Wednesday 28 December 2011

Monitoring /proc/timer_stats

The /proc/timer_stats interface allows one to check on timer usage in a Linux system and hence detect any misuse of timers that can cause excessive wake up events (and also waste power).  /proc/timer_stats reports the process id (pid) of a task that initialised the timer, the name of the task, the name of the function that initialised the timer and the name of the timer callback function.  To enable timer sampling, write "1\n" to /proc/timer_stats and to disable write "0\n".

While this interface is simple to use, collecting multiple samples over a long period of time to monitor overall system behaviour takes a little more effort.   To help with this, I've written a very simple tool called eventstat that calculates the rate of events per second and can dump the data in a .csv (comma separated values) format for importing into a spreadsheet such as LibreOffice for further analysis (such as graphing).

In its basic form, eventstat will run ad infinitum and can be halted by control-C. One can also specify the sample period and number of samples to gather, for example:

 sudo eventstat 10 60  

.. this gathers samples every 10 seconds for 60 samples (which equates to 10 minutes).

The -t option specifies an events/second threshold to discard events less than this threshold, for example:  sudo cpustat -t 10 will show events running at 10Hz or higher.

To dump the samples into a .csv file, use the -r option followed by the name of the .csv file.  If you just want to collect just the samples into a .csv file and not see the statistics during the run, use also the -q option, e.g.

 sudo eventstat -q -r event-report.csv


With eventstat you can quickly identify rouge processes that cause a high frequency of wake ups.   Arguably one can do this with tools such as PowerTop, but eventstat was written to allow one to collect the event statistics over a very long period of time and then help to analyse or graph the data in tools such as Libre Office spreadsheet.


The source is available in the following git repository:  git://kernel.ubuntu.com/cking/eventstat.git and in my power management tools PPA: https://launchpad.net/~colin-king/+archive/powermanagement

In an ideal world, application developers should check their code with tools like eventstat or PowerTop to ensure that the application is not misbehaving and causing excessive wake ups especially because abuse of timers could be happening in the supporting libraries that applications may be using.

Tuesday 13 December 2011

Improving Battery Life in Ubuntu Precise 12.04 LTS

Part of my focus this cycle is to see where we can make power saving improvements for Ubuntu Precise 12.04 LTS. There has been a lot of anecdotal evidence of specific machines or power saving features behaving poorly over the past few cycles.   So, armed with a 6.5 digit precision multimeter from Fluke I've been measuring the power consumption on various laptops in different test scenarios to try and answer some outstanding questions:

* Is it safe to enable Matthew Garrett's PCIe ASPM fix?
* Are the power savings suggested by PowerTop useful and can we reliably enabled any of these in pm-utils?
* How accurate are the ACPI battery readings to estimate power consumption?
* Do the existing pm-utils power.d scripts still make sense?
* Which is better for power saving: i386, i386-pae or amd64?
* How much power does the laptop backlight really use?
* Does halving the mouse input rate really save that much more power?
* Should we re-enable Aggressive Link Power Management (ALPM)?
* Are there any misbehaving applications that are consuming too much power?
* What are the root causes of HDD wake-ups
* Which applications and daemons are creating unnecessary wake events?
* How much does the MSR_IA32_ENERGY_PERF_BIAS save us?

..and many more besides!

From some of the analysis and "crowd sourcing" tests it is clear that the PCIe ASPM fix works well, so we've already incorporated that into Precise.

Aggressive Link Power Management (ALPM) is a mechanism where a SATA AHCI controller can put the SATA link that connects to the disk into a very low power mode during periods of zero I/O activity and into an active power state when work needs to be done. Tests show that this can save around 0.5-1.5 Watts of power on a typical system. However, it has been known in the past to not work on some devices, so I've put a call for testing of ALPM out to the community so we can get a better understanding of the power savings vs reliability.

Some of the PowerTop analysis has shown we can save another 1-2 Watts of power by putting USB and PCI controllers of devices like Webcams, SD card controllers, Wireless, Ethernet and Bluetooth  into a lower power state.  Again, we would like to understand the range of power savings across a large set of hardware and to see how reliable this is, so another crowd sourced call for testing has been also set up.

So, if you want to contribute to the testing, please visit the above links and spend just a few tens of minutes to see we can extend the battery life of your laptop or netbook.  And periodically visit https://wiki.ubuntu.com/Kernel/PowerManagement to see if there any new tests you can participate in.

[UPDATE]

I've written some brief notes on power saving tweaks and also some simple recommendations for application developers to follow too.

The thread continues here (part 2)