A Smackerel of Opinion: stress-ng: an updated system stress test tool

Saturday 21 June 2014

stress-ng: an updated system stress test tool

Recently added to Ubuntu 14.10 is stress-ng, a simple tool designed to stress various components of a Linux system. stress-ng is a re-implementation of the original stress tool written by Amos Waterland and adds various new ways to exercise a computer as well as a very simple "bogo-operation" set of metrics for each stress method.

stress-ng current contains the following methods to exercise the machine:

CPU compute - just lots of sqrt() operations on pseudo-random values. One can also specify the % loading of the CPUs
Cache thrashing, a naive cache read/write exerciser
Drive stress by writing and removing many temporary files
Process creation and termination, just lots of fork() + exit() calls
I/O syncs, just forcing lots of sync() calls
VM stress via mmap(), memory write and munmap()
Pipe I/O, large pipe writes and reads that exercise pipe, copying and context switching
Socket stressing, much like the pipe I/O test but using sockets
Context switching between a pair of producer and consumer processes

Many of the above stress methods have additional configuration options. Each stress method can be run by one or more child processes.

The --metrics option dumps the number of operations performed by each stress method, aka "bogo ops", bogos because they are a rough and unscientific metric. One can specify how long to run a test either by test duration in sections or by bogo ops.

I've tried to make stress-ng compatible with the older stress tool, but note that it is not guaranteed to produce identical results as the common test methods between the two tools have been implemented differently.

Stress-ng has been a useful for helping me measure different power consuming loads. It is also useful with various thermald optimisation tweaks on one of my older machines.

For more information, consult the stress-ng manual page. Be warned, this tool can make your system get seriously busy and warm!

9 comments:

Colin Ian King15 December 2014 at 00:25
Note that stress-ng is almost feature complete with version 0.03.x - it now contains a large range of stressors and some specific stressors such as CPU and VM contain there own subset of stressors too. stress-ng will build on debian kFreeBSD and debian HURD based kernel as well for OpenBSD and FreeBSD too.
ReplyDelete
Replies
Anonymous27 September 2016 at 01:38
I'm running an instance of this tool inside and outside of a Linux container to verify the cgroup CPU Bandwidth or CPU time that each cgroup is getting. Is this an appropriate use-case for this tool?
ReplyDelete
Replies
George12 August 2017 at 17:53
Colin,
I'm trying to generate maximum heat load in a server farm, for the purpose of benchmarking heat generation/power consumption under a theoretical maximum load, on xeon processors. I'm trying to get my CPU's to their maximum wattage. I tried "stress -c 32" for my 32 core machines, and it is generating the expected high load, but only a 10% power consumption peak over an idle machine with no bootable filesystem. I can't tell if that is typical, or if stress's sqrt function doesn't do much in the way of driving power consumption.

What would you suggest as a stress-ng command line/script to achieve my goal? I have dual 16 core xeons and 128gig of memory. Disk is solid state, and I'n not really wanting to generate write wear on the ssd's, so the most I would want to do on the drives is reads.

I have a lab for a big-data software development project, and I'm trying to develop metrics for making decisions on the best way to spend budget. A basic question is "Is more cooling needed for a unit of work, or is more modern hardware going to do more work with less cooling?"
Obviously the answer is complex, depending on how old the old hardware is, and how much efficiency gain there is with the new hardware. But developing a testing method that can really crank up the heat is a desirable thing. I can't tell if my new stack is being stressed right now, I can't generate enough heat to convince me that "stress -c32" is doing it. My power consumption is high enough, but the delta from the idle state is too low to be believable.

The best test would be to get the software team to give me their worst case system load, but they don't really know what that is, as every test they give me doesn't seem to take the system out of an idle power level, they spend most of the time with no cpu load and waiting for i/o. Of course that suggests I shouldn't even try to load the machines... But as soon as I make that assumption, someone is going to come up with a test case that breaks my lab. So I want to break it myself, now, so that the worst they can do later is reach my benchmark without surpassing it.
ReplyDelete
Replies
kunal s lawtawar1 December 2017 at 23:12
It doesn't work with YAML output. We can't capture all the output for hdd
ReplyDelete
Replies
Unknown9 January 2018 at 16:11
Hello Mr. King.

In your post you say that it is possible to "specify the % loading of the CPUs". But according to the user manual, the option for this is missing. I see how to put a limit to the amount of Memory used, the number of CPUs, etc. but not for the % of CPU.
ReplyDelete
Replies
Unknown9 January 2018 at 16:12
Hello Mr. King.

In your post you say that it is possible to "specify the % loading of the CPUs". But according to the user manual, the option for this is missing. I see how to put a limit to the amount of Memory used, the number of CPUs, etc. but not for the % of CPU.
ReplyDelete
Replies
Anonymous25 July 2018 at 09:31
What would be the default run time for each sub-stressor under CPU if I select --cpu option? It says "defaulting to a 86400 second run per stressor" does that mean each sub-stressor will run for 24hrs? Or whole CPU class will complete within 24 hrs?
ReplyDelete
Replies

Add comment