paulmck (paulmck) wrote,

Stupid RCU Tricks: So rcutorture is Not Aggressive Enough For You?

So you read the previous post, but simply running rcutorture did not completely vent your frustration. What can you do?

One thing you can do is to tweak a number of rcutorture settings to adjust the manner and type of torture that your testing inflicts.

RCU CPU Stall Warnings

If you are not averse to a quick act of vandalism, then you might wish to induce an RCU CPU stall warning. The --bootargs argument can be used for this, for example as follows:

tools/testing/selftests/rcutorture/bin/ --allcpus --duration 3 --trust-make \
    --bootargs "rcutorture.stall_cpu=22 rcutorture.fwd_progress=0"

The rcutorture.stall_cpu=22 says to stall a CPU for 22 seconds, that is, one second longer than the default RCU CPU stall timeout in mainline. If you are instead using a distribution kernel, you might need to specify 61 seconds (as in “rcutorture.stall_cpu=61”) in order to allow for the typical 60-second RCU CPU stall timeout. The rcutorture.fwd_progress=0 has no effect except to suppress a warning message (with stack trace included free of charge) that questions the wisdom of running both RCU-callback forward-progress tests and RCU CPU stall tests at the same time. In fact, the code not only emits the warning message, it also automatically suppresses the forward-progress tests. If you prefer living dangerously and don't mind the occasional out-of-memory (OOM) lockup accompanying your RCU CPU stall warnings, feel free to edit kernel/rcu/rcutorture.c to remove this automatic suppression.

If you are running on a large system that takes more than ten seconds to boot, you might need to increase the RCU CPU stall holdoff interval. For example, adding rcutorture.stall_cpu_holdoff=120 to the --bootargs list would wait for two minutes before stalling a CPU instead of the default holdoff of 10 seconds. If simply spinning a CPU with preemption disabled does not fully vent your ire, you could undertake a more profound act of vandalism by adding rcutorture.stall_cpu_irqsoff=1 so as to cause interrupts to be disabled on the spinning CPU.

Some flavors of RCU such as SRCU permit general blocking within their read-side critical sections, and you can exercise this capability by adding rcutorture.stall_cpu_block=1 to the --bootargs list. Better yet, you can use this kernel-boot parameter to torture flavors of RCU that forbid blocking within read-side critical sections, which allows you to see they complain about such mistreatment.

The vanilla flavor of RCU has a grace-period kthread, and stalling this kthread is another good way to torture RCU. Simply add rcutorture.stall_gp_kthread=22 to the --bootargs list, which delays the grace-period kthread for 22 seconds. Doing this will normally elicit strident protests from mainline kernels.

Finally, you could starve rcutorture of CPU time by running a large number of them concurrently (each in its own Linux-kernel source tree), thereby overcommitting the CPUs.

But maybe you would prefer to deprive RCU of memory. If so, read on!

Running rcutorture Out of Memory

By default, each rcutorture guest OS is allotted 512MB of memory. But perhaps you would like to have it make do with only 128MB:

tools/testing/selftests/rcutorture/bin/ --allcpus --trust-make --memory 128M

You could go further by making the RCU need-resched testing more aggressive,T for example, by increasing the duration of this testing from the default three-quarters of the RCU CPU stall timeout to (say) seven eighths:

tools/testing/selftests/rcutorture/bin/ --allcpus --trust-make --memory 128M \
    --bootargs "rcutorture.fwd_progress_div=8"

More to the point, you might make the RCU callback-flooding tests more aggressive, for example by adjusting the values of the MAX_FWD_CB_JIFFIES, MIN_FWD_CB_LAUNDERS, or MIN_FWD_CBS_LAUNDERED macros and rebuilding the kernel. Alternatively, you could use kill -STOP on one of the vCPUs in the middle of an rcutorture run. Either way, if you break it, you buy it!

Or perhaps you would rather attempt to drown rcutorture in memory, perhaps forcing a full 16GB onto each guest OS:

tools/testing/selftests/rcutorture/bin/ --allcpus --trust-make --memory 16G

Another productive torture method involves unusual combinations of Kconfig options, a topic take up by the next section.

Confused Kconfig Options

The Kconfig options for a given rcutorture scenario are specified by the corresponding file in the tools/testing/selftests/rcutorture/configs/rcu directory. For example, the Kconfig options for the infamous TREE03 scenario may be found in tools/testing/selftests/rcutorture/configs/rcu/TREE03.

But why not just use the --kconfig argument and be happy, as described previously?

One reason is that there are a few Kconfig options that the rcutorture scripting refers to early in the process, before the --kconfig parameter's additions have been processed, for example, changing CONFIG_NR_CPUS should be done in the file rather than via the --kconfig parameter. Another reason is to not need to keep supplying a --kconfig argument for each of many repeated rcutorture runs. But perhaps most important, if you want some scenarios to be built with one Kconfig option and others built with some other Kconfig option, modifying each scenario's file avoids the need for multiple rcutorture runs.

For example, you could edit the tools/testing/selftests/rcutorture/configs/rcu/TREE03 file to change the CONFIG_NR_CPUS=16 to instead read CONFIG_NR_CPUS=4, and then run the following on a 12-CPU system:

tools/testing/selftests/rcutorture/bin/ --allcpus --trust-make --configs "3*TREE03"

This would run three concurrent copies of TREE03, but with each guest OS restricted to only 4 CPUs.

Finally, if a given Kconfig option applies to all rcutorture runs and you are tired of repeatedly entering --kconfig arguments, you can instead add that option to the tools/testing/selftests/rcutorture/configs/rcu/CFcommon file.

But sometimes Kconfig options just aren't enough. And that is why we have kernel boot parameters, the subject of the next section.

Boisterous Boot Parameters

We have supplied kernel boot parameters using the --bootargs parameter, but sometimes ordering considerations or sheer laziness motivate greater permanent. Either way, the scenario's .boot file may be brought to bear, for example, the TREE03 scenario's file is located here: tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot.

As of the v5.7 Linux kernel, this file contains the following:

rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30

For example, the probability of RCU's grace period processing overlapping with CPU-hotplug operations may be adjusted by decreasing the value of the rcutorture.onoff_interval from its default of 200 milliseconds or by adjusting the various grace-period delays specified by the rcutree.gp_preinit_delay, rcutree.gp_init_delay, and rcutree.gp_cleanup_delay parameters. In fact, chasing bugs involving races between RCU grace periods and CPU-hotplug operations often involves tuning these four parameters to maximize race probability, thus decreasing the required rcutorture run durations.

The possibilities for the .boot file contents are limited only by the extent of the Documentation/admin-guide/kernel-parameters.txt. And actually not even by that, given the all-to-real possibility of undocumented kernel boot parameters.

You can also create your own rcutorture scenarios by creating a new set of files in the tools/testing/selftests/rcutorture/configs/rcu directory. You can make them run by default (or in response to the CFLIST string to the --configs parameter) by adding its name to the tools/testing/selftests/rcutorture/configs/rcu/CFLIST file. For example, you could create a MYSCENARIO file containing Kconfig options and (optionally) a MYSCENARIO.boot file containing kernel boot parameters in the tools/testing/selftests/rcutorture/configs/rcu directory, and make them run by default by adding a line reading MYSCENARIO to the tools/testing/selftests/rcutorture/configs/rcu/CFLIST file.


This post discussed enhancing rcutorture through use of stall warnings, memory limitations, Kconfig options, and kernel boot parameters. The special case of adjusting CONFIG_NR_CPUS deserves more attention, and that is the topic of the next post.
Tags: rcu, scalability, stupid rcu tricks
  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.