Back in the old days, I kept mental track of the -rcu tree and ran the tests appropriate to whatever was queued there. This strategy broke down in late 2020 due to family health issues (everyone is now fine, thank you!), resulting in a couple of embarrassing escapes. Some additional automation was clearly required.
This automation took the form of a new torture.sh script. This is not intended to be the main testing mechanism, but instead an overnight touch-test of the full rcutorture suite that is run occasionally, for example, just after accepting a large patch series or just before sending a pull request.
By default, torture.sh runs everything both with and without KASAN, and with a 10-minute “duration base”. The translation from “duration base” into wall-clock time is a bit indirect. The fewer CPUs you have, the more tests you run, and the longer it takes your system to build a kernel, the more wall-clock time that “10 minutes” will turn into. On my 16-hardware-thread laptop, running everything (including the non-default KCSAN runs) turns that 10-minute duration base into about 11 hours. Increasing the duration base by five minutes increases the total wall-clock time by about 100 minutes.
This is therefore not a test to be integrated into a per-commit CI system, however, manually selecting specific tests for the most recent RCU-related commit is far easier than keeping the entire -rcu stack in one's head. And torture.sh assists with this by providing sets of --configs- and --do- parameters.
The --configs- parameters are as follows:
The --do- parameters are as follows:
- --do-all, which enables everything, including non-default options such as KCSAN.
- --do-allmodconfig, which does a single allmodconfig kernel build without running anything, and without either KASAN or KCSAN.
- --do-clocksourcewd, which does a short test of the clocksource watchdog, verifying that it can tell the difference between delay-based skew and clock-based skew.
- --do-kasan, which enables KASAN on everything except -do-allmodconfig.
- --do-kcsan, which enables KCSAN on everything except -do-allmodconfig.
- --do-kvfree, which runs a special rcuscale test of the kvfree_rcu() primitive.
- --do-locktorture, which enables a set of locktorture runs.
- --do-none, which disables everything. Yes, you can give a long series of --do-all and --do-none arguments if you really want to, but the usual approach is to follow --do-none with the lists of tests you want to enable, for example, --do-none --do-clocksourcewd will test only the clocksource watchdog, and do so in but a few minutes.
- --do-rcuscale, which enables rcuscale update-side performance tests, adapted to the number of CPUs on your system.
- --do-rcutorture, which enables rcutorture stress tests.
- --do-refscale, which enables refscale read-side performance tests, adapted to the number of CPUs on your system.
- --do-scftorture, which enables scftorture stress tests for smp_call_function() and friends, adapted to the number of CPUs on your system.
As of early 2021, KCSAN is still a bit picky about compiler versions, so the --kcsan-kmake-arg allows you to specify arguments to the --kmake-arg argument to kvm.sh. For example, right now, I use --kcsan-kmake-arg "CC=clang-11".
As noted earlier, both rcuscale and refscale can have tests added and removed over time. The torture.sh script deals with this by doing a grep through the rcuscale.c and refscale source code, respectively, and running all of the tests that it finds.
The --duration argument specifies the duration base, which, as noted earlier, defaults to 10 minutes. This duration base is apportioned across the kvm.sh script's --duration parameter, with 70% for rcutorture, 10% for locktorture, and 20% for scftorture. So if you specify --duration 20 to torture.sh, the rcutorture kvm.sh runs will specify --duration 14, the locktorture kvm.sh runs will specify --duration 2, and the scftorture kvm.sh runs will specify --duration 4.
The 100GB full run is addressed at least partially by compressing KASAN vmlinux files, which gains roughly a factor of two overall, courtesy of the 1GB size of each such file. Normally, torture.sh uses all available CPUs to do the compression, but you can restrict it using the --compress-kasan-vmlinux parameter. At the extreme, --compress-kasan-vmlinux 0 will disable compression entirely, which can be an attractive option given that compressing takes about an hour of wall-clock time on my 16-CPU laptop.
Finally, torture.sh places all of its output under a date-stamped directory suffixed with -torture, for example, tools/testing/selftests/rcutorture/res/2
Taking all of this together, torture.sh provides a very useful overnight “acceptance test” for RCU.