Log in

Deep Blue vs. Watson Revisited

Some years back, I speculated on the importance of IBM's Watson. Much has happened since then: Watson won Jeopardy, has been applied to medical applications, and has been made available to numerous business partners to enable them to produce Watson-based offerings. In short, it is long past time for a follow-up.

However, The Economist beat me to the punch in their October 3rd print edition. I doubt that I can improve on their article, so I will confine myself to taking the fair-use liberty of quoting their last sentence:

If it [Watson] can pull that off, a truly disturbing possibility looms: that the next TV show featuring Watson might be “America's Got Talent”.


Suppose that a very long linked list was to be protected with SRCU. Let's also make the presumably unreasonable assumption that this list is so long that we don't want to stay in a single SRCU read-side critical section for the whole traversal.

So why not try hand-over-hand SRCU protection, as shown in the following code fragment?

  1 struct foo {
  2   struct list_head list;
  3   ...
  4 };
  6 LIST_HEAD(mylist);
  7 struct srcu_struct mysrcu;
  9 void process(void)
 10 {
 11   int i1, i2;
 12   struct foo *p;
 14   i1 = srcu_read_lock(&mysrcu);
 15   list_for_each_entry_rcu(p, &mylist, list) {
 16     do_something_with(p);
 17     i2 = srcu_read_lock(&mysrcu);
 18     srcu_read_unlock(&mysrcu, i1);
 19     i1 = i2;
 20   }
 21   srcu_read_unlock(&mysrcu, i1);
 22 }

The trick is that on each pass through the loop, we enter a new SRCU read-side critical section, then exit the old one. That way the entire traversal is protected by SRCU, but each SRCU read-side critical section is quite short, covering traversal of but a single element of the list.

As is customary with SRCU, the list is manipulated using list_add_rcu(), list_del_rcu, and friends.

What are the advantages and disadvantages of this hand-over-hand SRCU list traversal?
The official definition of the C Language is the standard, but the standard doesn't actually compile any programs. One can argue that the actual implementations are the real definition of the C Language, although further thought along this line usually results in a much greater appreciation of the benefits of having standards. Nevertheless, the implementations usually win any conflicts with the standard, at least in the short term.

Another interesting source of definitions is the opinions of the developers who actually write C. And both the standards bodies and the various implementations do take these opinions into account at least some of the time. Differences of opinion within the standards bodies are sometimes settled by surveying existing usage, and implementations sometimes provide facilities outside the standard based on user requests. For example, relatively few compiler warnings are actually mandated by the standard.

Although one can argue that the standard is the end-all and be-all definition of the C Language, the fact remains that if none of the implementers provide a facility called out by the standard, the implementers win. Similarly, if nobody uses a facility that is called out by the standard, the users win—even if that facility is provided by each and every implementation. Of course, things get more interesting if the users want something not guaranteed by the standard.

Therefore, it is worth knowing what users expect, even if only to adjust their expectations, as John Regehr has done for number of topics, perhaps most notably signed integer overflow. Some researchers have been taking a more proactive stance, with one example being Peter Sewell's group from the University of Cambridge. This group has put together a survey on padding bytes, pointer arithmetic, and unions. This survey is quite realistic, with “that would be crazy” being a valid answer to a number of the questions.

So, if you think you know a thing or two about C's handling of padding bytes, pointer arithmetic, and unions, take the survey!
Some time ago, I figured out that there are more than a billion instances of the Linux kernel in use, and this in turn led to the realization that a million-year RCU bug is happening about three times a day across the installed base. This realization has caused me to focus more heavily on RCU validation, which has uncovered a number of interesting bugs. I have also dabbled a bit in formal verification, which has not yet found a bug. However, formal verification might be getting there, and might some day be a useful addition to RCU's regression testing. I was therefore quite happy to be invited to this Dagstuhl Seminar. In what follows, I summarize a few of the presentation. See here for the rest of the presentations.

Viktor Vafeiadis presented his analysis of the C11 memory model, including some “interesting” consequences of data races, where a data race is defined as a situation involving multiple concurrent accesses to a non-atomic variable, at least one of which is a write. One such consequence involves a theoretically desirable “strengthening” property. For example, this property would mean that multiplexing two threads onto a single underlying thread would not introduce new behaviors. However, with C11, the undefined-behavior consequences of data races can actually cause new behaviors to appear with fewer threads, for example, see Slide 7. This suggests the option of doing away with the undefined behavior, which is exactly the option that LLVM has taken. However, this approach requires some care, as can be seen on Slide 19. Nevertheless, this approach seems promising. One important takeaway from this talk is that if you are worried about weak ordering, you need to pay careful attention to reining in the compiler's optimizations. If you are unconvinced, take a look at this! Jean Pichon-Pharabod, Kyndylan Nienhuis, and Mike Dodds presented on other aspects of the C11 memory model.

Martin T. Vechev apparently felt that the C11 memory model was too tame, and therefore focused on event-driven applications, specifically javascript running on Android. This presentation included some entertaining concurrency bugs and their effects on the browser's display. Martin also discussed formalizing javascript's memory model.

Hongjin Liang showed that ticket locks can provide starvation freedom given a minimally fair scheduler. This provides a proof point for Björn B. Brandenburg's dissertation, which analyzed the larger question of real-time response from lock-based code. It should also provide a helpful corrective to people who still believe that non-blocking synchronization is required.

Joseph Tassarotti presented a formal proof of the quiescent-state based reclamation (QSBR) variant of userspace RCU. In contrast to previous proofs, this proof did not rely on sequential consistency, but instead leveraged a release-acquire memory model. It is of course good to see researchers focusing their tools on RCU! That said, when a researcher asked me privately whether I felt that the proof incorporated realistic assumptions, I of course could not resist saying that since they didn't find any bugs, the assumptions clearly must have been unrealistic.

My first presentation covered what would be needed for me to be able to use formal verification as part of Linux-kernel RCU's regression testing. As shown on slide 34, these are:

  1. Either automatic translation or no translation required. After all, if I attempt to manually translate Linux-kernel RCU to some special-purpose language every release, human error will make its presence known.
  2. Correctly handle environment, including the memory model, which in turn includes compiler optimizations.
  3. Reasonable CPU and memory overhead. If these overheads are excessive, RCU is better served by simple stress testing.
  4. Map to source code lines containing the bug. After all, I already know that there are bugs—I need to know where they are.
  5. Modest input outside of source code under test. The sad fact is that a full specification of RCU would be at least as large as the implementation, and also at least as buggy.
  6. Find relevant bugs. To see why this is important, imagine that some tool finds 100 different million-year bugs and I fix them all. Because roughly one of six fixes introduces a bug, and because that bug is likely to reproduce in far less than a million years, this process has likely greatly reduced the robustness of the Linux kernel.

I was not surprised to get some “frank and honest” feedback, but I was quite surprised (but not at all displeased) to learn that some of the feedback was of the form “we want to see more C code.” After some discussion, I provided just that.

Verification Challenge 5: Uses of RCU

This is another self-directed verification challenge, this time to validate uses of RCU instead of validating the RCU implementations as in earlier posts. As you can see from Verification Challenge 4, the logic expression corresponding even to the simplest Linux-kernel RCU implementation is quite large, weighing in at tens of thousands of variables and hundreds of thousands of clauses. It is therefore worthwhile to look into the possibility of a trivial model of RCU that could be used for verification.

Because logic expressions do not care about cache locality, memory contention, energy efficiency, CPU hotplug, and a host of other complications that a Linux-kernel implementation must deal with, we can start with extreme simplicity. For example:

 1 static int rcu_read_nesting_global;
 3 static void rcu_read_lock(void)
 4 {
 5   (void)__sync_fetch_and_add(&rcu_read_nesting_global, 2);
 6 }
 8 static void rcu_read_unlock(void)
 9 {
10   (void)__sync_fetch_and_add(&rcu_read_nesting_global, -2);
11 }
13 static inline void assert_no_rcu_read_lock(void)
14 {
15   BUG_ON(rcu_read_nesting_global >= 2);
16 }
18 static void synchronize_rcu(void)
19 {
20   if (__sync_fetch_and_xor(&rcu_read_nesting_global, 1) < 2)
21     return;
23   return;
24 }

The idea is to reject any execution in which synchronize_rcu() does not wait for all readers to be done. As before, SET_ASSERT() sets a variable that suppresses all future assertions.

Please note that this model of RCU has some shortcomings:

  1. There is no diagnosis of rcu_read_lock()/rcu_read_unlock() misnesting. (A later version of the model provides limited diagnosis, but under #ifdef CBMC_PROVE_RCU.)
  2. The heavyweight operations in rcu_read_lock() and rcu_read_unlock() result in artificial ordering constraints. Even in TSO systems such as x86 or s390, a store in a prior RCU read-side critical section might be reordered with loads in later critical sections, but this model will act as if such reordering was prohibited.
  3. Although synchronize_rcu() is permitted to complete once all pre-existing readers are done, in this model it will instead wait until a point in time at which there are absolutely no readers, whether pre-existing or new. Therefore, this model's idea of an RCU grace period is even heavier weight than in real life.

Nevertheless, this approach will allow us to find at least some RCU-usage bugs, and it fits in well with cbmc's default fully-ordered settings. For example, we can use it to verify a variant of the simple litmus test used previously:

 1 int r_x;
 2 int r_y;
 4 int x;
 5 int y;
 7 void *thread_reader(void *arg)
 8 {
 9   rcu_read_lock();
10   r_x = x;
12   rcu_read_unlock();
13   rcu_read_lock();
14 #endif
15   r_y = y;
16   rcu_read_unlock();
17   return NULL;
18 }
20 void *thread_update(void *arg)
21 {
22   x = 1;
24   synchronize_rcu();
25 #endif
26   y = 1;
27   return NULL;
28 }
30 int main(int argc, char *argv[])
31 {
32   pthread_t tr;
34   if (pthread_create(&tr, NULL, thread_reader, NULL))
35     abort();
36   (void)thread_update(NULL);
37   if (pthread_join(tr, NULL))
38     abort();
40   BUG_ON(r_y != 0 && r_x != 1);
41   return 0;
42 }

This model has only 3,032 variables and 8,844 clauses, more than an order of magnitude smaller than for the Tiny RCU verification. Verification takes about half a second, which is almost two orders of magnitude faster than the 30-second verification time for Tiny RCU. In addition, the model successfully flags several injected errors. We have therefore succeeded in producing a simpler and faster model approximating RCU, and that can handle multi-threaded litmus tests.

A natural next step would be to move to litmus tests involving linked lists. Unfortunately, there appear to be problems with cbmc's handling of pointers in multithreaded situations. On the other hand, cbmc's multithreaded support is quite new, so hopefully there will be fixes for these problems in the near future. After fixes appear, I will give the linked-list litmus tests another try.

In the meantime, the full source code for these models may be found here.
So the Linux kernel now has a Documentation/CodeOfConflict file. As one of the people who provided an Acked-by for this file, I thought I should set down what went through my mind while reading it. Taking it one piece at a time:

The Linux kernel development effort is a very personal process compared to “traditional” ways of developing software. Your code and ideas behind it will be carefully reviewed, often resulting in critique and criticism. The review will almost always require improvements to the code before it can be included in the kernel. Know that this happens because everyone involved wants to see the best possible solution for the overall success of Linux. This development process has been proven to create the most robust operating system kernel ever, and we do not want to do anything to cause the quality of submission and eventual result to ever decrease.

In a perfect world, this would go without saying, give or take the “most robust” chest-beating. But I am probably not the only person to have noticed that the world is not always perfect. Sadly, it is probably necessary to remind some people that “job one” for the Linux kernel community is the health and well-being of the Linux kernel itself, and not their own pet project, whatever that might be.

On the other hand, I was also heartened by what does not appear in the above paragraph. There is no assertion that the Linux kernel community's processes are perfect, which is all to the good, because delusions of perfection all too often prevent progress in mature projects. In fact, in this imperfect world, there is nothing so good that it cannot be made better. On the other hand, there also is nothing so bad that it cannot be made worse, so random wholesale changes should be tested somewhere before being applied globally to a project as important as the Linux kernel. I was therefore quite happy to read the last part of this paragraph: “we do not want to do anything to cause the quality of submission and eventual result to ever decrease.”

If however, anyone feels personally abused, threatened, or otherwise uncomfortable due to this process, that is not acceptable.

That sentence is of course critically important, but must be interpreted carefully. For example, it is all too possible that someone might feel abused, threatened, and uncomfortable by the mere fact of a patch being rejected, even if that rejection was both civil and absolutely necessary for the continued robust operation of the Linux kernel. Or someone might claim to feel that way, if they felt that doing so would get their patch accepted. (If this sounds impossible to you, be thankful, but also please understand that the range of human behavior is extremely wide.) In addition, I certainly feel uncomfortable when someone points out a stupid mistake in one of my patches, but that discomfort is my problem, and furthermore encourages me to improve, which is a good thing. For but one example, this discomfort is exactly what motivated me to write the rcutorture test suite. Therefore, although I hope that we all know what is intended by the words “abused”, “threatened”, and “uncomfortable” in that sentence, the fact is that it will never be possible to fully codify the difference between constructive and destructive behavior.

Therefore, the resolution process is quite important:

If so, please contact the Linux Foundation's Technical Advisory Board at <tab@lists.linux-foundation.org>, or the individual members, and they will work to resolve the issue to the best of their ability. For more information on who is on the Technical Advisory Board and what their role is, please see:


There can be no perfect resolution process, but this one seems to be squarely in the “good enough” category. The timeframes are long enough that people will not be rewarded by complaining to the LF TAB instead of fixing their patches. The composition of the LF TAB, although not perfect, is diverse, consisting of both men and women from multiple countries. The LF TAB appears to be able to manage the inevitable differences of opinion, based on the fact that not all members provided their Acked-by for this Code of Conflict. And finally, the LF TAB is an elected body that has oversight via the LF, so there are feedback mechanisms. Again, this is not perfect, but it is good enough that I am willing to overlook my concerns about the first sentence in the paragraph.

On to the final paragraph:

As a reviewer of code, please strive to keep things civil and focused on the technical issues involved. We are all humans, and frustrations can be high on both sides of the process. Try to keep in mind the immortal words of Bill and Ted, “Be excellent to each other.”

And once again, in a perfect world it would not be necessary to say this. Sadly, we are human beings rather than angels, and so it does appear to be necessary. Then again, if we were all angels, this would be a very boring world.

Or at least that is what I keep telling myself!

Verification Challenge 4: Tiny RCU

The first and second verification challenges were directed to people working on verification tools, and the third challenge was directed at developers. Perhaps you are thinking that it is high time that I stop picking on others and instead direct a challenge at myself. If so, this is the challenge you were looking for!

The challenge is to take the v3.19 Linux kernel code implementing Tiny RCU, unmodified, and use some formal-verification tool to prove that its grace periods are correctly implemented.

This requires a tool that can handle multiple threads. Yes, Tiny RCU runs only on a single CPU, but the proof will require at least two threads. The basic idea is to have one thread update a variable, wait for a grace period, then update a second variable, while another thread accesses both variables within an RCU read-side critical section, and a third parent thread verifies that this critical section did not span a grace period, like this:

 1 int x;
 2 int y;
 3 int r1;
 4 int r2;
 6 void rcu_reader(void)
 7 {
 8   rcu_read_lock();
 9   r1 = x; 
10   r2 = y; 
11   rcu_read_unlock();
12 }
14 void *thread_update(void *arg)
15 {
16   x = 1; 
17   synchronize_rcu();
18   y = 1; 
19 }
21 . . .
23 assert(r2 == 0 || r1 == 1);

Of course, rcu_reader()'s RCU read-side critical section is not allowed to span thread_update()'s grace period, which is provided by synchronize_rcu(). Therefore, rcu_reader() must execute entirely before the end of the grace period (in which case r2 must be zero, keeping in mind C's default initialization to zero), or it must execute entirely after the beginning of the grace period (in which case r1 must be one).

There are a few technical problems to solve:

  1. The Tiny RCU code #includes numerous “interesting” files. I supplied empty files as needed and used “-I .” to focus the C preprocessor's attention on the current directory.
  2. Tiny RCU uses a number of equally interesting Linux-kernel primitives. I stubbed most of these out in fake.h, but copied a number of definitions from the Linux kernel, including IS_ENABLED, barrier(), and bool.
  3. Tiny RCU runs on a single CPU, so the two threads shown above must act as if this was the case. I used pthread_mutex_lock() to provide the needed mutual exclusion, keeping in mind that Tiny RCU is available only with CONFIG_PREEMPT=n. The thread that holds the lock is running on the sole CPU.
  4. The synchronize_rcu() function can block. I modeled this by having it drop the lock and then re-acquire it.
  5. The dyntick-idle subsystem assumes that the boot CPU is born non-idle, but in this case the system starts out idle. After a surprisingly long period of confusion, I handled this by having main() invoke rcu_idle_enter() before spawning the two threads. The confusion eventually proved beneficial, but more on that later.

The first step is to get the code to build and run normally. You can omit this step if you want, but given that compilers usually generate better diagnostics than do the formal-verification tools, it is best to make full use of the compilers.

I first tried goto-cc, goto-instrument, and satabs [Slide 44 of PDF] and impara [Slide 52 of PDF], but both tools objected strenuously to my code. My copies of these two tools are a bit dated, so it is possible that these problems have since been fixed. However, I decided to download version 5 of cbmc, which is said to have gained multithreading support.

After converting my code to a logic expression with no fewer than 109,811 variables and 457,344 clauses, cbmc -I . -DRUN fake.c took a bit more than ten seconds to announce VERIFICATION SUCCESSFUL. But should I trust it? After all, I might have a bug in my scaffolding or there might be a bug in cbmc.

The usual way to check for this is to inject a bug and see if cbmc catches it. I chose to break up the RCU read-side critical section as follows:

 1 void rcu_reader(void)
 2 {
 3   rcu_read_lock();
 4   r1 = x; 
 5   rcu_read_unlock();
 6   cond_resched();
 7   rcu_read_lock();
 8   r2 = y; 
 9   rcu_read_unlock();
10 }

Why not remove thread_update()'s call to synchronize_rcu()? Take a look at Tiny RCU's implementation of synchronize_rcu() to see why not!

With this change enabled via #ifdef statements, “cbmc -I . -DRUN -DFORCE_FAILURE fake.c” took almost 20 seconds to find a counter-example in a logic expression with 185,627 variables and 815,691 clauses. Needless to say, I am glad that I didn't have to manipulate this logic expression by hand!

Because cbmc catches an injected bug and verifies the original code, we have some reason to hope that the VERIFICATION SUCCESSFUL was in fact legitimate. As far as I know, this is the first mechanical proof of the grace-period property of a Linux-kernel RCU implementation, though admittedly of a rather trivial implementation. On the other hand, a mechanical proof of some properties of the dyntick-idle counters came along for the ride, courtesy of the WARN_ON_ONCE() statements in the Linux-kernel source code. (Previously, researchers at Oxford mechanically validated the relationship between rcu_dereference() and rcu_assign_pointer(), taking the whole of Tree RCU as input, and researchers at MPI-SWS formally validated userspace RCU's grace-period guarantee—manually.)

As noted earlier, I had confused myself into thinking that cbmc did not handle pthread_mutex_lock(). I verified that cbmc handles the gcc atomic builtins, but it turns out to be impractical to build a lock for cbmc's use from atomics. The problem stems from the “b” for “bounded” in “cbmc”, which means cbmc cannot analyze the unbounded spin loops used in locking primitives.

However, cbmc does do the equivalent of a full state-space search, which means it will automatically model all possible combinations of lock-acquisition delays even in the absence of a spin loop. This suggests something like the following:

 1 if (__sync_fetch_and_add(&cpu_lock, 1))
 2   exit();

The idea is to exclude from consideration any executions where the lock cannot be immediately acquired, again relying on the fact that cbmc automatically models all possible combinations of delays that the spin loop might have otherwise produced, but without the need for an actual spin loop. This actually works, but my mis-modeling of dynticks fooled me into thinking that it did not. I therefore made lock-acquisition failure set a global variable and added this global variable to all assertions. When this failed, I had sufficient motivation to think, which caused me to find my dynticks mistake. Fixing this mistake fixed all three versions (locking, exit(), and flag).

The exit() and flag approaches result in exactly the same number of variables and clauses, which turns out to be quite a bit fewer than the locking approach:

Verification69,050 variables, 287,548 clauses (output)109,811 variables, 457,344 clauses (output)
Verification Forced Failure113,947 variables, 501,366 clauses (output)   185,627 variables, 815,691 clauses (output)

So locking increases the size of the logic expressions by quite a bit, but interestingly enough does not have much effect on verification time. Nevertheless, these three approaches show a few of the tricks that can be used to accomplish real work using formal verification.

The GPL-licensed source for the Tiny RCU validation may be found here. C-preprocessor macros select the various options, with -DRUN being necessary for both real runs and cbmc verification (as opposed to goto-cc or impara verification), -DCBMC forcing the atomic-and-flag substitute for locking, and -DFORCE_FAILURE forcing the failure case. For example, to run the failure case using the atomic-and-flag approach, use:


Possible next steps include verifying dynticks and interrupts, dynticks and NMIs, and of course use of call_rcu() in place of synchronize_rcu(). If you try these out, please let me know how it goes!

Verification Challenge 3: cbmc

The first and second verification challenges were directed to people working on verification tools, but this one is instead directed at developers.

It turns out that there are a number of verification tools that have seen heavy use. For example, I have written several times about Promela and spin (here, here, and here), which I have used from time to time over the past 20 years. However, this tool requires that you translate your code to Promela, which is not conducive to use of Promela for regression tests.

For those of use working in the Linux kernel, it would be nice to have a verification tool that operated directly on C source code. And there are tools that do just that, for example, the C Bounded Model Checker (cbmc). This tool, which is included in a number of Linux distributions, converts a C-language input file into a (possibly quite large) logic expression. This expression is constructed so that if any combination of variables causes the logic expression to evaluate to true, then (and only then) one of the assertions can be triggered. This logic expression is then passed to a SAT solver, and if this SAT solver finds a solution, then there is a set of inputs that can trigger the assertion. The cbmc tool is also capable of checking for array-bounds errors and some classes of pointer misuse.

Current versions of cbmc can handle some useful tasks. For example, suppose it was necessary to reverse the sense of the if condition in the following code fragment from Linux-kernel RCU:

 1   if (rnp->exp_tasks != NULL ||
 2       (rnp->gp_tasks != NULL &&
 3        rnp->boost_tasks == NULL &&
 4        rnp->qsmask == 0 &&
 5        ULONG_CMP_GE(jiffies, rnp->boost_time))) {
 6     if (rnp->exp_tasks == NULL) 
 7       rnp->boost_tasks = rnp->gp_tasks;
 8     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
 9     t = rnp->boost_kthread_task;
10     if (t)   
11       rcu_wake_cond(t, rnp->boost_kthread_status);
12   } else {
13     rcu_initiate_boost_trace(rnp);
14     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
15   }

This is a simple application of De Morgan's law, but an error-prone one, particularly if carried out in a distracting environment. Of course, to test a validation tool, it is best to feed it buggy code to see if it detects those known bugs. And applying De Morgan's law in a distracting environment is an excellent way to create bugs, as you can see below:

 1   if (rnp->exp_tasks == NULL &&
 2       (rnp->gp_tasks == NULL ||
 3        rnp->boost_tasks != NULL ||
 4        rnp->qsmask != 0 &&
 5        ULONG_CMP_LT(jiffies, rnp->boost_time))) {
 6     rcu_initiate_boost_trace(rnp);
 7     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
 8   } else {
 9     if (rnp->exp_tasks == NULL) 
10       rnp->boost_tasks = rnp->gp_tasks;
11     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
12     t = rnp->boost_kthread_task;
13     if (t)   
14       rcu_wake_cond(t, rnp->boost_kthread_status);
15   }

Of course, a full exhaustive test is infeasible, but structured testing would result in a manageable number of test cases. However, we can use cbmc to do the equivalent of a full exhaustive test, despite the fact that the number of combinations is on the order of two raised to the power 1,000. The approach is to create task_struct and rcu_node structures that contain only those fields that are used by this code fragment, but that also contain flags that indicate which functions were called and what their arguments were. This allows us to wrapper both the old and the new versions of the code fragment in their respective functions, and call them in sequence on different instances of identically initialized task_struct and rcu_node structures. These two calls are followed by an assertion that checks that the return value and the corresponding fields of the structures are identical.

This approach results in checkiftrans-1.c (raw C code here). Lines 5-8 show the abbreviated task_struct structure and lines 13-22 show the abbreviated rcu_node struButcture. Lines 10, 11, 24, and 25 show the instances. Lines 27-31 record a call to rcu_wake_cond() and lines 33-36 record a call to rcu_initiate_boost_trace().

Lines 38-49 initialize a task_struct/rcu_node structure pair. The rather unconventional use of the argv[] array works because cbmc assumes that this array contains random numbers. The old if statement is wrappered by do_old_if() on lines 51-71, while the new if statement is wrappered by do_new_if() on lines 73-93. The assertion is in check() on lines 95-107, and finally the main program is on lines 109-118.

Running cbmc checkiftrans-1.c gives this output, which prominently features VERIFICATION FAILED at the end of the file. On lines 4, 5, 12 and 13 of the file are complaints that neither ULONG_CMP_GE() nor ULONG_CMP_LT() are defined. Lacking definitions for these these two functions, cbmc seems to treat them as random-number generators, which could of course cause the two versions of the if statement to yield different results. This is easily fixed by adding the required definitions:

 1 #define ULONG_MAX         (~0UL)
 2 #define ULONG_CMP_GE(a, b)  (ULONG_MAX / 2 >= (a) - (b))
 3 #define ULONG_CMP_LT(a, b)  (ULONG_MAX / 2 < (a) - (b))

This results in checkiftrans-2.c (raw C code here). However, running cbmc checkiftrans-2.c gives this output, which still prominently features VERIFICATION FAILED at the end of the file. At least there are no longer any complaints about undefined functions!

It turns out that cbmc provides a counterexample in the form of a traceback. This traceback clearly shows that the two instances executed different code paths, and a closer examination of the two representations of the if statement show that I forgot to convert one of the && operators to a ||—that is, the “rnp->qsmask != 0 &&” on line 84 should instead be “rnp->qsmask != 0 ||”. Making this change results incheckiftrans-3.c (raw C code here). The inverted if statement is now as follows:

 1   if (rnp->exp_tasks == NULL &&
 2       (rnp->gp_tasks == NULL ||
 3        rnp->boost_tasks != NULL ||
 4        rnp->qsmask != 0 ||
 5        ULONG_CMP_LT(jiffies, rnp->boost_time))) {
 6     rcu_initiate_boost_trace(rnp);
 7     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
 8   } else {
 9     if (rnp->exp_tasks == NULL) 
10       rnp->boost_tasks = rnp->gp_tasks;
11     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
12     t = rnp->boost_kthread_task;
13     if (t)   
14       rcu_wake_cond(t, rnp->boost_kthread_status);
15   }

This time, running cbmc checkiftrans-3.c produces this output, which prominently features VERIFICATION SUCCESSFUL at the end of the file. Furthermore, this verification consumed only about 100 milliseconds on my aging laptop. And, even better, because it refused to verify the buggy version, we have at least some reason to believe it!

Of course, one can argue that doing such work carefully and in a quiet environment would eliminate the need for such verification, and 30 years ago I might have emphatically agreed with this argument. I have since learned that ideal work environments are not always as feasible as we might like to think, especially if there are small children (to say nothing of adult-sized children) in the vicinity. Besides which, human beings do make mistakes, even when working in ideal circumstances, and if we are to have reliable software, we need some way of catching these mistakes.

The canonical pattern for using cbmc in this way is as follows:

 1 retref = funcref(...);
 2 retnew = funcnew(...);
 3 assert(retref == retnew && ...);

The ... sequences represent any needed arguments to the calls and any needed comparisons of side effects within the assertion.

Of course, there are limitations:

  1. The “b” in cbmc stands for “bounded.” In particular, cbmc handles neither infinite loops nor infinite recursion. The --unwind and --depth arguments to cbmc allow you to control how much looping and recursion is analyzed. See the manual for more information.
  2. The SAT solvers used by cbmc have improved greatly over the past 25 years. In fact, where a 100-variable problem was at the edge of what could be handled in the 1990s, most ca-2015 solvers can handle more than a million variables. However, the NP-complete nature of SAT does occasionally make its presence known, for example, programs that reduce to a proof involving the pigeonhole principle are not handled well as of early 2015.
  3. Handling of concurrency is available in later versions of cbmc, but is not as mature as is the handling of single-threaded code.

All that aside, everything has its limitations, and cbmc's ease of use is quite impressive. I expect to continue to use it from time to time, and strongly recommend that you give it a try!
Although junk mail, puppies, and patches often are unwelcome, there are exceptions. For example, if someone has been wanting a particular breed of dog for some time, that person might be willing to accept a puppy, even if that means giving it shots, housebreaking it, teaching it the difference between furniture and food, doing bottlefeeding, watching over it day and night, and even putting up with some sleepless nights.

Similarly, if a patch fixes a difficult and elusive bug, the maintainer might be willing to apply the patch by hand, fix build errors and warnings, fix a few bugs in the patch itself, run a full set of tests, fix and style problems, and even accept the risk that the bug might have unexpected side effects, some of which might result in some sleepless nights. This in fact is one of the reasons for the common advice given to open-source newbies: start by fixing bugs.

Other good advice for new contributors can be found here:

  1. Greg Kroah-Hartman's HOWTO do Linux kernel development – take 2 (2005)
  2. Jonathan Corbet's How to Participate in the Linux Community (2008)
  3. Greg Kroah-Hartman's Write and Submit your first Linux kernel Patch (2010)
  4. My How to make a positive difference in a FOSS project (2012)
  5. Daniel Lezcano's What do we mean by working upstream: A long-term contributor’s view

This list is mostly about contributing to the Linux kernel, but most other projects have similar pages giving good new-contributor advice.
True confession: I was once a serial junk mailer. Not mere email spam, but physical bulk-postage-rate flyers, advertising a series of non-technical conferences. It was of course for a good cause, and one of the most difficult things about that task was convincing that cause's leaders that this flyer was in fact junk mail. They firmly believed that anyone with even a scrap of compassion would of course read the flyer from beginning to end, feeling the full emotional impact of each and every lovingly crafted word. They reluctantly came around to my view, which was that we had at most 300 milliseconds to catch the recipient's attention, that being the amount of time that the recipient might glance at the flyer on its way into the trash. Or at least I think that they came around to my view. All I really know is that they stopped disputing the point.

But junk mail for worthy causes is not the only thing that can be less welcome than its sender might like.

For example, Jim Wasko noticed a sign at a daycare center that read: “If you are late picking up your child and have not called us in advance, we will give him/her an espresso and a puppy. Have a great day.”

Which goes to show that although puppies are cute and lovable, and although their mother no doubt went to a lot of trouble to bring them into this world, they are, just like junk mail, not universally welcome. And this should not be too surprising, given the questions that come to mind when contemplating a free puppy. Has it had its shots? Is it housebroken? Has it learned that furniture is not food? Has it been spayed/neutered? Is it able to eat normal dogfood, or does it still require bottlefeeding? Is it willing to entertain itself for long periods? And, last, but most definitely not least, is it willing to let you sleep through the night?

Nevertheless, people are often surprised and bitterly disappointed when their offers of free puppies are rejected.

Other people are just as surprised and disappointed when their offers of free patches are rejected. After all, they put a lot of work into their patches, and they might even get into trouble if the patch isn't eventually accepted.

But it turns out that patches are a lot like junk mail and puppies. They are greatly valued by those who produce them, but often viewed with great suspicion by the maintainers receiving them. You see, the thought of accepting a free patch also raises questions. Does the patch apply cleanly? Does it build without errors and warnings? Does it run at all? Does it pass regression tests? Has it been tested with the commonly used combination of configuration parameters? Does the patch have good code style? Is the patch maintainable? Does the patch provide a straightforward and robust solution to whatever problem it is trying to solve? In short, will this patch allow the maintainer to sleep through the night?

I am extremely fortunate in that most of the RCU patches that I receive are “good puppies.” However, not everyone is so lucky, and I occasionally hear from patch submitters whose patches were not well received. They often have a long list of reasons why their patches should have been accepted, including:

  1. I put a lot of work into that patch, so it should have been accepted! Unfortunately, hard work on your part does not guarantee a perception of value on the maintainer's part.
  2. The maintainer's job is to accept patches. Maybe not, your maintainer might well be an unpaid volunteer.
  3. But my maintainer is paid to maintain! True, but he is probably not being paid to do your job.
  4. I am not asking him to do my job, but rather his/her job, which is to accept patches! The maintainer's job is not to accept any and all patches, but instead to accept good patches that further the project's mission.
  5. I really don't like your attitude! I put a lot of work into making this be a very good patch! It should have been accepted! Really? Did you make sure it applied cleanly? Did you follow the project's coding conventions? Did you make sure that it passed regression tests? Did you test it on the full set of platforms supported by the project? Does it avoid problems discussed on the project's mailing list? Did you promptly update your patch based on any feedback you might have received? Is your code maintainable? Is your code aligned with the project's development directions? Do you have a good reputation with the community? Do you have a track record of supporting your submissions? In other words, will your patch allow the maintainer to sleep through the night?
  6. But I don't have time to do all that! Then the maintainer doesn't have time to accept your patch. And most especially doesn't have time to deal with all the problems that your patch is likely to cause.

As a recovering proprietary programmer, I can assure you that things work a bit differently in the open-source world, so some adjustment is required. But participation in an open-source project can be very rewarding and worthwhile!