Paul E. McKenney (paulmck) wrote,
Paul E. McKenney
paulmck

Will Your Rust Code Survive the Attack of the Zombie Pointers?

Some of the previous posts in this series have been said to be quite difficult, so I figured I owed you all an easy one. And the zombie-pointer problem really does have a trivial solution, at least in the context of the Linux kernel. In other environments, all bets are off.

But first, what on earth is a zombie pointer?

To answer that question, let's start with a last-in-first-out (LIFO) stack. We don't know when this algorithm was invented or who invented it, but a very similar algorithm was described in R. J. Treiber's classic 1986 technical report (IBM Almaden Research Center RJ 5118), and it was referred to as prior art in (expired) US Patent 3,886,525, which was filed in 1973 (hat trick to Maged Michael). Fun though it might be to analyze this code's original IBM assembly-language implementation, let's instead look at a C11 implementation:

 1 struct node_t* _Atomic top;
 2
 3 void list_push(value_t v)
 4 {
 5   struct node_t *newnode = (struct node_t *) malloc(sizeof(*newnode));
 6
 7   set_value(newnode, v);
 8   newnode->next = atomic_load(&top);
 9   do {
10     // newnode->next may have become invalid
11   } while (!atomic_compare_exchange_weak(&top, &newnode->next, newnode));
12 }
13
14 void list_pop_all()
15 {
16   struct node_t *p = atomic_exchange(&top, NULL);
17
18   while (p) {
19     struct node_t *next = p->next;
20      
21     foo(p);
22     free(p);
23     p = next;
24   }
25 }

Those used to the Linux-kernel cmpxchg() atomic operation should keep in mind that when the C11 atomic_compare_exchange_weak() fails, it writes the current value referenced by its first argument to the pointer passed as its second argument. In other words, instead of passing atomic_compare_exchange_weak() the new value, you instead pass it a pointer to the new value. Each time atomic_compare_exchange_weak() fails, it updates the pointed-to new value. This sometimes allows this function to be used in a tight loop, as in line 11 above.

With these atomic_compare_exchange_weak() semantics in mind, let's step through execution of list_push(C) on a stack initially containing objects A and B:

  1. Line 5 allocates memory for the new object C.
  2. Line 7 initializes this newly allocated memory.
  3. Line 8 sets the new object's ->next pointer to the current top-of-stack pointer.
  4. Line 11 invokes atomic_compare_exchange_weak(), which atomically compares the value of newnode->next to the value of top, and if they are equal, stores the pointer newnode into top and returns true. (As noted before, if the comparison is instead not-equal, atomic_compare_exchange_weak() instead stores the value of top into newnode->next and returns false, repeating until such time as the pointers compare equal.)
  5. One way or another, the stack ends up containing objects C, A, and B, ordered from the top of stack down.

Oddly enough, this code would still work even if line 8 were omitted, however, that would result in a high probability that the first call to atomic_compare_exchange_weak() would fail, which would not be so good for common-case performance. It would also result in compile-time complaints about uninitialized variables, so line 8 is doubly unlikely to be omitted in real life.

While we are at it, let's step through list_pop_all():

  1. Line 16 atomically stores NULL into top and returns its previous value. After this lines executes, the stack is empty and its previous contents are referenced by local variable p.
  2. Lines 18-24 execute for each object on list p:
    1. Line 19 prefetches a pointer to the next object.
    2. Line 21 passes the current object to function foo().
    3. Line 22 frees the current object.
    4. Line 23 advances to the next object.

Thus far, we have seen no zombie pointers. Let's therefore consider the following sequence of events, with the stack initially containing objects A and B:

  1. Thread 1 starts pushing object C, but is interrupted, preempted, otherwise impeded just after executing line 8.
  2. Thread 2 pops the entire list and frees all of the objects. Object C's ->next pointer is now invalid because it points to now-freed memory formerly known as object A.
  3. Thread 2 pushes an object, and happens to allocate memory for it at the same address as the old object A, so let's call it A'.
  4. Object C's ->next pointer is still invalid as far as an omniscient compiler is concerned, but from an assembly language viewpoint happens to reference the type-compatible object A'. This pointer is now a "zombie pointer" that has come back from the dead, or that has at least come to have a more entertaining form of invalidity.
  5. Thread 1 resumes and executes the atomic_compare_exchange_weak() on line 11. Now this primitive, like its Linux-kernel counterpart cmpxchg(), operates not on the compiler's idea of what the pointer is, but rather on the actual bits making up that pointer. Therefore, that atomic_compare_exchange_weak() succeeds despite the fact that the object C's ->next pointer is invalid.
  6. Thread 1 has thus completed its push of object C, and the resulting stack looks great from an assembly-language viewpoint. But the pointer from object C to A' is a zombie pointer, and compilers are therefore within their rights to do arbitrarily strange things with it.

As noted earlier, this has a trivial solution within the Linux kernel. Please do not peek too early. ;-)

But what can be done outside of the Linux kernel?

The traditional solution has been to hide the allocator from the compiler. After all, if the compiler cannot see the calls to malloc() and free(), it cannot figure out that a given pointer is invalid (or, in C-standard parlance, indeterminate). Creating your own allocator with its own function names suffices, and back in the day you often needed to create your own allocator if performance and scalability was important to you.

However, this would also deprive Rust of information that it needs to check pointer operations. So maybe Rust needs to know about allocation and deallocation, but needs to be very careful not to pass this information on to the optimizer. Assuming that this even makes sense in the context of the implementation of Rust.

Perhaps code working with invalid and zombie pointers needs to be carefully written in C. And perhaps there is some way to write it safely in unsafe Rust.

The final approach is to wait until backend support for safe handling of invalid/zombie pointers arrives, and then add Rust syntax as appropriate. Here is some documentation of current efforts towards adding such support to C++:

  1. CPPCON 2020 presentation
  2. Pointer lifetime-end zap (informational/historical)
  3. Pointer lifetime-end zap proposed solutions

So there is good progress, but it might still be some time before this is supported by either the standard or by compilers.

History

October 6, 2021: Expand description per feedback from Jonathan Corbet
October 12, 2021: Self-review.
Tags: linux, lkmm, rust
Subscribe

Recent Posts from This Journal

  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 2 comments