You are viewing paulmck

Previous Entry | Next Entry

Stupid RCU Tricks: RCU-Protected Deletion

RCU
Consider the following code fragment:

 

 

  1 rcu_read_lock();
  2 p = rcu_dereference(gp);
  3 if (must_delete(p)) {
  4   spin_lock(&my_lock);
  5   if (must_delete(p)) {
  6     kfree_rcu(p);
  7     rcu_assign_pointer(gp, q);
  8     already_deleted(p);
  9   }
 10   spin_unlock(&my_lock);
 11 } else {
 12   do_something_with(p);
 13 }
 14 rcu_read_unlock();

Note that the whole fragment is enclosed in an RCU read-side critical section. Line 2 picks up a local pointer to the global RCU-protected pointer gp, which we assume is never NULL. Line 3 checks to see if it is necessary to delete the structure referenced by gp, and, if so, lines 4-10 do the deleting. Otherwise, line 12 does something with the just-accessed structure.

If the structure must be deleted, we first acquire a spinlock on line 4, and then line 5 re-checks whether it is still necessary to delete the structure, but this time under the lock. If the structure still needs to be deleted, then line 6 frees it (but only after an RCU grace period has elapsed) and line 7 makes gp point to some replacement structure q. Clearly this replacement structure must have been allocated somehow, but the details of exactly how that happens are not important to this example. Next, line 8 marks the old structure as having already been deleted, which resolves races when multiple CPUs concurrently notice that the structure needs deleting, and is also the reason that line 5 rechecks under the lock. Finally, line 10 releases the spinlock.

This code fragment is a bit unconventional: Normally lines 6 and 7 would be interchanged. But the RCU read-side critical section will prevent the structure from being freed until after the global pointer is updated, right? Why or why not?

Comments

( 9 comments )
(Anonymous)
Jul. 29th, 2011 11:54 pm (UTC)
Note that this also means you can never synchronize_rcu() inside that spinlock, without deadlocking.
paulmck
Jul. 30th, 2011 04:37 am (UTC)
Indeed!
The general rule says that you are not permitted to wait for a grace period within an RCU read-side critical section. Acquiring a lock that is held across a grace period is in fact one way to wait for a grace period, so acquiring such a lock within an RCU read-side critical section is indeed one way to violate this general rule.

Of course, the consequences of violating the rule depend on the RCU implementation. The PREEMPT_RCU implementations within the Linux kernel would deadlock as you say -- but the kernel would also complain about scheduling within atomic in this case. The !PREEMPT_RCU implementations would silently split the RCU read-side critical section across the lock acquisition if the lock acquisition blocked -- but you would still get a complaint about scheduling within atomic.

If I remember correctly, the user-level RCU implementations would deadlock without complaint, so perhaps this is the situation you were thinking of.
(Anonymous)
Aug. 1st, 2011 01:32 pm (UTC)
Hi I think you're wrong
Hi From these findings?
paulmck
Aug. 2nd, 2011 06:01 am (UTC)
Re: Hi I think you're wrong
I might well be wrong. But would you happen to have a particular line of reasoning demonstrating that?
(Anonymous)
Aug. 2nd, 2011 04:43 pm (UTC)
Hello I think you're wrong
Hi I think you're wrong. I'm sure. I can prove it.
paulmck
Aug. 2nd, 2011 05:06 pm (UTC)
Re: Hello I think you're wrong
Feel free to attempt to do so. :-)
Alex Nikiforov
Mar. 20th, 2012 09:03 am (UTC)
maybe I'm late, but...
Hi, maybe I'm late with this topic, but I have a question.

I'm really sorry, but I dont quite understand answer on this quiz.
You wrote

The moral of this story: Always prevent readers from obtaining new references to the structure before starting the grace period. Always!

Is it means that at first one must hold spin_lock() and then rcu_dereference() bcoz rcu_dereference() will starts grace period or I miss something.

Edited at 2012-03-20 09:55 am (UTC)
paulmck
Mar. 31st, 2012 05:25 pm (UTC)
No read-side spinlocks, just careful updates
The moral is that lines 6 and 7 must be interchanged: You must always always always remove any read-side access to the data element that is to be deleted (the rcu_assign_pointer() statement on line 7) before starting the grace period (the kfree_rcu() on line 6). That way, any reader that acquires a reference to the data item is guaranteed to be waited on by the corresponding grace period.
Alex Nikiforov
Apr. 3rd, 2012 07:04 am (UTC)
Re: No read-side spinlocks, just careful updates
Thx Paul
( 9 comments )