You are viewing paulmck

Previous Entry | Next Entry

Stupid RCU Tricks: Bug Hunt Number 1

RCU
The following code fragment contains a bug semi-similar to one recently found in the Linux kernel.

 

 

struct foo {
	struct list_head list;
	int key;
	int data;
};

LIST_HEAD(mylist);
DEFINE_SPINLOCK(mylock);
struct foo *cache;

int search(int key, int *data)
{
	struct foo *p;

	rcu_read_lock();
	p = rcu_dereference(cache);
	if (p != NULL && p->key == key)
		goto found;
	list_for_each_entry_rcu(p, &mylist, list)
		if (p->key == key) {
			rcu_assign_pointer(cache, p);
			goto found;
		}
	rcu_read_unlock();
	return -ENOENT;
found:
	*data = p->data;
	rcu_read_unlock();
	return 0;
}

int insert(int key, int data)
{
	struct foo *p = kmalloc(sizeof(*p), GFP_KERNEL);

	if (p == NULL)
		return -ENOMEM;
	p->key = key;
	p->data = data;
	spin_lock(&mylock);
	list_add_rcu(&p->list, &mylist);
	spin_unlock(&mylock);
}

int delete(int key)
{
	struct foo *p;

	spin_lock(&mylock);
	list_for_each_entry(p, &mylist, list)
		if (p->key == key) {
			list_del_rcu(&p->list);
			spin_unlock(&mylock);
			synchronize_rcu();
			kfree(p);
			return 0;
		}
	spin_unlock(&mylock);
	return -ENOENT;
}


Can you find it and fix it?

Comments

axboe
Jun. 6th, 2011 02:59 am (UTC)
Example a bit flawed
Paul, the example should be revised a bit to match the cfq-iosched cache problem. The issue was not forgetting to prune the one-hit cache on deletion. I wish it was, since then it would have been found much sooner! However, the one-hit cache was not properly protected. So when killing the item from the linked list and the radix tree, clearing the cache on match was not done inside the lock protecting the other structures.
paulmck
Jun. 6th, 2011 03:10 am (UTC)
Re: Example a bit flawed
Thank you for keeping me honest, Jens!

I will keep the current example, update the description, and pull yours out in a later posting.