英文:
Return a pointer from RCU-protected list in LKM
问题
我有点困惑这段代码:
在traverse
函数中,我们使用rcu_read_lock
锁,然后在某个条件满足之前遍历某个列表,但在满足条件后,我们解锁rcu_read_unlock
RCU并返回指向e
的指针。
让我感到困惑的关键点是,我们解锁了RCU读取端的关键部分,但仍然保留了列表中的指针,如果写端删除了这个元素,似乎这个指针e
会出问题,是吗?
据我所知,指针只在读取端关键部分内有效,即在rcu_read_lock
和rcu_read_unlock
之间,我错了吗?
附言:traverse
在没有持有任何锁的情况下调用。
英文:
I'm a bit confused about such code:
struct a {
// ...
struct list_head some_list;
// ...
};
struct e {
struct list_head next;
// ...
};
static void *traverse(struct a *a)
{
struct e *e;
rcu_read_lock();
list_for_each_entry_rcu(e, &a->some_list, next) {
if (...) {
rcu_read_unlock();
return e;
}
}
rcu_read_unlock();
return NULL;
}
In the function traverse
we take a lock rcu_read_lock
and then iterate over some list until some condition is met, but after this condition is met we unlock rcu_read_unlock
RCU and return the e
pointer.
The key point that is confusing me is that we unlock RCU read-side critical section, but keep pointer from the list, what if write-side will remove this element and it seems that this pointer e
will be broken, isn't it?
AFAIK, the pointer is valid only inside read-side critical section, i.e. between rcu_read_lock
and rcu_read_unlock
, am I wrong?
P.S.: traverse
is called without holding any locks.
答案1
得分: 1
您的假设是正确的,您发布的代码片段似乎是“错误的”。在这种情况下,通常要做的是类似以下操作:
static void traverse(struct a *a, void (*callback)(struct e *))
{
struct e *e;
rcu_read_lock();
list_for_each_entry_rcu(e, &a->some_list, next) {
if (...) {
callback(e);
break;
}
}
rcu_read_unlock();
}
这样,您可以确保对 e
执行的任何操作,调用 callback()
函数来使用它的操作将看到列表的一致版本(当然,假设它不会将 e
保存在某处以供以后使用,否则我们又回到了起点)。
在 rcu_read_unlock();
后执行 return e;
可能会引发问题,正如您在问题中所提到的,但理论上可能仍然可以正常工作,具体情况取决于确切的情况。是否存在问题取决于在返回后对 e
执行了什么操作。
例如,如果在调用者中只是简单地检查 e
是否为 NULL
,类似于 if (e != NULL) {...}
,那将是可以的。当然,有人可能会主张在这种情况下,您本可以让 traverse
函数返回一个 bool
:')
英文:
Your assumptions are right, the snippet of code you posted seems "broken". What you usually want to do in such a situation is something like the following:
static void traverse(struct a *a, void (*callback)(struct e *))
{
struct e *e;
rcu_read_lock();
list_for_each_entry_rcu(e, &a->some_list, next) {
if (...) {
callback(e);
break;
}
}
rcu_read_unlock();
}
This way you can ensure that whatever operation you need to perform on e
, the callback()
function that gets called to use it will see a consistent version of the list (of course, assuming that it does not save e
somewhere to use it later, otherwise we're back at square one).
Doing return e;
after rcu_read_unlock();
can cause trouble as you have noted in your question, but in theory it could still be fine depending on the exact scenario. Whether there's a problem or not only depends on what is done with e
after it is returned.
For example, if e
is simply checked in the caller with something like if (e != NULL) {...}
then that'd be fine. Of course though, one could argue that you could have just made the traverse
function return a bool
in such case :')
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论