问题

我有点困惑这段代码：

在traverse函数中，我们使用rcu_read_lock锁，然后在某个条件满足之前遍历某个列表，但在满足条件后，我们解锁rcu_read_unlockRCU并返回指向e的指针。

让我感到困惑的关键点是，我们解锁了RCU读取端的关键部分，但仍然保留了列表中的指针，如果写端删除了这个元素，似乎这个指针e会出问题，是吗？

据我所知，指针只在读取端关键部分内有效，即在rcu_read_lock和rcu_read_unlock之间，我错了吗？

附言：traverse在没有持有任何锁的情况下调用。

英文:

I'm a bit confused about such code:

struct a {
    // ...
    struct list_head some_list;
    // ...
};

struct e {
    struct list_head next;
    // ... 
};

static void *traverse(struct a *a)
{
    struct e *e;
    
    rcu_read_lock();
    list_for_each_entry_rcu(e, &amp;a-&gt;some_list, next) {
        if (...) {
            rcu_read_unlock();
            return e;
        }
    }
    rcu_read_unlock();
    return NULL;
}

In the function traverse we take a lock rcu_read_lock and then iterate over some list until some condition is met, but after this condition is met we unlock rcu_read_unlock RCU and return the e pointer.

The key point that is confusing me is that we unlock RCU read-side critical section, but keep pointer from the list, what if write-side will remove this element and it seems that this pointer e will be broken, isn't it?

AFAIK, the pointer is valid only inside read-side critical section, i.e. between rcu_read_lock and rcu_read_unlock, am I wrong?

P.S.: traverse is called without holding any locks.

答案1

得分: 1

您的假设是正确的，您发布的代码片段似乎是“错误的”。在这种情况下，通常要做的是类似以下操作：

static void traverse(struct a *a, void (*callback)(struct e *))
{
    struct e *e;
    
    rcu_read_lock();
    list_for_each_entry_rcu(e, &a->some_list, next) {
        if (...) {
            callback(e);
            break;
        }
    }
    rcu_read_unlock();
}

这样，您可以确保对 e 执行的任何操作，调用 callback() 函数来使用它的操作将看到列表的一致版本（当然，假设它不会将 e 保存在某处以供以后使用，否则我们又回到了起点）。

在 rcu_read_unlock(); 后执行 return e; 可能会引发问题，正如您在问题中所提到的，但理论上可能仍然可以正常工作，具体情况取决于确切的情况。是否存在问题取决于在返回后对 e 执行了什么操作。

例如，如果在调用者中只是简单地检查 e 是否为 NULL，类似于 if (e != NULL) {...}，那将是可以的。当然，有人可能会主张在这种情况下，您本可以让 traverse 函数返回一个 bool：'）

英文:

Your assumptions are right, the snippet of code you posted seems "broken". What you usually want to do in such a situation is something like the following:

static void traverse(struct a *a, void (*callback)(struct e *))
{
    struct e *e;
    
    rcu_read_lock();
    list_for_each_entry_rcu(e, &amp;a-&gt;some_list, next) {
        if (...) {
            callback(e);
            break;
        }
    }
    rcu_read_unlock();
}

This way you can ensure that whatever operation you need to perform on e, the callback() function that gets called to use it will see a consistent version of the list (of course, assuming that it does not save e somewhere to use it later, otherwise we're back at square one).

Doing return e; after rcu_read_unlock(); can cause trouble as you have noted in your question, but in theory it could still be fine depending on the exact scenario. Whether there's a problem or not only depends on what is done with e after it is returned.

For example, if e is simply checked in the caller with something like if (e != NULL) {...} then that'd be fine. Of course though, one could argue that you could have just made the traverse function return a bool in such case :')

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从在内核模块中受RCU保护的列表中返回一个指针

问题

答案1

如何终止 Thunar 进程？

Go应用程序无法捕获信号。

如何在Go中检测损坏的符号链接？

Go: serving http directory with subdirectories

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论