C++:添加休眠导致死锁

huangapple go评论75阅读模式
英文:

C++: Adding a sleep causes deadlock

问题

以下是您要翻译的代码部分:

我有这段代码。

    std::thread t1(func1);
    std::this_thread::sleep_for( std::chrono::seconds( 2 ) );
    std::thread t2(func2);
    std::thread t3(func3);

    t1.join();
    t2.join();
    t3.join();

在这段代码中,t1锁定了互斥锁m1,而t2正在等待获取m1的锁。这段代码按预期工作。
现在,如果在t1中添加5秒的休眠,程序会崩溃,并显示以下错误 -

terminate called after throwing an instance of 'std::system_error'
what(): Resource deadlock avoided
Aborted (core dumped)


以下是所有线程的回溯信息 -

(gdb) thread 1
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007fa9eae827f1 in __GI_abort () at abort.c:79
#2 0x00007fa9eb875957 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007fa9eb87bae6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007fa9eb87ab49 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fa9eb87b4b8 in __gxx_personality_v0 () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fa9eb243573 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7 0x00007fa9eb243ad1 in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#8 0x00007fa9eb87bd47 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007fa9eb877a23 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x000055d2f9bbe02f in std::unique_lockstd::mutex::lock (this=0x55d2fbe062f8) at /usr/include/c++/7/bits/std_mutex.h:264

(gdb) thread 2
#0 0x00007fa9ebd7ed2d in __GI___pthread_timedjoin_ex (threadid=140367767017216, thread_return=0x0, abstime=0x0, block=) at pthread_join_common.c:89
#1 0x00007fa9eb8a6933 in std::thread::join() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x000055d2f9bb6e20 in main (argc=, argv=) at tests/benchmark-tests/benchmarkTest.cpp:294
(gdb)

(gdb) bt
#0 0x00007fa9ebd87d50 in __GI___nanosleep (requested_time=requested_time@entry=0x7fa9eae40d30, remaining=remaining@entry=0x7fa9eae40d30) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
#1 0x000055d2f9bbeaf5 in std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > (__rtime=...) at /usr/include/c++/7/thread:373


我的问题是 -
1. 为什么会崩溃?它说是避免了死锁,但我无法理解死锁是什么?
2. C++是否具有内置的死锁检测?可以通过某些标志禁用吗?

添加了可复现问题的代码 -
```cpp
#include <iostream>
#include <thread>
#include <mutex>
#include <unordered_map>

class Worker {
    public:
        Worker() : workerLock(workerMutex, std::defer_lock) {}
        ~Worker() {
            workerLock.lock();
            workerLock.unlock();
        }
        void acquireWorkerLock() {
            workerLock.lock();
        }
        void releaseWorkerLock() {
            workerLock.unlock();
        }
        void work() {
            std::cout << "I'm working" << std::endl;
            std::this_thread::sleep_for(std::chrono::milliseconds(10000));
        }
    private:
        std::mutex workerMutex;
        std::unique_lock<std::mutex> workerLock;
};

class WorkerManager {
    public:
        void newWorker(std::string name) {
            const std::lock_guard<std::mutex> lock(managerMutex);

            bool workerAlreadyPresent = ( workers.find( name ) != workers.end() );
            if( !workerAlreadyPresent ) {
                Worker *worker = new Worker();
                workers.insert( {name, worker} );
            }
        }
        void work(std::string name) {
            std::unique_lock<std::mutex> lock(managerMutex);
            auto workerIt = workers.find( name );
            if( workerIt != workers.end() ) {
                Worker *worker = workerIt->second;
                worker->acquireWorkerLock();
                lock.unlock();
                worker->work();
                worker->releaseWorkerLock();
            }
        }
        void removeWorker(std::string name) {
            std::unique_lock<std::mutex> lock(managerMutex);
            auto workerIt = workers.find( name );
            if( workerIt != workers.end() ) {
                Worker *worker = workerIt->second;
                workers.erase(workerIt);
                lock.unlock();
                delete worker;
            }
        }
    private:
        std::unordered_map<std::string, Worker*> workers;
        std::mutex managerMutex;
};

WorkerManager wm;

void func1() {
    wm.work("Foo");
    std::cout << "Work done by worker Foo\n";
}

void func2() {
    wm.removeWorker("Foo");
    std::cout << "Worker removed Foo\n";
}

void func3() {
    wm.newWorker("Bar");
    std::cout << "Worker added Bar\n";
}

int main() {
    wm.newWorker("Foo");
    std::thread t1(func1);
    std::this_thread::sleep_for( std::chrono::seconds( 2 ) );
    std::thread t2(func2);
    std::thread t3(func3);

    t1.join();


<details>
<summary>英文:</summary>

I have this code.

    std::thread t1(func1);
    std::this_thread::sleep_for( std::chrono::seconds( 2 ) );
    std::thread t2(func2);
    std::thread t3(func3);

    t1.join();
    t2.join();
    t3.join();

In this code t1 takes a lock on mutex m1 and t2 is waiting to get the lock on m1. This code works as expected. 
Now if I add a sleep for 5 seconds in t1, the program crashes with the following error - 

terminate called after throwing an instance of 'std::system_error'
what(): Resource deadlock avoided
Aborted (core dumped)


Here are the backtraces of all the threads - 

(gdb) thread 1
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007fa9eae827f1 in __GI_abort () at abort.c:79
#2 0x00007fa9eb875957 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007fa9eb87bae6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007fa9eb87ab49 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fa9eb87b4b8 in __gxx_personality_v0 () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fa9eb243573 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7 0x00007fa9eb243ad1 in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#8 0x00007fa9eb87bd47 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007fa9eb877a23 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x000055d2f9bbe02f in std::unique_lock<std::mutex>::lock (this=0x55d2fbe062f8) at /usr/include/c++/7/bits/std_mutex.h:264

(gdb) thread 2
#0 0x00007fa9ebd7ed2d in __GI___pthread_timedjoin_ex (threadid=140367767017216, thread_return=0x0, abstime=0x0, block=<optimized out>) at pthread_join_common.c:89
#1 0x00007fa9eb8a6933 in std::thread::join() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x000055d2f9bb6e20 in main (argc=<optimized out>, argv=<optimized out>) at tests/benchmark-tests/benchmarkTest.cpp:294
(gdb)

(gdb) bt
#0 0x00007fa9ebd87d50 in __GI___nanosleep (requested_time=requested_time@entry=0x7fa9eae40d30, remaining=remaining@entry=0x7fa9eae40d30) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
#1 0x000055d2f9bbeaf5 in std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > (__rtime=...) at /usr/include/c++/7/thread:373

My questions are - 
1. Why is this crashing? It is saying deadlock avoidance but I am not able to understand what the deadlock is?
2. Does C++ have inbuild deadlock detection? Can that be disabled by some flag?

Added the code which reproduces the issue - 

#include <iostream>
#include <thread>
#include <mutex>
#include <unordered_map>

class Worker {
public:
Worker() : workerLock(workerMutex, std::defer_lock) {}
~Worker() {
workerLock.lock();
workerLock.unlock();
}
void acquireWorkerLock() {
workerLock.lock();
}
void releaseWorkerLock() {
workerLock.unlock();
}
void work() {
std::cout << "I'm working" << std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(10000));
}
private:
std::mutex workerMutex;
std::unique_lock<std::mutex> workerLock;
};

class WorkerManager {
public:
void newWorker(std::string name) {
const std::lock_guard<std::mutex> lock(managerMutex);

        bool workerAlreadyPresent = ( workers.find( name ) != workers.end() );
        if( !workerAlreadyPresent ) {
            Worker *worker = new Worker();
            workers.insert( {name, worker} );
        }
    }
    void work(std::string name) {
        std::unique_lock&lt;std::mutex&gt; lock(managerMutex);
        auto workerIt = workers.find( name );
        if( workerIt != workers.end() ) {
            Worker *worker = workerIt-&gt;second;
            worker-&gt;acquireWorkerLock();
            lock.unlock();
            worker-&gt;work();
            worker-&gt;releaseWorkerLock();
        }
    }
    void removeWorker(std::string name) {
        std::unique_lock&lt;std::mutex&gt; lock(managerMutex);
        auto workerIt = workers.find( name );
        if( workerIt != workers.end() ) {
            Worker *worker = workerIt-&gt;second;
            workers.erase(workerIt);
            lock.unlock();
            delete worker;
        }
    }
private:
    std::unordered_map&lt;std::string, Worker*&gt; workers;
    std::mutex managerMutex;

};

WorkerManager wm;

void func1() {
wm.work("Foo");
std::cout << "Work done by worker Foo\n";
}

void func2() {
wm.removeWorker("Foo");
std::cout << "Worker removed Foo\n";
}

void func3() {
wm.newWorker("Bar");
std::cout << "Worker added Bar\n";
}

int main() {
wm.newWorker("Foo");
std::thread t1(func1);
std::this_thread::sleep_for( std::chrono::seconds( 2 ) );
std::thread t2(func2);
std::thread t3(func3);

t1.join();
t2.join();
t3.join();
return 0;

}

Backtrace for the above code - 

(gdb) thread 1
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007f79d03387f1 in __GI_abort () at abort.c:79
#2 0x00007f79d0bac957 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007f79d0bb2ae6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007f79d0bb1b49 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007f79d0bb24b8 in __gxx_personality_v0 () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007f79d0918573 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#7 0x00007f79d0918ad1 in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#8 0x00007f79d0bb2d47 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007f79d0baea23 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00005653edd0f690 in std::unique_lock<std::mutex>::lock (this=0x5653eeb65e98) at /usr/include/c++/7/bits/std_mutex.h:264
#11 0x00005653edd0f048 in Worker::~Worker (this=0x5653eeb65e70, __in_chrg=<optimized out>) at test.cpp:10
#12 0x00005653edd0f443 in WorkerManager::removeWorker (this=0x5653edf16140 <wm>, name="Foo") at test.cpp:57

(gdb) thread 2
#0 0x00007f79d06f1d2d in __GI___pthread_timedjoin_ex (threadid=140161156749056, thread_return=0x0, abstime=0x0, block=<optimized out>) at pthread_join_common.c:89
#1 0x00007f79d0bdd933 in std::thread::join() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00005653edd0eb0e in main () at test.cpp:89
(gdb)

(gdb) thread 3
#0 0x00007f79d06fad50 in __GI___nanosleep (requested_time=0x7f79cff58cb0, remaining=0x7f79cff58cb0) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
#1 0x00005653edd0fb2d in std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > (__rtime=...) at /usr/include/c++/7/thread:373
#2 0x00005653edd0f135 in Worker::work (this=0x5653eeb65e70) at test.cpp:21
#3 0x00005653edd0f339 in WorkerManager::work (this=0x5653edf16140 <wm>, name="Foo") at test.cpp:46


</details>


# 答案1
**得分**: 4

你不能通过相同的 `std::unique_lock` 两次锁定一个互斥锁。在你的情况下,如果正在进行工作,工作线程的互斥锁已经通过它的 `workerLock` 锁定。然后,如果另一个线程尝试删除工作线程,它会在工作线程的析构函数中对相同的唯一锁调用 `lock`。引用自 [cppreference](https://en.cppreference.com/w/cpp/thread/unique_lock/lock):

> **异常**
> 
> 如果互斥锁已经被此 `unique_lock` 锁定(换句话说,`owns_lock` 为真),则 std::system_error...

我认为,唯一锁或任何锁卫士不应该在单一代码范围(块)之外使用。

这是一个建议的 `Worker`,它不将唯一锁用作成员变量:

```cpp
class Worker {
  public:
    ~Worker() {
      std::lock_guard<std::mutex> workerLock(workerMutex);
    }
    void acquireWorkerLock() {
      workerMutex.lock();
    }
    void releaseWorkerLock() {
      workerMutex.unlock();
    }
    void work() {
      std::cerr << "I'm working" << std::endl;            
      std::this_thread::sleep_for(std::chrono::milliseconds(10000));
    }
  private:
    std::mutex workerMutex;
}

这个问题可能与此相关:unique_lock across threads?。Howard Hinnant 提供的主要回答中有一个简短的引用:

unique_lock 不应该同时从多个线程访问。它不是为以这种方式线程安全设计的。相反,多个 unique_lock(局部变量)引用同一个全局 mutex。只有 mutex 本身设计为可以同时被多个线程访问。

英文:

You cannot lock a mutex through the same std::unique_lock twice. In your case, if the work is being done, the worker's mutex has been lock through its workerLock. Then, if another thread tries to remove the worker, it calls lock on the same unique lock in the worker's destructor. Quoting from the cpprefernce:

> Exceptions
>
> If the mutex is already locked by this unique_lock (in other words, owns_lock is true), std::system_error...

IMO, a unique lock or any lock guard is not meant to be used outside of a single code scope (block).


Here is a proposed Worker that doesn't use a unique lock as a member variable:

class Worker {
  public:
    ~Worker() {
      std::lock_guard&lt;std::mutex&gt; workerLock(workerMutex);
    }
    void acquireWorkerLock() {
      workerMutex.lock();
    }
    void releaseWorkerLock() {
      workerMutex.unlock();
    }
    void work() {
      std::cerr &lt;&lt; &quot;I&#39;m working&quot; &lt;&lt; std::endl;            
      std::this_thread::sleep_for(std::chrono::milliseconds(10000));
    }
  private:
    std::mutex workerMutex;
}

This question might be relevant: unique_lock across threads?. A short quote from the main answer provided by Howard Hinnant:

> unique_lock should not be accessed from multiple threads at once. It was not designed to be thread-safe in that manner. Instead, multiple unique_locks (local variables) reference the same global mutex. Only the mutex itself is designed to be accessed by multiple threads at once.

huangapple
  • 本文由 发表于 2023年7月11日 14:47:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/76659314.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定