全局变量实际上在其RAM共享行为中是全局的吗?

huangapple go评论67阅读模式
英文:

Are global variables actually global in their RAM share behavior?

问题

在多线程应用中,每个线程是否因性能原因拥有给定全局变量的独立副本,还是每个线程只能读写相同版本的变量?

如果答案是前者,那么mutex Lock(),然后Unlock()是否保证将数据复制到全局变量中?

语言是Cpp。编译器是默认的VS编译器。操作系统是Windows。目标CPU是现代5+核。

许多注释建议,除非另有规定,否则全局变量在整个应用程序运行实例中只有一个副本。这段代码是否完全安全,零需要任何同步或其他数据访问保护机制?

(C++ 伪代码。这不是最佳实践,我只是在确保某些概念的调查中。)

// 在MyApplication.cpp中
int Variable1 = 0;
bool ThreadAFinished = false;

void ThreadAFunction()
{
    while (true)
    {
        Variable1++;

        if (Variable1 == 1000)
        {
            break;
        }
    }
    ThreadAFinished = true;
}

void ThreadBFunction()
{
    while (true)
    {
        if (ThreadAFinished)
        {
            if (Variable1 < 1000)
            {
                printf("ERROR");
            }
            else
            {
                printf("TEST PASSED");
            }
            break;
        }
    }
}
英文:

In a multithreaded application, does each thread have its own copy of a given global variable for performance reasons, or can each thread only read/write into the same version?

If the answer is the former, does a mutex Lock(), followed by Unlock() guarantee copying the data into the global variable?

Language is Cpp. Compiler is the default VS compiler. OS is Windows. Target CPU is modern 5+ core.

Many comments suggest that unless otherwise specified, a global variable has only a single copy throughout the application running instance. Is this code totally safe with zero need for any synchronization or any other data accessing protection mechanism?

(C++ pseudo-code. It isn't best-practice, I am just trying to make sure of certain concepts as part of an inquiry.)

//In MyApplication.cpp
	int Variable1 = 0;
	bool ThreadAFinished = false;

	void ThreadAFunction()
	{
		while (true)
		{
			Variable1++;

			if (Variable1 == 1000)
			{
				break;
			}
		}
		ThreadAFinished = true;
	}
	void ThreadBFunction()
	{
		while (true)
		{
			if (ThreadAFinished)
			{
				if (Variable1 &lt; 1000)
				{
					printf(&quot;ERROR&quot;);
				}
				else
				{
					printf(&quot;TEST PASSED&quot;);
				}
				break;
			}
		}
	}

答案1

得分: 2

Others already commented: you can rely upon having just one copy being accessed by all threads, as long as you don't explicitly make it thread-local.

As for the thread race problems, please be aware, that its not only the compiler optimizing to different instruction ordering, but also memory and register caching, that will trick you. Do not expect the volatile modifier to solve those problems ... it also does not guarantee proper behavior in all situations.

There are good (short and understandable) articles about memory fencing / memory barrier in wikipedia and standard libs also have some things available for you: search for "Atomic operations library"

英文:

Others already commented: you can rely upon having just one copy being accessed by all threads, as long as you don't explicitly make it thread-local.

As for the thread race problems, please be aware, that its not only the compiler optimizing to different instruction ordering, but also memory and register caching, that will trick you. Do not expect the volatile modifier to solve those problems ... it also does not guarantee proper behavior in all situations.

There are good (short and understandable) articles about memory fencing / memory barrier in wikipedia and standard libs also have some things available for you: search for "Atomic operations library"

答案2

得分: 1

以下是翻译好的部分:

可以存在多个相同全局变量的副本。每个处理器都有自己的数据缓存,缓存通常会持有处理器正在使用的数据的独立副本。

这就是为什么必须在访问相同数据的线程之间提供同步的一个原因;一个线程可以将数据写入其本地缓存,另一个线程读取“相同”的数据,要么来自其自己的本地缓存,要么来自内存,并且不会看到其他线程所做的更改。

问题中的代码存在数据竞争:一个线程更新全局变量,另一个线程读取全局变量的值。程序的行为是未定义的。最简单的修复方法是更改两个全局变量的类型,使它们变为原子类型:

std::atomic Variable1 = 0;
std::atomic ThreadAFinished = false;

现在编译器(实际上是运行时库)将提供适当的代码,以确保缓存得到适当地刷新和重新加载,从而使这两个全局变量的行为就像只有一个副本一样。

英文:

There can be multiple copies of the same global variable. Each processor has its own data cache, and the cache will typically hold a separate copy of whatever data the processor is using.

That's one of the reasons why you have to provide synchronization between threads that access the same data; one thread can write to the data in its local cache, and another thread reads the "same" data, either from its own local cache or from memory, and doesn't see the change that the other thread made.

The code in the question has a data race: one thread updates the global variables, and another thread reads the values of the global variables. The behavior of the program is undefined. The simplest fix is to change the types of the two global variables to make them atomic:

std::atomic&lt;int&gt; Variable1 = 0;
std::atomic&lt;bool&gt; ThreaedAFinished = false;

Now the compiler (well, the runtime library) will provide the appropriate code to ensure that caches get flushed and reloaded appropriately, so that the two global variables act like there is only one copy of each.

答案3

得分: 1

> ...does each thread has its own 'copy' of the global variable...

每个线程有自己的全局变量的 'copy' 吗?

Yes, but No.

是的,但不是。

"Yes" because, at a low level, at a level that lies beneath anything that has to do with the C++ language, updates to a variable by different threads can be cached in different memory locations.

"是" 是因为,在底层,位于与C++语言无关的层次上,不同线程对变量的更新可以缓存在不同的内存位置。

"No" because, "cache" is not part of the explanation of how the C++ language works. It is not part of the language's memory model.

"不" 是因为,“cache”不是解释C++语言工作原理的一部分。它不是语言的 memory model 的一部分。

The way the memory model explains it, there is one and only one copy of any given global variable. But when different threads update global variables with no explicit synchronization, then threads are allowed to disagree on the order in which updates to the different variables happened. That disagreement can allow threads to see inconsistent, sometimes corrupt, views of shared data, but it also allows the system to make the most efficient use of the hardware caches, and get the best performance when threads are accessing their private data.

内存模型的解释是,任何给定的全局变量只有一个副本。但是,当不同的线程在没有明确的 synchronization 的情况下更新全局变量时,线程可以就不同变量的更新顺序产生分歧。这种分歧可以允许线程看到不一致的、有时是损坏的 shared 数据视图,但它也允许系统在线程访问它们的 private 数据时以最有效的方式使用硬件缓存,并获得最佳性能。

> does a mutex Lock(), followed by Unlock() guarantee copying the data into the global variable proper?

mutex Lock() 后跟 Unlock() 是否保证将数据复制到全局变量中?

Locking and unlocking mutexes is one kind of explicit synchronization that a program can use to ensure that different threads see consistent views of the same shared data structure.

锁定和解锁互斥锁是程序可以使用的一种显式同步方式,以确保不同线程看到相同的共享数据结构的一致视图。

If you have some shared, global structure, and if every thread locks the same shared, global mutex whenever they access the structure, then the structure is safe. What "safe" means is, Every time thread A locks the mutex, then it will see the data structure in the same exact state that some other thread left it in just before the other thread released the mutex.

如果你有一些共享的全局结构,并且每个线程在访问该结构时都锁定 相同的 全局互斥锁,那么该结构就是安全的。 "安全" 意味着,每次线程A锁定互斥锁时,它将看到数据结构处于与其他线程在释放互斥锁之前留下的完全相同状态。

If any thread access the structure without locking the mutex though, even if it's a read-only access, the other thread could see the structure in an inconsistent state because different parts of what it sees could come from different levels of "cache."

如果任何线程在没有锁定互斥锁的情况下访问该结构,即使是只读访问,其他线程也 可能 看到该结构处于不一致状态,因为它所看到的不同部分可能来自不同级别的 "cache"。

英文:

> ...does each thread has its own 'copy' of the global variable...

Yes, but No.

"Yes" because, at a low level, at a level that lies beneath anything that has to do with the C++ language, updates to a variable by different threads can be cached in different memory locations.

"No" because, "cache" is not part of the explanation of how the C++ language works. It is not part of the language's memory model.

The way the memory model explains it, there is one and only one copy of any given global variable. But when different threads update global variables with no explicit synchronization, then threads are allowed to disagree on the order in which updates to the different variables happened. That disagreement can allow threads to see inconsistent, sometimes corrupt, views of shared data, but it also allows the system to make the most efficient use of the hardware caches, and get the best performance when threads are accessing their private data.

> does a mutex Lock(), followed by Unlock() guarantee copying the data into the global variable proper?

Locking and unlocking mutexes is one kind of explicit synchronization that a program can use to ensure that different threads see consistent views of the same shared data structure.

If you have some shared, global structure, and if every thread locks the same shared, global mutex whenever they access the structure, then the structure is safe. What "safe" means is, Every time thread A locks the mutex, then it will see the data structure in the same exact state that some other thread left it in just before the other thread released the mutex.

If any thread access the structure without locking the mutex though, even if it's a read-only access, the other thread could see the structure in an inconsistent state because different parts of what it sees could come from different levels of "cache."

huangapple
  • 本文由 发表于 2023年4月19日 19:28:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/76053960.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定