两个 macOS 动态链接库与共享静态库:为什么全局变量是共享的?

huangapple go评论56阅读模式
英文:

Two macOS dylibs with Shared Static Library: Why Are Global Variables Shared?

问题

令我惊讶的是,当编译两个分开的动态链接库(dylibs)恰好共享一个静态库时,该静态库中定义的全局变量似乎是共享的。此StackOverflow文章似乎表明每个动态库将保持其全局变量分开,但上述情况下的测试代码表明这不是真的。我正在寻求确认这在macOS(以及可能的Linux)上是预期行为。

给定这种情况:

  • 静态库 "Foo" 有一个名为 "Bar" 的全局变量;这是一个初始化为 123 的整数。
  • 动态库 "AAA" 链接到 "Foo"
  • 动态库 "BBB" 链接到 "Foo"
  • 应用程序 "MyApp" 链接到动态库 AAA 和 BBB。
  • 应用程序调用动态库 "AAA" 中的函数来修改 Bar,将其设置为 111
  • 应用程序调用动态库 "AAA" 中的函数来打印 Bar
  • 应用程序调用动态库 "BBB" 中的函数来打印 Bar

我预期 "AAA" 打印 "Bar" 时它应该是 111,而 "BBB" 打印 bar 时它应该仍然是 123。然而,实际情况是当 "BBB" 打印 "Bar" 时,它是 111,这表明从 MyApp 的角度看,"Bar" 只有一个共享实例。

我的怀疑是,由于 "Bar" 被 "AAA" 和 "BBB" 公开,当你动态链接这两个 dylibs 时,两者中的一个 "胜出",因为名称完全相同,链接器无法区分两者。

通过在 Xcode 中的 "其他 C++ 标志" 中设置 "-fvisibility=hidden" 标志来验证了这一怀疑。如果我对 dylibs "AAA" 和 "BBB" 这样做,那么两个全局变量似乎是不同的。我预计这是因为 'visibility=hidden' 隐藏了两个 "Bar" 的副本,从而解决了上面段落中描述的冲突。

有人可以确认我对此的理解吗?

--- 示例代码 ---

静态库 CGlobalTest 具有如下所示的 C++ 类。该类在 .cpp 文件中声明了一个函数内部的全局变量,一个类全局变量和一个静态全局变量。函数 GetGlobal() 返回其中之一的引用,基于 GlobalType 参数。

CGlobalTest.cpp:

class CGlobalTest
{
public:
	CGlobalTest() { }
	
	static int&	GetFunctionGlobal()
				{
					static int sFunctionGlobal = 123;
					return sFunctionGlobal;
				}
	
	static int&	GetClassGlobal()
				{
					return sClassGlobal;
				}
	
	static int&	GetFileGlobal();
	
	static int&	GetGlobal(
					GlobalType	inType)
				{
					switch (inType) {
					case kFunctionGlobal:
						return GetFunctionGlobal();
						break;
					case kClassGlobal:
						return GetClassGlobal();
						break;
					case kFileGlobal:
						return GetFileGlobal();
						break;
					}
				}
	
	static int	sClassGlobal;
};

CGlobalTest.h

#include "static_lib.h"

int	CGlobalTest::sClassGlobal = 456;

int sFileGlobal = 789;

int&
CGlobalTest::GetFileGlobal()
{
	return sFileGlobal;
}

然后我有两个使用 CGlobalTest 静态库的动态库,分别名为 global_test_dynamic_1 和 global_test_dynamic_2。 代码 1 和 2 本质上是相同的,所以我只包含第一个。

dynamic_lib_1.cpp:

#include "dynamic_lib_1.h"
#include "static_lib.h"
#include "stdio.h"

const char*
GlobalTypeToString(
	GlobalType	inType)
{
	const char* type = "";
	switch (inType) {
	case kFunctionGlobal:
		type = "Function Global";
		break;
	case kClassGlobal:
		type = "Class Global";
		break;
	case kFileGlobal:
		type = "File Global";
		break;
	}
	
	return type;
}

void dynamic_lib_1_set_global(enum GlobalType inType, int value)
{
	int& global = CGlobalTest::GetGlobal((GlobalType) inType);
	global = value;
	printf("Dynamic Lib 1: Set %s: %d (%p)\n", GlobalTypeToString(inType), global, &global);
}

void dynamic_lib_1_print_global(enum GlobalType inType)
{
	const int& global = CGlobalTest::GetGlobal((GlobalType) inType);
	printf("Dynamic Lib 1: %s = %d (%p)\n", GlobalTypeToString(inType), global, &global);
}

dynamic_lib_1.h

#ifdef __cplusplus
#define EXPORT extern "C" __attribute__((visibility("default")))
#else
#define EXPORT
#endif

#include "global_type.h"

EXPORT void dynamic_lib_1_set_global(enum GlobalType inType, int value);
EXPORT void dynamic_lib_1_print_global(enum GlobalType inType);

最后,有一个应用程序链接到这两个动态链接库。

#include "dynamic_lib_1.h"
#include "dynamic_lib_2.h"
#include "global_type.h"

#include <assert.h>
#include <dlfcn.h>
#include <stdio.h>
#include <unistd.h>

typedef void (*print_func)(enum GlobalType inType);
typedef void (*set_func)(enum GlobalType inType, int value);

int main()
{
	printf("App is starting up...\n");

	// 加载动态库 1

	void* handle1 = dlopen("libglobal_test_dynamic_1.dylib", RTLD_NOW);
	assert(handle1 != NULL);

	print_func d1_print = (print_func) dlsym(handle1, "dynamic_lib_1_print_global");
	assert(d1_print != NULL);
	
	set_func d1_set = (set_func) dlsym(handle1, "dynamic_lib_1_set_global");
英文:

To my surprise, when compiling two separate dylibs that happen to share a static library, the global variables defined in that static library seem to be shared. This SO article would seem to indicate that every dynamic library will keep its global variables separate, but in the case described above, test code included below proves that this is not the true. I am seeking confirmation that this is to be expected on macOS (and probably Linux as well).

Given this scenario:

  • static lib "Foo" has a global called "Bar"; this is an int initialized to 123.
  • dylib "AAA" links to "Foo"
  • dylib "BBB" links to "Foo"
  • application "MyApp" links to dylibs AAA and BBB.
  • application calls function in dylib "AAA" that modifies Bar, setting it to 111
  • application calls function in dylib "AAA" that prints Bar
  • application calls function in dylib "BBB" that prints Bar

I expected that when "AAA" printed "Bar" it would be 111, and that when "BBB" printed bar it would still be 123. Instead, when "BBB" prints "Bar", it is 111, indicating that – from MyApp's point of view – there is only a single, shared instance of "Bar".

My suspicion is that since "Bar" is exposed by both "AAA" and "BBB", when you dynamically link the two dylibs, one of the two "wins" because the name is exactly the same and the linker can't distinguish the two.

This suspicion seems to be proved by setting the "-fvisibility=hidden" flag in the "Other C++ Flags" in Xcode. If I do this for the dylibs "AAA" and "BBB", then the two global variables seem to be distinct. I expect this is because 'visibility=hidden' hides the two copies of "Bar", thus resolving the conflict described in the paragraph above.

Can someone confirm my understanding of this?

--- SAMPLE CODE ---

The static library CGlobalTest has a C++ class as shown below. The class declares a global inside a function, a class global, and static global in the .cpp file. The function GetGlobal() returns a reference to one of these based on the GlobalType parameter.

CGlobalTest.cpp:

class CGlobalTest
{
public:
	CGlobalTest() { }
	
	static int&amp;	GetFunctionGlobal()
				{
					static int sFunctionGlobal = 123;
					return sFunctionGlobal;
				}
	
	static int&amp;	GetClassGlobal()
				{
					return sClassGlobal;
				}
	
	static int&amp;	GetFileGlobal();
	
	static int&amp;	GetGlobal(
					GlobalType	inType)
				{
					switch (inType) {
					case kFunctionGlobal:
						return GetFunctionGlobal();
						break;
					case kClassGlobal:
						return GetClassGlobal();
						break;
					case kFileGlobal:
						return GetFileGlobal();
						break;
					}
				}
	
	static int	sClassGlobal;
};

CGlobalTest.h

#include &quot;static_lib.h&quot;

int	CGlobalTest::sClassGlobal = 456;

int sFileGlobal = 789;

int&amp;
CGlobalTest::GetFileGlobal()
{
	return sFileGlobal;
}

I've then got two dynamic libraries that use the CGlobalTest static library, called global_test_dynamic_1 and global_test_dynamic_2. The code for 1 and 2 are essentially the same, so I'm including just the first one.

dynamic_lib_1.cpp:

#include &quot;dynamic_lib_1.h&quot;
#include &quot;static_lib.h&quot;
#include &quot;stdio.h&quot;

const char*
GlobalTypeToString(
	GlobalType	inType)
{
	const char* type = &quot;&quot;;
	switch (inType) {
	case kFunctionGlobal:
		type = &quot;Function Global&quot;;
		break;
	case kClassGlobal:
		type = &quot;Class Global&quot;;
		break;
	case kFileGlobal:
		type = &quot;File Global&quot;;
		break;
	}
	
	return type;
}

void dynamic_lib_1_set_global(enum GlobalType inType, int value)
{
	int&amp; global = CGlobalTest::GetGlobal((GlobalType) inType);
	global = value;
	printf(&quot;Dynamic Lib 1: Set %s: %d (%p)\n&quot;, GlobalTypeToString(inType), global, &amp;global);
}

void dynamic_lib_1_print_global(enum GlobalType inType)
{
	const int&amp; global = CGlobalTest::GetGlobal((GlobalType) inType);
	printf(&quot;Dynamic Lib 1: %s = %d (%p)\n&quot;, GlobalTypeToString(inType), global, &amp;global);
}

dynamic_lib_1.h

#ifdef __cplusplus
#define EXPORT extern &quot;C&quot; __attribute__((visibility(&quot;default&quot;)))
#else
#define EXPORT
#endif

#include &quot;global_type.h&quot;

EXPORT void dynamic_lib_1_set_global(enum GlobalType inType, int value);
EXPORT void dynamic_lib_1_print_global(enum GlobalType inType);

Finally, there is an application that links to the two dylibs.

#include &quot;dynamic_lib_1.h&quot;
#include &quot;dynamic_lib_2.h&quot;
#include &quot;global_type.h&quot;

#include &lt;assert.h&gt;
#include &lt;dlfcn.h&gt;
#include &lt;stdio.h&gt;
#include &lt;unistd.h&gt;

typedef void (*print_func)(enum GlobalType inType);
typedef void (*set_func)(enum GlobalType inType, int value);

int main()
{
	printf(&quot;App is starting up...\n&quot;);

	// LOAD DYNAMIC LIBRARY 1

	void* handle1 = dlopen(&quot;libglobal_test_dynamic_1.dylib&quot;, RTLD_NOW);
	assert(handle1 != NULL);

	print_func d1_print = (print_func) dlsym(handle1, &quot;dynamic_lib_1_print_global&quot;);
	assert(d1_print != NULL);
	
	set_func d1_set = (set_func) dlsym(handle1, &quot;dynamic_lib_1_set_global&quot;);
	assert(d1_set != NULL);
	
	// LOAD DYNAMIC LIBRARY 2

	void* handle2 = dlopen(&quot;libglobal_test_dynamic_2.dylib&quot;, RTLD_NOW);
	assert(handle1 != NULL);

	print_func d2_print = (print_func) dlsym(handle2, &quot;dynamic_lib_2_print_global&quot;);
	assert(d2_print != NULL);
	
	set_func d2_set = (set_func) dlsym(handle2, &quot;dynamic_lib_2_set_global&quot;);
	assert(d2_set != NULL);
	
	enum GlobalType type;
	
	printf(&quot;**************************************************\n&quot;);
	printf(&quot;** FUNCTION GLOBAL\n&quot;);
	printf(&quot;**************************************************\n&quot;);
	
	type = kFunctionGlobal;
	
	(d1_print)(type);
	(d2_print)(type);
	
	printf(&quot;** SET D1 TO 111 - THEN PRINT FROM D2\n&quot;);
	d1_set(type, 111);
	d1_print(type);
	d2_print(type);

	printf(&quot;** SET D2 TO 222 - THEN PRINT FROM D1\n&quot;);
	d2_set(type, 222);
	d2_print(type);
	d1_print(type);

	printf(&quot;**************************************************\n&quot;);
	printf(&quot;** CLASS GLOBAL\n&quot;);
	printf(&quot;**************************************************\n&quot;);
	
	type = kClassGlobal;
	
	(d1_print)(type);
	(d2_print)(type);
	
	printf(&quot;** SET D1 TO 111 - THEN PRINT FROM D2\n&quot;);
	d1_set(type, 111);
	d1_print(type);
	d2_print(type);

	printf(&quot;** SET D2 TO 222 - THEN PRINT FROM D1\n&quot;);
	d2_set(type, 222);
	d2_print(type);
	d1_print(type);

	printf(&quot;**************************************************\n&quot;);
	printf(&quot;** FILE GLOBAL\n&quot;);
	printf(&quot;**************************************************\n&quot;);
	
	type = kFileGlobal;
	
	(d1_print)(type);
	(d2_print)(type);
	
	printf(&quot;** SET D1 TO 111 - THEN PRINT FROM D2\n&quot;);
	d1_set(type, 111);
	d1_print(type);
	d2_print(type);

	printf(&quot;** SET D2 TO 222 - THEN PRINT FROM D1\n&quot;);
	d2_set(type, 222);
	d2_print(type);
	d1_print(type);

	return 0;
}

答案1

得分: 2

这是一个特性,而非 bug。每个 dylib 中的符号都明确标记为 "weak-coalesce"(除了被导出的):

% xcrun dyld_info -fixups libglobal_test_dynamic_1.dylib 
libglobal_test_dynamic_1.dylib [arm64]:
    -fixups:
        segment      section          address                 type   target
        __DATA_CONST __got            0x00004000              bind  weak-coalesce/__ZZN11CGlobalTest17GetFunctionGlobalEvE15sFunctionGlobal
        __DATA_CONST __got            0x00004008              bind  libSystem.B.dylib/_printf
        __DATA_CONST __const          0x00004010            rebase  0x00003F45
        __DATA_CONST __const          0x00004018            rebase  0x00003F55
        __DATA_CONST __const          0x00004020            rebase  0x00003F62

这使得动态链接器为进程中使用此符号的所有二进制选择单一定义。

请注意,这必须显式完成。Mach-O 默认使用两级命名空间,您既有符号名称又有预期找到它的库。此外,库必须明确要求导入它知道自己已经导出的符号。对于任何普通符号(如基本函数定义),这是不会发生的,库中的所有内容将只使用本地定义。

但这里的核心问题是内联函数中包含的静态变量。C++ 标准对此有所规定("dcl.inline/6"):

内联函数中具有外部链接的 static 局部变量始终指向同一对象。

这恰好是你的情况。如果将 GetFunctionGlobal() 的定义从头文件移到源文件中,那么你会得到与 GetFileGlobal() 完全相同的行为。

这必须以这种方式工作,因为否则内联函数中的 static 局部变量将会破坏,因为每个翻译单元将获得该变量的自己副本。

就我所知,动态链接仍然在实现上很大程度上是实现定义的,因此这与 dylibs 的行为如何相关并不受标准规定,但如果不是以这种方式实现,那就会出现问题。考虑一个情况,你有一个导出 CGlobalTest 类的动态库和一个导入它的二进制文件。由于你的函数定义仍在头文件中,二进制文件现在可以获得静态局部变量的自己副本(破损),或者可以别名到库的副本(正常)。通过链接静态库,你实际上正在做到这一点:导出它。如果你不想导出它,那就是 -fvisibility=hidden 的用途。

英文:

This is a feature, not a bug. The symbol in each dylib is explicitly marked "weak-coalesce" (in addition to being exported):

% xcrun dyld_info -fixups libglobal_test_dynamic_1.dylib 
libglobal_test_dynamic_1.dylib [arm64]:
    -fixups:
        segment      section          address                 type   target
        __DATA_CONST __got            0x00004000              bind  weak-coalesce/__ZZN11CGlobalTest17GetFunctionGlobalEvE15sFunctionGlobal
        __DATA_CONST __got            0x00004008              bind  libSystem.B.dylib/_printf
        __DATA_CONST __const          0x00004010            rebase  0x00003F45
        __DATA_CONST __const          0x00004018            rebase  0x00003F55
        __DATA_CONST __const          0x00004020            rebase  0x00003F62

This makes the dynamic linker choose a single definition for all binaries using this symbol across the process.

Note that this has to be done explicitly though. Mach-Os by default use a two-level namespace, where you have both the symbol name and the library you expect to find it in. And on top of this, the library has to explicitly ask to import a symbol that it knows it's already exporting. With any ordinary symbol (like a basic function definition), that wouldn't happen, everything within that library would just use the local definition.

But the core problem here are inline functions containing static variables. The C++ standard has this to say ("dcl.inline/6"):

> A static local variable in an inline function with external linkage always refers to the same object.

This is precisely your case. If you move the definition of GetFunctionGlobal() out of the header and into your source file, then you'll get the exact same behaviour as with GetFileGlobal().

And it has to work this way, because otherwise static local variables within inline functions would just be broken, as every translation unit would get its own copy of that variable.

Now, as far as I know, dynamic linking is still to this day very much implementation-defined, so how this behaves with respect to dylibs isn't mandated by the standard, but again if it weren't implemented this way, it would just be broken. Because consider the case where you have a dynamic library that exports the CGlobalTest class, and a binary that imports it. Since your function definitions are still in the header, the binary could now either get its own copy of the static local variable (broken), or it could be aliased to that of the library (sane). And by linking against your static library, you're doing exactly that: exporting it. If you don't want to export it, then that's what -fvisibility=hidden is for.

huangapple
  • 本文由 发表于 2023年4月6日 22:48:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/75950871.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定