英文:
Slashes and dots in function names and prototypes?
问题
void runtime∕race·Read(int32 goid, void *addr, void *pc);
void runtime∕race·Write(int32 goid, void *addr, void *pc);
void
runtime·raceinit(void)
{
// ...
}
斜杠和点号(·)代表路径分隔符,这是有效的C代码。
英文:
I'm new to C and looking at Go's source tree I found this:
https://code.google.com/p/go/source/browse/src/pkg/runtime/race.c
void runtime∕race·Read(int32 goid, void *addr, void *pc);
void runtime∕race·Write(int32 goid, void *addr, void *pc);
void
runtime·raceinit(void)
{
// ...
}
What do the slashes and dots (·) mean? Is this valid C?
答案1
得分: 20
> 重要更新:
>
> 最终答案肯定是你从Russ Cox那里得到的,他是Go的作者之一,在golang-nuts邮件列表上回答了这个问题。话虽如此,我还是会保留一些我之前的笔记,它们可能有助于理解一些事情。
>
> 此外,从上面链接的答案中可以看出,我相信∕
“伪斜杠”现在也可以被翻译成普通的/
斜杠(就像中点被翻译成点)在比我下面测试的Go C编译器的更新版本中 - 但我没有时间验证。
该文件由Go语言套件的内部C编译器编译,该编译器源自Plan 9 C编译器<sup>(1)(2)</sup>,并且与C标准有一些差异(大多是扩展,据我所知)。
其中一个扩展是允许在标识符中使用UTF-8字符。
现在,在Go语言套件的C编译器中,中点字符(·)被特殊处理,因为它在目标文件中被翻译为普通的点(.),这由Go语言套件的内部链接器解释为命名空间分隔符字符。
> ## 示例 ##
> 对于以下文件example.c
(注意:必须以UTF-8无BOM保存):
> <pre>
> void ·Bar1() {}
> void foo·bar2() {}
> void foo∕baz·bar3() {}
> </pre>
> 内部C编译器会生成以下符号:
> <pre>
> $ go tool 8c example.c
> $ go tool nm example.8
> T "".Bar1
> T foo.bar2
> T foo∕baz.bar3
> </pre>
>
> 现在,请注意我给了·Bar1()
一个大写的B
。这是因为这样,我可以使其对常规Go代码可见 - 因为它被翻译为与编译以下Go代码所得到的完全相同的符号:
> <pre>
> package example
> func Bar1() {} // nm将显示:T "".Bar1
> </pre>
现在,关于你在问题中提到的函数,故事进一步深入。我对此不太确定,但我会尽力根据我所知的来解释。因此,从这一点开始,以下每个句子都应该被理解为在末尾写上“据我所知”。
因此,更好地理解这个谜题所需的下一个缺失的部分是了解一些关于奇怪的""
命名空间以及Go套件的链接器如何处理它的信息。""
命名空间是我们可能想称之为“空”(因为对程序员来说""
意味着“空字符串”)命名空间,或者更好地说是一个“占位符”命名空间。当链接器看到像这样的导入:
import examp "path/to/package/example"
//...
func main() {
examp.Bar1()
}
然后它会取$GOPATH/pkg/.../example.a
库文件,并在导入阶段动态地将每个""
替换为path/to/package/example
。因此,现在,在链接的程序中,我们将看到一个像这样的符号:
<pre>
T path/to/package/example.Bar1
</pre>
英文:
> IMPORTANT UPDATE:
>
> The ultimate answer is certainly the one you got from Russ Cox, one of Go authors, on the golang-nuts mailing list. That said, I'm leaving some of my earlier notes below, they might help to understand some things.
>
> Also, from reading this answer linked above, I believe the ∕
"pseudo-slash" may now be translated to regular /
slash too (like the middot is translated to dot) in newer versions of Go C compiler than the one I've tested below - but I don't have time to verify.
The file is compiled by the Go Language Suite's internal C compiler, which originates in the Plan 9 C compiler<sup>(1)(2)</sup>, and has some differences (mostly extensions, AFAIK) to the C standard.
One of the extensions is, that it allows UTF-8 characters in identifiers.
Now, in the Go Language Suite's C compiler, the middot character (·) is treated in a special way, as it is translated to a regular dot (.) in object files, which is interpreted by Go Language Suite's internal linker as namespace separator character.
> ## Example ##
> For the following file example.c
(note: it must be saved as UTF-8 without BOM):
> <pre>
> void ·Bar1() {}
> void foo·bar2() {}
> void foo∕baz·bar3() {}
> </pre>
> the internal C compiler produces the following symbols:
> <pre>
> $ go tool 8c example.c
> $ go tool nm example.8
> T "".Bar1
> T foo.bar2
> T foo∕baz.bar3
> </pre>
>
> Now, please note I've given the ·Bar1()
a capital B
. This is
> because that way, I can make it visible to regular Go code - because
> it is translated to the exact same symbol as would result from
> compiling the following Go code:
> <pre>
> package example
> func Bar1() {} // nm will show: T "".Bar1
> </pre>
Now, regarding the functions you named in the question, the story goes further down the rabbit hole. I'm a bit less sure if I'm right here, but I'll try to explain based on what I know. Thus, each sentence below this point should be read as if it had "AFAIK" written just at the end.
So, the next missing piece needed to better understand this puzzle, is to know something more about the strange ""
namespace, and how the Go suite's linker handles it. The ""
namespace is what we might want to call an "empty" (because ""
for a programmer means "an empty string") namespace, or maybe better, a "placeholder" namespace. And when the linker sees an import going like this:
import examp "path/to/package/example"
//...
func main() {
examp.Bar1()
}
then it takes the $GOPATH/pkg/.../example.a
library file, and during import phase substitutes on the fly each ""
with path/to/package/example
. So now, in the linked program, we will see a symbol like this:
<pre>
T path/to/package/example.Bar1
</pre>
答案2
得分: 8
The "·" character is \xB7
according to my Javascript console.
The "∕" character is \x2215
.
The dot falls within Annex D of the C99 standard lists which special characters which are valid as identifiers in C source. The slash doesn't seem to, so I suspect it's used as something else (perhaps namespacing) via a #define or preprocessor magic.
That would explain why the dot is present in the actual function definition, but the slash is not.
Edit: Check This Answer for some additional information. It's possible that the unicode slash is just allowed by GCC's implementation.
英文:
The "·" character is \xB7
according to my Javascript console.
The "∕" character is \x2215
.
The dot falls within Annex D of the C99 standard lists which special characters which are valid as identifiers in C source. The slash doesn't seem to, so I suspect it's used as something else (perhaps namespacing) via a #define or preprocessor magic.
That would explain why the dot is present in the actual function definition, but the slash is not.
Edit: Check This Answer for some additional information. It's possible that the unicode slash is just allowed by GCC's implementation.
答案3
得分: 5
这似乎不是标准的C语言,也不是C99。特别是,无论是在C99模式下,还是在gcc
和clang
中,都会报错关于点的问题。
这段源代码是由Part 9编译器套件编译的(特别是在OS X上的./pkg/tool/darwin_amd64/6c),该编译器是由Go构建系统引导的。根据这个文档,在第8页的底部,Plan 9及其编译器根本不使用ASCII,而是使用Unicode。在第9页的底部,它指出任何具有足够高代码点的字符都被认为是有效的标识符名称。
这里没有任何预处理的魔法 - 函数的定义与函数的声明不匹配,只是因为它们是不同的函数。例如,void runtime∕race·Initialize();
是一个外部函数,其定义出现在./src/pkg/runtime/race/race.go中;同样,void runtime∕race·MapShadow(…)
也是如此。
稍后出现的函数void runtime·raceinit(void)
是一个完全不同的函数,这通过它实际调用runtime∕race·Initialize();
来表明。
英文:
It appears this is not standard C, nor C99. In particular, it both gcc
and clang
complain about the dot, even when in C99 mode.
This source code is compiled by the Part 9 compiler suite (in particular, ./pkg/tool/darwin_amd64/6c on OS X), which is bootstrapped by the Go build system. According to this document, bottom of page 8, Plan 9 and its compiler do not use ASCII at all, but use Unicode instead. At bottom of page 9, it it stated that any character with a sufficiently high code point is considered valid for use in an identifier name.
There's no pre-processing magic at all - the definition of functions do not match the declaration of functions simply because those are different functions. For example, void runtime∕race·Initialize();
is an external function whose definition appears in ./src/pkg/runtime/race/race.go; likewise for void runtime∕race·MapShadow(…)
.
The function which appears later, void runtime·raceinit(void)
, is a completely different function, which is aparant by the fact it actually calls runtime∕race·Initialize();
.
答案4
得分: 3
go编译器/运行时是使用最初为plan9开发的C编译器进行编译的。当您从源代码构建go时,它首先会构建plan9编译器,然后使用这些编译器来构建Go。
plan9编译器支持Unicode函数名1,而Go开发人员在其函数名中使用Unicode字符作为伪命名空间。
1 看起来这实际上可能符合标准:https://stackoverflow.com/questions/2681778/g-unicode-variable-name 但是gcc不支持Unicode函数/变量名。
英文:
The go compiler/runtime is compiled using the C compilers originally developed for plan9. When you build go from source, it'll first build the plan9 compilers, then use those to build Go.
The plan9 compilers support unicode function names 1, and the Go developers use unicode characters in their function names as pseudo namespaces.
1 It looks like this might actually be standards compliant: https://stackoverflow.com/questions/2681778/g-unicode-variable-name but gcc doesn't support unicode function/variable names.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论