英文:
gofmt preserving of newlines
问题
当使用gofmt格式化Go源代码时,它会保留换行符,以便可以将项目分组在一起。我对这个实际是如何实现的很感兴趣。我尝试查看github仓库golang/go
中的源代码,但没有立即找到。如果你查看https://github.com/golang/go/blob/master/src/go/printer/printer.go#L979:
// intersperse extra newlines if present in the source
打印机是如何知道源代码中存在这些额外的换行符的?
有人可以指导我吗?
英文:
When formatting go source code with gofmt, it preserves the newlines so you can group items together. I'm interested on how this is actually implemented. I tried looking at the source code in the github repo golang/go
, but couldn't find it immediately. If you look at https://github.com/golang/go/blob/master/src/go/printer/printer.go#L979:
// intersperse extra newlines if present in the source
How does the printer know those extra newlines are present in the source?
Can someone point me into the right direction?
答案1
得分: 2
与大多数词法分析器不同,Go语言的词法分析器包括通常由编译器的词法分析器删除或省略的标记。词法分析器生成的标记流中包括注释、隐含的分号、换行符、换页符(FF)和其他空白字符的标记,以及其他标记。这使得可以使用相同的标记流重新生成源代码,并创建编译器所需的结构,如抽象语法树(AST)。
英文:
Unlike most lexers, the go lexer is including tokens which are often removed or elided by a compiler's lexer. The stream of tokens emitted by the lexer includes, among others, tokens for comments, implied semicolons, newlines, formfeeds (FF), and other whitespace. This allows the same token stream to be used to regenerate the source, and to create structures required by the compiler, such as the AST.
答案2
得分: 1
gofmt在AST上工作。当你查看https://golang.org/pkg/go/ast时,你会发现每个节点都有Pos()和End()函数,它们分别返回开始和结束的token.Pos。这些本质上是源文件中的偏移量,因此不知道行号/换行符。
但是,当与token.Fileset结合使用时,这样的token.Pos可以转换为包含行号的token.Position。gofmt在函数printer.go:lineFor()中执行此操作。
实际的换行插入是在nodes.go:linebreak()中完成的。linebreak()的第一个参数是通过在相应的token.Pos上调用上述lineFor()获得的行号。该函数计算此行号与上一个已打印标记的行号之间的差异(在struct printer的pos字段中跟踪)。这告诉它现在要打印的标记是否与先前标记在输入文件中的同一行。如果不是,则意味着程序员在原始源代码中包含了一个或多个换行符,linebreak()将最多输出1个空行。虽然它可以保留所有输入换行符,但gofmt的策略是将连续的空行压缩为只有1个空行。
如果你提出这个问题的原因是想自定义gofmt,请查看https://github.com/mbenkmann/goformat。
英文:
gofmt works on the AST. When you look at https://golang.org/pkg/go/ast you'll see that every node has functions Pos() and End() which return the token.Pos of the beginning and end respectively. These are essentially offsets in the source file and as such know nothing about line numbers/breaks.
But when combined with a token.Fileset such a token.Pos can be converted into a token.Position which includes the line number. gofmt does that in the function printer.go:lineFor().
The actual insertion of linebreaks is done in nodes.go:linebreak(). The first argument to linebreak() is a line number obtained by calling the aforementioned lineFor() on the respective token.Pos. The function computes the difference between this line number and the line number of the last token that was printed (tracked in the pos field of struct printer). This tells it if the token to be printed now is on the same line in the input file as the previous token. If it isn't, that means the programmer included one or more line breaks in the original source and linebreak() will output at most 1 empty line. While it could preserve all input line breaks, gofmt's policy is to compress series of empty lines down to only 1 empty line.
If the reason you're asking this question is that you want to customize gofmt, take a look at https://github.com/mbenkmann/goformat
答案3
得分: 0
在internal.go
package 的第40-41行有这样的代码:
> > // 使用分号而不是换行符进行插入,以使psrc中的行号与src中的行号匹配。
然后是这段代码:
psrc := append([]byte("package p;"), src...)
file, err = parser.ParseFile(fset, filename, psrc, parserMode)
这是你要找的内容吗?如果我理解你的问题正确的话。
英文:
In the internal.go
package `line 40-41 there's this:
> > // Insert using a ;, not a newline, so that the line numbers
>
> // in psrc match the ones in src.
And then this:
psrc := append([]byte("package p;"), src...)
file, err = parser.ParseFile(fset, filename, psrc, parserMode)
is that what you are looking for? If I got your question right.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论