英文:
Why does json.Encoder add an extra line?
问题
json.Encoder
似乎与json.Marshal
稍有不同。具体来说,它在编码值的末尾添加了一个换行符。你知道为什么会这样吗?在我看来,这似乎是一个错误。
package main
import "fmt"
import "encoding/json"
import "bytes"
func main() {
var v string
v = "hello"
buf := bytes.NewBuffer(nil)
json.NewEncoder(buf).Encode(v)
b, _ := json.Marshal(&v)
fmt.Printf("%q, %q", buf.Bytes(), b)
}
这将输出:
"\hello\n", "\"hello\""
英文:
json.Encoder
seems to behave slightly different than json.Marshal
. Specifically it adds a new line at the end of the encoded value. Any idea why is that? It looks like a bug to me.
package main
import "fmt"
import "encoding/json"
import "bytes"
func main() {
var v string
v = "hello"
buf := bytes.NewBuffer(nil)
json.NewEncoder(buf).Encode(v)
b, _ := json.Marshal(&v)
fmt.Printf("%q, %q", buf.Bytes(), b)
}
This outputs
"\"hello\"\n", "\"hello\""
答案1
得分: 14
因为在使用Encoder.Encode
时,他们明确地添加了一个换行符。这是该函数的源代码,并且在文档中实际上说明了它会添加一个换行符(请参见注释,即文档):
https://golang.org/src/encoding/json/stream.go?s=4272:4319
// Encode将v的JSON编码写入流中,后面跟一个换行符。
//
// 有关将Go值转换为JSON的详细信息,请参见Marshal的文档。
func (enc *Encoder) Encode(v interface{}) error {
if enc.err != nil {
return enc.err
}
e := newEncodeState()
err := e.marshal(v)
if err != nil {
return err
}
// 用换行符终止每个值。
// 这样在调试时输出看起来更好一些,
// 如果编码的值是一个数字,则需要某种空格,
// 这样读者就知道没有更多的数字了。
e.WriteByte('\n')
if _, err = enc.w.Write(e.Bytes()); err != nil {
enc.err = err
}
encodeStatePool.Put(e)
return err
}
那么,除了“使输出看起来更好一些”之外,Go开发人员为什么要这样做呢?一个答案是:
流式处理
Go的json Encoder
针对流式处理进行了优化(例如MB/GB/PB级别的json数据)。在流式处理时,通常需要一种方式来标记流何时完成。在Encoder.Encode()
的情况下,使用的是\n
换行符。当然,你可以将其写入缓冲区。但你也可以将其写入一个io.Writer,以便流式传输v
的块。
这与使用json.Marshal
相反,如果你的输入来自不受信任的(和未知的有限)源(例如通过ajax POST方法向你的Web服务发送一个100MB的json文件),通常不鼓励使用json.Marshal
。而且,json.Marshal
将是一个最终的完整的json集合,例如你不会期望将几个100个Marshal
条目连接在一起。你会使用Encoder.Encode()来构建一个大集合,并将其写入缓冲区、流、文件、io.Writer等。
每当对是否存在错误感到怀疑时,我总是查阅源代码-这是Go的一个优点,它的源代码和编译器都是纯Go的。在[n]vim中,我使用\gb
在浏览器中打开源代码定义,具体设置请参见我的.vimrc
设置。
英文:
Because they explicitly added a new line character when using Encoder.Encode
. Here's the source code to that func, and it actually states it adds a newline character in the documentation (see comment, which is the documentation):
https://golang.org/src/encoding/json/stream.go?s=4272:4319
// Encode writes the JSON encoding of v to the stream,
// followed by a newline character.
//
// See the documentation for Marshal for details about the
// conversion of Go values to JSON.
func (enc *Encoder) Encode(v interface{}) error {
if enc.err != nil {
return enc.err
}
e := newEncodeState()
err := e.marshal(v)
if err != nil {
return err
}
// Terminate each value with a newline.
// This makes the output look a little nicer
// when debugging, and some kind of space
// is required if the encoded value was a number,
// so that the reader knows there aren't more
// digits coming.
e.WriteByte('\n')
if _, err = enc.w.Write(e.Bytes()); err != nil {
enc.err = err
}
encodeStatePool.Put(e)
return err
}
Now, why did the Go developers do it other than "makes the output look a little nice"? One answer:
Streaming
The go json Encoder
is optimized for streaming (e.g. MB/GB/PB of json data). It is typical that when streaming you need a way to deliminate when your stream has completed. In the case of Encoder.Encode()
, that is a \n
newline character. Sure, you can certainly write to a buffer. But you can also write to an io.Writer which would stream the block of v
.
This is opposed to the use of json.Marshal
which is generally discouraged if your input is from an untrusted (and unknown limited) source (e.g. an ajax POST method to your web service - what if someone posts a 100MB json file?). And, json.Marshal
would be a final complete set of json - e.g. you wouldn't expect to concatenate a few 100 Marshal
entries together. You'd use Encoder.Encode() for that to build a large set and write to the buffer, stream, file, io.Writer, etc.
Whenever in doubt if it's a bug, I always lookup the source - that's one of the advantages to Go, it's source and compiler is just pure Go. Within [n]vim I use \gb
to open the source definition in a browser with my .vimrc
settings.
答案2
得分: 0
编码器用于写入一系列文档。额外的空格用于终止流中的JSON文档。
流读取器需要一个终止符。考虑一个包含以下JSON文档的流:1
,2
,3
。如果没有额外的空格,传输的数据将是字节序列123
。这将被视为一个包含数字123的单个JSON文档,而不是三个文档。
英文:
The Encoder writes a stream of documents. The extra whitespace terminates a JSON document in the stream.
A terminator is required for stream readers. Consider a stream containing these JSON documents: 1
, 2
, 3
. Without the extra whitespace, the data on the wire is the sequence of bytes 123
. This is a single JSON document with the number 123, not three documents.
答案3
得分: 0
你可以通过反向流来删除换行符:
f, _ := os.OpenFile(fname, ...)
encoder := json.NewEncoder(f)
encoder.Encode(v)
f.Seek(-1, 1)
f.WriteString("other data ...")
他们应该让用户控制这种奇怪的行为:
- 一个构建选项来禁用它
- Encoder.SetEOF(eof string)
- Encoder.SetIndent(prefix, indent, eof string)
英文:
You can erease the newline by backward stream:
f, _ := os.OpenFile(fname, ...)
encoder := json.NewEncoder(f)
encoder.Encode(v)
f.Seek(-1, 1)
f.WriteString("other data ...")
They should let user control this strange behavior:
- a build option to disable it
- Encoder.SetEOF(eof string)
- Encoder.SetIndent(prefix, indent, eof string)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论