json.Encoder为什么会添加额外的换行符?

huangapple go评论86阅读模式
英文:

Why does json.Encoder add an extra line?

问题

json.Encoder似乎与json.Marshal稍有不同。具体来说,它在编码值的末尾添加了一个换行符。你知道为什么会这样吗?在我看来,这似乎是一个错误。

package main

import "fmt"
import "encoding/json"
import "bytes"

func main() {
    var v string
    v = "hello"
    buf := bytes.NewBuffer(nil)
    json.NewEncoder(buf).Encode(v)
    b, _ := json.Marshal(&v)

    fmt.Printf("%q, %q", buf.Bytes(), b)
}

这将输出:

"\hello\n", "\"hello\""

在 Playground 中尝试一下

英文:

json.Encoder seems to behave slightly different than json.Marshal. Specifically it adds a new line at the end of the encoded value. Any idea why is that? It looks like a bug to me.

package main

import "fmt"
import "encoding/json"
import "bytes"

func main() {
	var v string
	v = "hello"
	buf := bytes.NewBuffer(nil)
	json.NewEncoder(buf).Encode(v)
	b, _ := json.Marshal(&v)

	fmt.Printf("%q, %q", buf.Bytes(), b)
}

This outputs

"\"hello\"\n", "\"hello\""

<kbd>Try it in the Playground</kbd>

答案1

得分: 14

因为在使用Encoder.Encode时,他们明确地添加了一个换行符。这是该函数的源代码,并且在文档中实际上说明了它会添加一个换行符(请参见注释,即文档):

https://golang.org/src/encoding/json/stream.go?s=4272:4319

// Encode将v的JSON编码写入流中,后面跟一个换行符。
//
// 有关将Go值转换为JSON的详细信息,请参见Marshal的文档。
func (enc *Encoder) Encode(v interface{}) error {
    if enc.err != nil {
        return enc.err
    }
    e := newEncodeState()
    err := e.marshal(v)
    if err != nil {
        return err
    }
    
    // 用换行符终止每个值。
    // 这样在调试时输出看起来更好一些,
    // 如果编码的值是一个数字,则需要某种空格,
    // 这样读者就知道没有更多的数字了。
    e.WriteByte('\n')

    if _, err = enc.w.Write(e.Bytes()); err != nil {
        enc.err = err
    }
    encodeStatePool.Put(e)
    return err
}

那么,除了“使输出看起来更好一些”之外,Go开发人员为什么要这样做呢?一个答案是:

流式处理

Go的json Encoder针对流式处理进行了优化(例如MB/GB/PB级别的json数据)。在流式处理时,通常需要一种方式来标记流何时完成。在Encoder.Encode()的情况下,使用的是\n换行符。当然,你可以将其写入缓冲区。但你也可以将其写入一个io.Writer,以便流式传输v的块。

这与使用json.Marshal相反,如果你的输入来自不受信任的(和未知的有限)源(例如通过ajax POST方法向你的Web服务发送一个100MB的json文件),通常不鼓励使用json.Marshal。而且,json.Marshal将是一个最终的完整的json集合,例如你不会期望将几个100个Marshal条目连接在一起。你会使用Encoder.Encode()来构建一个大集合,并将其写入缓冲区、流、文件、io.Writer等。

每当对是否存在错误感到怀疑时,我总是查阅源代码-这是Go的一个优点,它的源代码和编译器都是纯Go的。在[n]vim中,我使用\gb在浏览器中打开源代码定义,具体设置请参见我的.vimrc设置

英文:

Because they explicitly added a new line character when using Encoder.Encode. Here's the source code to that func, and it actually states it adds a newline character in the documentation (see comment, which is the documentation):

https://golang.org/src/encoding/json/stream.go?s=4272:4319

// Encode writes the JSON encoding of v to the stream,
// followed by a newline character.
//
// See the documentation for Marshal for details about the
// conversion of Go values to JSON.
func (enc *Encoder) Encode(v interface{}) error {
	if enc.err != nil {
		return enc.err
	}
	e := newEncodeState()
	err := e.marshal(v)
	if err != nil {
		return err
	}
	
	// Terminate each value with a newline.
	// This makes the output look a little nicer
	// when debugging, and some kind of space
	// is required if the encoded value was a number,
	// so that the reader knows there aren&#39;t more
	// digits coming.
	e.WriteByte(&#39;\n&#39;)

	if _, err = enc.w.Write(e.Bytes()); err != nil {
		enc.err = err
	}
	encodeStatePool.Put(e)
	return err
}

Now, why did the Go developers do it other than "makes the output look a little nice"? One answer:

Streaming

The go json Encoder is optimized for streaming (e.g. MB/GB/PB of json data). It is typical that when streaming you need a way to deliminate when your stream has completed. In the case of Encoder.Encode(), that is a \n newline character. Sure, you can certainly write to a buffer. But you can also write to an io.Writer which would stream the block of v.

This is opposed to the use of json.Marshal which is generally discouraged if your input is from an untrusted (and unknown limited) source (e.g. an ajax POST method to your web service - what if someone posts a 100MB json file?). And, json.Marshal would be a final complete set of json - e.g. you wouldn't expect to concatenate a few 100 Marshal entries together. You'd use Encoder.Encode() for that to build a large set and write to the buffer, stream, file, io.Writer, etc.

Whenever in doubt if it's a bug, I always lookup the source - that's one of the advantages to Go, it's source and compiler is just pure Go. Within [n]vim I use \gb to open the source definition in a browser with my .vimrc settings.

答案2

得分: 0

编码器用于写入一系列文档。额外的空格用于终止流中的JSON文档。

流读取器需要一个终止符。考虑一个包含以下JSON文档的流:123。如果没有额外的空格,传输的数据将是字节序列123。这将被视为一个包含数字123的单个JSON文档,而不是三个文档。

英文:

The Encoder writes a stream of documents. The extra whitespace terminates a JSON document in the stream.

A terminator is required for stream readers. Consider a stream containing these JSON documents: 1, 2, 3. Without the extra whitespace, the data on the wire is the sequence of bytes 123. This is a single JSON document with the number 123, not three documents.

答案3

得分: 0

你可以通过反向流来删除换行符:

f, _ := os.OpenFile(fname, ...)
encoder := json.NewEncoder(f)
encoder.Encode(v)
f.Seek(-1, 1)
f.WriteString("other data ...")

他们应该让用户控制这种奇怪的行为:

  • 一个构建选项来禁用它
  • Encoder.SetEOF(eof string)
  • Encoder.SetIndent(prefix, indent, eof string)
英文:

You can erease the newline by backward stream:

f, _ := os.OpenFile(fname, ...)
encoder := json.NewEncoder(f)
encoder.Encode(v)
f.Seek(-1, 1)
f.WriteString(&quot;other data ...&quot;)

They should let user control this strange behavior:

  • a build option to disable it
  • Encoder.SetEOF(eof string)
  • Encoder.SetIndent(prefix, indent, eof string)

huangapple
  • 本文由 发表于 2016年3月31日 05:16:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/36319918.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定