不要将不需要的 JSON 键值对读入内存中。

huangapple go评论80阅读模式
英文:

Don't read unneeded JSON key-values into memory

问题

我有一个JSON文件,其中有一个字段在加载到内存时占用了大量的空间。其他字段都是合理的,但我尽量不加载那个特定的字段,除非我绝对必须加载它。

{
    "Field1": "value1",
    "Field2": "value2",
    "Field3": "一个非常长的字符串,可能占用几GB的内存"
}

在将该文件读入内存时,我希望忽略Field3(因为加载它可能会导致我的应用程序崩溃)。下面是一段代码,我认为它可以实现这个目的,因为它使用了IO流,而不是将[]byte类型传递给Unmarshal命令。

package main

import (
    "encoding/json"
    "os"
)

func main() {
    type MyStruct struct {
        Field1 string
        Field2 string
    }
    fi, err := os.Open("myJSONFile.json")
    if err != nil {
        os.Exit(2)
    }
    // 创建一个实例并填充数据
    var mystruct MyStruct
    err = json.NewDecoder(fi).Decode(&mystruct)
    if err != nil {
        os.Exit(2)
    }
    // 做一些其他的事情
}

问题在于内置的json.Decoder类型在Decode之前会将整个文件读入内存,然后丢弃与struct字段不匹配的键值对(正如之前在StackOverflow上指出的:链接)。

有没有办法在Go中解码JSON而不将整个JSON对象保存在内存中?

英文:

I have a JSON file with a single field that takes a huge amount of space when loaded into memory. The other fields are reasonable, but I'm trying to take care not to load that particular field unless I absolutely have to.

{
    "Field1": "value1",
    "Field2": "value2",
    "Field3": "a very very long string that potentially takes a few GB of memory"
}

When reading that file into memory, I'd want to ignore Field3 (because loading it could crash my app). Here's some code that I would assume does that because it uses io streams rather than passing a []byte type to the Unmarshal command.

package main

import (
	"encoding/json"
	"os"
)

func main() {
	type MyStruct struct {
		Field1 string
		Field2 string
	}
	fi, err := os.Open("myJSONFile.json")
	if err != nil {
		os.Exit(2)
	}
	// create an instance and populate
	var mystruct MyStruct
	err = json.NewDecoder(fi).Decode(&mystruct)
	if err != nil {
		os.Exit(2)
	}
	// do some other stuff
}

The issue is that the built-in json.Decoder type reads the entire file into memory on Decode before throwing away key-values that don't match the struct's fields (as has been pointed out on StackOverflow before: link).

Are there any ways of decoding JSON in Go without keeping the entire JSON object in memory?

答案1

得分: 2

你可以编写一个自定义的io.Reader,将其提供给json.Decoder,并在其中预读你的JSON文件并跳过特定的字段。

另一种选择是编写自己的解码器,这更加复杂和混乱。

以下是修改后的代码示例:

type IgnoreField struct {
	io.Reader
	Field string
	buf   bytes.Buffer
}

func NewIgnoreField(r io.Reader, field string) *IgnoreField {
	return &IgnoreField{
		Reader: r,
		Field:  field,
	}
}

func (iF *IgnoreField) Read(p []byte) (n int, err error) {
	if n, err = iF.Reader.Read(p); err != nil {
		return
	}
	s := string(p)
	fl := `"` + iF.Field + `"`
	if i := strings.Index(s, fl); i != -1 {
		l := strings.LastIndex(s[0:i], ",")
		if l == -1 {
			l = i
		}
		iF.buf.WriteString(s[0:l])

		s = s[i+1+len(fl):]
		i = strings.Index(s, `"`)
		if i != -1 {
			s = s[i+1:]
		}
		for {
			i = strings.Index(s, `"`) //end quote
			if i != -1 {
				s = s[i+1:]
				fmt.Println("Skipped")
				break
			} else {
				if n, err = iF.Reader.Read(p); err != nil {
					return
				}
				s = string(p)
			}
		}
		iF.buf.WriteString(s)
	}
	ln := iF.buf.Len()
	if ln >= len(p) {
		tmp := iF.buf.Bytes()
		iF.buf.Reset()
		copy(p, tmp[0:len(p)])
		iF.buf.Write(p[len(p):])
		ln = len(p)
	} else {
		copy(p, iF.buf.Bytes())
		iF.buf.Reset()
	}
	return ln, nil
}

func main() {
	type MyStruct struct {
		Field1 string
		Field2 string
	}
	fi, err := os.Open("myJSONFile.json")
	if err != nil {
		os.Exit(2)
	}
	// create an instance and populate
	var mystruct MyStruct
	err := json.NewDecoder(NewIgnoreField(fi, "Field3")).Decode(&mystruct)
	if err != nil {
		fmt.Println(err)
	}
	fmt.Println(mystruct)
}

你可以在playground上运行这段代码。

英文:

You could write a custom io.Reader that you feed to json.Decoder and that will pre-read your json file and skip that specific field.

The other option is to write your own decoder, more complicated and messy.

//edit it seemed like a fun exercise, so here goes:

type IgnoreField struct {
io.Reader
Field string
buf   bytes.Buffer
}
func NewIgnoreField(r io.Reader, field string) *IgnoreField {
return &IgnoreField{
Reader: r,
Field:  field,
}
}
func (iF *IgnoreField) Read(p []byte) (n int, err error) {
if n, err = iF.Reader.Read(p); err != nil {
return
}
s := string(p)
fl := `"` + iF.Field + `"`
if i := strings.Index(s, fl); i != -1 {
l := strings.LastIndex(s[0:i], ",")
if l == -1 {
l = i
}
iF.buf.WriteString(s[0:l])
s = s[i+1+len(fl):]
i = strings.Index(s, `"`)
if i != -1 {
s = s[i+1:]
}
for {
i = strings.Index(s, `"`) //end quote
if i != -1 {
s = s[i+1:]
fmt.Println("Skipped")
break
} else {
if n, err = iF.Reader.Read(p); err != nil {
return
}
s = string(p)
}
}
iF.buf.WriteString(s)
}
ln := iF.buf.Len()
if ln >= len(p) {
tmp := iF.buf.Bytes()
iF.buf.Reset()
copy(p, tmp[0:len(p)])
iF.buf.Write(p[len(p):])
ln = len(p)
} else {
copy(p, iF.buf.Bytes())
iF.buf.Reset()
}
return ln, nil
}
func main() {
type MyStruct struct {
Field1 string
Field2 string
}
fi, err := os.Open("myJSONFile.json")
if err != nil {
os.Exit(2)
}
// create an instance and populate
var mystruct MyStruct
err := json.NewDecoder(NewIgnoreField(fi, "Field3")).Decode(&mystruct)
if err != nil {
fmt.Println(err)
}
fmt.Println(mystruct)
}

<kbd>playground</kbd>

huangapple
  • 本文由 发表于 2014年7月29日 03:13:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/25002575.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定