英文:
Don't read unneeded JSON key-values into memory
问题
我有一个JSON文件,其中有一个字段在加载到内存时占用了大量的空间。其他字段都是合理的,但我尽量不加载那个特定的字段,除非我绝对必须加载它。
{
"Field1": "value1",
"Field2": "value2",
"Field3": "一个非常长的字符串,可能占用几GB的内存"
}
在将该文件读入内存时,我希望忽略Field3
(因为加载它可能会导致我的应用程序崩溃)。下面是一段代码,我认为它可以实现这个目的,因为它使用了IO流,而不是将[]byte
类型传递给Unmarshal
命令。
package main
import (
"encoding/json"
"os"
)
func main() {
type MyStruct struct {
Field1 string
Field2 string
}
fi, err := os.Open("myJSONFile.json")
if err != nil {
os.Exit(2)
}
// 创建一个实例并填充数据
var mystruct MyStruct
err = json.NewDecoder(fi).Decode(&mystruct)
if err != nil {
os.Exit(2)
}
// 做一些其他的事情
}
问题在于内置的json.Decoder
类型在Decode
之前会将整个文件读入内存,然后丢弃与struct
字段不匹配的键值对(正如之前在StackOverflow上指出的:链接)。
有没有办法在Go中解码JSON而不将整个JSON对象保存在内存中?
英文:
I have a JSON file with a single field that takes a huge amount of space when loaded into memory. The other fields are reasonable, but I'm trying to take care not to load that particular field unless I absolutely have to.
{
"Field1": "value1",
"Field2": "value2",
"Field3": "a very very long string that potentially takes a few GB of memory"
}
When reading that file into memory, I'd want to ignore Field3
(because loading it could crash my app). Here's some code that I would assume does that because it uses io streams rather than passing a []byte
type to the Unmarshal
command.
package main
import (
"encoding/json"
"os"
)
func main() {
type MyStruct struct {
Field1 string
Field2 string
}
fi, err := os.Open("myJSONFile.json")
if err != nil {
os.Exit(2)
}
// create an instance and populate
var mystruct MyStruct
err = json.NewDecoder(fi).Decode(&mystruct)
if err != nil {
os.Exit(2)
}
// do some other stuff
}
The issue is that the built-in json.Decoder
type reads the entire file into memory on Decode
before throwing away key-values that don't match the struct
's fields (as has been pointed out on StackOverflow before: link).
Are there any ways of decoding JSON in Go without keeping the entire JSON object in memory?
答案1
得分: 2
你可以编写一个自定义的io.Reader
,将其提供给json.Decoder
,并在其中预读你的JSON文件并跳过特定的字段。
另一种选择是编写自己的解码器,这更加复杂和混乱。
以下是修改后的代码示例:
type IgnoreField struct {
io.Reader
Field string
buf bytes.Buffer
}
func NewIgnoreField(r io.Reader, field string) *IgnoreField {
return &IgnoreField{
Reader: r,
Field: field,
}
}
func (iF *IgnoreField) Read(p []byte) (n int, err error) {
if n, err = iF.Reader.Read(p); err != nil {
return
}
s := string(p)
fl := `"` + iF.Field + `"`
if i := strings.Index(s, fl); i != -1 {
l := strings.LastIndex(s[0:i], ",")
if l == -1 {
l = i
}
iF.buf.WriteString(s[0:l])
s = s[i+1+len(fl):]
i = strings.Index(s, `"`)
if i != -1 {
s = s[i+1:]
}
for {
i = strings.Index(s, `"`) //end quote
if i != -1 {
s = s[i+1:]
fmt.Println("Skipped")
break
} else {
if n, err = iF.Reader.Read(p); err != nil {
return
}
s = string(p)
}
}
iF.buf.WriteString(s)
}
ln := iF.buf.Len()
if ln >= len(p) {
tmp := iF.buf.Bytes()
iF.buf.Reset()
copy(p, tmp[0:len(p)])
iF.buf.Write(p[len(p):])
ln = len(p)
} else {
copy(p, iF.buf.Bytes())
iF.buf.Reset()
}
return ln, nil
}
func main() {
type MyStruct struct {
Field1 string
Field2 string
}
fi, err := os.Open("myJSONFile.json")
if err != nil {
os.Exit(2)
}
// create an instance and populate
var mystruct MyStruct
err := json.NewDecoder(NewIgnoreField(fi, "Field3")).Decode(&mystruct)
if err != nil {
fmt.Println(err)
}
fmt.Println(mystruct)
}
你可以在playground上运行这段代码。
英文:
You could write a custom io.Reader
that you feed to json.Decoder
and that will pre-read your json file and skip that specific field.
The other option is to write your own decoder, more complicated and messy.
//edit it seemed like a fun exercise, so here goes:
type IgnoreField struct {
io.Reader
Field string
buf bytes.Buffer
}
func NewIgnoreField(r io.Reader, field string) *IgnoreField {
return &IgnoreField{
Reader: r,
Field: field,
}
}
func (iF *IgnoreField) Read(p []byte) (n int, err error) {
if n, err = iF.Reader.Read(p); err != nil {
return
}
s := string(p)
fl := `"` + iF.Field + `"`
if i := strings.Index(s, fl); i != -1 {
l := strings.LastIndex(s[0:i], ",")
if l == -1 {
l = i
}
iF.buf.WriteString(s[0:l])
s = s[i+1+len(fl):]
i = strings.Index(s, `"`)
if i != -1 {
s = s[i+1:]
}
for {
i = strings.Index(s, `"`) //end quote
if i != -1 {
s = s[i+1:]
fmt.Println("Skipped")
break
} else {
if n, err = iF.Reader.Read(p); err != nil {
return
}
s = string(p)
}
}
iF.buf.WriteString(s)
}
ln := iF.buf.Len()
if ln >= len(p) {
tmp := iF.buf.Bytes()
iF.buf.Reset()
copy(p, tmp[0:len(p)])
iF.buf.Write(p[len(p):])
ln = len(p)
} else {
copy(p, iF.buf.Bytes())
iF.buf.Reset()
}
return ln, nil
}
func main() {
type MyStruct struct {
Field1 string
Field2 string
}
fi, err := os.Open("myJSONFile.json")
if err != nil {
os.Exit(2)
}
// create an instance and populate
var mystruct MyStruct
err := json.NewDecoder(NewIgnoreField(fi, "Field3")).Decode(&mystruct)
if err != nil {
fmt.Println(err)
}
fmt.Println(mystruct)
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论