不要将不需要的 JSON 键值对读入内存中。

huangapple go评论107阅读模式
英文:

Don't read unneeded JSON key-values into memory

问题

我有一个JSON文件,其中有一个字段在加载到内存时占用了大量的空间。其他字段都是合理的,但我尽量不加载那个特定的字段,除非我绝对必须加载它。

  1. {
  2. "Field1": "value1",
  3. "Field2": "value2",
  4. "Field3": "一个非常长的字符串,可能占用几GB的内存"
  5. }

在将该文件读入内存时,我希望忽略Field3(因为加载它可能会导致我的应用程序崩溃)。下面是一段代码,我认为它可以实现这个目的,因为它使用了IO流,而不是将[]byte类型传递给Unmarshal命令。

  1. package main
  2. import (
  3. "encoding/json"
  4. "os"
  5. )
  6. func main() {
  7. type MyStruct struct {
  8. Field1 string
  9. Field2 string
  10. }
  11. fi, err := os.Open("myJSONFile.json")
  12. if err != nil {
  13. os.Exit(2)
  14. }
  15. // 创建一个实例并填充数据
  16. var mystruct MyStruct
  17. err = json.NewDecoder(fi).Decode(&mystruct)
  18. if err != nil {
  19. os.Exit(2)
  20. }
  21. // 做一些其他的事情
  22. }

问题在于内置的json.Decoder类型在Decode之前会将整个文件读入内存,然后丢弃与struct字段不匹配的键值对(正如之前在StackOverflow上指出的:链接)。

有没有办法在Go中解码JSON而不将整个JSON对象保存在内存中?

英文:

I have a JSON file with a single field that takes a huge amount of space when loaded into memory. The other fields are reasonable, but I'm trying to take care not to load that particular field unless I absolutely have to.

  1. {
  2. "Field1": "value1",
  3. "Field2": "value2",
  4. "Field3": "a very very long string that potentially takes a few GB of memory"
  5. }

When reading that file into memory, I'd want to ignore Field3 (because loading it could crash my app). Here's some code that I would assume does that because it uses io streams rather than passing a []byte type to the Unmarshal command.

  1. package main
  2. import (
  3. "encoding/json"
  4. "os"
  5. )
  6. func main() {
  7. type MyStruct struct {
  8. Field1 string
  9. Field2 string
  10. }
  11. fi, err := os.Open("myJSONFile.json")
  12. if err != nil {
  13. os.Exit(2)
  14. }
  15. // create an instance and populate
  16. var mystruct MyStruct
  17. err = json.NewDecoder(fi).Decode(&mystruct)
  18. if err != nil {
  19. os.Exit(2)
  20. }
  21. // do some other stuff
  22. }

The issue is that the built-in json.Decoder type reads the entire file into memory on Decode before throwing away key-values that don't match the struct's fields (as has been pointed out on StackOverflow before: link).

Are there any ways of decoding JSON in Go without keeping the entire JSON object in memory?

答案1

得分: 2

你可以编写一个自定义的io.Reader,将其提供给json.Decoder,并在其中预读你的JSON文件并跳过特定的字段。

另一种选择是编写自己的解码器,这更加复杂和混乱。

以下是修改后的代码示例:

  1. type IgnoreField struct {
  2. io.Reader
  3. Field string
  4. buf bytes.Buffer
  5. }
  6. func NewIgnoreField(r io.Reader, field string) *IgnoreField {
  7. return &IgnoreField{
  8. Reader: r,
  9. Field: field,
  10. }
  11. }
  12. func (iF *IgnoreField) Read(p []byte) (n int, err error) {
  13. if n, err = iF.Reader.Read(p); err != nil {
  14. return
  15. }
  16. s := string(p)
  17. fl := `"` + iF.Field + `"`
  18. if i := strings.Index(s, fl); i != -1 {
  19. l := strings.LastIndex(s[0:i], ",")
  20. if l == -1 {
  21. l = i
  22. }
  23. iF.buf.WriteString(s[0:l])
  24. s = s[i+1+len(fl):]
  25. i = strings.Index(s, `"`)
  26. if i != -1 {
  27. s = s[i+1:]
  28. }
  29. for {
  30. i = strings.Index(s, `"`) //end quote
  31. if i != -1 {
  32. s = s[i+1:]
  33. fmt.Println("Skipped")
  34. break
  35. } else {
  36. if n, err = iF.Reader.Read(p); err != nil {
  37. return
  38. }
  39. s = string(p)
  40. }
  41. }
  42. iF.buf.WriteString(s)
  43. }
  44. ln := iF.buf.Len()
  45. if ln >= len(p) {
  46. tmp := iF.buf.Bytes()
  47. iF.buf.Reset()
  48. copy(p, tmp[0:len(p)])
  49. iF.buf.Write(p[len(p):])
  50. ln = len(p)
  51. } else {
  52. copy(p, iF.buf.Bytes())
  53. iF.buf.Reset()
  54. }
  55. return ln, nil
  56. }
  57. func main() {
  58. type MyStruct struct {
  59. Field1 string
  60. Field2 string
  61. }
  62. fi, err := os.Open("myJSONFile.json")
  63. if err != nil {
  64. os.Exit(2)
  65. }
  66. // create an instance and populate
  67. var mystruct MyStruct
  68. err := json.NewDecoder(NewIgnoreField(fi, "Field3")).Decode(&mystruct)
  69. if err != nil {
  70. fmt.Println(err)
  71. }
  72. fmt.Println(mystruct)
  73. }

你可以在playground上运行这段代码。

英文:

You could write a custom io.Reader that you feed to json.Decoder and that will pre-read your json file and skip that specific field.

The other option is to write your own decoder, more complicated and messy.

//edit it seemed like a fun exercise, so here goes:

  1. type IgnoreField struct {
  2. io.Reader
  3. Field string
  4. buf bytes.Buffer
  5. }
  6. func NewIgnoreField(r io.Reader, field string) *IgnoreField {
  7. return &IgnoreField{
  8. Reader: r,
  9. Field: field,
  10. }
  11. }
  12. func (iF *IgnoreField) Read(p []byte) (n int, err error) {
  13. if n, err = iF.Reader.Read(p); err != nil {
  14. return
  15. }
  16. s := string(p)
  17. fl := `"` + iF.Field + `"`
  18. if i := strings.Index(s, fl); i != -1 {
  19. l := strings.LastIndex(s[0:i], ",")
  20. if l == -1 {
  21. l = i
  22. }
  23. iF.buf.WriteString(s[0:l])
  24. s = s[i+1+len(fl):]
  25. i = strings.Index(s, `"`)
  26. if i != -1 {
  27. s = s[i+1:]
  28. }
  29. for {
  30. i = strings.Index(s, `"`) //end quote
  31. if i != -1 {
  32. s = s[i+1:]
  33. fmt.Println("Skipped")
  34. break
  35. } else {
  36. if n, err = iF.Reader.Read(p); err != nil {
  37. return
  38. }
  39. s = string(p)
  40. }
  41. }
  42. iF.buf.WriteString(s)
  43. }
  44. ln := iF.buf.Len()
  45. if ln >= len(p) {
  46. tmp := iF.buf.Bytes()
  47. iF.buf.Reset()
  48. copy(p, tmp[0:len(p)])
  49. iF.buf.Write(p[len(p):])
  50. ln = len(p)
  51. } else {
  52. copy(p, iF.buf.Bytes())
  53. iF.buf.Reset()
  54. }
  55. return ln, nil
  56. }
  57. func main() {
  58. type MyStruct struct {
  59. Field1 string
  60. Field2 string
  61. }
  62. fi, err := os.Open("myJSONFile.json")
  63. if err != nil {
  64. os.Exit(2)
  65. }
  66. // create an instance and populate
  67. var mystruct MyStruct
  68. err := json.NewDecoder(NewIgnoreField(fi, "Field3")).Decode(&mystruct)
  69. if err != nil {
  70. fmt.Println(err)
  71. }
  72. fmt.Println(mystruct)
  73. }

<kbd>playground</kbd>

huangapple
  • 本文由 发表于 2014年7月29日 03:13:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/25002575.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定