英文:
Efficiently count the number of JSON objects in a file
问题
我需要获取给定文件中的 JSON 对象数量。该文件包含一个 JSON 对象数组。我观察到,对于包含 100 万个对象的文件,计数大约需要 150-180 秒。有没有办法优化下面的代码以更快地获取计数?
func Count(file string) (int, error) {
f, err := os.Open(file)
if err != nil {
return -1, err
}
defer f.Close()
dec := json.NewDecoder(bufio.NewReader(f))
_, e := dec.Token()
if e != nil {
return -1, e
}
var count int
for dec.More() {
var tempMap map[string]interface{}
readErr := dec.Decode(&tempMap)
if readErr != nil {
return -1, readErr
}
count++
}
return count, nil
}
英文:
I need to get the number of json objects in a given file. The File contains an array of JSON objects. I observe that its taking approximately 150-180 seconds to count a file with 1 million objects. Is there a way I can optimize the below code to get the count faster?
func Count(file string) (int, error) {
f, err := os.Open(file)
if err != nil {
return -1, err
}
defer f.Close()
dec := json.NewDecoder(bufio.NewReader(f))
_, e := dec.Token()
if e != nil {
return -1, e
}
var count int
for dec.More() {
var tempMap map[string]interface{}
readErr := dec.Decode(&tempMap)
if readErr != nil {
return -1, readErr
}
tranCount++
}
return count, nil
}
答案1
得分: 1
通过计算起始对象分隔符的数量来加快速度,而不是解码为Go值。
根据问题中的代码,看起来你的目标是计算文档中第一层嵌套的对象数量。以下是实现该目标的代码:
func Count(r io.Reader) (int, error) {
dec := json.NewDecoder(r)
nest := 0
count := 0
for {
t, err := dec.Token()
if err == io.EOF {
break
}
if err != nil {
return -1, err
}
switch t {
case json.Delim('{'):
if nest == 1 {
count++
}
nest++
case json.Delim('}'):
nest--
}
}
return count, nil
}
如果你的目标是计算所有对象的数量,可以从上述代码中删除所有对nest
的使用:
func Count(r io.Reader) (int, error) {
dec := json.NewDecoder(r)
count := 0
for {
t, err := dec.Token()
if err == io.EOF {
break
}
if err != nil {
return -1, err
}
switch t {
case json.Delim('{'):
count++
}
}
return count, nil
}
以上是翻译好的内容,请确认是否满意。
英文:
Speed things up by counting start object delimiters instead of decoding to Go values.
Based on the code in the question, it looks like your goal is to count objects at the first level of nesting in the document. Here's code that does that:
func Count(r io.Reader) (int, error) {
dec := json.NewDecoder(r)
nest := 0
count := 0
for {
t, err := dec.Token()
if err == io.EOF {
break
}
if err != nil {
return -1, err
}
switch t {
case json.Delim('{'):
if nest == 1 {
count++
}
nest++
case json.Delim('}'):
nest--
}
}
return count, nil
}
If your goal is to count all objects, remove all uses of nest
from the code above:
func Count(r io.Reader) (int, error) {
dec := json.NewDecoder(r)
count := 0
for {
t, err := dec.Token()
if err == io.EOF {
break
}
if err != nil {
return -1, err
}
switch t {
case json.Delim('{'):
count++
}
}
return count, nil
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论