Gson库在解析大型JSON文件时无法如预期地可靠工作。

huangapple go评论96阅读模式
英文:

Gson library is not working reliably as intended for parsing large JSON file

问题

我必须解析一个非常庞大的JSON文件(文件大小可以达到数GB),所以我不能只是将整个JSON字符串加载到内存中并解析成对象,我必须以某种方式逐行读取JSON字符串并解析它。我目前正在使用Gson库中的JsonReader,这个方法效果很好,但最近我发现它偶尔会抛出一个错误,错误信息为Unterminated string at line 1 column xxxxxxxxx path $.fieldname[random index].fieldname[random index].fieldname,但当我使用不同的库,比如Jackson来解析时,解析顺利进行了(这个文件并不是那么庞大,只有50MB,所以我可以将其加载到内存中并解析成对象),所以这是Gson的一个BUG吗?如果是的话,还有其他的JAVA库可以用来做同样的事情吗?我会感激任何答案!

PS:我正在使用gson-2.8.2

编辑:我再次使用Gson测试了同一个文件,出现了相同的错误,但是在不同的行和不同的位置,所以可以确认这是Gson的一个BUG吗?

英文:

I have to parse a really huge JSON file (file size can get to several GBs), so I cannot just load the entire JSON String into the memory and parse it into an object, I have to somehow read the JSON String line by line and parse it. I am currently using JsonReader from Gson library, which was working great, but recently I discovered that it occasionally throws an error saying Unterminated string at line 1 column xxxxxxxxx path $.fieldname[random index].fieldname[random index].fieldname, but when I parsed it using a different library like Jackson, the parsing went flawlessly (this file is not that huge, only 50 MB so I can just load it into the memory and parse it into an object), so is this a BUG in Gson? And if it is, is there any other JAVA library I can use to do the same thing? I will be appreciated for any answer!

PS: I am using gson-2.8.2

EDIT: I have tested the same file again using Gson, the same error occurred but at the different line and different position, so is it confirmed this is BUG in Gson?

答案1

得分: 3

似乎你应该检查Gson的GitHub问题:https://github.com/google/gson/issues
除此之外,提供一个最小的可复现示例会很有帮助;你甚至可以生成这样一个文件,使示例自包含;-)

顺便说一下,如果你知道如何操作的话,请更改标题。使用Gson时,标题似乎并不稳定...

英文:

Looks like you should check the GitHub issues for Gson: https://github.com/google/gson/issues
Apart from that, a minimum example to reproduce that would be good; you could even generate such a file to make the example self-contained Gson库在解析大型JSON文件时无法如预期地可靠工作。

Btw, please change the heading as you apparently know how to do that. It just does not work reliably with Gson...

答案2

得分: 1

我还使用了Jackson库进行解析测试,但仍然遇到了相同类型的错误。然而,在许多测试中,结果显示GsonJackson库在处理未格式化的JSON文件(即JSON没有适当的缩进)时可能会出现问题(但并非总是如此)。因为我测试的所有JSON文件都将整个JSON字符串放在一行上(从技术上来说仍然是合法的JSON),在我对其进行格式化以添加缩进后,解析成功进行了(无论是Gson还是Jackson)。希望这能帮助任何遇到与我类似问题的人。

英文:

I tested the parsing with Jackson library as well and still got the same type of error, however, throughout many tests, it turns out Gson and Jackson libraries can have a problem (not always) handling JSON file that is NOT pretty printed (meaning that the JSON does not have proper indentation), because all the JSON files I tested are putting the entire JSON string on a single line (technically still a legit JSON), after I formatted it to have indentations, the parsing went through successfully (both Gson and Jackson), hope this can help anyone who encountered the same issue as I did

huangapple
  • 本文由 发表于 2020年8月5日 02:15:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/63252842.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定