英文:
Gson library is not working reliably as intended for parsing large JSON file
问题
我必须解析一个非常庞大的JSON文件(文件大小可以达到数GB),所以我不能只是将整个JSON字符串加载到内存中并解析成对象,我必须以某种方式逐行读取JSON字符串并解析它。我目前正在使用Gson
库中的JsonReader
,这个方法效果很好,但最近我发现它偶尔会抛出一个错误,错误信息为Unterminated string at line 1 column xxxxxxxxx path $.fieldname[random index].fieldname[random index].fieldname
,但当我使用不同的库,比如Jackson
来解析时,解析顺利进行了(这个文件并不是那么庞大,只有50MB,所以我可以将其加载到内存中并解析成对象),所以这是Gson
的一个BUG吗?如果是的话,还有其他的JAVA库可以用来做同样的事情吗?我会感激任何答案!
PS:我正在使用gson-2.8.2
编辑:我再次使用Gson
测试了同一个文件,出现了相同的错误,但是在不同的行和不同的位置,所以可以确认这是Gson
的一个BUG吗?
英文:
I have to parse a really huge JSON file (file size can get to several GBs), so I cannot just load the entire JSON String into the memory and parse it into an object, I have to somehow read the JSON String line by line and parse it. I am currently using JsonReader
from Gson
library, which was working great, but recently I discovered that it occasionally throws an error saying Unterminated string at line 1 column xxxxxxxxx path $.fieldname[random index].fieldname[random index].fieldname
, but when I parsed it using a different library like Jackson
, the parsing went flawlessly (this file is not that huge, only 50 MB so I can just load it into the memory and parse it into an object), so is this a BUG in Gson
? And if it is, is there any other JAVA library I can use to do the same thing? I will be appreciated for any answer!
PS: I am using gson-2.8.2
EDIT: I have tested the same file again using Gson
, the same error occurred but at the different line and different position, so is it confirmed this is BUG in Gson
?
答案1
得分: 3
似乎你应该检查Gson
的GitHub问题:https://github.com/google/gson/issues
除此之外,提供一个最小的可复现示例会很有帮助;你甚至可以生成这样一个文件,使示例自包含;-)
顺便说一下,如果你知道如何操作的话,请更改标题。使用Gson时,标题似乎并不稳定...
英文:
Looks like you should check the GitHub issues for Gson
: https://github.com/google/gson/issues
Apart from that, a minimum example to reproduce that would be good; you could even generate such a file to make the example self-contained
Btw, please change the heading as you apparently know how to do that. It just does not work reliably with Gson...
答案2
得分: 1
我还使用了Jackson
库进行解析测试,但仍然遇到了相同类型的错误。然而,在许多测试中,结果显示Gson
和Jackson
库在处理未格式化的JSON文件(即JSON没有适当的缩进)时可能会出现问题(但并非总是如此)。因为我测试的所有JSON文件都将整个JSON字符串放在一行上(从技术上来说仍然是合法的JSON),在我对其进行格式化以添加缩进后,解析成功进行了(无论是Gson
还是Jackson
)。希望这能帮助任何遇到与我类似问题的人。
英文:
I tested the parsing with Jackson
library as well and still got the same type of error, however, throughout many tests, it turns out Gson
and Jackson
libraries can have a problem (not always) handling JSON file that is NOT pretty printed (meaning that the JSON does not have proper indentation), because all the JSON files I tested are putting the entire JSON string on a single line (technically still a legit JSON), after I formatted it to have indentations, the parsing went through successfully (both Gson
and Jackson
), hope this can help anyone who encountered the same issue as I did
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论