问题

以下是翻译好的部分：

我正在使用 Java（OpenJDK 8）将一个大型 JSON 文件读入字符串中。

我使用的代码是 final String fileContents = (Files.readAllLines(Paths.get(filePath.toString()))).stream().collect(Collectors.joining());

生成的字符串开头有一些不可打印的字符，这些字符在文件中不存在：

Eclipse 在实际文件 {"TIPL 等之前显示这些字符为 [-1,-2]。

在这里出了什么问题？我该怎么做才能让 Java 正确读取文件？

英文:

I am reading a large JSON file into a string using java (OpenJDK 8).

The code I am using is final String fileContents = (Files.readAllLines(Paths.get(filePath.toString()))).stream().collect(Collectors.joining());

The resulting String has some unprintable characters at the start of the string which aren't in the file:

Eclipse shows the characters as [-1,-2] before the {"TIPL etc. which is the actual file.

What is wrong here? What can I do to get Java to read the file correctly?

答案1

得分: 4

你的文件以UTF16-LE（小端序）编码，并包含字节顺序标记（FF FE）。

Files.readAllLines()使用的默认编码是UTF-8，所以你在字符串数据中看到字节顺序标记（BOM）字符和NUL字符。

你应该将字符集作为第二个参数传递给Files.readAllLines()方法：

Files.readAllLines(path, StandardCharsets.UTF_16);

StandardCharsets.UTF_16编码将自动解释BOM，并相应地解析字符串内容。Charset类的Javadoc中包含有关如何使用各种字符编码对字节顺序标记进行编码和解码的附加信息。

英文:

Your file is encoded as UTF16-LE (little-endian) and contains a byte-order mark (FF FE).

The default encoding used by Files.readAllLines() is UTF-8, so that's why you're seeing the byte-order mark (BOM) characters and NUL characters in your string data.

You should pass a character set as your second parameter to Files.readAllLines():

Files.readAllLines(path, StandardCharsets.UTF_16);

The StandardCharsets.UTF_16 encoding will automatically interpret the BOM, and parse your string content accordingly. The Javadoc for the Charset class contains additional information on how byte-order marks are encoded and decoded using various character encodings.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

`Files.readAllLines` 在不可打印字符前添加了内容。

问题

答案1

如何在不下载整个文件的情况下获取S3对象的CSV标题？

HashMap – 迭代并且输出被打印了太多次

How to switch variable into another variable once the condition is get in java if else if statement

从数据库中显示来自URL的图像。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论