2020年10月1日 21:33:58go评论90阅读模式

英文:

Why does JUnit not work with files containing non-English characters? (using NetBeans)

问题

我在NetBeans中制作了一个程序，它接受一个输入的.txt文件，然后将输出写入控制台。
它正常运行，但当我尝试使用JUnit进行测试时，程序会错误地读取文件。

例如，它会将'ö'错误地读取为'Ă¶'。

有没有办法解决JUnit不能正确读取非英文字符的问题呢？

英文:

I made a program in NetBeans that takes an input .txt file then writes an output to console.
It works fine but when I try to test it using JUnit, the program reads the file incorrectly.

For example, insteand of 'ö' it reads 'Ă¶'

Is there any way to solve this problem of JUnit not reading non-English characters?

答案1

得分: 0

我怀疑问题实际上出现在你的程序或单元测试中，而不是在JUnit中。

如果证据如你所说，我预计你的代码做了类似这样的操作

Reader r = new FileReader(filename);

它打开文件并基于默认字符集设置字符解码器。

当你在NetBeans中运行代码时，默认字符集为UTF-8，你可以正确读取文件（该文件采用UTF-8编码）。
当你在JUnit测试环境中运行代码时，默认字符集似乎为LATIN-1，而它与输入文件的编码不匹配。

对于代码使用默认字符集来推断其输入文件的编码是可能是不正确的。或者，可能是因为你的JUnit测试不正确，因为它没有设置JVM默认字符集以匹配测试文件的编码。

以特定字符集（UTF-8）打开此文件的方法如下：

// Java 11
Reader r = new FileReader(filename, StandardCharsets.UTF_8);
// Java 8 及更早版本
Reader r = new InputStreamReader(new FileInputStream(filename), "UTF-8");

你无法更改正在运行的JVM的默认字符集。但是在启动运行JUnit测试的JVM时，你可以在JVM选项中覆盖平台默认字符集。（请参阅 https://stackoverflow.com/questions/361975/setting-the-default-java-character-encoding。）

另外，也有可能你对证据有误解，实际上编码问题实际上出现在输出方面；也就是说，在你运行JUnit测试的上下文中，默认字符集与控制台的实际字符集不匹配...

英文:

I suspect that problem is actually in your program or unit tests, not in in JUnit.

If the evidence is as you say it is, then I expect that you code does something like this

Reader r = new FileReader(filename);

which opens the file and sets up a charset decoder based on the default charset.

When you are running the code in the NetBeans, the default charset is UTF-8, and you are reading the file (which is UTF-8 encoded) correctly.
When you are running it in the context of a JUnit test, the default charset is (apparently) LATIN-1 which doesn't match the encoding of the input file.

It is possiblly incorrect for your code to be using the default charset to infer the encoding of its input file. Alternatively, it could be that your JUnit test is incorrect because it is not setting the JVM default charset to match the test file.

The way to open this file with a specific charset (UTF-8) be:

// Java 11
Reader r = new FileReader(filename, StandardCharsets.UTF_8);
// Java 8 and earlier
Reader r = new InputStreamReader(new FileInputStream(filename), &quot;UTF-8&quot;);

You can't change a running JVM's default charset. But you could possibly override the platform default charset in the JVM options when you start the JVM that runs the JUnit tests. (See https://stackoverflow.com/questions/361975/setting-the-default-java-character-encoding.)

It is also possible that you have misinterpreted the evidence and the encoding problem is actually on the output side; i.e. there is a mismatch between the default charset and the console's actual charset ... in the context that you are running the JUnit tests.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么JUnit无法处理包含非英文字符的文件？（使用NetBeans）

问题

答案1

将功能传递给方法，而无需对类进行进一步更改。

JVM如何能够对用户的请求做出响应。

有没有办法使用流或其他方式对包含两个类的ArrayList进行排序？

jsp – 如何替代不推荐使用的ExpressionEvaluator

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。