2020年10月19日 19:21:54go评论87阅读模式

英文:

Java : Byte array prints unknown values for same string

问题

以下是翻译好的部分：

我有以下存储在文本文件中和作为Java变量的字符串：‘destructive’

我的代码如下：

public class SimpleTest {

    public static void main(String[] args) {
        try {
            File file = new File("TestFIle.txt");
            byte[] file_encoded = FileUtils.readFileToString(file, "UTF-8").getBytes("UTF-8");
            System.out.println(Arrays.toString(file_encoded));

            String toEncrypt = "‘destructive’";
            byte[] encoded = toEncrypt.getBytes(Charset.forName("UTF-8"));
            System.out.println(Arrays.toString(encoded));
        } catch (IOException ex) {
            Logger.getLogger(SimpleTest.class.getName()).log(Level.SEVERE, null, ex);
        }
    }
}

正如您所看到的

String toEncrypt = "‘destructive’";

TestFIle.txt 中的内容也是：‘destructive’

当我运行代码时，我得到：

[-17, -69, -65, -30, -128, -104, 100, 101, 115, 116, 114, 117, 99, 116, 105, 118, 101, -30, -128, -103]
[-30, -128, -104, 100, 101, 115, 116, 114, 117, 99, 116, 105, 118, 101, -30, -128, -103]

为什么在从文件中读取相同的文本时，字节数组开头会多出[-17, -69, -65]，以及为什么会出现这种情况？

英文:

I have the following String which is stored in a text file and also as a variable in Java : ‘destructive’

My code below

public class SimpleTest {

    public static void main(String[] args) {
        try {
            File file = new File(&quot;TestFIle.txt&quot;);
            byte[] file_encoded = FileUtils.readFileToString(file, &quot;UTF-8&quot;).getBytes(&quot;UTF-8&quot;);
            System.out.println(Arrays.toString(file_encoded));

            String toEncrypt = &quot;‘destructive’&quot;;
            byte[] encoded = toEncrypt.getBytes(Charset.forName(&quot;UTF-8&quot;));
            System.out.println(Arrays.toString(encoded));
        } catch (IOException ex) {
            Logger.getLogger(SimpleTest.class.getName()).log(Level.SEVERE, null, ex);
        }
    }
}

As you can see

String toEncrypt = &quot;‘destructive’&quot;;

The contents in TestFIle.txt is also : ‘destructive’

When i run the code i get:

[-17, -69, -65, -30, -128, -104, 100, 101, 115, 116, 114, 117, 99, 116, 105, 118, 101, -30, -128, -103]
[-30, -128, -104, 100, 101, 115, 116, 114, 117, 99, 116, 105, 118, 101, -30, -128, -103]

What is the additional [-17, -69, -65] at the starting of byte array while reading the same text from a file and why do i get that?

答案1

得分: 1

你的文件似乎包含以UTF-8编码的文本，并带有前导字节顺序标记（BOM）。UTF-8的BOM为EF BB BF。在二进制补码表示中，这分别是-17、-69和-65。

英文:

Your file seems to contain text encoded in UTF-8 with a leading byte order mark (BOM). The BOM for UTF-8 is EF BB BF. In two's complement representation this is -17 -69 -65.

答案2

得分: 0

前导的 `[-17, -69, -65]` 是 UTF-8 的[字节顺序标记][1]。
在十六进制中，BOM 是 `[0xEF, 0xBB, 0xBF]`，实际上是 `[239, 187, 191]`。
但由于 Java 的 `byte` 是有符号的，这些数字被解释（并打印）为负数。

一般来说，BOM 是可选的，似乎在 Microsoft 生态系统中很常见：https://superuser.com/questions/1553666/utf-8-vs-utf-8-with-bom

  [1]: https://en.wikipedia.org/wiki/Byte_order_mark

英文:

The leading [-17, -69, -65] is the byte order mark of UTF-8.
In hexadecimal the BOM is [0xEF, 0xBB, 0xBF] which is actually [239, 187, 191].
But because Java's byte is signed, the numbers are interpreted (and printed) as negative numbers.

In general, the BOM is optional and it seems to be common in the Microsoft ecosystem: https://superuser.com/questions/1553666/utf-8-vs-utf-8-with-bom

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Java：字节数组打印相同字符串时显示未知值

问题

答案1

答案2

DateTimerFormatter在Java中是一个日期时间格式化类。

任意两名学生之间的年龄差的最小组数最多为1。

如何在 Android 应用中查找两个地点之间的旅行时间。

如何在持久化时仅在ID为空时生成ID。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论