2020年8月25日 18:58:57go评论98阅读模式

英文:

Read NUL-terminated String from ByteBuffer

问题

ByteBuffer b = /* 61 62 63 64 00 31 32 34 00 (hex) */;
int startPosition = b.position();
int nullTerminatorIndex = -1;
while (b.hasRemaining()) {
    byte currentByte = b.get();
    if (currentByte == 0) {
        nullTerminatorIndex = b.position() - 1;
        break;
    }
}
if (nullTerminatorIndex != -1) {
    byte[] stringBytes = new byte[nullTerminatorIndex - startPosition];
    b.position(startPosition);
    b.get(stringBytes);
    b.get(); // Move past the null terminator
    String s0 = new String(stringBytes, StandardCharsets.UTF_8);
    nullTerminatorIndex = -1;
    startPosition = b.position();
    while (b.hasRemaining()) {
        byte currentByte = b.get();
        if (currentByte == 0) {
            nullTerminatorIndex = b.position() - 1;
            break;
        }
    }
    if (nullTerminatorIndex != -1) {
        stringBytes = new byte[nullTerminatorIndex - startPosition];
        b.position(startPosition);
        b.get(stringBytes);
        String s1 = new String(stringBytes, StandardCharsets.UTF_8);
    }
}

In this code, we iterate through the ByteBuffer starting from the current position until we find a null terminator (byte value 0), which indicates the end of the UTF-8 string. Once we find the null terminator, we create a byte array containing the bytes of the string, and then create a String using the StandardCharsets.UTF_8 encoding.

Please note that this code assumes that the byte buffer contains valid UTF-8 encoded strings terminated by null bytes. Also, error handling and proper resource management are important considerations in real-world scenarios.

英文:

How can I read NUL-terminated UTF-8 string from Java ByteBuffer starting at ByteBuffer#position()?

ByteBuffer b = /* 61 62 63 64 00 31 32 34 00 (hex) */;
String s0 = /* read first string */;
String s1 = /* read second string */;
// `s0` will now contain “ABCD” and `s1` will contain “124”.

I have already tried using Charsets.UTF_8.decode(b) but it seems this function is ignoring current ByteBuffer postision and reads until the end of the buffer.

Is there more idiomatic way to read such string from byte buffer than seeking for byte containing 0 and the limiting the buffer to it (or copying the part with string into separate buffer)?

答案1

得分: 6

以下是翻译好的内容：

习惯用法意思："一行代码"，不过我并不知道（并不令人惊讶，因为NUL结尾的字符串并不在Java规范中）。

我想到的第一件事是使用b.slice().limit(x)来仅创建一个轻量级的视图，包含所需的字节（比将它们复制到任何地方都要好，因为您可以直接在缓冲区中进行操作）

ByteBuffer b = ByteBuffer.wrap(new byte[] {0x61, 0x62, 0x63, 0x64, 0x00, 0x31, 0x32, 0x34, 0x00 });
int i;
while (b.hasRemaining()) {
  ByteBuffer nextString = b.slice(); // 与b具有相同起始位置的视图
  for (i = 0; b.hasRemaining() && b.get() != 0x00; i++) {
    // 计算到下一个NUL
  }
  nextString.limit(i); // 视图现在在NUL之前停止
  CharBuffer s = StandardCharsets.UTF_8.decode(nextString);
  System.out.println(s);
}

英文:

Idiomatic meaning "one liner" not that I know of (unsurprising since NUL-terminated strings are not part of the Java spec).

The first thing I came up with is using b.slice().limit(x) to create a lightweight view onto the desired bytes only (better than copying them anywhere as you might be able to work directly with the buffer)

ByteBuffer b = ByteBuffer.wrap(new byte[] {0x61, 0x62, 0x63, 0x64, 0x00, 0x31, 0x32, 0x34, 0x00 });
int i;
while (b.hasRemaining()) {
  ByteBuffer nextString = b.slice(); // View on b with same start position
  for (i = 0; b.hasRemaining() &amp;&amp; b.get() != 0x00; i++) {
    // Count to next NUL
  }
  nextString.limit(i); // view now stops before NUL
  CharBuffer s = StandardCharsets.UTF_8.decode(nextString);
  System.out.println(s);
}

答案2

得分: 1

在Java中，字符\u0000，UTF-8字节0，Unicode代码点U+0都是正常的字符。因此，读取所有内容（也许读入一个过大的字节数组），然后执行以下操作：

String s = new String(bytes, StandardCharsets.UTF_8);
String[] s0s1 = s.split("\u0000");
String s0 = s0s1[0];
String s1 = s0s1[1];

如果你没有固定的位置，必须逐字节顺序读取，代码会变得很丑陋。事实上，C语言的其中一位创始人称空终止字符串为历史性错误。

相反地，为了不为Java字符串生成UTF-8字节0，通常用于进一步处理成C/C++的空终止字符串，存在一种编写修改后的UTF-8的方法，也会对0字节进行编码。

英文:

In java the char \u0000, the UTF-8 byte 0, the Unicode code point U+0 is a normal char. So read all (maybe into an overlarge byte array), and do

String s = new String(bytes, StandardCharsets.UTF_8);
String[] s0s1 = s.split(&quot;\u0000&quot;);
String s0 = s0s1[0];
String s1 = s0s1[1];

If you do not have fixed positions and must sequentially read every byte the code is ugly. One of the C founders indeed called the nul terminated string a historic mistake.

The reverse, to not produce a UTF-8 byte 0 for a java String, normally for further processing as C/C++ nul terminated strings, there exists writing a modified UTF-8, also encoding the 0 byte.

答案3

得分: 0

import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
public class Jtest {
    public static void main(String[] args) {
        ByteBuffer b = ByteBuffer.allocate(10);
        b.put((byte)0x61);
        b.put((byte)0x62);
        b.put((byte)0x63);
        b.put((byte)0x64);
        b.put((byte)0x00);
        b.put((byte)0x31);
        b.put((byte)0x32);
        b.put((byte)0x34);
        b.put((byte)0x00);
        b.rewind();
        String s0;
        String s1;
        System.out.println("Original ByteBuffer: " + Arrays.toString(b.array()));
        String s = StandardCharsets.UTF_8.decode(b).toString();
        int nullIndex = s.indexOf('
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
public class Jtest {
    public static void main(String[] args) {
        ByteBuffer b = ByteBuffer.allocate(10);
        b.put((byte)0x61);
        b.put((byte)0x62);
        b.put((byte)0x63);
        b.put((byte)0x64);
        b.put((byte)0x00);
        b.put((byte)0x31);
        b.put((byte)0x32);
        b.put((byte)0x34);
        b.put((byte)0x00);
        b.rewind();
        String s0;
        String s1;
        System.out.println("Original ByteBuffer: " + Arrays.toString(b.array()));
        String s = StandardCharsets.UTF_8.decode(b).toString();
        int nullIndex = s.indexOf('\0');
        String s0 = s.substring(0, nullIndex);
        String s1 = s.substring(nullIndex + 1);
        String[] words = { s0, s1 };
        for (int i = 0; i < words.length; i++) {
            System.out.println(" Word " + i + " = " + words[i]);
        }
    }
}
');
        String s0 = s.substring(0, nullIndex);
        String s1 = s.substring(nullIndex + 1);
        String[] words = { s0, s1 };
        for (int i = 0; i < words.length; i++) {
            System.out.println(" Word " + i + " = " + words[i]);
        }
    }
}

英文:

You can do it by replace and split functions. Convert your hex bytes to String and find 0 by a custom character. Then split your string with that custom character.

import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
/**
 * Created by Administrator on 8/25/2020.
 */
public class Jtest {
    public static void main(String[] args) {
        //ByteBuffer b = /* 61 62 63 64 00 31 32 34 00 (hex) */;
        ByteBuffer b = ByteBuffer.allocate(10);
        b.put((byte)0x61);
        b.put((byte)0x62);
        b.put((byte)0x63);
        b.put((byte)0x64);
        b.put((byte)0x00);
        b.put((byte)0x31);
        b.put((byte)0x32);
        b.put((byte)0x34);
        b.put((byte)0x00);
        b.rewind();
        String s0;
        String s1;
        // print the ByteBuffer
        System.out.println(&quot;Original ByteBuffer:  &quot;
                + Arrays.toString(b.array()));
        // `s0` will now contain “ABCD” and `s1` will contain “124”.
        String s = StandardCharsets.UTF_8.decode(b).toString();
        String ss = s.replace((char)0,&#39;;&#39;);
        String[] words = ss.split(&quot;;&quot;);
        for(int i=0; i &lt; words.length; i++) {
            System.out.println(&quot; Word &quot; + i + &quot; = &quot; +words[i]);
        }
    }
}

I believe you can do it more efficiently with removing replace.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从ByteBuffer读取以NUL结尾的字符串

问题

答案1

答案2

答案3

构造函数参数在Android Studio中的Service类的onCreate方法内部是未知的。

删除Java中字符串中的任何位置的嵌套括号对。

音频播放（位于互联网上的文件）使用Kotlin。

什么应该首先学习？Java集合框架还是数据结构和算法？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。