如何从JAVA中的UTF-8代码中获取主字符?

huangapple go评论62阅读模式
英文:

How can I get back the main character from the UTF-8 code in JAVA?

问题

私制作了一个ASCII编码器-解码器。我将字符编码为UTF-8。要编码,我正在使用以下代码:

private String asciiReturn(String inpString){
    int codePoint = 0;
    StringBuilder str = new StringBuilder();
    for (int i = 0; i < inpString.length(); i++){
        codePoint = Character.codePointAt(inpString, i);
        i += Character.charCount(codePoint) - 1;
        str.append(codePoint);
        str.append(" ");
    }
    return str.toString();
}

因此,通过这种方式,我可以编码所有这些表情符号。

例如,对于这个表情符号 '🤷🏻‍♂️',我得到了 "129335 127995 8205 9794 65039"。这基本上是该表情符号的UTF-8十进制值,这正是我想要的。但是我的问题在于解码。

我想要的是:(示例)

输入字符串:"72 117 104 33 129335 127995 8205 9794 65039"
输出字符串:"Huh!🤷🏻‍♂️"

原因:
72 -> 'H'
117 -> 'u'
104 -> 'h'
33 -> '!'
129335 127995 8205 9794 65039 -> '🤷🏻‍♂️'

提前感谢 😊👍

英文:

I am making an ASCII Encoder-Decoder. I am encoding the characters into UTF-8. To encode I am using this code:

private String asciiReturn(String inpString){
int codePoint = 0;
StringBuilder str = new StringBuilder();
for (int i = 0; i &lt; inpString.length(); i++){
codePoint = Character.codePointAt(inpString, i);
i += Character.charCount(codePoint) - 1;
str.append(codePoint);
str.append(&quot; &quot;);
}
return str.toString();
}

So by this, I can encode all those emoji characters too.

Like '🤷🏻‍♂️' for this emoji I am getting "129335 127995 8205 9794 65039". So this is basically the UTF-8 decimal value of the emoji and that's exactly what I want. But my problem is the decoding.

What I want is: (Example)

Input String: "72 117 104 33 129335 127995 8205 9794 65039"<br>
Output String: "Huh!🤷🏻‍♂️"

Cause:
72 -> 'H'<br>
117 -> 'u'<br>
104 -> 'h'<br>
33 -> '!'<br>
129335 127995 8205 9794 65039 -> '🤷🏻‍♂️'

Thanks in advance 🙂

答案1

得分: 1

尝试这个。

private String decode(String inpString) {
    return Arrays.stream(inpString.split("\\s+"))
        .map(s -> Character.toString(Integer.parseInt(s)))
        .collect(Collectors.joining());
}

String input = "72 117 104 33 129335 127995 8205 9794 65039";
System.out.println(decode(input));

输出

Huh!💩👋‍♂️

你也可以像这样编写你的编码方法:

static String asciiReturn(String s) {
    return s.codePoints()
        .mapToObj(Integer::toString)
        .collect(Collectors.joining(" "));
}

String s = "Huh!💩👋‍♂️";
System.out.println(asciiReturn(s));

输出

72 117 104 33 129335 127995 8205 9794 65039
英文:

Try this.

private String decode(String inpString) {
    return Arrays.stream(inpString.split(&quot;\\s+&quot;))
        .map(s -&gt; Character.toString(Integer.parseInt(s)))
        .collect(Collectors.joining());
}

and

String input = &quot;72 117 104 33 129335 127995 8205 9794 65039&quot;;
System.out.println(decode(input));

output

Huh!&#129335;&#127995;‍♂️

You can also write your encoding method like this:

static String asciiReturn(String s) {
    return s.codePoints()
        .mapToObj(Integer::toString)
        .collect(Collectors.joining(&quot; &quot;));
}

and

String s = &quot;Huh!&#129335;&#127995;‍♂️&quot;;
System.out.println(asciiReturn(s));

output

72 117 104 33 129335 127995 8205 9794 65039

huangapple
  • 本文由 发表于 2020年8月9日 13:26:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/63322795.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定