在Java中的随机转义序列

huangapple go评论59阅读模式
英文:

Random escape sequences in java

问题

我在Java中使用转义字符(反斜杠\)进行实验。当我获取\n或\t的长度(实际存储所需的字节数)时,我得到1,而当我获取\n\t的长度时,我得到2,正如预期的那样。

我的困惑开始于打印:

3的长度 -> 1
7的长度 -> 1
8的长度 -> 2
0的长度 -> 3

这是如何发生的?如果与ASCII或扩展ASCII相关,那么这应该从164开始更改。另一个观察是,在前三个字符之后,它开始将每个字符视为长度为1,例如\123456的长度为4。

这是否与编码有关?我现在在我的IDE中设置了UTF-8。

这可能是一个愚蠢的问题,但我不了解Unicode或它的编码,有人能解释一下吗?

英文:

I am playing with escape character(backslash \) in java. When I get the length(number of bytes actually taken to store) of \n or \t, I get 1 and when I get length of \n\t, I get 2, as expected.

My confusion starts when I print:

length of 3  -> 1
length of 7  -> 1
length of 8  -> 2
length of 0  -> 3

How is this happening? If it's related to ASCII or extended-ASCII, then this should change from 164. Another observation is after first three characters it starts counting each char as 1 length, so e.g. \123456 has length 4.

Is it something related to the encoding? I have UTF-8 set into my IDE right now.

This could be a silly question but I don't have detail knowledge of unicode or it's encoding, can someone please explain?

答案1

得分: 2

当您使用反斜杠和数字时,您正在使用八进制数 https://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.6 ,当将八进制123转换为十六进制时,它是53 https://decimaltobinary.pro/Convert_octal_number_123_to_hexadecimal_ ,十六进制中的53在ASCII中是'S' https://ascii.cl/

因为我们是在八进制基础上操作,所以可以使用0到7的数字:

  • 123,所有数字都可以视为八进制。

  • 177,所有数字都可以视为八进制。

  • 178,1和7在8以下可以转换,8在八进制基础外。因此,8被拆分为一个字符。

  • 190,1可以作为八进制基础的一部分,但9不可以,所以9及其后的所有数字都被视为字符。

  • 123456,我们可以在八进制中使用ASCII从0到177(7F),所以123可以转换为一个字符。

英文:

When you use \ and a number you are using octal numbers https://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.6 , when converting octal 123 to hex It is 53 https://decimaltobinary.pro/Convert_octal_number_123_to_hexadecimal_ , 53 in hex in ASCII is 'S' https://ascii.cl/

as we are on base 8 we can use digits from 0 to 7:

  • 123, all numbers can be treated as octal.

  • 177, all numbers can be treated as octal.

  • 178, 1 and 7 are under 8 can be converted, 8 is outside base 8. for that reason 8 is being taken apart as a character.

  • 190, 1 can be part of base 8, but 9 no so 9 and all digits after him are treated as characters.

  • 123456 we can use ASCII in octal from 0 to 177 (7F), so 123 can be transformed to one character.

huangapple
  • 本文由 发表于 2020年7月30日 02:49:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/63160497.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定