英文:
Random escape sequences in java
问题
我在Java中使用转义字符(反斜杠\)进行实验。当我获取\n或\t的长度(实际存储所需的字节数)时,我得到1,而当我获取\n\t的长度时,我得到2,正如预期的那样。
我的困惑开始于打印:
3的长度 -> 1
7的长度 -> 1
8的长度 -> 2
0的长度 -> 3
这是如何发生的?如果与ASCII或扩展ASCII相关,那么这应该从164开始更改。另一个观察是,在前三个字符之后,它开始将每个字符视为长度为1,例如\123456的长度为4。
这是否与编码有关?我现在在我的IDE中设置了UTF-8。
这可能是一个愚蠢的问题,但我不了解Unicode或它的编码,有人能解释一下吗?
英文:
I am playing with escape character(backslash \) in java. When I get the length(number of bytes actually taken to store) of \n or \t, I get 1 and when I get length of \n\t, I get 2, as expected.
My confusion starts when I print:
length of 3 -> 1
length of 7 -> 1
length of 8 -> 2
length of 0 -> 3
How is this happening? If it's related to ASCII or extended-ASCII, then this should change from 164. Another observation is after first three characters it starts counting each char as 1 length, so e.g. \123456 has length 4.
Is it something related to the encoding? I have UTF-8 set into my IDE right now.
This could be a silly question but I don't have detail knowledge of unicode or it's encoding, can someone please explain?
答案1
得分: 2
当您使用反斜杠和数字时,您正在使用八进制数 https://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.6 ,当将八进制123转换为十六进制时,它是53 https://decimaltobinary.pro/Convert_octal_number_123_to_hexadecimal_ ,十六进制中的53在ASCII中是'S' https://ascii.cl/
因为我们是在八进制基础上操作,所以可以使用0到7的数字:
-
123,所有数字都可以视为八进制。
-
177,所有数字都可以视为八进制。
-
178,1和7在8以下可以转换,8在八进制基础外。因此,8被拆分为一个字符。
-
190,1可以作为八进制基础的一部分,但9不可以,所以9及其后的所有数字都被视为字符。
-
123456,我们可以在八进制中使用ASCII从0到177(7F),所以123可以转换为一个字符。
英文:
When you use \ and a number you are using octal numbers https://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.6 , when converting octal 123 to hex It is 53 https://decimaltobinary.pro/Convert_octal_number_123_to_hexadecimal_ , 53 in hex in ASCII is 'S' https://ascii.cl/
as we are on base 8 we can use digits from 0 to 7:
-
123, all numbers can be treated as octal.
-
177, all numbers can be treated as octal.
-
178, 1 and 7 are under 8 can be converted, 8 is outside base 8. for that reason 8 is being taken apart as a character.
-
190, 1 can be part of base 8, but 9 no so 9 and all digits after him are treated as characters.
-
123456 we can use ASCII in octal from 0 to 177 (7F), so 123 can be transformed to one character.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论