如何将任何类型的空白字符转换为一个字符?

huangapple go评论124阅读模式
英文:

How to convert any kind of white space to a char?

问题

I use String.strip() (Java 11) to remove trailing & leading white spaces from a String. There are 25 different kinds of white spaces in a String. I want to test my code with some of these 25 types of white space.

我使用 String.strip()(Java 11)来删除字符串中的尾部和首部空白字符。字符串中有25种不同类型的空白字符。我想用一些这25种类型的空白字符测试我的代码。

I have a code example which converts a particular type of white space (ex. \u2002) into a char and then uses it in a String. When I try to convert another white space type like \u000A to char, I get a compiler error. Why does this happen and how to fix it?

我有一个代码示例,它将特定类型的空白字符(例如\u2002)转换为char,然后在字符串中使用它。当我尝试将另一种空白字符类型(如\u000A)转换为char时,我会收到编译器错误。为什么会发生这种情况,如何修复它?

public static void main(String...args){
    char chr = '\u2002';//No problem.

    //Compiler error : 
    //Intellij IDEA compiler - Illegal escape character in character literal.
    //Java compiler - java: illegal line end in character literal.
    chr = '\u000a';

    String text = chr + "hello world" + chr;
    text = text.strip();
    System.out.println(text);
}
英文:

I use String.strip() (Java 11) to remove trailing & leading white spaces from a String. There are 25 different kinds of white spaces in a String. I want to test my code with some of these 25 types of white space.

I have a code example which converts a particular type of white space (ex. \u2002) into a char and then uses it in a String. When I try to convert another white space type like \u000A to char, I get a compiler error. Why does this happen and how to fix it ?

public static void main(String...args){
    char chr = '\u2002';//No problem.

    //Compiler error : 
    //Intellij IDEA compiler - Illegal escape character in character literal.
    //Java compiler - java: illegal line end in character literal.
    chr = '\u000a';

    String text = chr + "hello world" + chr;
    text = text.strip();
    System.out.println(text);
}

答案1

得分: 3

请确认您是否看到了这个错误信息:

> 错误:字符字面量中的非法行结束

类似\u000a的转义序列在编译过程的早期阶段进行处理。\u000a将被替换为实际的换行符字符(Unicode代码点10)。

就好像您写了这样的代码一样:

    chr = '
';

这就是为什么当我尝试使用JDK 11.0.8编译您的代码时,会出现"非法行结束"错误。

这个早期的转换在Java语言规范中有描述:

> 由于Unicode转义序列在很早的阶段进行处理,因此对于字符字面量的值为换行符(LF)的情况,写'\u000a'是不正确的;Unicode转义\u000a在翻译步骤1中被转换为实际的换行符(第3.3节),并且在步骤2中成为行终止符(第3.4节),因此字符字面量在步骤3中无效。相反,应该使用转义序列'\n'(第3.10.6节)。类似地,对于值为回车符(CR)的字符字面量,写'\u000d'也是不正确的。应该使用'\r'

英文:

Are you sure you're not seeing this error instead?

> error: illegal line end in character literal

Escape sequences like \u000a are processed very early in the compilation process. The \u000a is being replaced with an actual line feed character (code point 10).

It's as if you wrote this:

    chr = '
';

which is why, when I try and compile your code using JDK 11.0.8, I get the "illegal line end" error.

This early conversion is described in the Java Language Specification:

> Because Unicode escapes are processed very early, it is not correct to write '\u000a' for a character literal whose value is linefeed (LF); the Unicode escape \u000a is transformed into an actual linefeed in translation step 1 (§3.3) and the linefeed becomes a LineTerminator in step 2 (§3.4), and so the character literal is not valid in step 3. Instead, one should use the escape sequence '\n' (§3.10.6). Similarly, it is not correct to write '\u000d' for a character literal whose value is carriage return (CR). Instead, use '\r'.

huangapple
  • 本文由 发表于 2020年8月5日 06:13:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/63255781.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定