用Java将字符串XML中的非UTF-8字符替换为空格。

huangapple go评论76阅读模式
英文:

Replace non-UTF-8 character from String XML by space in java

问题

Checking this post, I'm able to replace the ’ character which is the apostrophe character from my String XML by space:

String s = "<content>abc&#226;€™s house.</content>";
s = s.replaceAll("[^\\x00-\\x7F]", " ");
System.out.println(s);

The issue is that it produces 3 spaces: abc s house., I'd say because of &#226;€™ having 3 characters maybe? But I need for that character to be replaced by just one space: abc s house.

If I use below approach, it works while running from eclipse, but when I compile it to an executable jar, then it converts the &#226;€™ by the and since the string doesn't have the it doesn't work. (I'm able to see this behavior by decompiling the jar and see the code):

s = s.replace("&#226;€™", " ");
英文:

Checking this post, I'm able to replace the &#226;€™ character which is the apostrophe character from my String XML by space:

String s = &quot;&lt;content&gt;abc&#226;€™s house.&lt;/content&gt;&quot;;		
s = s.replaceAll(&quot;[^\\x00-\\x7F]&quot;,&quot; &quot;);
System.out.println(s);

The issue is that it produces 3 spaces: abc s house., I'd say because of &#226;€™ having 3 characters maybe? But I need for that character to be replaced by just one space: abc s house.

If I use below approach, it works while running from eclipse, but when I compile it to an executable jar, then it converts the &#226;€™ by the and since the string doesn't have the it doesn't work. (I'm able to see this behavior by decompiling the jar and see the code):

s = s.replace(&quot;&#226;€™&quot;, &quot; &quot;);

答案1

得分: 1

你可以使用 + 量词:

String s = "abc&#226;€™s house.";
s = s.replaceAll("[^\\x00-\\x7F]+"," ");
System.out.println(s);

输出结果:

abc s house.

英文:

You can use + quantifier:

    String s = &quot;abc&#226;€™s house.&quot;;
    s = s.replaceAll(&quot;[^\\x00-\\x7F]+&quot;,&quot; &quot;);
    System.out.println(s);

prints:

> abc s house.

huangapple
  • 本文由 发表于 2023年6月29日 00:59:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/76575294.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定