Java regex to replace all special characters in a String with an underscore also considering removing leading,trailing,multiple underscores

huangapple go评论70阅读模式
英文:

Java regex to replace all special characters in a String with an underscore also considering removing leading,trailing,multiple underscores

问题

我需要一个正则表达式来替换所有特殊字符,考虑多个特殊字符连续出现时只替换为一个下划线,并且如果字符串包含前导和尾部特殊字符,则不要添加前导和尾部下划线。我尝试了以下方法,但似乎不起作用。

String myDefaultString = "_@##%Default__$*_123_";
myDefaultString.replaceAll("[\\p{Punct}&&[^_]]", "_")

我最终想要的结果应该是 Default_123,正则表达式需要考虑前导下划线并去除它们,保留 Default123 之间的下划线,但也应该去除字符串中的尾部和多个连续的下划线。

我还尝试了以下正则表达式:

myDefaultString.replaceAll("[^a-zA-Z0-9_.]+", "_")

但似乎不起作用,我想要实现的内容是否非常复杂,或者是否有更好的方法来实现?

英文:

I would need a regular expression to replace all the special characters considering multiple with a single underscore and also not to add trailing and leading underscore if the String contains trailing and leading special characters, I have tried the following but it doesn't seem to work.

String myDefaultString = "_@##%Default__$*_123_"
myDefaultString.replaceAll("[\\p{Punct}&&[^_]]", "_")

My eventual result should be Default_123 where the regular expression needs to consider leading underscore and remove them keeping the underscore in between Default and 123 but also should remove trailing and multiple underscores in between the String.

Also tried the following regex

myDefaultString.replaceAll("[^a-zA-Z0-9_.]+", "_")

But does not seem to work, is what I'm trying to achieve very complicated or it there a better way to do it?

答案1

得分: 5

你可以在 replaceAll 中使用这个正则表达式:

String str = "_@##%Default__$*_123_";
str = str.replaceAll("[\\p{Punct}&&[^_]]+|^_+|\\p{Punct}+(?=_|$)", "");
//=> "Default_123"

正则表达式演示

正则表达式详解:

  • [\\p{Punct}&&[^_]]+: 匹配 1 个或更多的标点字符,但不包括 _

  • |: 或

  • ^_+: 匹配开头的 1 个或更多下划线

  • |: 或

  • \\p{Punct}+(?=_|$): 匹配 1 个或更多的标点字符,如果其后面是 _ 或字符串末尾。

英文:

You may use this regex in replaceAll:

String str = "_@##%Default__$*_123_";
str = str.replaceAll("[\\p{Punct}&&[^_]]+|^_+|\\p{Punct}+(?=_|$)", "");
//=> "Default_123"

RegEx Demo

RegEx Details:

  • [\\p{Punct}&&[^_]]+: Match 1+ punctuation characters that are not _
  • |: OR
  • ^_+: Match 1+ underscores at start
  • |: OR
  • \\p{Punct}+(?=_|$): Match 1+ punctuation characters if that is followed by a _ or end of string.

huangapple
  • 本文由 发表于 2020年10月1日 13:55:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/64149821.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定