从字符串中删除特殊字符,但仅在其不位于单词内部时才删除。

huangapple go评论88阅读模式
英文:

Remove special character from string only if is not inside word

问题

我想要替换字符串中的所有特殊字符,但仅当它们不在单词内部。

特殊字符:<>{}\"/|;:.,~!?@#$%^=&*'

示例:

String str = "//won't won't wo/'n't wont wont'."
str.replaceAll(,"") // "won't won't won't wont wont"

有人知道如何通过正则表达式实现这一点吗?

英文:

I want to replace all special characters from string but only if is not inside a word.

Special characters: <>{}\"/|;:.,~!?@#$%^=&*'

Example:

String str = "//won't won't wo/'n't wont wont'."
str.replaceAll(,"") // "won't won't won't wont wont"

Anyone know how to reach this through regex ?

答案1

得分: 2

这个正则表达式:

(?<!\w)\W+|\W+(?!\w)

匹配两个备选的正则表达式之一。 (1|2)

  1. 任何特殊字符(非单词字符:\W),它不是由单词字符(\w)前置。
  2. 任何特殊字符,它不是由单词字符跟随。

这个工作原理是因为如果匹配了其中一个,特殊字符必定不在一个单词中。

“前置于”:正向后顾(positive lookbehind)。 (?<=y)X:X 前置于 y
“不前置于”:负向后顾(negative lookbehind)。 (?<!y)X:X 不前置于 y
“后跟于”:正向先行(positive lookahead)。 X(?=y):X 后跟于 y
“不后跟于”:负向先行(negative lookahead)。 X(?!y):X 不后跟于 y。

你应该用你的特殊字符集(适当转义)替换 \W

需要注意的一点是,这个解决方案不依赖于空格的存在。

英文:

This RegEx:

(?&lt;!\w)\W+|\W+(?!\w)

Matches either of two alternate RegEx. (1|2)

  1. Any special character (non word character: \W) that is not preceded by a word character (\w)
  2. Any special character that is not followed by a word character.

This works because if either is matched the special character must not be in a word

"is preceded": positive lookbehind. (?&lt;=y)X: X is preceded by y
"is not preceded": negative lookbehind. (?&lt;!y)X: X is not preceded by y
"is followed by": positive lookahead. X(?=y): X is followed by y
"is not followed by": negative lookahead. X(?!y): X is not followed by y.

You should replace the \W with your set of special characters (appropriately escaped)

One thing to note with this solution is that it does not depend on the existence of white space.

答案2

得分: 0

尝试这个:

    (?<![a-z])[<>{}"\/|;:.,~!?@#$%^=&amp;*']|[<>{}"\/|;:.,~!?@#$%^=&amp;*'](?![a-z])

使用设置了大小写不敏感标志 (/i)。

示例

Java 的正则表达式引擎执行以下操作:

    (?<![a-z])                  # 匹配一个不在字母前的负向后瞻
    [<>{}"\/|;:.,~!?@#$%^=&amp;*']  # 匹配一个特殊字符
    |
    [<>{}"\/|;:.,~!?@#$%^=&amp;*']  # 匹配一个特殊字符
    (?![a-z])                   # 匹配一个不在字母后的负向前瞻
英文:

Try this:

(?&lt;![a-z])[&lt;&gt;{}&quot;\/|;:.,~!?@#$%^=&amp;*&#39;]|[&lt;&gt;{}&quot;\/|;:.,~!?@#$%^=&amp;*&#39;](?![a-z])

with the case-insensitive flag set (/i)

Demo

Java's regex engine performs the following operations.

(?&lt;![a-z])                  # match a letter in a
                            # negative lookbehind
[&lt;&gt;{}&quot;\/|;:.,~!?@#$%^=&amp;*&#39;]  # match a special character
|
[&lt;&gt;{}&quot;\/|;:.,~!?@#$%^=&amp;*&#39;]  # match a special character
(?![a-z])                   # match a letter in a
                            # negative lookahead

huangapple
  • 本文由 发表于 2020年4月6日 05:18:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/61049643.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定