正则表达式用于匹配中间的特殊字符

huangapple go评论62阅读模式
英文:

regular expression to match certian special chars in middle

问题

我正在尝试构建符合以下条件的Java正则表达式:

  1. 标识符不得以特殊字符开头或结尾。

  2. 不允许出现多个连续的特殊字符序列。

  3. 允许的特殊字符包括:冒号、连字符(减号)、句号(句号)和下划线。

我已经进行了一些分析并构建了正则表达式:

String regularexp = "^[A-Za-z0-9](?:,/-/_.*?[^A-Za-z0-9]{2}).*?[A-Za-z0-9]$";

但不知何故它不起作用。

final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println("完全匹配:" + matcher.group(0));
    for (int i = 1; i <= matcher.groupCount(); i++) {
        System.out.println("第 " + i + " 组:" + matcher.group(i));
    }
}

请帮我检查一下我在哪里出错了。

英文:

i am trying to build java regular expression with following conditions:

  1. The identifier may not start or end with a special character

  2. sequences of multiple special characters are not permitted

  3. permitted special characters are : colon,hyphen(minus),period(full stop) and underscore

i have done some analysis and built the regular expression :

String regularexp=&quot;^[A-Za-z0-9](?:,/-/_.*?[^A-Za-z0-9]{2}).*?[A-Za-z0-9]$&quot; ;but somehow its not working .

final Pattern pattern = Pattern.compile(regex);
		final Matcher matcher = pattern.matcher(string);

		while (matcher.find()) {
		    System.out.println(&quot;Full match: &quot; + matcher.group(0));
		    for (int i = 1; i &lt;= matcher.groupCount(); i++) {
		        System.out.println(&quot;Group &quot; + i + &quot;: &quot; + matcher.group(i));
		    }

	}

can you please check where i am doing mistake.

答案1

得分: 1

以下是翻译好的内容:

更容易编写的方法是,如果你专注于那些不是特殊字符的部分。那么字符串看起来像这样(在EBNF表示法中):

<word> (<special-char> <word>)*
  • <word> 是一个包含一个或多个字母数字字符的序列:[A-Za-z0-9]+
    通过要求<word>不为空,我们确保不会出现多个<special-char>的序列。
  • <special-char> 则恰好是一个特殊字符:[-:._]
    我把 - 放在第一位,这样我们就不需要转义它。

组合起来:

^[A-Za-z0-9]+([-:._][A-Za-z0-9]+)*$

Regex101 演示链接

英文:

It's easier to write if you focus on the parts that are not special characters. Then the string looks like this (in EBNF notation):

&lt;word&gt; (&lt;special-char&gt; &lt;word&gt;)*
  • &lt;word&gt; is a sequence of one or more alphanumeric characters: [A-Za-z0-9]+.
    By requiring that a &lt;word&gt; is not empty, we guarantee that no
    sequences of more than one &lt;special-char&gt; can occur.
  • And &lt;special-char&gt; is exactly one special character: [-:._].
    I put - first so we don't need to escape it.

Putting it together:

^[A-Za-z0-9]+([-:._][A-Za-z0-9]+)*$

Regex101 demo

huangapple
  • 本文由 发表于 2020年10月19日 20:05:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/64426996.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定