正则表达式与简单情况不匹配

huangapple go评论83阅读模式
英文:

Regular expression does not match simple case

问题

以下是翻译好的内容:

我目前拥有的是...
`
^((([A-Za-z])+([A-Za-z0-9\-])*([a-zA-Z0-9])+)*\.)+$
`

规则:
1. 第一个字符必须是一个“。”或者[a-zA-Z](如果字符串长度为1,则只能是“。”)
2. 必须以一个“。”结尾
3. 在任何“。”之前只能有[a-zA-Z0-9]
4. 除了a-zA-Z0-9和\.之外,可以有-(连字符),这是唯一的其他字符集值
5. 在任何“。”之后不能有“-”

应该匹配的示例:
`。`
`a。`
`a-9。`
`abc。`
`abc.a-c.abc。`

不应该匹配的示例:
` -。`
`-a。`
`a-。`
`a`
`abc.-bc`
`ab-.abc`
`abc.a-@c`
`..`

目前它不能匹配`a。`,这是最简单的情况之一。您对如何修复它有任何建议吗?
英文:

What I currently have...

^((([A-Za-z])+([A-Za-z0-9\-])*([a-zA-Z0-9])+)*\.)+$

Rules:

  1. The first char must either be a "." or [a-zA-Z] (it can only be "." if the string is of length 1)
  2. It must end in a "."
  3. before any "." there can only be [a-zA-Z0-9]
  4. other than a-zA-Z0-9 and . there can be - (hyphens) that is the only otther character set value
  5. after any "." there can not be a "-"

examples that should match:
.
a.
a-9.
abc.
abc.a-c.abc.

that should not match:
-.
-a.
a-.
a
abc.-bc
ab-.abc
abc.a-@c
..

currently it does not match a. which is one of the simplest cases. Do you have any suggestions on how to fix it?

答案1

得分: 1

/^(?!\-)([A-Z0-9]|[\-\.](?!\.))*\.$/i
这个正则表达式还会处理`..`和`--`的情况。
试试看。

[在Regex101上查看实时演示][1]

让我们来分解一下:

```md
/
^              行的开头
(?!\-)         不能以-开头
(              开始匹配组
  [A-Z0-9]     匹配列表
  |            或者
  [-.](?![-.]) - 或者 .,但不能后跟 - 或者 .
)*             结束组,可重复匹配 0 次或多次
\.             必须以 . 结尾
$              行的结尾
/i             不区分大小写匹配

<details>
<summary>英文:</summary>

/^(?!-)([A-Z0-9]|-.)*.$/i


This will also handle the `..` and `--` case.  
Give it a try.

[Live demo on Regex101][1]

Let&#39;s break it down:

```md
/
^              Line start
(?!\-)         Must not start with -
(              Start of matching group
  [A-Z0-9]     Match list
  |            OR
  [-.](?![-.]) A - or . not followed by - or .
)*             End group matching 0 or more times
\.             Must end in . 
$              Line end
/i             Treat as case insensitive

答案2

得分: 1

作为一种不使用回顾的替代方案,您可以通过匹配 a-zA-Z 来开始字符串。

然后使用一个可选的模式,该模式匹配字符类(包括连字符)的零个或多个重复项,并以匹配不带连字符的内容结束,以防止在重复项中的点之前或字符串末尾出现连字符。

启用不区分大小写:

^(?:[a-z](?:[a-z0-9-]*[a-z0-9])?(?:\.[a-z0-9-]*[a-z0-9])*)?\.$

分解:

  • ^ 字符串的开头
  • (?: 非捕获组
    • [a-z] 匹配单个字符 a-z
    • (?: 非捕获组
      • [a-z0-9-]* 匹配 0 次或多次任何 a-z、0-9 或 -
      • [a-z0-9] 以 a-z-9 结束,以防止 . 之前出现 -
    • )? 关闭组并使其变为可选
    • (?: 非捕获组
      • \.[a-z0-9-]* 匹配 . 并且匹配 0 次或多次任何 a-z、0-9 或 -
      • [a-z0-9] 以 a-z-9 结束,以防止 . 之前出现 -
    • )* 关闭组并将其重复 0 次或多次
  • )? 关闭组并使其变为可选,以允许单个点
  • \. 匹配单个点
  • $ 字符串的结尾

正则表达式演示链接

英文:

As an alternative solution without lookarounds, you can start the string by matching a-zA-Z.

Then use an optional pattern that matches zero or more repetitions of the character class including the hyphen, and ends with matching without the hyphen to prevent it to be present before the dot in the repetition or at the end of the string.

With case insensitive enabled:

^(?:[a-z](?:[a-z0-9-]*[a-z0-9])?(?:\.[a-z0-9-]*[a-z0-9])*)?\.$

In parts

  • ^ Start of string
  • (?: Non capture group
    • [a-z] Match a single char a-z
    • (?: Non capture group
      • [a-z0-9-]* Match 0+ times any of a-z0-9-
      • [a-z0-9] End with a-z-9 so that there can not be a - before the .
    • )? Close group and make it optional
    • (?: Non capture group
      • \.[a-z0-9-]* Match a . and 0+ times any of a-z0-9-
      • [a-z0-9] End with a-z-9 so that there can not be a - before the .
    • )* Close group and repeat it 0+ times
  • )? Close group and make it optional to also allow a single dot
  • \. Match a single dot
  • $ End of string

Regex demo

答案3

得分: -1

^ - 开始流,目前还不错

(([A-Za-z])+([A-Za-z0-9\-])*([a-zA-Z0-9])+)* - 换句话说,尝试尽可能多次地匹配([A-Za-z])+([A-Za-z0-9\-])*([a-zA-Z0-9])+;0次也是可以接受的。

让我们尝试匹配一次:

([A-Za-z])+ - 好的,这将匹配到 a

([A-Za-z0-9\-])* - 这将不匹配任何内容。

([a-zA-Z0-9])+ - 匹配在这里失败。这不匹配 .

因此,它甚至没有匹配一次,我们快进到那个巨大的块的后面,经过 *,然后到达:

\. - 这不匹配;我们在 a 上。

英文:

taking it left to right:

^ - start of stream, so far so good

(([A-Za-z])+([A-Za-z0-9\-])*([a-zA-Z0-9])+)* - in other words, attempt to match ([A-Za-z])+([A-Za-z0-9\-])*([a-zA-Z0-9])+ as many times as we can; 0 is also acceptable.

Let's try to match it once:

([A-Za-z])+ - okay, that'll match the a.

([A-Za-z0-9\-])* - that'll match nothing.

([a-zA-Z0-9])+ - the match fails here. This does not match .

Therefore, it doesn't match even once, and we fast forward right after that giant blob, after the *, and get to:

\. - this doesn't match; we're on a.

huangapple
  • 本文由 发表于 2020年8月29日 11:24:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/63643107.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定