Python re模块中的令人困惑的示例

huangapple go评论67阅读模式
英文:

Confusing example from the Python re module

问题

> re.sub(pattern, repl, string, count=0, flags=0)
>
> ...
>
> 可选参数 count 是要替换的最大模式出现次数;count 必须是非负整数。如果省略或为零,则将替换所有出现次数。仅当模式的空匹配不与前一个空匹配相邻时,才替换模式的空匹配,因此 sub('x*', '-', 'abxd') 返回 '-a-b--d-'。

因此,x* 应该匹配

  1. 空字符串之前的 a
  2. a 和 b 之间的空字符串
  3. b 和 x 之间的空字符串
  4. 子字符串 'x'
  5. x 和 d 之间的空字符串
  6. d 后的空字符串

显然,(5) 没有被替换,但我看不出原因。如果我们从上面加粗的文本中移除 "空" 这个词,我可以看出 (5) 不会被替换。但是 (5) 不与前一个空匹配相邻。

英文:

From the Python docs:

> re.sub(pattern, repl, string, count=0, flags=0)
>
> ...
>
> The optional argument count is the maximum number of pattern
> occurrences to be replaced; count must be a non-negative integer. If
> omitted or zero, all occurrences will be replaced. Empty matches for
> the pattern are replaced only when not adjacent to a previous empty
> match, so sub('x*', '-', 'abxd') returns '-a-b--d-'.

So x* should match

  1. The empty string before a
  2. The empty string between a and b
  3. The empty string between b and x
  4. The substring 'x'
  5. The empty string between x and d
  6. The empty string after d

Evidently (5) is not replaced, but I can't see why. If we removed the word "empty" from the bolded text above, I can see that (5) would not be replaced. But (5) is not adjacent to a previous empty match.

答案1

得分: 2

> "... 3. The empty string between b and x ..."
我不认为在 bx 之间会有空字符串,因为 x 匹配。
该模式有效地是 "如果是 x,则匹配 1 次或多次,或者不匹配"

例如,之间是空格的唯一原因是 ab 之间没有 x

             -----------------
字符        | a | b | x | d |
             -----------------
索引        0   1   2   3   4
步骤 索引 子字符串 是否为 x 当前字符串
1 0 到 1 a false -abxd
2 1 到 2 b false -a-bxd
3 2 到 3 x true -a-b-d
4 3 到 4 d false -a-b--d
5 4 false -a-b--d-
英文:

> "... 3. The empty string between b and x ..."

I don't believe there would be an empty string between b and x, since x matches.
The pattern is effectively, "if x, 1 or more, or none".

For example, the only reason it's an empty space between a and b is because b is not an x.

             -----------------
characters   | a | b | x | d |
             -----------------
indices      0   1   2   3   4
step indices substring is x current string
1 0 to 1 a false -abxd
2 1 to 2 b false -a-bxd
3 2 to 3 x true -a-b-d
4 3 to 4 d false -a-b--d
5 4 false -a-b--d-

huangapple
  • 本文由 发表于 2023年6月26日 10:07:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76553154.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定