英文:
Confusing example from the Python re module
问题
> re.sub(pattern, repl, string, count=0, flags=0)
>
> ...
>
> 可选参数 count 是要替换的最大模式出现次数;count 必须是非负整数。如果省略或为零,则将替换所有出现次数。仅当模式的空匹配不与前一个空匹配相邻时,才替换模式的空匹配,因此 sub('x*', '-', 'abxd') 返回 '-a-b--d-'。
因此,x*
应该匹配
- 空字符串之前的 a
- a 和 b 之间的空字符串
- b 和 x 之间的空字符串
- 子字符串 'x'
- x 和 d 之间的空字符串
- d 后的空字符串
显然,(5) 没有被替换,但我看不出原因。如果我们从上面加粗的文本中移除 "空" 这个词,我可以看出 (5) 不会被替换。但是 (5) 不与前一个空匹配相邻。
英文:
From the Python docs:
> re.sub(pattern, repl, string, count=0, flags=0)
>
> ...
>
> The optional argument count is the maximum number of pattern
> occurrences to be replaced; count must be a non-negative integer. If
> omitted or zero, all occurrences will be replaced. Empty matches for
> the pattern are replaced only when not adjacent to a previous empty
> match, so sub('x*', '-', 'abxd') returns '-a-b--d-'.
So x*
should match
- The empty string before a
- The empty string between a and b
- The empty string between b and x
- The substring 'x'
- The empty string between x and d
- The empty string after d
Evidently (5) is not replaced, but I can't see why. If we removed the word "empty" from the bolded text above, I can see that (5) would not be replaced. But (5) is not adjacent to a previous empty match.
答案1
得分: 2
> "... 3. The empty string between b and x ..."
我不认为在 b 和 x 之间会有空字符串,因为 x 匹配。
该模式有效地是 "如果是 x,则匹配 1 次或多次,或者不匹配"。
例如,之间是空格的唯一原因是 a 和 b 之间没有 x。
-----------------
字符 | a | b | x | d |
-----------------
索引 0 1 2 3 4
步骤 | 索引 | 子字符串 | 是否为 x | 当前字符串 |
---|---|---|---|---|
1 | 0 到 1 | a | false | -abxd |
2 | 1 到 2 | b | false | -a-bxd |
3 | 2 到 3 | x | true | -a-b-d |
4 | 3 到 4 | d | false | -a-b--d |
5 | 4 | false | -a-b--d- |
英文:
> "... 3. The empty string between b and x ..."
I don't believe there would be an empty string between b and x, since x matches.
The pattern is effectively, "if x, 1 or more, or none".
For example, the only reason it's an empty space between a and b is because b is not an x.
-----------------
characters | a | b | x | d |
-----------------
indices 0 1 2 3 4
step | indices | substring | is x | current string |
---|---|---|---|---|
1 | 0 to 1 | a | false | -abxd |
2 | 1 to 2 | b | false | -a-bxd |
3 | 2 to 3 | x | true | -a-b-d |
4 | 3 to 4 | d | false | -a-b--d |
5 | 4 | false | -a-b--d- |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论