2017年8月24日 16:20:40go评论97阅读模式

英文:

Golang: Why does regexp.FindAllStringSubmatch() returns [][]string and not []string?

问题

我对Go语言还比较新，这是我第一次处理正则表达式。

我有点惊讶于someregex.FindAllStringSubmatch("somestring", -1)返回的是一个切片的切片[][]string，而不是一个简单的字符串切片[]string。

例如：

someRegex, _ := regexp.Compile("^.*(mes).*$")
matches := someRegex.FindAllStringSubmatch("somestring", -1)
fmt.Println(matches) // 输出 [[somestring mes]]

这种行为的原因是什么，我搞不明白。

英文:

I am kind of new to go and that's the first time I have to deal with regexp.

I am a bit surprised that the someregex.FindAllStringSubmatch("somestring", -1) returns a slice of slice [][]string instead of a simple slice of string : []string.

example :

someRegex, _ := regexp.Compile(&quot;^.*(mes).*$&quot;)
matches := someRegex.FindAllStringSubmatch(&quot;somestring&quot;, -1)
fmt.Println(matches) // logs [[somestring mes]]

What is the reason of this behavior, I can't figure it out ?

答案1

得分: 8

func (*Regexp) FindAllStringSubmatch提取匹配项和捕获子匹配项。

子匹配项是由一对未转义括号（称为捕获组）括起来的正则表达式部分匹配的文本的一部分。

在你的情况下，^.*(mes).*$匹配：

^ - 字符串的开头
.* - 任意0个或多个字符，尽可能多地匹配
(mes) - 捕获组1：一个mes子字符串
.*$ - 字符串的剩余部分。

因此，匹配值是整个字符串。它将是输出中的第一个值。然后，由于有一个捕获组，结果中必须有一个位置给它，因此mes被放置在列表中的第二个项目。

由于可能有多个匹配项，我们需要一个列表的列表。

一个更好的例子可能是具有多个匹配/子匹配提取（可能还有一个可选组）的例子：

package main

import (
	"fmt"
	"regexp"
)

func main() {
	someRegex, _ := regexp.Compile(`[^aouiye]([aouiye])([^aouiye])?`)
	matches := someRegex.FindAllStringSubmatch("somestri", -1)
	fmt.Printf("%q\n", matches)
}

[^aouiye]([aouiye])([^aouiye])?匹配一个非元音字母、一个元音字母和一个非元音字母，并将最后两个分别捕获到组#1和组#2中。

结果是[["som" "o" "m"] ["ri" "i" ""]]。有2个匹配项，每个匹配项包含一个匹配值、组1的值和组2的值。由于ri匹配项在组2(([^aouiye])?)中没有捕获到任何文本，所以它是空的，但它仍然存在，因为该组在正则表达式模式中被定义。

英文:

The func (*Regexp) FindAllStringSubmatch extracts matches and captured submatches.

A submatch is a part of the text that is matched by the regex part that is enclosed with a pair of unescaped parentheses (a so called capturing group).

In your case, ^.*(mes).*$ matches:

^ - start of string
.* - any 0+ chars as many as possible
(mes) - Capturing group 1: a mes substring
.*$ - the rest of the string.

So, the match value is the whole string. It will be the first value in the output. Then, since there is a capturing group, there must be a place for it in the results, hence, mes is placed as the second item in the list.

Since there may be more matches than 1, we need a list of lists.

A better example may be the one with several match / submatch extraction (and maybe an optional group, too):

package main

import (
	&quot;fmt&quot;
	&quot;regexp&quot;
)

func main() {
	someRegex, _ := regexp.Compile(`[^aouiye]([aouiye])([^aouiye])?`)
	matches := someRegex.FindAllStringSubmatch(&quot;somestri&quot;, -1)
	fmt.Printf(&quot;%q\n&quot;, matches)
}

The [^aouiye]([aouiye])([^aouiye])? matches a non-vowel, a vowel, and a non-vowel, capturing the last 2 into separate groups #1 and #2.

The results are [["som" "o" "m"] ["ri" "i" ""]]. There are 2 matches, and each contains a match value, Group 1 value and Group 2 value. Since the ri match has no text captured into Group 2 (([^aouiye])?), it is empty, but it is still there since the group is defined in the regex pattern.

答案2

得分: 3

FindAllStringSubmatch是FindStringSubmatch的“All”版本；它返回一个切片，其中包含表达式的所有连续匹配项，如包注释中的“All”描述所定义。返回值为nil表示没有匹配项。

文档。

总结一下：你需要一个字符串数组的数组，因为这是FindStringSubmatch的“all”版本。FindStringSubmatch将返回一个单独的字符串数组。

英文:

> FindAllStringSubmatch is the 'All' version of FindStringSubmatch; it
> returns a slice of all successive matches of the expression, as
> defined by the 'All' description in the package comment. A return
> value of nil indicates no match.

Docs.

To sum up: You need an array of arrays of strings, because this is the all version of FindStringSubmatch. FindStringSubmatch will return a single string array.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Golang：为什么regexp.FindAllStringSubmatch()返回的是[][]string而不是[]string？

问题

答案1

答案2

（websocket）Golang同步数据锁定失败 – 管道中断

如何为我从外部包导入的方法创建单元测试的期望？

在Golang中，用于模拟第三方库的接口使用方法

如何在GO中解码灵活的XML？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论