使用Go正则表达式中的命名匹配

huangapple go评论82阅读模式
英文:

Using named matches from Go regex

问题

我来自Python,所以可能没有以正确的方式来看待这个问题。我想创建一个相当复杂的正则表达式,并能够通过名称访问匹配的字段。我似乎找不到一个好的例子。我最接近的是这样的:

package main

import (
  "fmt"
  "regexp"
)

var myExp = regexp.MustCompile(`(?P<first>\d+)\.(\d+).(?P<second>\d+)`)

func main() {
  fmt.Printf("%+v", myExp.FindStringSubmatch("1234.5678.9"))

  match := myExp.FindStringSubmatch("1234.5678.9")
    for i, name := range myExp.SubexpNames() {
        fmt.Printf("'%s'\t %d -> %s\n", name, i, match[i])
    }
    //fmt.Printf("by name: %s %s\n", match["first"], match["second"])
}

被注释掉的那一行是我期望在Python中访问命名字段的方式。在Go中,有什么等效的方法吗?

或者,如果我需要将匹配转换为映射,Go中最符合惯用方式的方法是什么?

英文:

I'm coming from Python, so I'm probably just not looking at this the right way. I'd like to create a fairly complicated regex and be able to access the fields match by name. I can't seem to find a good example. The closest I've managed to get is this:

package main

import (
  &quot;fmt&quot;
  &quot;regexp&quot;
)

var myExp = regexp.MustCompile(`(?P&lt;first&gt;\d+)\.(\d+).(?P&lt;second&gt;\d+)`)

func main() {
  fmt.Printf(&quot;%+v&quot;, myExp.FindStringSubmatch(&quot;1234.5678.9&quot;))

  match := myExp.FindStringSubmatch(&quot;1234.5678.9&quot;)
    for i, name := range myExp.SubexpNames() {
        fmt.Printf(&quot;&#39;%s&#39;\t %d -&gt; %s\n&quot;, name, i, match[i])
    }
    //fmt.Printf(&quot;by name: %s %s\n&quot;, match[&quot;first&quot;], match[&quot;second&quot;])
}

The commented out line is how I would expect to access the named fields in Python. What's the equivalent way to do this in go?

Or if I need to convert the match to a map, what's the most idiomatic way in go to make and then access the map?

答案1

得分: 79

你可以通过使用map来引用你的命名捕获组,如下所示:

package main

import (
	"fmt"
	"regexp"
)

var myExp = regexp.MustCompile(`(?P<first>\d+)\.(\d+).(?P<second>\d+)`)

func main() {
	match := myExp.FindStringSubmatch("1234.5678.9")
	result := make(map[string]string)
	for i, name := range myExp.SubexpNames() {
		if i != 0 && name != "" {
			result[name] = match[i]
		}
	}
	fmt.Printf("by name: %s %s\n", result["first"], result["second"])
}

GoPlay

英文:

You can reference your named capture groups by utilizing map as follows:

package main

import (
	&quot;fmt&quot;
	&quot;regexp&quot;
)

var myExp = regexp.MustCompile(`(?P&lt;first&gt;\d+)\.(\d+).(?P&lt;second&gt;\d+)`)

func main() {
	match := myExp.FindStringSubmatch(&quot;1234.5678.9&quot;)
	result := make(map[string]string)
	for i, name := range myExp.SubexpNames() {
		if i != 0 &amp;&amp; name != &quot;&quot; {
			result[name] = match[i]
		}
	}
	fmt.Printf(&quot;by name: %s %s\n&quot;, result[&quot;first&quot;], result[&quot;second&quot;])
}

<kbd>GoPlay</kbd>

答案2

得分: 17

我没有评论的声誉,所以如果这不应该是一个“答案”,请原谅我,但我发现上面的答案很有帮助,所以我将其封装成了一个函数:

func reSubMatchMap(r *regexp.Regexp, str string) (map[string]string) {
    match := r.FindStringSubmatch(str)
    subMatchMap := make(map[string]string)
    for i, name := range r.SubexpNames() {
        if i != 0 {
            subMatchMap[name] = match[i]
        }
    }
    
    return subMatchMap
}

在 Playground 上的示例用法:
https://play.golang.org/p/LPLND6FnTXO

希望对其他人有所帮助。喜欢 Go 中命名捕获组的简便性。

英文:

I don't have the reputation to comment so forgive me if this shouldn't be an 'answer', but I found the above answer helpful so I wrapped it in to a function:

func reSubMatchMap(r *regexp.Regexp, str string) (map[string]string) {
	match := r.FindStringSubmatch(str)
	subMatchMap := make(map[string]string)
	for i, name := range r.SubexpNames() {
		if i != 0 {
			subMatchMap[name] = match[i]
		}
	}
	
	return subMatchMap
}

Example usage on Playground:
https://play.golang.org/p/LPLND6FnTXO

Hope this is helpful to someone else. Love the ease of named capture groups in Go.

答案3

得分: 11

其他方法在找不到“命名组”匹配项时会抛出错误。

然而,下面的方法会创建一个map,其中包含实际找到的命名组:

func findNamedMatches(regex *regexp.Regexp, str string) map[string]string {
    match := regex.FindStringSubmatch(str)

    results := map[string]string{}
    for i, name := range match {
        results[regex.SubexpNames()[i]] = name
    }
    return results
}

这种方法将仅返回包含命名组匹配项的map。如果没有匹配项,它将返回一个空的map。我发现这比在找不到匹配项时抛出错误要容易处理。

英文:

The other approaches will throw an error when a match wasn't found for a 'named group'.

The following, however, creates a map with whatever named groups were actually found:

func findNamedMatches(regex *regexp.Regexp, str string) map[string]string {
    match := regex.FindStringSubmatch(str)

    results := map[string]string{}
    for i, name := range match {
	    results[regex.SubexpNames()[i]] = name
    }
    return results
}

This approach will just return the map with the named group matches. If there are no matches, it'll just return an empty map. I've found that's much easier to deal with than errors being thrown if a match isn't found.

答案4

得分: 2

你可以使用 regroup 库来实现这个功能,该库的地址是 https://github.com/oriser/regroup

示例:

package main

import (
	"fmt"

	"github.com/oriser/regroup"
)

var myExp = regroup.MustCompile(`(?P<first>\d+)\.(\d+).(?P<second>\d+)`)

func main() {
	match, err := myExp.Groups("1234.5678.9")
	if err != nil {
		panic(err)
	}
	fmt.Printf("按名称提取: %s %s\n", match["first"], match["second"])
}

<kbd>Playground</kbd>

你也可以使用结构体来实现:

package main

import (
	"fmt"

	"github.com/oriser/regroup"
)

type Example struct {
	First  int `regroup:"first"`
	Second int `regroup:"second"`
}

var myExp = regroup.MustCompile(`(?P<first>\d+)\.(\d+).(?P<second>\d+)`)

func main() {
	res := &Example{}
	err := myExp.MatchToTarget("1234.5678.9", res)
	if err != nil {
		panic(err)
	}
	fmt.Printf("按结构体提取: %+v\n", res)
}

<kbd>Playground</kbd>

英文:

You can use regroup library for that https://github.com/oriser/regroup

Example:

package main

import (
	&quot;fmt&quot;

	&quot;github.com/oriser/regroup&quot;
)

var myExp = regroup.MustCompile(`(?P&lt;first&gt;\d+)\.(\d+).(?P&lt;second&gt;\d+)`)

func main() {
	match, err := myExp.Groups(&quot;1234.5678.9&quot;)
	if err != nil {
		panic(err)
	}
	fmt.Printf(&quot;by name: %s %s\n&quot;, match[&quot;first&quot;], match[&quot;second&quot;])
}

<kbd>Playground</kbd>

You can also use a struct for that:

package main

import (
	&quot;fmt&quot;

	&quot;github.com/oriser/regroup&quot;
)

type Example struct {
	First  int `regroup:&quot;first&quot;`
	Second int `regroup:&quot;second&quot;`
}

var myExp = regroup.MustCompile(`(?P&lt;first&gt;\d+)\.(\d+).(?P&lt;second&gt;\d+)`)

func main() {
	res := &amp;Example{}
	err := myExp.MatchToTarget(&quot;1234.5678.9&quot;, res)
	if err != nil {
		panic(err)
	}
	fmt.Printf(&quot;by struct: %+v\n&quot;, res)
}

<kbd>Playground</kbd>

huangapple
  • 本文由 发表于 2013年12月24日 04:38:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/20750843.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定