使用函数替换正则表达式的子匹配

huangapple go评论73阅读模式
英文:

Replace a regular expression submatch using a function

问题

让我们假设我有像这样的字符串

input := `bla bla b:foo="hop" blablabla b:bar="hu?"`

我想使用一个函数来替换b:foo="hop"b:bar="hu?"中引号之间的部分。

很容易构建一个正则表达式来获取匹配和子匹配,例如

r := regexp.MustCompile(`\bb:\w+="([^"]+)"`)

然后调用ReplaceAllStringFunc,但问题是回调函数接收到的是整个匹配,而不是子匹配:

fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
    // 这里的m是整个匹配。该死。
}))

我该如何替换子匹配?

目前,我还没有找到比在回调函数中使用正则表达式将m分解并在处理子匹配后重新构建字符串更好的解决方案。

我本来可以使用一个正向查找,但Go中没有这个功能(而且它们也不应该是必需的)。

在这种情况下,我该怎么办?


编辑:这是我目前的解决方案,我希望能简化:

func complexFunc(s string) string {
   return "dbvalue("+s+")" // 这可能更复杂
}
func main() {
        input := `bla bla b:foo="hop" blablabla b:bar="hu?"`
        r := regexp.MustCompile(`(\bb:\w+=")([^"]+)`)
        fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
                parts := r.FindStringSubmatch(m)
                return parts[1] + complexFunc(parts[2])
        }))
}

(playground链接)

让我困扰的是我必须两次应用正则表达式。这听起来不对劲。

英文:

Let's say I have strings like

input := `bla bla b:foo="hop" blablabla b:bar="hu?"`

and I want to replace the parts between quotes in b:foo="hop" or b:bar="hu?" using a function.

It's easy to build a regular expression to get the match and submatch, for example

r := regexp.MustCompile(`\bb:\w+="([^"]+)"`)

and then to call ReplaceAllStringFunc but the problem is that the callback receives the whole match and not the submatch :

fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
    // m is the whole match here. Damn.
}))

How can I replace the submatch ?

Right now, I haven't found a better solution than to decompose myself m inside the callback with a regex, and to rebuild the string after having processed the submatch.

I would have used an alternate approach with a positive look behind were they available in Go but that's not the case (and they shouldn't be necessary anyway).

What can I do here?


EDIT : here's my current solution that I would like to simplify :

func complexFunc(s string) string {
   return "dbvalue("+s+")" // this could be more complex
}
func main() {
        input := `bla bla b:foo="hop" blablabla b:bar="hu?"`
        r := regexp.MustCompile(`(\bb:\w+=")([^"]+)`)
        fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
                parts := r.FindStringSubmatch(m)
                return parts[1] + complexFunc(parts[2])
        }))
}

(playground link)

What bothers me is that I have to apply the regex twice. This doesn't sound right.

答案1

得分: 4

看起来OP为此创建了一个问题,但截至本帖子发布时,仍未实现 https://github.com/golang/go/issues/5690

幸运的是,看起来其他人在网上提供了自己的函数来实现这个 https://gist.github.com/elliotchance/d419395aa776d632d897

func ReplaceAllStringSubmatchFunc(re *regexp.Regexp, str string, repl func([]string) string) string {
	result := ""
	lastIndex := 0

	for _, v := range re.FindAllSubmatchIndex([]byte(str), -1) {
		groups := []string{}
		for i := 0; i < len(v); i += 2 {
			groups = append(groups, str[v[i]:v[i+1]])
		}

		result += str[lastIndex:v[0]] + repl(groups)
		lastIndex = v[1]
	}

	return result + str[lastIndex:]
}
英文:

It looks like the OP created an issue for this, but as of this post, still isn't implemented https://github.com/golang/go/issues/5690

Fortunately, it looks like someone else on the web has provided their own function that does this https://gist.github.com/elliotchance/d419395aa776d632d897

func ReplaceAllStringSubmatchFunc(re *regexp.Regexp, str string, repl func([]string) string) string {
	result := &quot;&quot;
	lastIndex := 0

	for _, v := range re.FindAllSubmatchIndex([]byte(str), -1) {
		groups := []string{}
		for i := 0; i &lt; len(v); i += 2 {
			groups = append(groups, str[v[i]:v[i+1]])
		}

		result += str[lastIndex:v[0]] + repl(groups)
		lastIndex = v[1]
	}

	return result + str[lastIndex:]
}

答案2

得分: 3

我不喜欢下面的代码,但它似乎做了你想要的事情:

package main

import (
    "fmt"
    "regexp"
)

func main() {
    input := `bla bla b:foo=&quot;hop&quot; blablabla b:bar=&quot;hu?&quot;`
    r := regexp.MustCompile(`\bb:\w+=&quot;([^&quot;]+)&quot;`)
    r2 := regexp.MustCompile(`&quot;([^&quot;]+)&quot;`)
    fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
        return r2.ReplaceAllString(m, `&quot;whatever&quot;`)
    }))
}

Playground


输出

bla bla b:foo=&quot;whatever&quot; blablabla b:bar=&quot;whatever&quot;

编辑:第二次尝试。

package main

import (
    "fmt"
    "regexp"
)

func computedFrom(s string) string {
    return fmt.Sprintf("computedFrom(%s)", s)
}

func main() {
    input := `bla bla b:foo=&quot;hop&quot; blablabla b:bar=&quot;hu?&quot;`
    r := regexp.MustCompile(`\bb:\w+=&quot;([^&quot;]+)&quot;`)
    r2 := regexp.MustCompile(`&quot;([^&quot;]+)&quot;`)
    fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
        match := string(r2.Find([]byte(m)))
        return r2.ReplaceAllString(m, computedFrom(match))
    }))
}

Playground


输出:

bla bla b:foo=computedFrom(&quot;hop&quot;) blablabla b:bar=computedFrom(&quot;hu?&quot;)
英文:

I don't like the code bellow, but it seems to do what you seem to want it to do:

package main

import (
        &quot;fmt&quot;
        &quot;regexp&quot;
)

func main() {
        input := `bla bla b:foo=&quot;hop&quot; blablabla b:bar=&quot;hu?&quot;`
        r := regexp.MustCompile(`\bb:\w+=&quot;([^&quot;]+)&quot;`)
        r2 := regexp.MustCompile(`&quot;([^&quot;]+)&quot;`)
        fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
                return r2.ReplaceAllString(m, `&quot;whatever&quot;`)
        }))
}

Playground


Output

bla bla b:foo=&quot;whatever&quot; blablabla b:bar=&quot;whatever&quot;

EDIT: Take II.


package main

import (
        &quot;fmt&quot;
        &quot;regexp&quot;
)

func computedFrom(s string) string {
        return fmt.Sprintf(&quot;computedFrom(%s)&quot;, s)
}

func main() {
        input := `bla bla b:foo=&quot;hop&quot; blablabla b:bar=&quot;hu?&quot;`
        r := regexp.MustCompile(`\bb:\w+=&quot;([^&quot;]+)&quot;`)
        r2 := regexp.MustCompile(`&quot;([^&quot;]+)&quot;`)
        fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
                match := string(r2.Find([]byte(m)))
                return r2.ReplaceAllString(m, computedFrom(match))
        }))
}

Playground


Output:

bla bla b:foo=computedFrom(&quot;hop&quot;) blablabla b:bar=computedFrom(&quot;hu?&quot;)

答案3

得分: 0

尝试这个:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	input := `bla bla b:foo="hop" blablabla b:bar="hu?"`
	r := regexp.MustCompile(`\b(b:\w+)="([^"]+)"`)
	fmt.Println(r.ReplaceAllString(input, `=whatever`))
}
英文:

Try this:

package main

import (
	&quot;fmt&quot;
	&quot;regexp&quot;
)

func main() {
	input := `bla bla b:foo=&quot;hop&quot; blablabla b:bar=&quot;hu?&quot;`
	r := regexp.MustCompile(`\b(b:\w+)=&quot;([^&quot;]+)&quot;`)
	fmt.Println(r.ReplaceAllString(input, &quot;=whatever&quot;))
}

huangapple
  • 本文由 发表于 2013年6月12日 20:25:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/17065465.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定