英文:
How to access a capturing group from regexp.ReplaceAllFunc?
问题
如何在ReplaceAllFunc()函数内部访问捕获组?
package main
import (
"fmt"
"regexp"
)
func main() {
body := []byte("Visit this page: [PageName]")
search := regexp.MustCompile("\\[([a-zA-Z]+)\\]")
body = search.ReplaceAllFunc(body, func(s []byte) []byte {
// 在这里如何访问捕获组?
})
fmt.Println(string(body))
}
目标是将[PageName]
替换为<a href="/view/PageName">PageName</a>
。
这是Writing Web Applications Go教程底部的“其他任务”部分的最后一个任务。
英文:
How can I access a capture group from inside ReplaceAllFunc()?
package main
import (
"fmt"
"regexp"
)
func main() {
body := []byte("Visit this page: [PageName]")
search := regexp.MustCompile("\\[([a-zA-Z]+)\\]")
body = search.ReplaceAllFunc(body, func(s []byte) []byte {
// How can I access the capture group here?
})
fmt.Println(string(body))
}
The goal is to replace [PageName]
with <a href="/view/PageName">PageName</a>
.
This is the last task under the "Other tasks" section at the bottom of the Writing Web Applications Go tutorial.
答案1
得分: 7
我同意,在函数内部访问捕获组将是理想的,但我不认为使用regexp.ReplaceAllFunc
实现这一点是可能的。
我现在能想到的关于如何使用该函数实现这一点的方法是:
package main
import (
"fmt"
"regexp"
)
func main() {
body := []byte("Visit this page: [PageName] [OtherPageName]")
search := regexp.MustCompile("\\[[a-zA-Z]+\\]")
body = search.ReplaceAllFunc(body, func(s []byte) []byte {
m := string(s[1 : len(s)-1])
return []byte("<a href=\"/view/" + m + "\">" + m + "</a>")
})
fmt.Println(string(body))
}
编辑
我知道另一种实现你想要的功能的方法。首先,你需要知道可以使用(?:re)
的语法来指定非捕获组,其中re
是你的正则表达式。这不是必需的,但可以减少不感兴趣的匹配次数。
接下来要知道的是regexp.FindAllSubmatcheIndex。它将返回一个切片,其中每个内部切片表示给定正则表达式匹配的所有子匹配的范围。
有了这两个知识,你可以构建一个相对通用的解决方案:
package main
import (
"fmt"
"regexp"
)
func ReplaceAllSubmatchFunc(re *regexp.Regexp, b []byte, f func(s []byte) []byte) []byte {
idxs := re.FindAllSubmatchIndex(b, -1)
if len(idxs) == 0 {
return b
}
l := len(idxs)
ret := append([]byte{}, b[:idxs[0][0]]...)
for i, pair := range idxs {
// 用用户提供的函数的结果替换内部子匹配
ret = append(ret, f(b[pair[2]:pair[3]])...)
if i+1 < l {
ret = append(ret, b[pair[1]:idxs[i+1][0]]...)
}
}
ret = append(ret, b[idxs[len(idxs)-1][1]:]...)
return ret
}
func main() {
body := []byte("Visit this page: [PageName] [OtherPageName][XYZ] [XY]")
search := regexp.MustCompile(`(?:\[)([a-zA-Z]+)(?:\])`)
body = ReplaceAllSubmatchFunc(search, body, func(s []byte) []byte {
m := string(s)
return []byte("<a href=\"/view/" + m + "\">" + m + "</a>")
})
fmt.Println(string(body))
}
英文:
I agree that having access to capture group while inside of your function would be ideal, I don't think it's possible with regexp.ReplaceAllFunc
.
Only thing that comes to my mind right now regard how to do this with that function is this:
package main
import (
"fmt"
"regexp"
)
func main() {
body := []byte("Visit this page: [PageName] [OtherPageName]")
search := regexp.MustCompile("\\[[a-zA-Z]+\\]")
body = search.ReplaceAllFunc(body, func(s []byte) []byte {
m := string(s[1 : len(s)-1])
return []byte("<a href=\"/view/" + m + "\">" + m + "</a>")
})
fmt.Println(string(body))
}
EDIT
There is one other way I know how to do what you want. First thing you need to know is that you can specify non capturing group using syntax (?:re)
where re
is your regular expression. This is not essential, but will reduce number of not interesting matches.
Next thing to know is regexp.FindAllSubmatcheIndex. It will return slice of slices, where each internal slice represents ranges of all submatches for given matching of regexp.
Having this two things, you can construct somewhat generic solution:
package main
import (
"fmt"
"regexp"
)
func ReplaceAllSubmatchFunc(re *regexp.Regexp, b []byte, f func(s []byte) []byte) []byte {
idxs := re.FindAllSubmatchIndex(b, -1)
if len(idxs) == 0 {
return b
}
l := len(idxs)
ret := append([]byte{}, b[:idxs[0][0]]...)
for i, pair := range idxs {
// replace internal submatch with result of user supplied function
ret = append(ret, f(b[pair[2]:pair[3]])...)
if i+1 < l {
ret = append(ret, b[pair[1]:idxs[i+1][0]]...)
}
}
ret = append(ret, b[idxs[len(idxs)-1][1]:]...)
return ret
}
func main() {
body := []byte("Visit this page: [PageName] [OtherPageName][XYZ] [XY]")
search := regexp.MustCompile("(?:\\[)([a-zA-Z]+)(?:\\])")
body = ReplaceAllSubmatchFunc(search, body, func(s []byte) []byte {
m := string(s)
return []byte("<a href=\"/view/" + m + "\">" + m + "</a>")
})
fmt.Println(string(body))
}
答案2
得分: 3
如果你想在ReplaceAllFunc
中获取子组,可以使用ReplaceAllString
来获取子组。
package main
import (
"fmt"
"regexp"
)
func main() {
body := []byte("Visit this page: [PageName]")
search := regexp.MustCompile("\\[([a-zA-Z]+)\\]")
body = search.ReplaceAllFunc(body, func(s []byte) []byte {
// 如何在这里访问捕获组?
group := search.ReplaceAllString(string(s), `$1`)
fmt.Println(group)
// 根据需要处理组
newGroup := "<a href='/view/" + group + "'>" + group + "</a>"
return []byte(newGroup)
})
fmt.Println(string(body))
}
当有多个组时,你可以通过这种方式获取每个组,然后处理每个组并返回所需的值。
英文:
If you want to get group in ReplaceAllFunc
, you can use ReplaceAllString
to get the subgroup.
package main
import (
"fmt"
"regexp"
)
func main() {
body := []byte("Visit this page: [PageName]")
search := regexp.MustCompile("\\[([a-zA-Z]+)\\]")
body = search.ReplaceAllFunc(body, func(s []byte) []byte {
// How can I access the capture group here?
group := search.ReplaceAllString(string(s), `$1`)
fmt.Println(group)
// handle group as you wish
newGroup := "<a href='/view/" + group + "'>" + group + "</a>"
return []byte(newGroup)
})
fmt.Println(string(body))
}
And when there are many groups, you are able to get each group by this way, then handle each group and return desirable value.
答案3
得分: 0
你必须首先调用ReplaceAllFunc
,然后在同一个正则表达式上再次调用FindStringSubmatch
。像这样:
func (p parser) substituteEnvVars(data []byte) ([]byte, error) {
var err error
substituted := p.envVarPattern.ReplaceAllFunc(data, func(matched []byte) []byte {
varName := p.envVarPattern.FindStringSubmatch(string(matched))[1]
value := os.Getenv(varName)
if len(value) == 0 {
log.Printf("替换环境变量%s时发生致命错误\n", varName)
}
return []byte(value)
});
return substituted, err
}
英文:
You have to call ReplaceAllFunc
first and within the function call FindStringSubmatch
on the same regex again. Like:
func (p parser) substituteEnvVars(data []byte) ([]byte, error) {
var err error
substituted := p.envVarPattern.ReplaceAllFunc(data, func(matched []byte) []byte {
varName := p.envVarPattern.FindStringSubmatch(string(matched))[1]
value := os.Getenv(varName)
if len(value) == 0 {
log.Printf("Fatal error substituting environment variable %s\n", varName)
}
return []byte(value)
});
return substituted, err
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论