在Golang中,是否可以一次性提取字符串的部分并替换这些部分?

huangapple go评论156阅读模式
英文:

Is it possible to extract parts of a string and replace those parts in one operation in Golang?

问题

假设我想从一个字符串中提取所有的数字(很可能使用正则表达式匹配),并且我还想用一个通用的占位符(比如"#") 替换这些数字。

可以通过两个步骤来实现:首先使用FindAll提取数字,然后使用ReplaceAll替换匹配到的数字。然而,我对执行这些操作的性能成本表示怀疑。

所以,以一个字符串为例:

"sdasd 3.2% sadas 6 ... +8.9"

将其替换为:

"sdasd #% sadas # ... +#"

并且得到一个切片:

[3.2, 6.0, 8.9]

请以最高效的方式实现。

编辑:我已经实现了regexp.FindAllString + regexp.ReplaceAllString,并且对我的应用程序的性能影响非常小。如果有时间,我将尝试Elliot Chance的方法并比较这两种方法。

英文:

Say I want to extract all numbers from a string (Most likely using regex matching) and I also want to replace those number matches with a generic placeholder like "#".

This is easily done in two parts using FindAll, then ReplaceAll. However I have serious doubts about the performance costs of doing such operations.

So take a string

"sdasd 3.2% sadas 6 ... +8.9"

replace it with

"sdasd #% sadas # ... +#"

and get a slice

[3.2,6.0,8.9]

In the most performant way possible.

Edit : I implemented the regexp.FindAllString + regexp.ReplaceAllString and the performance hit to my app was very minimal. I will hopefully try Elliot Chance's approach and compare the two when I have time.

答案1

得分: 1

如果您需要原始性能,正则表达式很少是实现它的方式,即使它很方便。逐个标记进行迭代应该非常快。以下是一些代码:

input := "sdasd 3.2 sadas 6"
output := []string{}
numbers := []float64{}

for _, tok := range strings.Split(input, " ") {
    if f, err := strconv.ParseFloat(tok, 64); err == nil {
        numbers = append(numbers, f)
        tok = "#"
    }
    output = append(output, tok)
}
finalString := strings.Join(output, " ")
fmt.Println(finalString, numbers)

playground链接

我相信还有一些可以进行的优化,但这是我会采取的一般方法。

英文:

If you need raw performance than regexp is rarely the way to achieve it, even if it is convenient. Iterating token by token should be pretty fast. Some code:

input := "sdasd 3.2 sadas 6"
output := []string{}
numbers := []float64{}

for _, tok := range strings.Split(input, " ") {
	if f, err := strconv.ParseFloat(tok, 64); err == nil {
		numbers = append(numbers, f)
		tok = "#"
	}
	output = append(output, tok)
}
finalString := strings.Join(output, " ")
fmt.Println(finalString, numbers)

playground link

I'm sure there's a few more optimizations in there that could be made, but this is the general approach I'd take.

答案2

得分: 0

永远不要低估正则表达式的威力,尤其是Go语言的RE2引擎。

此外,绝对不要在没有进行基准测试的情况下对性能做任何假设。它总是会让人惊讶的。

正则表达式通常会被编译和缓存。为了确保,你可以先编译它来进行优化。

英文:

Never underestimate the power of regex, especially the RE2 engine of Go.

Also, never, ever, assume anything about performance without benchmarking. It always surprises.

The regular expression is usually compiled and cached. To be sure, you could optimize by compiling it first.

huangapple
  • 本文由 发表于 2017年5月6日 01:15:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/43810523.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定