英文:
Is it possible to extract parts of a string and replace those parts in one operation in Golang?
问题
假设我想从一个字符串中提取所有的数字(很可能使用正则表达式匹配),并且我还想用一个通用的占位符(比如"#") 替换这些数字。
可以通过两个步骤来实现:首先使用FindAll提取数字,然后使用ReplaceAll替换匹配到的数字。然而,我对执行这些操作的性能成本表示怀疑。
所以,以一个字符串为例:
"sdasd 3.2% sadas 6 ... +8.9"
将其替换为:
"sdasd #% sadas # ... +#"
并且得到一个切片:
[3.2, 6.0, 8.9]
请以最高效的方式实现。
编辑:我已经实现了regexp.FindAllString + regexp.ReplaceAllString,并且对我的应用程序的性能影响非常小。如果有时间,我将尝试Elliot Chance的方法并比较这两种方法。
英文:
Say I want to extract all numbers from a string (Most likely using regex matching) and I also want to replace those number matches with a generic placeholder like "#".
This is easily done in two parts using FindAll, then ReplaceAll. However I have serious doubts about the performance costs of doing such operations.
So take a string
"sdasd 3.2% sadas 6 ... +8.9"
replace it with
"sdasd #% sadas # ... +#"
and get a slice
[3.2,6.0,8.9]
In the most performant way possible.
Edit : I implemented the regexp.FindAllString + regexp.ReplaceAllString and the performance hit to my app was very minimal. I will hopefully try Elliot Chance's approach and compare the two when I have time.
答案1
得分: 1
如果您需要原始性能,正则表达式很少是实现它的方式,即使它很方便。逐个标记进行迭代应该非常快。以下是一些代码:
input := "sdasd 3.2 sadas 6"
output := []string{}
numbers := []float64{}
for _, tok := range strings.Split(input, " ") {
if f, err := strconv.ParseFloat(tok, 64); err == nil {
numbers = append(numbers, f)
tok = "#"
}
output = append(output, tok)
}
finalString := strings.Join(output, " ")
fmt.Println(finalString, numbers)
我相信还有一些可以进行的优化,但这是我会采取的一般方法。
英文:
If you need raw performance than regexp is rarely the way to achieve it, even if it is convenient. Iterating token by token should be pretty fast. Some code:
input := "sdasd 3.2 sadas 6"
output := []string{}
numbers := []float64{}
for _, tok := range strings.Split(input, " ") {
if f, err := strconv.ParseFloat(tok, 64); err == nil {
numbers = append(numbers, f)
tok = "#"
}
output = append(output, tok)
}
finalString := strings.Join(output, " ")
fmt.Println(finalString, numbers)
I'm sure there's a few more optimizations in there that could be made, but this is the general approach I'd take.
答案2
得分: 0
永远不要低估正则表达式的威力,尤其是Go语言的RE2引擎。
此外,绝对不要在没有进行基准测试的情况下对性能做任何假设。它总是会让人惊讶的。
正则表达式通常会被编译和缓存。为了确保,你可以先编译它来进行优化。
英文:
Never underestimate the power of regex, especially the RE2 engine of Go.
Also, never, ever, assume anything about performance without benchmarking. It always surprises.
The regular expression is usually compiled and cached. To be sure, you could optimize by compiling it first.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论