在Golang中,strings.Replace()函数的返回值是反转的。

huangapple go评论156阅读模式
英文:

Inverted Return from strings.Replace() Golang

问题

我有一个大型数据集,需要进行一些字符串操作(我知道字符串是不可变的)。strings 包中的 Replace() 函数正好可以满足我的需求,只是我需要它进行反向搜索。

假设我有这个字符串:AA-BB-CC-DD-EE

运行以下脚本:

package main

import (
	"fmt"
	"strings"
)

func main() {
	fmt.Println(strings.Replace("AA-BB-CC-DD-EE", "-", "", 1))
}

它的输出是:AABB-CC-DD-EE

我需要的是:AA-BBCCDDEE,即找到搜索关键字的第一个实例,并且丢弃其余部分。

拆分字符串,插入破折号,然后重新连接起来是可行的。但是,我在考虑是否有更高效的方法来实现这个目标。

英文:

I have a large dataset where I needed to do some string manipulation (I know strings are immutable). The Replace() function in the strings package does exactly what I need, except I need it to search in reverse.

Say I have this string: AA-BB-CC-DD-EE

Run this script:

package main

import (
"fmt"
"strings"
)

func main() {
    fmt.Println(strings.Replace("AA-BB-CC-DD-EE", "-", "", 1))
}

It outputs: AABB-CC-DD-EE

What I need is: AA-BBCCDDEE, where the first instance of the search key is found, and the rest discarded.

Splitting the string, inserting the dash, and joining it back together works. But, I'm thinking there is a more performant way to achieve this.

答案1

得分: 4

字符串切片!

in := "AA-BB-CC-DD-EE"
afterDash := strings.Index(in, "-") + 1
fmt.Println(in[:afterDash] + strings.Replace(in[afterDash:], "-", "", -1))

(如果输入中没有破折号,可能需要进行一些调整以获得所需的行为)。

英文:

String slices!

in := "AA-BB-CC-DD-EE"
afterDash := strings.Index(in, "-") + 1
fmt.Println(in[:afterDash] + strings.Replace(in[afterDash:], "-", "", -1))

(might require some tweaking to get the behavior you want in the case that the input has no dashes).

答案2

得分: 1

这可以是另一种解决方案

package main

import (
	"strings"
	"fmt"
)

func Reverse(s string) string {
	n := len(s)
	runes := make([]rune, n)
	for _, rune := range s {
		n--
		runes[n] = rune
	}
	return string(runes[n:])
}

func main() {
	S := "AA-BB-CC-DD-EE"
	S = Reverse(strings.Replace(Reverse(S), "-", "", strings.Count(S, "-")-1))
	fmt.Println(S)
}

另一种解决方案:

package main

import (
	"fmt"
	"strings"
)

func main() {
	S := strings.Replace("AA-BB-CC-DD-EE", "-", "*", 1)
	S = strings.Replace(S, "-", "", -1)
	fmt.Println(strings.Replace( S, "*", "-", 1))
}
英文:

This can be another solution

package main

import (
	"strings"
	"fmt"
)

func Reverse(s string) string {
	n := len(s)
	runes := make([]rune, n)
	for _, rune := range s {
		n--
		runes[n] = rune
	}
	return string(runes[n:])
}

func main() {
	S := "AA-BB-CC-DD-EE"
	S = Reverse(strings.Replace(Reverse(S), "-", "", strings.Count(S, "-")-1))
	fmt.Println(S)
}

Another solution:

package main

import (
	"fmt"
	"strings"
)

func main() {
	S := strings.Replace("AA-BB-CC-DD-EE", "-", "*", 1)
	S = strings.Replace(S, "-", "", -1)
	fmt.Println(strings.Replace( S, "*", "-", 1))
}

答案3

得分: 1

我认为你想使用strings.Map而不是通过函数组合来实现。它基本上是为这种情况而设计的:字符替换,比Replace和相关函数处理更复杂的要求。定义如下:

>Map函数根据映射函数返回字符串s的副本,其中所有字符都根据映射函数进行修改。如果映射函数返回负值,则删除该字符,不进行替换。

你可以使用一个相当简单的闭包来构建映射函数:

func makeReplaceFn(toReplace rune, skipCount int) func(rune) rune {
    count := 0
    return func(r rune) rune {
        if r == toReplace && count < skipCount {
            count++
        } else if r == toReplace && count >= skipCount {
            return -1
        }

        return r
    }
}

从那里开始,这是一个非常直接的程序:

strings.Map(makeReplaceFn('-', 1), "AA-BB-CC-DD-EE")

Playground,这将产生所需的输出:

>AA-BBCCDDEE
>
>程序已退出。

我不确定这是否比其他解决方案更快或更慢,因为一方面它必须为字符串中的每个符文调用一个函数,而另一方面它不必在每个函数调用之间进行[]byte/[]rune和字符串之间的转换(尽管hobbs的子切片答案可能总体上是最好的)。

此外,该方法可以很容易地适应其他情况(例如保留每隔一个破折号),但需要注意的是,strings.Map只能进行符文到符文的映射,而不能像strings.Replace那样进行符文到字符串的映射。

英文:

I think you want to use strings.Map rather than rigging things with compositions of functions. It's basically meant for this scenario: character replacement with more complex requirements than Replace and cousins can handle. The definition:

>Map returns a copy of the string s with all its characters modified according to the mapping function. If mapping returns a negative value, the character is dropped from the string with no replacement.

Your mapping function can be built with a fairly simple closure:

func makeReplaceFn(toReplace rune, skipCount int) func(rune) rune {
    count := 0
    return func(r rune) rune {
        if r == toReplace &amp;&amp; count &lt; skipCount {
            count++
        } else if r == toReplace &amp;&amp; count &gt;= skipCount {
            return -1
        }

        return r
    }
}

From there, it's a very straightforward program:

strings.Map(makeReplaceFn(&#39;-&#39;, 1), &quot;AA-BB-CC-DD-EE&quot;)

Playground, this produces the desired output:

>AA-BBCCDDEE
>
>Program exited.

I'm not sure whether this is faster or slower than other solutions without benchmarking, because on one hand it has to call a function for each rune in the string, while on the other hand it doesn't have to convert (and thus copy) between a []byte/[]rune and string between each function call (though the subslicing answer by hobbs is probably overall the best).

In addition, the method can be easily adapted to other scenarios (e.g. retaining every other dash), with the caveat that strings.Map can only do rune to rune mapping, and not rune to string mapping like strings.Replace does.

答案4

得分: 0

这是一个有趣的问题。虽然提供的解决方案很好,通过拆分和替换来解决问题,甚至调用 Replace 函数三次,但这种方法可能不够高效。

那么答案是什么呢?不要重复造轮子,Go 标准库已经几乎解决了这个问题,使用 Replace() 函数,我们可以稍作修改。我在确定新函数的 API 时遇到了一些困惑,最终决定保持函数签名不变,但对 strings.Replace 做出最小的改动:

func ReplaceAfter(s, old, new string, skip int) string

变量 skip 替代了 n,以明确它的作用,因为调用者将指定要跳过多少个 old 实例进行替换。skip==0 定义为替换每个实例,skip==-1 定义为不替换任何实例。

从这里开始,只需要对函数的几个部分进行少量更改。

func ReplaceAfter(s, old, new string, skip int) string {
    if old == new || skip == -1 { // 改动
        return s // 避免分配内存
    }

    // 计算替换次数
    m := strings.Count(s, old)
    if m == 0 || m < skip { // 改动
        return s // 避免分配内存
    } // 改动(移除了 else if)

    // 将替换应用于缓冲区
    n := m - skip // 改动,n 的含义相同,但是计算方式不同
    t := make([]byte, len(s)+n*(len(new)-len(old))) // 更长的缓冲区
    w := 0
    start := 0
    for i := 0; i < m; i++ {
        j := start
        if len(old) == 0 {
            if i > 0 {
                _, wid := utf8.DecodeRuneInString(s[start:])
                j += wid
            }
        } else {
            j += strings.Index(s[start:], old)
        }
        if i >= skip { // 改动,替换
            w += copy(t[w:], s[start:j])
            w += copy(t[w:], new)
        } else { // 改动,跳过
            w += copy(t[w:], s[start:j+len(old)])
        }
        start = j + len(old)
    }
    w += copy(t[w:], s[start:])
    return string(t[0:w])
}

这里有一个 playground 链接,其中包含一个可工作的演示。如果你感兴趣,我还复制并调整了 go/src/strings/ 中相关的测试函数,以确保所编写的函数的行为是可预测的。

英文:

This was a fun question to answer. While the solutions offered work neatly, splitting and replacing, to say nothing of calling Replace 3 times doesn't seem likely to be performant.

The answer? Don't reinvent the wheel, the go standard library has already almost solved this problem with Replace(), let's tweak it. I stumbled a bit over how the API of our new function should work, finally settling on leaving the signature unchanged, but deciding on minimal change from strings.Replace:

func ReplaceAfter(s,old,new string,skip int) string

The variable skip replaces n to clarify what it does since the caller will specify how many instances of old to skip replacing. skip==0 is defined as replacing every instance and skip==-1 is defined as replacing no instances.

From here there were really only a few bits of the function that needed changing.

func ReplaceAfter(s, old, new string, skip int) string {
if old == new || skip == -1 { // changed
return s // avoid allocation
}
// Compute number of replacements.
m := strings.Count(s, old)
if m == 0 || m &lt; skip { // changed
return s // avoid allocation
} // changed (removed else if)
// Apply replacements to buffer.
n := m - skip // changed, n means the same thing but is calculated
t := make([]byte, len(s)+n*(len(new)-len(old))) // longer buffer
w := 0
start := 0
for i := 0; i &lt; m; i++ {
j := start
if len(old) == 0 {
if i &gt; 0 {
_, wid := utf8.DecodeRuneInString(s[start:])
j += wid
}
} else {
j += strings.Index(s[start:], old)
}
if i &gt;= skip { // changed, replace
w += copy(t[w:], s[start:j])
w += copy(t[w:], new)
} else { // changed, skip ahead
w += copy(t[w:], s[start:j+len(old)])
}
start = j + len(old)
}
w += copy(t[w:], s[start:])
return string(t[0:w])
}

Here's a playground link with a working demo. If you're interested, I also copied and adapted the relevant Test functions from go/src/strings/, to make sure that the function as written behaved itself predictably.

huangapple
  • 本文由 发表于 2015年10月24日 03:37:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/33310172.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定