在Go语言中获取长度最多为N个字符/元素的子字符串/子切片的简单方法

huangapple go评论79阅读模式
英文:

Easy way to get a sub-string/sub-slice of up to N characters/elements in Go

问题

在Python中,我可以使用切片操作来获取一个长度最多为N的子字符串,如果字符串太短,它会返回剩余的字符串,例如:

"mystring"[:100] # 返回 "mystring"

在Go语言中,要实现相同的功能,最简单的方法是使用字符串切片。但是,如果尝试类似的操作,会导致运行时错误:

"mystring"[:100] // panic: runtime error: slice bounds out of range

当然,我可以手动编写代码来实现相同的功能:

func Substring(s string, startIndex int, count int) string {
    maxCount := len(s) - startIndex
    if count > maxCount {
        count = maxCount
    }
    return s[startIndex:count]
}

fmt.Println(Substring("mystring", 0, n))

但是,这样做对于一个简单的操作来说有点繁琐,而且我不知道如何将这个函数推广到其他类型的切片,因为Go语言不支持泛型。我希望有更好的方法。即使使用Math.Min()也不容易解决这个问题,因为它期望和返回float64类型的值。

英文:

In Python I can slice a string to get a sub-string of up to N characters and if the string is too short it will simply return the rest of the string, e.g.

"mystring"[:100] # Returns "mystring"

What's the easiest way to do the same in Go? Trying the same thing panics:

"mystring"[:100] // panic: runtime error: slice bounds out of range

Of course, I can write it all manually:

func Substring(s string, startIndex int, count int) string {
	maxCount := len(s) - startIndex
	if count > maxCount {
		count = maxCount
	}
	return s[startIndex:count]
}

fmt.Println(Substring("mystring", 0, n))

But that's rather a lot of work for something so simple and (I would have thought) common. What's more, I don't know how to generalise this function to slices of other types, since Go doesn't support generics. I'm hoping there is a better way. Even Math.Min() doesn't easily work here, because it expects and returns float64.

答案1

得分: 1

请注意,虽然函数仍然是推荐的解决方案(即使必须为具有不同类型的切片实现),但它在处理字符串时效果不佳。

fmt.Println(Substring("世界mystring", 0, 5)) 实际上会打印出 世�� 而不是 世界mys
参见“Code points, characters, and runes”:一个字符可以由多个不同的码点序列表示,因此也可以由不同的UTF-8字节序列表示。
在Go中,一个“码点”就是一个rune(参见这里)。

在处理字符串时,使用rune会更加健壮。

func SubstringRunes(s string, startIndex int, count int) string {
    runes := []rune(s)
    length := len(runes)
    maxCount := length - startIndex
    if count > maxCount {
        count = maxCount
    }
    return string(runes[startIndex:count])
}

这个 playground 中可以看到它的实际效果。

英文:

Note that while a function remains the recommended solution (even if it has to be implemented for slices with different type), it wouldn't work well with string.

fmt.Println(Substring("世界mystring", 0, 5)) would actually print 世�� instead of 世界mys.
See "Code points, characters, and runes": a character may be represented by a number of different sequences of code points, and therefore different sequences of UTF-8 bytes.
And in Go, a "code point" is a rune (as seen here).

Using rune would be more robust (again, in case of strings)

func SubstringRunes(s string, startIndex int, count int) string {
	runes := []rune(s)
	length := len(runes)
	maxCount := length - startIndex
	if count > maxCount {
		count = maxCount
	}
	return string(runes[startIndex:count])
}

See it in action in this playground.

huangapple
  • 本文由 发表于 2017年6月9日 22:07:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/44459927.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定