英文:
Easy way to get a sub-string/sub-slice of up to N characters/elements in Go
问题
在Python中,我可以使用切片操作来获取一个长度最多为N的子字符串,如果字符串太短,它会返回剩余的字符串,例如:
"mystring"[:100] # 返回 "mystring"
在Go语言中,要实现相同的功能,最简单的方法是使用字符串切片。但是,如果尝试类似的操作,会导致运行时错误:
"mystring"[:100] // panic: runtime error: slice bounds out of range
当然,我可以手动编写代码来实现相同的功能:
func Substring(s string, startIndex int, count int) string {
maxCount := len(s) - startIndex
if count > maxCount {
count = maxCount
}
return s[startIndex:count]
}
fmt.Println(Substring("mystring", 0, n))
但是,这样做对于一个简单的操作来说有点繁琐,而且我不知道如何将这个函数推广到其他类型的切片,因为Go语言不支持泛型。我希望有更好的方法。即使使用Math.Min()
也不容易解决这个问题,因为它期望和返回float64
类型的值。
英文:
In Python I can slice a string to get a sub-string of up to N characters and if the string is too short it will simply return the rest of the string, e.g.
"mystring"[:100] # Returns "mystring"
What's the easiest way to do the same in Go? Trying the same thing panics:
"mystring"[:100] // panic: runtime error: slice bounds out of range
Of course, I can write it all manually:
func Substring(s string, startIndex int, count int) string {
maxCount := len(s) - startIndex
if count > maxCount {
count = maxCount
}
return s[startIndex:count]
}
fmt.Println(Substring("mystring", 0, n))
But that's rather a lot of work for something so simple and (I would have thought) common. What's more, I don't know how to generalise this function to slices of other types, since Go doesn't support generics. I'm hoping there is a better way. Even Math.Min()
doesn't easily work here, because it expects and returns float64
.
答案1
得分: 1
请注意,虽然函数仍然是推荐的解决方案(即使必须为具有不同类型的切片实现),但它在处理字符串时效果不佳。
fmt.Println(Substring("世界mystring", 0, 5))
实际上会打印出 世��
而不是 世界mys
。
参见“Code points, characters, and runes”:一个字符可以由多个不同的码点序列表示,因此也可以由不同的UTF-8字节序列表示。
在Go中,一个“码点”就是一个rune
(参见这里)。
在处理字符串时,使用rune
会更加健壮。
func SubstringRunes(s string, startIndex int, count int) string {
runes := []rune(s)
length := len(runes)
maxCount := length - startIndex
if count > maxCount {
count = maxCount
}
return string(runes[startIndex:count])
}
在这个 playground 中可以看到它的实际效果。
英文:
Note that while a function remains the recommended solution (even if it has to be implemented for slices with different type), it wouldn't work well with string
.
fmt.Println(Substring("世界mystring", 0, 5)
) would actually print 世��
instead of 世界mys
.
See "Code points, characters, and runes": a character may be represented by a number of different sequences of code points, and therefore different sequences of UTF-8 bytes.
And in Go, a "code point" is a rune
(as seen here).
Using rune
would be more robust (again, in case of strings)
func SubstringRunes(s string, startIndex int, count int) string {
runes := []rune(s)
length := len(runes)
maxCount := length - startIndex
if count > maxCount {
count = maxCount
}
return string(runes[startIndex:count])
}
See it in action in this playground.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论