How can I easily get a substring in Go while guarding against "slice bounds out of range" error?

huangapple go评论72阅读模式
英文:

How can I easily get a substring in Go while guarding against "slice bounds out of range" error?

问题

使用Go语言,我想将长字符串截断为任意长度(例如用于日志记录)。

const maxLen = 100

func main() {
    myString := "This string might be longer, so we'll keep all except the first 100 bytes."
    
    fmt.Println(myString[:10])      // 打印前10个字节
    fmt.Println(myString[:maxLen])  // 报错:运行时错误:切片边界超出范围
}

目前,我可以通过使用额外的变量和if语句来解决,但这似乎很冗长:

const maxLen = 100

func main() {
    myString := "This string might be longer, so we'll keep all except the first 100 bytes."
    
    limit := len(myString)
    if limit > maxLen {
        limit = maxLen
    }
    
    fmt.Println(myString[:limit]) // 打印前100个字节,如果字符串长度小于100,则打印整个字符串
}

有没有更简洁的方法?

英文:

Using Go, I want to truncate long strings to an arbitrary length (e.g. for logging).

const maxLen = 100

func main() {
    myString := "This string might be longer, so we'll keep all except the first 100 bytes."

    fmt.Println(myString[:10])	    // Prints the first 10 bytes
    fmt.Println(myString[:maxLen])	// panic: runtime error: slice bounds out of range
}

For now, I can solve it with an extra variable and if statement, but that seems very long-winded:

const maxLen = 100

func main() {
    myString := "This string might be longer, so we'll keep all except the first 100 bytes."

    limit := len(myString)
    if limit > maxLen {
        limit = maxLen
    }

    fmt.Println(myString[:limit]) // Prints the first 100 bytes, or the whole string if shorter
}

Is there a shorter/cleaner way?

答案1

得分: 7

使用一个简单的函数来隐藏实现细节。例如,

package main

import "fmt"

func maxString(s string, max int) string {
    if len(s) > max {
        r := 0
        for i := range s {
            r++
            if r > max {
                return s[:i]
            }
        }
    }
    return s
}

func main() {
    s := "日本語"
    fmt.Println(s)
    fmt.Println(maxString(s, 2))
}

输出:

日本語
日本
英文:

Use a simple function to hide the implementation details. For example,

package main

import "fmt"

func maxString(s string, max int) string {
	if len(s) > max {
		r := 0
		for i := range s {
			r++
			if r > max {
				return s[:i]
			}
		}
	}
	return s
}

func main() {
	s := "日本語"
	fmt.Println(s)
	fmt.Println(maxString(s, 2))
}

Output:

日本語
日本

答案2

得分: 3

假设您希望保留最多maxLen个字符,即您的代码所说的,而不是您的字符串所说的。

如果您不需要原始的myString,您可以像这样覆盖它:

const maxLen = 100

func main() {
    myString := "This string might be longer, so we'll keep the first 100 bytes."

    if len(myString) >= maxLen {
        myString = myString[:maxLen] // 在Go中,切片操作是一个常数时间操作
    }

    fmt.Println(myString) // 如果字符串较短,则打印前100个字节或整个字符串
}

这可能会将Unicode字符切成两半,导致末尾出现一些垃圾。如果您需要处理多字节的Unicode字符(您可能需要这样做),可以尝试以下代码:

func main() {
    myString := "日本語"

    mid := maxLen
    for len(myString) >= mid && utf8.ValidString(myString[:mid]) == false {
        mid++ // 从myString中添加另一个字节,直到我们有一个完整的多字节字符
    }
    if len(myString) > mid {
        myString = myString[:mid]
    }

    fmt.Println(myString) // 如果字符串较短,则打印前100个字节或整个字符串
}

或者,如果您可以接受从输出中删除最多一个字符,下面这个版本会更简洁一些:

func main() {
    myString := "日本語"

    for len(myString) >= maxLen || utf8.ValidString(myString) == false {
        myString = myString[:len(myString)-1] // 删除一个字节
    }

    fmt.Println(myString) // 如果字符串较短,则打印前100个字节或整个字符串
}
英文:

Assuming you want to keep at most maxLen characters, i.e. what your code says, rather than what your string says.

If you don't need the original myString, you can overwrite it like this:

const maxLen = 100

func main() {
	myString := "This string might be longer, so we'll keep the first 100 bytes."

	if len(myString) >= maxLen {
		myString = myString[:maxLen] // slicing is a constant time operation in go
	}

	fmt.Println(myString) // Prints the first 100 bytes, or the whole string if shorter
}

This might cut unicode characters in half, leaving some garbage at the end. If you need to handle multi-byte unicode, which you probably do, try this:

func main() {
    myString := "日本語"

	mid := maxLen
	for len(myString) >= mid && utf8.ValidString(myString[:mid]) == false {
		mid++ // add another byte from myString until we have a whole multi-byte character
	}
	if len(myString) > mid {
		myString = myString[:mid]
	}

	fmt.Println(myString) // Prints the first 100 bytes, or the whole string if shorter
}

Or, if you can accept removing up to one character from the output, this version is a bit cleaner

func main() {
    myString := "日本語"

	for len(myString) >= maxLen || utf8.ValidString(myString) == false {
		myString = myString[:len(myString)-1] // remove a byte
	}

	fmt.Println(myString) // Prints the first 100 bytes, or the whole string if shorter
}

huangapple
  • 本文由 发表于 2016年1月17日 22:49:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/34839659.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定