将字符串索引为字符

huangapple go评论114阅读模式
英文:

Indexing string as chars

问题

字符串的元素具有字节类型,并且可以使用常规的索引操作进行访问。

如何将字符串的元素作为字符获取?

“some”1 -> “o”

英文:

> The elements of strings have type byte and may be accessed using the
> usual indexing operations.

How can I get element of string as char ?

> "some"1 -> "o"

答案1

得分: 10

最简单的解决方案是将其转换为一个符文数组:

  1. var runes = []rune("someString")

请注意,当您在字符串上进行迭代时,您不需要进行转换。请参考Effective Go中的示例:

  1. for pos, char := range "日本語" {
  2. fmt.Printf("character %c starts at byte position %d\n", char, pos)
  3. }

这将打印出:

  1. character starts at byte position 0
  2. character starts at byte position 3
  3. character starts at byte position 6
英文:

The simplest solution is to convert it to an array of runes :

  1. var runes = []rune("someString")

Note that when you iterate on a string, you don't need the conversion. See this example from Effective Go :

  1. for pos, char := range "日本語" {
  2. fmt.Printf("character %c starts at byte position %d\n", char, pos)
  3. }

This prints

  1. character starts at byte position 0
  2. character starts at byte position 3
  3. character starts at byte position 6

答案2

得分: 4

Go字符串通常是UTF-8编码的,但不一定是。如果它们是Unicode字符串,那么术语“字符”相当复杂,并且没有通用/唯一的符文(码点)和Unicode字符的双射。

无论如何,可以很容易地在切片中使用码点(符文)并使用索引进行操作,使用以下转换:

  1. package main
  2. import "fmt"
  3. func main() {
  4. utf8 := "Hello, 世界"
  5. runes := []rune(utf8)
  6. fmt.Printf("utf8:% 02x\nrunes: %#v\n", []byte(utf8), runes)
  7. }

还可以在这里查看:http://play.golang.org/p/qWVSA-n93o

注意:通常通过索引访问Unicode“字符”是一个设计错误。大多数文本数据是按顺序处理的。

英文:

Go strings are usually, but not necessarily, UTF-8 encoded. In the case they are Unicode strings, the term "char[acter]" is pretty complex and there is no generall/unique bijection of runes (code points) and Unicode characters.

Anyway one can easily work with code points (runes) in a slice and use indexes into it using a conversion:

  1. package main
  2. import "fmt"
  3. func main() {
  4. utf8 := "Hello, 世界"
  5. runes := []rune(utf8)
  6. fmt.Printf("utf8:% 02x\nrunes: %#v\n", []byte(utf8), runes)
  7. }

Also here: http://play.golang.org/p/qWVSA-n93o

Note: Often the desire to access Unicode "characters" by index is a design mistake. Most of textual data is processed sequentially.

答案3

得分: 0

另一个选项是utf8string包:

  1. package main
  2. import "golang.org/x/exp/utf8string"
  3. func main() {
  4. s := utf8string.NewString("👁️👾👽👼")
  5. t := s.At(2)
  6. println(t == '👽')
  7. }

https://pkg.go.dev/golang.org/x/exp/utf8string

英文:

Another option is the package utf8string:

  1. package main
  2. import "golang.org/x/exp/utf8string"
  3. func main() {
  4. s := utf8string.NewString("🧡💛💚💙💜")
  5. t := s.At(2)
  6. println(t == '💚')
  7. }

https://pkg.go.dev/golang.org/x/exp/utf8string

huangapple
  • 本文由 发表于 2012年10月29日 18:36:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/13119937.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定