How can I iterate over a string by runes in Go?

huangapple go评论92阅读模式
英文:

How can I iterate over a string by runes in Go?

问题

我想要这样做:

for i := 0; i < len(str); i++ {
    dosomethingwithrune(str[i]) // 使用一个 rune
}

但是事实证明,str[i] 的类型是 byteuint8),而不是 rune

我该如何按照 rune 而不是字节来迭代字符串?

英文:

I wanted to this:

for i := 0; i &lt; len(str); i++ {
    dosomethingwithrune(str[i]) // takes a rune
}

But it turns out that str[i] has type byte (uint8) rather than rune.

How can I iterate over the string by runes rather than bytes?

答案1

得分: 171

请看来自Effective Go的示例:

for pos, char := range "日本語" {
    fmt.Printf("字符 %c 在字节位置 %d 开始\n", char, pos)
}

这将打印出:

字符 日 在字节位置 0 开始
字符 本 在字节位置 3 开始
字符 語 在字节位置 6 开始

> 对于字符串,range 会为您做更多的工作,通过解析 UTF-8 来分解出单个的 Unicode 代码点。

英文:

See this example from Effective Go :

for pos, char := range &quot;日本語&quot; {
    fmt.Printf(&quot;character %c starts at byte position %d\n&quot;, char, pos)
}

This prints :

character 日 starts at byte position 0
character 本 starts at byte position 3
character 語 starts at byte position 6

> For strings, the range does more work for you, breaking out individual
> Unicode code points by parsing the UTF-8.

答案2

得分: 42

golang.org给出的示例中,为了实现你最初想要的效果,Go语言允许你将字符串轻松转换为一个符文切片,然后对其进行迭代:

runes := []rune("Hello, 世界")
for i := 0; i < len(runes); i++ {
    fmt.Printf("Rune %v is '%c'\n", i, runes[i])
}

当然,我们也可以使用类似其他示例中的范围运算符,但这更接近于你最初的语法。无论如何,这将输出:

Rune 0 is 'H'
Rune 1 is 'e'
Rune 2 is 'l'
Rune 3 is 'l'
Rune 4 is 'o'
Rune 5 is ','
Rune 6 is ' '
Rune 7 is '世'
Rune 8 is '界'

请注意,由于rune类型是int32的别名,我们必须在Printf语句中使用%c而不是通常的%v,否则我们将看到Unicode代码点的整数表示(参见A Tour of Go)。

英文:

To mirror an example given at golang.org, Go allows you to easily convert a string to a slice of runes and then iterate over that, just like you wanted to originally:

runes := []rune(&quot;Hello, 世界&quot;)
for i := 0; i &lt; len(runes) ; i++ {
	fmt.Printf(&quot;Rune %v is &#39;%c&#39;\n&quot;, i, runes[i])
}

Of course, we could also use a range operator like in the other examples here, but this more closely follows your original syntax. In any case, this will output:

Rune 0 is &#39;H&#39;
Rune 1 is &#39;e&#39;
Rune 2 is &#39;l&#39;
Rune 3 is &#39;l&#39;
Rune 4 is &#39;o&#39;
Rune 5 is &#39;,&#39;
Rune 6 is &#39; &#39;
Rune 7 is &#39;世&#39;
Rune 8 is &#39;界&#39;

Note that since the rune type is an alias for int32, we must use %c instead of the usual %v in the Printf statement, or we will see the integer representation of the Unicode code point (see A Tour of Go).

答案3

得分: 20

例如:

package main

import "fmt"

func main() {
        for i, rune := range "Hello, 世界" {
                fmt.Printf("%d: %c\n", i, rune)
        }
}

Playground


输出:

0: H
1: e
2: l
3: l
4: o
5: ,
6:  
7: 世
10: 界
英文:

For example:

package main

import &quot;fmt&quot;

func main() {
        for i, rune := range &quot;Hello, 世界&quot; {
                fmt.Printf(&quot;%d: %c\n&quot;, i, rune)
        }
}

Playground


Output:

0: H
1: e
2: l
3: l
4: o
5: ,
6:  
7: 世
10: 界

答案4

得分: 2

另外,这是一个不使用fmt包的代码示例:

package main

func main() {
	for _, r := range "Hello, 世界" {
		println(string(r))
	}
}

在循环中,变量r表示当前迭代的rune。我们在将其打印到控制台之前,使用string()函数将其转换为字符串。

Playground

英文:

Alternatively, a code example that doesn't uses fmt package:

package main

func main() {
	for _, rune := range &quot;Hello, 世界&quot; {
		println(string(rune))
	}
}

In the loop, the variable r represents the current rune being iterated over. We convert it to a string using the string() function before printing it to the console.

Playground

huangapple
  • 本文由 发表于 2013年8月9日 00:08:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/18130859.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定