golang for loop a string, but it prints 'char' as int, why?

huangapple go评论66阅读模式
英文:

golang for loop a string, but it prints 'char' as int, why?

问题

一个非常简单的Go函数:

func genString(v string) {
	for _, c := range v {
		fmt.Println(c)
	}
}

在以下位置调用:

func TestBasics(t *testing.T) {
	genString("abc")
}

然后我运行了:

go test -v -run TestBasics xxxxxx

它输出:

97
98
99

我期望它应该输出:

a
b
c

但它输出了相应的整数值?为什么?如何修复它并只打印char

谢谢!

英文:

A very simple go function:

func genString(v string) {
	for _, c := range v {
		fmt.Println(c)
	}
}

Called in:

func TestBasics(t *testing.T) {
	genString("abc")
}

Then I ran:

go test -v -run TestBasics xxxxxx

It prints:

97
98
99

I expected that it should print

a
b
c

But it prints the corresponding integer value? Why, how to fix it and print just the char?

Thanks!

答案1

得分: 2

为什么

使用range循环遍历字符串将会得到一系列的rune

关于range的规范(For range spec):

Range expression                          1st value          2nd value

array or slice  a  [n]E, *[n]E, or []E    index    i  int    a[i]       E
string          s  string type            index    i  int    see below  rune
map             m  map[K]V                key      k  K      m[k]       V
channel         c  chan E, <-chan E       element  e  E

(注意表格中的第二行和最后一列)

  1. 对于字符串值,range子句遍历字符串中的Unicode码点,从字节索引0开始。在后续的迭代中,索引值将是字符串中连续UTF-8编码码点的第一个字节的索引,而第二个值(类型为rune)将是相应码点的值。如果迭代遇到无效的UTF-8序列,第二个值将是0xFFFD,即Unicode替换字符,并且下一次迭代将在字符串中前进一个字节。

rune是一个表示Unicode码点整数值。该类型本身只是int32的别名。


如何修复并打印字符

使用fmt.Printf%c占位符来打印字符值,即fmt.Printf("%c\n", c)

fmt打印占位符文档

整数:

%b	二进制
%c	对应Unicode码点的字符
%d	十进制
%o	八进制
%O	带有0o前缀的八进制
%q	使用Go语法安全转义的单引号字符字面值
%x	小写字母表示的十六进制数(a-f)
%X	大写字母表示的十六进制数(A-F)
%U	Unicode格式:U+1234;与"U+%04X"相同

(注意表格中的第二行)


for _, c := range "abc" {
	fmt.Printf("%c\n", c)
}

https://go.dev/play/p/BEjJof4XvIk

英文:

> Why

Looping with range over string will give you a sequence of runes.

For range spec:

Range expression                          1st value          2nd value

array or slice  a  [n]E, *[n]E, or []E    index    i  int    a[i]       E
string          s  string type            index    i  int    see below  rune
map             m  map[K]V                key      k  K      m[k]       V
channel         c  chan E, &lt;-chan E       element  e  E

(notice the second row and last column in the table)

> 2. For a string value, the "range" clause iterates over the Unicode code points in the string starting at byte index 0. On successive
> iterations, the index value will be the index of the first byte of
> successive UTF-8-encoded code points in the string, and the second
> value, of type rune, will be the value of the corresponding code
> point
. If the iteration encounters an invalid UTF-8 sequence, the
> second value will be 0xFFFD, the Unicode replacement character, and
> the next iteration will advance a single byte in the string.

A rune value is an integer value identifying a Unicode code point.
The type itself is just an alias of int32.


> how to fix it and print just the char

Use fmt.Printf with the %c verb to print the character value, i.e. fmt.Printf(&quot;%c\n&quot;, c)

fmt printing verbs doc:

Integers:

%b	base 2
%c	the character represented by the corresponding Unicode code point
%d	base 10
%o	base 8
%O	base 8 with 0o prefix
%q	a single-quoted character literal safely escaped with Go syntax.
%x	base 16, with lower-case letters for a-f
%X	base 16, with upper-case letters for A-F
%U	Unicode format: U+1234; same as &quot;U+%04X&quot;

(notice the second row in the table)


for _, c := range &quot;abc&quot; {
	fmt.Printf(&quot;%c\n&quot;, c)
}

https://go.dev/play/p/BEjJof4XvIk

huangapple
  • 本文由 发表于 2022年8月17日 15:01:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/73384111.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定