2022年11月6日 14:35:31go评论129阅读模式

英文:

Type of string elements is uint8 using index and int32 on value

问题

这里我正在使用索引s[k]和值v检查字符串s的每个元素的类型，但返回不同的输出。使用索引i，我得到的类型是uint8，但对于值语义，我得到的是int32。

func main() {
    s := "AaBbCcXxYyZz"
    for k, v := range s {
        fmt.Printf("%v\t%T\t%s\n", s[k], s[k], string(s[k]))
        fmt.Printf("%v\t%T\t%s\n", v, v, string(v))
    }
}

英文:

Here I am checking the type of each elements of string s using the index s[k] and value v but returning different outputs. Using index i am getting the type uint8 but for value semantics I am getting the int32.

func main() {
    s := &quot;AaBbCcXxYyZz&quot;
	for k,v := range s {
        fmt.Printf(&quot;%v\t%T\t%s\n&quot;, s[k], s[k], string(s[k]))
		fmt.Printf(&quot;%v\t%T\t%s\n&quot;, v, v, string(v))
	} 
}

答案1

得分: 1

循环for k,v := range s {}遍历Unicode码点。在Golang中，它们被称为runes，并且以32位有符号整数表示：

对于字符串值，"range"子句从字节索引0开始迭代字符串中的Unicode码点。在后续的迭代中，索引值将是字符串中连续的UTF-8编码码点的第一个字节的索引，第二个值（类型为rune）将是相应码点的值。如果迭代遇到无效的UTF-8序列，则第二个值将为0xFFFD，即Unicode替换字符，并且下一次迭代将在字符串中前进一个字节。

Golang规范

索引s[k]返回字符串的内部表示中的字节。

对于多字节的字母表（如中文），这种差异很容易看出。尝试迭代字符串"給祭断情試紀脱答条証行日稿"（它是一个无意义的中文lorem impsum短语）：

s[0]: 231	uint8	&#231;
     :32102	int32	給
s[3]: 231	uint8	&#231;
     :31085	int32	祭
s[6]: 230	uint8	&#230;
     :26029	int32	断

看到k值之间的步长了吗？这是因为这些中文字符的utf-8编码占据了3个字节。
完整示例：https://go.dev/play/p/-44NZMojcgq

英文:

The loop for k,v := range s {} iterates over unicode codepoints. In Golang they are called runes and are represented as 32-bit signed inegers:
> For a string value, the "range" clause iterates over the Unicode code points in the string starting at byte index 0. On successive iterations, the index value will be the index of the first byte of successive UTF-8-encoded code points in the string, and the second value, of type rune, will be the value of the corresponding code point. If the iteration encounters an invalid UTF-8 sequence, the second value will be 0xFFFD, the Unicode replacement character, and the next iteration will advance a single byte in the string.

Golang specification

The indexing s[k] returns the byte in the internal representation of the string.

The difference is easy to see for multibyte alphabets, such as Chinese. Try iterate the string "給祭断情試紀脱答条証行日稿" (it a meaningless lorem impsum phrase in chinese):

s[0]: 231	uint8	&#231;
     :32102	int32	給
s[3]: 231	uint8	&#231;
     :31085	int32	祭
s[6]: 230	uint8	&#230;
     :26029	int32	断

See the step between the values of k? It is due to utf-8 encoding of those chinese characters occupies 3 bytes.
Full example: https://go.dev/play/p/-44NZMojcgq

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

字符串元素的类型是uint8，使用索引和int32对值进行操作。

问题

答案1

如果在我的 VS Code Golang 项目中自动完成功能不起作用，我该怎么办？

尝试在字符串列表中找到子集。

一个数据在通道中被两个例程接收到。

CORS问题 – React/Axios前端和Golang后端

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。