2015年5月15日 23:44:07go评论115阅读模式

英文:

How to get a single Unicode character from string

问题

我想知道如何从字符串中获取一个 Unicode 字符。例如，如果字符串是"你好"，我如何获取第一个字符"你"？

我从另一个地方找到了一种方法：

var str = "你好"
runes := []rune(str)
fmt.Println(string(runes[0]))

这个方法是有效的。但我还有一些问题：

是否有其他方法可以实现这个目的？
为什么在 Go 语言中，str[0] 不能从字符串中获取一个 Unicode 字符，而是获取字节数据？

英文:

I wonder how I can I get a Unicode character from a string. For example, if the string is "你好", how can I get the first character "你"?

From another place I get one way:

var str = &quot;你好&quot;
runes := []rune(str)
fmt.Println(string(runes[0]))

It does work.
But I still have some questions:

Is there another way to do it?
Why in Go does str[0] not get a Unicode character from a string, but it gets byte data?

答案1

得分: 44

首先，你可能想阅读https://blog.golang.org/strings，它会回答你部分问题。

在Go中，字符串可以包含任意字节。当你写str[i]时，结果是一个字节，索引始终是字节数。

大多数情况下，字符串是以UTF-8编码的。你有多种方法来处理字符串中的UTF-8编码。

例如，你可以使用for...range语句逐个字符地迭代字符串。

var first rune
for _, c := range str {
    first = c
    break
}
// first现在包含字符串的第一个字符

你还可以利用unicode/utf8包。例如：

r, size := utf8.DecodeRuneInString(str)
// r包含字符串的第一个字符
// size是字符的字节数

如果字符串以UTF-8编码，没有直接的方法来访问字符串的第n个字符，因为字符的大小（以字节为单位）是不固定的。如果你需要这个功能，你可以很容易地编写自己的辅助函数来实现它（可以使用for...range或unicode/utf8包）。

英文:

First, you may want to read https://blog.golang.org/strings
It will answer part of your questions.

A string in Go can contains arbitrary bytes. When you write str[i], the result is a byte, and the index is always a number of bytes.

Most of the time, strings are encoded in UTF-8 though. You have multiple ways to deal with UTF-8 encoding in a string.

For instance, you can use the for...range statement to iterate on a string rune by rune.

var first rune
for _,c := range str {
    first = c
    break
}
// first now contains the first rune of the string

You can also leverage the unicode/utf8 package. For instance:

r, size := utf8.DecodeRuneInString(str)
// r contains the first rune of the string
// size is the size of the rune in bytes

If the string is encoded in UTF-8, there is no direct way to access the nth rune of the string, because the size of the runes (in bytes) is not constant. If you need this feature, you can easily write your own helper function to do it (with for...range, or with the unicode/utf8 package).

答案2

得分: 2

你可以使用utf8string包：

package main
import "golang.org/x/exp/utf8string"
func main() {
   s := utf8string.NewString("ÄÅàâäåçèéêëìîïü")
   // 示例 1
   r := s.At(1)
   println(r == 'Å')
   // 示例 2
   t := s.Slice(1, 3)
   println(t == "Åà")
}

https://pkg.go.dev/golang.org/x/exp/utf8string

英文:

You can use the utf8string package:

package main
import &quot;golang.org/x/exp/utf8string&quot;
func main() {
   s := utf8string.NewString(&quot;&#196;&#197;&#224;&#226;&#228;&#229;&#231;&#232;&#233;&#234;&#235;&#236;&#238;&#239;&#252;&quot;)
   // example 1
   r := s.At(1)
   println(r == &#39;&#197;&#39;)
   // example 2
   t := s.Slice(1, 3)
   println(t == &quot;&#197;&#224;&quot;)
}

https://pkg.go.dev/golang.org/x/exp/utf8string

答案3

得分: -2

你可以这样做：

func main() {
  str := "cat"
  var s rune
  for i, c := range str {
    if i == 2 {
      s = c
    }
  }
}

现在，s的值等于'a'。

英文:

you can do this:

func main() {
  str := &quot;cat&quot;
  var s rune
  for i, c := range str {
    if i == 2 {
      s = c
    }
  }
}

s is now equal to a

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何从字符串中获取单个 Unicode 字符

问题

答案1

答案2

答案3

golang的ReverseProxy不起作用。

在Go语言中，表示枚举的惯用方式是什么？

当将结构字面量分配给节点时出现MissingListField错误。

获取数字之前的字符。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。