英文:
When to use []byte or string in Go?
问题
在编写Go应用程序时,我经常面临使用[]byte
还是string
的选择。除了[]byte
的明显可变性之外,我如何决定使用哪个?
我有几个用例作为示例:
- 一个函数返回一个新的
[]byte
。由于切片容量是固定的,有什么理由不返回一个字符串呢? []byte
默认情况下不像string
那样漂亮地打印出来,所以我经常将其转换为string
以进行日志记录。它应该一直是一个string
吗?- 当在
[]byte
前添加数据时,总是会创建一个新的底层数组。如果要添加的数据是常量,为什么不使用string
呢?
英文:
Frequently in writing Go applications, I find myself with the choice to use []byte
or string
. Apart from the obvious mutability of []byte
, how do I decide which one to use?
I have several use cases for examples:
- A function returns a new
[]byte
. Since the slice capacity is fixed, what reason is there to not return a string? []byte
are not printed as nicely asstring
by default, so I often find myself casting tostring
for logging purposes. Should it always have been astring
?- When prepending
[]byte
, a new underlying array is always created. If the data to prepend is constant, why should this not be astring
?
答案1
得分: 51
我的建议是在处理文本时,默认情况下使用字符串。但是,如果满足以下条件之一,请改用[]byte:
-
[]byte的可变性将显著减少所需的分配次数。
-
您正在处理使用[]byte的API,并且避免将其转换为字符串将简化您的代码。
英文:
My advice would be to use string by default when you're working with text. But use []byte instead if one of the following conditions applies:
-
The mutability of a []byte will significantly reduce the number of allocations needed.
-
You are dealing with an API that uses []byte, and avoiding a conversion to string will simplify your code.
答案2
得分: 20
我感觉在Go语言中,与其他非机器学习风格的语言相比,类型用于传达含义和预期使用方式。因此,确定要使用哪种类型的最佳方法是问问自己数据是什么。
字符串表示文本。只是文本。编码不是你需要担心的事情,所有操作都是基于字符的,不管一个“字符”实际上是什么。
数组表示二进制数据或该数据的特定编码。[]byte
表示数据要么是字节流,要么是单字节字符流。[]int16
表示整数流或两个字节字符流。
考虑到几乎所有与字节相关的东西都有处理字符串的函数,反之亦然,我建议你不要问自己需要用数据做什么,而是问这些数据代表什么。然后在找到瓶颈后进行优化。
编辑:这篇帖子是我得到使用类型转换来分割字符串的理由的地方。
英文:
I've gotten the sense that in Go, more than in any other non-ML style language, the type is used to convey meaning and intended use. So, the best way to figure out which type to use is to ask yourself what the data is.
A string represents text. Just text. The encoding is not something you have to worry about and all operations work on a character by character basis, regardless of that a 'character' actually is.
An array represents either binary data or a specific encoding of that data. []byte
means that the data is either just a byte stream or a stream of single byte characters. []int16
represents an integer stream or a stream of two byte characters.
Given that fact that pretty much everything that deals with bytes also has functions to deal with strings and vice versa, I would suggest that instead of asking what you need to do with the data, you ask what that data represents. And then optimize things once you figure out bottlenecks.
EDIT: This post is where I got the rationale for using type conversion to break up the string.
答案3
得分: 9
-
一个区别是返回的
[]byte
可以被重新使用来保存另一个/新的数据(无需新的内存分配),而string
则不行。另一个区别是,在至少gc实现中,string
比[]byte
少一个字。当有很多这样的项时,可以用来节省一些内存。 -
将
[]byte
转换为string
进行日志记录是不必要的。典型的“文本”动词,如%s
、%q
,对于string
和[]byte
表达式同样适用。在另一个方向上,例如%x
或% 02x
也是如此。 -
这取决于为什么进行连接操作以及结果是否会再次与其他地方/其他地方组合在一起。如果是这种情况,那么
[]byte
可能性能更好。
英文:
-
One difference is that the returned
[]byte
can be potentially
reused to hold another/new data (w/o new memory allocation), while
string
cannot. Another one is that, in the gc implementation at
least,string
is a one word smaller entity than[]byte
. Can be
used to save some memory when there is a lot of such items live. -
Casting a
[]byte
tostring
for logging is not necessary. Typical 'text' verbs, like%s
,%q
work forstring
and[]byte
expressions
equally. In the other direction the same holds for e.g.%x
or% 02x
. -
Depends on why is the concatenation performed and if the result is ever
to be again combined w/ something/somewhere else afterwards. If that's the case then[]byte
may perform better.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论