如何将 []int8 转换为字符串?

huangapple go评论75阅读模式
英文:

How to convert []int8 to string

问题

[]int8转换为字符串的最佳方法(性能最快)是什么?

对于[]byte,我们可以使用string(byteslice),但对于[]int8,会出现错误:

cannot convert ba (type []int8) to type string

我从*sqlx.RowsSliceScan()方法中获取了ba,它生成的是[]int8而不是字符串。

这个解决方案是最快的吗?

func B2S(bs []int8) string {
    ba := []byte{}
    for _, b := range bs {
        ba = append(ba, byte(b))
    }
    return string(ba)
}

编辑 我错了,应该是uint8而不是int8.. 所以我可以直接使用string(ba)

英文:

What's the best way (fastest performance) to convert from []int8 to string?

For []byte we could do string(byteslice), but for []int8 it gives an error:

cannot convert ba (type []int8) to type string

I got the ba from SliceScan() method of *sqlx.Rows that produces []int8 instead of string

Is this solution the fastest?

func B2S(bs []int8) string {
    ba := []byte{}
    for _, b := range bs {
        ba = append(ba, byte(b))
    }
    return string(ba)
}

EDIT my bad, it's uint8 instead of int8.. so I can do string(ba) directly.

答案1

得分: 22

**事先注意:**提问者首先表示输入切片是[]int8,所以答案是针对这个类型的。后来他意识到输入是[]uint8,可以直接转换为string,因为byteuint8的别名(并且[]byte => string的转换是由语言规范支持的)。


无法直接转换不同类型的切片,必须手动进行转换。

问题是我们应该将哪种类型的切片转换为?我们有两个候选:[]byte[]rune。字符串在内部以UTF-8编码的字节序列形式存储([]byte),并且string也可以转换为rune切片。语言支持将这两种类型([]byte[]rune)转换为string

rune是一个Unicode码点。如果我们尝试以一对一的方式将int8转换为rune,那么如果输入包含使用UTF-8编码的多字节字符(使用UTF-8),则会失败(即输出错误),因为在这种情况下,多个int8值应该最终转换为一个rune

让我们从字符串"世界"开始,其字节为:

fmt.Println([]byte("世界"))
// 输出:[228 184 150 231 149 140]

以及它的runes:

fmt.Println([]rune("世界"))
// [19990 30028]

它只有2个runes和6个字节。所以显然,1对1的int8->rune映射不起作用,我们必须使用1-1的int8->byte映射。

<sup>byteuint8的别名,范围为0..255,要将其转换为[]int8(范围为-128..127),如果字节值大于127,则必须使用-256+bytevalue,所以"世界"[]int8中的表示如下:</sup>

[-28 -72 -106 -25 -107 -116]

<sup>我们想要的反向转换是:bytevalue = 256 + int8value,如果int8为负数,但我们无法将其作为int8(范围为-128..127)或byte(范围为0..255)进行转换,所以我们还必须首先将其转换为int(并在最后转换回byte)。这可能看起来像这样:</sup>

if v < 0 {
	b[i] = byte(256 + int(v))
} else {
	b[i] = byte(v)
}

但实际上,由于有符号整数使用二进制补码表示,如果我们简单地使用byte(v)转换,我们将得到相同的结果(在负数的情况下,这等效于256 + v)。

注意:由于我们知道切片的长度,所以通过使用索引[]设置其元素而不调用内置的append函数,可以更快地分配具有此长度的切片。

因此,这是最终的转换:

func B2S(bs []int8) string {
	b := make([]byte, len(bs))
	for i, v := range bs {
		b[i] = byte(v)
	}
	return string(b)
}

Go Playground上尝试一下。

英文:

Note beforehand: The asker first stated that input slice is []int8 so that is what the answer is for. Later he realized the input is []uint8 which can be directly converted to string because byte is an alias for uint8 (and []byte => string conversion is supported by the language spec).


You can't convert slices of different types, you have to do it manually.

Question is what type of slice should we convert to? We have 2 candidates: []byte and []rune. Strings are stored as UTF-8 encoded byte sequences internally ([]byte), and a string can also be converted to a slice of runes. The language supports converting both of these types ([]byte and []rune) to string.

A rune is a unicode codepoint. And if we try to convert an int8 to a rune in a one-to-one fashion, it will fail (meaning wrong output) if the input contains characters which are encoded to multiple bytes (using UTF-8) because in this case multiple int8 values should end up in one rune.

Let's start from the string &quot;世界&quot; whose bytes are:

fmt.Println([]byte(&quot;世界&quot;))
// Output: [228 184 150 231 149 140]

And its runes:

fmt.Println([]rune(&quot;世界&quot;))
// [19990 30028]

It's only 2 runes and 6 bytes. So obviously 1-to-1 int8->rune mapping won't work, we have to go with 1-1 int8->byte mapping.

<sup>byte is alias for uint8 having range 0..255, to convert it to []int8 (having range -128..127) we have to use -256+bytevalue if the byte value is > 127 so the &quot;世界&quot; string in []int8 looks like this:</sup>

[-28 -72 -106 -25 -107 -116]

<sup>The backward conversion what we want is: bytevalue = 256 + int8value if the int8 is negative but we can't do this as int8 (range -128..127) and neither as byte (range 0..255) so we also have to convert it to int first (and back to byte at the end). This could look something like this:</sup>

if v &lt; 0 {
	b[i] = byte(256 + int(v))
} else {
	b[i] = byte(v)
}

But actually since signed integers are represented using 2's complement, we get the same result if we simply use a byte(v) conversion (which in case of negative numbers this is equivalent to 256 + v).

Note: Since we know the length of the slice, it is much faster to allocate a slice with this length and just set its elements using indexing [] and not calling the built-in append function.

So here is the final conversion:

func B2S(bs []int8) string {
	b := make([]byte, len(bs))
	for i, v := range bs {
		b[i] = byte(v)
	}
	return string(b)
}

Try it on the Go Playground.

答案2

得分: 3

不完全确定它是最快的,但我还没有找到更好的东西。
ba := []byte{}更改为ba := make([]byte,0, len(bs),这样最后你会得到:

func B2S(bs []int8) string {
    ba := make([]byte,0, len(bs))
    for _, b := range bs {
        ba = append(ba, byte(b))
    }
    return string(ba)
}

这样,append函数将永远不会尝试插入超过切片底层数组所能容纳的数据,并且您将避免将数据复制到更大的数组中。

英文:

Not entirely sure it is the fastest, but I haven't found anything better.
Change ba := []byte{} for ba := make([]byte,0, len(bs) so at the end you have:

func B2S(bs []int8) string {
    ba := make([]byte,0, len(bs))
    for _, b := range bs {
        ba = append(ba, byte(b))
    }
    return string(ba)
}

This way the append function will never try to insert more data that it can fit in the slice's underlying array and you will avoid unnecessary copying to a bigger array.

答案3

得分: 1

从“在不同类型的切片之间进行转换”中可以确定的是,你需要从原始的int8[]构建正确的切片。

我最终使用了runeint32的别名)(playground),假设uint8都是简单的ASCII字符。显然,这是一种过度简化,icza答案中有更多相关信息。
此外,SliceScan()方法最终返回的是uint8[]

package main

import (
	"fmt"
)

func main() {
	s := []int8{'a', 'b', 'c'}
	b := make([]rune, len(s))
	for i, v := range s {
		b[i] = rune(v)
	}
	fmt.Println(string(b))
}

但是我没有将其与使用[]byte进行基准测试。

英文:

What is sure from "Convert between slices of different types" is that you have to build the right slice from your original int8[].

I ended up using rune (int32 alias) (playground), assuming that the uint8 were all simple ascii character. That is obviously an over-simplification and icza's answer has more on that.
Plus the SliceScan() method ended up returning uint8[] anyway.

package main

import (
	&quot;fmt&quot;
)

func main() {
	s := []int8{&#39;a&#39;, &#39;b&#39;, &#39;c&#39;}
	b := make([]rune, len(s))
	for i, v := range s {
		b[i] = rune(v)
	}
	fmt.Println(string(b))
}

But I didn't benchmark it against using a []byte.

答案4

得分: 0

使用不安全的包。

func B2S(bs []int8) string {
  return strings.TrimRight(string(*(*[]byte)unsafe.Pointer(&bs)), "\x00")
}

再次发送 ^^

英文:

Use unsafe package.

func B2S(bs []int8) string {
  return strings.TrimRight(string(*(*[]byte)unsafe.Pointer(&amp;bs)), &quot;\x00&quot;)
}

Send again ^^

huangapple
  • 本文由 发表于 2015年3月4日 14:47:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/28848187.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定