使用切片值进行 Golang 字符串格式化

huangapple go评论75阅读模式
英文:

Golang string format using slice values

问题

这是我为你翻译的内容:

这里我正在尝试从包含字符串的切片中创建一个用于我的API的查询字符串。

where={"node_name":"node1","node_name":"node_2"}

import (
   "fmt"
   "strings"
)

func main() {
    nodes := []string{"node1", "node2"}
    var query string
    for _, n := range nodes {
        query += fmt.Sprintf("\"node_name\":\"%s\",", n)
    }
    query = strings.TrimRight(query, ",")
    final := fmt.Sprintf("where={%s}", query)
    fmt.Println(final)
}

这是 goplayground 的链接。

获取结果的最佳方法是什么?

英文:

Here I am trying to create a query string for my API from a slice containing strings.

ie. where={"node_name":"node1","node_name":"node_2"}

import (
   "fmt"
   "strings"
)

func main() {
    nodes := []string{"node1", "node2"}
    var query string
    for _, n := range nodes {
	    query += fmt.Sprintf("\"node_name\":\"%s\",", n)
    }
    query = strings.TrimRight(query, ",")
    final := fmt.Sprintf("where={%s}", query)
    fmt.Println(final)
}

Here is goplayground link.

What is the best way to get the result?

答案1

得分: 16

你的解决方案由于字符串连接而使用了太多的内存分配。

我们将提供一些替代方案,这些方案更快、更优雅。请注意,下面的解决方案不会检查节点值是否包含引号字符"。如果包含引号字符,必须以某种方式进行转义(否则结果将是一个无效的查询字符串)。

完整可运行的代码可以在Go Playground上找到。完整的测试/基准测试代码也可以在Go Playground上找到,但不能运行,请将它们保存到你的Go工作区(例如$GOPATH/src/query/query.go$GOPATH/src/query/query_test.go),然后使用go test -bench .运行。

还请务必查看这个相关问题:https://stackoverflow.com/questions/1760757/how-to-efficiently-concatenate-strings-in-go

替代方案

Genesis

你的逻辑可以通过以下函数来实现:

func buildOriginal(nodes []string) string {
    var query string
    for _, n := range nodes {
        query += fmt.Sprintf("\"node_name\":\"%s\",", n)
    }
    query = strings.TrimRight(query, ",")
    return fmt.Sprintf("where={%s}", query)
}

使用bytes.Buffer

更好的方法是使用单个缓冲区,例如bytes.Buffer,在其中构建查询,最后将其转换为string

func buildBuffer(nodes []string) string {
    buf := &bytes.Buffer{}
    buf.WriteString("where={")
    for i, v := range nodes {
        if i > 0 {
            buf.WriteByte(',')
        }
        buf.WriteString("\"node_name\":\"")
        buf.WriteString(v)
        buf.WriteByte('"')
    }
    buf.WriteByte('}')
    return buf.String()
}

使用方法:

nodes := []string{"node1", "node2"}
fmt.Println(buildBuffer(nodes))

输出:

where={"node_name":"node1","node_name":"node2"}

改进的bytes.Buffer

bytes.Buffer仍然会进行一些重新分配,尽管比你的原始解决方案要少得多。

但是,如果我们在创建bytes.Buffer时传递一个足够大的字节切片,我们仍然可以将分配减少到1次。我们可以事先计算所需的大小:

func buildBuffer2(nodes []string) string {
    size := 8 + len(nodes)*15
    for _, v := range nodes {
        size += len(v)
    }
    buf := bytes.NewBuffer(make([]byte, 0, size))
    buf.WriteString("where={")
    for i, v := range nodes {
        if i > 0 {
            buf.WriteByte(',')
        }
        buf.WriteString("\"node_name\":\"")
        buf.WriteString(v)
        buf.WriteByte('"')
    }
    buf.WriteByte('}')
    return buf.String()
}

请注意,在计算size时,8是字符串where={}的大小,15是字符串"node_name":"",的大小。

使用text/template

我们还可以创建一个文本模板,并使用text/template包来执行它,高效地生成结果:

var t = template.Must(template.New("").Parse(templ))

func buildTemplate(nodes []string) string {
    size := 8 + len(nodes)*15
    for _, v := range nodes {
        size += len(v)
    }
    buf := bytes.NewBuffer(make([]byte, 0, size))
    if err := t.Execute(buf, nodes); err != nil {
        log.Fatal(err) // 处理错误
    }
    return buf.String()
}

const templ = `where={
{{- range $idx, $n := . -}}
    {{if ne $idx 0}},{{end}}"node_name":"{{$n}}"
{{- end -}}
}`

使用strings.Join()

这个解决方案由于其简单性而很有趣。我们可以使用strings.Join()将节点与中间的静态文本","node_name":"连接起来,然后应用适当的前缀和后缀。

需要注意的一点是,strings.Join()使用内置的copy()函数和一个预分配的[]byte缓冲区,所以它非常快!“作为一个特殊情况,它(copy()函数)还可以将字节从字符串复制到字节切片。”

func buildJoin(nodes []string) string {
    if len(nodes) == 0 {
        return "where={}"
    }
    return `where={"node_name":"` + strings.Join(nodes, `","node_name":"`) + `"}`
}

基准测试结果

我们将使用以下nodes值进行基准测试:

var nodes = []string{"n1", "node2", "nodethree", "fourthNode",
    "n1", "node2", "nodethree", "fourthNode",
    "n1", "node2", "nodethree", "fourthNode",
    "n1", "node2", "nodethree", "fourthNode",
    "n1", "node2", "nodethree", "fourthNode",
}

基准测试代码如下:

func BenchmarkOriginal(b *testing.B) {
    for i := 0; i < b.N; i++ {
        buildOriginal(nodes)
    }
}

func BenchmarkBuffer(b *testing.B) {
    for i := 0; i < b.N; i++ {
        buildBuffer(nodes)
    }
}

// ... 其他所有基准测试函数都相同

现在是结果:

BenchmarkOriginal-4               200000             10572 ns/op
BenchmarkBuffer-4                 500000              2914 ns/op
BenchmarkBuffer2-4               1000000              2024 ns/op
BenchmarkBufferTemplate-4          30000             77634 ns/op
BenchmarkJoin-4                  2000000               830 ns/op

一些不足为奇的事实:buildBuffer()buildOriginal()3.6倍buildBuffer2()(使用预计算的大小)比buildBuffer()快约30%,因为它不需要重新分配(和复制)内部缓冲区。

一些令人惊讶的事实:buildJoin()非常快,甚至比buildBuffer2()2.4倍(因为它只使用了一个[]bytecopy())。另一方面,buildTemplate()的速度相当慢:比buildOriginal()7倍。这主要是因为它在内部使用(必须使用)反射。

英文:

Your solution uses way too many allocations due to string concatenations.

We'll create some alternative, faster and/or more elegant solutions. Note that the below solutions do not check if node values contain the quotation mark &quot; character. If they would, those would have to be escaped somehow (else the result would be an invalid query string).

The complete, runnable code can be found on the Go Playground. The complete testing / benchmarking code can also be found on the Go Playground, but it is not runnable, save both to your Go workspace (e.g. $GOPATH/src/query/query.go and $GOPATH/src/query/query_test.go) and run it with go test -bench ..

Also be sure to check out this related question: https://stackoverflow.com/questions/1760757/how-to-efficiently-concatenate-strings-in-go

Alternatives

Genesis

Your logic can be captured by the following function:

func buildOriginal(nodes []string) string {
	var query string
	for _, n := range nodes {
		query += fmt.Sprintf(&quot;\&quot;node_name\&quot;:\&quot;%s\&quot;,&quot;, n)
	}
	query = strings.TrimRight(query, &quot;,&quot;)
	return fmt.Sprintf(&quot;where={%s}&quot;, query)
}

Using bytes.Buffer

Much better would be to use a single buffer, e.g. bytes.Buffer, build the query in that, and convert it to string at the end:

func buildBuffer(nodes []string) string {
	buf := &amp;bytes.Buffer{}
	buf.WriteString(&quot;where={&quot;)
	for i, v := range nodes {
		if i &gt; 0 {
			buf.WriteByte(&#39;,&#39;)
		}
		buf.WriteString(`&quot;node_name&quot;:&quot;`)
		buf.WriteString(v)
		buf.WriteByte(&#39;&quot;&#39;)
	}
	buf.WriteByte(&#39;}&#39;)
	return buf.String()
}

Using it:

nodes := []string{&quot;node1&quot;, &quot;node2&quot;}
fmt.Println(buildBuffer(nodes))

Output:

where={&quot;node_name&quot;:&quot;node1&quot;,&quot;node_name&quot;:&quot;node2&quot;}

bytes.Buffer improved

bytes.Buffer will still do some reallocations, although much less than your original solution.

However, we can still reduce the allocations to 1, if we pass a big-enough byte slice when creating the bytes.Buffer using bytes.NewBuffer(). We can calculate the required size prior:

func buildBuffer2(nodes []string) string {
	size := 8 + len(nodes)*15
	for _, v := range nodes {
		size += len(v)
	}
	buf := bytes.NewBuffer(make([]byte, 0, size))
	buf.WriteString(&quot;where={&quot;)
	for i, v := range nodes {
		if i &gt; 0 {
			buf.WriteByte(&#39;,&#39;)
		}
		buf.WriteString(`&quot;node_name&quot;:&quot;`)
		buf.WriteString(v)
		buf.WriteByte(&#39;&quot;&#39;)
	}
	buf.WriteByte(&#39;}&#39;)
	return buf.String()
}

Note that in size calculation 8 is the size of the string where={} and 15 is the size of the string &quot;node_name&quot;:&quot;&quot;,.

Using text/template

We can also create a text template, and use the text/template package to execute it, efficiently generating the result:

var t = template.Must(template.New(&quot;&quot;).Parse(templ))

func buildTemplate(nodes []string) string {
	size := 8 + len(nodes)*15
	for _, v := range nodes {
		size += len(v)
	}
	buf := bytes.NewBuffer(make([]byte, 0, size))
	if err := t.Execute(buf, nodes); err != nil {
		log.Fatal(err) // Handle error
	}
	return buf.String()
}

const templ = `where={
{{- range $idx, $n := . -}}
    {{if ne $idx 0}},{{end}}&quot;node_name&quot;:&quot;{{$n}}&quot;
{{- end -}}
}`

Using strings.Join()

This solution is interesting due to its simplicity. We can use strings.Join() to join the nodes with the static text &quot;,&quot;node_name&quot;:&quot; in between, proper prefix and postfix applied.

An important thing to note: strings.Join() uses the builtin copy() function with a single preallocated []byte buffer, so it's very fast! "As a special case, it (the copy() function) also will copy bytes from a string to a slice of bytes."

func buildJoin(nodes []string) string {
	if len(nodes) == 0 {
		return &quot;where={}&quot;
	}
	return `where={&quot;node_name&quot;:&quot;` + strings.Join(nodes, `&quot;,&quot;node_name&quot;:&quot;`) + `&quot;}`
}

Benchmark results

We'll benchmark with the following nodes value:

var nodes = []string{&quot;n1&quot;, &quot;node2&quot;, &quot;nodethree&quot;, &quot;fourthNode&quot;,
	&quot;n1&quot;, &quot;node2&quot;, &quot;nodethree&quot;, &quot;fourthNode&quot;,
	&quot;n1&quot;, &quot;node2&quot;, &quot;nodethree&quot;, &quot;fourthNode&quot;,
	&quot;n1&quot;, &quot;node2&quot;, &quot;nodethree&quot;, &quot;fourthNode&quot;,
	&quot;n1&quot;, &quot;node2&quot;, &quot;nodethree&quot;, &quot;fourthNode&quot;,
}

And the benchmarking code looks like this:

func BenchmarkOriginal(b *testing.B) {
	for i := 0; i &lt; b.N; i++ {
		buildOriginal(nodes)
	}
}

func BenchmarkBuffer(b *testing.B) {
	for i := 0; i &lt; b.N; i++ {
		buildBuffer(nodes)
	}
}

// ... All the other benchmarking functions look the same

And now the results:

BenchmarkOriginal-4               200000             10572 ns/op
BenchmarkBuffer-4                 500000              2914 ns/op
BenchmarkBuffer2-4               1000000              2024 ns/op
BenchmarkBufferTemplate-4          30000             77634 ns/op
BenchmarkJoin-4                  2000000               830 ns/op

Some unsurprising facts: buildBuffer() is 3.6 times faster than buildOriginal(), and buildBuffer2() (with pre-calculated size) is about 30% faster than buildBuffer() because it does not need to reallocate (and copy over) the internal buffer.

Some surprising facts: buildJoin() is extremely fast, even beats buildBuffer2() by 2.4 times (due to only using a []byte and copy()). buildTemplate() on the other hand proved quite slow: 7 times slower than buildOriginal(). The main reason for this is because it uses (has to use) reflection under the hood.

huangapple
  • 本文由 发表于 2017年1月4日 14:12:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/41457273.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定