2017年1月4日 14:12:03go评论106阅读模式

英文:

Golang string format using slice values

问题

这是我为你翻译的内容：

这里我正在尝试从包含字符串的切片中创建一个用于我的API的查询字符串。

即 where={"node_name":"node1","node_name":"node_2"}

import (
   "fmt"
   "strings"
)
func main() {
    nodes := []string{"node1", "node2"}
    var query string
    for _, n := range nodes {
        query += fmt.Sprintf("\"node_name\":\"%s\",", n)
    }
    query = strings.TrimRight(query, ",")
    final := fmt.Sprintf("where={%s}", query)
    fmt.Println(final)
}

这是 goplayground 的链接。

获取结果的最佳方法是什么？

英文:

Here I am trying to create a query string for my API from a slice containing strings.

ie. where={"node_name":"node1","node_name":"node_2"}

import (
   &quot;fmt&quot;
   &quot;strings&quot;
)
func main() {
    nodes := []string{&quot;node1&quot;, &quot;node2&quot;}
    var query string
    for _, n := range nodes {
	    query += fmt.Sprintf(&quot;\&quot;node_name\&quot;:\&quot;%s\&quot;,&quot;, n)
    }
    query = strings.TrimRight(query, &quot;,&quot;)
    final := fmt.Sprintf(&quot;where={%s}&quot;, query)
    fmt.Println(final)
}

Here is goplayground link.

What is the best way to get the result?

答案1

得分: 16

你的解决方案由于字符串连接而使用了太多的内存分配。

我们将提供一些替代方案，这些方案更快、更优雅。请注意，下面的解决方案不会检查节点值是否包含引号字符"。如果包含引号字符，必须以某种方式进行转义（否则结果将是一个无效的查询字符串）。

完整可运行的代码可以在Go Playground上找到。完整的测试/基准测试代码也可以在Go Playground上找到，但不能运行，请将它们保存到你的Go工作区（例如$GOPATH/src/query/query.go和$GOPATH/src/query/query_test.go），然后使用go test -bench .运行。

还请务必查看这个相关问题：https://stackoverflow.com/questions/1760757/how-to-efficiently-concatenate-strings-in-go

替代方案

Genesis

你的逻辑可以通过以下函数来实现：

func buildOriginal(nodes []string) string {
    var query string
    for _, n := range nodes {
        query += fmt.Sprintf("\"node_name\":\"%s\",", n)
    }
    query = strings.TrimRight(query, ",")
    return fmt.Sprintf("where={%s}", query)
}

使用`bytes.Buffer`

更好的方法是使用单个缓冲区，例如bytes.Buffer，在其中构建查询，最后将其转换为string：

func buildBuffer(nodes []string) string {
    buf := &bytes.Buffer{}
    buf.WriteString("where={")
    for i, v := range nodes {
        if i > 0 {
            buf.WriteByte(',')
        }
        buf.WriteString("\"node_name\":\"")
        buf.WriteString(v)
        buf.WriteByte('"')
    }
    buf.WriteByte('}')
    return buf.String()
}

使用方法：

nodes := []string{"node1", "node2"}
fmt.Println(buildBuffer(nodes))

输出：

where={"node_name":"node1","node_name":"node2"}

改进的`bytes.Buffer`

bytes.Buffer仍然会进行一些重新分配，尽管比你的原始解决方案要少得多。

但是，如果我们在创建bytes.Buffer时传递一个足够大的字节切片，我们仍然可以将分配减少到1次。我们可以事先计算所需的大小：

func buildBuffer2(nodes []string) string {
    size := 8 + len(nodes)*15
    for _, v := range nodes {
        size += len(v)
    }
    buf := bytes.NewBuffer(make([]byte, 0, size))
    buf.WriteString("where={")
    for i, v := range nodes {
        if i > 0 {
            buf.WriteByte(',')
        }
        buf.WriteString("\"node_name\":\"")
        buf.WriteString(v)
        buf.WriteByte('"')
    }
    buf.WriteByte('}')
    return buf.String()
}

请注意，在计算size时，8是字符串where={}的大小，15是字符串"node_name":"",的大小。

使用`text/template`

我们还可以创建一个文本模板，并使用text/template包来执行它，高效地生成结果：

var t = template.Must(template.New("").Parse(templ))
func buildTemplate(nodes []string) string {
    size := 8 + len(nodes)*15
    for _, v := range nodes {
        size += len(v)
    }
    buf := bytes.NewBuffer(make([]byte, 0, size))
    if err := t.Execute(buf, nodes); err != nil {
        log.Fatal(err) // 处理错误
    }
    return buf.String()
}
const templ = `where={
{{- range $idx, $n := . -}}
    {{if ne $idx 0}},{{end}}"node_name":"{{$n}}"
{{- end -}}
}`

使用`strings.Join()`

这个解决方案由于其简单性而很有趣。我们可以使用strings.Join()将节点与中间的静态文本","node_name":"连接起来，然后应用适当的前缀和后缀。

需要注意的一点是，strings.Join()使用内置的copy()函数和一个预分配的[]byte缓冲区，所以它非常快！“作为一个特殊情况，它（copy()函数）还可以将字节从字符串复制到字节切片。”

func buildJoin(nodes []string) string {
    if len(nodes) == 0 {
        return "where={}"
    }
    return `where={"node_name":"` + strings.Join(nodes, `","node_name":"`) + `"}`
}

基准测试结果

我们将使用以下nodes值进行基准测试：

var nodes = []string{"n1", "node2", "nodethree", "fourthNode",
    "n1", "node2", "nodethree", "fourthNode",
    "n1", "node2", "nodethree", "fourthNode",
    "n1", "node2", "nodethree", "fourthNode",
    "n1", "node2", "nodethree", "fourthNode",
}

基准测试代码如下：

func BenchmarkOriginal(b *testing.B) {
    for i := 0; i < b.N; i++ {
        buildOriginal(nodes)
    }
}
func BenchmarkBuffer(b *testing.B) {
    for i := 0; i < b.N; i++ {
        buildBuffer(nodes)
    }
}
// ... 其他所有基准测试函数都相同

现在是结果：

BenchmarkOriginal-4               200000             10572 ns/op
BenchmarkBuffer-4                 500000              2914 ns/op
BenchmarkBuffer2-4               1000000              2024 ns/op
BenchmarkBufferTemplate-4          30000             77634 ns/op
BenchmarkJoin-4                  2000000               830 ns/op

一些不足为奇的事实：buildBuffer()比buildOriginal()快3.6倍，buildBuffer2()（使用预计算的大小）比buildBuffer()快约30%，因为它不需要重新分配（和复制）内部缓冲区。

一些令人惊讶的事实：buildJoin()非常快，甚至比buildBuffer2()快2.4倍（因为它只使用了一个[]byte和copy()）。另一方面，buildTemplate()的速度相当慢：比buildOriginal()慢7倍。这主要是因为它在内部使用（必须使用）反射。

英文:

Your solution uses way too many allocations due to string concatenations.

We'll create some alternative, faster and/or more elegant solutions. Note that the below solutions do not check if node values contain the quotation mark " character. If they would, those would have to be escaped somehow (else the result would be an invalid query string).

The complete, runnable code can be found on the Go Playground. The complete testing / benchmarking code can also be found on the Go Playground, but it is not runnable, save both to your Go workspace (e.g. $GOPATH/src/query/query.go and $GOPATH/src/query/query_test.go) and run it with go test -bench ..

Also be sure to check out this related question: https://stackoverflow.com/questions/1760757/how-to-efficiently-concatenate-strings-in-go

Alternatives

Genesis

Your logic can be captured by the following function:

func buildOriginal(nodes []string) string {
	var query string
	for _, n := range nodes {
		query += fmt.Sprintf(&quot;\&quot;node_name\&quot;:\&quot;%s\&quot;,&quot;, n)
	}
	query = strings.TrimRight(query, &quot;,&quot;)
	return fmt.Sprintf(&quot;where={%s}&quot;, query)
}

Using `bytes.Buffer`

Much better would be to use a single buffer, e.g. bytes.Buffer, build the query in that, and convert it to string at the end:

func buildBuffer(nodes []string) string {
	buf := &amp;bytes.Buffer{}
	buf.WriteString(&quot;where={&quot;)
	for i, v := range nodes {
		if i &gt; 0 {
			buf.WriteByte(&#39;,&#39;)
		}
		buf.WriteString(`&quot;node_name&quot;:&quot;`)
		buf.WriteString(v)
		buf.WriteByte(&#39;&quot;&#39;)
	}
	buf.WriteByte(&#39;}&#39;)
	return buf.String()
}

Using it:

nodes := []string{&quot;node1&quot;, &quot;node2&quot;}
fmt.Println(buildBuffer(nodes))

Output:

where={&quot;node_name&quot;:&quot;node1&quot;,&quot;node_name&quot;:&quot;node2&quot;}

`bytes.Buffer` improved

bytes.Buffer will still do some reallocations, although much less than your original solution.

However, we can still reduce the allocations to 1, if we pass a big-enough byte slice when creating the bytes.Buffer using bytes.NewBuffer(). We can calculate the required size prior:

func buildBuffer2(nodes []string) string {
	size := 8 + len(nodes)*15
	for _, v := range nodes {
		size += len(v)
	}
	buf := bytes.NewBuffer(make([]byte, 0, size))
	buf.WriteString(&quot;where={&quot;)
	for i, v := range nodes {
		if i &gt; 0 {
			buf.WriteByte(&#39;,&#39;)
		}
		buf.WriteString(`&quot;node_name&quot;:&quot;`)
		buf.WriteString(v)
		buf.WriteByte(&#39;&quot;&#39;)
	}
	buf.WriteByte(&#39;}&#39;)
	return buf.String()
}

Note that in size calculation 8 is the size of the string where={} and 15 is the size of the string "node_name":"",.

Using `text/template`

We can also create a text template, and use the text/template package to execute it, efficiently generating the result:

var t = template.Must(template.New(&quot;&quot;).Parse(templ))
func buildTemplate(nodes []string) string {
	size := 8 + len(nodes)*15
	for _, v := range nodes {
		size += len(v)
	}
	buf := bytes.NewBuffer(make([]byte, 0, size))
	if err := t.Execute(buf, nodes); err != nil {
		log.Fatal(err) // Handle error
	}
	return buf.String()
}
const templ = `where={
{{- range $idx, $n := . -}}
    {{if ne $idx 0}},{{end}}&quot;node_name&quot;:&quot;{{$n}}&quot;
{{- end -}}
}`

Using `strings.Join()`

This solution is interesting due to its simplicity. We can use strings.Join() to join the nodes with the static text ","node_name":" in between, proper prefix and postfix applied.

An important thing to note: strings.Join() uses the builtin copy() function with a single preallocated []byte buffer, so it's very fast! "As a special case, it (the copy() function) also will copy bytes from a string to a slice of bytes."

func buildJoin(nodes []string) string {
	if len(nodes) == 0 {
		return &quot;where={}&quot;
	}
	return `where={&quot;node_name&quot;:&quot;` + strings.Join(nodes, `&quot;,&quot;node_name&quot;:&quot;`) + `&quot;}`
}

Benchmark results

We'll benchmark with the following nodes value:

var nodes = []string{&quot;n1&quot;, &quot;node2&quot;, &quot;nodethree&quot;, &quot;fourthNode&quot;,
	&quot;n1&quot;, &quot;node2&quot;, &quot;nodethree&quot;, &quot;fourthNode&quot;,
	&quot;n1&quot;, &quot;node2&quot;, &quot;nodethree&quot;, &quot;fourthNode&quot;,
	&quot;n1&quot;, &quot;node2&quot;, &quot;nodethree&quot;, &quot;fourthNode&quot;,
	&quot;n1&quot;, &quot;node2&quot;, &quot;nodethree&quot;, &quot;fourthNode&quot;,
}

And the benchmarking code looks like this:

func BenchmarkOriginal(b *testing.B) {
	for i := 0; i &lt; b.N; i++ {
		buildOriginal(nodes)
	}
}
func BenchmarkBuffer(b *testing.B) {
	for i := 0; i &lt; b.N; i++ {
		buildBuffer(nodes)
	}
}
// ... All the other benchmarking functions look the same

And now the results:

BenchmarkOriginal-4               200000             10572 ns/op
BenchmarkBuffer-4                 500000              2914 ns/op
BenchmarkBuffer2-4               1000000              2024 ns/op
BenchmarkBufferTemplate-4          30000             77634 ns/op
BenchmarkJoin-4                  2000000               830 ns/op

Some unsurprising facts: buildBuffer() is 3.6 times faster than buildOriginal(), and buildBuffer2() (with pre-calculated size) is about 30% faster than buildBuffer() because it does not need to reallocate (and copy over) the internal buffer.

Some surprising facts: buildJoin() is extremely fast, even beats buildBuffer2() by 2.4 times (due to only using a []byte and copy()). buildTemplate() on the other hand proved quite slow: 7 times slower than buildOriginal(). The main reason for this is because it uses (has to use) reflection under the hood.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

问题

答案1

替代方案

Genesis

使用bytes.Buffer

改进的bytes.Buffer

使用text/template

使用strings.Join()

基准测试结果

Alternatives

Genesis

Using bytes.Buffer

bytes.Buffer improved

Using text/template

Using strings.Join()

Benchmark results

发表评论

使用`bytes.Buffer`

改进的`bytes.Buffer`

使用`text/template`

使用`strings.Join()`

Using `bytes.Buffer`

`bytes.Buffer` improved

Using `text/template`

Using `strings.Join()`