2016年12月16日 14:46:54go评论106阅读模式

英文:

Why passing pointers to channel is slower

问题

我是你的中文翻译助手，以下是翻译好的内容：

我是一个对golang新手，正在尝试用golang重写我的Java服务器项目。

我发现，将指针传递到通道中会导致性能下降近30%，与传递值相比。

这是一个示例代码片段：

package main
import (
    "time"
    "fmt"
)

var c = make(chan t, 1024)
// var c = make(chan *t, 1024)
type t struct {
    a uint
    b uint
}

func main() {

    start := time.Now()
    for i := 0; i < 1000; i++ {
        b := t{a:3, b:5}
        // c <- &b
        c <- b
    }
    elapsed := time.Since(start)
    fmt.Println(t2)
}

#更新。修复了包缺失的问题

英文:

I'm a newbie to golang, trying to rewrite my java server project in golang.

I found, passing pointers into channel cause almost 30% performance drop compared to passing values.

Here is a sample snippet:
package main
import (
"time"
"fmt"
)

var c = make(chan t, 1024)
// var c = make(chan *t, 1024)
type t struct {
    a uint
    b uint
}

func main() {

    start := time.Now()
    for i := 0; i &lt; 1000; i++ {
        b := t{a:3, b:5}
        // c &lt;- &amp;b
        c &lt;- b
    }
    elapsed := time.Since(start)
    fmt.Println(t2)
}

#update. fix the package missing

答案1

得分: 15

作为一个值，它可以在栈上分配：

go run -gcflags '-m' tmp.go
# command-line-arguments
./tmp.go:18: 内联调用 time.Time.Nanosecond
./tmp.go:24: 内联调用 time.Time.Nanosecond
./tmp.go:25: t2 逃逸到堆上
./tmp.go:25: main ... 参数不逃逸
63613

作为一个指针，它逃逸到堆上：

go run -gcflags '-m' tmp.go
# command-line-arguments
./tmp.go:18: 内联调用 time.Time.Nanosecond
./tmp.go:24: 内联调用 time.Time.Nanosecond
./tmp.go:21: &b 逃逸到堆上 <-- 额外的 GC 压力
./tmp.go:20: 移动到堆上: b   <-- 
./tmp.go:25: t2 逃逸到堆上
./tmp.go:25: main ... 参数不逃逸
122513

逃逸到堆上会引入一些开销/ GC 压力。

查看汇编代码，指针版本还引入了额外的指令，包括：

go run -gcflags '-S' tmp.go
0x0055 00085 (...tmp.go:18)	CALL	runtime.newobject(SB)

非指针变体在调用 runtime.chansend1 之前不会产生这种开销。

英文:

As a value it can be stack allocated:

go run -gcflags &#39;-m&#39; tmp.go
# command-line-arguments
./tmp.go:18: inlining call to time.Time.Nanosecond
./tmp.go:24: inlining call to time.Time.Nanosecond
./tmp.go:25: t2 escapes to heap
./tmp.go:25: main ... argument does not escape
63613

As a pointer, it escapes to the heap:

go run -gcflags &#39;-m&#39; tmp.go
# command-line-arguments
./tmp.go:18: inlining call to time.Time.Nanosecond
./tmp.go:24: inlining call to time.Time.Nanosecond
./tmp.go:21: &amp;b escapes to heap &lt;-- Additional GC pressure
./tmp.go:20: moved to heap: b   &lt;-- 
./tmp.go:25: t2 escapes to heap
./tmp.go:25: main ... argument does not escape
122513

Escaping to the heap introduces some overhead / GC pressure.

Looking at the assembly, the pointer version also introduces additional instructions, including:

go run -gcflags &#39;-S&#39; tmp.go
0x0055 00085 (...tmp.go:18)	CALL	runtime.newobject(SB)

The non-pointer variant doesn't incur this overhead before calling runtime.chansend1.

答案2

得分: 1

作为对Martin Gallagher的良好分析的补充，必须补充说明你测量的方式是可疑的。这种微小程序的性能变化很大，因此应该进行重复测量。你的示例中也有一些错误。

首先：它无法编译，因为缺少包语句。

其次：Nanoseconds和Nanosecond之间有一个重要的区别。

我尝试以以下方式评估你的观察结果*：

package main

import (
    "time"
    "fmt"
)

const (
    chan_size = 1000
    cycle_count = 1000
)

var (
    v_ch = make(chan t, chan_size)
    p_ch = make(chan *t, chan_size)
)

type t struct {
    a uint
    b uint
}

func fill_v() {
    for i := 0; i < chan_size; i++ {
        b := t{a:3, b:5}
        v_ch <- b
    }
}

func fill_p() {
    for i := 0; i < chan_size; i++ {
        b := t{a:3, b:5}
        p_ch <- &b
    }
}

func measure_f(f func()) int64 {
    start := time.Now()
    f();
    elapsed := time.Since(start)
    return elapsed.Nanoseconds()
}

func main() {

    var v_nanos int64 = 0
    var p_nanos int64 = 0
    for i := 0; i<cycle_count; i++ {
        v_nanos += measure_f(fill_v);
        for i := 0; i < chan_size; i++ {
            _ = <- v_ch
        }
    }
    for i := 0; i<cycle_count; i++ {
        p_nanos += measure_f(fill_p);
        for i := 0; i < chan_size; i++ {
            _ = <- p_ch
        }
    }
    fmt.Println(
        "v:",v_nanos/cycle_count, 
        " p:", p_nanos/cycle_count, 
        "ratio (v/p):", float64(v_nanos)/float64(p_nanos))
}

确实存在可测量的性能下降（我将下降定义为drop=1-(candidate/optimum)），但尽管我重复运行代码1000次，下降率在25%到50%之间变化，我甚至不确定堆是如何回收的以及何时回收，因此可能很难量化。

*在ideone上查看“运行”演示

...请注意，stdout被冻结：v: 34875 p: 59420 ratio (v/p)0.586923845267128

由于某种原因，无法在Go Playground上运行此代码。

英文:

As a supplement to the good analysis of Martin Gallagher, it must be added that the way you are measuring is suspicious. The performance of such tiny programs varies a lot, so measuring should be done repeatedly. There are also some mistakes in your example.

First: it doesn't compile because the package statement is missing.

Second: there is an important difference between Nanoseconds and Nanosecond

I tried to evaluate your observation this way*:

package main
import (
&quot;time&quot;
&quot;fmt&quot;
)
const (
chan_size = 1000
cycle_count = 1000
)
var (
v_ch = make(chan t, chan_size)
p_ch = make(chan *t, chan_size)
)
type t struct {
a uint
b uint
}
func fill_v() {
for i := 0; i &lt; chan_size; i++ {
b := t{a:3, b:5}
v_ch &lt;- b
}
}
func fill_p() {
for i := 0; i &lt; chan_size; i++ {
b := t{a:3, b:5}
p_ch &lt;- &amp;b
}
}
func measure_f(f func()) int64 {
start := time.Now()
f();
elapsed := time.Since(start)
return elapsed.Nanoseconds()
}
func main() {
var v_nanos int64 = 0
var p_nanos int64 = 0
for i := 0; i&lt;cycle_count; i++ {
v_nanos += measure_f(fill_v);
for i := 0; i &lt; chan_size; i++ {
_ = &lt;- v_ch
}
}
for i := 0; i&lt;cycle_count; i++ {
p_nanos += measure_f(fill_p);
for i := 0; i &lt; chan_size; i++ {
_ = &lt;- p_ch
}
}
fmt.Println(
&quot;v:&quot;,v_nanos/cycle_count, 
&quot; p:&quot;, p_nanos/cycle_count, 
&quot;ratio (v/p):&quot;, float64(v_nanos)/float64(p_nanos))
}

There is indeed a measurable performance drop (I define drop like this drop=1-(candidate/optimum)), but although I repeat the code 1000 times, it varies between 25% and 50%, I'm not even sure how the heap is recycled and when, so it maybe hard to quantify at all.

*see a "running" demo on ideone

...note that stdout is frozen: v: 34875 p: 59420 ratio (v/p)0.586923845267128

For some reason, it was not possible to run this code in the Go Playground

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么将指针传递给通道会更慢？

问题

答案1

答案2

恐慌：拨号tcp：在172.22.64.1:53上查找bookstoreDB：没有这样的主机

在Go语言中，当我使用fmt.Printf时，文件的类型是什么？

使用Go / Negroni / Gorilla Mux从静态URL提供文件。

Godoc样式表和JavaScript

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论