为什么将指针传递给通道会更慢?

huangapple go评论102阅读模式
英文:

Why passing pointers to channel is slower

问题

我是你的中文翻译助手,以下是翻译好的内容:

我是一个对golang新手,正在尝试用golang重写我的Java服务器项目。

我发现,将指针传递到通道中会导致性能下降近30%,与传递值相比。

这是一个示例代码片段:

package main
import (
    "time"
    "fmt"
)

var c = make(chan t, 1024)
// var c = make(chan *t, 1024)
type t struct {
    a uint
    b uint
}

func main() {

    start := time.Now()
    for i := 0; i < 1000; i++ {
        b := t{a:3, b:5}
        // c <- &b
        c <- b
    }
    elapsed := time.Since(start)
    fmt.Println(t2)
}

#更新。修复了包缺失的问题

英文:

I'm a newbie to golang, trying to rewrite my java server project in golang.

I found, passing pointers into channel cause almost 30% performance drop compared to passing values.

Here is a sample snippet:
package main
import (
"time"
"fmt"
)

var c = make(chan t, 1024)
// var c = make(chan *t, 1024)
type t struct {
    a uint
    b uint
}

func main() {

    start := time.Now()
    for i := 0; i &lt; 1000; i++ {
        b := t{a:3, b:5}
        // c &lt;- &amp;b
        c &lt;- b
    }
    elapsed := time.Since(start)
    fmt.Println(t2)
}

#update. fix the package missing

答案1

得分: 15

作为一个值,它可以在栈上分配:

go run -gcflags '-m' tmp.go
# command-line-arguments
./tmp.go:18: 内联调用 time.Time.Nanosecond
./tmp.go:24: 内联调用 time.Time.Nanosecond
./tmp.go:25: t2 逃逸到堆上
./tmp.go:25: main ... 参数不逃逸
63613

作为一个指针,它逃逸到堆上:

go run -gcflags '-m' tmp.go
# command-line-arguments
./tmp.go:18: 内联调用 time.Time.Nanosecond
./tmp.go:24: 内联调用 time.Time.Nanosecond
./tmp.go:21: &b 逃逸到堆上 <-- 额外的 GC 压力
./tmp.go:20: 移动到堆上: b   <-- 
./tmp.go:25: t2 逃逸到堆上
./tmp.go:25: main ... 参数不逃逸
122513

逃逸到堆上会引入一些开销/ GC 压力。

查看汇编代码,指针版本还引入了额外的指令,包括:

go run -gcflags '-S' tmp.go
0x0055 00085 (...tmp.go:18)	CALL	runtime.newobject(SB)

非指针变体在调用 runtime.chansend1 之前不会产生这种开销。

英文:

As a value it can be stack allocated:

go run -gcflags &#39;-m&#39; tmp.go
# command-line-arguments
./tmp.go:18: inlining call to time.Time.Nanosecond
./tmp.go:24: inlining call to time.Time.Nanosecond
./tmp.go:25: t2 escapes to heap
./tmp.go:25: main ... argument does not escape
63613

As a pointer, it escapes to the heap:

go run -gcflags &#39;-m&#39; tmp.go
# command-line-arguments
./tmp.go:18: inlining call to time.Time.Nanosecond
./tmp.go:24: inlining call to time.Time.Nanosecond
./tmp.go:21: &amp;b escapes to heap &lt;-- Additional GC pressure
./tmp.go:20: moved to heap: b   &lt;-- 
./tmp.go:25: t2 escapes to heap
./tmp.go:25: main ... argument does not escape
122513

Escaping to the heap introduces some overhead / GC pressure.

Looking at the assembly, the pointer version also introduces additional instructions, including:

go run -gcflags &#39;-S&#39; tmp.go
0x0055 00085 (...tmp.go:18)	CALL	runtime.newobject(SB)

The non-pointer variant doesn't incur this overhead before calling runtime.chansend1.

答案2

得分: 1

作为对Martin Gallagher的良好分析的补充,必须补充说明你测量的方式是可疑的。这种微小程序的性能变化很大,因此应该进行重复测量。你的示例中也有一些错误。

首先:它无法编译,因为缺少包语句。

其次:NanosecondsNanosecond之间有一个重要的区别。

我尝试以以下方式评估你的观察结果<sup>*</sup>:

package main

import (
    "time"
    "fmt"
)

const (
    chan_size = 1000
    cycle_count = 1000
)

var (
    v_ch = make(chan t, chan_size)
    p_ch = make(chan *t, chan_size)
)

type t struct {
    a uint
    b uint
}

func fill_v() {
    for i := 0; i < chan_size; i++ {
        b := t{a:3, b:5}
        v_ch <- b
    }
}

func fill_p() {
    for i := 0; i < chan_size; i++ {
        b := t{a:3, b:5}
        p_ch <- &b
    }
}

func measure_f(f func()) int64 {
    start := time.Now()
    f();
    elapsed := time.Since(start)
    return elapsed.Nanoseconds()
}

func main() {

    var v_nanos int64 = 0
    var p_nanos int64 = 0
    for i := 0; i<cycle_count; i++ {
        v_nanos += measure_f(fill_v);
        for i := 0; i < chan_size; i++ {
            _ = <- v_ch
        }
    }
    for i := 0; i<cycle_count; i++ {
        p_nanos += measure_f(fill_p);
        for i := 0; i < chan_size; i++ {
            _ = <- p_ch
        }
    }
    fmt.Println(
        "v:",v_nanos/cycle_count, 
        " p:", p_nanos/cycle_count, 
        "ratio (v/p):", float64(v_nanos)/float64(p_nanos))
}

确实存在可测量的性能下降(我将下降定义为drop=1-(candidate/optimum)),但尽管我重复运行代码1000次,下降率在25%到50%之间变化,我甚至不确定堆是如何回收的以及何时回收,因此可能很难量化


<sup>*</sup>在ideone上查看“运行”演示

...请注意,stdout被冻结:v: 34875 p: 59420 ratio (v/p)0.586923845267128

由于某种原因,无法在Go Playground上运行此代码。

英文:

As a supplement to the good analysis of Martin Gallagher, it must be added that the way you are measuring is suspicious. The performance of such tiny programs varies a lot, so measuring should be done repeatedly. There are also some mistakes in your example.

First: it doesn't compile because the package statement is missing.

Second: there is an important difference between Nanoseconds and Nanosecond

I tried to evaluate your observation this way<sup>*</sup>:

package main
import (
&quot;time&quot;
&quot;fmt&quot;
)
const (
chan_size = 1000
cycle_count = 1000
)
var (
v_ch = make(chan t, chan_size)
p_ch = make(chan *t, chan_size)
)
type t struct {
a uint
b uint
}
func fill_v() {
for i := 0; i &lt; chan_size; i++ {
b := t{a:3, b:5}
v_ch &lt;- b
}
}
func fill_p() {
for i := 0; i &lt; chan_size; i++ {
b := t{a:3, b:5}
p_ch &lt;- &amp;b
}
}
func measure_f(f func()) int64 {
start := time.Now()
f();
elapsed := time.Since(start)
return elapsed.Nanoseconds()
}
func main() {
var v_nanos int64 = 0
var p_nanos int64 = 0
for i := 0; i&lt;cycle_count; i++ {
v_nanos += measure_f(fill_v);
for i := 0; i &lt; chan_size; i++ {
_ = &lt;- v_ch
}
}
for i := 0; i&lt;cycle_count; i++ {
p_nanos += measure_f(fill_p);
for i := 0; i &lt; chan_size; i++ {
_ = &lt;- p_ch
}
}
fmt.Println(
&quot;v:&quot;,v_nanos/cycle_count, 
&quot; p:&quot;, p_nanos/cycle_count, 
&quot;ratio (v/p):&quot;, float64(v_nanos)/float64(p_nanos))
}

There is indeed a measurable performance drop (I define drop like this drop=1-(candidate/optimum)), but although I repeat the code 1000 times, it varies between 25% and 50%, I'm not even sure how the heap is recycled and when, so it maybe hard to quantify at all.


<sup>*</sup>see a "running" demo on ideone

...note that stdout is frozen: v: 34875 p: 59420 ratio (v/p)0.586923845267128

For some reason, it was not possible to run this code in the Go Playground

huangapple
  • 本文由 发表于 2016年12月16日 14:46:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/41178729.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定