Go和C++中的向量性能

huangapple go评论100阅读模式
英文:

Vector performance in Go and C++

问题

请考虑以下两个代码片段,一个是GO语言的,另一个是C++11的。在C++中,std::vector是一个具有摊销O(1)插入操作的动态数组。如何在GO语言中实现相同的性能?问题在于这段GO代码在我的硬件上大约慢了3倍。运行多次。

编译:

  • go build vec.go(go版本go1.2.1 linux/amd64)
  • g++ -O2 -std=gnu++11 -o vec vec.cc(g++ (Ubuntu 4.8.2-19ubuntu1) 4.8.2)

GO版本(vec.go):

package main

type X struct {
	x int32
	y float64
}

const N int = 80000000

func main() {
	x := X{123, 2.64}
	s := make([]X, 1)
	for i := 0; i < N; i++ {
		s = append(s, x)
	}
}

C++11版本(vec.cc):

#include <vector>

const int N = 80000000;

struct X {
        int x;
        double y;
};

int main(void)
{
        X x{123, 2.64};
        std::vector<X> s(1);
        for (int i = 0; i < N; ++i) {
                s.push_back(x);
        }
}
英文:

Please consider these two snippets in GO and C++11. In C++ std::vector is a doubling-array which has amortized O(1) insert operation. How to achieve the same performance in GO? Problem is that this GO code is about 3 times slower on my hardware. Run many times.

Compiled:

  • go build vec.go (go version go1.2.1 linux/amd64)
  • g++ -O2 -std=gnu++11 -o vec vec.cc (g++ (Ubuntu 4.8.2-19ubuntu1) 4.8.2)

GO version (vec.go):

<!-- language: go -->

package main

type X struct {
	x int32
	y float64
}

const N int = 80000000

func main() {
	x := X{123, 2.64}
	s := make([]X, 1)
	for i := 0; i &lt; N; i++ {
		s = append(s, x)
	}
}

C++11 version (vec.cc):

<!-- language: c++ -->

#include &lt;vector&gt;

const int N = 80000000;

struct X {
        int x;
        double y;
};

int main(void)
{
        X x{123, 2.64};
        std::vector&lt;X&gt; s(1);
        for (int i = 0; i &lt; N; ++i) {
                s.push_back(x);
        }
}

答案1

得分: 11

Go的规范对append()函数没有特定的复杂度要求,但实际上它也是以摊还常数时间实现的,就像这个问题的答案中所描述的那样。

当前的实现方式如下:对于数组大小小于1024的情况,它会按需求进行倍增,而对于大于1024的情况,它会增加到原始大小的1.25倍。以1.25倍增加仍然是摊还常数时间,但它会导致摊还常数因子比总是倍增的实现方式更高。然而,1.25倍增加的方式在整体上浪费的内存较少。

如果你在只有少数几次操作时(即使在非常大的N值下),得到了不同的性能表现,那么你可能看到了不同的常数因子。我自己注意到,gc编译器生成的机器代码比gccgo生成的代码更高效。

为了验证Go是否以摊还常数时间运行,请尝试绘制运行你的算法所需时间的图表,使用几个不同的N值。

英文:

Go's specification doesn't require any particular complexity for append(), but in practice it's also implemented in ammortized constant time, as described in the answer to this question.

The current implementation works as follows: for array sizes below 1024, it doubles as needed, and above 1024 it increases to 1.25x the original size. Increasing by 1.25x is still amortized constant time, but it has the effect of imposing a higher amortized constant factor than an implementation that always doubles. However 1.25x wastes less memory overall.

If you're getting different performance behavior by only a few times (even at very large N), then you're seeing different constant factors in play. I've noted myself that the machine code produced by the gc compiler is much more efficient than that generated by gccgo.

To verify for yourself that Go is operating in ammortized constant time, try plotting the time it takes to run your algorithm for several different values of N.

答案2

得分: 1

我已经回答了你关于计算复杂度的问题:append复杂度。它的摊销时间是常数时间。

你的基准测试结果如下:

$ rm vec
$ cat vec.cc
#include <vector>

const int N = 80000000;

struct X {
        int x;
        double y;
};

int main(void)
{
        X x{123, 2.64};
        std::vector<X> s(1);
        for (int i = 0; i < N; ++i) {
                s.push_back(x);
        }
}
$ g++ -O2 -std=gnu++11 -o vec vec.cc
$ time ./vec
real    0m1.360s
user    0m0.536s
sys     0m0.816s
$ rm vec
$ cat vec.go
package main

type X struct {
    x int32
    y float64
}

const N int = 80000000

func main() {
    x := X{123, 2.64}
    s := make([]X, 1)
    for i := 0; i < N; i++ {
        s = append(s, x)
    }
}
$ go version
go version devel +6b696a34e0af Sun Aug 03 15:14:59 2014 -0700 linux/amd64
$ go build vec.go
$ time ./vec
real    0m2.590s
user    0m1.192s
sys     0m1.388s
$
英文:

I've already answered your computational complexity question: append complexity. It is amortized constant time.

My results from your benchmark.

$ rm vec
$ cat vec.cc
#include &lt;vector&gt;

const int N = 80000000;

struct X {
        int x;
        double y;
};

int main(void)
{
        X x{123, 2.64};
        std::vector&lt;X&gt; s(1);
        for (int i = 0; i &lt; N; ++i) {
                s.push_back(x);
        }
}
$ g++ -O2 -std=gnu++11 -o vec vec.cc
$ time ./vec
real	0m1.360s
user	0m0.536s
sys	0m0.816s
$ rm vec
$ cat vec.go
package main

type X struct {
	x int32
	y float64
}

const N int = 80000000

func main() {
	x := X{123, 2.64}
	s := make([]X, 1)
	for i := 0; i &lt; N; i++ {
		s = append(s, x)
	}
}
$ go version
go version devel +6b696a34e0af Sun Aug 03 15:14:59 2014 -0700 linux/amd64
$ go build vec.go
$ time ./vec
real	0m2.590s
user	0m1.192s
sys	0m1.388s
$ 

答案3

得分: 0

如果您事先知道元素的数量,可以使用以下方式进行预分配:

s := make([]X, 0, N)
for i := 0; i < N; i++ {
    s = append(s, x)
}

此外,建议使用Go 1.3版本,编译器进行了一些优化。

为了获得更好的向量化效果,可以尝试使用gccgo

英文:

If you know the number of elements before hand, you can preallocate it with:

s := make([]X, 0, N)
for i := 0; i &lt; N; i++ {
    s = append(s, x)
}

Also use Go 1.3, the compiler got some optimizations.

And for better vectorization, try gccgo

huangapple
  • 本文由 发表于 2014年8月4日 03:53:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/25108508.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定