英文:
Is there analog of memset in go?
问题
在C++中,可以使用memset来初始化数组并赋予某个值:
const int MAX = 1000000;
int is_prime[MAX];
memset(is_prime, 1, sizeof(is_prime));
简单来说,memset的作用是快速地将数组填充为某个值。
在Go语言中,可以使用is_prime := make([]int, 1000000)
来创建一个全为0的切片,类似地,可以使用new([1000000]int)
来创建一个全为0的数组,但是没有直接的方法可以创建一个全为1或其他非零元素的数组或切片。
当然,可以使用循环来后续填充数组的值,但是memset
的主要目的是比循环更快速。
那么,Go语言程序员有没有类似于memset
的方法(用于快速地将数组初始化为非零值)呢?
英文:
In C++ I can initialize an array with some value using memset:
const int MAX = 1000000;
int is_prime[MAX]
memset(is_prime, 1, sizeof(is_prime))
What memset does, crudely can be described as filling the array with some value, but doing this really really fast.
In go I can do is_prime := make([]int, 1000000)
, but this will create a slice with all 0, in the similar manner I can use new([1000000]int)
, but nothing will allow me to create an array/slice with all 1 or any other non-zero element.
Of course I can use a loop to populate it with the value later, but the main purpose of memset
is that it is way way faster than the loop.
So do Go programmers have a memset
analog (fast way of initializing array to some non-zero value)?
答案1
得分: 31
使用循环的最简单解决方案如下:
func memsetLoop(a []int, v int) {
for i := range a {
a[i] = v
}
}
标准库中没有memset
函数的支持,但我们可以利用内置的copy()
函数来实现高度优化的效果。
使用重复的copy()
我们可以手动设置第一个元素,并使用copy()
将已设置的部分复制到未设置的部分;已设置的部分每次都会变得越来越大(翻倍),所以迭代次数为log(n)
:
func memsetRepeat(a []int, v int) {
if len(a) == 0 {
return
}
a[0] = v
for bp := 1; bp < len(a); bp *= 2 {
copy(a[bp:], a[:bp])
}
}
这个解决方案受到了bytes.Repeat()
的实现启发。如果你只想创建一个填充相同值的新的[]byte
切片,可以使用bytes.Repeat()
函数。但对于现有的切片或除[]byte
之外的其他切片,你可以使用上述的memsetRepeat()
函数。
对于小切片,memsetRepeat()
可能比memsetLoop()
慢(但对于小切片来说,这并不重要,它会立即执行)。
由于使用了快速的copy()
函数,如果元素数量增加,memsetRepeat()
将会更快。
对这两种解决方案进行基准测试:
var a = make([]int, 1000) // 大小会变化
func BenchmarkLoop(b *testing.B) {
for i := 0; i < b.N; i++ {
memsetLoop(a, 10)
}
}
func BenchmarkRepeat(b *testing.B) {
for i := 0; i < b.N; i++ {
memsetRepeat(a, 11)
}
}
基准测试结果
100个元素:大约快1.15倍
BenchmarkLoop 20000000 81.6 ns/op
BenchmarkRepeat 20000000 71.0 ns/op
1,000个元素:大约快2.5倍
BenchmarkLoop 2000000 706 ns/op
BenchmarkRepeat 5000000 279 ns/op
10,000个元素:大约快2倍
BenchmarkLoop 200000 7029 ns/op
BenchmarkRepeat 500000 3544 ns/op
100,000个元素:大约快1.5倍
BenchmarkLoop 20000 70671 ns/op
BenchmarkRepeat 30000 45213 ns/op
性能提升最高的是在3800-4000个元素左右,大约快3.2倍。
英文:
The simplest solution with a loop would look like this:
func memsetLoop(a []int, v int) {
for i := range a {
a[i] = v
}
}
There is no memset
support in the standard library, but we can make use of the built-in copy()
which is highly optimized.
With repeated copy()
We can set the first element manually, and start copying the already set part to the unset part using copy()
; where the already set part gets bigger and bigger every time (doubles), so the number of iterations is log(n)
:
func memsetRepeat(a []int, v int) {
if len(a) == 0 {
return
}
a[0] = v
for bp := 1; bp < len(a); bp *= 2 {
copy(a[bp:], a[:bp])
}
}
This solution was inspired by the implementation of bytes.Repeat()
. If you just want to create a new []byte
filled with the same values, you can use the bytes.Repeat()
function. You can't use that for an existing slice or slices other than []byte
, for that you can use the presented memsetRepeat()
.
In case of small slices memsetRepeat()
may be slower than memsetLoop()
(but in case of small slices it doesn't really matter, it will run in an instant).
Due to using the fast copy()
, memsetRepeat()
will be much faster if the number of elements grows.
Benchmarking these 2 solutions:
var a = make([]int, 1000) // Size will vary
func BenchmarkLoop(b *testing.B) {
for i := 0; i < b.N; i++ {
memsetLoop(a, 10)
}
}
func BenchmarkRepeat(b *testing.B) {
for i := 0; i < b.N; i++ {
memsetRepeat(a, 11)
}
}
Benchmark results
100 elements: ~1.15 times faster
BenchmarkLoop 20000000 81.6 ns/op
BenchmarkRepeat 20000000 71.0 ns/op
1,000 elements: ~2.5 times faster
BenchmarkLoop 2000000 706 ns/op
BenchmarkRepeat 5000000 279 ns/op
10,000 elements: ~2 times faster
BenchmarkLoop 200000 7029 ns/op
BenchmarkRepeat 500000 3544 ns/op
100,000 elements: ~1.5 times faster
BenchmarkLoop 20000 70671 ns/op
BenchmarkRepeat 30000 45213 ns/op
The highest performance gain is around 3800-4000 elements where it is ~3.2 times faster.
答案2
得分: 7
根据标题为“optimize memset idiom”的这个错误,除了使用循环之外,Go语言中没有其他方法来实现这一点。该问题于2013年1月9日关闭,并发布了以下帖子:
我认为这个问题已经解决了。优化非零情况并不是很有趣。
如果有人对做更多工作有强烈意见,我们可以再开一个错误报告。
因此,解决方案是使用icza已经提到的循环。
虽然有bytes.Repeat函数,但它也只是使用了循环:
func Repeat(b []byte, count int) []byte {
nb := make([]byte, len(b)*count)
bp := copy(nb, b)
for bp < len(nb) {
copy(nb[bp:], nb[:bp])
bp *= 2
}
return nb
}
英文:
According to this bug titled "optimize memset idiom" there is no way to do this in Go other than with a loop. The issue was closed on 9 Jan 2013 with this post
> I consider this fixed. Optimizing non-zero cases isn't very
> interesting.
>
> We can open another bug if people feel strongly about doing more.
So the solution is to use a loop as already covered by icza.
There is bytes.Repeat but that also just uses a loop:
func Repeat(b []byte, count int) []byte {
nb := make([]byte, len(b)*count)
bp := copy(nb, b)
for bp < len(nb) {
copy(nb[bp:], nb[:bp])
bp *= 2
}
return nb
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论