英文:
How to iterate int range concurrently
问题
纯粹出于教育目的,我创建了一个 base58 包。它可以使用 比特币 base58 符号表 对 uint64
进行编码/解码,例如:
b58 := Encode(100) // 返回 2j
num := Decode("2j") // 返回 100
在创建第一个测试时,我得到了 这个:
func TestEncode(t *testing.T) {
var i uint64
for i = 0; i <= (1<<64 - 1); i++ {
b58 := Encode(i)
num := Decode(b58)
if num != i {
t.Fatalf("期望 %d 对应 %s", i, b58)
}
}
}
这个"天真"的实现尝试将 uint64 的所有范围(从 0 到 18,446,744,073,709,551,615)转换为 base58,然后再转回 uint64,但是花费的时间太长。
为了更好地理解 Go 如何处理并发,我想知道如何使用通道或 goroutine 并以最高效的方式迭代整个 uint64 范围?
如果可以,数据可以被分块并并行处理,如何实现?
提前感谢。
更新:
如 @Adrien 的答案中所提到的,一种方法是使用 t.Parallel(),但这仅适用于测试包。无论如何,通过实现它,我发现它明显更慢,虽然它可以并行运行,但没有速度上的提升。
我知道完整的 uint64
可能需要几年的时间,但我想找出/了解的是如何使用通道或 goroutine 来加速这个过程(使用小范围 1<<16
进行测试),可能可以使用类似这个的东西 https://play.golang.org/p/9U22NfrXeq 作为示例。
问题不是如何测试包,而是通过使用并发来加速迭代的算法和技术。
英文:
For purely educational purposes I created a base58 package. It will encode/decode a uint64
using the bitcoin base58 symbol chart, for example:
b58 := Encode(100) // return 2j
num := Decode("2j") // return 100
While creating the first tests I came with this:
func TestEncode(t *testing.T) {
var i uint64
for i = 0; i <= (1<<64 - 1); i++ {
b58 := Encode(i)
num := Decode(b58)
if num != i {
t.Fatalf("Expecting %d for %s", i, b58)
}
}
}
This "naive" implementation, tries to convert all the range from uint64 (From 0 to 18,446,744,073,709,551,615) to base58 and later back to uint64 but takes too much time.
To better understand how go handles concurrency I would like to know how to use channels or goroutines and perform the iteration across the full uint64 range in the most efficient way?
Could data be processed by chunks and in parallel, if yes how to accomplish this?
Thanks in advance.
UPDATE:
Like mention in the answer by @Adrien, one-way is to use t.Parallel() but that applies only when testing the package, In any case, by implementing it I found that is noticeably slower, it runs in parallel but there is no speed gain.
I understand that doing the full uint64
may take years but what I want to find/now is how could a channel or goroutine, may help to speed up the process (testing with small range 1<<16
) probably by using something like this https://play.golang.org/p/9U22NfrXeq just as an example.
The question is not about how to test the package is about what algorithm, technic could be used to iterate faster by using concurrency.
答案1
得分: 1
这个功能已经内置在Go的testing
包中,以T.Parallel
的形式存在。
func TestEncode(t *testing.T) {
var i uint64
for i = 0; i <= (1<<64 - 1); i++ {
t.Run(fmt.Sprintf("%d",i), func(t *testing.T) {
j := i // 复制到本地变量 - 重要
t.Parallel() // 将测试标记为可并行运行
b58 := Encode(j)
num := Decode(b58)
if num != j {
t.Fatalf("期望值为 %d,实际值为 %s", j, b58)
}
})
}
}
你需要注意的是,这段代码是用Go语言编写的,T.Parallel
用于标记测试函数可以并行运行。
英文:
This functionality is built into the Go testing
package, in the form of T.Parallel
:
func TestEncode(t *testing.T) {
var i uint64
for i = 0; i <= (1<<64 - 1); i++ {
t.Run(fmt.Sprintf("%d",i), func(t *testing.T) {
j := i // Copy to local var - important
t.Parallel() // Mark test as parallelizable
b58 := Encode(j)
num := Decode(b58)
if num != j {
t.Fatalf("Expecting %d for %s", j, b58)
}
})
}
}
答案2
得分: 0
我提出了以下解决方案:
package main
import (
"fmt"
"time"
"github.com/nbari/base58"
)
func encode(i uint64) {
x := base58.Encode(i)
fmt.Printf("%d = %s\n", i, x)
time.Sleep(time.Second)
}
func main() {
concurrency := 4
sem := make(chan struct{}, concurrency)
for i, val := uint64(0), uint64(1<<16); i <= val; i++ {
sem <- struct{}{}
go func(i uint64) {
defer func() { <-sem }()
encode(i)
}(i)
}
for i := 0; i < cap(sem); i++ {
sem <- struct{}{}
}
}
基本上,启动4个工作线程并调用encode
函数,为了更好地观察和理解这种行为,添加了一个睡眠函数,以便数据可以按4个一组打印。
此外,这些答案帮助我更好地理解并发性:https://stackoverflow.com/a/18405460/1135424
如果有更好的方法,请告诉我。
英文:
I came up with this solutions:
<!-- language: go -->
package main
import (
"fmt"
"time"
"github.com/nbari/base58"
)
func encode(i uint64) {
x := base58.Encode(i)
fmt.Printf("%d = %s\n", i, x)
time.Sleep(time.Second)
}
func main() {
concurrency := 4
sem := make(chan struct{}, concurrency)
for i, val := uint64(0), uint64(1<<16); i <= val; i++ {
sem <- struct{}{}
go func(i uint64) {
defer func() { <-sem }()
encode(i)
}(i)
}
for i := 0; i < cap(sem); i++ {
sem <- struct{}{}
}
}
Basically, start 4 workers and calls the encode function, to notice/understand more this behavior a sleep is added so that the data can be printed in chunks of 4.
Also, these answers helped me to better understand concurrency understanding: https://stackoverflow.com/a/18405460/1135424
If there is a better way please let me know.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论