英文:
Processing array in Go parallel : any risk of to much threads?
问题
我有一个关于我之前的帖子在Go中并行处理数组的额外问题:假设我的数组非常大,例如
a1 := []int{0, 1, 2, 3, 4...1000}
a2 := []int{10, 20, 30, 40, 50...10000}
而我只有4个CPU:
runtime.GOMAXPROCS(4)
var wg sync.WaitGroup
以下代码还正确吗?
for i := 1; i < 1000; i++ {
wg.Add(1)
go func(i int) {
defer wg.Done()
x := process_array(a1[i], a2[i])
fmt.Println(a1[i], "+", a2[i], "=", x)
}(i)
}
wg.Wait()
换句话说,runtime.GOMAXPROCS(4)能够限制线程数为4,还是会出现1000个线程的“积累”问题?谢谢你的评论!
英文:
I have an additional question concerning my previous post Processing array in Go parallel : imagine that my arrays are very large, for example
a1 := []int{0, 1, 2, 3, 4...1000}
a2 := []int{10, 20, 30, 40, 50...10000}
and I have only 4 cpus :
runtime.GOMAXPROCS(4)
var wg sync.WaitGroup
Is the following code still correct ?
for i := 1; i < 1000; i++ {
wg.Add(1)
go func(i int) {
defer wg.Done()
x := process_array(a1[i], a2[i])
fmt.Println(a1[i], "+", a2[i], "=", x)
}(i)
}
wg.Wait()
in other words, the runtime.GOMAXPROCS(4) will be able to limit the number of threads to 4, or, there will be a problem of "accumulation" of 1000 threads ? Thanks for your comments !
答案1
得分: 1
当编写并行代码以提高速度时,始终要记住阿姆达尔定律。他的定律提供了一个非常有用的经验法则,用于判断何时停止努力,并可以概括为“顺序部分将成为瓶颈”。
如果忽视阿姆达尔定律,你可能会浪费时间追求不可能的目标。相反,你可能需要考虑并发性的更广泛问题,以在多个地方或以多种方式解决任何性能问题。
通常,你正在使用的方法是数据并行:将独立数据结构的不同段分解到多个进程中。
你还可以考虑函数分解(基本上是流水线),其中不同的阶段执行不同的工作。
然后还有一种特殊的时间情况,使用主从或“数据农场”作为实现并行性的一种方式。
所有这些方法都需要真正的并行硬件才能发挥真正的作用。关于使用这些技术进行多处理的一个很好但有些陈旧的总结可以在Tidmus/Chalmers的《实用并行处理:并行问题解决入门》(ISBN 1850321353)中找到。
英文:
When writing parallel code to improve speed, always remember Amdahl's Law. His Law gives a very useful rule of thumb on when to stop bothering and can be paraphrased as 'the sequential bits will become the bottleneck'.
If you ignore Amdahl's Law, you might end up wasting your time chasing impossible objectives. Instead, you might need to think about broader issues of concurrency to solve any performance problem in more than one place or in more than one way.
Generally, the approach you are using is data-parallel: the "geometrical" decomposition of independent segments of data structures across multiple processes.
You might also consider function decompositions (essentially pipelines) where different stages do different work.
Then there is the special temporal case, using master-worker or 'data farming' as a way of achieving parallelism.
All these tend to need truly parallel hardware to be seriously useful. A good, but old, summary of multiprocessing using these techniques is in Tidmus/Chalmers Practical Parallel Processing: An introduction to problem solving in parallel (ISBN 1850321353).
答案2
得分: 0
你的for循环将创建1000个goroutine,runtime.GOMAXPROCS(4)
设置可以使用的CPU数量。
GOMAXPROCS设置可以同时执行的最大CPU数量,并返回先前的设置。如果n < 1,则不会更改当前设置。可以使用NumCPU查询本地计算机上的逻辑CPU数量。当调度程序改进时,此调用将消失。
在同一页上:
GOMAXPROCS变量限制可以同时执行用户级Go代码的操作系统线程的数量。对于代表Go代码阻塞在系统调用中的线程数量没有限制;它们不计入GOMAXPROCS限制。此包的GOMAXPROCS函数用于查询和更改限制。
英文:
Your for loop will create 1000 goroutines, runtime.GOMAXPROCS(4)
on sets the number of cpus that can be used.
> GOMAXPROCS sets the maximum number of CPUs that can be executing
> simultaneously and returns the previous setting. If n < 1, it does not
> change the current setting. The number of logical CPUs on the local
> machine can be queried with NumCPU. This call will go away when the
> scheduler improves.
and on the same page:
> The GOMAXPROCS variable limits the number of operating system threads
> that can execute user-level Go code simultaneously. There is no limit
> to the number of threads that can be blocked in system calls on behalf
> of Go code; those do not count against the GOMAXPROCS limit. This
> package's GOMAXPROCS function queries and changes the limit.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论