英文:
Attempt to parallelize not fast enough
问题
我阅读了关于Go语言并发模型的文章,也了解了并发和并行的区别。为了测试并行执行,我编写了以下程序。
package main
import (
"fmt"
"runtime"
"time"
)
const count = 1e8
var buffer [count]int
func main() {
fmt.Println("GOMAXPROCS: ", runtime.GOMAXPROCS(0))
// 初始化缓冲区
for i := 0; i < count; i++ {
buffer[i] = 3
}
// 顺序操作
now := time.Now()
worker(0, count-1)
fmt.Println("顺序操作: ", time.Since(now))
// 尝试并行化
ch := make(chan int, 1)
now = time.Now()
go func() {
worker(0, (count/2)-1)
ch <- 1
}()
worker(count/2, count-1)
<-ch
fmt.Println("并行操作: ", time.Since(now))
}
func worker(start int, end int) {
for i := start; i <= end; i++ {
task(i)
}
}
func task(index int) {
buffer[index] = 2 * buffer[index]
}
但问题是:结果并不是很令人满意。
GOMAXPROCS: 8
顺序操作: 206.85ms
并行操作: 169.028ms
使用goroutine确实加快了速度,但并没有达到预期的两倍速度。我的代码和/或理解有什么问题?我该如何更接近两倍的速度?
英文:
I read about Go's concurrency model and also saw about the difference between concurrency and parallelism. In order to test parallel execution, I wrote the following program.
package main
import (
"fmt"
"runtime"
"time"
)
const count = 1e8
var buffer [count]int
func main() {
fmt.Println("GOMAXPROCS: ", runtime.GOMAXPROCS(0))
// Initialise with dummy value
for i := 0; i < count; i++ {
buffer[i] = 3
}
// Sequential operation
now := time.Now()
worker(0, count-1)
fmt.Println("sequential operation: ", time.Since(now))
// Attempt to parallelize
ch := make(chan int, 1)
now = time.Now()
go func() {
worker(0, (count/2)-1)
ch <- 1
}()
worker(count/2, count-1)
<-ch
fmt.Println("parallel operation: ", time.Since(now))
}
func worker(start int, end int) {
for i := start; i <= end; i++ {
task(i)
}
}
func task(index int) {
buffer[index] = 2 * buffer[index]
}
But the problem is: the results are not very pleasing.
GOMAXPROCS: 8
sequential operation: 206.85ms
parallel operation: 169.028ms
Using a goroutine does speed things up but not enough. I expected it to be closer to being twice as fast. What is wrong with my code and/or understanding? And how can I get closer to being twice as fast?
答案1
得分: 2
并行化是强大的,但在如此小的计算负载下很难看到效果。以下是一个具有更大结果差异的示例代码:
package main
import (
"fmt"
"math"
"runtime"
"time"
)
func calctest(nCPU int) {
fmt.Println("协程数:", nCPU)
ch := make(chan float64, nCPU)
startTime := time.Now()
a := 0.0
b := 1.0
n := 100000.0
deltax := (b - a) / n
stepPerCPU := n / float64(nCPU)
for start := 0.0; start < n; {
stop := start + stepPerCPU
go f(start, stop, a, deltax, ch)
start = stop
}
integral := 0.0
for i := 0; i < nCPU; i++ {
integral += <-ch
}
fmt.Println(time.Now().Sub(startTime))
fmt.Println(deltax * integral)
}
func f(start, stop, a, deltax float64, ch chan float64) {
result := 0.0
for i := start; i < stop; i++ {
result += math.Sqrt(a + deltax*(i+0.5))
}
ch <- result
}
func main() {
nCPU := runtime.NumCPU()
calctest(nCPU)
fmt.Println("")
calctest(1)
}
这是我得到的结果:
协程数: 8
853.181μs
协程数: 1
2.031358ms
英文:
Parallelization is powerful, but it's hard to see with such a small computational load. Here is some sample code with a larger difference in the result:
package main
import (
"fmt"
"math"
"runtime"
"time"
)
func calctest(nCPU int) {
fmt.Println("Routines:", nCPU)
ch := make(chan float64, nCPU)
startTime := time.Now()
a := 0.0
b := 1.0
n := 100000.0
deltax := (b - a) / n
stepPerCPU := n / float64(nCPU)
for start := 0.0; start < n; {
stop := start + stepPerCPU
go f(start, stop, a, deltax, ch)
start = stop
}
integral := 0.0
for i := 0; i < nCPU; i++ {
integral += <-ch
}
fmt.Println(time.Now().Sub(startTime))
fmt.Println(deltax * integral)
}
func f(start, stop, a, deltax float64, ch chan float64) {
result := 0.0
for i := start; i < stop; i++ {
result += math.Sqrt(a + deltax*(i+0.5))
}
ch <- result
}
func main() {
nCPU := runtime.NumCPU()
calctest(nCPU)
fmt.Println("")
calctest(1)
}
This is the result I get:
Routines: 8
853.181µs
Routines: 1
2.031358ms
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论