当增加goroutine的数量时,Go程序变慢了。

huangapple go评论72阅读模式
英文:

Go program slowing down when increasing number of goroutines

问题

我正在为我的并行计算课程做一个小项目,我已经尝试过使用带缓冲通道、不带缓冲通道、不使用通道而使用切片指针等方式。我还尽可能地进行了优化(不是当前的状态),但我仍然得到了相同的结果:增加goroutine的数量(即使只增加1个)会减慢整个程序的速度。有人能告诉我我做错了什么,在这种情况下是否可能进行并行性优化?

以下是代码的一部分:

func main() {

	rand.Seed(time.Now().UnixMicro())

	numAgents := 2

	fmt.Println("Please pick a number of goroutines: ")
	fmt.Scanf("%d", &numAgents)

	numFiles := 4
	fmt.Println("How many files do you want?")
	fmt.Scanf("%d", &numFiles)
	start := time.Now()

	numAssist := numFiles
	channel := make(chan []File, numAgents)
	files := make([]File, 0)

	for i := 0; i < numAgents; i++ {
		if i == numAgents-1 {
			go generateFiles(numAssist, channel)
		} else {
			go generateFiles(numFiles/numAgents, channel)
			numAssist -= numFiles / numAgents
		}
	}

	for i := 0; i < numAgents; i++ {
		files = append(files, <-channel...)
	}

	elapsed := time.Since(start)
	fmt.Printf("Function took %s\n", elapsed)
}

func generateFiles(numFiles int, channel chan []File) {
	magicNumbersMap := getMap()
	files := make([]File, 0)

	for i := 0; i < numFiles; i++ {
		content := randElementFromMap(&magicNumbersMap)

		length := rand.Intn(400) + 100
		hexSlice := getHex()

		for j := 0; j < length; j++ {
			content = content + hexSlice[rand.Intn(len(hexSlice))]
		}

		hash := getSHA1Hash([]byte(content))

		file := File{
			content: content,
			hash:    hash,
		}

		files = append(files, file)
	}

	channel <- files

}

期望的是通过增加goroutine的数量,程序会运行得更快,但是在一定数量的goroutine之后,通过增加goroutine的数量,我会得到相同的执行时间或稍微慢一些。

编辑:使用的所有函数:

import (
	"crypto/sha1"
	"encoding/base64"
	"fmt"
	"math/rand"
	"time"
)

type File struct {
	content string
	hash    string
}

func getMap() map[string]string {
	return map[string]string{
		"D4C3B2A1": "Libcap file format",
		"EDABEEDB": "RedHat Package Manager (RPM) package",
		"4C5A4950": "lzip compressed file",
	}
}

func getHex() []string {
	return []string{
		"0", "1", "2", "3", "4", "5",
		"6", "7", "8", "9", "A", "B",
		"C", "D", "E", "F",
	}
}

func randElementFromMap(m *map[string]string) string {
	x := rand.Intn(len(*m))
	for k := range *m {
		if x == 0 {
			return k
		}
		x--
	}
	return "Error"
}

func getSHA1Hash(content []byte) string {
	h := sha1.New()
	h.Write(content)
	return base64.URLEncoding.EncodeToString(h.Sum(nil))
}

希望能帮到你!

英文:

I'm doing a small project for my parallelism course and I have tried it with buffered channels, unbuffered channels, without channels using pointers to slices etc. Also, tried to optimize it as much as possible (not the current state) but I still get the same result: increasing number of goroutines (even by 1) slows down the whole program. Can someone please tell me what I'm doing wrong and is even parallelism enhancement possible in this situation?

Here is part of the code:

func main() {
rand.Seed(time.Now().UnixMicro())
numAgents := 2
fmt.Println(&quot;Please pick a number of goroutines: &quot;)
fmt.Scanf(&quot;%d&quot;, &amp;numAgents)
numFiles := 4
fmt.Println(&quot;How many files do you want?&quot;)
fmt.Scanf(&quot;%d&quot;, &amp;numFiles)
start := time.Now()
numAssist := numFiles
channel := make(chan []File, numAgents)
files := make([]File, 0)
for i := 0; i &lt; numAgents; i++ {
if i == numAgents-1 {
go generateFiles(numAssist, channel)
} else {
go generateFiles(numFiles/numAgents, channel)
numAssist -= numFiles / numAgents
}
}
for i := 0; i &lt; numAgents; i++ {
files = append(files, &lt;-channel...)
}
elapsed := time.Since(start)
fmt.Printf(&quot;Function took %s\n&quot;, elapsed)
}
func generateFiles(numFiles int, channel chan []File) {
magicNumbersMap := getMap()
files := make([]File, 0)
for i := 0; i &lt; numFiles; i++ {
content := randElementFromMap(&amp;magicNumbersMap)
length := rand.Intn(400) + 100
hexSlice := getHex()
for j := 0; j &lt; length; j++ {
content = content + hexSlice[rand.Intn(len(hexSlice))]
}
hash := getSHA1Hash([]byte(content))
file := File{
content: content,
hash:    hash,
}
files = append(files, file)
}
channel &lt;- files
}

Expectation was that by increasing goroutines the program would run faster but to a certain number of goroutines and at that point by increasing goroutines I would get the same execution time or a little bit slower.

EDIT: All the functions that are used:

    import (
&quot;crypto/sha1&quot;
&quot;encoding/base64&quot;
&quot;fmt&quot;
&quot;math/rand&quot;
&quot;time&quot;
)
type File struct {
content string
hash    string
}
func getMap() map[string]string {
return map[string]string{
&quot;D4C3B2A1&quot;: &quot;Libcap file format&quot;,
&quot;EDABEEDB&quot;: &quot;RedHat Package Manager (RPM) package&quot;,
&quot;4C5A4950&quot;: &quot;lzip compressed file&quot;,
}
}
func getHex() []string {
return []string{
&quot;0&quot;, &quot;1&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;, &quot;5&quot;,
&quot;6&quot;, &quot;7&quot;, &quot;8&quot;, &quot;9&quot;, &quot;A&quot;, &quot;B&quot;,
&quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;F&quot;,
}
}
func randElementFromMap(m *map[string]string) string {
x := rand.Intn(len(*m))
for k := range *m {
if x == 0 {
return k
}
x--
}
return &quot;Error&quot;
}
func getSHA1Hash(content []byte) string {
h := sha1.New()
h.Write(content)
return base64.URLEncoding.EncodeToString(h.Sum(nil))
}

答案1

得分: 3

简单来说,文件生成代码并不复杂到需要进行并行执行的程度。所有的上下文切换和通过通道传输数据都会消耗并行处理的好处。

如果你在generateFiles函数的循环中添加类似time.Sleep(time.Millisecond * 10)的代码,就好像它在执行更复杂的操作,你会看到你期望看到的结果 - 更多的goroutine会更快地工作。但是,只有在某个特定的水平上,额外的工作量超过了并行处理的好处时,才会出现这种情况。

还要注意,你程序的最后一部分的执行时间:

for i := 0; i < numAgents; i++ {
    files = append(files, <-channel...)
}

直接取决于goroutine的数量。由于所有的goroutine几乎同时完成,这个循环几乎不会与你的工作线程并行执行,它所花费的时间只是简单地添加到总时间中。

此外,当你多次向files切片添加元素时,它必须多次增长并将数据复制到新位置。你可以通过最初创建一个足够容纳所有结果元素的切片来避免这种情况(幸运的是,你知道你将需要多少个元素)。

英文:

Simply speaking - the files generation code is not complex enough to justify parallel execution. All the context switching and moving data through the channel eats all benefit of parallel processing.

If you add something like time.Sleep(time.Millisecond * 10) inside the loop in your generateFiles function as if it was doing something more complex, you'll see what you expected to see - more goroutines work faster. But again, only until certain level, when extra work to do parallel processing overweights the benefit.

Note also, the execution time of the last bit of your program:

for i := 0; i &lt; numAgents; i++ {
files = append(files, &lt;-channel...)
}

directly depends on number of goroutines. Since all goroutines finish approximately at the same time, this loop almost never executed in parallel with your workers and the time it takes to run is simply added to the total time.

Next, when you append to files slice multiple times, it has to grow several times and copy the data over to the new location. You can avoid this by initially creating a slice that will fil all your resulting elements (luckily, you know exactly how many you'll need).

huangapple
  • 本文由 发表于 2023年1月7日 01:22:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/75034141.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定