处理一个通道的并发操作会导致意外的输出。

huangapple go评论77阅读模式
英文:

Processing a channel concurrently results in unexpected output

问题

我有一个非缓冲通道,有i个工作器从中获取一个值(文件系统路径)并处理它(通过HTTP发送文件内容)。当我增加i时,遇到了问题。

当我运行以下代码时:

paths := make(chan string)

for i := 0; i < 5; i++ {
	go func() {
		for path := range paths {
			fmt.Println(path)
		}
	}()
}

walkFn := func(path string, info os.FileInfo, err error) error {
	if !info.IsDir() {
		paths <- path
	}
	return nil
}

filepath.Walk("/tmp/foo", walkFn)
close(paths)

它按预期工作,并输出/tmp/foo目录下的所有内容:

/tmp/foo/2
/tmp/foo/file9
/tmp/foo/file91
/tmp/foo/file90
/tmp/foo/file900
/tmp/foo/file901
/tmp/foo/file902
/tmp/foo/file92
/tmp/foo/file97
/tmp/foo/file93
/tmp/foo/file94
/tmp/foo/file95
/tmp/foo/file96
/tmp/foo/file98
/tmp/foo/file99

但是当我通过HTTP发送文件内容时,受影响的文件数量突然减少了:

for i := 0; i < 5; i++ {
	go func() {
		for path := range paths {
			resp, err := http.Head("https://example.com/" + strings.TrimPrefix(path, rootDir+"/"))
			if err != nil {
				fmt.Printf("Error: %s\n", err)
				return
			}

			fmt.Printf("%s: %s\n", path, resp.Status)
		}
	}()
}

受影响的文件数量从15(目录中存在的文件数量)减少到10:

/tmp/foo/2: 404 Not Found
/tmp/foo/file901: 404 Not Found
/tmp/foo/file900: 404 Not Found
/tmp/foo/file9: 404 Not Found
/tmp/foo/file90: 404 Not Found
/tmp/foo/file902: 404 Not Found
/tmp/foo/file91: 404 Not Found
/tmp/foo/file92: 404 Not Found
/tmp/foo/file93: 404 Not Found
/tmp/foo/file94: 404 Not Found

下表显示了i的值与输出行数之间的关系:

+-----+-------+
| `i` | lines |
+-----+-------+
| 1   | 15    |
| 5   | 10    |
| 6   | 9     |
| 15  | 0     |
+-----+-------+

为什么会发生这种情况,我如何同时处理所有通道条目?这与http请求有关吗?

英文:

I have an unbuffered channel that i amount of workers take a value from (a filesystem path) and process it (send the file contents over HTTP). I'm running into problem when I increase i.

When I run this:

paths := make(chan string)

for i := 0; i &lt; 5; i++ {
	go func() {
		for path := range paths {
			fmt.Println(path)
		}
	}()
}

walkFn := func(path string, info os.FileInfo, err error) error {
	if !info.IsDir() {
		paths &lt;- path
	}
	return nil
}

filepath.Walk(&quot;/tmp/foo&quot;, walkFn)
close(paths)

It works expectedly and outputs all the contents of /tmp/foo:

/tmp/foo/2
/tmp/foo/file9
/tmp/foo/file91
/tmp/foo/file90
/tmp/foo/file900
/tmp/foo/file901
/tmp/foo/file902
/tmp/foo/file92
/tmp/foo/file97
/tmp/foo/file93
/tmp/foo/file94
/tmp/foo/file95
/tmp/foo/file96
/tmp/foo/file98
/tmp/foo/file99

But when I send the file contents over HTTP, the number of affected files suddenly goes down:

for i := 0; i &lt; 5; i++ {
	go func() {
		for path := range paths {
			resp, err := http.Head(&quot;https://example.com/&quot; + strings.TrimPrefix(path, rootDir+&quot;/&quot;))
			if err != nil {
				fmt.Printf(&quot;Error: %s\n&quot;, err)
				return
			}

			fmt.Printf(&quot;%s: %s\n&quot;, path, resp.Status)
		}
	}()
}

the number of affected files reduces from 15 (which is how many exist in the directory), down to 10:

/tmp/foo/2: 404 Not Found
/tmp/foo/file901: 404 Not Found
/tmp/foo/file900: 404 Not Found
/tmp/foo/file9: 404 Not Found
/tmp/foo/file90: 404 Not Found
/tmp/foo/file902: 404 Not Found
/tmp/foo/file91: 404 Not Found
/tmp/foo/file92: 404 Not Found
/tmp/foo/file93: 404 Not Found
/tmp/foo/file94: 404 Not Found

Here's a table that relates the value of i to the number of output lines:

+-----+-------+
| `i` | lines |
+-----+-------+
| 1   | 15    |
| 5   | 10    |
| 6   | 9     |
| 15  | 0     |
+-----+-------+

Why does this happen and how can I process all the channel entries concurrently? Is it a problem with http requests?

答案1

得分: 1

问题是在这行代码之后:

filepath.Walk("/tmp/foo", walkFn)

所有的路径都通过paths通道发送了出去,这意味着有人接收到了这些路径。然而,这并不意味着那些接收路径的goroutine已经完全执行完毕。

所以当你的程序在close(paths)之后退出时,仍然有一些goroutine在工作,它们会被终止,因为main函数已经执行完毕。

https://golang.org/ref/spec#Program_execution

程序的执行从初始化主包开始,然后调用main函数。当该函数调用返回时,程序退出。它不会等待其他(非主)goroutine完成。

一个简单的解决方案是在程序的末尾添加

select{}

这将使程序永远阻塞。

英文:

The problem is that after this line:

filepath.Walk(&quot;/tmp/foo&quot;, walkFn)

All paths have been sent through the paths channel, this implies that someone received them. However, it does not imply that those receiving goroutines have finished completely.

So when your program exits after close(paths), there are still goroutines working and they get killed because main is finished.

https://golang.org/ref/spec#Program_execution

> Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.

One simple solution is to add

select{}

at the end of your program. This will make it block forever.

huangapple
  • 本文由 发表于 2015年7月9日 22:05:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/31319894.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定