英文:
Multithreading in Go. Can somebody explain these answers to me?
问题
我在模拟考试中遇到了两个问题。我已经得到了答案,但是无法理解背后的原理。
我首先会发布代码,然后是问题和答案。也许有人可以友好地解释答案给我听?
package main
import "fmt"
func fact(n int, c chan int, d chan int) {
k := /* code to compute factorial of n */
z := <- d
c <- k + z
d <- z + 1
}
func main() {
r := 0
c := make(chan int)
d := make(chan int)
for i = 0 ; i < N ; i++ {
go fact(i,c,d)
}
d <- 0
for j = 0 ; j < N ; j++ {
r = r + <-c
}
fmt.Printf("result = %d\n",r)
}
第一个问题是:
如果我们在主过程中省略了"d <- 0"这一行,程序会如何运行,为什么?
老师的答案是:
所有线程的状态都被阻塞,包括主线程。
第二个问题是:
如果我们交换fact过程的前两行,整个程序的效率会受到什么影响?
答案是:
所有线程将按顺序运行。每个线程只有在完成后才会触发另一个线程。
英文:
I got two questions on a mock-exam. I was given the answers but cannot figure out the rationale behind them.
I will first post the code and then the questions and answers. Maybe somebody would be so kind to explain the answers to me?
package main
import "fmt"
func fact(n int, c chan int, d chan int) {
k := /* code to compute factorial of n */
z := <- d
c <- k + z
d <- z + 1
}
func main() {
r := 0
c := make(chan int)
d := make(chan int)
for i = 0 ; i < N ; i++ {
go fact(i,c,d)
}
d <- 0
for j = 0 ; j < N ; j++ {
r = r + <-c
}
fmt.Printf("result = %d\n",r)
}
The first question is:
How does the program behave if we omit the line "d <- 0" in the main procedure, and why?
The answer from the teacher is:
The state of all the threads are blocked and also the main thread.
The second question is:
How would the efficiency of the whole program be affected if we swapped the first two lines of the fact procedure?
The answer is:
All threads are going to run sequentially. Each thread would trigger another thread only when it is finished.
答案1
得分: 11
最好不要将其视为“多线程”。Go语言提供了直接支持并发的机制,而不是线程。它使用线程来实现并发,但这只是实现的细节。可以参考Rob Pike的演讲《并发不等于并行》(Concurrency is not parallelism)来深入讨论。
关键在于,默认情况下,通道是同步的(除非在构造时将其设置为缓冲通道)。当一个goroutine向通道写入数据时,它会阻塞直到另一个goroutine从该通道读取数据。因此,当执行以下代码时:
z := <-d
它将无法继续执行,直到执行以下代码:
d <- 0
如果在d
通道上没有可用的值,fact
函数将无法继续执行。这对你来说可能是显而易见的。但反过来也是正确的。除非有其他goroutine从d
通道读取数据,否则主goroutine无法继续执行。因此,无缓冲通道提供了并发goroutine之间的同步点。
类似地,主循环无法继续执行,直到c
通道上出现某个值。我发现使用两根手指指向每个goroutine中的当前代码行非常有用。将一根手指向前移,直到达到一个通道操作。然后将另一根手指向前移,直到达到另一个通道操作。如果你的手指指向同一个通道上的读取和写入操作,那么你可以继续执行。如果不是,则会发生死锁。
如果你仔细思考一下,你会发现一个问题。这个程序会泄漏一个goroutine。
func fact(n int, c chan int, d chan int) {
k := /* 计算n的阶乘的代码 */
z := <-d // (1)
c <- k + z
d <- z + 1 // (2)
}
在(2)处,我们尝试向d
通道写入数据。什么会使其继续执行呢?另一个从d
通道读取数据的goroutine。记住,我们启动了N个goroutine,它们都试图从d
通道读取数据。只有一个goroutine会成功。其他的goroutine会在(1)处阻塞,等待d
通道上出现数据。当第一个goroutine到达(2)时,数据会出现在d
通道上。然后该goroutine退出,随机选择另一个goroutine继续执行。
但是,最后会有一个goroutine无法写入d
通道,从而导致泄漏。为了解决这个问题,在最后的Printf
之前需要添加以下代码:
<-d
这将允许最后一个goroutine退出。
英文:
It is best not to think of this as "multi-threading." Go provides direct facilities for concurrency, not for threading. It happens to implement its concurrency with threading, but that is an implementation detail. See Rob Pike's talk, Concurrency is not parallelism for a much deeper discussion.
The key to your questions is that channels are synchronous by default (if they are not made buffered during their construction). When one goroutine writes to a channel, it will block until some other goroutine reads from that channel. So when this line executes:
z := <- d
It cannot proceed until this line executes:
d <- 0
Without some value being available on the d
channel, fact
will never proceed. That is likely obvious to you. But the reverse is also true. Until something reads from the d
channel, the main goroutine cannot proceed. In this way unbuffered channels provide a synchronization point across concurrent goroutines.
Similarly, the main loop cannot proceed until some value appears on c
. I find it useful to use two fingers and point to the current line of code in each goroutine. Advance one finger until you get to a channel operation. Then advance the other until it reaches a channel operation. If your fingers are pointing to a read and a write on the same channel, then you may proceed. If they are not, then you're in deadlock.
If you think this through, you'll discover a problem. This program leaks a goroutine.
func fact(n int, c chan int, d chan int) {
k := /* code to compute factorial of n */
z := <- d // (1)
c <- k + z
d <- z + 1 // (2)
}
At (2) we try to write to d
. What will allow that to proceed? Another goroutine reading from d
. Remember, we started N
goroutines, and all of them tried to read from d
. Only one of them will be successful. The others will block at (1), waiting for something to show up on d
. That happens when the first one gets to (2). Then that goroutine exits and a random goroutine will proceed.
But there will be a final goroutine that will never be able to write to d
and it will leak. In order to resolve that, the following would need to be added before the final Printf
:
<-d
This would allow the last goroutine to exit.
答案2
得分: 4
如果在主过程中省略了"d <- 0"这一行,程序会如何运行,为什么?
如果没有这行代码,由"go fact(...)"启动的每个goroutine都将在语句"z := <- d"处等待来自通道的数据,从而被阻塞。
"fact()"函数对"d"通道的内容没有实质性影响-它会移除一些数据并添加一些数据。因此,如果通道中没有数据,程序将无法继续执行,导致死锁。
一个同时读写同一个通道的goroutine会导致死锁-在实际应用中要避免这种情况!
如果我们交换fact过程的前两行代码,整个程序的效率会受到影响吗?
在执行冗长的阶乘计算之前,"fact()"例程将等待从"d"通道接收到一个令牌。
由于"d"通道一次只能有一个令牌在使用,这意味着每个goroutine只有在接收到令牌时才会执行昂贵的计算,从而使它们串行化。
而在原始代码中,昂贵的阶乘计算是在并行执行之前进行的,然后再等待令牌。
实际上,这种方式的效果可能不如你所期望的那样好,因为goroutine只会在阻塞操作和函数调用时才会被调度。
英文:
> How does the program behave if we omit the line "d <- 0" in the main procedure, and why?
Withough that line, each goroutine started by go fact(...)
will be waiting for something from the channel, blocked at the statement z := <- d
.
The fact()
function has no net effect on the contents of the d
channel -it removes something and adds something. So if there is nothing in the channel there will be no progress and the program will deadlock.
A goroutine which reads and writes from the same channel is asking for a deadlock - avoid in real life!
> How would the efficiency of the whole program be affected if we swapped the first two lines of the fact procedure?
The fact()
routine will wait until it gets a token from the d
channel before doing its lengthy factorial calculation.
Because there is only one token in play in the d
channel at once, that means that each go routine will only do the expensive calculation when it receives the token, effectively serializing them.
As it was originally, the expensive factorial calculations are done in parallel before waiting for the token.
In practice this would work less well than you might have hoped for since goroutines aren't pre-emptively scheduled, only on blocking operations and function calls.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论