英文:
Unexpected behavior from launching a method call on a loop variable as a goroutine
问题
我阅读了这篇文章,并决定自己重复这种行为并进行实验:
package main
import (
"fmt"
"time"
)
type User struct {
i int
token string
}
func NewUser(i int, token string) User {
user := User{token: fmt.Sprint(i), i: i}
return user
}
func (u *User) PrintAddr() {
fmt.Printf("%d (PrintAddr): %p\n", u.i, u)
}
func main() {
users := make([]User, 4)
for i := 0; i < 4; i++ {
user := NewUser(i, "")
users[i] = user
}
for i, user := range users {
go user.PrintAddr()
go users[i].PrintAddr()
}
time.Sleep(time.Second)
}
这是代码的输出:
1 (PrintAddr): 0xc000056198
2 (PrintAddr): 0xc0000561b0
0 (PrintAddr): 0xc000056180
3 (PrintAddr): 0xc00000c030
3 (PrintAddr): 0xc00000c030
3 (PrintAddr): 0xc00000c030
3 (PrintAddr): 0xc00000c030
3 (PrintAddr): 0xc0000561c8
我也不明白为什么有5个3 (PrintAddr)
中的4个是0xc00000c030
,而最后一个不同。
然而,如果我使用指针数组而不是值数组,像这样:
func NewUser(i int, token string) *User {
user := &User{token: fmt.Sprint(i), i: i}
return user
}
// -snip-
func main() {
users := make([]*User, 4)
// -snip-
}
那么这里一切都正常,每个条目都以相同的地址打印了两次:
1 (PrintAddr): 0xc0000ae030
3 (PrintAddr): 0xc0000ae060
2 (PrintAddr): 0xc0000ae048
2 (PrintAddr): 0xc0000ae048
3 (PrintAddr): 0xc0000ae060
1 (PrintAddr): 0xc0000ae030
0 (PrintAddr): 0xc0000ae018
0 (PrintAddr): 0xc0000ae018
但是为什么这里的情况与文章中的情况不适用,我没有得到多个3 (PrintAddr)
呢?
英文:
I read this article and decided to repeat such behavior myself and experiment with that:
package main
import (
"fmt"
"time"
)
type User struct {
i int
token string
}
func NewUser(i int, token string) User {
user := User{token: fmt.Sprint(i), i: i}
return user
}
func (u *User) PrintAddr() {
fmt.Printf("%d (PrintAddr): %p\n", u.i, u)
}
func main() {
users := make([]User, 4)
for i := 0; i < 4; i++ {
user := NewUser(i, "")
users[i] = user
}
for i, user := range users {
go user.PrintAddr()
go users[i].PrintAddr()
}
time.Sleep(time.Second)
}
Here is the code output:
1 (PrintAddr): 0xc000056198
2 (PrintAddr): 0xc0000561b0
0 (PrintAddr): 0xc000056180
3 (PrintAddr): 0xc00000c030
3 (PrintAddr): 0xc00000c030
3 (PrintAddr): 0xc00000c030
3 (PrintAddr): 0xc00000c030
3 (PrintAddr): 0xc0000561c8
I also don't understand, why are 4 of 5 3 (PrintAddr)
are 0xc00000c030
, and the last one is different?
However, if I use a pointer array instead of value array, like this,
func NewUser(i int, token string) *User {
user := &User{token: fmt.Sprint(i), i: i}
return user
}
// -snip-
func main() {
users := make([]*User, 4)
// -snip-
then everything's fine here and each entry is printed exactly 2 times with the same address:
1 (PrintAddr): 0xc0000ae030
3 (PrintAddr): 0xc0000ae060
2 (PrintAddr): 0xc0000ae048
2 (PrintAddr): 0xc0000ae048
3 (PrintAddr): 0xc0000ae060
1 (PrintAddr): 0xc0000ae030
0 (PrintAddr): 0xc0000ae018
0 (PrintAddr): 0xc0000ae018
But why did the situation in the article not apply here and I didn't get many 3 (PrintAddr)
instead?
答案1
得分: 3
问题
你的第一个版本存在一个“同步错误”,表现为“数据竞争”:
$ go run -race main.go
0 (PrintAddr): 0xc0000b4018
0 (PrintAddr): 0xc0000c2120
==================
警告:数据竞争
在主 goroutine 中的 0x00c0000b4018 处写入:
main.main()
redacted/main.go:29 +0x1e5
在 goroutine 7 中的 0x00c0000b4018 处之前读取:
main.(*User).PrintAddr()
redacted/main.go:19 +0x44
Goroutine 7(已完成)创建于:
main.main()
redacted/main.go:30 +0x244
==================
1 (PrintAddr): 0xc0000b4018
1 (PrintAddr): 0xc0000c2138
2 (PrintAddr): 0xc0000b4018
2 (PrintAddr): 0xc0000c2150
3 (PrintAddr): 0xc0000b4018
3 (PrintAddr): 0xc0000c2168
发现 1 个数据竞争
for
循环(第 29 行)在“同时”(即在没有适当同步的情况下)更新循环变量 user
,而 PrintAddr
方法通过其指针接收器(第 19 行)访问它。请注意,如果你不在第 30 行将 user.PrintAddr()
启动为 goroutine,则问题会消失。
实际上,问题及其解决方案在你提供的 Wiki 页面的底部给出。
> 但是为什么文章中的情况不适用于这里,我没有得到很多 3 (PrintAddr)
?
这个同步错误是一种不希望的不确定性的来源。特别是,你无法预测 3 (PrintAddr)
会被打印多少次(如果有的话),而且这个数字可能会在每次执行时有所不同。实际上,向上滚动并自己查看:在我使用竞争检测器执行时,输出恰好包含了 0 到 3 之间每个整数的两个实例,尽管存在错误;但这并不能保证。
解决方案
只需在循环的顶部隐藏循环变量 user
,问题就会消失:
for i, user := range users {
user := user // <---
go user.PrintAddr()
go users[i].PrintAddr()
}
PrintAddr
现在将操作于最内层的 user
变量,该变量不会被第 29 行的 for
循环更新。
附加说明
你还应该使用一个等待组来等待所有的 goroutine 完成。time.Sleep
不是协调 goroutine 的方式。
英文:
Problem
Your first version has a synchronisation bug, which manifests itself as a data race:
$ go run -race main.go
0 (PrintAddr): 0xc0000b4018
0 (PrintAddr): 0xc0000c2120
==================
WARNING: DATA RACE
Write at 0x00c0000b4018 by main goroutine:
main.main()
redacted/main.go:29 +0x1e5
Previous read at 0x00c0000b4018 by goroutine 7:
main.(*User).PrintAddr()
redacted/main.go:19 +0x44
Goroutine 7 (finished) created at:
main.main()
redacted/main.go:30 +0x244
==================
1 (PrintAddr): 0xc0000b4018
1 (PrintAddr): 0xc0000c2138
2 (PrintAddr): 0xc0000b4018
2 (PrintAddr): 0xc0000c2150
3 (PrintAddr): 0xc0000b4018
3 (PrintAddr): 0xc0000c2168
Found 1 data race(s)
The for
loop (line 29) keeps updating loop variable user
while (i.e. in a concurrent manner without proper synchronisation) the PrintAddr
method accesses it via its pointer receiver (line 19). Note that if you don't start user.PrintAddr()
as a goroutine on line 30, the problem goes away.
The problem and a solution to it are actually given at the bottom of the Wiki you link to.
> But why did the situation in the article not apply here and I didn't get many 3 (PrintAddr)
instead?
That synchronisation bug is a source of undesired undeterminism. In particular, you cannot predict how many times (if any) 3 (PrintAddr)
will be printed, and that number may vary from one execution to the next. In fact, scroll up and see for yourself: in my execution with the race detector on, the output happened to feature two of each integer between 0 and 3, despite the bug; but there's no guarantee for that.
Solution
Simply shadow loop variable user
at the top of the loop and the problem goes away:
for i, user := range users {
user := user // <---
go user.PrintAddr()
go users[i].PrintAddr()
}
PrintAddr
will now operate on the innermost user
variable, which is not updated by the for
loop on line 29.
Addendum
You should also use a wait group to wait for all your goroutines to finish. time.Sleep
is no way to coordinate goroutines.
答案2
得分: 2
你的代码的第一个版本在对值slice进行迭代时,获取了迭代变量的地址。为什么会这样呢?
方法PrintAddr
是在指针接收器上定义的:
func (u *User) PrintAddr() {
fmt.Printf("%d (PrintAddr): %p\n", u.i, u)
}
在for循环中,user
迭代变量在每次循环中被重用,并被赋予切片中的下一个值。因此它是同一个变量。但是你通过调用在指针接收器上定义的方法来获取了它的地址:
users := make([]User, 4)
// ...
for i, user := range users {
go user.PrintAddr()
go users[i].PrintAddr()
}
在值上调用该方法等同于(&user).PrintAddr()
:
> 如果x
是可寻址的,并且&x
的方法集包含m
,则x.m()
是(&x).m()
的简写形式。
通过索引切片可以正常工作,因为你访问的是切片中实际的第i
个值,而不是使用迭代变量。
将切片更改为持有指针值也可以解决这个问题,因为迭代变量现在是指向User
值的指针的副本。
英文:
The first version of your code that ranges over the value slice is taking the address of the iterator variable.. Why?
The method PrintAddr
is defined on the pointer receiver:
func (u *User) PrintAddr() {
fmt.Printf("%d (PrintAddr): %p\n", u.i, u)
}
In the for loop the user
iteration variable is reused at every loop and assigned the next value in the slice. Therefore it is the same variable. But you are taking its address by calling a method that was defined on the pointer receiver:
users := make([]User, 4)
// ...
for i, user := range users {
go user.PrintAddr()
go users[i].PrintAddr()
}
Calling the method on the value equals to (&user).PrintAddr()
:
> If x
is addressable and &x's method set contains m
, x.m()
is shorthand for (&x).m()
Indexing the slice instead works as expected because you are accessing the actual i
-th value in the slice, instead of using the iterator var.
Changing the slice to hold pointer values also works around this issue because the iterator var is now a copy of the pointer to the User
value.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论