通过将缓冲区发送到Go通道传递消息,但为什么会被覆盖?

huangapple go评论87阅读模式
英文:

pass message by sending buffer to go channel but got overwritten, why?

问题

在这个例子中,问题出在生成器函数中的buf变量的作用域和重用上。当buf变量在循环外部声明时,它在每次循环迭代中都被重复使用,而不是创建一个新的切片。这导致在将buf写入通道之后,它的内容会被后续循环迭代修改,从而影响通道中的值。

buf变量在循环内部声明时,每次循环迭代都会创建一个新的切片,确保每个切片的内容都是独立的。这样,当buf被写入通道时,它的内容不会被后续循环迭代修改,保持了预期的结果。

因此,将buf变量的声明放在循环内部可以解决这个问题。这样每次循环迭代都会创建一个新的切片,确保通道中的值不会被修改。

英文:

A very simple and usual case in golang as below, but got result not expected.

package main

import (
	"fmt"
	"time"
)

func main() {
	consumer(generator())
	for {
		time.Sleep(time.Duration(time.Second))
	}
}

// simple generator through channel
func generator() <-chan []byte {
	ret := make(chan []byte)
	go func() {
		// make buf outside of loop, and result is not expected
		var ch = byte('A')
		count := 0
		buf := make([]byte, 1)
		for {
			if count > 10 {
				return
			}
			// make buf inside loop, and result is expected
			// buf := make([]byte, 1)
			buf[0] = ch
			ret <- buf
			ch++
			count++
			// time.Sleep(time.Duration(time.Second))
		}
	}()
	return ret
}

// simple consumer through channel
func consumer(recv <-chan []byte) {
	go func() {
		for buf := range recv {
			fmt.Println("received:" + string(buf[0]))
		}
	}()
}

output:
received:A
received:B
received:D
received:D
received:F
received:F
received:H
received:H
received:J
received:J
received:K

In generator, if put the buf variable inside for loop, result is what I expected:

received:A
received:B
received:C
received:D
received:E
received:F
received:G
received:H
received:I
received:J
received:K

I am thinking even buf is outside for loop and not changed always, after we write it to channe, receiver will read out it until next write can happen, so its' content should not be override, but looks like golang behaviors not in this way, what wrong for happened here?

答案1

得分: 2

问题:你的代码存在数据竞争

将你的程序保存在名为 main.go 的文件中,然后使用竞争检测器运行它:go run -race main.go。你应该会看到类似以下的输出:

$ go run -race main.go 
received:A
==================
WARNING: DATA RACE
Write at 0x00c000180000 by goroutine 7:
  main.generator.func1()
      /redacted/main.go:29 +0x8c

Previous read at 0x00c000180000 by goroutine 8:
  main.consumer.func1()
      /redacted/main.go:43 +0x55

竞争检测器告诉你的程序存在数据竞争,因为两个 goroutine 在没有同步的情况下对某个共享内存进行写入和读取:

  • 在你的 generator 函数中作为 goroutine 启动的匿名函数在第 29 行更新了其名为 buf 的局部变量;
  • 在你的 consumer 函数中作为 goroutine 启动的匿名函数在第 43 行从其名为 buf 的局部变量中读取。

数据竞争源于两个因素的结合:

  1. 虽然 consumer 中的局部变量 buf 只是 generator 中同名局部变量的一个副本,但这些切片变量是耦合的,因为它们引用同一个底层数组。

    通过将缓冲区发送到Go通道传递消息,但为什么会被覆盖?

    请参阅语言规范的相关部分

    一旦初始化,切片将始终与保存其元素的底层数组相关联。因此,切片与其数组以及同一数组的其他切片共享存储空间[...]。

  2. 对切片的操作在并发情况下不是_并发安全_的,如果同时从多个 goroutine 执行(即同时从多个 goroutine 执行),则需要适当的同步。

你的代码显示了一个典型的_别名_情况。你应该更加熟悉切片的工作原理

解决方案

你可以通过使用一个一字节数组([1]byte)而不是切片来消除数据竞争,但在 Go 中,数组相当不灵活。在这里,你是否真的需要使用字节切片还不清楚。由于你实际上只是将一个字节一次发送到通道中,为什么不简单地使用 chan byte 而不是 chan []byte

与数据竞争无关的其他改进包括:

  • 修改你的两个函数的 API,使它们成为同步的(因此更容易推理);
  • 简化生成器逻辑并关闭通道,以便 main 可以正常终止;
  • 简化消费者逻辑并不为其生成一个 goroutine。
package main

import "fmt"

func main() {
    ch := make(chan byte)
    go generator(ch)
    consumer(ch)
}

func generator(ch chan<- byte) {
    var c byte = 'A'
    for i := 0; i < 10; i++ {
        ch <- c
        c++
    }
    close(ch)
}

func consumer(ch <-chan byte) {
    for c := range ch {
        fmt.Printf("received: %c\n", c)
    }
}
英文:

Problem: your code contains a data race

Save your your program in a file named main.go; then run it with the race detector: go run -race main.go. You should see something like the following:

$ go run -race main.go 
received:A
==================
WARNING: DATA RACE
Write at 0x00c000180000 by goroutine 7:
  main.generator.func1()
      /redacted/main.go:29 +0x8c

Previous read at 0x00c000180000 by goroutine 8:
  main.consumer.func1()
      /redacted/main.go:43 +0x55

The race detector tells you your program contains a data race because two goroutines are writing and reading to some shared memory without synchronisation:

  • the anonymous function launched as a goroutine in your generator function updates its local variable named buf at line 29;
  • the anonymous function launched as a goroutine in your consumer function reads from its local variable named buf at line 43.

The data race stems from the conjunction of two things:

  1. Although local variable buf in consumer is just a copy of the homonymous local variable in generator, those slice variables are coupled because they refer to the same underlying array.

    通过将缓冲区发送到Go通道传递消息,但为什么会被覆盖?

    See [the relevant section of the language specification] (https://golang.org/ref/spec#Slice_types):

    > A slice, once initialized, is always associated with an underlying array that holds its elements. A slice therefore shares storage with its array and with other slices of the same array [...]

  2. Operations on slices are not concurrency-safe and require proper synchronisation if performed concurrently (i.e. from multiple goroutines at the same time).

What your code displays is a typical case of aliasing. You should better familiarise yourself with how slices work.

Solution

You could eliminate the data race by using a one-byte array ([1]byte) instead of a slice, but arrays are quite inflexible in Go. Whether you really need to use a slice of bytes at all here is unclear. Since you're effectively only sending one byte at a time to the channel, why not simply use a chan byte rather than a chan []byte?

Other improvements unrelated to the data race include:

  • modifying the API of your two functions to make them synchronous (and therefore, easier to reason about);

  • simplifying the generator logic and closing the channel so that main can actually terminate;

  • simplifying the consumer logic and not spawning a goroutine for it.

    package main
    
    import &quot;fmt&quot;
    
    func main() {
        ch := make(chan byte)
        go generator(ch)
        consumer(ch)
    }
    
    func generator(ch chan&lt;- byte) {
        var c byte = &#39;A&#39;
        for i := 0; i &lt; 10; i++ {
            ch &lt;- c
    	      c++
        }
        close(ch)
    }
    
    func consumer(ch &lt;-chan byte) {
        for c := range ch {
            fmt.Printf(&quot;received: %c\n&quot;, c)
        }
    }
    

答案2

得分: 1

这个案例非常简单。两个线程都拥有缓冲区的所有权,因此通道不保证同步。当消费者正在读取通道时,生成器足够快地修改缓冲区,所以会出现字符跳过的情况。要解决这个问题,你需要引入另一个通道(用于将缓冲区发送回去)或者传递缓冲区的副本。

英文:

The case is very simple. Both threads have ownership of the buffer and so channel does not guarantee synchronization. While consumer is reading the channel, generator is fast enough to modify the buffer so this char skip happens. to fix this you have to introduce another channel (that will send buffer back) or pass a copy of buffer.

huangapple
  • 本文由 发表于 2021年7月5日 16:21:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/68252843.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定