Go中的Python风格生成器

huangapple go评论89阅读模式
英文:

Python-style generators in Go

问题

我目前正在学习Go之旅,我觉得goroutine的使用方式与Python生成器类似,特别是在问题66中。我觉得问题66看起来很复杂,所以我将其重写为以下代码:

package main

import "fmt"

func fibonacci(c chan int) {
    x, y := 1, 1
        
	for {
        c <- x
        x, y = y, x + y
    }
}

func main() {
    c := make(chan int)
	go fibonacci(c)
	
	for i := 0; i < 10; i++ {
		fmt.Println(<-c)
	}
}

这个代码似乎是有效的。有几个问题:

  1. 如果我将通道的缓冲区大小调大到10,fibonacci会尽可能快地填满10个位置,而main会尽可能快地消耗这些位置。这样做是正确的吗?这样做会以内存为代价提高性能,对吗?
  2. 由于fibonacci发送者没有关闭通道,当我们超出作用域时,内存会发生什么变化?我预期的是,一旦cgo fibonacci超出作用域,通道及其上的所有内容都会被垃圾回收。我的直觉告诉我,这可能不是发生的情况。
英文:

I'm currently working through the Tour of Go, and I thought that goroutines have been used similarly to Python generators, particularly with Question 66. I thought 66 looked complex, so I rewrote it to this:

package main

import &quot;fmt&quot;

func fibonacci(c chan int) {
    x, y := 1, 1
        
	for {
        c &lt;- x
        x, y = y, x + y
    }
}

func main() {
    c := make(chan int)
	go fibonacci(c)
	
	for i := 0; i &lt; 10; i++ {
		fmt.Println(&lt;-c)
	}
}

This seems to work. A couple of questions:

  1. If I turn up the buffer size on the channel to say, 10, fibonacci would fill up 10 further spots, as quickly as possible, and main would eat up the spots as quickly as it could go. Is this right? This would be more performant than a buffer size of 1 at the expense of memory, correct?
  2. As the channel doesn't get closed by the fibonacci sender, what happens memory-wise when we go out of scope here? My expectation is that once c and go fibonacci is out of scope, the channel and everything on it gets garbage-collected. My gut tells me this is probably not what happens.

答案1

得分: 26

是的,增加缓冲区大小可能会大大提高程序的执行速度,因为它会减少上下文切换的次数。Goroutines不会被垃圾回收,但通道会被回收。在你的例子中,斐波那契goroutine将永远运行(等待另一个goroutine从通道c中读取),而通道c将永远不会被销毁,因为fib-goroutine仍在使用它。

这是另一个稍微不同的程序,它不会泄漏内存,而且在我看来更类似于Python的生成器:

package main

import "fmt"

func fib(n int) chan int {
    c := make(chan int)
    go func() {
        x, y := 0, 1
        for i := 0; i <= n; i++ {
            c <- x
            x, y = y, x+y
        }
        close(c)
    }()
    return c
}

func main() {
    for i := range fib(10) {
        fmt.Println(i)
    }
}

或者,如果你不知道要生成多少个斐波那契数,你必须使用另一个退出通道,这样你就可以在应该停止生成器goroutine时向其发送信号。这就是golang教程https://tour.golang.org/concurrency/4中所解释的内容。

英文:

Yes, increasing the buffer size might drastically increase the execution speed of your program, because it will reduce the number of context switches. Goroutines aren't garbage-collected, but channels are. In your example, the fibonacci goroutine will run forever (waiting for another goroutine to read from the channel c), and the channel c will never be destroyed, because the fib-goroutine is still using it.

Here is another, sightly different program, which does not leak memory and is imho more similar to Python's generators:

package main

import &quot;fmt&quot;

func fib(n int) chan int {
	c := make(chan int)
	go func() {
		x, y := 0, 1
		for i := 0; i &lt;= n; i++ {
			c &lt;- x
			x, y = y, x+y
		}
		close(c)
	}()
	return c
}

func main() {
	for i := range fib(10) {
		fmt.Println(i)
	}
}

Alternatively, if you do not know how many Fibonacci numbers you want to generate, you have to use another quit channel so that you can send the generator goroutine a signal when it should stop. This is whats explained in golang's tutorial https://tour.golang.org/concurrency/4.

答案2

得分: 17

我喜欢@tux21b的答案;在fib()函数中创建通道使调用代码变得简洁明了。稍微解释一下,只有在调用函数时无法确定何时停止时,才需要一个单独的“quit”通道。如果你只关心“小于X的数字”,你可以这样做:

package main

import "fmt"

func fib(n int) chan int {
    c := make(chan int)

    go func() {
        x, y := 0, 1

        for x < n {
            c <- x
            x, y = y, x+y
        }

        close(c)
    }()

    return c
}

func main() {
    // 打印小于500的斐波那契数列
    for i := range fib(500) {
        fmt.Println(i)
    }
}

如果你想要两者都能实现,这样做有点笨拙,但我个人更喜欢它,而不是在调用者中测试条件,然后通过一个单独的通道发送退出信号:

func fib(wanted func (int, int) bool) chan int {
    c := make(chan int)

    go func() {
        x, y := 0, 1

        for i := 0; wanted(i, x); i++{
            c <- x
            x, y = y, x+y
        }

        close(c)
    }()

    return c
}

func main() {
    // 打印前10个斐波那契数列
    for n := range fib(func(i, x int) bool { return i < 10 }) {
        fmt.Println(n)
    }

    // 打印小于500的斐波那契数列
    for n := range fib(func(i, x int) bool { return x < 500 }) {
        fmt.Println(n)
    }
}

我认为这取决于特定情况下的细节,你可以选择:

  1. 在创建生成器时告诉它何时停止
  2. 传递一个明确的要生成的值的数量
  3. 传递一个目标值
  4. 传递一个确定是否继续的函数
  5. 给生成器一个“quit”通道,自己测试值,并在适当时告诉它退出。

总结并回答你的问题:

  1. 增加通道大小会提高性能,因为减少了上下文切换。在这个简单的例子中,性能和内存消耗都不会成为问题,但在其他情况下,缓冲通道通常是一个很好的选择。在大多数情况下,make(chan int, 100)使用的内存似乎并不重要,但它可能会产生很大的性能差异。

  2. 在你的fibonacci函数中有一个无限循环,所以运行它的goroutine将永远运行(在这种情况下,阻塞在c <- x上)。在调用者中不再从与之共享的通道读取的事实并不会改变这一点。正如@tux21b指出的,由于通道仍在使用中,它永远不会被垃圾回收。这与关闭通道无关(关闭通道的目的是让通道的接收端知道不会再有更多的值到来),而与你的函数没有返回有关。

英文:

I like @tux21b's answer; having the channel created in the fib() function makes the calling code nice and clean. To elaborate a bit, you only need a separate 'quit' channel if there's no way to tell the function when to stop when you call it. If you only ever care about "numbers up to X", you can do this:

package main

import &quot;fmt&quot;

func fib(n int) chan int {
    c := make(chan int)

    go func() {
        x, y := 0, 1

        for x &lt; n {
            c &lt;- x
            x, y = y, x+y
        }

        close(c)
    }()

    return c
}

func main() {
    // Print the Fibonacci numbers less than 500
    for i := range fib(500) {
        fmt.Println(i)
    }
}

If you want the ability to do either, this is a little sloppy, but I personally like it better than testing the condition in the caller and then signalling a quit through a separate channel:

func fib(wanted func (int, int) bool) chan int {
    c := make(chan int)

    go func() {
        x, y := 0, 1

        for i := 0; wanted(i, x); i++{
            c &lt;- x
            x, y = y, x+y
        }

        close(c)
    }()

    return c
}

func main() {
    // Print the first 10 Fibonacci numbers
    for n := range fib(func(i, x int) bool { return i &lt; 10 }) {
        fmt.Println(n)
    }

    // Print the Fibonacci numbers less than 500
    for n := range fib(func(i, x int) bool { return x &lt; 500 }) {
        fmt.Println(n)
    }
}

I think it just depends on the particulars of a given situation whether you:

  1. Tell the generator when to stop when you create it by
  2. Passing an explicit number of values to generate
  3. Passing a goal value
  4. Passing a function that determines whether to keep going
  5. Give the generator a 'quit' channel, test the values yourself, and tell it to quit when appropriate.

To wrap up and actually answer your questions:

  1. Increasing the channel size would help performance due to fewer context switches. In this trivial example, neither performance nor memory consumption are going to be an issue, but in other situations, buffering the channel is often a very good idea. The memory used by make (chan int, 100) hardly seems significant in most cases, but it could easily make a big performance difference.

  2. You have an infinite loop in your fibonacci function, so the goroutine running it will run (block on c &lt;- x, in this case) forever. The fact that (once c goes out of scope in the caller) you won't ever again read from the channel you share with it doesn't change that. And as @tux21b pointed out, the channel will never be garbage collected since it's still in use. This has nothing to do with closing the channel (the purpose of which is to let the receiving end of the channel know that no more values will be coming) and everything to do with not returning from your function.

答案3

得分: 13

你可以使用闭包来模拟生成器。这是来自golang.org的示例。

package main

import "fmt"

// fib返回一个返回连续斐波那契数的函数。
func fib() func() int {
    a, b := 0, 1
    return func() int {
        a, b = b, a+b
        return a
    }
}

func main() {
    f := fib()
    // 函数调用从左到右进行评估。
    fmt.Println(f(), f(), f(), f(), f())
}
英文:

You could use closures to simulate a generator. Here is the example from golang.org.

package main

import &quot;fmt&quot;

// fib returns a function that returns
// successive Fibonacci numbers.
func fib() func() int {
	a, b := 0, 1
	return func() int {
		a, b = b, a+b
		return a
	}
}

func main() {
	f := fib()
	// Function calls are evaluated left-to-right.
	fmt.Println(f(), f(), f(), f(), f())
}

答案4

得分: 8

使用通道来模拟Python生成器有点可行,但它引入了不必要的并发,并且增加了比可能需要的更多的复杂性。在这里,只需明确地保持状态更容易理解、更简短,而且几乎肯定更高效。这样做可以使关于缓冲区大小和垃圾回收的所有问题都变得无关紧要。

type fibState struct {
    x, y int
}

func (f *fibState) Pop() int {
    result := f.x
    f.x, f.y = f.y, f.x + f.y
    return result
}

func main() {
    fs := &fibState{1, 1}
    for i := 0; i < 10; i++ {
        fmt.Println(fs.Pop())
    }
}
英文:

Using channels to emulate Python generators kind of works, but they introduce concurrency where none is needed, and it adds more complication than's probably needed. Here, just keeping the state explicitly is easier to understand, shorter, and almost certainly more efficient. It makes all your questions about buffer sizes and garbage collection moot.

type fibState struct {
    x, y int
}

func (f *fibState) Pop() int {
    result := f.x
    f.x, f.y = f.y, f.x + f.y
    return result
}

func main() {
    fs := &amp;fibState{1, 1}
    for i := 0; i &lt; 10; i++ {
        fmt.Println(fs.Pop())
    }
}

huangapple
  • 本文由 发表于 2012年7月9日 02:27:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/11385556.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定