进程达到最大线程数限制了吗?

huangapple go评论87阅读模式
英文:

Go hitting max threads for process?

问题

我正在尝试使用Go进行一些文件系统使用分析,并通过将几乎所有内容作为goroutine生成,并依赖Go虚拟机(和GOMAXPROCS)来管理它,以使代码尽可能快速。我观察着这段代码运行(非常快),直到它突然停止了。我检查了top命令,它列出我的进程有1500个线程。

我以为可能是因为达到了某个限制,进程因此在等待操作系统而死锁。我检查了我的操作系统(FreeBSD)的限制,果然它列出了每个进程最多1500个线程。

我感到惊讶,于是查看了Go文档,它说GOMAXPROCS只是对正在运行的线程的限制,而阻塞的线程不计算在内。

所以我的问题是:

  • 可以说我不能依赖Go虚拟机作为一个全局池来防止达到这些类型的操作系统限制吗?

  • 有没有一种惯用的方法来处理这个问题(请友善,我才用Go的第二天)?

  • 特别是,除了使用sync来关闭通道之外,我还没有找到一个很好的方法来在使用完通道后关闭它。有没有更好的方法?

  • 我想抽象出样板代码(使用goroutine进行并行映射和在完成后关闭通道),有没有一种类型安全的方法来做到这一点,而不使用泛型?

以下是我的当前代码:

func AnalyzePaths(paths chan string) chan AnalyzedPath {
	analyzed := make(chan AnalyzedPath)
	go func() {
		group := sync.WaitGroup{}
		for path := range paths {
			group.Add(1)
			go func(path string) {
				defer group.Done()
				analyzed <- Analyze(path)
			}(path)
		}
		group.Wait()
		close(analyzed)
	}()
	return analyzed
}

func GetPaths(roots []string) chan string {
	globbed := make(chan string)
	go func() {
		group := sync.WaitGroup{}
		for _, root := range roots {
			group.Add(1)
			go func(root string) {
				defer group.Done()
				for _, path := range glob(root) {
					globbed <- path
				}
			}(root)
		}
		group.Wait()
		close(globbed)
	}()
	return globbed
}

func main() {
    paths := GetPaths(patterns)
    for analyzed := range AnalyzePaths(paths) {
        fmt.Println(analyzed)
    }
}
英文:

I'm trying out Go for doing some filesystem use analysis and I went for making the code as fast as possible by spawning almost everything off as a goroutine and relying on the Go VM (and GOMAXPROCS) to manage it. I was watching this code run (pretty quickly) until it just stopped dead. I checked top and it listed my process as having 1500 threads.

I thought maybe I had hit some limit and the process was therefore deadlocked waiting on the OS. I checked my OS (FreeBSD) limits, and sure enough it was listed as 1500 threads max per process.

Surprised, I checked the Go docs and it says GOMAXPROCS is only a limit on running threads, but blocked threads don't count.

So my questions:

  • Is it fair to say I can't rely on the Go VM as a global pool to prevent hitting OS limits of these kinds?

  • Is there an idiomatic way to handle this (be nice, it's only my second day using Go)?

  • In particular, I haven't found a great way other than sync to close a channel when I'm done using it. Is there a better way?

  • I'd like to abstract away the boilerplate (parallel mapping with go routines and
    closing channel when done), is there a type-safe way to do this without generics?

Here's my current code:

func AnalyzePaths(paths chan string) chan AnalyzedPath {
	analyzed := make(chan AnalyzedPath)
	go func() {
		group := sync.WaitGroup{}
		for path := range paths {
			group.Add(1)
			go func(path string) {
				defer group.Done()
				analyzed &lt;- Analyze(path)
			}(path)
		}
		group.Wait()
		close(analyzed)
	}()
	return analyzed
}

func GetPaths(roots []string) chan string {
	globbed := make(chan string)
	go func() {
		group := sync.WaitGroup{}
		for _, root := range roots {
			group.Add(1)
			go func(root string) {
				defer group.Done()
				for _, path := range glob(root) {
					globbed &lt;- path
				}
			}(root)
		}
		group.Wait()
		close(globbed)
	}()
	return globbed
}

func main() {
    paths := GetPaths(patterns)
    for analyzed := range AnalyzePaths(paths) {
        fmt.Println(analyzed)
    }
}

答案1

得分: 3

大约两个月前(或更早),语言开发者谈到了线程计数控制(和其他一些限制)的侵入。因此,我们可以预期很快会看到它。一个多月前,我在我的Linux机器上开发了这个问题,并发现GOMAXPROCS的值不会超过256。如果我将300或更多的值发送给它,结果总是256。但我发现goroutine并不是线程。Goroutine可以存在于一个线程中。

至于惯用的同步方式-我认为没有必要过多地进行同步。
在我的代码中,我通常使用goroutine之间仅通过通道进行通信的思想。并且通道应该作为goroutine的参数传递。

func main() {
    ch1 := make(chan SomeType1)
    ch2 := make(chan SomeType2)
    go generator(ch1, ch2)
    go processor(ch1, ch2)
    // 在这里,main函数会等待,直到它捕获到2个ch2完成信号
    <-ch2
    <-ch2
    // 通常我们不需要ch2信号的确切值,
    // 所以我们将其赋值给空
}

func generator(ch1 chan SomeType1, ch2 chan SomeType2) {
    for (YOUR_CONDITION){
        // 生成一些东西
        // ....
        // 发送到通道
        ch1 <- someValueOfType1
    }
    ch1 <- magicStopValue
    ch2 <- weAreFinishedSignal1
}

func processor(ch1 chan SomeType1, ch2 chan SomeType2) {
    // 从ch1“读取”值
    value := <-ch1
    for value != magicStopValue {
        // 进行一些处理
        // ....
        // 从ch1获取下一个值并重新处理
        value = <-ch1
    }
    // 在这里,我们可以发送信号表示goroutine2已完成
    ch2 <- weAreFinishedSignal2
}

如果goroutine在一个线程中,它们之间的通信速度更快。对我来说,通道的性能远非理想,但对于许多目的来说已经足够了。

英文:

About 2 months ago (or more) language developers spoke about intruding of thread count control (and some other limits). So we can expect to see it soon. Month or more ago I develop the issue and found on my linux machine that GOMAXPROCS doesn't exceeds value of 256. If I sent 300 or more to it, the result was always 256. But I found that goroutines are not a threads. Goroutines can live in one thread.

As for idiomatic syncing - I think there is no necessity to sync too much.
In my code I usually use idea that goroutines are communicating through channels only. And channels should be passed as parameters for goroutines.

func main() {
    ch1 := make(chan SomeType1)
    ch2 := make(chan SomeType2)
    go generator(ch1, ch2)
    go processor(ch1, ch2)
    // here main func becomes waiting until it capture 2 of ch2-finished-signals 
    &lt;- ch2
    &lt;- ch2
    // usually we don&#39;t need the exact values of ch2-signals,
    // so we assign it to nothing 
}

func generator(ch1 chan SomeType1, ch2 chan SomeType2) {
    for (YOUR_CONDITION){
        // generate something
        //....
        // send to channel
        ch1 &lt;- someValueOfType1
    }
    ch1 &lt;- magicStopValue
    ch2 &lt;- weAreFinishedSignal1
}

func processor(ch1 chan SomeType1, ch2 chan SomeType2) {
    // &quot;read&quot; value from ch1 
    value := &lt;-ch1
    for value != magicStopValue {
        // make some processing
        // ....
        //get next value from ch1 and replay processing
        value = &lt;- ch1
    }
    // here we can send signal that goroutine2 is finished
    ch2 &lt;- weAreFinishedSignal2
}

If goroutines are in one thread they are communicating faster. As for me the channel performance is far from good, but enough for many purposes.

huangapple
  • 本文由 发表于 2013年11月27日 08:39:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/20230861.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定