我正在使用golang时是否错误地使用了通道?

huangapple go评论89阅读模式
英文:

Am I using channels incorrectly with golang?

问题

我来帮你翻译一下:

我来自Node.js背景,那里很容易做一些异步工作,然后在长时间运行的任务完成后继续做一些其他工作。我相信Go语言也是一样的,但我还没有完全理解通道是如何工作的。

我正在构建一个解析器,用于解析我玩的一款老游戏中的拍卖数据日志,并将其解析后的数据流式传输到网站上的实时反馈中。一个文件可能一次发送100行数据,我的解析器必须逐行分析每一行,并从每一行中提取元信息(如物品、物品价格等)。

每一行都会运行以下的循环(假设已经从正则表达式中得到了物品列表):

itemChannel := make(chan Item)

for _, itemName := range itemList {
	item := Item {
		Name: itemName,
	}

    // 长时间运行的方法,用于解析物品的定价、数量,并进行一些HTTP调用(平均运行时间为75毫秒)
	go item.FetchData(itemChannel)

	// 当通道完成时从通道中读取数据
	raw := <-itemChannel
	auction.Items = append(auction.Items, raw)
	auction.Seller = seller
}

auctions = append(auctions, auction)
fmt.Println("Appended auction: ", auction)
go c.publishToRelayService(auction)

根据目前的观察,似乎raw := <-itemChannel会导致循环阻塞,直到goroutine完成并传回数据(这意味着运行item.FetchData(itemChannel)也会有同样的效果)。我想知道如何在数据返回到通道时从中读取数据,同时尽快跳出循环迭代。有些行中有15-20个物品,这会导致程序在解析下一行之前停顿约2-3秒。我希望能够尽快跳出循环并处理下一行,以保持解析器的速度尽可能快。是否有类似于Node中的Promise的机制,可以在每次item.FetchData()完成时链式添加一个完成处理程序?

注意:当所有的获取工作都完成后,fetchChannel会在我的Item类型中被写入。

英文:

I've came from a Node.js background where its super easy to do some async work and then do some more work upon that long running task completing, and I'm sure its the same in Go but I just haven't quite wrapped my head around how channels work yet.

I'm building a parser for an old game that I play which analyses lines from auction data logs and parses them to stream to a live feed on a website over socket io. A file could send 100 lines at a time and my parser has to analyse each line one at a time and extract meta information from each line (such as the items, item prices etc.)

Each single line has this for loop run against it (this assumes the section where a list of items has been derived from a regexp):

itemChannel := make(chan Item)

for _, itemName := range itemList {
	item := Item {
		Name: itemName,
	}

    // Long running method which does parsing for the item such as pricing, quantity and makes some http calls (runs 75ms on average)
	go item.FetchData(itemChannel)

	// Read from the channel when its done
	raw := &lt;-itemChannel
	auction.Items = append(auction.Items, raw)
	auction.Seller = seller
}

auctions = append(auctions, auction)
fmt.Println(&quot;Appended auction: &quot;, auction)
go c.publishToRelayService(auction)

Right now (from observation) it seems as if raw := &lt;-itemChannel causes the loop to block until the goroutine finishes and passes its data back (which surely means running as item.FetchData(itemChannel) would do the same thing. How can I read from the channel as data comes back into it but break out of the loop iterations as quick as possible. Some lines have 15-20 items in them which causes the program to halt for ~2-3 seconds before parsing the next line. I'd like to be able to break out and process the next line sooner than that to keep the parser as fast as possible. Is there any mechanism similar to Promises in Node where I can just chain a completion handler on to each completion of item.FetchData()?

NOTE fetchChannel is written to inside of my Item type when all fetch work has been completed.

答案1

得分: 2

你可以编写一个不同的Go协程,它等待通道中的新数据并进行处理。这样生产者和消费者就可以并行运行,当生产者完成生产时,消费者也必须完成,因为这里的消费者是一个轻量级进程。

你可以使用一个done通道来指示消费者已完成。

以下是你可以更改代码的方式:

itemChannel := make(chan Item)
done := make(chan bool)
// 消费者
go func(channel chan Item) {
    for raw := range channel {
        auction.Items = append(auction.Items, raw)
        auction.Seller = seller
        auctions = append(auctions, auction)
    }
    done <- true
}(itemChannel)

// 生产者
for _, itemName := range itemList {
    item := Item{
        Name: itemName,
    }

    // 长时间运行的方法,用于对商品进行解析,例如定价、数量,并进行一些HTTP调用(平均运行时间为75毫秒)
    go item.FetchData(itemChannel)

}

<-done
fmt.Println("Appended auction: ", auction)
go c.publishToRelayService(auction)

注意:我已经将原始代码中的&lt;替换为了<,以便更正语法错误。

英文:

You may write a different go routine that waits for new data in the channels and process it.
This way producer and consumer are running parallel and when the producer is done with producing consumer must be done as here the consumer is a light process

You can use a done channel to indicate that the consumer is done

Here is how you may change the code

itemChannel := make(chan Item)
done := make(chan bool)
//Consumer
go func(channel chan Item) {
	for raw := range channel {
		auction.Items = append(auction.Items, raw)
		auction.Seller = seller
		auctions = append(auctions, auction)
	}
	done &lt;- true
}(itemChannel)

//Producer
for _, itemName := range itemList {
	item := Item{
		Name: itemName,
	}

	// Long running method which does parsing for the item such as pricing, quantity and makes some http calls (runs 75ms on average)
	go item.FetchData(itemChannel)

}

&lt;-done
fmt.Println(&quot;Appended auction: &quot;, auction)
go c.publishToRelayService(auction)

答案2

得分: 1

就更广泛的问题而言,从Node.js转向Go/CSP通道,你需要先放下对回调的思考。我使用过的每种反应式/异步范式都是对回调的一种包装,以便更容易使用。但是CSP并不试图像这样。

Go语言中不同的关键之处在于,轻量级goroutine的协作调度在很大程度上独立于操作系统线程(尽管实现者通常会尽力通过操作系统线程技巧来充分利用CPU核心)。它与执行回调没有真正的比较。

每个goroutine都有自己独立的生命周期。它可能非常短暂。或者,如果一个goroutine包含一些循环,它可能存在一段时间,看起来更像是一个actor(在actor模型中)。

这是你需要思考的,以便探索通信顺序进程(CSP)。将goroutine视为数字电子构建块也可以作为一个有用的类比。门和电线类似于goroutine和通道。

此外,触发器可以由多个门构建 - 同样,goroutine可以由通过内部通道连接的“较小”goroutine组成。如果你做对了,外部通道对于其协作者(内部通道被隐藏)是唯一需要关注的事物。

这为软件设计开辟了新的方式,也是Rob Pike所倡导的其中之一:并发不等于并行。要有不同的思维方式。

一个例子可能是模拟软件(更大规模的康威生命游戏)。我看到过一个非常引人注目的模拟,它基于对每个涉及的细胞的个体行为进行建模,模拟了血液在血管中流动和凝结的过程。这个演示使用了这种方法,有着4000万个并发实体,非常令人印象深刻,而且是在一台普通的笔记本电脑上运行的。

英文:

In terms of the broader question, moving from Node.js to Go/CSP channels, you need to start by putting aside thinking of callbacks. Every reactive/async paradigm I've used has been some form of dressing up of callbacks to be easy to use. But CSP does not try to be like this.

The key thing in Go that's different is that the cooperative scheduling of light-weight goroutines happens broadly independently of the operating system threads (although the implementers usually try hard to make this use the CPU cores as best as possible via OS thread tricks under the hood). There is no real comparison with actioning callbacks.

Each goroutines has its own independent life-cycle. It may be quite short. Or, if a goroutine includes some looping, it may exist for a period of time and look rather like an actor (in the actor model).

This is the thinking you need in order to explore Communicating Sequential Processes (CSP). Thinking of goroutines along the lines of digital electronics building blocks can also be a helpful analogy. Gates and wires are similar to goroutines and channels.

Also, flip-flops can be built from several gates - in the same way, goroutines can be composed from 'smaller' goroutines joined by internal channels. If you get this right, the external channels on the 'bigger' goroutine are the only thing of concern to its collaborators (the internal channels are hidden).

This opens up new ways for designing software and is one of the things that Rob Pike has advocated: Concurrency is not Parallelism. Think differently.

An example might be simulation software (Conway's Game of Life on a larger scale). I saw a very compelling simulation of blood flowing and clotting in blood vessels based upon modelling the individual behaviour of each cell involved. The demo had 40 million concurrent entities, a very impressive use of this approach, running on an ordinary laptop.

huangapple
  • 本文由 发表于 2017年1月20日 08:35:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/41754198.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定