英文:
Whether go uses shared memory or distributed computing
问题
Go有一个口号:“不要通过共享内存来通信;相反,通过通信来共享内存”。我想知道Go是使用共享内存还是分布式计算方法。例如,对于MPI来说,它明显是分布式的,OpenMP明显是共享内存的;但是我不确定Go是怎样的,它是独特的。
我看过很多帖子,比如https://stackoverflow.com/q/1730655/984260,还有effective Go文档等等,但是还是没有搞清楚。提前感谢。
英文:
Go has the slogan "Do not communicate by sharing memory; instead, share memory by communicating". I was wondering whether Go uses shared memory or distributed computing approach. For example, for MPI it is clearly distributed, OpenMP is clearly shared memory; but I was not sure about Go, which is unique.
I have seen many posts, such as https://stackoverflow.com/q/1730655/984260, effective Go document etc., but could not clarify. Thanks in advance.
答案1
得分: 15
Go不会阻止你在goroutine/线程之间共享内存。他们所说的通信是指通过通道发送一块数据或指向该块数据的指针。这实际上将数据的“所有权”转移给了通道的目标读取器。请注意,这种所有权的转移并不是由语言或运行时强制执行的,而只是一种约定。
如果你愿意,仍然可以从两个goroutine中写入同一块内存。换句话说,Go不会阻止你自己给自己惹麻烦,它只提供了语言语义,使这些错误更容易被检测到。
如果一个值被传递到通道中,程序员必须假设该值在同一goroutine中不再属于他自己。
func F(c chan *T) {
// 创建/加载一些数据。
data := getSomeData()
// 将数据发送到通道中。
c <- data
// 在函数的剩余部分,'data'现在应该被视为越界。这纯粹是一种约定,并没有在任何地方强制执行。
// 例如,下面的代码仍然是有效的Go代码,但会导致问题。
data.Field = 123
}
英文:
Go does not prevent you from sharing memory between goroutines/threads. What they mean by communicating, is that you send a chunk of data, or a pointer to said chunk, across a channel. This effectively transfers 'ownership' of the data to the target reader of the channel. Mind you, this transfer of ownership is not enforced by the language or the runtime, it is just by convention.
You are still perfectly capable of writing to the same memory from two goroutines, if you so choose. In other words: Go does not prevent you from shooting yourself in the foot, it just provides language semantics which make these mistakes easier to detect.
If a value is passed into a channel, the programmer must then assume that value is no longer his to write to in the same goroutine.
func F(c chan *T) {
// Create/load some data.
data := getSomeData()
// Send data into the channel.
c <- data
// 'data' should now be considered out-of-bounds for the remainder of
// this function. This is purely by convention, and is not enforced
// anywhere. For example, the following is still valid Go code, but will
// lead to problems.
data.Field = 123
}
答案2
得分: 5
这个问题假设共享内存和分布式计算是相反的。这有点像问:RAM和LAN是相反的吗?更清楚的是区分在CPU/内存节点内的共享内存并发和在CPU/内存节点之间的共享内存并发。
这是并行处理研究的一部分。已经有许多研究项目,包括:
-
开发具有多个CPU共享单个内存的非冯·诺依曼计算机,通过某种形式的交换结构(通常是Clos网络)连接。OpenMP非常适合这些情况。
-
开发由一组CPU组成的并行计算机,每个CPU都有自己独立的内存,并且节点之间有一些通信结构。这通常是MPI等通信库的应用场景。
第一种情况是在高性能计算领域中专门研究的。大多数人熟悉的是后一种情况。在这种情况下,通常情况下通信仅通过以太网进行,但也已经为某些特定领域(例如IEEE1355 SpaceWire,它起源于Transputer串行连接)开发了各种更快、延迟更低的替代方案。
多年来,主流观点是,只有在内存共享的情况下才能实现高效的并行计算,因为通过传递消息进行通信的成本被(天真地)认为是不可行的。在共享内存并发中,困难在于软件:因为一切都是相互依赖的,随着系统规模的增大,设计并发性变得越来越困难。需要有深入的专业知识。
对于我们其他人来说,Go遵循Erlang、Limbo和当然Occam的做法,通过传递消息来协调要完成的工作。这源于通信顺序进程的代数,为创建任何规模的并行系统提供了基础。CSP设计是可组合的:每个子系统本身可以成为更大系统的组成部分,没有理论上的限制。
你的问题提到了OpenMP(共享内存)和MPI(分布式内存消息传递),它们可以一起使用。可以认为Go在促进消息传递方面与MPI大致等效。然而,Go也允许锁和共享内存。Go与MPI和OpenMP不同,因为它并不明确关注多处理器系统。要使用Go进入并行处理领域,需要一个网络消息传递框架,例如OpenCL,有人正在开发Go的API。
英文:
The question assumes that shared memory and distributed computing are opposites. That's a bit like asking: Are RAM and LAN opposites? It would be clearer to differentiate between shared memory concurrency within a CPU/memory node and between CPU/memory nodes.
This is part of a bigger picture of parallel processing research. There have been many research projects, including:
-
developing non-Von-Neumann computers that have multiple CPUs sharing a single memory, joined by some form of switching fabric (often a Clos network). OpenMP would be a good fit for these.
-
developing parallel computers that consist of a collection of CPUs, each with their own separate memory, and with some communications fabric between the nodes. This is typically the home of MPI, amongst others.
The first case is specialised in the High Performance Computing fraternity. It is the latter case that is familiar to most of us. In this case, usually these days the comms is simply via Ethernet, but various faster lower-latency alternatives have been (successfully) developed for certain niches (eg IEEE1355 SpaceWire, which emerged from the Transputer serial links).
For many years, the dominant view was that efficient parallelism would only be possible if the memory was shared, because the cost of communication by passing messages was (naively) assumed to be prohibitive. With shared-memory concurrency, the difficulty is in the software: because everything is interdependent, designing the concurrency gets combinatorially harder and harder as systems get larger. Hard-core expertise is needed.
For the rest of us, Go follows Erlang, Limbo and of course Occam in promoting the passing of messages as the means to choreograph the work to be done. This arises from the algebra of Communicating Sequential Processes, which provides the basis for creating parallel systems of any size. CSP designs are composable: each subsystem can itself be a component of a larger system, without a theoretical limit.
Your question mentioned OpenMP (shared-memory) and MPI (distributed memory message passing), which can be used together. Go could be considered to be approximately equivalent of MPI in that it promotes message passing. It does however also allow locks and shared memory. Go is different from both MPI and OpenMP because it is not explicitly concerned with multi-processor systems. To progress into the world of parallel processing using Go, a network message passing framework would be needed, such as OpenCL, for which someone is working on a Go API.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论