在Scala中实现Go的并发模式是否困难?

huangapple go评论106阅读模式
英文:

Is it difficult to implement Go's concurrency patterns in Scala?

问题

毫无疑问,Go的语法比Scala简单得多。它也具有较少的语言特性。我非常喜欢使用Go编写并发代码的简便性。

事实证明,高性能的代码是非阻塞的代码(参见http://norvig.com/21-days.html#Answers),Go和Scala在这方面都非常擅长。

我的问题是如何在Scala中编写与Go程序完全相同的行为的程序,通过实现相同的并发模式。首先想到的是以类似的方式使用Futures和Channels。

我正在寻找:

  • 在Scala中可能实现Go并发模式的方法
  • 如果在Scala中很难完全模拟Go的结构
  • 代码片段

非常感谢任何帮助。

[编辑] 几个Go并发模式的示例
http://talks.golang.org/2012/concurrency.slide

Fan-in

func fanIn(input1, input2 <-chan string) <-chan string {
  c := make(chan string)
  go func() {
    for {
      select {
        case s := <-input1:  c <- s
        case s := <-input2:  c <- s
      }
    }
  }()
  return c
}

超时粒度(一个通道 vs 整个对话)

将服务调用复制到多个实例中,并返回第一个响应的值。(这使用了一组模式)

所有这些都没有使用:锁、条件变量、回调。
(Scala的Futures使用回调)

英文:

There is no doubt about it, Go's syntax is much simpler than Scala's. It also has fewer language features. I quite like the ease with which one can write concurrency code with Go.

As it turns out, performant code is none blocking code ( see http://norvig.com/21-days.html#Answers ) and both Go and Scala are very good at this.

My question is about how one can write programs in Scala that behaves the exact same way as Go programs, by implementing the same concurrency patterns. The first thing that comes to mind is using Futures in a similar way with Channels.

I'm looking for

  • possible implementations of Go's concurrency patterns in Scala
  • if the Go constructs are hard to simulate exactly in Scala
  • code snippets

Any help is much appreciated.

[Edit] A few examples of Go concurrency patterns
http://talks.golang.org/2012/concurrency.slide

Fan-in

func fanIn(input1, input2 &lt;-chan string) &lt;-chan string {
  c := make(chan string)
  go func() {
    for {
      select {
        case s := &lt;-input1:  c &lt;- s
        case s := &lt;-input2:  c &lt;- s
      }
    }
  }()
  return c
}

Timeout granularity (one channel vs whole conversation)

Replicating service calls amongst multiple instances and returning the value of the first one to respond. (this is uses a bundle of patterns)

All with: No locks. No condition variables. No callbacks.
(Scala Futures use callbacks)

答案1

得分: 9

Go语言在核心语言中内置了并发特性,而Scala使用Java的java.util.concurrent中的concurrent package和并发原语。

在Scala中,使用基于线程的并发或Actor模型是惯用的方式,而Go的并发是基于Hoare的通信顺序进程

尽管两种语言之间的并发原语并不相同,但看起来它们有一些相似之处。

在Go中,通常使用GoroutinesChannels来实现并发。还有其他更传统的低级同步原语,如互斥锁和等待组

在Scala中,据我所知,任何声明为“Runnable”的类都将在单独的线程中启动,并且不会阻塞。这在功能上类似于goroutines。

在Scala中,队列可以用于在类似于Go中的Channels中传递信息。

编辑:正如Chuck指出的那样,"Scala的队列和Go的通道之间的关键区别是,默认情况下,Go的通道在写入时会阻塞,直到有读取它们的准备好的内容,并且在读取时会阻塞,直到有写入它们的准备好的内容。"。这需要在任何Scala实现的通道中编写。

编辑2:正如Maurício Linhares指出的那样,"你可以使用Async(github.com/scala/async)在Scala中进行无可见回调的并发操作,但是你不能完全没有回调地进行,这在当前JVM的实现方式下是不可能的。"

感谢大家的建设性评论。

了解更多信息,请参阅:

英文:

Go has concurrency features built in to the core language while Scala uses the concurrent package and concurrency primitives from Java's java.util.concurrent.

In Scala it's idiomatic to use either thread-based concurrency or the Actor Model, while Go concurrency is based on Hoare's Communicating Sequential Processes.

Although the concurrency primitives between the two languages aren't the same, it looks like there is some similarity.

In Go concurrency is usually achieved using Goroutines and Channels. There are also other more traditional low level synchronization primitives such as mutexes and wait groups.

In Scala, as far as I know, any class that is declared "Runnable" will be launched in a separate thread, and will not block. This is functionally similar to goroutines.

In Scala Queues can be used to pass information between routines in a similar fashion to Channels in Go.

EDIT: As pointed out by Chuck, "the crucial difference between Scala's Queues and Go channels is that, by default, Go's channels block on write until something is ready to read from them and block on read until something is ready to write to them.". This would need to be written into any Scala implementation of channels.

EDIT 2: As pointed out by Maurício Linhares, "You can do concurrency without visible callbacks in Scala using Async - github.com/scala/async - but you can't do it without callbacks at all, it's just not possible given the way the JVM is currently implemented.".

Thanks to all for the constructive comments.

For more info see:

答案2

得分: 5

简短回答是不难。

正如你所知,通过消息传递的并发可以使用阻塞或非阻塞的同步原语进行操作。Go的通道可以同时实现这两种方式 - 它们可以是无缓冲的或有缓冲的 - 由你选择。

在JVM语言中,有很多关于非阻塞并发总是更好的说法。这在一般情况下是不正确的;这只是JVM的一个特性,它上面的线程相当昂贵。作为回应,大多数JVM并发API只提供非阻塞模型,尽管这是不幸的。

对于相对较小的并发量,比如说1000个JVM线程以下,阻塞并发在JVM上也可以非常有效。因为这种风格不涉及任何回调,所以编写起来很容易,以后也容易阅读。

坎特伯雷大学的优秀的JCSP库是使用CSP通道编写Java/Scala/...程序的好方法。这与Go使用的风格相同;JCSP通道与Go通道非常相似,可以选择无缓冲或有缓冲(或固定缓冲区大小)。它的select被称为Alternative,通过JCSP开发人员的形式分析已被证明是正确的。

但是由于JVM实际上无法支持超过1000个线程,这对于某些应用领域来说可能不合适。但是,这时就可以使用Go...


注:JCSP的当前版本是v1.1rc5,与JCSP网站上所说的不同。

英文:

The short answer is no, it is not difficult.

As you know, concurrency by message passing can operate with blocking or non-blocking synchronisation primitives. Go's channels can do both - they can be unbuffered or buffered - you choose.

A lot is said in JVM languages about non-blocking concurrency being always superior. This is not true in general; it's just a feature of the JVM that threads are quite expensive on it. In response, most JVM concurrency APIs provide only a non-blocking model, although this is unfortunate.

For relatively modest concurrency of up to, say, 1000 JVM threads, blocking concurrency can work very effectively even on the JVM. Because this style doesn't involve any callbacks, it is easy to write and then read later.

The excellent JCSP library from the University of Canterbury is a good way to write Java/Scala/... programs using CSP channels. This is the same style used by Go; JCSP channels are very similar to Go channels, giving the option of unbuffered or buffered (or overwriting fixed buffer) sizes. Its select is called Alternative and has been proven correct by the JCSP developers via formal analysis.

But because the JVM cannot realistically support more than 1000 or so threads, this will not be appropriate for some application areas. But then, there's Go...


Footnote: the current version of JCSP is v1.1rc5 in the Maven repos, contrary to what the JCSP website says.

答案3

得分: 3

好的,以下是翻译好的内容:

良好的实现并不简单。

也就是说,你可以以“阻塞”的方式实现一个,其中每个阻塞原语(通道等待)实际上会阻塞执行线程。实现将是简单的,但是无用的。

另一种选择是构建一种机制,允许异步地“挂起/恢复”等待的执行流。由于JVM没有内置的continuations支持,实现这一点相当复杂,需要AST转换或字节码编织。

对于第一种方法(即在SIP-22 async之上进行AST转换的方法),你可以参考https://github.com/rssh/scala-gopher(警告:我是作者)。

更新:scala-gopher-2.0.0适用于scala3,基于dotty-cps-async https://github.com/rssh/dotty-cps-async,它在async块内进行了单子cps转换。

英文:

Good implementation is non-trivial.

I.e. you can implement one in 'blocking' way, where each go blocking primitive (channel wait) will actually block execution thread. Implementation would be trivial, but useless.

Alternative is build a mechanism, which allows 'suspend/resume' execution flow for waits asynchronously. Since we have no builtin support of continuations in JVM, implementing this is quite complex and require or AST transformations or bytecode weaving.

For implementation of #1 approach (i.e. with AST transformation on top of SIP-22 async), you can look at https://github.com/rssh/scala-gopher (warning: I'm author).

Update: scala-gopher-2.0.0 for scala3 is based on dotty-cps-async https://github.com/rssh/dotty-cps-async, which do monadic cps transformation inside async block.

答案4

得分: 1

显然,有一个第三方库(Netflix)为Scala(以及Java和其他JVM语言)提供了响应式扩展。RX的可观察对象可以以类似Go的通道的方式处理。

https://github.com/Netflix/RxJava/tree/master/language-adaptors/rxjava-scala

文档也很有用,提供了常见模式的可视化表示。

英文:

Apparently there is a third party lib (Netflix) which provide reactive extensions to Scala (but also Java and other JVM languages).
The RX's observables can be treaded in a similar way to Go's channels.

https://github.com/Netflix/RxJava/tree/master/language-adaptors/rxjava-scala

The documentation is useful as well, providing visual representations of common patterns.

答案5

得分: 1

在JVM上实现CSP风格的并发编程并不容易,无论是针对Java还是Scala。原因在于CSP基于具有减少上下文的线程,通常称为绿色线程。减少的上下文消耗更少的内存,这意味着您可以运行比操作系统线程或Java线程更多的绿色线程(1个Java线程对应1个操作系统线程)。我曾经尝试过:使用4GB RAM,您可以启动约80,000个Goroutine(Go语言中的绿色线程变体),而Java线程只能启动约2,000个。

那么为什么这很重要呢?CSP的思想是,如果某个通道不包含数据,那么只有一个绿色线程会被浪费,它会一直等待直到接收到输入。假设您有一个被40,000个用户访问的Web应用程序。在具有4GB RAM的机器上可以启动的80,000个Goroutine可以立即处理这40,000个连接(一个入站连接和一个出站连接)。如果没有绿色线程,您需要更多的内存或更多的服务器。

绿色线程的另一个优点是,您不需要担心绿色线程是否停留在通道上,因为您有很多绿色线程。现在,通过面向通道的代码,您可以像处理同步代码一样处理真正异步的代码。通过通道传递消息与跟踪其他方法调用一样容易。Robert Pike在这个Youtube视频中很好地解释了这一点,大约在29:00的位置。这使得CSP风格的并发代码在一开始就更容易编写正确,也更容易找到与并发相关的错误。

另一个问题是延续(continuations)。假设您有一个函数连续从两个通道中获取数据并对数据进行计算。现在,第一个通道有数据,但第二个通道没有数据。因此,当第二个通道接收到数据时,语言运行时必须跳转到函数内部的位置,以便第二个通道向函数提供数据。为了能够做到这一点,运行时需要记住要跳转的位置,并且它必须存储从第一个通道获取的数据并恢复它,因为它与从第二个通道获取的数据一起计算。在JVM上,可以使用支持字节码注入的延续库来实现这一点。一个可以做到这一点的Java和Kotlin库是Quasar:http://docs.paralleluniverse.co/quasar/ Quasar还具有光纤(fibers),它们可以作为在JVM上类似于绿色线程的一种方式。Quasar的开发者是Ron Pressler,他被Oracle聘用来参与Loom项目的开发:http://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.html 该项目的目标是在JVM级别上支持光纤和延续,这将使光纤更高效,延续的字节码注入更加简便。

此外,Kotlin还有协程(Coroutines):https://kotlinlang.org/docs/reference/coroutines.html Kotlin的协程也实现了由Kotlin编译器提供的光纤和延续,因此开发者不需要在CSP库(如Quasar)中辅助进行字节码注入。

不幸的是,Kotlin的协程只适用于Kotlin,不能在其他JVM语言中使用。Quasar无法与Scala一起使用,因为与Java或Kotlin相比,Scala的字节码注入实现延续会更加困难,因为Scala是一种更复杂的语言。至少这是Quasar的开发者提供的理由。

因此,就Scala而言,最好的做法是坚持使用Akka或等待Loom项目完成。然后,一些Scala开发者可以开始在真正实现CSP的级别上为Scala实现CSP。在撰写本文时,Loom项目正在进行中,但尚未得到Oracle的正式批准。因此,目前尚不清楚未来的JDK是否将包含用于全面实现CSP所需的内容。

英文:

Implementing CSP-style concurrency on the JVM is not easy, whether it is for Java or for Scala. The reason is that CSP is based on threads with a reduced context, which are often called green threads. The reduced context consumes less memory, which means that you can run much more green threads than OS threads or Java threads (1 Java thread corresponds to 1 OS thread). I once tried it out: With 4 GB RAM you can start about 80.000 Goroutines (the variant of green thread in Go) compared to about 2.000 Java threads.

Now why does that matter? The idea in CSP is that if some channel contains no data there is "only" one green thread lost that now sits on that channel till it receives input. Let's say you have a web application being accessed by 40.000 users. The 80.000 Goroutines that can be started on a machine with 4 GB RAM can handle those 40.000 connection right away on the spot (1 inbound connection and one outbound connection). Without green threads you need a lot more memory or more servers.

The other point in green threads is that you just don't need to worry if a green thread sits on a channel as you have so many of them. Now with channel-oriented code you can look at code that truly behind the surface is asynchronous as if it were sychronous. Following message flow through channels is as easy as following any other method calls. Robert Pike explains this well in this Youtube-Video at about position 29:00. This makes CSP-style concurrent code much easier to get right from the beginning and also easier to find concurrency related bugs.

The other issue is continuations. Let's say you have a function that consumes data from 2 channels in a row and computes the data somehow. Now, the first channel has data, but the second has not. So when the second channel receives data, the language runtime has to jump inside the function to the place where the second channel supplies data to the function. For being able to do that the runtime needs to remeber where to jump and it had to store the data taken from the first chanels somewhere and restore it, because it is being computed together with the data from the second channel. This can be done on the JVM using continuations libraries that make use of byte code injection to make "stashing" intermediate results and remember locations where to jump to. One library for Java nad Kotlin that can do this is Quasar: http://docs.paralleluniverse.co/quasar/ Quasar also has fibers which serve as a means to have something similar to green threads on the JVM. The developer of Quasar is Ron Pressler who got hired by Oracle to work on Projekt Loom: http://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.html The idea of this project is to support for fibers and continuations on the JVM-level, which would make fibers more efficient and byte code injection for continuations less cumbersome.

Then there are also Coroutines in Kotlin: https://kotlinlang.org/docs/reference/coroutines.html Kotlin's Couroutines also implement fibers and continuations supplied by the Kotlin compiler, so the developer does not need to assist the CSP library (e.g. Quasar) in knowing what function needs byte code injection.

Unhappily, Kotlin's Couroutines are only for Kotlin and cannot be used outside of it. So they are not available to other JVM languages. Quasar does not work with Scala as byte code injection for Scala for continuations would be much more difficult as for Java or Kotlin as Scala is a much more elaborate language. At least that is the reasoning provided by the developer of Quasar.

So the best thing to do as what Scala is concerned is to stick to Akka or wait fro Project Loom to finish. Then some Scala people could start implementing CSP for Scala on a level that truly implements CSP. At the time of writing Project Loom is in the working, but not yet officially approved by Oracle. So it is so far not clear whether some future JDK will contains those things needed for full-scale CSP.

huangapple
  • 本文由 发表于 2013年11月26日 08:16:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/20206186.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定