英文:
How are message-passing concurrent languages better than shared-memory concurrent languages in practice
问题
我多年来一直是一名Java开发者,但在开始进行Android开发之前,从未遇到过太多并发问题,突然间开始出现“应用程序无响应”和明显的死锁情况。
这让我意识到理解和调试一些并发问题有多么困难。新的语言如Scala和Go如何改进并发性能?它们如何更易于理解并如何防止并发错误?有人能提供一些真实世界的例子来展示这些优势吗?
英文:
I've been a Java developer for many years but never had to deal too much with concurrency issues until I started doing Android development, and suddenly started finding "application not responding" and apparent deadlock situations.
This made me realize how hard it can be to understand and to debug some of these concurrency issues. How do new languages such as Scala and Go improve concurrency? How are they more understandable and how do they prevent concurrency bugs? Can someone provide real-world examples that demonstrates the advantages?
答案1
得分: 49
简化并发的三个主要竞争者是actors、软件事务内存(STM)和自动并行化。Scala拥有这三种实现。
Actors
Actors在Erlang语言中有着最显著的实现,据我所知,这个想法起源于Erlang。Erlang是围绕actors从头设计的。这个想法是,actors彼此之间是黑盒子;它们只通过传递消息进行交互。
Scala在其库中有一个actors的实现,而在外部库中也有其他变体。在主要库中,并不强制要求黑盒子的特性,但是有一些易于使用的方法来传递消息,并且Scala使得创建不可变消息变得容易(这样你就不必担心在某个随机时间发送了一条带有某些内容的消息,然后更改了内容)。
actors的优势在于你不必担心复杂的共享状态,这真的简化了涉及的推理。此外,你可以将问题分解为比线程更小的部分,并让actor库决定如何将actors捆绑到适当数量的线程中。
缺点是,如果你尝试做一些复杂的事情,在成功之前,你需要处理很多关于发送消息、处理错误等的逻辑。
软件事务内存
STM基于这样一个想法,即最重要的并发操作是获取一些共享状态,对其进行操作,然后写回。因此,它提供了一种实现这一操作的方式;然而,如果它遇到一些问题(通常会延迟到最后才检测到),它会回滚更改并返回失败(或重试)。
这既是高性能的(在只有适度争用的情况下,因为通常一切都很顺利),也是对大多数锁定错误具有鲁棒性的,因为STM系统可以检测问题(甚至可能从低优先级请求中夺取访问权并将其给予高优先级请求)。
与actors不同,尝试复杂的事情更容易,只要你能处理失败。然而,你还必须正确地推理底层状态;STM通过失败和重试来防止罕见的无意死锁,但如果你只是犯了一个逻辑错误,一组特定的步骤无法完成,STM不能允许它。
Scala有一个STM库,它不是标准库的一部分,但正在考虑将其纳入。Clojure和Haskell都有成熟的STM库。
自动并行化
自动并行化的观点是,你不想考虑并发;你只想让事情快速发生。因此,如果你有某种并行操作-例如,对一组项目应用某个复杂操作,逐个处理,并产生另一组项目作为结果-你应该有自动并行执行此操作的例程。Scala的集合可以以这种方式使用(有一个.par
方法,将传统的串行集合转换为其并行模拟)。许多其他语言也有类似的功能(如Clojure、Matlab等)。
编辑:实际上,Actor模型在1973年就被描述出来了,可能是受到了Simula 67中早期使用协程而不是并发的工作的启发;在1978年出现了相关的通信顺序进程。因此,Erlang在当时并不是唯一具备这种能力的语言,但它在部署actor模型方面非常有效。
英文:
The three main contenders for simplifying concurrency are actors, software transactional memory (STM), and automatic parallelization. Scala has implementations of all three.
Actors
Actors find their most notable implementation in the language Erlang, which as far as I know is where the idea started*. Erlang is designed from the ground up around actors. The idea is that actors themselves are black boxes to each other; they interact only by passing messages.
Scala has an implementation of actors in its library, and variants are available in external libraries. In the main library, the black-box-ness is not enforced, but there are easy-to-use methods for passing messages, and Scala makes it easy to create immutable messages (so you don't have to worry that you send a message with some content, and then change the content at some random time).
The advantage of actors is that you don't have to worry about complex shared state, which really simplifies the reasoning involved. Also, you can decompose the problem into smaller pieces than threads and let the actor library figure out how to bundle actors into the appropriate number of threads.
The downside is that if you are trying to do something complex, you have a lot of logic to deal with for sending messages, handling errors, and so on, before you know it succeeds.
Software Transactional Memory
STM is based on the idea that the most important concurrent operation is to grab some shared state, fiddle with it, and write it back. So it provides a means of doing this; however, if it encounters some problem--which it typically delays detecting until the very end, at which point it checks to make sure the writes all went correctly--it rolls back the changes and returns a failure (or tries again).
This is both high-performance (in situations with only moderate contention, since usually everything goes just fine) and robust to most sorts of locking errors, since the STM system can detect problems (and even potentially do things like take access away from a lower-priority request and give it to a higher-priority one).
Unlike actors, it's easier to attempt complex things, as long as you can handle failure. However, you also have to reason correctly about the underlying state; STM prevents rare unintentional deadlocks via failing and retrying, but if you've simply made a logic error and a certain set of steps cannot complete, STM cannot allow it to.
Scala has a STM library that is not part of the standard library but is being considered for inclusion. Clojure and Haskell both have well-developed STM libraries.
Automatic Parallelization
Automatic parallelization takes the view that you don't want to think about concurrency; you just want stuff to happen fast. So if you have some sort of parallel operation--applying some complex operation to a collection of items, one at a time, and producing some other collection as a result, for instance--you should have routines that automatically do this in parallel. Scala's collections can be used in this way (there is a .par
method that converts a conventional serial collection into its parallel analog). Many other languages have similar features (Clojure, Matlab, etc.).
Edit: Actually, the Actor model was described back in 1973 and was probably motivated by earlier work in Simula 67 (using coroutines instead of concurrency); in 1978 came the related Communicating Sequential Processes. So Erlang's capabilities were not unique at the time, but the language was uniquely effective at deploying the actor model.
答案2
得分: 7
在一个典型的Go程序中,线程通过通道来进行状态和数据的通信。
这可以在不需要锁的情况下完成(通道在底层仍然使用锁)。通过通道将数据传递给接收者,意味着数据的所有权转移。一旦你通过通道发送了一个值,你就不应该再对它进行操作,因为接收者现在“拥有”它。
然而,需要注意的是,Go运行时并不以任何方式强制执行这种“所有权”的转移。通过通道发送的对象没有被标记或标记为任何东西。这只是一种约定。因此,如果你愿意,你可以通过修改之前通过通道发送的值来自毁。
Go的优势在于Go提供的语法(启动goroutine和通道工作方式)使得编写正确功能的代码变得更加容易,从而防止竞态条件和死锁。Go的清晰并发机制使得你的程序中将要发生的事情非常容易理解。
顺便说一句:Go的标准库仍然提供传统的互斥锁和信号量,如果你真的想使用它们的话。但是你显然要自行决定和承担风险。
英文:
In an idiomatic Go program, threads communicate state and data through channels.
This can be done without the need for locks (channels still use locking under the hood). Passing data through a channel to a receiver, implies transfer of ownership of the data. Once you send a value through a channel, you should not be operating on it anymore as whoever received it now 'owns' it.
However, it should be noted that this transfer of 'ownership' is not enforced by the Go runtime in any way. Objects sent through channels are not flagged or marked or anything like that. It is merely a convention. So you can, if you are so inclined, shoot yourself in the foot by mutating a value you previously sent through a channel.
Go's strength lies in that the syntax Go offers (launching of goroutines and the way channels work), makes it a lot easier to write code which functions correctly and thus prevents race conditions and dead locks. Go's clear concurrency mechanics make it very easy to reason about what is going to happen in your program.
As a side note: The standard library in Go does still offer the traditional mutexes and semaphores if you really want to use them. But you obviously do so at your own discretion and risk.
答案3
得分: 7
对我来说,使用Scala(Akka)的actors相对于传统的并发模型有几个优点:
- 使用像actors这样的消息传递系统可以轻松处理共享状态。例如,我经常会将可变数据结构封装在一个actor中,这样访问它的唯一方式就是通过消息传递。由于actors始终一次处理一个消息,这确保了对数据的所有操作都是线程安全的。
- Actors部分地消除了处理线程的生成和维护的需求。大多数actor库会处理将actors分布到线程中,因此您只需要担心启动和停止actors。通常,我会创建一系列相同的actors,每个物理CPU核心一个,并使用负载均衡器actor将消息均匀地分发给它们。
- Actors可以帮助提高系统的可靠性。我使用Akka actors,其中一个特性是您可以为actors创建一个监督者,如果一个actor崩溃,监督者将自动创建一个新实例。这可以帮助防止线程崩溃并且您被困在一个半运行的程序中的情况。根据需要快速启动新的actors并与在另一个应用程序中运行的远程actors进行交互也非常容易。
尽管仍然需要对并发和多线程编程有相当的理解,因为死锁和竞态条件仍然可能发生,但actors使得识别和解决这些问题变得更加容易。我不知道这些是否适用于Android应用程序,但我主要从事服务器端编程,使用actors使得开发变得更加容易。
英文:
For me, using Scala (Akka) actors has had several advantages over traditional concurrency models:
- Using a message-passing system like actors gives you a way to easily handle shared state. For example, I will frequently wrap a mutable data structure in an actor, so the only way to access it is through message-passing. Since actors always process one message at a time, this ensures that all operations on the data are thread-safe.
- Actors partially eliminate the need to deal with spawning and maintaining threads. Most actor libraries will handle distributing actors across threads, so you only need to worry about starting and stopping actors. Often I will create a series of identical actors, one per physical CPU core, and use a load-balancer actor to evenly distribute messages to them.
- Actors can help improve system reliability. I use Akka actors, and one feature is you can create a supervisor for actors, where if an actor crashes the supervisor will automatically create a new instance. this can help prevent situations with threading where a thread crashes and you're stuck with a half-running program. It's also very easy to spin up new actors as needed and work with remote actors running in another application.
You still need a decent understanding of concurrency and multi-threaded programming since deadlocks and race conditions are still possible, but actors make it much easier to identify and solve these issues. I don't know how much these apply to Android apps, but I mostly do server-side programming and using actors has made development much easier.
答案4
得分: 0
Scala的actors遵循共享无状态原则,因此没有锁(因此也没有死锁)!Actors监听消息,并由有任务需要actor处理的代码调用。
英文:
Scala actors work on a shared-nothing principle, so there are no locks (and hence no deadlocks)! Actors listen for messages and are invoked by the code that has something for an actor to work upon.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论