英文:
Java ArrayList thread unsafe example explanation
问题
以上代码引发了以下异常:
java.lang.IndexOutOfBoundsException: Index: 0, Size: 1
我知道ArrayList不是线程安全的,但在这个示例中,我认为每个remove()调用都保证至少会先于一个add()调用,因此即使顺序像以下这样混乱,代码也应该是可以的:
thread0: method2()
thread1: method2()
thread1: method3()
thread0: method3()
需要一些解释,请看下文。
英文:
class ThreadUnsafe {
static final int THREAD_NUMBER = 2;
static final int LOOP_NUMBER = 200;
public static void main(String[] args) {
ThreadUnsafe test = new ThreadUnsafe();
for (int i = 0; i < THREAD_NUMBER; i++) {
new Thread(() -> {
test.method1(LOOP_NUMBER);
}, "Thread" + i).start();
}
}
ArrayList<String> list = new ArrayList<>();
public void method1(int loopNumber) {
for (int i = 0; i < loopNumber; i++) {
method2();
method3();
}
}
private void method2() {
list.add("1");
}
private void method3() {
list.remove(0);
}
}
The code above throws
java.lang.IndexOutOfBoundsException: Index: 0, Size: 1
I know ArrayList is not thread-safe, but in the example, I think every remove() call is guaranteed to be preceded by at least one add() call, so the code should be OK even the order is messed up like the following:
thread0: method2()
thread1: method2()
thread1: method3()
thread0: method3()
Some explanations needed here, please.
答案1
得分: 6
如果始终确保在开始另一个add()
或remove()
调用之前完全完成一个调用,那么你的推理是正确的。但是ArrayList
不能保证这一点,因为它的方法不是synchronized
的。因此,可能发生两个线程同时处于某些修改调用的中间状态。
让我们来看一下例如add()
方法的内部,以了解一种可能的故障模式。
在添加元素时,ArrayList
使用size++
来增加大小。而这不是原子操作。
现在想象一下列表为空,有两个线程A和B在完全相同的时刻添加一个元素,同时在并行进行size++
操作(可能在不同的CPU核心中)。让我们想象事情按照以下顺序发生:
- A将大小读为0。
- B将大小读为0。
- A将其值加1,得到1。
- B将其值加1,得到1。
- A将其新值写回
size
字段,导致size=1
。 - B将其新值写回
size
字段,导致size=1
。
尽管我们进行了2次add()
调用,但size
仅为1。如果现在尝试连续删除2个元素(这次是顺序进行的),第二个remove()
将失败。
为了实现线程安全,当一个访问正在进行时,不应让其他线程干扰内部结构,比如size
(或元素数组)。
多线程在于多个线程的调用不仅可以以任何(预期或意外的)顺序发生,而且它们还可以重叠,除非受到诸如synchronized
之类的机制的保护。另一方面,过多地使用同步可能会导致多线程性能不佳,还可能导致死锁。
英文:
If always one add()
or remove()
call is completely finished before another one is started, your reasoning is correct. But ArrayList
doesn't guarantee that as its methods aren't synchronized
. So, it can happen that two threads are in the middle of some modifying calls at the same time.
Let's look at the internals of e.g. the add()
method to understand one possible failure mode.
When adding an element, ArrayList
increases the size using size++
. And this is not atomic.
Now imagine the list being empty, and two threads A and B adding an element at exactly the same moment, doing the size++
in parallel (maybe in different CPU cores). Let's imagine things happen in the following order:
- A reads size as 0.
- B reads size as 0.
- A adds one to its value, giving 1.
- B adds one to its value, giving 1.
- A writes its new value back into the
size
field, resulting insize=1
. - B writes its new value back into the
size
field, resulting insize=1
.
Although we had 2 add()
calls, the size
is only 1. If now you try to remove 2 elements (and this time it happens sequentially), the second remove()
will fail.
To achieve thread safety, no other thread should be able to mess around with the internals like size
(or the elements array) while one access is currently in progress.
Multi-threading is inherently complex in that the calls from multiple threads can not only happen in any (expected or unexpected) order, but that they can also overlap, unless protected by some mechanism like synchronized
. On the other hand, excessive use of the synchronization can easily lead to poor multi-thread performance, and also to dead-locks.
答案2
得分: 1
作为对@RalfKleberhoff的回答的补充,
我认为每个remove()调用都保证在至少一个add()调用之前,
是的。
因此,即使顺序被搞乱,代码应该是没问题的。
不,这在多线程程序中并不是有效的推断。
您的程序涉及数据竞争,因为两个线程都访问同一个共享的非原子对象,其中一些访问是写操作,但没有适当的同步。包含数据竞争的程序的整体行为是未定义的,因此实际上您无法对其行为做出任何结论。
不要试图在同步上取巧或节省。通过限制对共享对象的使用来最小化所需的同步量,但在需要时确实需要它,关于何时以及在何处需要它的规则并不难学习。
英文:
As a supplement to @RalfKleberhoff's answer,
> I think every remove() call is guaranteed to be preceded by at least one add() call,
Yes.
> so the code should be OK even the order is messed up
No, that is not a valid inference with respect to a multithreaded program.
Your program contains data races as a result of two threads both accessing the same shared, non-atomic object, with some of those accesses being writes, without appropriate synchronization. The whole behavior of a program that contains data races is undefined, so in fact you cannot draw any conclusions at all about its behavior.
Do not try to cheat or scrimp on synchronization. Do minimize the amount of it that you need by limiting your use of shared objects, but where you need it, you need it, and the rules for determining when and where you need it are not that hard to learn.
答案3
得分: 0
ArrayList 在 Java 文档中表示:
> 注意,这个实现不是同步的。如果多个线程并发访问一个 ArrayList 实例,并且至少有一个线程在结构上修改了列表,那么必须在外部进行同步。
为什么这段代码不是线程安全的?
> 在计算机上运行的多个线程彼此独立。
>
> public void method1(int loopNumber) {
> for (int i = 0; i < loopNumber; i++) {
> method2();
> method3();
> }
> }
>
> 这里的 method2()
和 method3()
在同一线程内部是顺序执行的,但在不同线程之间不是顺序执行的。ArrayList list
在两个线程之间是共享的,这会在多核系统中导致它处于不一致的状态。
一个有趣的测试是在 method3()
中添加空检查,并设置 LOOP_NUMBER = 10000
;
private void method3()
{
if (!list.isEmpty())
list.remove(0);
}
结果应该会得到类似于 java.lang.IndexOutOfBoundsException: Index: 0, Size: 1
或者 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
的运行时异常,原因是 list
中的变量处于不一致的状态,即 size
。
要解决这个问题,你可以像下面这样添加 synchronized,或者使用 同步列表
public void method1(int loopNumber)
{
for (int i = 0; i < loopNumber; i++)
{
synchronized (list)
{
method2();
method3();
}
}
}
英文:
ArrayList in java docs says,
> Note that this implementation is not synchronized. If multiple threads
> access an ArrayList instance concurrently, and at least one of the
> threads modifies the list structurally, it must be synchronized
> externally.
Why this code is not thread safe ?
> Multiple thread running on Machine runs independent of each other.
>
> public void method1(int loopNumber) {
> for (int i = 0; i < loopNumber; i++) {
> method2();
> method3();
> }
> }
>
> Here method2()
and method3()
are being process sequential within
> the thread but not across the thread. ArrayList list
is common between both thread. which will be in inconstant state between both thread on multi core system.
Interesting test would be add empty check in method3()
and set LOOP_NUMBER = 10000
;
private void method3()
{
if (!list.isEmpty())
list.remove(0);
}
In result you should get same Runtime Exception some thing like java.lang.IndexOutOfBoundsException: Index: 0, Size: 1
or java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
because of same reason inconstant state of variable in list
i.e. size
.
To fix this issue you could have added synchronized like below or use Syncronized list
public void method1(int loopNumber)
{
for (int i = 0; i < loopNumber; i++)
{
synchronized (list)
{
method2();
method3();
}
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论