Java ArrayList 线程不安全示例解释

huangapple go评论65阅读模式
英文:

Java ArrayList thread unsafe example explanation

问题

以上代码引发了以下异常:

java.lang.IndexOutOfBoundsException: Index: 0, Size: 1

我知道ArrayList不是线程安全的,但在这个示例中,我认为每个remove()调用都保证至少会先于一个add()调用,因此即使顺序像以下这样混乱,代码也应该是可以的:

thread0: method2()
thread1: method2()
thread1: method3()
thread0: method3() 

需要一些解释,请看下文。

英文:
class ThreadUnsafe {

    static final int THREAD_NUMBER = 2;
    static final int LOOP_NUMBER = 200; 

    public static void main(String[] args) {
        ThreadUnsafe test = new ThreadUnsafe();
        for (int i = 0; i < THREAD_NUMBER; i++) {
            new Thread(() -> {
                test.method1(LOOP_NUMBER);
            }, "Thread" + i).start();
        }
    }

  
    ArrayList<String> list = new ArrayList<>();

    public void method1(int loopNumber) {
        for (int i = 0; i < loopNumber; i++) {  
            method2();
            method3();
        }
    }
    private void method2() {
        list.add("1");
    }
    private void method3() {
        list.remove(0);
    }

}

The code above throws

java.lang.IndexOutOfBoundsException: Index: 0, Size: 1

I know ArrayList is not thread-safe, but in the example, I think every remove() call is guaranteed to be preceded by at least one add() call, so the code should be OK even the order is messed up like the following:

thread0: method2()
thread1: method2()
thread1: method3()
thread0: method3() 

Some explanations needed here, please.

答案1

得分: 6

如果始终确保在开始另一个add()remove()调用之前完全完成一个调用,那么你的推理是正确的。但是ArrayList不能保证这一点,因为它的方法不是synchronized的。因此,可能发生两个线程同时处于某些修改调用的中间状态。

让我们来看一下例如add()方法的内部,以了解一种可能的故障模式。

在添加元素时,ArrayList使用size++来增加大小。而这不是原子操作。

现在想象一下列表为空,有两个线程A和B在完全相同的时刻添加一个元素,同时在并行进行size++操作(可能在不同的CPU核心中)。让我们想象事情按照以下顺序发生:

  • A将大小读为0。
  • B将大小读为0。
  • A将其值加1,得到1。
  • B将其值加1,得到1。
  • A将其新值写回size字段,导致size=1
  • B将其新值写回size字段,导致size=1

尽管我们进行了2次add()调用,但size仅为1。如果现在尝试连续删除2个元素(这次是顺序进行的),第二个remove()将失败。

为了实现线程安全,当一个访问正在进行时,不应让其他线程干扰内部结构,比如size(或元素数组)。

多线程在于多个线程的调用不仅可以以任何(预期或意外的)顺序发生,而且它们还可以重叠,除非受到诸如synchronized之类的机制的保护。另一方面,过多地使用同步可能会导致多线程性能不佳,还可能导致死锁。

英文:

If always one add() or remove() call is completely finished before another one is started, your reasoning is correct. But ArrayList doesn't guarantee that as its methods aren't synchronized. So, it can happen that two threads are in the middle of some modifying calls at the same time.

Let's look at the internals of e.g. the add() method to understand one possible failure mode.

When adding an element, ArrayList increases the size using size++. And this is not atomic.

Now imagine the list being empty, and two threads A and B adding an element at exactly the same moment, doing the size++ in parallel (maybe in different CPU cores). Let's imagine things happen in the following order:

  • A reads size as 0.
  • B reads size as 0.
  • A adds one to its value, giving 1.
  • B adds one to its value, giving 1.
  • A writes its new value back into the size field, resulting in size=1.
  • B writes its new value back into the size field, resulting in size=1.

Although we had 2 add() calls, the size is only 1. If now you try to remove 2 elements (and this time it happens sequentially), the second remove() will fail.

To achieve thread safety, no other thread should be able to mess around with the internals like size (or the elements array) while one access is currently in progress.

Multi-threading is inherently complex in that the calls from multiple threads can not only happen in any (expected or unexpected) order, but that they can also overlap, unless protected by some mechanism like synchronized. On the other hand, excessive use of the synchronization can easily lead to poor multi-thread performance, and also to dead-locks.

答案2

得分: 1

作为对@RalfKleberhoff的回答的补充,

我认为每个remove()调用都保证在至少一个add()调用之前,

是的。

因此,即使顺序被搞乱,代码应该是没问题的。

不,这在多线程程序中并不是有效的推断。

您的程序涉及数据竞争,因为两个线程都访问同一个共享的非原子对象,其中一些访问是写操作,但没有适当的同步。包含数据竞争的程序的整体行为是未定义的,因此实际上您无法对其行为做出任何结论。

不要试图在同步上取巧或节省。通过限制对共享对象的使用来最小化所需的同步量,但在需要时确实需要它,关于何时以及在何处需要它的规则并不难学习。

英文:

As a supplement to @RalfKleberhoff's answer,

> I think every remove() call is guaranteed to be preceded by at least one add() call,

Yes.

> so the code should be OK even the order is messed up

No, that is not a valid inference with respect to a multithreaded program.

Your program contains data races as a result of two threads both accessing the same shared, non-atomic object, with some of those accesses being writes, without appropriate synchronization. The whole behavior of a program that contains data races is undefined, so in fact you cannot draw any conclusions at all about its behavior.

Do not try to cheat or scrimp on synchronization. Do minimize the amount of it that you need by limiting your use of shared objects, but where you need it, you need it, and the rules for determining when and where you need it are not that hard to learn.

答案3

得分: 0

ArrayList 在 Java 文档中表示:

> 注意,这个实现不是同步的。如果多个线程并发访问一个 ArrayList 实例,并且至少有一个线程在结构上修改了列表,那么必须在外部进行同步。

为什么这段代码不是线程安全的?

> 在计算机上运行的多个线程彼此独立。
>
> public void method1(int loopNumber) {
> for (int i = 0; i < loopNumber; i++) {
> method2();
> method3();
> }
> }
>
> 这里的 method2()method3() 在同一线程内部是顺序执行的,但在不同线程之间不是顺序执行的。ArrayList list 在两个线程之间是共享的,这会在多核系统中导致它处于不一致的状态。

一个有趣的测试是在 method3() 中添加空检查,并设置 LOOP_NUMBER = 10000

private void method3()
{
	if (!list.isEmpty())
		list.remove(0);
}

结果应该会得到类似于 java.lang.IndexOutOfBoundsException: Index: 0, Size: 1 或者 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 的运行时异常,原因是 list 中的变量处于不一致的状态,即 size

要解决这个问题,你可以像下面这样添加 synchronized,或者使用 同步列表

public void method1(int loopNumber)
{
	for (int i = 0; i &lt; loopNumber; i++)
	{
		synchronized (list)
		{
			method2();
			method3();
		}
	}
} 
英文:

ArrayList in java docs says,

> Note that this implementation is not synchronized. If multiple threads
> access an ArrayList instance concurrently, and at least one of the
> threads modifies the list structurally, it must be synchronized
> externally.

Why this code is not thread safe ?

> Multiple thread running on Machine runs independent of each other.
>
> public void method1(int loopNumber) {
> for (int i = 0; i < loopNumber; i++) {
> method2();
> method3();
> }
> }
>
> Here method2() and method3() are being process sequential within
> the thread
but not across the thread. ArrayList list is common between both thread. which will be in inconstant state between both thread on multi core system.

Interesting test would be add empty check in method3() and set LOOP_NUMBER = 10000;

private void method3()
{
	if (!list.isEmpty())
		list.remove(0);
}

In result you should get same Runtime Exception some thing like java.lang.IndexOutOfBoundsException: Index: 0, Size: 1 or java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 because of same reason inconstant state of variable in list i.e. size.

To fix this issue you could have added synchronized like below or use Syncronized list

public void method1(int loopNumber)
{
	for (int i = 0; i &lt; loopNumber; i++)
	{
		synchronized (list)
		{
			method2();
			method3();
		}
	}
} 

huangapple
  • 本文由 发表于 2020年10月9日 17:59:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/64277789.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定