Is it OK to modify items in an ArrayList from multiple threads, if those threads never modify the same item?



假设我有一个 ArrayList<ContentStub>,其中 ContentStub 定义如下:

public class ContentStub {
    ContentType contentType;
    Object content;

我有多个类的实现,用于为每个 ContentType “填充”存根,例如:

public class TypeAStubInflater {

    public void inflate(List<ContentStub> contentStubs) {
        contentStubs.forEach(stub ->
                                 if(stub.contentType == ContentType.TYPE_A) {
                                    stub.content = someService.getContent();

思路是有一个 TypeAStubInflater 仅修改 ContentType.TYPE_A 的项,在一个线程中运行;而 TypeBStubInflater 仅修改 ContentType.TYPE_B 的项,依此类推。但是,每个实例的 inflate() 方法都在并行地修改同一个 contentStubs 列表中的项。


  • 没有线程会改变 ArrayList 的大小。
  • 没有线程会尝试修改另一个线程正在修改的值。
  • 没有线程会尝试读取另一个线程写入的值。

考虑到所有这些,似乎不需要额外的措施来确保线程安全。从对 ArrayList 实现的(非常)快速查看中,似乎没有出现 ConcurrentModificationException 的风险。然而,这并不意味着其他问题不会出现。我是否漏掉了什么,或者这样做是安全的?


In general, that will work, because you are not modifying the state of the List itself, which would throw a ConcurrentModificationException if any iterator is active at the time of looping, but rather are modifying just an object inside the list, which is fine from the list's POV.

I would recommend splitting up your into a Map&lt;ContentType, List&lt;ContentStub&gt;&gt; and then start Threads with those specific lists.

You could convert your list to a map with this:

Map&lt;ContentType, ContentStub&gt; typeToStubMap = stubs.stream().collect(Collectors.toMap(stub -&gt; stub.contentType, Function.identity()));

If your List is not that big (<1000 entries) I would even recommend not using any threading, but just use a plain for-i loop to iterate, even .foreach if that 2 extra integers are no concern.


假设线程A写入TYPE_A内容,线程B写入TYPE_B内容。列表contentStubs仅用于获取ContentStub的实例:仅限读访问。因此从ABcontentStubs的角度来看,没有问题。然而,线程AB进行的更新很可能永远不会被另一个线程看到,例如另一个线程C很可能会得出结论,对于列表中的所有元素,stub.content == null



    stub.content = someService.getContent(); // 发生在元素[17]位置



    ContentStub stub = contentStubs.get(17);






Let's assume the thread A writes TYPE_A content and thread B writes TYPE_B content. The List contentStubs is only used to obtain instances of ContentStub: read-access only. So from the perspective of A, B and contentStubs, there is no problem. However, the updates done by threads A and B will likely never be seen by another thread, e.g. another thread C will likely conclude that stub.content == null for all elements in the list.

The reason for this is the Java Memory Model. If you don't use constructs like locks, synchronization, volatile and atomic variables, the memory model gives no guarantee if and when modifications of an object by one thread are visible for another thread. To make this a little more practical, let's have an example.

Imagine that a thread A executes the following code:

    stub.content = someService.getContent(); // happens to be element[17]

List element 17 is a reference to a ContentStub object on the global heap. The VM is allowed to make a private thread copy of that object. All subsequent access to reference in thread A, uses the copy. The VM is free to decide when and if to update the original object on the global heap.

Now imagine a thread C that executes the following code:

    ContentStub stub = contentStubs.get(17);

The VM will likely do the same trick with a private copy in thread C.

If thread C already accessed the object before thread A updated it, thread C will likely use the &ndash; not updated &ndash; copy and ignore the global original for a long time. But even if thread C accesses the object for the first time after thread A updated it, there is no guarantee that the changes in the private copy of thread A already ended up in the global heap.

In short: without a lock or synchronization, thread C will almost certainly only read null values in each stub.content.

The reason for this memory model is performance. On modern hardware, there is a trade-off between performance and consistency across all CPUs/cores. If the memory model of a modern language requires consistency, that is very hard to guarantee on all hardware and it will likely impact performance too much. Modern languages therefore embrace low consistency and offer the developer explicit constructs to enforce it when needed. In combination with instruction reordering by both compilers and processors, that makes old-fashioned linear reasoning about your program code &hellip; interesting.

