在需要并行处理时,Java中同步ArrayList的高效方法是什么?

huangapple go评论63阅读模式
英文:

Efficient way of synchronize ArrayList in java when you have to process it parellel

问题

我有一个列表集合,我需要迭代每个列表元素并将其放入另一个列表中。数据量非常大,所以我需要并行处理它,以便获得较好的处理时间。同时,我需要保留列表的顺序。当我按照上述方式使用它时,有时会丢失列表中的元素,或者有时会得到NULL。制作列表同步或线程安全的高效方式是什么?

java.util.List<T> metadata = new ArrayList<T>();
sourceValuesIterable.parallelStream().forEach(tblRow ->
{
    metadata.add();
});

另一个问题:当您使用Guava的Predicates从集合中删除NULL时,是否会改变列表元素的顺序?

提前感谢。

英文:

I have collection of list and I have to iterate on each list element and put it into a another list.The data is very huge so I need to process it parallel so that I can get good processing time.Also I need to preserve the order of lists.I have lost element from list when I am using it as mentioned or sometime getting NULL.What will we efficient way of making list synchronize or thread safe.

 java.util.List&lt;T&gt; metadata = new ArrayList&lt;T&gt;();
sourceValuesIterable.parallelStream().forEach(tblRow -&gt;
{
    metadata.add();
});

One more question: When you remove the NULL from collection using Guava's Predicates does it change the order of list element?

Thanks in advance.

答案1

得分: 1

并行处理要求一个单一的“流水线”,如果你希望保留顺序的话。幸运的是,在这里你可以这样做:将你的 sVI 映射到 Ts,然后通过收集将流转换成列表:

List<T> metadata = sVI.parallelStream()
    .map(tblRow -> new ThingieThatGoesInMetadata())
    .collect(Collectors.toList());

从这里开始;这样,顺序是有保证的

英文:

Parallelism requires a single 'stream pipeline' if you want to stand any chance of order being preserved. Fortunately, you can do that here: map your sVI to Ts, then turn the stream into a list by collecting it:

List&lt;T&gt; metadata = sVI.parallelStream()
    .map(tblRow -&gt; new ThingieThatGoesInMetadata())
    .collect(Collectors.toList());

Start there; this way, the ordering is guaranteed.

答案2

得分: 0

我认为假设并行化这个任务并逐个将元素添加到新列表中会自动成为复制的最快方式是错误的。

首先,您没有为新的ArrayList预定大小,因此它将不断调整大小,以便在达到必要的容量时添加元素。

还有与启动并行流以及合并结果相关的开销。

ArrayList已经有了一个复制构造函数,它将执行高效的复制。最终,这只是复制引用的基础数组。很难想象能够超越这种低级操作的性能。

对于与性能相关的问题,始终最好的做法是进行性能分析,测量结果,并使用数据来指导您的决策。

英文:

I think it's a mistake to assume that parallelising this task and adding elements one at a time to the new list is automatically going to be the fastest way to copy it.

For starters, you didn't pre-size the new ArrayList, so it's going to continually be resizing as you add elements in order to reach the necessary capacity.

There is also an overhead associated with spinning up a parallel stream and with merging the results.

ArrayList already has a copy constructor which will do an efficient copy. Ultimately, that's just going to be copying the underlying array of references. It's hard to imagine being able to beat that kind of low-level operation for performance.

As always with performance-related concerns, your best bet is to profile it, measure the results, and use data to inform your decisions.

huangapple
  • 本文由 发表于 2020年7月22日 20:49:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/63034610.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定