英文:
ConcurrentHashMap - Can we get rid of i >= n from transfer()?
问题
与以下内容相关:
https://stackoverflow.com/q/63597067/7134737
我正在研究j.u.c.ConcurrentHashMap中的transfer()方法。
是否有必要从transfer()方法中去掉i >= n || i + n >= nextn?
for (int i = 0, bound = 0;;) {
Node<K,V> f; int fh;
while (advance) {
int nextIndex, nextBound;
if (--i >= bound || finishing) // 始终递减
advance = false;
else if ((nextIndex = transferIndex) <= 0) {
i = -1; // 显然小于n
advance = false;
}
else if (U.compareAndSetInt
(this, TRANSFERINDEX, nextIndex,
nextBound = (nextIndex > stride ?
nextIndex - stride : 0))) {
bound = nextBound;
i = nextIndex - 1; // 保证小于n
advance = false;
}
}
if (i < 0 || i >= n || i + n >= nextn) { // 除了i < 0以外,为什么要保留其他条件
考虑到i
是一个局部变量,我看不到它超过n
(旧表长度)的数学方式。
此外,上面链接的问题还说这是无用的代码。
我是否忽略了关于JMM需要检查这个奇怪条件的什么内容?
英文:
Related to :
https://stackoverflow.com/q/63597067/7134737
I am looking into the transfer() method in j.u.c.ConcurrentHashMap.
Doesn't it make sense to just get rid of i >= n || i + n >= nextn from the transfer() method ?
for (int i = 0, bound = 0;;) {
Node<K,V> f; int fh;
while (advance) {
int nextIndex, nextBound;
if (--i >= bound || finishing) // always decreasing
advance = false;
else if ((nextIndex = transferIndex) <= 0) {
i = -1; // obviously less than n
advance = false;
}
else if (U.compareAndSetInt
(this, TRANSFERINDEX, nextIndex,
nextBound = (nextIndex > stride ?
nextIndex - stride : 0))) {
bound = nextBound;
i = nextIndex - 1; // will be guaranteed to be less than n
advance = false;
}
}
if (i < 0 || i >= n || i + n >= nextn) { // why keep any thing other than i < 0
Given that i
is a local variable I see no mathematical way it can exceed n
(old table len).
Also, the question linked above also says that its dead code.
Am I missing something about the JMM that needs to check this weird condition?
答案1
得分: 1
这段代码的注释和解释涉及到了一些特定的编程和并发概念,需要一定的上下文理解。以下是代码中的一些主要部分的简要翻译:
// 当 advance 为真时执行以下循环
while (advance) {
int nextIndex, nextBound;
// 如果减小 i 后仍然大于等于 bound 或者已经完成,将 advance 设为假
if (--i >= bound || finishing)
advance = false;
// 如果 nextIndex 小于等于 0,则进入下面的 else if 块
else if ((nextIndex = transferIndex) <= 0) {
i = -1;
advance = false;
}
// 使用 CAS 操作尝试设置 nextIndex 和 nextBound,以确保线程安全
else if (U.compareAndSetInt
(this, TRANSFERINDEX, nextIndex,
nextBound = (nextIndex > stride ?
nextIndex - stride : 0))) {
bound = nextBound;
i = nextIndex - 1;
advance = false;
}
}
// 如果 i 小于 0,或者大于等于 n,或者大于等于 nextn,则执行以下操作
if (i < 0 || i >= n || i + n >= nextn) {
// 这里执行一些其他操作
}
请注意,这段代码是为了在并发环境下执行一些操作,使用了CAS(比较并交换)来确保线程安全性。这种代码通常需要通过不断重试来实现操作,直到成功为止,以处理并发冲突。此外,还提到了与数据库中的MVCC(多版本并发控制)相似的原理,其中也使用了重试来解决冲突。
英文:
It can, though:
>
> while (advance) {
> int nextIndex, nextBound;
> if (--i >= bound || finishing)
> advance = false;
> else if ((nextIndex = transferIndex) <= 0) { // !!!! LINE 1
> i = -1;
> advance = false;
> }
> else if (U.compareAndSetInt
> (this, TRANSFERINDEX, nextIndex,
> nextBound = (nextIndex > stride ?
> nextIndex - stride : 0))) {
> bound = nextBound;
> i = nextIndex - 1;
> advance = false;
> }
> }
>
> if (i < 0 || i >= n || i + n >= nextn) {
>
This code is highly non-idiomatic; it is written in a style somewhat common in older kernel/low-level C code style and not at all in java style. No wonder; this stuff is the domain of low-level stuff.
Line 1 as marked in the snippet I copy/pasted directly from the source of CHM (other than that remark) looks like a check but it is not. It sets nextIndex
to transferIndex, and then checks if that value is below or equal to 0 in which case it enters the else if
's block.
So, here's the flow to get to i >= n
triggering:
-
This code loops once or a few times.
-
It gets to that
if
check.transferIndex
is a field. Another thread can change it (and given that we're talking about CHM, the code is obviously written to take into account that may have occurred). -
Because of this,
nextIndex
is now quite high. Higher thann
. That's because another thread has been growing this map in the middle of thistransfer()
operation. -
Nevertheless, whilst
nextIndex
has been updated to the value ofnextTransfer
, the if fails. The next elseif (a CAS (Compare-and-set, a CPU primitive that is incredibly useful to write lock-free code that operates correctly in concurrent situations) op to ensure that in the few cycles we just spent going through these hoops,transferIndex
is still the value we read a cycle or two ago), we have now 'claimed' the job and seti
tonextIndex - 1
which could be higher thann
!. -
Then we get to the
i >= n
check you are asking about, and it could therefore indeed be true. I think the situation where that occurs is if, halfway through a transfer, some other thread has added some data to the CHM, but that's already been added to the post-transfer intended table so there is no need to act, thus, start the 'finishing' job which requires another CAS check and one more loop withfinishing = true
on.
NB: You may be thinking: Wait, transferIndex
is being read but there is absolutely nothing establishing Happens-Before on transferIndex here, so whatever it reads, it's by definition a JMM violation and could be (and is in fact likely to be on modern CPUs) unaffected by other thread writes. But not so fast: Before a likely stale value of transferIndex is actually used, the code does a CAS check on TRANSFERINDEX
(it's in the snippet I just pasted, in the final else if
) and CAS checks also establish HB. It's not documented, because Unsafe
isn't officially part of the spec, more or less. (It's sort of part of JVM spec and sort of not, bit of an oddball). For stale reads on this, the loop more or less just reloops.
As a general rule, to be truly fast in the face of likely concurrent operations, you want to be lock-free. To be lock-free, you need lots of CAS, and you use the CAS generally in a 'retry' style: You check current state, you act on current state, you then use CAS to confirm that the state you checked before is still the state, and if so, update and move on. If not, just.. do it again. Over and over until you 'win' ('win' = the CAS call succeeds because the thing you are CASing does have the expected value, expected = the thing it was when you started the calculation). Eventually you 'win'. Hence, it's best to read such code as having looped a few times, having done absolutely nothing, because CAS calls failed.
DBs work the same way: If you're familiar with MVCC style dbs and how you can use them to do lock-free ops with nevertheless the very highest concurrency guarantees (namely, TransactionIsolationLevel.SERIALIBLE
, the 'toughest' level) - same principle at work. Retry-on-conflict - same thing happening here.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论