英文:
Memory Barrier Vs CAS
问题
CAS操作会将所有CPU写缓存刷新到主内存。这是否类似于内存屏障?
如果是这样的话,这是否意味着CAS可以使Java中的Happens-Before生效?
答案是:
CAS是CPU指令。
屏障是一个StoreLoad屏障,因为我关心的是CAS之后是否可以读取CAS之前写入的数据。
更多细节:
我之所以提出这个问题,是因为我正在编写一个Java中的分叉-合并实现。实现如下:
{
//初始化结果容器
Object[] result = new Object[];
//工作线程完成状态计数
AtomicInteger state = new AtomicInteger(result.size);
}
//工作线程i
{
result[i] = new Object();
//这是一个CAS操作
state.getAndDecrement();
if(state.get() == 0){
//使用结果数组做一些事情
}
}
我想知道“使用结果数组做一些事情”的部分是否可以看到其他工作线程写入的所有结果元素。
英文:
I find that CAS will flush all CPU write cache to main memory。 Is this similar to memory barrier?
If this is true, does this mean CAS can make java Happens-Before work?
For answer:
The CAS is CPU instruction.
The barrier is a StoreLoad barrier because what I care about is will the data are written before CAS can be read after CAS.
More Detail:
I have this question because I am writing a fork-join built in Java. The implementation is like this
{
//initialize result container
Objcet[] result = new Object[];
//worker finish state count
AtomicInteger state = new AtomicInteger(result.size);
}
//worker thread i
{
result[i] = new Object();
//this is a CAS operation
state.getAndDecrement();
if(state.get() == 0){
//do something useing result array
}
}
I want to know can (do something using result array) part see all result element which is written by other worker thread.
答案1
得分: 3
I find that CAS will flush all cpu write cache to main memory. Is this similar to memory barrier?
CAS会将所有CPU写缓存刷新到主内存。这与内存屏障类似吗?
- It depends on what you mean by CAS. (A specific hardware instruction? An implementation strategy used in the implementation of some Java class?)
这取决于你所说的CAS是什么意思。 (是特定的硬件指令吗?是在某些Java类的实现中使用的实现策略吗?)
- It depends on what kind of memory barrier you are talking about. There are a number of different kinds ...
这取决于你所说的内存屏障是哪种类型的。有很多不同类型的内存屏障...
- It is not necessarily true that a CAS instruction flushes all dirty cache lines. It depends on how a particular instruction set / hardware implements the CAS instruction.
并不一定CAS 指令 刷新了 所有 的脏缓存行。这取决于特定的指令集/硬件如何实现CAS指令。
It is unclear what you mean by "make happens-before work". Certainly, under some circumstance a CAS instruction would provide the necessary memory coherency properties for a specific happens-before relationship. But not necessarily all relationships. It would depend on how the CAS instruction is implemented by the hardware.
"使happens-before生效"是什么意思不太清楚。在某些情况下,CAS指令可能会提供特定的happens-before关系所需的内存一致性属性。但并不一定适用于所有关系。这将取决于硬件如何实现CAS指令。
To be honest, unless you are actually writing a Java compiler, you would do better to not try to understanding the intricacies of what a JIT compiler needs to do to implement the Java Memory Model. Just apply the happens before rules.
老实说,除非你实际上正在编写一个Java编译器,否则最好不要试图理解JIT编译器需要执行以实现Java内存模型的复杂细节。只需应用happens before规则。
UPDATE
It turns out from your recent updates and comments that your actual question is about the behavior of AtomicInteger
operations.
The memory semantics of the atomic types are specified in the package javadoc for java.util.concurrent.atomic
as follows:
The memory effects for accesses and updates of atomics generally follow the rules for volatiles, as stated in The Java Language Specification (17.4 Memory Model):
get
has the memory effects of reading a volatile variable.set
has the memory effects of writing (assigning) a volatile variable.lazySet
has the memory effects of writing (assigning) a volatile variable except that it permits reorderings with subsequent (but not previous) memory actions that do not themselves impose reordering constraints with ordinary non-volatile writes. Among other usage contexts, lazySet may apply when nulling out, for the sake of garbage collection, a reference that is never accessed again.weakCompareAndSet
atomically reads and conditionally writes a variable but does not create any happens-before orderings, so provides no guarantees with respect to previous or subsequent reads and writes of any variables other than the target of the weakCompareAndSet.compareAndSet
and all other read-and-update operations such asgetAndIncrement
have the memory effects of both reading and writing volatile variables.
As you can see, operations on Atomic types are specified to have memory semantics that are equivalent to volatile variables. This should be sufficient to reason about your use of Java atomic types... without resorting to dubious analogies with CAS instructions and memory barriers.
Your example is incomplete and it is difficult to understand what it is trying to do. Therefore, I can't comment on its correctness. However, you should be able to analyze it yourself using happens-before logic, etc.
英文:
> I find that CAS will flush all cpu write cache to main memory。 Is this similar to memory barrier?
-
It depends on what you mean by CAS. (A specific hardware instruction? An implementation strategy used in the implementation of some Java class?)
-
It depends on what kind of memory barrier you are talking about. There are a number of different kinds ...
-
It is not necessarily true that a CAS instruction flushes all dirty cache lines. It depends on how a particular instruction set / hardware implements the CAS instruction.
It is unclear what you mean by "make happens-before work". Certainly, under some circumstance a CAS instruction would provide the necessary memory coherency properties for a specific happens-before relationship. But not necessarily all relationships. It would depend on how the CAS instruction is implemented by the hardware.
To be honest, unless you are actually writing a Java compiler, you would do better to not try to understanding the intricacies of what a JIT compiler needs to do to implement the Java Memory Model. Just apply the happens before rules.
UPDATE
It turns out from your recent updates and comments that your actual question is about the behavior of AtomicInteger
operations.
The memory semantics of the atomic types are specified in the package javadoc for java.util.concurrent.atomic
as follows:
> The memory effects for accesses and updates of atomics generally follow the rules for volatiles, as stated in The Java Language Specification (17.4 Memory Model):
>
> - get
has the memory effects of reading a volatile variable.
> - set
has the memory effects of writing (assigning) a volatile variable.
> - lazySet
has the memory effects of writing (assigning) a volatile variable except that it permits reorderings with subsequent (but not previous) memory actions that do not themselves impose reordering constraints with ordinary non-volatile writes. Among other usage contexts, lazySet may apply when nulling out, for the sake of garbage collection, a reference that is never accessed again.
> - weakCompareAndSet
atomically reads and conditionally writes a variable but does not create any happens-before orderings, so provides no guarantees with respect to previous or subsequent reads and writes of any variables other than the target of the weakCompareAndSet.
> - compareAndSet
and all other read-and-update operations such as getAndIncrement
have the memory effects of both reading and writing volatile variables.
As you can see, operations on Atomic types are specified to have memory semantics that are equivalent volatile variables. This should be sufficient to reason about your use of Java atomic types ... without resorting to dubious analogies with CAS instructions and memory barriers.
Your example is incomplete and it is difficult to understand what it is trying to do. Therefore, I can't comment on its correctness. However, you should be able to analyze it yourself using happens-before logic, etc.
答案2
得分: 1
I find that CAS will flush all CPU write cache to main memory. Is this similar to memory barrier?
A CAS in Java on the X86 is implemented using a lock prefix and then it depends on the type of CAS what kind of instruction is actually being used; but that isn't that relevant for this discussion. A locked instruction effectively is a full barrier; so it includes all 4 fences: LoadLoad/LoadStore/StoreLoad/StoreStore. Since the X86 provides all but StoreLoad due to TSO, only the StoreLoad needs to be added; just as with a volatile write.
A StoreLoad doesn't force changes to be written to main memory; it only forces the CPU to wait executing loads till the store buffer has been be drained to the L1d. However, with MESI (Intel) based cache coherence protocols, it can happen that a cache-line that is in MODIFIED state on a different CPU, needs to be flushed to main memory before it can be returned as EXCLUSIVE. With MOESI (AMD) based cache coherence protocols, this is not an issue. If the cache-line is already in MODIFIED, EXCLUSIVE state on the core doing the StoreLoad, StoreLoad doesn't cause the cache line to be flushed to main memory. The cache is the source of truth.
If this is true, does this mean CAS can make Java Happens-Before work?
From a memory model perspective, a successful CAS in Java is nothing else than a volatile read followed by a volatile write. So there is a happens-before relation between a volatile write of some field on some object instance and a subsequent volatile read on the same field on the same object instance.
Since you are working with Java, I would focus on the Java Memory Model and not too much on how it is implemented in the hardware. The JMM is allowing for executions that can't be explained based purely by thinking in fences.
Regarding your example:
result[i] = new Object();
//this is a CAS operation
state.getAndDecrement();
if(state.get() == 0){
//do something using result array
}
I'm not sure what the intended logic is. In your example, multiple threads at the same time could see that the state is 0, so all could start to do something with the array. If this behavior is undesirable, then this is caused by a race condition. I would use something like this:
result[i] = new Object();
//this is a CAS operation
int s = state.getAndDecrement();
if(s == 0){
//do something using result array
}
Now the other question is if there is a data race on the array content. There is a happens-before edge between the write to the array content and the write to 'state' (program order rule). There is a happens-before edge between the write of the state and the read (volatile variable rule) and there is a happens-before relation between the read of the state and the read of the array content (program order rule). So there is a happens-before edge between writing to the array and reading its content in this particular example due to the transitive nature of the happens-before relation.
Personally I would not to try too be too smart and use something less array prone like an AtomicReferenceArray; then at least you don't need to worry about missing happens before edge between the write of the array and the read.
英文:
> I find that CAS will flush all CPU write cache to main memory。
> Is this similar to memory barrier?
A CAS in Java on the X86 is implemented using a lock prefix and then it depends on the type of CAS what kind of instruction is actually being used; but that isn't that relevant for this discussion. A locked instruction effectively is a full barrier; so it includes all 4 fences: LoadLoad/LoadStore/StoreLoad/StoreStore. Since the X86 provides all but StoreLoad due to TSO, only the StoreLoad needs to be added; just as with a volatile write.
A StoreLoad doesn't force changes to be written to main memory; it only forces the CPU to wait executing loads till the store buffer has been be drained to the L1d. However, with MESI (Intel) based cache coherence protocols, it can happen that a cache-line that is in MODIFIED state on a different CPU, needs to be flushed to main memory before it can be returned as EXCLUSIVE. With MOESI (AMD) based cache coherence protocols, this is not an issue. If the cache-line is already in MODIFIED,EXCLUSIVE state on the core doing the StoreLoad, StoreLoad doesn't cause the cache line to be flushed to main memory. The cache is the source of truth.
>> If this is true, does this mean CAS can make java Happens-Before work?
From a memory model perspective, a successful CAS in java is nothing else than a volatile read followed by a volatile write. So there is a happens before relation between a volatile write of some field on some object instance and a subsequent volatile read on the same field on the same object instance.
Since you are working with Java, I would focus on the Java Memory Model and not too much on how it is implemented in the hardware. The JMM is allowing for executions that can't be explained based purely by thinking in fences.
Regarding your example:
result[i] = new Object();
//this is a CAS operation
state.getAndDecrement();
if(state.get() == 0){
//do something using result array
}
I'm not sure what the intended logic is. In your example, multiple threads at the same time could see that the state is 0, so all could start to do something with the array. If this behavior is undesirable, then this is caused by a race condition. I would use something like this:
result[i] = new Object();
//this is a CAS operation
int s = state.getAndDecrement();
if(s == 0){
//do something using result array
}
Now the other question is if there is a data race on the array content. There is a happens-before edge between the write to the array content and the write to 'state' (program order rule). There is a happens before edge between the write of the state and the read (volatile variable rule) and there is a happens before relation between the read of the state and the read of the array content (program order rule). So there is a happens before edge between writing to the array and reading its content in this particular example due to the transitive nature of the happens-before relation.
Personally I would not to try too be too smart and use something less array prone like an AtomicReferenceArray; then at least you don't need to worry about missing happens before edge between the write of the array and the read.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论