英文:
Will memory write be visible after sending an IPI on x86?
问题
我已经阅读了《Intel 64 和 IA-32 架构 SDM vol 3A,9.2 内存排序》,但有一个问题一直困扰着我。
如果我首先写入一个内存地址,然后使用 x2APIC 发送一个互处理器中断(IPI),这意味着发送 IPI 不需要写入内存(只需使用 wrmsr)。另一个核心接收到 IPI 并读取内存,它会读取到正确的值吗?
例如:
最初 x = 0
处理器 0:
mov [ _x], 1
wrmsr # 使用 x2APIC 发送 IPI
处理器 1:
# 接收到 IPI,在中断服务例程中:
mov r1, [ _x]
r1 = 0 允许吗?
英文:
I have read Intel 64 and IA-32 Architectures SDM vol 3A, 9.2 MEMORY ORDERING, but there was one question that kept bothering me.
If I first write to a memory address, then send an interprocessor interrupt(IPI) with x2APIC, that mean sending IPI doesn't need writing memory (just use wrmsr). Another core recive the IPI and read the memory, will it read the correct value?
For example:
Initially x = 0
Processor 0:
mov [ _x], 1
wrmsr # use x2APIC to send IPI
Processor 1:
# resive IPI, in the interrupt service routine:
mov r1, [ _x]
Is r1 = 0 allowed ?
答案1
得分: 3
这是一个有趣的问题。从表面上看,人们会认为由于WRMSR
是一个串行化指令,它会清除先前的内存写入,一切都很顺利。即使如此,引用手册中的一段话:
这些指令强制处理器在获取和执行下一条指令之前完成对先前指令的所有标志、寄存器和内存的修改,并且排空所有对内存的缓冲区写入。
(强调是我的)
它并未提及与发送IPI有关的顺序,因为这是当前指令的一部分,而不是下一条指令的一部分。因此,从理论上讲,这意味着其他核心可能会在原始核心仍在清除内容的情况下执行mov r1, [_x]
,但这很不可能发生,因为目标核心需要处理中断,这可能具有更高的延迟,正如@harold所提到的,这一点是无关紧要的,因为WRMSR
并不总是串行化的。阅读我最初忽略的脚注:
对于IA32_TSC_DEADLINE MSR(MSR索引6E0H)和X2APIC MSRs(MSR索引802H到83FH),WRMSR不是串行化的。
因此,绝对不能保证对x
的写入会被清除。
英文:
That is an interesting question. On the face of it, one would think that since WRMSR
is a serializing instruction it flushes the preceding memory writes and all is well. Even then, to quote the manual:
> These instructions force the processor to complete all modifications
> to flags, registers, and memory by previous instructions and to drain
> all buffered writes to memory before the next instruction is fetched
> and executed.
(Emphasis mine)
It doesn't say anything about the ordering with respect to sending the IPI as that is part of the current instruction, not the next one. So this theoretically means the other core could execute the mov r1, [ _x]
while the originating core is still busy flushing stuff but is very unlikely given that the target core would need to service the interrupt which probably has a lot higher latency.
As @harold mentioned, this point is moot since WRMSR
is not always serializing. Reading the footnote that I initially missed:
> WRMSR to the IA32_TSC_DEADLINE MSR (MSR index 6E0H) and the X2APIC
> MSRs (MSR indices 802H to 83FH) are not serializing.
So there is absolutely no guarantee that the write to x
is flushed.
答案2
得分: 2
来自Intel® 64和IA-32架构软件开发者手册第3A卷:系统编程指南,第1部分
11.12.3 x2APIC模式下的MSR访问
为了允许在x2APIC模式下高效地访问APIC寄存器,当写入APIC寄存器时,WRMSR的序列化语义会放宽。因此,系统软件不应将“在x2APIC模式下的WRMSR到APIC寄存器”用作序列化指令。对APIC寄存器的读取和写入访问将按程序顺序进行。对APIC寄存器的WRMSR操作可能在所有先前的存储全局可见之前完成;软件可以通过插入序列化指令或序列MFENCE;LFENCE之前的WRMSR来防止这种情况。
RDMSR指令不具备序列化功能,当在x2APIC模式下读取APIC寄存器时,这一行为不会改变。系统软件使用RDMSR指令访问APIC寄存器时不应期望序列化行为。(注意:基于MMIO的xAPIC接口被系统软件映射为非缓存区域。因此,在xAPIC模式下对xAPIC-MMIO接口的读/写具有序列化语义。)
但是,我仍然不知道这是否适用于AMD处理器。
英文:
From Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: System Programming Guide, Part 1
11.12.3 MSR Access in x2APIC Mode
To allow for efficient access to the APIC registers in x2APIC mode, the serializing semantics of WRMSR are relaxed when writing to the APIC registers. Thus, system software should not use “WRMSR to APIC registers in x2APIC mode” as a serializing instruction. Read and write accesses to the APIC registers will occur in program order. A WRMSR to an APIC register may complete before all preceding stores are globally visible; software can prevent this
by inserting a serializing instruction or the sequence MFENCE;LFENCE before the WRMSR.
The RDMSR instruction is not serializing and this behavior is unchanged when reading APIC registers in x2APIC mode. System software accessing the APIC registers using the RDMSR instruction should not expect a serializing behavior. (Note: The MMIO-based xAPIC interface is mapped by system software as an un-cached region. Consequently, read/writes to the xAPIC-MMIO interface have serializing semantics in the xAPIC mode.)
However, I still don't know if this will work with amd processors.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论