2023年5月29日 02:08:32go评论60阅读模式

英文:

Will memory write be visible after sending an IPI on x86?

问题

我已经阅读了《Intel 64 和 IA-32 架构 SDM vol 3A，9.2 内存排序》，但有一个问题一直困扰着我。

如果我首先写入一个内存地址，然后使用 x2APIC 发送一个互处理器中断（IPI），这意味着发送 IPI 不需要写入内存（只需使用 wrmsr）。另一个核心接收到 IPI 并读取内存，它会读取到正确的值吗？

例如：

最初 x = 0

处理器 0:

mov [ _x], 1
wrmsr       # 使用 x2APIC 发送 IPI

处理器 1:

# 接收到 IPI，在中断服务例程中：
mov r1, [ _x]

r1 = 0 允许吗？

英文:

I have read Intel 64 and IA-32 Architectures SDM vol 3A, 9.2 MEMORY ORDERING, but there was one question that kept bothering me.

If I first write to a memory address, then send an interprocessor interrupt(IPI) with x2APIC, that mean sending IPI doesn't need writing memory (just use wrmsr). Another core recive the IPI and read the memory, will it read the correct value?

For example:

Initially x = 0

Processor 0:

mov [ _x], 1
wrmsr       # use x2APIC to send IPI

Processor 1:

# resive IPI, in the interrupt service routine:
mov r1, [ _x]

Is r1 = 0 allowed ?

答案1

得分: 3

这是一个有趣的问题。从表面上看，人们会认为由于WRMSR是一个串行化指令，它会清除先前的内存写入，一切都很顺利。即使如此，引用手册中的一段话：

这些指令强制处理器在获取和执行下一条指令之前完成对先前指令的所有标志、寄存器和内存的修改，并且排空所有对内存的缓冲区写入。

（强调是我的）

它并未提及与发送IPI有关的顺序，因为这是当前指令的一部分，而不是下一条指令的一部分。因此，从理论上讲，这意味着其他核心可能会在原始核心仍在清除内容的情况下执行mov r1, [_x]，但这很不可能发生，因为目标核心需要处理中断，这可能具有更高的延迟，正如@harold所提到的，这一点是无关紧要的，因为WRMSR并不总是串行化的。阅读我最初忽略的脚注：

对于IA32_TSC_DEADLINE MSR（MSR索引6E0H）和X2APIC MSRs（MSR索引802H到83FH），WRMSR不是串行化的。

因此，绝对不能保证对x的写入会被清除。

英文:

That is an interesting question. On the face of it, one would think that since WRMSR is a serializing instruction it flushes the preceding memory writes and all is well. Even then, to quote the manual:

> These instructions force the processor to complete all modifications
> to flags, registers, and memory by previous instructions and to drain
> all buffered writes to memory before the next instruction is fetched
> and executed.

(Emphasis mine)

It doesn't say anything about the ordering with respect to sending the IPI as that is part of the current instruction, not the next one. So this theoretically means the other core could execute the mov r1, [ _x] while the originating core is still busy flushing stuff but is very unlikely given that the target core would need to service the interrupt which probably has a lot higher latency.

As @harold mentioned, this point is moot since WRMSR is not always serializing. Reading the footnote that I initially missed:

> WRMSR to the IA32_TSC_DEADLINE MSR (MSR index 6E0H) and the X2APIC
> MSRs (MSR indices 802H to 83FH) are not serializing.

So there is absolutely no guarantee that the write to x is flushed.

答案2

得分: 2

来自Intel® 64和IA-32架构软件开发者手册第3A卷：系统编程指南，第1部分

11.12.3 x2APIC模式下的MSR访问

为了允许在x2APIC模式下高效地访问APIC寄存器，当写入APIC寄存器时，WRMSR的序列化语义会放宽。因此，系统软件不应将“在x2APIC模式下的WRMSR到APIC寄存器”用作序列化指令。对APIC寄存器的读取和写入访问将按程序顺序进行。对APIC寄存器的WRMSR操作可能在所有先前的存储全局可见之前完成；软件可以通过插入序列化指令或序列MFENCE;LFENCE之前的WRMSR来防止这种情况。

RDMSR指令不具备序列化功能，当在x2APIC模式下读取APIC寄存器时，这一行为不会改变。系统软件使用RDMSR指令访问APIC寄存器时不应期望序列化行为。（注意：基于MMIO的xAPIC接口被系统软件映射为非缓存区域。因此，在xAPIC模式下对xAPIC-MMIO接口的读/写具有序列化语义。）

但是，我仍然不知道这是否适用于AMD处理器。

英文:

From Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: System Programming Guide, Part 1

11.12.3 MSR Access in x2APIC Mode

To allow for efficient access to the APIC registers in x2APIC mode, the serializing semantics of WRMSR are relaxed when writing to the APIC registers. Thus, system software should not use “WRMSR to APIC registers in x2APIC mode” as a serializing instruction. Read and write accesses to the APIC registers will occur in program order. A WRMSR to an APIC register may complete before all preceding stores are globally visible; software can prevent this
by inserting a serializing instruction or the sequence MFENCE;LFENCE before the WRMSR.

The RDMSR instruction is not serializing and this behavior is unchanged when reading APIC registers in x2APIC mode. System software accessing the APIC registers using the RDMSR instruction should not expect a serializing behavior. (Note: The MMIO-based xAPIC interface is mapped by system software as an un-cached region. Consequently, read/writes to the xAPIC-MMIO interface have serializing semantics in the xAPIC mode.)

However, I still don't know if this will work with amd processors.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

内存写入在发送x86中断后是否可见？

问题

答案1

答案2

GRUB未在自定义操作系统开发中切换到图形模式

如何在汇编中将128位数据加载到ymm寄存器？

-ffreestanding 和 -nostdlib 在使用 gcc 编译时的区别

x87能够对无符号QUADword整数执行精确的除法吗？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论