2023年5月25日 18:54:06go评论63阅读模式

英文:

RISCV Li instruction

问题

In RISCV汇编中，"li"是一个伪指令。您提供的指令：

li      t2, 0x1800
csrc    mstatus, t2

"li"指令被汇编成以下两条指令：

lui x7, 2
addi x7, x7, -2048

为什么是2和-2048呢？因为 "li" 用于将一个立即数加载到寄存器中。在这里，0x1800被拆分成两部分：2（高位）和-2048（低位）。"lui"将高位加载到寄存器x7中，而"addi"将低位与x7相加。

关于这种行为是否有文档，您可以查阅RISCV架构的官方文档或特定的工具文档，以获得更多关于汇编指令和伪指令的详细信息。

英文:

in RISCV assembly, "li" is an pseudoinstruction.
I have this instruction:

li      t2, 0x1800
csrc    mstatus, t2

The "li" is assembled into following 2 instruction.

lui x7 2
addi x7 x7 -2048

My question , why 2 and -2048? and why "li" assembled into lui and addi?
are there a document for this kind of behavior?

I have used "riscv64-unknown-elf-as" as assembler.

答案1

得分: 5

以下是您要翻译的内容：

are there a document for this kind of behavior?
这不被视为行为，而是汇编器和编译器使用的聪明但广为人知的代码序列，用于缩短立即数的组合。处理器唯一的行为是在所有I-Type指令中对12位立即数进行符号扩展。设计师之所以这样做是因为两个原因的结合：

他们希望允许对addi等指令使用负数的立即数，以及对lw和sw也同样允许负偏移，认为负偏移非常有用，因为它们可以用于基于帧指针的相对算术以访问局部变量，或者用于访问紧随块之前的块的头部等其他情况。
此外，他们希望硬件只有一种12位扩展，即符号扩展。这两点综合起来意味着lui和addi、lw、sw之一可以完成完整的32位地址/值，所有这些工作方式都相同：对第二个指令的符号扩展可能需要递增用于lui的常量。
他们没有必须以这种方式设计它；例如，他们可以提供另一种指令addui，在添加之前清除前20位；或者，他们可以提供lw和sw的版本，执行相同的操作，或者定义lw和sw只支持12位无符号立即数。
但他们选择了一种既允许负立即数又是硬件替代方案中更简单的妥协方式。设计师已经付出了一些努力，以简化硬件，并考虑了嵌入式和其他功耗/尺寸受限的处理器。

why 2 and -2048?
为了避免addi与负12位数字的符号扩展功能，您必须将立即数限制为11位无符号数，这将使第12位，即符号位，为零，因此在12位中不是负数，因此永远不会扩展为负数。例如，0x400适用于11位，因此我们可以执行以下操作：

lui x7, 1
addi x7, x7, 0x400
addi x7, x7, 0x400

从而实现0x1000 + 0x400 + 0x400 = 0x1800。
但是，如您所见，这涉及到三个指令！
为了缩短代码序列，我们必须利用第12位（符号位）的额外位，即使它将设置/打开/为真/负，而且将在addi使用之前将12位立即数的前20位产生-1的值。那-1（由12位立即数的符号扩展引起的上20位的-1）需要通过+1（上20位的+1偏移）来获得所需的数字，并且这+1偏移在lui指令中完成，因此是lui x7, 2而不是1，以及addi x7, x7, 0x800以执行2个指令序列。0x800被视为有符号12位数为-2048，因此：2和-2048：0x2000=8192；8192 + -2048 = 6144；6144=0x1800。

英文:

> are there a document for this kind of behavior?

This is not really considered behavior, but rather a clever yet well known code sequence to shorten composition of immediates, used by assemblers and compilers.

The only behavior of the processor is sign extension of the 12-bit immediate in all I-Type instructions.

The reason the designers do this is a combination of two things:

That they want to allow for negative immediates for instructions like addi, as well as for lw and sw deeming that negative offsets are sufficiently useful, as they can be used for frame pointer relative arithmetic to access local variables, or reaching the header of a block that immediately precedes the block, among other things.
And further, they want the hardware to have only one kind of 12-bit extension, namely signed extension.

These two points, taken together, mean that lui and one of: addi, lw, sw, can accomplish full 32-bit addresses / values, all working the same: sign extension of the second instruction may require incrementing the constant used for the lui.

They didn't have to architect it this way; for example, they could have provided an another instruction addui that clears the upper 20 bits before adding; or, they could have provided versions of lw and sw that do the same, or defined lw and sw to support only 12-bit unsigned immediates.

But what they chose was a compromise to both allow negative immediates in general, and otherwise the simpler of the hardware alternatives.

The designers have gone to some lengths to simplify the hardware with consideration for embedded and otherwise power/size limited processors

> why 2 and -2048?

To avoid the sign extension feature of addi with negative 12-bit numbers, you would have to limit immediates to 11 bits unsigned, which would leave the 12th bit, sign bit, as zero, and thus would not be negative in 12 bits, so would never extend a negative sign.  For example, 0x400 fits in 11 bits, so with that we can do:

lui x7, 1
addi x7, x7, 0x400
addi x7, x7, 0x400

achieving 0x1000 + 0x400 + 0x400 = 0x1800.

However, as you can see that involves three instructions!

To shorten the code sequence, we must take advantage of the extra 12th (sign) bit, even though it is going to be set/on/true/1/negative, and will cause -1 value for the upper 20 bits of the immediate before use by the addi.

That -1 (of the upper 20 bits caused by sign extension of the 12 bit immediate) needs to be offset by +1 (of the upper 20 bits) to obtain the desired number, and that +1 offset is done in the lui instruction, hence lui x7, 2 instead of 1, and addi x7, x7, 0x800 to accomplish the 2 instruction sequence.  0x800 taken as a signed 12-bit number is -2048, so: 2 and -2048: 0x2000=8192; 8192 + -2048 = 6144; 6144=0x1800.

答案2

得分: 0

你可以查看 binutils 源代码中的 RISCV_CONST_HIGH_PART 以获取更多详细信息。

英文:

you may check RISCV_CONST_HIGH_PART in binutils source code for more details.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

RISCV Li指令

问题

答案1

答案2

GDB: 文件未找到/所需文件未找到

为什么(V)SHUFPS不在英特尔的常数时间指令列表中？

在x86/x64体系结构中设置EFLAGS标志的条件。

uoy erA woH ycancalC lunami -1 * 3?

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论