Why does the opcode for MOV from a segment register not have its low bit set? It's not 8-bit operand-size, so the W bit should be set

huangapple go评论62阅读模式
英文:

Why does the opcode for MOV from a segment register not have its low bit set? It's not 8-bit operand-size, so the W bit should be set

问题

I am having a problem where I can't understand the opcode of this instruction MOV BX,CS like the first byte is 10001100 where the first 6 bits represent the opcode then the direction 1 bit follows then the W. So here is my doubt why in the opcode it is 100011 and not 100010 and in the W cell why there is 0 and not 1 like for what purpose or there is some condition that I should be careful for.

Why does the opcode for MOV from a segment register not have its low bit set? It's not 8-bit operand-size, so the W bit should be set

I would really appreciate if someone can give me a detailed answer.

英文:

I am having a problem where I can't understand the opcode of this instruction MOV BX,CS like the first byte is 10001100 where the first 6 bit represents the opcode then the direction 1 bit follows then the W. So here is my doubt why in the opcode it is 100011 and not 100010 and in the W cell why there is 0 and not 1 like for what purpose or there is some condition that I should be careful for.

Why does the opcode for MOV from a segment register not have its low bit set? It's not 8-bit operand-size, so the W bit should be set

I would really appreciate if someone can give me a detailed answer.

When I tried to solve with more than 1 method I just can't get the solution that is shown in the pic above and I watched a lot of videos but didn't find an answer that clarified my doubt.

答案1

得分: 5

一些操作码不遵循这个模式,包括用于对段寄存器执行mov操作的操作码。这些指令没有8位版本,因此低位不表示宽度为8位还是16位。

低2位表示方向和宽度并不是通用的,它只是存在于大多数正常的ALU操作码中的一种模式,这些操作码成对出现,用于8位和16/32/64位操作数大小,比如andaddcmp等。这些操作码都有两种方向,都有imm16和符号扩展的imm8形式用于16位操作数大小,以及没有ModRM的al, imm8 / ax, imm16短编码形式。还有像shr这样的操作码,没有两种方向,但有两种大小。

甚至对于非段寄存器的mov操作也有一些特殊形式,例如带有寄存器编号的mov reg, imm形式,其中寄存器编号是操作码的低3位,而不是方向或宽度。没有mov r/m16, sign_extended_imm8,因为8086没有用于它的情况。(https://www.felixcloutier.com/x86/mov显示了各种mov操作码。)


你在查看的mov r/m, Sreg操作码是8C。低2位具有不同值的四个操作码包括mov Sreg, r/m,但也包括两个无关的操作码,其低位设置,并且都只存在于16位操作数大小的形式中,因此它实际上并不代表任何其中的W。尽管对于LEA和pop来说它是有效的。

  • 8Cmov r/m, Sreg,你所询问的操作码。
  • 8Dlea
  • 8Emov Sreg, r/m(它确实存在,低位设置,并且在ModRM的/r字段中可以与除CS以外的任何段寄存器一起使用。将其编码为CS目标是可能的,但它会在8086之后的CPU上引发#UD异常。8086没有非法指令异常;每个比特模式都会运行成为某个操作。甚至有些8086型号将mov到CS作为跳转运行。)
  • 8Fpop r/m

请参考http://ref.x86asm.net/coder32.html#x8C了解操作码列表。还有许多其他不符合该模式的异常情况,比如clisti分别是FA和FB。


注释1:具有16/32/64位操作数大小的操作码具有相同的操作码。默认操作数大小由当前模式隐含确定。使用66h前缀设置与16位或32位相反的操作数大小,或使用REX.W前缀选择64位。

在设计386和AMD64时,没有足够的空间添加新的操作码,所以它们选择了前缀。

英文:

Some opcodes don't follow that pattern, including the ones for mov to/from segment registers. There's no 8-bit version of those instructions, so the low bit does not represent Width = 8 vs. 16-bit.

The low 2 bits being Direction and Width is not universal, it's just a pattern that exists in most normal ALU opcodes that come in pairs for 8-bit and 16/32/64-bit operand-size<sup>1</sup>, like and, add, cmp, etc. which have both directions, immediate forms with both imm16 and sign-extended imm8 for 16-bit operand-size, and al, imm8 / ax, imm16 short encodings with no ModRM. Also ones like shr that don't have both directions, but do have both sizes.

Even mov for non segment registers has some special forms, with a mov reg, imm form with the register number as the low 3 bits of the opcode, instead of direction or width. And no mov r/m16, sign_extended_imm8 because 8086 had no use for that. (https://www.felixcloutier.com/x86/mov shows the various mov opcodes.)


The mov r/m, Sreg opcode you're looking at is 8C. The four opcodes with different values of its low 2 bits include mov Sreg, r/m, but also two unrelated opcodes that have the low bit set, and all exist only in forms with 16-bit operand-size so it doesn't really mean W for any of them. Although it does work for LEA and pop.

  • 8C is mov r/m, Sreg, the opcode you're asking about.

  • 8D is lea

  • 8E is mov Sreg, r/m (which does exist with the d bit set, and works with any segment reg other than CS in the /r field of ModRM. Encoding it with a CS destination is possible, but it will raise a #UD exception on CPUs after 8086. 8086 didn't have an illegal-instruction exception; every bit-pattern ran as something. Some 8086 models even ran mov to CS as a jump.)

  • 8F is pop r/m.

See http://ref.x86asm.net/coder32.html#x8C for list of instructions by opcode. There are many other exceptions to the pattern, like cli and sti being FA and FB respectively.


Footnote 1: With 16/32/64-bit operand-size all having the same opcode. The default is implied by the current mode. A 66h prefix sets the opposite of 16 or 32-bit, or a REX.W prefix selects 64-bit.

There wasn't room for new opcodes when 386 and AMD64 were being designed, so they went with prefixes instead.

huangapple
  • 本文由 发表于 2023年5月25日 19:40:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/76331878.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定