英文:
Assembly code for operating bit field in C behaves weird in QEMU
问题
I understand your request to translate the provided content without addressing the questions. Here's the translated text:
在进行2018版本的MIT 6.828实验时,我在使用模拟的80386 CPU上运行QEMU时遇到了一些奇怪的问题:
我想要做的是初始化INTEL 82540EM芯片,也被称为E1000的接收过程。我基本上只是向设备的寄存器中写入一些字节。
首先,我定义了一个具有位字段的结构,因为它实际上是硬件中的一个寄存器:
struct rx_addr_reg {
// 低32位
unsigned ral : 32; // 0 - 31
// 高32位
unsigned rah : 16; // 0 -15
unsigned as : 2; // 16 - 17
unsigned rs : 13; // 18 - 30
unsigned av : 1; // 31
};
然后,我决定使用C宏来使用它:
#define E1000_RA 0x05400 /* 接收地址 - 可读写数组 */
#define E1000_RAH_AV 0x80000000 /* 接收描述符有效 */
#define E1000_GET_REG(base,reg) \
{ ((void*)(base) + (reg)) }
#define E1000_SET_RECEIVE_ADDR_REG(addr,as,rs,av) (struct rx_addr_reg)\
{ (addr >> 16) & 0xffffffff, (addr) & 0xffff, \
(as) & 0x3, (rs) & 0x1fff, (av) & 0x1 }
然后,在我的.c
文件中,我尝试访问和初始化寄存器:
// 接收初始化
// 使用所需的以太网地址编程接收地址寄存器(RAL/RAH)
struct rx_addr_reg* rar = (struct rx_addr_reg*) E1000_GET_REG(e1000_va, E1000_RA);
*rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1);
我期望在内存中看到的rar
类似于:
内存地址: 内容
0x????????: 0x12005452 0x80005634
然而,结果却是:
内存地址: 内容
0x????????: 0x12005452 0x00000080
这很奇怪,所以我在GDB中检查了程序:
以下是我无法理解的几个问题:
-
汇编代码尝试对
0x5406(%eax)
中的字节进行AND操作,但实际上似乎清除了0x5405
中的字节。 -
然后在ANDW方面出现了问题,似乎清除了
0x5404(%eax)
中的字节。 -
最后,它对
0x5404(%eax)
中的字节进行了ORB操作,应该与0x5407(%eax)
进行or操作。 -
顺便说一下,当我尝试打印
0x5400(%eax)
中的字节时,为什么GDB拒绝执行,只显示4字节对齐的字节内容?
还有一个我认为可能解决问题的点,但我不确定:
我定义的结构是8字节长的,而系统正在32位下运行。因此,如果设备不允许写入位字段,只允许使用整个4字节进行写入,那么问题可能是合理的。
非常感谢您的回答!
英文:
I encountered some weird things while doing 2018 version MIT 6.828, which lab running on QEMU with emulated 80386 CPU:
What I want to do is initializing the receive process for INTEL 82540EM chip, also known as the E1000. I basically just write some bytes to the device's registers.
First I defined a structure with bit fileds, since it is actually a register in hardware:
struct rx_addr_reg {
// low 32 bit
unsigned ral : 32; // 0 - 31
// high 32 bit
unsigned rah : 16; // 0 -15
unsigned as : 2; // 16 - 17
unsigned rs : 13; // 18 - 30
unsigned av : 1; // 31
};
I decided to use it via C macro:
#define E1000_RA 0x05400 /* Receive Address - RW Array */
#define E1000_RAH_AV 0x80000000 /* Receive descriptor valid */
#define E1000_GET_REG(base,reg) \
{ ((void*)(base) + (reg)) }
#define E1000_SET_RECEIVE_ADDR_REG(addr,as,rs,av) (struct rx_addr_reg)\
{ (addr >> 16) & 0xffffffff, (addr) & 0xffff, \
(as) & 0x3, (rs) & 0x1fff, (av) & 0x1 }
Then in my .c
file, I try to reach and initiate the register:
// Receive Initialization
// Program the Receive Address Registers (RAL/RAH) with the desired Ethernet addresses
struct rx_addr_reg* rar = (struct rx_addr_reg*) E1000_GET_REG(e1000_va, E1000_RA);
*rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1);
What I expected to see the rar
in memory is something like:
Memory address: content
0x????????: 0x12005452 0x80005634
However, the result ended with:
Memory address: content
0x????????: 0x12005452 0x00000080
That is weird, so I check the program in GDB:
+ target remote localhost:26000
The target architecture is assumed to be i8086
[f000:fff0] 0xffff0: ljmp $0xf000,$0xe05b
0x0000fff0 in ?? ()
+ symbol-file obj/kern/kernel
(gdb) br e1000.c:64
Breakpoint 1 at 0xf0107470: file kern/e1000.c, line 64.
(gdb) si
[f000:e05b] 0xfe05b: cmpl $0x0,%cs:0x6ac8
0x0000e05b in ?? ()
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0xf0107470 <pci_e1000_attach+264>: movl $0x60200a,0x410(%eax)
Breakpoint 1, pci_e1000_attach (pcif=0xf012af10) at kern/e1000.c:64
64 *(uint32_t*)((char*)e1000_va + E1000_TIPG) |= 10 | 8 << 10 | 6 << 20;
(gdb) si
=> 0xf010747a <pci_e1000_attach+274>: movl $0x12005452,0x5400(%eax)
82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb)
=> 0xf0107484 <pci_e1000_attach+284>: movw $0x5634,0x5404(%eax)
0xf0107484 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb)
=> 0xf010748d <pci_e1000_attach+293>: andb $0xfc,0x5406(%eax)
0xf010748d 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00005634
(gdb) si
=> 0xf0107494 <pci_e1000_attach+300>: andw $0x8003,0x5406(%eax)
0xf0107494 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000034
(gdb) si
=> 0xf010749d <pci_e1000_attach+309>: orb $0x80,0x5407(%eax)
0xf010749d 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000000
(gdb) si
=> 0xf01074a4 <pci_e1000_attach+316>: movl $0x1,0xc(%esp)
86 cprintf("[RAH:RAL] [av]: [%x:%x] [%x]\n", rar->rah, rar->ral, rar->av);
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000080
(gdb)
Below is the points that I can not understand:
- The assembly code try to AND the byte in
0x5406(%eax)
with0xfc
, but it actually seems clear the byte in0x5405
.
(gdb)
=> 0xf010748d <pci_e1000_attach+293>: andb $0xfc,0x5406(%eax)
0xf010748d 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00005634
(gdb) si
=> 0xf0107494 <pci_e1000_attach+300>: andw $0x8003,0x5406(%eax)
0xf0107494 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000034
- Then something wrong with the ANDW, it seems clear the byte at
0x5404(%eax)
:
(gdb) si
=> 0xf0107494 <pci_e1000_attach+300>: andw $0x8003,0x5406(%eax)
0xf0107494 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000034
(gdb) si
=> 0xf010749d <pci_e1000_attach+309>: orb $0x80,0x5407(%eax)
0xf010749d 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000000
- Finally it ORBs the byte at
0x5404(%eax)
, which it shouldor
with0x5407(%eax)
(gdb) si
=> 0xf010749d <pci_e1000_attach+309>: orb $0x80,0x5407(%eax)
0xf010749d 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000000
(gdb) si
=> 0xf01074a4 <pci_e1000_attach+316>: movl $0x1,0xc(%esp)
86 cprintf("[RAH:RAL] [av]: [%x:%x] [%x]\n", rar->rah, rar->ral, rar->av);
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000080
- BTW, when I try to print the bytes at
0x5400(%eax)
, why does gdb refuse to do it but only shows the content at 4bytes aligned bytes?
(gdb) x/xw $eax+0x5404
0xef809404: 0x00000034
(gdb) x/xw $eax+0x5406
0xef809406: 0x00000034
(gdb) x/xb $eax+0x5406
0xef809406: 0x34
(gdb) x/xb $eax+0x5404
0xef809404: 0x34
One point that I think it may solve the problem but I'm not sure:
The structure I defined is a 8-byte-long, and the system is running under 32-bit. So if the device is not allowed to write bit fields, and only allowed to write with whole 4 bytes, the problem maybe reasonable.
Really appreciate for your answer!
答案1
得分: 7
这个硬件规定其寄存器宽度为32位。这意味着你需要一次读写32位。你的C代码没有采取任何措施来确保这一点;编译器假定你在操作指向结构体的指针时,是在读写普通的RAM。对于RAM来说,通过一次读写少于32位的方式来更新32位值的子字段是可以的,这就是编译器生成的代码所做的,使用它的字节和字操作。然而,在设备寄存器上这种方式将无法正常工作。(QEMU的实现将忽略字节和字访问尝试;当你尝试通过gdbstub访问设备时,你也可以看到这一点。)
所以你不能仅仅定义一个结构体,其中的位字段与规范中的寄存器对齐,然后期望对单个位字段的写入能正常工作。如果你想更新寄存器中的单个字段,你应该先读取整个32位寄存器,然后更新值的相关部分,然后再将整个32位值写回去。(通常情况下,你希望一次更新所有字段,这样你可以直接写入完整的新值,而不必先读取。)
你还需要确保编译器不认为这只是RAM,因此可以随意重新排序、合并或丢弃更新。个人而言,我喜欢Linux内核的方法,即定义用于进行访问的函数,最终会归结为汇编加载和存储,这样就始终清楚生成的代码将执行什么操作;当然还有其他方法。
英文:
This hardware defines that its registers are 32 bits wide. That means you need to read and write them 32 bits at a time. Your C code doesn't do anything to ensure that happens; the compiler assumes that you're reading and writing plain old RAM when you operate on pointers to structs. For RAM it is fine to update sub-fields in a 32-bit value by reading and writing less than 32 bits at a time, and that is what the code the compiler generates is doing, with its byte and word operations. However this will not work correctly on a device register. (QEMU's implementation will ignore the byte and word access attempts; you can see this as well when you try to access the device via the gdbstub.)
So you can't just define a struct with bitfields that line up with the registers in the specification and expect writing to an individual bitfield to work correctly. If you want to update an individual field in a register you should read the whole 32 bit register, update the relevant part of the value, and then write the whole 32 bit value back again. (Often you want to update all the fields at once, in which case you can just do a write of the full new value without having to do a read first.)
You also want to make sure the compiler doesn't think this is just RAM and so it can happily reorder, merge or drop updates. Personally I like the Linux kernel's approach of defining functions for doing accesses that eventually boil down to asm loads and stores so that it's always 100% clear exactly what the generated code will be doing; there are other approaches too.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论