获取对`&mut u32`的一部分的`&mut u8`引用

huangapple go评论97阅读模式
英文:

Get a `&mut u8` reference to part of a `&mut u32`

问题

  1. fn as_ne_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
  2. unsafe {
  3. let ptr: *mut u8 = num as *mut u32 as *mut u8;
  4. [
  5. &mut *ptr.add(0),
  6. &mut *ptr.add(1),
  7. &mut *ptr.add(2),
  8. &mut *ptr.add(3),
  9. ]
  10. }
  11. }
  12. fn main() {
  13. let mut num: u32 = 0x12345678;
  14. println!("{:#x} - {:x?}", num, num.to_be_bytes());
  15. let parts = as_ne_bytes_mut(&mut num);
  16. *parts[1] = 0xfe;
  17. println!("{:#x} - {:x?}", num, num.to_be_bytes());
  18. }

Here's the corrected code with the correct as casts instead of &mut references. It should work as intended, allowing you to manipulate the bytes of a u32 through references to u8.

英文:

To convert a u32 to its bytes, I know there is already:

However, is there any sound and cross-platform way to convert a &mut u32 into references of its bytes, as &mut u8?

Like this:

  1. fn as_be_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
  2. todo!()
  3. }
  4. fn main() {
  5. let mut num: u32 = 0x12345678;
  6. // Prints `0x12345678 - [12, 34, 56, 78]`
  7. println!("{:#x} - {:x?}", num, num.to_be_bytes());
  8. let parts = as_be_bytes_mut(&mut num);
  9. *parts[2] = 0xfe;
  10. // Should print `0x1234fe78 - [12, 34, fe, 78]`
  11. println!("{:#x} - {:x?}", num, num.to_be_bytes());
  12. }

The rationale is that it should be theoretically possible, because there are no invalid states of a u32, no matter how you modify its underlying bytes.


Attempt:

  1. fn as_ne_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
  2. unsafe {
  3. let ptr: *mut u8 = (num as *mut u32).cast();
  4. [
  5. &mut *ptr.add(0),
  6. &mut *ptr.add(1),
  7. &mut *ptr.add(2),
  8. &mut *ptr.add(3),
  9. ]
  10. }
  11. }
  12. fn main() {
  13. let mut num: u32 = 0x12345678;
  14. println!("{:#x} - {:x?}", num, num.to_be_bytes());
  15. let parts = as_ne_bytes_mut(&mut num);
  16. *parts[1] = 0xfe;
  17. println!("{:#x} - {:x?}", num, num.to_be_bytes());
  18. }
  1. 0x12345678 - [12, 34, 56, 78]
  2. 0x1234fe78 - [12, 34, fe, 78]

I think (tm) this is sound, because a u32 is a packed array of u8's and u8's are always correctly aligned, and the lifetimes should also match. I didn't find a way yet to implement as_be/le_bytes_mut yet. Also I'm not 100% sure this is sound, so some feedback would help.

答案1

得分: 2

以下是翻译好的部分:

  1. 它应该能够得到一个`&mut [u8; 4]`
  2. 这比`[&mut u8; 4]`要好得多,因为它是一个无操作。但是,除非你交换位并返回一个`impl DerefMut<Target = &mut [u8; 4]>`的守卫,在它被丢弃时将位交换回来,否则你将无法创建`le``be`版本。
  3. 我可能会选择四个函数,每个函数返回一个字节,而不是返回`[&mut u8; 4]``le``be`函数。尽管如果你以相同的方式使用它们,它们应该会优化为相同的结果。
  4. Miri认为这些都还可以。但是,你可以将它们变成一个单一的const泛型函数,但是要为它提供`0..4`的边界会不太方便。[playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=35158fde09016e23b8eaa064bfd209c7)

请注意,我已将HTML实体字符(例如&amp;&quot;)转换为它们的相应字符。如果需要更多翻译,请告诉我。

英文:

It should be sound to get a &amp;mut [u8; 4] out of it.

  1. pub fn as_ne_bytes_mut(num: &amp;mut u32) -&gt; &amp;mut [u8; 4] {
  2. unsafe {
  3. let arr: *mut [u8; 4] = (num as *mut u32).cast();
  4. &amp;mut *arr
  5. }
  6. }

This is much better than [&amp;mut u8; 4] since it's a no-op. However, you aren't going to be able to create le and be versions unless you swap the bits around and return an impl DerefMut&lt;Target = &amp;mut [u8; 4]&gt; guard that swaps the bits back when it's dropped.

I would probably go with four functions that return one byte each instead of le and be functions that return [&amp;mut u8; 4]. Although if you use them the same, they should optimize to the same thing.

  1. pub fn most_sig_byte_0(num: &amp;mut u32) -&gt; &amp;mut u8 {
  2. if cfg!(target_endian = &quot;big&quot;) {
  3. &amp;mut as_ne_bytes_mut(num)[0]
  4. } else {
  5. &amp;mut as_ne_bytes_mut(num)[3]
  6. }
  7. }
  8. pub fn most_sig_byte_1(num: &amp;mut u32) -&gt; &amp;mut u8 {
  9. if cfg!(target_endian = &quot;big&quot;) {
  10. &amp;mut as_ne_bytes_mut(num)[1]
  11. } else {
  12. &amp;mut as_ne_bytes_mut(num)[2]
  13. }
  14. }
  15. pub fn most_sig_byte_2(num: &amp;mut u32) -&gt; &amp;mut u8 {
  16. if cfg!(target_endian = &quot;big&quot;) {
  17. &amp;mut as_ne_bytes_mut(num)[2]
  18. } else {
  19. &amp;mut as_ne_bytes_mut(num)[1]
  20. }
  21. }
  22. pub fn most_sig_byte_3(num: &amp;mut u32) -&gt; &amp;mut u8 {
  23. if cfg!(target_endian = &quot;big&quot;) {
  24. &amp;mut as_ne_bytes_mut(num)[3]
  25. } else {
  26. &amp;mut as_ne_bytes_mut(num)[0]
  27. }
  28. }

Miri thinks these are fine, at least. You could make them a single const generic function, but it would be less ergonomic to give it the 0..4 bound. (playground)

答案2

得分: 1

  1. 可以通过`cfg(target_endian = &quot;...&quot;)`来实现:
  2. pub fn as_ne_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
  3. unsafe {
  4. let ptr: *mut u8 = (num as *mut u32).cast();
  5. [
  6. &mut *ptr.add(0),
  7. &mut *ptr.add(1),
  8. &mut *ptr.add(2),
  9. &mut *ptr.add(3),
  10. ]
  11. }
  12. }
  13. pub fn as_be_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
  14. let mut b = as_ne_bytes_mut(num);
  15. #[cfg(target_endian = &quot;little&quot;)]
  16. b.reverse();
  17. b
  18. }
  19. pub fn as_le_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
  20. let mut b = as_be_bytes_mut(num);
  21. b.reverse();
  22. b
  23. }
  24. fn main() {
  25. let mut num: u32 = 0x12345678;
  26. // 输出 `0x12345678 - [12, 34, 56, 78]`
  27. println!("{:#x} - {:x?}", num, num.to_be_bytes());
  28. let parts = as_be_bytes_mut(&mut num);
  29. *parts[2] = 0xfe;
  30. // 应该输出 `0x1234fe78 - [12, 34, fe, 78]`
  31. println!("{:#x} - {:x?}", num, num.to_be_bytes());
  32. }
  1. 0x12345678 - [12, 34, 56, 78]
  2. 0x1234fe78 - [12, 34, fe, 78]

虽然看起来有点复杂,但编译器成功地进行了完美优化

  1. example::as_ne_bytes_mut:
  2. mov rax, rdi
  3. lea rcx, [rsi + 1]
  4. lea rdx, [rsi + 2]
  5. mov qword ptr [rdi], rsi
  6. add rsi, 3
  7. mov qword ptr [rdi + 8], rcx
  8. mov qword ptr [rdi + 16], rdx
  9. mov qword ptr [rdi + 24], rsi
  10. ret
  11. example::as_be_bytes_mut:
  12. mov rax, rdi
  13. lea rcx, [rsi + 1]
  14. lea rdx, [rsi + 2]
  15. lea rdi, [rsi + 3]
  16. mov qword ptr [rax], rdi
  17. mov qword ptr [rax + 24], rsi
  18. mov qword ptr [rax + 8], rdx
  19. mov qword ptr [rax + 16], rcx
  20. ret
  21. example::as_le_bytes_mut:
  22. mov rax, rdi
  23. lea rcx, [rsi + 1]
  24. lea rdx, [rsi + 2]
  25. mov qword ptr [rdi], rsi
  26. add rsi, 3
  27. mov qword ptr [rdi + 24], rsi
  28. mov qword ptr [rdi + 8], rcx
  29. mov qword ptr [rdi + 16], rdx
  30. ret
英文:

This can be achieved with the help of cfg(target_endian = &quot;...&quot;):

  1. pub fn as_ne_bytes_mut(num: &amp;mut u32) -&gt; [&amp;mut u8; 4] {
  2. unsafe {
  3. let ptr: *mut u8 = (num as *mut u32).cast();
  4. [
  5. &amp;mut *ptr.add(0),
  6. &amp;mut *ptr.add(1),
  7. &amp;mut *ptr.add(2),
  8. &amp;mut *ptr.add(3),
  9. ]
  10. }
  11. }
  12. pub fn as_be_bytes_mut(num: &amp;mut u32) -&gt; [&amp;mut u8; 4] {
  13. let mut b = as_ne_bytes_mut(num);
  14. #[cfg(target_endian = &quot;little&quot;)]
  15. b.reverse();
  16. b
  17. }
  18. pub fn as_le_bytes_mut(num: &amp;mut u32) -&gt; [&amp;mut u8; 4] {
  19. let mut b = as_be_bytes_mut(num);
  20. b.reverse();
  21. b
  22. }
  23. fn main() {
  24. let mut num: u32 = 0x12345678;
  25. // Prints `0x12345678 - [12, 34, 56, 78]`
  26. println!(&quot;{:#x} - {:x?}&quot;, num, num.to_be_bytes());
  27. let parts = as_be_bytes_mut(&amp;mut num);
  28. *parts[2] = 0xfe;
  29. // Should print `0x1234fe78 - [12, 34, fe, 78]`
  30. println!(&quot;{:#x} - {:x?}&quot;, num, num.to_be_bytes());
  31. }
  1. 0x12345678 - [12, 34, 56, 78]
  2. 0x1234fe78 - [12, 34, fe, 78]

While it does look a little convoluted, the compiler manages to optimize it perfectly:

  1. example::as_ne_bytes_mut:
  2. mov rax, rdi
  3. lea rcx, [rsi + 1]
  4. lea rdx, [rsi + 2]
  5. mov qword ptr [rdi], rsi
  6. add rsi, 3
  7. mov qword ptr [rdi + 8], rcx
  8. mov qword ptr [rdi + 16], rdx
  9. mov qword ptr [rdi + 24], rsi
  10. ret
  11. example::as_be_bytes_mut:
  12. mov rax, rdi
  13. lea rcx, [rsi + 1]
  14. lea rdx, [rsi + 2]
  15. lea rdi, [rsi + 3]
  16. mov qword ptr [rax], rdi
  17. mov qword ptr [rax + 24], rsi
  18. mov qword ptr [rax + 8], rdx
  19. mov qword ptr [rax + 16], rcx
  20. ret
  21. example::as_le_bytes_mut:
  22. mov rax, rdi
  23. lea rcx, [rsi + 1]
  24. lea rdx, [rsi + 2]
  25. mov qword ptr [rdi], rsi
  26. add rsi, 3
  27. mov qword ptr [rdi + 24], rsi
  28. mov qword ptr [rdi + 8], rcx
  29. mov qword ptr [rdi + 16], rdx
  30. ret

答案3

得分: 1

这段代码可以在不使用指针算术的情况下完成。

以下是代码的中文翻译:

  1. pub fn as_le_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
  2. let num_slice = std::slice::from_mut(num);
  3. let (pref, middle, suff): (_, &mut [u8], _) = unsafe {
  4. num_slice.align_to_mut()
  5. };
  6. // 当我们转换为u8时,这将始终为true
  7. assert!(pref.is_empty() && suff.is_empty());
  8. match middle {
  9. #[cfg(target_endian = "little")]
  10. [a, b, c, d] => [a, b, c, d],
  11. #[cfg(target_endian = "big")]
  12. [a, b, c, d] => [d, c, b, a],
  13. _ => unreachable!()
  14. }
  15. }
  16. pub fn as_be_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
  17. let mut r = as_le_bytes_mut(num);
  18. r.reverse();
  19. r
  20. }

它还可以编译成优化的汇编代码,您可以在[godbolt链接](https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:'1',fontScale:14,fontUsePx:'0',j:1,lang:rust,selection:(endColumn:2,endLineNumber:22,positionColumn:1,positionLineNumber:1,selectionStartColumn:2,selectionStartLineNumber:22,startColumn:1,startLineNumber:1),source:'pub+fn+as_le_bytes_mut(num:+%26mut+u32)-%3E%5B%26mut+u8%3B+4%5D%7B%0A++++let+num_slice+%3D+std::slice::from_mut(num)%3B%0A++++let+(pref,+middle,+suff):+(_,%26mut+%5Bu8%5D,_)+%3D+unsafe%7B%0A++++++++num_slice.align_to_mut()%0A++++%7D%3B%0A++++//+This+would+be+always+true+when+we+cast+to+u8%0A++++assert!!(pref.is_empty()+%26%26+suff.is_empty())%3B%0A++++%0A++++match+middle+%7B%0A++++++++%23%5Bcfg(target_endian+%3D+%22little%22)%5D%0A++++++++%5Ba,+b,+c,+d%5D+%3D%3E+%5Ba,+b,+c,+d%5D,%0A++++++++%23%5Bcfg(target_endian+%3D+%22big%22)%5D%0A++++++++%5Ba,+b,+c,+d%5D+%3D%3E+%5Bd,+c,+b,+a%5D,%0A++++++++_+%3D%3E+unreachable!!()%0A++++%7D%0A%7D%0A%0Apub+fn+as_be_bytes_mut(num:+%26mut+u32)-%3E%5B%26mut+u8%3B+4%5D%7B%0A++++let+mut+r+%3D+as_le_bytes_mut(num)%3B%0A++++r.reverse()%3B%0A++++r%0A%7D'),l:'5',n:'0',o:'',t:'0')),k:50,l:'4',n:'0',o:'',s:0,t:'0')查看到。

英文:

It is possible to do this without pointer arithmetics.

Code:

  1. pub fn as_le_bytes_mut(num: &amp;mut u32)-&gt;[&amp;mut u8; 4]{
  2. let num_slice = std::slice::from_mut(num);
  3. let (pref, middle, suff): (_,&amp;mut [u8],_) = unsafe{
  4. num_slice.align_to_mut()
  5. };
  6. // This would be always true when we cast to u8
  7. assert!(pref.is_empty() &amp;&amp; suff.is_empty());
  8. match middle {
  9. #[cfg(target_endian = &quot;little&quot;)]
  10. [a, b, c, d] =&gt; [a, b, c, d],
  11. #[cfg(target_endian = &quot;big&quot;)]
  12. [a, b, c, d] =&gt; [d, c, b, a],
  13. _ =&gt; unreachable!()
  14. }
  15. }
  16. pub fn as_be_bytes_mut(num: &amp;mut u32)-&gt;[&amp;mut u8; 4]{
  17. let mut r = as_le_bytes_mut(num);
  18. r.reverse();
  19. r
  20. }

And it compiles to nice assembly too: godbolt link.

huangapple
  • 本文由 发表于 2023年5月26日 13:38:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76337924.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定