英文:
Get a `&mut u8` reference to part of a `&mut u32`
问题
fn as_ne_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
unsafe {
let ptr: *mut u8 = num as *mut u32 as *mut u8;
[
&mut *ptr.add(0),
&mut *ptr.add(1),
&mut *ptr.add(2),
&mut *ptr.add(3),
]
}
}
fn main() {
let mut num: u32 = 0x12345678;
println!("{:#x} - {:x?}", num, num.to_be_bytes());
let parts = as_ne_bytes_mut(&mut num);
*parts[1] = 0xfe;
println!("{:#x} - {:x?}", num, num.to_be_bytes());
}
Here's the corrected code with the correct as
casts instead of &mut
references. It should work as intended, allowing you to manipulate the bytes of a u32
through references to u8
.
英文:
To convert a u32
to its bytes, I know there is already:
u32::to_le_bytes
u32::to_be_bytes
u32::to_ne_bytes
u32::from_le_bytes
u32::from_be_bytes
u32::from_ne_bytes
However, is there any sound and cross-platform way to convert a &mut u32
into references of its bytes, as &mut u8
?
Like this:
fn as_be_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
todo!()
}
fn main() {
let mut num: u32 = 0x12345678;
// Prints `0x12345678 - [12, 34, 56, 78]`
println!("{:#x} - {:x?}", num, num.to_be_bytes());
let parts = as_be_bytes_mut(&mut num);
*parts[2] = 0xfe;
// Should print `0x1234fe78 - [12, 34, fe, 78]`
println!("{:#x} - {:x?}", num, num.to_be_bytes());
}
The rationale is that it should be theoretically possible, because there are no invalid states of a u32
, no matter how you modify its underlying bytes.
Attempt:
fn as_ne_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
unsafe {
let ptr: *mut u8 = (num as *mut u32).cast();
[
&mut *ptr.add(0),
&mut *ptr.add(1),
&mut *ptr.add(2),
&mut *ptr.add(3),
]
}
}
fn main() {
let mut num: u32 = 0x12345678;
println!("{:#x} - {:x?}", num, num.to_be_bytes());
let parts = as_ne_bytes_mut(&mut num);
*parts[1] = 0xfe;
println!("{:#x} - {:x?}", num, num.to_be_bytes());
}
0x12345678 - [12, 34, 56, 78]
0x1234fe78 - [12, 34, fe, 78]
I think (tm) this is sound, because a u32
is a packed array of u8
's and u8
's are always correctly aligned, and the lifetimes should also match. I didn't find a way yet to implement as_be/le_bytes_mut yet. Also I'm not 100% sure this is sound, so some feedback would help.
答案1
得分: 2
以下是翻译好的部分:
它应该能够得到一个`&mut [u8; 4]`。
这比`[&mut u8; 4]`要好得多,因为它是一个无操作。但是,除非你交换位并返回一个`impl DerefMut<Target = &mut [u8; 4]>`的守卫,在它被丢弃时将位交换回来,否则你将无法创建`le`和`be`版本。
我可能会选择四个函数,每个函数返回一个字节,而不是返回`[&mut u8; 4]`的`le`和`be`函数。尽管如果你以相同的方式使用它们,它们应该会优化为相同的结果。
Miri认为这些都还可以。但是,你可以将它们变成一个单一的const泛型函数,但是要为它提供`0..4`的边界会不太方便。[playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=35158fde09016e23b8eaa064bfd209c7)
请注意,我已将HTML实体字符(例如&
和"
)转换为它们的相应字符。如果需要更多翻译,请告诉我。
英文:
It should be sound to get a &mut [u8; 4]
out of it.
pub fn as_ne_bytes_mut(num: &mut u32) -> &mut [u8; 4] {
unsafe {
let arr: *mut [u8; 4] = (num as *mut u32).cast();
&mut *arr
}
}
This is much better than [&mut u8; 4]
since it's a no-op. However, you aren't going to be able to create le
and be
versions unless you swap the bits around and return an impl DerefMut<Target = &mut [u8; 4]>
guard that swaps the bits back when it's dropped.
I would probably go with four functions that return one byte each instead of le
and be
functions that return [&mut u8; 4]
. Although if you use them the same, they should optimize to the same thing.
pub fn most_sig_byte_0(num: &mut u32) -> &mut u8 {
if cfg!(target_endian = "big") {
&mut as_ne_bytes_mut(num)[0]
} else {
&mut as_ne_bytes_mut(num)[3]
}
}
pub fn most_sig_byte_1(num: &mut u32) -> &mut u8 {
if cfg!(target_endian = "big") {
&mut as_ne_bytes_mut(num)[1]
} else {
&mut as_ne_bytes_mut(num)[2]
}
}
pub fn most_sig_byte_2(num: &mut u32) -> &mut u8 {
if cfg!(target_endian = "big") {
&mut as_ne_bytes_mut(num)[2]
} else {
&mut as_ne_bytes_mut(num)[1]
}
}
pub fn most_sig_byte_3(num: &mut u32) -> &mut u8 {
if cfg!(target_endian = "big") {
&mut as_ne_bytes_mut(num)[3]
} else {
&mut as_ne_bytes_mut(num)[0]
}
}
Miri thinks these are fine, at least. You could make them a single const generic function, but it would be less ergonomic to give it the 0..4
bound. (playground)
答案2
得分: 1
可以通过`cfg(target_endian = "...")`来实现:
pub fn as_ne_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
unsafe {
let ptr: *mut u8 = (num as *mut u32).cast();
[
&mut *ptr.add(0),
&mut *ptr.add(1),
&mut *ptr.add(2),
&mut *ptr.add(3),
]
}
}
pub fn as_be_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
let mut b = as_ne_bytes_mut(num);
#[cfg(target_endian = "little")]
b.reverse();
b
}
pub fn as_le_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
let mut b = as_be_bytes_mut(num);
b.reverse();
b
}
fn main() {
let mut num: u32 = 0x12345678;
// 输出 `0x12345678 - [12, 34, 56, 78]`
println!("{:#x} - {:x?}", num, num.to_be_bytes());
let parts = as_be_bytes_mut(&mut num);
*parts[2] = 0xfe;
// 应该输出 `0x1234fe78 - [12, 34, fe, 78]`
println!("{:#x} - {:x?}", num, num.to_be_bytes());
}
0x12345678 - [12, 34, 56, 78]
0x1234fe78 - [12, 34, fe, 78]
虽然看起来有点复杂,但编译器成功地进行了完美优化:
example::as_ne_bytes_mut:
mov rax, rdi
lea rcx, [rsi + 1]
lea rdx, [rsi + 2]
mov qword ptr [rdi], rsi
add rsi, 3
mov qword ptr [rdi + 8], rcx
mov qword ptr [rdi + 16], rdx
mov qword ptr [rdi + 24], rsi
ret
example::as_be_bytes_mut:
mov rax, rdi
lea rcx, [rsi + 1]
lea rdx, [rsi + 2]
lea rdi, [rsi + 3]
mov qword ptr [rax], rdi
mov qword ptr [rax + 24], rsi
mov qword ptr [rax + 8], rdx
mov qword ptr [rax + 16], rcx
ret
example::as_le_bytes_mut:
mov rax, rdi
lea rcx, [rsi + 1]
lea rdx, [rsi + 2]
mov qword ptr [rdi], rsi
add rsi, 3
mov qword ptr [rdi + 24], rsi
mov qword ptr [rdi + 8], rcx
mov qword ptr [rdi + 16], rdx
ret
英文:
This can be achieved with the help of cfg(target_endian = "...")
:
pub fn as_ne_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
unsafe {
let ptr: *mut u8 = (num as *mut u32).cast();
[
&mut *ptr.add(0),
&mut *ptr.add(1),
&mut *ptr.add(2),
&mut *ptr.add(3),
]
}
}
pub fn as_be_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
let mut b = as_ne_bytes_mut(num);
#[cfg(target_endian = "little")]
b.reverse();
b
}
pub fn as_le_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
let mut b = as_be_bytes_mut(num);
b.reverse();
b
}
fn main() {
let mut num: u32 = 0x12345678;
// Prints `0x12345678 - [12, 34, 56, 78]`
println!("{:#x} - {:x?}", num, num.to_be_bytes());
let parts = as_be_bytes_mut(&mut num);
*parts[2] = 0xfe;
// Should print `0x1234fe78 - [12, 34, fe, 78]`
println!("{:#x} - {:x?}", num, num.to_be_bytes());
}
0x12345678 - [12, 34, 56, 78]
0x1234fe78 - [12, 34, fe, 78]
While it does look a little convoluted, the compiler manages to optimize it perfectly:
example::as_ne_bytes_mut:
mov rax, rdi
lea rcx, [rsi + 1]
lea rdx, [rsi + 2]
mov qword ptr [rdi], rsi
add rsi, 3
mov qword ptr [rdi + 8], rcx
mov qword ptr [rdi + 16], rdx
mov qword ptr [rdi + 24], rsi
ret
example::as_be_bytes_mut:
mov rax, rdi
lea rcx, [rsi + 1]
lea rdx, [rsi + 2]
lea rdi, [rsi + 3]
mov qword ptr [rax], rdi
mov qword ptr [rax + 24], rsi
mov qword ptr [rax + 8], rdx
mov qword ptr [rax + 16], rcx
ret
example::as_le_bytes_mut:
mov rax, rdi
lea rcx, [rsi + 1]
lea rdx, [rsi + 2]
mov qword ptr [rdi], rsi
add rsi, 3
mov qword ptr [rdi + 24], rsi
mov qword ptr [rdi + 8], rcx
mov qword ptr [rdi + 16], rdx
ret
答案3
得分: 1
这段代码可以在不使用指针算术的情况下完成。
以下是代码的中文翻译:
pub fn as_le_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
let num_slice = std::slice::from_mut(num);
let (pref, middle, suff): (_, &mut [u8], _) = unsafe {
num_slice.align_to_mut()
};
// 当我们转换为u8时,这将始终为true
assert!(pref.is_empty() && suff.is_empty());
match middle {
#[cfg(target_endian = "little")]
[a, b, c, d] => [a, b, c, d],
#[cfg(target_endian = "big")]
[a, b, c, d] => [d, c, b, a],
_ => unreachable!()
}
}
pub fn as_be_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
let mut r = as_le_bytes_mut(num);
r.reverse();
r
}
英文:
It is possible to do this without pointer arithmetics.
Code:
pub fn as_le_bytes_mut(num: &mut u32)->[&mut u8; 4]{
let num_slice = std::slice::from_mut(num);
let (pref, middle, suff): (_,&mut [u8],_) = unsafe{
num_slice.align_to_mut()
};
// This would be always true when we cast to u8
assert!(pref.is_empty() && suff.is_empty());
match middle {
#[cfg(target_endian = "little")]
[a, b, c, d] => [a, b, c, d],
#[cfg(target_endian = "big")]
[a, b, c, d] => [d, c, b, a],
_ => unreachable!()
}
}
pub fn as_be_bytes_mut(num: &mut u32)->[&mut u8; 4]{
let mut r = as_le_bytes_mut(num);
r.reverse();
r
}
And it compiles to nice assembly too: godbolt link.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论