2023年5月26日 13:38:14go评论97阅读模式

英文:

Get a `&mut u8` reference to part of a `&mut u32`

问题

fn as_ne_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
    unsafe {
        let ptr: *mut u8 = num as *mut u32 as *mut u8;
        [
            &mut *ptr.add(0),
            &mut *ptr.add(1),
            &mut *ptr.add(2),
            &mut *ptr.add(3),
        ]
    }
}
fn main() {
    let mut num: u32 = 0x12345678;
    println!("{:#x} - {:x?}", num, num.to_be_bytes());
    let parts = as_ne_bytes_mut(&mut num);
    *parts[1] = 0xfe;
    println!("{:#x} - {:x?}", num, num.to_be_bytes());
}

Here's the corrected code with the correct as casts instead of &mut references. It should work as intended, allowing you to manipulate the bytes of a u32 through references to u8.

英文:

To convert a u32 to its bytes, I know there is already:

However, is there any sound and cross-platform way to convert a &mut u32 into references of its bytes, as &mut u8?

Like this:

fn as_be_bytes_mut(num: &amp;mut u32) -&gt; [&amp;mut u8; 4] {
    todo!()
}
fn main() {
    let mut num: u32 = 0x12345678;
    // Prints `0x12345678 - [12, 34, 56, 78]`
    println!(&quot;{:#x} - {:x?}&quot;, num, num.to_be_bytes());
    let parts = as_be_bytes_mut(&amp;mut num);
    *parts[2] = 0xfe;
    // Should print `0x1234fe78 - [12, 34, fe, 78]`
    println!(&quot;{:#x} - {:x?}&quot;, num, num.to_be_bytes());
}

The rationale is that it should be theoretically possible, because there are no invalid states of a u32, no matter how you modify its underlying bytes.

Attempt:

fn as_ne_bytes_mut(num: &amp;mut u32) -&gt; [&amp;mut u8; 4] {
    unsafe {
        let ptr: *mut u8 = (num as *mut u32).cast();
        [
            &amp;mut *ptr.add(0),
            &amp;mut *ptr.add(1),
            &amp;mut *ptr.add(2),
            &amp;mut *ptr.add(3),
        ]
    }
}
fn main() {
    let mut num: u32 = 0x12345678;
    println!(&quot;{:#x} - {:x?}&quot;, num, num.to_be_bytes());
    let parts = as_ne_bytes_mut(&amp;mut num);
    *parts[1] = 0xfe;
    println!(&quot;{:#x} - {:x?}&quot;, num, num.to_be_bytes());
}

0x12345678 - [12, 34, 56, 78]
0x1234fe78 - [12, 34, fe, 78]

I think (tm) this is sound, because a u32 is a packed array of u8's and u8's are always correctly aligned, and the lifetimes should also match. I didn't find a way yet to implement as_be/le_bytes_mut yet. Also I'm not 100% sure this is sound, so some feedback would help.

答案1

得分: 2

以下是翻译好的部分：

它应该能够得到一个`&mut [u8; 4]`。
这比`[&mut u8; 4]`要好得多，因为它是一个无操作。但是，除非你交换位并返回一个`impl DerefMut<Target = &mut [u8; 4]>`的守卫，在它被丢弃时将位交换回来，否则你将无法创建`le`和`be`版本。
我可能会选择四个函数，每个函数返回一个字节，而不是返回`[&mut u8; 4]`的`le`和`be`函数。尽管如果你以相同的方式使用它们，它们应该会优化为相同的结果。
Miri认为这些都还可以。但是，你可以将它们变成一个单一的const泛型函数，但是要为它提供`0..4`的边界会不太方便。[playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=35158fde09016e23b8eaa064bfd209c7)

请注意，我已将HTML实体字符（例如&和"）转换为它们的相应字符。如果需要更多翻译，请告诉我。

英文:

It should be sound to get a &mut [u8; 4] out of it.

pub fn as_ne_bytes_mut(num: &amp;mut u32) -&gt; &amp;mut [u8; 4] {
    unsafe {
        let arr: *mut [u8; 4] = (num as *mut u32).cast();
        &amp;mut *arr
    }
}

This is much better than [&mut u8; 4] since it's a no-op. However, you aren't going to be able to create le and be versions unless you swap the bits around and return an impl DerefMut<Target = &mut [u8; 4]> guard that swaps the bits back when it's dropped.

I would probably go with four functions that return one byte each instead of le and be functions that return [&mut u8; 4]. Although if you use them the same, they should optimize to the same thing.

pub fn most_sig_byte_0(num: &amp;mut u32) -&gt; &amp;mut u8 {
    if cfg!(target_endian = &quot;big&quot;) {
        &amp;mut as_ne_bytes_mut(num)[0]
    } else {
        &amp;mut as_ne_bytes_mut(num)[3]
    }
}
pub fn most_sig_byte_1(num: &amp;mut u32) -&gt; &amp;mut u8 {
    if cfg!(target_endian = &quot;big&quot;) {
        &amp;mut as_ne_bytes_mut(num)[1]
    } else {
        &amp;mut as_ne_bytes_mut(num)[2]
    }
}
pub fn most_sig_byte_2(num: &amp;mut u32) -&gt; &amp;mut u8 {
    if cfg!(target_endian = &quot;big&quot;) {
        &amp;mut as_ne_bytes_mut(num)[2]
    } else {
        &amp;mut as_ne_bytes_mut(num)[1]
    }
}
pub fn most_sig_byte_3(num: &amp;mut u32) -&gt; &amp;mut u8 {
    if cfg!(target_endian = &quot;big&quot;) {
        &amp;mut as_ne_bytes_mut(num)[3]
    } else {
        &amp;mut as_ne_bytes_mut(num)[0]
    }
}

Miri thinks these are fine, at least. You could make them a single const generic function, but it would be less ergonomic to give it the 0..4 bound. (playground)

答案2

得分: 1

可以通过`cfg(target_endian = &quot;...&quot;)`来实现：
pub fn as_ne_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
    unsafe {
        let ptr: *mut u8 = (num as *mut u32).cast();
        [
            &mut *ptr.add(0),
            &mut *ptr.add(1),
            &mut *ptr.add(2),
            &mut *ptr.add(3),
        ]
    }
}
pub fn as_be_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
    let mut b = as_ne_bytes_mut(num);
    #[cfg(target_endian = &quot;little&quot;)]
    b.reverse();
    b
}
pub fn as_le_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
    let mut b = as_be_bytes_mut(num);
    b.reverse();
    b
}
fn main() {
    let mut num: u32 = 0x12345678;
    // 输出 `0x12345678 - [12, 34, 56, 78]`
    println!("{:#x} - {:x?}", num, num.to_be_bytes());
    let parts = as_be_bytes_mut(&mut num);
    *parts[2] = 0xfe;
    // 应该输出 `0x1234fe78 - [12, 34, fe, 78]`
    println!("{:#x} - {:x?}", num, num.to_be_bytes());
}

0x12345678 - [12, 34, 56, 78]
0x1234fe78 - [12, 34, fe, 78]

虽然看起来有点复杂，但编译器成功地进行了完美优化：

example::as_ne_bytes_mut:
        mov     rax, rdi
        lea     rcx, [rsi + 1]
        lea     rdx, [rsi + 2]
        mov     qword ptr [rdi], rsi
        add     rsi, 3
        mov     qword ptr [rdi + 8], rcx
        mov     qword ptr [rdi + 16], rdx
        mov     qword ptr [rdi + 24], rsi
        ret
example::as_be_bytes_mut:
        mov     rax, rdi
        lea     rcx, [rsi + 1]
        lea     rdx, [rsi + 2]
        lea     rdi, [rsi + 3]
        mov     qword ptr [rax], rdi
        mov     qword ptr [rax + 24], rsi
        mov     qword ptr [rax + 8], rdx
        mov     qword ptr [rax + 16], rcx
        ret
example::as_le_bytes_mut:
        mov     rax, rdi
        lea     rcx, [rsi + 1]
        lea     rdx, [rsi + 2]
        mov     qword ptr [rdi], rsi
        add     rsi, 3
        mov     qword ptr [rdi + 24], rsi
        mov     qword ptr [rdi + 8], rcx
        mov     qword ptr [rdi + 16], rdx
        ret

英文:

This can be achieved with the help of cfg(target_endian = "..."):

pub fn as_ne_bytes_mut(num: &amp;mut u32) -&gt; [&amp;mut u8; 4] {
    unsafe {
        let ptr: *mut u8 = (num as *mut u32).cast();
        [
            &amp;mut *ptr.add(0),
            &amp;mut *ptr.add(1),
            &amp;mut *ptr.add(2),
            &amp;mut *ptr.add(3),
        ]
    }
}
pub fn as_be_bytes_mut(num: &amp;mut u32) -&gt; [&amp;mut u8; 4] {
    let mut b = as_ne_bytes_mut(num);
    #[cfg(target_endian = &quot;little&quot;)]
    b.reverse();
    b
}
pub fn as_le_bytes_mut(num: &amp;mut u32) -&gt; [&amp;mut u8; 4] {
    let mut b = as_be_bytes_mut(num);
    b.reverse();
    b
}
fn main() {
    let mut num: u32 = 0x12345678;
    // Prints `0x12345678 - [12, 34, 56, 78]`
    println!(&quot;{:#x} - {:x?}&quot;, num, num.to_be_bytes());
    let parts = as_be_bytes_mut(&amp;mut num);
    *parts[2] = 0xfe;
    // Should print `0x1234fe78 - [12, 34, fe, 78]`
    println!(&quot;{:#x} - {:x?}&quot;, num, num.to_be_bytes());
}

0x12345678 - [12, 34, 56, 78]
0x1234fe78 - [12, 34, fe, 78]

While it does look a little convoluted, the compiler manages to optimize it perfectly:

example::as_ne_bytes_mut:
        mov     rax, rdi
        lea     rcx, [rsi + 1]
        lea     rdx, [rsi + 2]
        mov     qword ptr [rdi], rsi
        add     rsi, 3
        mov     qword ptr [rdi + 8], rcx
        mov     qword ptr [rdi + 16], rdx
        mov     qword ptr [rdi + 24], rsi
        ret
example::as_be_bytes_mut:
        mov     rax, rdi
        lea     rcx, [rsi + 1]
        lea     rdx, [rsi + 2]
        lea     rdi, [rsi + 3]
        mov     qword ptr [rax], rdi
        mov     qword ptr [rax + 24], rsi
        mov     qword ptr [rax + 8], rdx
        mov     qword ptr [rax + 16], rcx
        ret
example::as_le_bytes_mut:
        mov     rax, rdi
        lea     rcx, [rsi + 1]
        lea     rdx, [rsi + 2]
        mov     qword ptr [rdi], rsi
        add     rsi, 3
        mov     qword ptr [rdi + 24], rsi
        mov     qword ptr [rdi + 8], rcx
        mov     qword ptr [rdi + 16], rdx
        ret

答案3

得分: 1

这段代码可以在不使用指针算术的情况下完成。

以下是代码的中文翻译：

pub fn as_le_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
    let num_slice = std::slice::from_mut(num);
    let (pref, middle, suff): (_, &mut [u8], _) = unsafe {
        num_slice.align_to_mut()
    };
    // 当我们转换为u8时，这将始终为true
    assert!(pref.is_empty() && suff.is_empty());
    match middle {
        #[cfg(target_endian = "little")]
        [a, b, c, d] => [a, b, c, d],
        #[cfg(target_endian = "big")]
        [a, b, c, d] => [d, c, b, a],
        _ => unreachable!()
    }
}
pub fn as_be_bytes_mut(num: &mut u32) -> [&mut u8; 4] {
    let mut r = as_le_bytes_mut(num);
    r.reverse();
    r
}

它还可以编译成优化的汇编代码，您可以在[godbolt链接](https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:'1',fontScale:14,fontUsePx:'0',j:1,lang:rust,selection:(endColumn:2,endLineNumber:22,positionColumn:1,positionLineNumber:1,selectionStartColumn:2,selectionStartLineNumber:22,startColumn:1,startLineNumber:1),source:'pub+fn+as_le_bytes_mut(num:+%26mut+u32)-%3E%5B%26mut+u8%3B+4%5D%7B%0A++++let+num_slice+%3D+std::slice::from_mut(num)%3B%0A++++let+(pref,+middle,+suff):+(_,%26mut+%5Bu8%5D,_)+%3D+unsafe%7B%0A++++++++num_slice.align_to_mut()%0A++++%7D%3B%0A++++//+This+would+be+always+true+when+we+cast+to+u8%0A++++assert!!(pref.is_empty()+%26%26+suff.is_empty())%3B%0A++++%0A++++match+middle+%7B%0A++++++++%23%5Bcfg(target_endian+%3D+%22little%22)%5D%0A++++++++%5Ba,+b,+c,+d%5D+%3D%3E+%5Ba,+b,+c,+d%5D,%0A++++++++%23%5Bcfg(target_endian+%3D+%22big%22)%5D%0A++++++++%5Ba,+b,+c,+d%5D+%3D%3E+%5Bd,+c,+b,+a%5D,%0A++++++++_+%3D%3E+unreachable!!()%0A++++%7D%0A%7D%0A%0Apub+fn+as_be_bytes_mut(num:+%26mut+u32)-%3E%5B%26mut+u8%3B+4%5D%7B%0A++++let+mut+r+%3D+as_le_bytes_mut(num)%3B%0A++++r.reverse()%3B%0A++++r%0A%7D'),l:'5',n:'0',o:'',t:'0')),k:50,l:'4',n:'0',o:'',s:0,t:'0')查看到。

英文:

It is possible to do this without pointer arithmetics.

Code:

pub fn as_le_bytes_mut(num: &amp;mut u32)-&gt;[&amp;mut u8; 4]{
    let num_slice = std::slice::from_mut(num);
    let (pref, middle, suff): (_,&amp;mut [u8],_) = unsafe{
        num_slice.align_to_mut()
    };
    // This would be always true when we cast to u8
    assert!(pref.is_empty() &amp;&amp; suff.is_empty());
    
    match middle {
        #[cfg(target_endian = &quot;little&quot;)]
        [a, b, c, d] =&gt; [a, b, c, d],
        #[cfg(target_endian = &quot;big&quot;)]
        [a, b, c, d] =&gt; [d, c, b, a],
        _ =&gt; unreachable!()
    }
}
pub fn as_be_bytes_mut(num: &amp;mut u32)-&gt;[&amp;mut u8; 4]{
    let mut r = as_le_bytes_mut(num);
    r.reverse();
    r
}

And it compiles to nice assembly too: godbolt link.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

获取对`&mut u32`的一部分的`&mut u8`引用

问题

答案1

答案2

答案3

向tokio的Mutex中添加值会引发移动错误。

为什么在TempDir上链接方法会导致文件创建失败？

可以在不将其读入内存的情况下多次重用HTTP请求体吗？

Rust diesel the trait `load_dsl::private::CompatibleType<PodcastEpisode, Sqlite>` is not implemented for `Untyped

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。