为什么 `std::mem::forget` 不能用于创建静态引用?

huangapple go评论85阅读模式
英文:

Why `std::mem::forget` cannot be used for creating static references?

问题

在上面的代码片段中,为什么会发生段错误?std::mem::forget 只是接管所有权并不运行析构函数/ drop 函数,这可以用来创建生命周期为静态的引用,这个逻辑有什么问题吗?我是 Rust 的新手,希望了解引用、结构体 T 的实例和指针之间的关系。从一个天真的想法来看,上面的代码逻辑上似乎是正确的。请解释并纠正。

解释:
在Rust中,std::mem::forget确实是用来接管所有权并不运行析构函数的函数。但是,要正确使用它,你需要确保在忘记值之前,没有任何代码试图访问已经被忘记的值。否则,会出现悬垂引用或者访问已释放的内存的情况,这会导致不可预测的行为,包括段错误。

在你的代码中,forget函数被用在了forget_me_as_ptrforget_me_as_mut_ref方法中,这些方法返回了对结构体Self的引用,但在这些引用返回之前,forget已经被调用,导致了悬垂引用或者访问已释放内存的问题。

具体来说,在以下代码中:

fn forget_me_as_ptr(self) -> *const Self {
    let p = addr_of!(self); // 获取指向 self 的指针
    forget(self); // 接管了 self 的所有权,但没有运行析构函数
    p // 返回指向 self 的指针
}

fn forget_me_as_mut_ref(mut self) -> &'static mut Self {
    let p = addr_of_mut!(self); // 获取指向 self 的可变指针
    forget(self); // 接管了 self 的所有权,但没有运行析构函数
    unsafe { p.as_mut().unwrap() } // 解引用可变指针,这里会导致悬垂引用或访问已释放内存
}

在调用forget之后,self的所有权被接管,但是在forget之后仍然尝试访问已经被释放的self,这会导致段错误。

要解决这个问题,你应该确保在调用forget之后不再尝试访问被忘记的值,或者重新设计代码以避免使用forget,因为它通常不是创建生命周期为静态的引用的正确方式。在Rust中,通常使用生命周期参数来明确指定引用的生命周期,而不是尝试通过forget来创建静态引用。

英文:
pub trait Forget {
    fn forget_me_as_ptr(self) -> *const Self
    where
        Self: Sized,
    {
        let p = addr_of!(self);
        forget(self);
        p
    }

    fn forget_me_as_mut_ptr(self) -> *mut Self
    where
        Self: Sized,
    {
        self.forget_me_as_ptr() as *mut Self
    }

    fn forget_me_as_ref(self) -> &'static Self
    where
        Self: Sized,
    {
        self.forget_me_as_mut_ref()
    }

    fn forget_me_as_mut_ref(mut self) -> &'static mut Self
    where
        Self: Sized,
    {
        let p = addr_of_mut!(self);
        forget(self);
        unsafe { p.as_mut().unwrap() }
    }
}

pub fn default<T: Forget + Default>() -> &'static T {
    T::default().forget_me_as_ref()
}

#[derive(Debug)]
struct Player {
    health: u8,
    name: String,
    aliases: Vec<String>,
}

impl Drop for Player {
    fn drop(&mut self) {
        println!("Dropping [{}, {self:p}]", type_name::<Self>());
    }
}

impl Display for Player {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        write!(
            f,
            "[{}, {self:p}] = name: {}, health: {}, aliases: {:#?}",
            type_name::<Self>(),
            self.name,
            self.health,
            self.aliases
        )
    }
}

impl Forget for Player {}

impl Default for Player {
    fn default() -> Self {
        Self {
            health: 100,
            name: "default".into(),
            aliases: vec!["default".to_string()],
        }
    }
}

fn main() {
    let p = default::<Player>();
    println!("{p}");
}

output:

$ cargo r
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s
     Running `target/debug/static_factory`
Segmentation fault (core dumped)

In the above code snippet why is it segfaulting ?
std::mem::forget just takes ownership and does not run the destructor/drop which can exactly be used to create references whose lifetime is static ? Where is the flaw in this logic ? Am new to rust and kind of wanted to understand what is the relation between references, instances of struct T, pointers. The code above looks logically correct from a naive thought. Please explain and correct.

答案1

得分: 6

使用 std::mem::forget 可以避免运行 Drop 实现,但由于 self 是一个局部变量,它的内存仍然会被回收并用于其他用途。

要获得所需的行为,您需要将其移动到一个稳定的地址,不会被重用。您可以通过将其放入一个 Box 中,获取指向它的指针,然后忘记这个盒子。幸运的是,已经有一个函数可以实现这个目的:Box::leak。类似这样:

pub trait Forget {
    // ... 其余部分 ...

    fn forget_me_as_mut_ref(mut self) -> &'static mut Self
    where
        Self: Sized,
    {
        Box::leak(Box::new(self))
    }
}
英文:

Using std::mem::forget avoids running the Drop implementation but since self is a local variable, the memory for it will still be reclaimed and used for other things.

To get the desired behavior, you would need to move it to a stable address that won't be reused. You can do this by putting it in a Box, getting the pointer to it, and forgetting the box. Fortunately there is already a function for that: Box::leak. Something like:

pub trait Forget {
    // ... the rest ...

    fn forget_me_as_mut_ref(mut self) -> &'static mut Self
    where
        Self: Sized,
    {
        Box::leak(Box::new(self))
    }

答案2

得分: 3

局部变量,就像函数的参数一样,存储在堆栈上,无论是否调用drop函数,都不会有任何影响,因为一旦离开函数,对它的任何指针都会变成悬空指针。因此,当p指向局部内存(如参数self)并返回结果时,unsafe { p.as_mut().unwrap() } 是不安全的。

英文:

Local variables, like the parameters of a function live on the stack, whether drop is called on it's memory doesn't make a difference in that any pointer to it becomes dangling as soon as you leave the function. Therefore unsafe { p.as_mut().unwrap() } is unsound when p points to local memory (such as the argument self) and you return the result.

答案3

得分: 0

我的嫌疑对象涉及以下代码行:

pub fn default<T: Forget + Default>() -> &'static T {
    T::default().forget_me_as_ref()
}

T::default() 在堆栈上创建了一个临时对象,为该对象创建了一个引用。一旦函数调用结束,与堆栈分配的对象对应的内存可能会被破坏。如果对象是在堆上创建的,我认为这个问题会消失。

英文:

My suspect is with the line:

pub fn default<T: Forget + Default>() -> &'static T {
    T::default().forget_me_as_ref()
}

T::default() creates a temporary object on stack for which we create a reference. Once the function call ends the memory corresponding to the stack allocated object is lets say garbled. Had the object been created on heap i think this would go away.

huangapple
  • 本文由 发表于 2023年7月18日 01:24:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/76706784.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定