Rust中的方法来转移所有权,同时保证不复制底层数据。

huangapple go评论69阅读模式
英文:

Rust way to transfer ownership while guaranteeing no underlying data copy

问题

我有点困惑如何在不进行实际数据复制的情况下转移所有权。我有以下代码。我指的是操作系统进行的底层数据复制,类似于memcopy。

fn main() {
    let v1 = Vec::from([1; 1024]);
    take_ownership_but_memcopies(v1);

    let v2 = Vec.from([2; 1024]);
    dont_memecopy_but_dont_take_ownership(&v2);

    let v3 = Vec::from([3; 1024]);
    take_ownership_dont_memcopy(v3);
}

// 移动但复制所有元素
fn take_ownership_but_memcopies(my_vec1: Vec<i32>) {
    println!("{:?}", my_vec1);
}

// 不复制但不获取所有权
fn dont_memecopy_but_dont_take_ownership(my_vec2: &Vec<i32>) {
    println!("{:?}", my_vec2);
}

// 获取所有权而不产生memcopy开销
fn take_ownership_dont_memcopy(my_vec3: Vec<i32>) {
    println!("{:?}", my_vec3);
}

依我理解,如果像v2那样使用引用,我不会获取所有权。如果像v1那样使用,可能会发生memcopy。

要保证v3 在不进行操作系统底层数据复制的情况下转移所有权,你可以像上面的代码中一样传递v3,因为这样不会进行底层数据复制。

英文:

I am a bit confused about how to transfer ownership without the overhead of actual data copy.
I have the following code. I am referring to underlying data copy by OS as memcopy.

fn main() {
    let v1 = Vec::from([1; 1024]);
	take_ownership_but_memcopies(v1);

    let v2 = Vec::from([2; 1024]);
	dont_memecopy_but_dont_take_ownership(&amp;v2);

	let v3 = Vec::from([3; 1024]);
	take_ownership_dont_memcopy(???);
}

// Moves but memcopies all elements
fn take_ownership_but_memcopies(my_vec1: Vec&lt;i32&gt;) {
	println!(&quot;{:?}&quot;, my_vec1);
}

// Doesn&#39;t memcopy but doesn&#39;t take ownership
fn dont_memecopy_but_dont_take_ownership(my_vec2: &amp;Vec&lt;i32&gt;) {
	println!(&quot;{:?}&quot;, my_vec2);
}

// Take ownership without the overhead of memcopy
fn take_ownership_dont_memcopy(myvec3: ???) {
	println!(&quot;{:?}&quot;, my_vec3);
}

As i understand, if i use reference like v2, i don't get the ownership. If i use it like v1, there could be a memcopy.

How should i need to transfer v3 to guarantee that there is no underlying memcopy by OS?

答案1

得分: 3

你对移动 Vec 时发生的情况的理解是不正确的 - 它并不会复制 Vec 中的每个元素!

要理解这一点,我们需要退后一步,看一下 Vec 在内部是如何表示的:

// 这是稍微简化的,查看源代码以获取更多细节!

struct Vec<T> {
    pointer: *mut T, // 指向数据的指针(在堆上)
    capacity: usize, // Vec 的当前容量
    len: usize,      // Vec 中当前的元素数量
}

虽然 Vec 在概念上“拥有”元素,但它们并不存储在 Vec 结构内部 - 它只保存一个指向数据的指针。因此,当你移动一个 Vec 时,只会复制指针(加上容量和长度)。

如果你试图完全避免复制,而不是避免复制 Vec 的内容,那并不是真正可能的 - 在编译器的语义中,移动就是一种复制(只是一种防止你在之后使用旧数据的复制)。但是,编译器可以并将会将无关紧要的复制优化为更有效的操作。

英文:

Your understanding of what happens when you move a Vec is incorrect - it does not copy every element within the Vec!

To understand why, we need to take a step back and look at how a Vec is represented internally:

// This is slightly simplified, look at the source for more details!

struct Vec&lt;T&gt; {
    pointer: *mut T, // pointer to the data (on the heap)
    capacity: usize, // the current capacity of the Vec
    len: usize,      // the current number of elements in the Vec
}

While the Vec conceptually 'owns' the elements, they are not stored within the Vec struct - it only holds a pointer to that data. So when you move a Vec, it is only the pointer (plus the capacity and length) that gets copied.


If you are attempting to avoid copying altogether, as opposed to avoiding copying the contents of the Vec, that isn't really possible - in the semantics of the compiler, a move is a copy (just one that prevents you from using the old data afterwards). However, the compiler can and will optimize trivial copies into something more efficient.

答案2

得分: 1

如何确保将v3传输以避免操作系统的底层内存复制?

你无法做到。因为这是Rust的语义。

不过,Vec 只是栈上的3个字,这就是被“memcopy”的全部内容,这是内部的,不会像在其中得到一个memcpy函数调用或者复制整个向量一样。假设函数调用不会被内联,并且编译器不会决定将对象作为引用传递。它还可以通过寄存器传递所有3个字,这时就不需要进行内存复制。

虽然不太清楚你为什么要关心这一点,如果你只想从集合中读取,你的函数应该是

// 不需要内存复制的情况下接管所有权
fn take_ownership_dont_memcopy(myvec3: &[i32]) {
    println!("{:?}", my_vec3);
}

这是最有效和灵活的签名:只有两个字,只有一个指针(不像&Vec),并且允许使用非Vec源。

英文:

> How should i need to transfer v3 to guarantee that there is no underlying memcopy by OS?

You can't. Because that's Rust's semantics.

However a Vec is just 3 words on the stack, that's all which gets "memcopy"d, which is intrinsic, it's not like you're going to get a memcpy function call in there or duplicate the entire vector. And that's assuming the function call does not get inlined, and the compiler does not decide to pass in object as a reference anyway. It could also pass all 3 words through registers, at which point there's nothing to memcpy.

Though it's not entirely clear why you care either way, if you only want to read from the collection your function should be

// Take ownership without the overhead of memcopy
fn take_ownership_dont_memcopy(myvec3: &amp;[i32]) {
    println!(&quot;{:?}&quot;, my_vec3);
}

that is the most efficient and flexible signature: it's just two words, there's a single pointer (unlike &amp;Vec), and it allows for non-Vec sources.

huangapple
  • 本文由 发表于 2023年2月8日 21:45:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/75386706.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定