Golang中的移动语义

huangapple go评论102阅读模式
英文:

Move Semantics in Golang

问题

这是Bjarne Stroustrup的《C++程序设计语言》第四版3.3.2节的内容。

我们实际上并不想要一个副本;我们只是想从函数中获取结果:我们想要移动一个Vector而不是复制它。幸运的是,我们可以表达这个意图:

class Vector {
     // ...
 
     Vector(const Vector& a);          // 拷贝构造函数
     Vector& operator=(const Vector& a);     // 拷贝赋值运算符

     Vector(Vector&& a);               // 移动构造函数
     Vector& operator=(Vector&& a);          // 移动赋值运算符
};

给定这个定义,编译器将选择移动构造函数来实现将返回值从函数中传递出去。这意味着r=x+y+z将不涉及Vector的复制。相反,Vector只是被移动。与通常情况一样,Vector的移动构造函数很容易定义...

我知道Golang支持传统的按值传递和按引用传递,使用Go风格的指针。Go语言是否像C++11那样支持"移动语义",如Stroustrup所描述的,以避免来回无用的复制?如果是这样,这是自动完成的,还是需要我们在代码中做一些事情来实现它?

英文:

This from Bjarne Stroustrup's The C++ Programming Language, Fourth Edition 3.3.2.

> We didn’t really want a copy; we just wanted to get the result out of
> a function: we wanted to move a Vector rather than to copy it.
> Fortunately, we can state that intent:
>
> class Vector {
> // ...
>
> Vector(const Vector& a); // copy constructor
> Vector& operator=(const Vector& a); // copy assignment
>
> Vector(Vector&& a); // move constructor
> Vector& operator=(Vector&& a); // move assignment
> };
>
> Given that definition, the compiler will choose the move constructor
> to implement the transfer of the return value out of the function.

> This means that r=x+y+z will involve no copying of Vectors. Instead,
> Vectors are just moved.As is typical, Vector’s move constructor is
> trivial to define...

I know Golang supports traditional passing by value and passing by reference using Go style pointers.

Does Go support "move semantics" the way C++11 does, as described by Stroustrup above, to avoid the useless copying back and forth? If so, is this automatic, or does it require us to do something in our code to make it happen.


Note: A few answers have been posted - I have to digest them a bit, so I haven't accepted one yet - thanks.

答案1

得分: 16

这里的解释如下:

  1. Go中的所有内容都是按值传递的。
  2. 但是,还有五种内置的“引用类型”,它们也是按值传递的,但在内部它们持有对单独维护的数据结构的引用:映射(maps)、切片(slices)、通道(channels)、字符串(strings)和函数值(后两者没有办法改变引用的数据)。

你自己的回答,@Vector,是错误的,因为在Go中没有任何东西是按引用传递的。相反,有一些具有引用语义的类型。它们的值仍然是按值传递的(sic!)。

你的困惑可能源于你的思维目前可能被C++、Java等语言所负担,而在Go中,这些事情大多是“像在C中一样”完成的。

以数组和切片为例。在Go中,数组是按值传递的,但切片是一个紧凑的结构体,包含一个指针(指向底层数组)、两个平台大小的整数(切片的长度和容量),当它被赋值或返回时,复制的是这个结构体的值——一个指针和两个整数。如果你复制一个“裸”数组,它将被逐字复制——包括所有的元素。

通道和映射也是如此。你可以将定义通道和映射的类型想象为如下声明:

type Map struct {
   impl *mapImplementation
}

type Slice struct {
   impl *sliceImplementation
}

(顺便说一下,如果你了解C++,你应该知道一些C++代码使用这个技巧将细节降低到头文件中。)

所以当你之后写下:

m := make(map[int]string)

你可以将它看作m具有类型Map,所以当你之后执行:

x := m

m的值被复制,但它只包含一个指针,所以现在xm都引用同一个底层数据结构。m是按引用复制的(“移动语义”)吗?当然不是!map、slice和channel类型的值具有引用语义吗?是的!

请注意,这五种类型并不特殊:通过在自定义类型中嵌入指向某个复杂数据结构的指针来实现自定义类型是一种相当常见的模式。

换句话说,Go允许程序员决定他们的类型需要哪种语义。而Go恰好有五种内置类型已经具有引用语义(而其他所有内置类型都具有值语义)。在任何情况下,选择一种语义而不是另一种并不会以任何方式影响按值复制的规则。例如,在Go中,可以有任何类型的值的指针,并对它们进行赋值(只要它们具有兼容的类型)——这些指针将被按值复制。

另一个角度来看,许多Go包(标准和第三方)更喜欢使用指向(复杂)值的指针。一个例子是os.Open()(用于打开文件系统上的文件),它返回类型为*os.File的值。也就是说,它返回一个指针,并且期望调用代码传递这个指针。当然,Go的作者可能已经声明os.File为一个包含单个指针的struct,从而使这个值具有引用语义,但他们没有这样做。我认为这样做的原因是对这种类型的值没有特殊的语法来处理,因此没有理由使它们像映射、通道和切片那样工作。简单明了,换句话说。

推荐阅读:

英文:

The breakdown is like here:

  1. Everything in Go is passed by value.
  2. But there are five built-in "reference types" which are passed by value as well but internally they hold references to separately maintained data structure: maps, slices, channels, strings and function values (there is no way to mutate the data the latter two reference).

Your own answer, @Vector, is incorrect is that nothing in Go is passed by reference. Rather, there are types with reference semantics. Values of them are still passed by value (sic!).

Your confusion suppsedly stems from the fact your mind is supposedly currently burdened by C++, Java etc while these things in Go are done mostly "as in C".

Take arrays and slices for instance. An array is passed by value in Go, but a slice is a packed struct containing a pointer (to an underlying array) and two platform-sized integers (the length and the capacity of the slice), and it's the value of this structure which is copied — a pointer and two integers — when it's assigned or returned etc. Should you copy a "bare" array, it would be copied literally — with all its elements.

The same applies to channels and maps. You can think of types defining channels and maps as declared something like this:

type Map struct {
   impl *mapImplementation
}

type Slice struct {
   impl *sliceImplementation
}

(By the way, if you know C++, you should be aware that some C++ code uses this trick to lower exposure of the details into header files.)

So when you later have

m := make(map[int]string)

you could think of it as m having the type Map and so when you later do

x := m

the value of m gets copied, but it contains just a single pointer, and so both x and m now reference the same underlying data structure. Was m copied by reference ("move semantics")? Surely not! Do values of type map and slice and channel have reference semantincs? Yes!

Note that these five types of this kind are not at all special: implementing your custom type by embedding in it a pointer to some complicated data structure is a rather common pattern.

In other words, Go allows the programmer to decide what semantics they want for their types. And Go happens to have five built-in types which have reference semantics already (while all the other built-in types have value semantics). Picking one semantics over the other does not affect the rule of copying everything by value in any way. For instance, it's fine to have pointers to values of any kind of type in Go, and assign them (so long they have compatible types) — these pointers will be copied by value.

Another angle to look at this is that many Go packages (standard and 3rd-party) prefer to work with pointers to (complex) values. One example is os.Open() (which opens a file on a filesystem) returning a value of the type *os.File. That is, it returns a pointer and expects the calling code to pass this pointer around. Surely, the Go authors might have declared os.File to be a struct containing a single pointer, essentially making this value have reference semantics but they did not do that. I think the reason for this is that there's no special syntax to work with the values of this type so there's no reason to make them work as maps, channels and slices. KISS, in other words.


Recommended reading:

答案2

得分: 2

《Go编程语言规范》

调用

在函数调用中,函数值和参数按照通常的顺序进行求值。在它们求值之后,调用的参数按值传递给函数,并且被调用的函数开始执行。当函数返回时,函数的返回参数按值传递回调用函数。

在Go语言中,一切都是按值传递的。

Rob Pike

在Go语言中,一切都是按值传递的。一切。

有一些类型(指针、通道、映射、切片)具有类似引用的属性,但在这些情况下,相关的数据结构(指针、通道指针、映射头、切片头)保存了指向底层共享对象的指针(指向的东西、通道描述符、哈希表、数组);数据结构本身是按值传递的。总是。

总是。

  • rob
英文:

> The Go Programming Language Specification
>
> Calls
>
> In a function call, the function value and arguments are evaluated in
> the usual order. After they are evaluated, the parameters of the call
> are passed by value to the function and the called function begins
> execution. The return parameters of the function are passed by value
> back to the calling function when the function returns.

In Go, everything is passed by value.

> Rob Pike
>
> In Go, everything is passed by value. Everything.
>
> There are some types (pointers, channels, maps, slices) that have
> reference-like properties, but in those cases the relevant data
> structure (pointer, channel pointer, map header, slice header) holds a
> pointer to an underlying, shared object (pointed-to thing, channel
> descriptor, hash table, array); the data structure itself is passed by
> value. Always.
>
> Always.
>
> -rob

答案3

得分: 1

据我了解,Go、Java和C#都没有像C++那样存在过多的复制成本,但它们并没有解决所有权转移给容器的问题。因此,仍然涉及到复制操作。随着C++越来越多地成为一种值语义语言,引用/指针被限制在以下两种情况下使用:i) 在类内部管理的智能指针对象;ii) 依赖引用。移动语义解决了过多复制的问题。需要注意的是,这与“按值传递”无关,现在每个人都在C++中通过引用(&)或常量引用(const &)传递对象。

让我们来看一下这个例子(1):

BigObject BO(big,stuff,inside);
vector<BigObject> vo;
vo.reserve(1000000);
vo.push_back(BO);

或者(2)

vector<BigObject> vo;
vo.reserve(1000000);
vo.push_back(BigObject(big,stuff,inside));

虽然你将引用传递给了向量vo,但在C++03中,向量代码内部仍然会进行一次复制操作。
在第二种情况下,需要构造一个临时对象,然后将其复制到向量中。由于它只能被向量访问,这是一次浪费的复制。

然而,在第一种情况下,我们的意图可能只是将BO的控制权交给向量本身。C++17允许这样做:

(1,C++17)

vector<BigObject> vo;
vo.reserve(1000000);
vo.emplace_back(big,stuff,inside);

或者(2,C++17)

vector<BigObject> vo;
vo.reserve(1000000);
vo.push_back(BigObject(big,stuff,inside))

根据我所读到的,目前尚不清楚Java、C#或Go是否免于在容器的情况下出现与C++03相同的复制重复问题。

旧式的写时复制(copy-on-write)技术也存在相同的问题,因为一旦向量内部的对象被复制,资源就会被复制。

英文:

It is my understanding that Go, as well as Java and C# never had the excessive copying costs of C++, but do not solve ownership transference to containers. Therefore there is still copying involved. As C++ becomes more of a value-semantics language, with references/pointers being relegated to i) smart-pointer managed objects inside classes and ii) dependence references, move semantics solves the problem of excessive copying. Note that this has nothing to do with "pass by value", nowadays everyone passes objects by Reference (&) or Const Reference (const &) in C++.
Let's look at this (1) :

BigObject BO(big,stuff,inside);
vector&lt;BigObject&gt; vo;
vo.reserve(1000000);
vo.push_back(BO);

Or (2)

vector&lt;BigObject&gt; vo;
vo.reserve(1000000);
vo.push_back(BigObject(big,stuff,inside));

Although you're passing by reference to the vector vo, in C++03 there was a copy inside the vector code.
In the second case, there is a temporary object that has to be constructed and then is copied inside the vector. Since it can only be accessed by the vector, that is a wasteful copy.

However, in the first case, our intent could be just to give control of BO to the vector itself. C++17 allows this:

(1, C++17)

vector&lt;BigObject&gt; vo;
vo.reserve(1000000);
vo.emplace_back(big,stuff,inside);

Or (2, C++17)

vector&lt;BigObject&gt; vo;
vo.reserve(1000000);
vo.push_back(BigObject(big,stuff,inside));

From what I've read, it is not clear that Java, C# or Go are exempt from the same copy duplication that C++03 suffered from in the case of containers.

The old-fashioned COW (copy-on-write) technique, also had the same problems, since the resources will be copied as soon as the object inside the vector is duplicated.

答案4

得分: -1

Stroustrup在谈论C++,它允许你通过值传递容器等内容,因此过多的复制成为一个问题。

在Go中(就像在Delphi、Java等语言中),当你传递一个容器类型时,它们总是引用,所以这不是一个问题。无论如何,在Go语言中你不必处理或担心这个问题——编译器会自动处理,从我迄今所见,它做得很正确

感谢@KerrekSB给我指明了正确的方向。

@KerrekSB - 我希望这是正确的答案。如果不对,你不承担任何责任。:)

英文:

Stroustrup is talking about C++, which allows you to pass containers, etc by value - so the excessive copying becomes an issue.

In Go, (like in Delphi, Java, etc) when you pass a container type, etc they are always references, so it's a non-issue. Regardless, you don't have to deal with it or worry about in GoLang - the compiler just does what it needs to do, and from what I've seen thus far, it's doing it right.

Tnx to @KerrekSB for putting me on the right track.

@KerrekSB - I hope this is the right answer. If it's wrong, you bear no responsibility.:)

huangapple
  • 本文由 发表于 2013年12月31日 08:34:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/20849911.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定