英文:
Value receiver vs. pointer receiver
问题
我很难理解在什么情况下我会想要使用值接收器而不是始终使用指针接收器。
根据文档的总结:
type T struct {
a int
}
func (tv T) Mv(a int) int { return 0 } // 值接收器
func (tp *T) Mp(f float32) float32 { return 1 } // 指针接收器
文档还说:“对于基本类型、切片和小结构等类型,值接收器非常廉价,因此除非方法的语义要求使用指针,否则值接收器是高效和清晰的。”
第一个观点是文档说值接收器是“非常廉价”的,但问题是它是否比指针接收器更便宜。所以我做了一个小的基准测试(代码在gist上),结果显示即使对于只有一个字符串字段的结构体,指针接收器也更快。以下是结果:
// 结构体只有一个空字符串属性
BenchmarkChangePointerReceiver 2000000000 0.36 ns/op
BenchmarkChangeItValueReceiver 500000000 3.62 ns/op
// 结构体只有一个零整数属性
BenchmarkChangePointerReceiver 2000000000 0.36 ns/op
BenchmarkChangeItValueReceiver 2000000000 0.36 ns/op
(编辑:请注意,第二个观点在较新的Go版本中已经无效,请参阅评论。)
第二个观点是文档说值接收器是“高效和清晰”的,这更多是一种口味问题,不是吗?就个人而言,我更喜欢在所有地方使用相同的东西以保持一致性。在什么意义上是高效的?从性能上看,指针似乎几乎总是更高效的。对于一个整数属性的少量测试运行显示值接收器有轻微优势(0.01-0.1 ns/op的范围)。
有人能告诉我一个明显比指针接收器更适合使用值接收器的情况吗?或者在基准测试中我做错了什么?我有没有忽视其他因素?
英文:
It is very unclear for me in which case I would want to use a value receiver instead of always using a pointer receiver.
To recap from the docs:
type T struct {
a int
}
func (tv T) Mv(a int) int { return 0 } // value receiver
func (tp *T) Mp(f float32) float32 { return 1 } // pointer receiver
The docs also say "For types such as basic types, slices, and small structs, a value receiver is very cheap so unless the semantics of the method requires a pointer, a value receiver is efficient and clear."
First point they docs say a value receiver is "very cheap", but the question is whether it is cheaper than a pointer receiver. So I made a small benchmark (code on gist) which showed me, that pointer receiver is faster even for a struct that has only one string field. These are the results:
// Struct one empty string property
BenchmarkChangePointerReceiver 2000000000 0.36 ns/op
BenchmarkChangeItValueReceiver 500000000 3.62 ns/op
// Struct one zero int property
BenchmarkChangePointerReceiver 2000000000 0.36 ns/op
BenchmarkChangeItValueReceiver 2000000000 0.36 ns/op
(Edit: Please note that second point became invalid in newer go versions, see comments.)
Second point the docs say that a value receiver it is "efficient and clear" which is more a matter of taste, isn't it? Personally I prefer consistency by using the same thing everywhere. Efficiency in what sense? Performance wise it seems pointer are almost always more efficient. Few test-runs with one int property showed minimal advantage of Value receiver (range of 0.01-0.1 ns/op)
Can someone tell me a case where a value receiver clearly makes more sense than a pointer receiver? Or am I doing something wrong in the benchmark? Did I overlook other factors?
答案1
得分: 236
请注意,常见问题解答中提到了一致性:
其次是一致性。如果类型的某些方法必须具有指针接收器,那么其余的方法也应该具有指针接收器,这样无论如何使用该类型,方法集都是一致的。有关详细信息,请参阅方法集部分。
正如此线程中提到的:
关于接收器是指针还是值的规则是,值方法可以在指针和值上调用,但指针方法只能在指针上调用。
然而,正如Sart Simha在评论中所指出的那样:
无论是值接收器方法还是指针接收器方法都可以在正确类型的指针或非指针上调用。
无论在哪个方法上调用,方法体内的接收器标识符在使用值接收器时引用的是一个按值传递的值,而在使用指针接收器时引用的是一个指针:示例。
现在:
有人能告诉我一个明显需要值接收器而不是指针接收器的情况吗?
代码审查评论可以提供帮助:
- 如果接收器是映射、函数或通道,请不要使用指针。
- 如果接收器是切片,并且该方法不会重新切片或重新分配切片,请不要使用指针。
- 如果方法需要修改接收器,则接收器必须是指针。
- 如果接收器是包含
sync.Mutex
或类似同步字段的结构体,则接收器必须是指针,以避免复制。- 如果接收器是一个大型结构体或数组,则指针接收器更高效。多大才算大?假设它相当于将所有元素作为参数传递给该方法。如果感觉太大,那么对于接收器来说也太大了。
- 函数或方法是否可以在并发时或从该方法调用时修改接收器?值类型在调用方法时会创建接收器的副本,因此外部更新不会应用于此接收器。如果必须在原始接收器中看到更改,则接收器必须是指针。
- 如果接收器是结构体、数组或切片,并且其中任何一个元素是指向可能发生变化的内容的指针,则最好使用指针接收器,因为这将使意图对读者更加清晰。
- 如果接收器是一个自然是值类型的小型数组或结构体(例如,类似
time.Time
类型的东西),没有可变字段和指针,或者只是一个简单的基本类型,那么值接收器是有意义的。
值接收器可以减少可能产生的垃圾量;如果将值传递给值方法,可以使用栈上的副本而不是在堆上分配。(编译器会尝试智能地避免此分配,但并不总是成功。)在进行性能分析之前,不要仅仅因为这个原因选择值接收器类型。- 最后,如果不确定,使用指针接收器。
关于"如果接收器是切片,并且该方法不会重新切片或重新分配切片,请不要使用指针。"的说明:
该语句建议,如果你有一个方法会重新切片或重新分配切片,那么你应该使用指针接收器。
换句话说,如果你在方法内部修改切片,比如追加元素或改变切片的长度/容量,建议使用指针接收器。
在实现切片类型的删除和插入方法时,你可能会修改切片(改变其长度、追加或删除元素)。因此,应该为这些方法使用指针接收器。
示例(playground):
package main
import "fmt"
type MySlice []int
func (s *MySlice) Insert(index int, value int) {
// 在索引位置插入值并移动元素
*s = append((*s)[:index], append([]int{value}, (*s)[index:]...)...)
}
func (s *MySlice) Delete(index int) {
// 删除索引位置的元素并移动元素
*s = append((*s)[:index], (*s)[index+1:]...)
}
func main() {
s := MySlice{1, 2, 3, 4, 5}
s.Insert(2, 42)
fmt.Println(s) // 输出:[1 2 42 3 4 5]
s.Delete(2)
fmt.Println(s) // 输出:[1 2 3 4 5]
}
在这个示例中,Insert
和 Delete
方法通过追加和删除元素来修改切片。因此,使用指针接收器确保修改在方法外部可见。
粗体部分在net/http/server.go#Write()
中找到:
// Write 将 h 中描述的标头写入 w。
//
// 此方法具有值接收器,尽管 h 的大小有点大,
// 因为它可以避免分配。逃逸分析无法智能地意识到此函数不会改变 h。
func (h extraHeader) Write(w *bufio.Writer) {
...
}
> 遵循接收器类型应该一致的建议,如果你有一个指针接收器,那么你的(p *type) String() string
方法也应该使用指针接收器。
>
> 但是,这并没有实现Stringer
接口,除非调用你的 API 的调用者也使用指向你的类型的指针,这可能是你的 API 的可用性问题。
>
> 我不知道在这里一致性是否胜过可用性。
指向:
> 你可以混合使用具有值接收器和具有指针接收器的方法,并将它们与包含值和指针的变量一起使用,而不必担心哪个是哪个。
两者都可以工作,语法是相同的。
>
> 但是,如果需要使用指针接收器的方法来满足接口,则只有指针才能分配给该接口,值将无效。
- Chris Siebenmann的"Go 接口和自动生成的函数"(2017 年 6 月)
> 通过接口调用值接收器方法总是会创建额外的值副本。
>
接口值本质上是指针,而你的值接收器方法需要值;因此,每次调用都需要 Go 创建一个值的新副本,调用你的方法,然后丢弃该值。
只要使用值接收器方法并通过接口值调用它们,就无法避免这种情况;这是 Go 的基本要求。
- "了解 Go 的不可寻址值和切片"(同样来自 Chris(2018 年 9 月))
> 不可寻址值的概念,它们是不可寻址值的相反。仔细的技术版本在 Go 规范的地址运算符中,但是简单的摘要版本是大多数匿名值都是不可寻址的(一个重要的例外是复合字面量)。
英文:
Note that the FAQ does mention consistency
> Next is consistency. If some of the methods of the type must have pointer receivers, the rest should too, so the method set is consistent regardless of how the type is used. See the section on method sets for details.
As mentioned in this thread:
> The rule about pointers vs. values for receivers is that value methods can
be invoked on pointers and values, but pointer methods can only be invoked
on pointers
Which is not true, as commented by Sart Simha
> Both value receiver and pointer receiver methods can be invoked on a correctly-typed pointer or non-pointer.
>
> Regardless of what the method is called on, within the method body the identifier of the receiver refers to a by-copy value when a value receiver is used, and a pointer when a pointer receiver is used: example.
Now:
> Can someone tell me a case where a value receiver clearly makes more sense then a pointer receiver?
The Code Review comment can help:
> - If the receiver is a map, func or chan, don't use a pointer to it.
>- If the receiver is a slice and the method doesn't reslice or reallocate the slice, don't use a pointer to it.
>- If the method needs to mutate the receiver, the receiver must be a pointer.
>- If the receiver is a struct that contains a sync.Mutex
or similar synchronizing field, the receiver must be a pointer to avoid copying.
>- If the receiver is a large struct or array, a pointer receiver is more efficient. How large is large? Assume it's equivalent to passing all its elements as arguments to the method. If that feels too large, it's also too large for the receiver.
>- Can function or methods, either concurrently or when called from this method, be mutating the receiver? A value type creates a copy of the receiver when the method is invoked, so outside updates will not be applied to this receiver. If changes must be visible in the original receiver, the receiver must be a pointer.
>- If the receiver is a struct, array or slice and any of its elements is a pointer to something that might be mutating, prefer a pointer receiver, as it will make the intention more clear to the reader.
>- If the receiver is a small array or struct that is naturally a value type (for instance, something like the time.Time
type), with no mutable fields and no pointers, or is just a simple basic type such as int or string, a value receiver makes sense.
A value receiver can reduce the amount of garbage that can be generated; if a value is passed to a value method, an on-stack copy can be used instead of allocating on the heap. (The compiler tries to be smart about avoiding this allocation, but it can't always succeed.) Don't choose a value receiver type for this reason without profiling first.
>- Finally, when in doubt, use a pointer receiver.
Note on "If the receiver is a slice and the method doesn't reslice or reallocate the slice, don't use a pointer to it."
The statement is suggesting that if you have a method that reslices or reallocates the slice, then you should use a pointer receiver.
In other words, if you modify the slice within the method, such as appending elements or changing the length/capacity of the slice, it's recommended to use a pointer receiver.
In the case of implementing deletion and insertion methods for a slice type, you will likely be modifying the slice (changing its length, appending or removing elements). Therefore, you should use a pointer receiver for these methods.
Example (playground):
package main
import "fmt"
type MySlice []int
func (s *MySlice) Insert(index int, value int) {
// Insert value at index and shift elements
*s = append((*s)[:index], append([]int{value}, (*s)[index:]...)...)
}
func (s *MySlice) Delete(index int) {
// Remove the element at index and shift elements
*s = append((*s)[:index], (*s)[index+1:]...)
}
func main() {
s := MySlice{1, 2, 3, 4, 5}
s.Insert(2, 42)
fmt.Println(s) // Output: [1 2 42 3 4 5]
s.Delete(2)
fmt.Println(s) // Output: [1 2 3 4 5]
}
In this example, the Insert
and Delete
methods are modifying the slice by appending and removing elements.
As a result, a pointer receiver is used to ensure the modifications are visible outside the method.
The part in bold is found for instance in net/http/server.go#Write()
:
// Write writes the headers described in h to w.
//
// This method has a value receiver, despite the somewhat large size
// of h, because it prevents an allocation. The escape analysis isn't
// smart enough to realize this function doesn't mutate h.
func (h extraHeader) Write(w *bufio.Writer) {
...
}
Note: irbull points out in the comments a warning about interface methods:
> Following the advice that the receiver type should be consistent, if you have a pointer receiver, then your (p *type) String() string
method should also use a pointer receiver.
>
> But this does not implement the Stringer
interface, unless the caller of your API also uses pointer to your type, which might be a usability problem of your API.
>
> I don't know if consistency beats usability here.
points out to:
> you can mix and match methods with value receivers and methods with pointer receivers, and use them with variables containing values and pointers, without worrying about which is which.
Both will work, and the syntax is the same.
>
> However, if methods with pointer receivers are needed to satisfy an interface, then only a pointer will be assignable to the interface — a value won't be valid.
- "Go interfaces and automatically generated functions" from Chris Siebenmann (June 2017)
> Calling value receiver methods through interfaces always creates extra copies of your values.
>
Interface values are fundamentally pointers, while your value receiver methods require values; ergo every call requires Go to create a new copy of the value, call your method with it, and then throw the value away.
There is no way to avoid this as long as you use value receiver methods and call them through interface values; it's a fundamental requirement of Go.
- "Learning about Go's unaddressable values and slicing" (still from Chris (Sept. 2018))
> Concept of unaddressable values, which are the opposite of addressable values. The careful technical version is in the Go specification in Address operators, but the hand waving summary version is that most anonymous values are not addressable (one big exception is composite literals)
答案2
得分: 40
另外补充一下@VonC的出色且有信息量的回答。
我很惊讶没有人真正提到项目变大后的维护成本,旧的开发人员离开,新的开发人员加入。Go语言确实是一门年轻的语言。
一般来说,我尽量避免使用指针,但它们确实有它们的用途和美感。
我在以下情况下使用指针:
- 处理大型数据集时
- 有一个维护状态的结构体,例如TokenCache
-
- 我确保所有字段都是私有的,只能通过定义的方法接收器进行交互
-
- 我不将此函数传递给任何goroutine
例如:
type TokenCache struct {
cache map[string]map[string]bool
}
func (c *TokenCache) Add(contract string, token string, authorized bool) {
tokens := c.cache[contract]
if tokens == nil {
tokens = make(map[string]bool)
}
tokens[token] = authorized
c.cache[contract] = tokens
}
我避免使用指针的原因:
- 指针不是并发安全的(这是Go语言的整个重点)
- 一旦使用指针接收器,就始终使用指针接收器(对于结构体的所有方法都是如此),保持一致性
- 与“值复制成本”相比,互斥锁肯定更昂贵、更慢且更难维护
- 谈到“值复制成本”,这真的是个问题吗?过早优化是万恶之源,你总是可以后续添加指针
- 它直接、有意识地迫使我设计小的结构体
- 通过设计具有明确意图和明显输入/输出的纯函数,可以避免大部分指针的使用
- 使用指针会使垃圾回收变得更加困难,我相信
- 更容易讨论封装、责任
- 保持简单,傻瓜(是的,指针可能很棘手,因为你永远不知道下一个项目的开发人员会是谁)
- 单元测试就像走进粉色花园(斯洛伐克的一个表达方式),意味着简单
我的经验法则是,尽可能编写封装的方法,例如:
package rsa
// EncryptPKCS1v15使用RSA和PKCS#1 v1.5的填充方案对给定的消息进行加密。
func EncryptPKCS1v15(rand io.Reader, pub *PublicKey, msg []byte) ([]byte, error) {
return []byte("秘密文本"), nil
}
cipherText, err := rsa.EncryptPKCS1v15(rand, pub, keyBlock)
更新:
这个问题激发了我对这个主题的更多研究,并写了一篇关于它的博客文章https://medium.com/gophersland/gopher-vs-object-oriented-golang-4fa62b88c701
英文:
To add additionally to @VonC great, informative answer.
I am surprised no one really mentioned the maintainance cost once the project gets larger, old devs leave and new one comes. Go surely is a young language.
Generally speaking, I try to avoid pointers when I can but they do have their place and beauty.
I use pointers when:
- working with large datasets
- have a struct maintaining state, e.g. TokenCache,
-
- I make sure ALL fields are PRIVATE, interaction is possible only via defined method receivers
-
- I don't pass this function to any goroutine
E.g:
type TokenCache struct {
cache map[string]map[string]bool
}
func (c *TokenCache) Add(contract string, token string, authorized bool) {
tokens := c.cache[contract]
if tokens == nil {
tokens = make(map[string]bool)
}
tokens[token] = authorized
c.cache[contract] = tokens
}
Reasons why I avoid pointers:
- pointers are not concurrently safe (the whole point of GoLang)
- once pointer receiver, always pointer receiver (for all Struct's methods for consistency)
- mutexes are surely more expensive, slower and harder to maintain comparing to the "value copy cost"
- speaking of "value copy cost", is that really an issue? Premature optimization is root to all evil, you can always add pointers later
- it directly, conciously forces me to design small Structs
- pointers can be mostly avoided by designing pure functions with clear intention and obvious I/O
- garbage collection is harder with pointers I believe
- easier to argue about encapsulation, responsibilities
- keep it simple, stupid (yes, pointers can be tricky because you never know the next project's dev)
- unit testing is like walking through pink garden (slovak only expression?), means easy
- no NIL if conditions (NIL can be passed where a pointer was expected)
My rule of thumb, write as many encapsulated methods as possible such as:
package rsa
// EncryptPKCS1v15 encrypts the given message with RSA and the padding scheme from PKCS#1 v1.5.
func EncryptPKCS1v15(rand io.Reader, pub *PublicKey, msg []byte) ([]byte, error) {
return []byte("secret text"), nil
}
cipherText, err := rsa.EncryptPKCS1v15(rand, pub, keyBlock)
UPDATE:
This question inspired me to research the topic more and write a blog post about it https://medium.com/gophersland/gopher-vs-object-oriented-golang-4fa62b88c701
答案3
得分: 3
这是一个语义问题。想象一下,你编写了一个接受两个数字作为参数的函数。你不希望突然发现这两个数字中的任何一个被调用函数改变了。如果你将它们作为指针传递,这种情况是可能的。很多东西应该像数字一样运作。比如点、二维向量、日期、矩形、圆等等。这些东西没有身份。两个位置相同、半径相同的圆不应该被区分开来。它们是值类型。
但是像数据库连接、文件句柄、GUI中的按钮这样的东西,身份是重要的。在这些情况下,你需要一个指向对象的指针。
当某个东西本质上是一个值类型,比如矩形或者点,最好能够在不使用指针的情况下传递它们。为什么呢?因为这意味着你可以确保不会改变对象。它可以清晰地表达代码的语义和意图给读者。可以明确函数接收的对象不能且不会改变该对象。
英文:
It is a question of semantics. Imagine you write a function taking two numbers as arguments. You don't want to suddenly find out that either of these numbers got mutated by the calling function. If you pass them as pointers that is possible. Lots of things should act just like numbers. Things like points, 2D vectors, dates, rectangles, circles etc. These things don't have identity. Two circle at the same position and with the same radius should not be distinguished from each other. They are value types.
But something like a database connection or a file handle, a button in the GUI is something where identity matters. In these cases you want a pointer to the object.
When something is inherently a value type such as a rectangle or point, it is really preferable to be able to pass them without using pointers. Why? Because it means you are certain to avoid mutating the object. It clarifies semantics and intent to reader of your code. It is clear that the function receiving the object cannot and will not mutate the object.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论