2023年2月25日 00:58:19go评论79阅读模式

英文:

Understanding Pointer Operations & CPU/Memory usage

问题

我在工作中与一位同事讨论了传递指针给函数和/或返回指针是否更高效的问题。

我编写了一些基准函数来测试不同的方法。这些函数基本上接受一个变量，对其进行转换，并将其传回。我们有四种不同的方法：

1）正常传递变量，为转换结果创建一个新变量，并传回其副本。
2）正常传递变量，为转换结果创建一个新变量，并传回内存地址。
3）传递一个指向变量的指针，为转换结果创建一个新变量，并传回该变量的副本。
4）传递一个指向变量的指针，在指针的值上执行转换，无需传回任何内容。

我使用Go语言编写了以下代码：

package main

import (
	"fmt"
	"testing"
)

type MyStruct struct {
	myString string
}

func acceptParamReturnVariable(s MyStruct) MyStruct {
	ns := MyStruct{
		fmt.Sprintf("I'm quoting this: \"%s\"", s.myString),
	}
	return ns
}

func acceptParamReturnPointer(s MyStruct) *MyStruct {
	ns := MyStruct{
		fmt.Sprintf("I'm quoting this: \"%s\"", s.myString),
	}
	return &ns
}

func acceptPointerParamReturnVariable(s *MyStruct) MyStruct {
	ns := MyStruct{
		fmt.Sprintf("I'm quoting this: \"%s\"", s.myString),
	}
	return ns
}

func acceptPointerParamNoReturn(s *MyStruct) {
	s.myString = fmt.Sprintf("I'm quoting this: \"%s\"", s.myString)
}

func BenchmarkNormalParamReturnVariable(b *testing.B) {
	s := MyStruct{
		myString: "Hello World",
	}
	var ns MyStruct
	for i := 0; i < b.N; i++ {
		ns = acceptParamReturnVariable(s)
	}
	_ = ns
}

func BenchmarkNormalParamReturnPointer(b *testing.B) {
	s := MyStruct{
		myString: "Hello World",
	}
	var ns *MyStruct
	for i := 0; i < b.N; i++ {
		ns = acceptParamReturnPointer(s)
	}
	_ = ns
}

func BenchmarkPointerParamReturnVariable(b *testing.B) {
	s := MyStruct{
		myString: "Hello World",
	}
	var ns MyStruct
	for i := 0; i < b.N; i++ {
		ns = acceptPointerParamReturnVariable(&s)
	}
	_ = ns
}

func BenchmarkPointerParamNoReturn(b *testing.B) {
	s := MyStruct{
		myString: "Hello World",
	}
	for i := 0; i < b.N; i++ {
		acceptPointerParamNoReturn(&s)
	}
	_ = s
}

我发现结果相当令人惊讶。

$ go test -run=XXXX -bench=. -benchmem
goos: darwin
goarch: amd64
pkg: XXXX
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkNormalParamReturnVariable-16           10538138               103.3 ns/op            48 B/op          2 allocs/op
BenchmarkNormalParamReturnPointer-16             9526380               201.2 ns/op            64 B/op          3 allocs/op
BenchmarkPointerParamReturnVariable-16           7542066               147.0 ns/op            48 B/op          2 allocs/op
BenchmarkPointerParamNoReturn-16                   45897            119265 ns/op          924351 B/op          5 allocs/op

在运行之前，我认为最高效的方式应该是第四个测试，因为在调用的函数的作用域中没有创建新变量，只传递了内存地址，然而，第四个测试是最不高效的，耗时最长，并且使用的内存也最多。

有人能解释一下这个问题吗？或者给我一些解释这个问题的好的阅读链接吗？

英文:

I was talking to a colleague at work about whether or not it's more efficient to pass a pointer to a function and/or returning a pointer.

I put together some bench mark functions to test the different ways of doing this. The functions basically accept a variable, transform it and pass it back. We have 4 different ways of doing it:

Pass the variable in normally, create a new variable for the result of the transformation and pass back a copy of it
Pass the variable in normally, create a new variable for the result of the transformation, and pass back the memory address
Pass in a pointer to a variable, create a new variable for the result of the transformation and pass back a copy of that variable
Pass in a pointer to a variable, perform the transformation on the value of the pointer, nothing to pass back.

package main

import (
	&quot;fmt&quot;
	&quot;testing&quot;
)

type MyStruct struct {
	myString string
}

func acceptParamReturnVariable(s MyStruct) MyStruct {
	ns := MyStruct{
		fmt.Sprintf(&quot;I&#39;m quoting this: \&quot;%s\&quot;&quot;, s.myString),
	}
	return ns
}

func acceptParamReturnPointer(s MyStruct) *MyStruct {
	ns := MyStruct{
		fmt.Sprintf(&quot;I&#39;m quoting this: \&quot;%s\&quot;&quot;, s.myString),
	}
	return &amp;ns
}

func acceptPointerParamReturnVariable(s *MyStruct) MyStruct {
	ns := MyStruct{
		fmt.Sprintf(&quot;I&#39;m quoting this: \&quot;%s\&quot;&quot;, s.myString),
	}
	return ns
}

func acceptPointerParamNoReturn(s *MyStruct) {
	s.myString = fmt.Sprintf(&quot;I&#39;m quoting this: \&quot;%s\&quot;&quot;, s.myString)
}

func BenchmarkNormalParamReturnVariable(b *testing.B) {
	s := MyStruct{
		myString: &quot;Hello World&quot;,
	}
	var ns MyStruct
	for i := 0; i &lt; b.N; i++ {
		ns = acceptParamReturnVariable(s)
	}
	_ = ns
}

func BenchmarkNormalParamReturnPointer(b *testing.B) {
	s := MyStruct{
		myString: &quot;Hello World&quot;,
	}
	var ns *MyStruct
	for i := 0; i &lt; b.N; i++ {
		ns = acceptParamReturnPointer(s)
	}
	_ = ns
}

func BenchmarkPointerParamReturnVariable(b *testing.B) {
	s := MyStruct{
		myString: &quot;Hello World&quot;,
	}
	var ns MyStruct
	for i := 0; i &lt; b.N; i++ {
		ns = acceptPointerParamReturnVariable(&amp;s)
	}
	_ = ns
}

func BenchmarkPointerParamNoReturn(b *testing.B) {
	s := MyStruct{
		myString: &quot;Hello World&quot;,
	}
	for i := 0; i &lt; b.N; i++ {
		acceptPointerParamNoReturn(&amp;s)
	}
	_ = s
}

I found the results rather surprising.

$ go test -run=XXXX -bench=. -benchmem
goos: darwin
goarch: amd64
pkg: XXXX
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkNormalParamReturnVariable-16           10538138               103.3 ns/op            48 B/op          2 allocs/op
BenchmarkNormalParamReturnPointer-16             9526380               201.2 ns/op            64 B/op          3 allocs/op
BenchmarkPointerParamReturnVariable-16           7542066               147.0 ns/op            48 B/op          2 allocs/op
BenchmarkPointerParamNoReturn-16                   45897            119265 ns/op          924351 B/op          5 allocs/op

Before running this, I figured the most efficient way would have been the 4th test, since no new variables are being created in the scope of the function being called and only memory addressed are being passed around, however, it seems that the 4th one is the least efficient, taking the most time, as well as using the most memory too.

Could some one possible explain this to me, or provide me with some good reading links that explain this?

答案1

得分: 1

你所做的基准测试并不能回答你所提出的问题。微基准测试被证明是非常困难的，不仅在Go语言中如此，在一般情况下也是如此。

回到效率问题上。通常情况下，将指针传递给函数不会逃逸到堆上。而通常情况下，从函数返回指针会逃逸到堆上。这里的关键词是“通常”。你无法确定编译器何时在堆栈上分配内存，何时在堆上分配内存。这不是一个简单的问题。你可以在这里找到一个非常好而简短的解释：链接。

但是如果你需要知道，你可以询问。你可以通过向go tool compile传递-m标志来打印编译器所做的优化决策。

go build -gcflags -m=1

如果你传递大于1的整数，你将得到更详细的输出。如果这不能给你提供优化程序所需的答案，那么可以尝试性能分析。这远远超出了内存分析的范畴。

总的来说，在日常工作中不要过于关注天真的优化决策。不要过于依赖那些说“通常……”的陈述，因为在现实世界中，你永远无法确定。首先要追求正确性优化，只有在确实需要并且已经证明需要进行性能优化时才进行。不要猜测，不要轻信。此外，要记住，Go语言是在不断变化的，我们在一个版本中证明的东西在另一个版本中可能不成立。

英文:

Benchmarks you do don't answer the questions you ask. Microbenchmarking is proven to be extremely hard - not only in Go world but in general.

Coming back to the efficiency problem. Typically, passing a pointer to a function doesn't escape to the heap. And typically, returning a pointer from a function does escape to the heap. Typically is the key word here. You can't really say when the compiler allocates something on the stack and when on the heap. This is not a trivial problem. Really good and short explanation can be found here.

But if you need to know, you can ask. You can start by simply printing optimization decisions made by the compiler. You can do so by passing the m flag to the go tool compile.

go build -gcflags -m=1

If you pass integer greater than 1 you get more verbose output. If it doesn't give you the answer you need to optimize your program, then try profiling. It goes much beyond the memory analysis.

In general, in your daily work do not bother with naive optimization decisions. Don't get too attached to the statements saying 'Typically...' because in real world, you never know. Always aim at the correctness optimization first. And then do the performance optimization only if you really need it and you proved that you need it. Do not guess, do not trust. Also, keep in mind, Go is changing so what we prove in one version, doesn't have to be true in the other.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

理解指针操作和CPU/内存使用情况

问题

答案1

How can I iterate over a pair of strings rune by rune in Go?

Analog of function range in golang

Golang MSSQL驱动程序适用于Windows7 64位操作系统。

Golang：Mysql准备好的插入语句无法将行添加到数据库表中。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论