英文:
Performance influence of the condition expressions in "for" statement
问题
下面是两个示例之间的性能差异:
1.
var slice []int{ ... 大量的项目... }
for i:=0; i<len(slice); i++ { .... 做一些事情 ....}
2.
var slice []int{ ... 大量的项目... }
sliceLen := len(slice)
for i:=0; i<sliceLen; i++ { .... 做一些事情 ....}
在"for"语句中的条件表达式是在每次迭代中评估还是只评估一次?
英文:
Is there any performance difference between two examples below:
1.
var slice []int{ ... huge list of items... }
for i:=0; i<len(slice); i++ { .... do something ....}
2.
var slice []int{ ... huge list of items... }
sliceLen := len(slice)
for i:=0; i<sliceLen; i++ { .... do something ....}
Are the condition expressions in "for" statement evaluated at each iteration or only once?
答案1
得分: 1
TLDR: 几乎没有区别
测试问题的好方法是使用标准的testing
库提供的基准测试。
创建测试文件,例如forcycle_test.go
。
package perftest
import (
"testing"
)
func BenchmarkLenInside(b *testing.B) {
testData := make([]int, 1000000)
for i := 0; i < b.N; i++ {
// 基准测试代码开始
for j := 0; j < len(testData); j++ {
doSth(testData[j])
}
// 结束
}
}
func BenchmarkLenOutside(b *testing.B) {
testData := make([]int, 1000000)
for i := 0; i < b.N; i++ {
// 基准测试代码开始
sliceLen := len(testData)
for j := 0; j < sliceLen; j++ {
doSth(testData[j])
}
// 结束
}
}
func doSth(n int) {
_ = n + n
}
运行基准测试
go test -bench .
示例输出
goos: linux
goarch: amd64
pkg: forperf
BenchmarkLenInside-6 4543 259117 ns/op
BenchmarkLenOutside-6 4620 258069 ns/op
PASS
ok forperf 3.811s
基准测试函数必须运行目标代码b.N次。在基准测试执行期间,b.N会进行调整,直到基准测试函数持续的时间足够长,可以可靠地计时。
正如您在此基准测试运行中所看到的,当您将len
函数作为循环的一部分时,它比循环外的版本稍微慢一些。
类型 | b.N在计时之前 | 每个b.N迭代的平均时间 |
---|---|---|
循环内的len | 4543 | 259117 ns |
循环外的len | 4620 | 258069 ns |
请注意,多次运行会得到不同的结果,其中循环内的len
为A
,循环外的len
为B
。
A < B
或A ≈ B
甚至A > B
都是可能的。
英文:
TLDR: There is almost no difference
Good way how to test your problem is benchmarking provided by standard testing
library
Create test file eg: forcycle_test.go
package perftest
import (
"testing"
)
func BenchmarkLenInside(b *testing.B) {
testData := make([]int, 1000000)
for i := 0; i < b.N; i++ {
// Benchmarked code start
for j := 0; j < len(testData); j++ {
doSth(testData[j])
}
// end
}
}
func BenchmarkLenOutside(b *testing.B) {
testData := make([]int, 1000000)
for i := 0; i < b.N; i++ {
// Benchmarked code start
sliceLen := len(testData)
for j := 0; j < sliceLen; j++ {
doSth(testData[j])
}
// end
}
}
func doSth(n int) {
_ = n + n
}
Run benchmark
go test -bench .
Example output
goos: linux
goarch: amd64
pkg: forperf
BenchmarkLenInside-6 4543 259117 ns/op
BenchmarkLenOutside-6 4620 258069 ns/op
PASS
ok forperf 3.811s
> The benchmark function must run the target code b.N times. During benchmark execution, b.N is adjusted until the benchmark function lasts long enough to be timed reliably.
As you can see in this benchmark run when you write len
function as part of for cycle, it is just slightly slower then version out of cycle.
Type | b.N before timed | Avg time per b.N iteration |
---|---|---|
Len in | 4543 | 259117 ns |
Len out | 4620 | 258069 ns |
Note that several runs gives you different results where len in
is A
and len out
is B
A < B
or A ≈ B
or even A > B
is possible.
答案2
得分: 1
“for”语句中的条件表达式是在每次迭代中评估还是只评估一次?
规范中指出,在每次迭代之前评估条件。
因为i
的值在每次迭代中都会改变,所以在每次迭代之前评估条件i<len(slice)
和i<sliceLen
是有意义的。
编译器可以将条件表达式的部分评估提升到循环外,只要最终的程序执行效果与每次评估表达式的效果相同。例如,编译器可以在循环之前将len(slice)
或sliceLen
加载到寄存器中,并在循环中使用该寄存器。
两个示例之间是否有性能差异?
两个代码片段都将变量i
与从变量中读取的值进行比较。在第一个代码片段中,该值是从切片头部的长度字段中读取的。如果您对切片的实现方式不熟悉,可以参考Slices: usage and internals。
性能应该是相似的,如果不是完全相同的话。
英文:
> Are the condition expressions in "for" statement evaluated at each iteration or only once?
The specification says that the condition is evaluated before each iteration.
Because the value of i
changes on each iteration, it makes sense the conditions i<len(slice)
and i<sliceLen
are evaluated before each iteration.
The compiler can hoist parts of the condition expression evaluation out of the loop as long as the resulting program executes as if the expression is evaluated every time. For example, the compiler can load len(slice)
or sliceLen
to a register before the loop and use that register in the loop.
> Is there any performance difference between two examples below
Both code snippets compare variable i
to a value read from a variable. In the first snippet, the value is read from the slice header length field. See Slices: usage and internals if you are not familiar with how a slice is implemented.
The performance should be similar if not identical.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论