访问变量为什么比访问len()函数慢得多?

huangapple go评论84阅读模式
英文:

Why is accessing a variable so much slower than accessing len()?

问题

我写了这个函数uniq,它接受一个已排序的int切片,并返回删除重复项后的切片:

func uniq(x []int) []int {
    i := 0
    for i < len(x)-1 {
        if x[i] == x[i+1] {
            copy(x[i:], x[i+1:])
            x = x[:len(x)-1]
        } else {
            i++
        }
    }
    return x
}

还有uniq2,是对uniq进行了重写,结果相同:

func uniq2(x []int) []int {
    i := 0
    l := len(x)
    for i < l-1 {
        if x[i] == x[i+1] {
            copy(x[i:], x[i+1:])
            l--
        } else {
            i++
        }
    }
    return x[:l]
}

这两个函数的唯一区别是,在uniq2中,我保存了len(x)到变量l中,并在移动切片时递减它,而不是每次都切片x和直接访问len(x)

我认为uniq2应该比uniq稍微快一点,因为不再在迭代中调用len(x),但实际上,它的速度却不可思议地慢得多。

通过这个在Linux上运行的测试,它生成一个随机排序的切片,并在该切片上调用uniq/uniq2 1000 次:

func main() {
    rand.Seed(time.Now().Unix())
    for i := 0; i < 1000; i++ {
        _ = uniq(genSlice())
        //_ = uniq2(genSlice())
    }
}
func genSlice() []int {
    x := make([]int, 0, 1000)
    for num := 1; num <= 10; num++ {
        amount := rand.Intn(1000)
        for i := 0; i < amount; i++ {
            x = append(x, num)
        }
    }
    return x
}
$ go build uniq.go
$ time ./uniq

uniq通常需要5-6秒才能完成,而uniq2要慢得多,需要12-15秒。

为什么在uniq2中,我保存切片长度到一个变量中,会比直接调用lenuniq慢得多?

它不应该稍微快一点吗?

英文:

I wrote this function uniq that takes in a sorted slice of ints
and returns the slice with duplicates removed:

func uniq(x []int) []int {
	i := 0
	for i &lt; len(x)-1 {
		if x[i] == x[i+1] {
			copy(x[i:], x[i+1:])
			x = x[:len(x)-1]
		} else {
			i++
		}
	}
	return x
}

and uniq2, a rewrite of uniq with the same results:

func uniq2(x []int) []int {
	i := 0
	l := len(x)
	for i &lt; l-1 {
		if x[i] == x[i+1] {
			copy(x[i:], x[i+1:])
			l--
		} else {
			i++
		}
	}
	return x[:l]
}

The only difference between the two functions
is that in uniq2, instead of slicing x
and directly accessing len(x) each time,
I save len(x) to a variable l
and decrement it whenever I shift the slice.

I thought that uniq2 would be slightly faster than uniq
because len(x) would no longer be called iteration,
but in reality, it is inexplicably much slower.

With this test that generates a random sorted slice
and calls uniq/uniq2 on it 1000 times,
which I run on Linux:

func main() {
	rand.Seed(time.Now().Unix())
	for i := 0; i &lt; 1000; i++ {
		_ = uniq(genSlice())
		//_ = uniq2(genSlice())
	}
}
func genSlice() []int {
	x := make([]int, 0, 1000)
	for num := 1; num &lt;= 10; num++ {
		amount := rand.Intn(1000)
		for i := 0; i &lt; amount; i++ {
			x = append(x, num)
		}
	}
	return x
}
$ go build uniq.go
$ time ./uniq

uniq usually takes 5--6 seconds to finish.
while uniq2 is more than two times slower,
taking between 12--15 seconds.

Why is uniq2, where I save the slice length to a variable,
so much slower than uniq, where I directly call len?

Shouldn't it slightly faster?

答案1

得分: 3

你预计执行时间大致相同,因为你认为它们做的事情大致相同。

唯一的区别在于,在uniq2中,我将x切片和直接访问len(x)的操作替换为将len(x)保存到变量l中,并在每次移动切片时将其递减。

这是错误的。

第一个版本执行以下操作:

        copy(x[i:], x[i+1:])
        x = x[:len(x)-1]

而第二个版本执行以下操作:

        copy(x[i:], x[i+1:])
        l--

第一个区别是,第一个版本分配(复制)了一个切片头部,它是一个reflect.SliceHeader值,包含3个整数(在64位架构上占用24字节),而l--只是一个简单的递减操作,速度更快。

但是,主要的区别并不源自此。主要的区别在于,由于第一个版本更改了x切片(包括头部和长度),因此您最终会复制越来越少的元素,而第二个版本不会更改x,并且始终复制到切片的末尾。x[i+1:]等同于x[x+1:len(x)]

为了证明这一点,想象一下您传递了一个长度为10且所有元素都相等的切片。第一个版本将首先复制9个元素,然后是8个,然后是7个等等。第二个版本将首先复制9个元素,然后再次复制9个,然后再次复制9个等等。

让我们修改您的函数以计算复制的元素数量:

func uniq(x []int) []int {
	count := 0
	i := 0
	for i < len(x)-1 {
		if x[i] == x[i+1] {
			count += copy(x[i:], x[i+1:])
			x = x[:len(x)-1]
		} else {
			i++
		}
	}
	fmt.Println("uniq copied", count, "elements")
	return x
}

func uniq2(x []int) []int {
	count := 0
	i := 0
	l := len(x)
	for i < l-1 {
		if x[i] == x[i+1] {
			count += copy(x[i:], x[i+1:])
			l--
		} else {
			i++
		}
	}
	fmt.Println("uniq2 copied", count, "elements")
	return x[:l]
}

进行测试:

uniq(make([]int, 1000))
uniq2(make([]int, 1000))

输出结果为:

uniq copied 499500 elements
uniq2 copied 998001 elements

uniq2()复制的元素是uniq()的两倍!

如果我们使用一个随机切片进行测试:

uniq(genSlice())
uniq2(genSlice())

输出结果为:

uniq copied 7956671 elements
uniq2 copied 11900262 elements

同样,uniq2()复制的元素大约是uniq()的1.5倍!(但这在很大程度上取决于随机数。)

Go Playground上尝试这些示例。

"修复"是修改uniq2()以复制到l为止:

        copy(x[i:], x[i+1:l])
        l--

通过这个"适当"的更改,性能大致相同。

英文:

You expect roughly the same execution time because you think they do roughly the same thing.

> The only difference between the two functions is that in uniq2, instead of slicing x and directly accessing len(x) each time, I save len(x) to a variable l and decrement it whenever I shift the slice.

This is wrong.

The first version does:

        copy(x[i:], x[i+1:])
        x = x[:len(x)-1]

And second does:

        copy(x[i:], x[i+1:])
        l--

The first difference is that the first assigns (copies) a slice header which is a reflect.SliceHeader value, being 3 integer (24 bytes on 64-bit architecture), while l-- does a simple decrement, it's much faster.

But the main difference does not stem from this. The main difference is that since the first version changes the x slice (the header, the length included), you end up copying less and less elements, while the second version does not change x and always copies to the end of the slice. x[i+1:] is equivalent to x[x+1:len(x)].

To demonstrate, imagine you pass a slice with length=10 and having all equal elements. The first version will copy 9 elements first, then 8, then 7 etc. The second version will copy 9 elements first, then 9 again, then 9 again etc.

Let's modify your functions to count the number of copied elements:

func uniq(x []int) []int {
	count := 0
	i := 0
	for i &lt; len(x)-1 {
		if x[i] == x[i+1] {
			count += copy(x[i:], x[i+1:])
			x = x[:len(x)-1]
		} else {
			i++
		}
	}
	fmt.Println(&quot;uniq copied&quot;, count, &quot;elements&quot;)
	return x
}

func uniq2(x []int) []int {
	count := 0
	i := 0
	l := len(x)
	for i &lt; l-1 {
		if x[i] == x[i+1] {
			count += copy(x[i:], x[i+1:])
			l--
		} else {
			i++
		}
	}
	fmt.Println(&quot;uniq2 copied&quot;, count, &quot;elements&quot;)
	return x[:l]
}

Testing it:

uniq(make([]int, 1000))
uniq2(make([]int, 1000))

Output is:

uniq copied 499500 elements
uniq2 copied 998001 elements

uniq2() copies twice as many elements!

If we test it with a random slice:

uniq(genSlice())
uniq2(genSlice())

Output is:

uniq copied 7956671 elements
uniq2 copied 11900262 elements

Again, uniq2() copies roughly 1.5 times more elements! (But this greatly depends on the random numbers.)

Try the examples on the Go Playground.

The "fix" is to modify uniq2() to copy until l:

        copy(x[i:], x[i+1:l])
        l--

With this "appropriate" change, performance is roughly the same.

huangapple
  • 本文由 发表于 2022年7月11日 09:55:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/72933040.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定