2022年7月11日 09:55:10go评论67阅读模式

英文:

Why is accessing a variable so much slower than accessing len()?

问题

我写了这个函数uniq，它接受一个已排序的int切片，并返回删除重复项后的切片：

func uniq(x []int) []int {
    i := 0
    for i < len(x)-1 {
        if x[i] == x[i+1] {
            copy(x[i:], x[i+1:])
            x = x[:len(x)-1]
        } else {
            i++
        }
    }
    return x
}

还有uniq2，是对uniq进行了重写，结果相同：

func uniq2(x []int) []int {
    i := 0
    l := len(x)
    for i < l-1 {
        if x[i] == x[i+1] {
            copy(x[i:], x[i+1:])
            l--
        } else {
            i++
        }
    }
    return x[:l]
}

这两个函数的唯一区别是，在uniq2中，我保存了len(x)到变量l中，并在移动切片时递减它，而不是每次都切片x和直接访问len(x)。

我认为uniq2应该比uniq稍微快一点，因为不再在迭代中调用len(x)，但实际上，它的速度却不可思议地慢得多。

通过这个在Linux上运行的测试，它生成一个随机排序的切片，并在该切片上调用uniq/uniq2 1000 次：

func main() {
    rand.Seed(time.Now().Unix())
    for i := 0; i < 1000; i++ {
        _ = uniq(genSlice())
        //_ = uniq2(genSlice())
    }
}
func genSlice() []int {
    x := make([]int, 0, 1000)
    for num := 1; num <= 10; num++ {
        amount := rand.Intn(1000)
        for i := 0; i < amount; i++ {
            x = append(x, num)
        }
    }
    return x
}

$ go build uniq.go
$ time ./uniq

uniq通常需要5-6秒才能完成，而uniq2要慢得多，需要12-15秒。

为什么在uniq2中，我保存切片长度到一个变量中，会比直接调用len的uniq慢得多？

它不应该稍微快一点吗？

英文:

I wrote this function uniq that takes in a sorted slice of ints
and returns the slice with duplicates removed:

func uniq(x []int) []int {
	i := 0
	for i &lt; len(x)-1 {
		if x[i] == x[i+1] {
			copy(x[i:], x[i+1:])
			x = x[:len(x)-1]
		} else {
			i++
		}
	}
	return x
}

and uniq2, a rewrite of uniq with the same results:

func uniq2(x []int) []int {
	i := 0
	l := len(x)
	for i &lt; l-1 {
		if x[i] == x[i+1] {
			copy(x[i:], x[i+1:])
			l--
		} else {
			i++
		}
	}
	return x[:l]
}

The only difference between the two functions
is that in uniq2, instead of slicing x
and directly accessing len(x) each time,
I save len(x) to a variable l
and decrement it whenever I shift the slice.

I thought that uniq2 would be slightly faster than uniq
because len(x) would no longer be called iteration,
but in reality, it is inexplicably much slower.

With this test that generates a random sorted slice
and calls uniq/uniq2 on it 1000 times,
which I run on Linux:

func main() {
	rand.Seed(time.Now().Unix())
	for i := 0; i &lt; 1000; i++ {
		_ = uniq(genSlice())
		//_ = uniq2(genSlice())
	}
}
func genSlice() []int {
	x := make([]int, 0, 1000)
	for num := 1; num &lt;= 10; num++ {
		amount := rand.Intn(1000)
		for i := 0; i &lt; amount; i++ {
			x = append(x, num)
		}
	}
	return x
}

$ go build uniq.go
$ time ./uniq

uniq usually takes 5--6 seconds to finish.
while uniq2 is more than two times slower,
taking between 12--15 seconds.

Why is uniq2, where I save the slice length to a variable,
so much slower than uniq, where I directly call len?

Shouldn't it slightly faster?

答案1

得分: 3

你预计执行时间大致相同，因为你认为它们做的事情大致相同。

唯一的区别在于，在uniq2中，我将x切片和直接访问len(x)的操作替换为将len(x)保存到变量l中，并在每次移动切片时将其递减。

这是错误的。

第一个版本执行以下操作：

        copy(x[i:], x[i+1:])
        x = x[:len(x)-1]

而第二个版本执行以下操作：

        copy(x[i:], x[i+1:])
        l--

第一个区别是，第一个版本分配（复制）了一个切片头部，它是一个reflect.SliceHeader值，包含3个整数（在64位架构上占用24字节），而l--只是一个简单的递减操作，速度更快。

但是，主要的区别并不源自此。主要的区别在于，由于第一个版本更改了x切片（包括头部和长度），因此您最终会复制越来越少的元素，而第二个版本不会更改x，并且始终复制到切片的末尾。x[i+1:]等同于x[x+1:len(x)]。

为了证明这一点，想象一下您传递了一个长度为10且所有元素都相等的切片。第一个版本将首先复制9个元素，然后是8个，然后是7个等等。第二个版本将首先复制9个元素，然后再次复制9个，然后再次复制9个等等。

让我们修改您的函数以计算复制的元素数量：

func uniq(x []int) []int {
	count := 0
	i := 0
	for i < len(x)-1 {
		if x[i] == x[i+1] {
			count += copy(x[i:], x[i+1:])
			x = x[:len(x)-1]
		} else {
			i++
		}
	}
	fmt.Println("uniq copied", count, "elements")
	return x
}

func uniq2(x []int) []int {
	count := 0
	i := 0
	l := len(x)
	for i < l-1 {
		if x[i] == x[i+1] {
			count += copy(x[i:], x[i+1:])
			l--
		} else {
			i++
		}
	}
	fmt.Println("uniq2 copied", count, "elements")
	return x[:l]
}

进行测试：

uniq(make([]int, 1000))
uniq2(make([]int, 1000))

输出结果为：

uniq copied 499500 elements
uniq2 copied 998001 elements

uniq2()复制的元素是uniq()的两倍！

如果我们使用一个随机切片进行测试：

uniq(genSlice())
uniq2(genSlice())

输出结果为：

uniq copied 7956671 elements
uniq2 copied 11900262 elements

同样，uniq2()复制的元素大约是uniq()的1.5倍！（但这在很大程度上取决于随机数。）

在Go Playground上尝试这些示例。

"修复"是修改uniq2()以复制到l为止：

        copy(x[i:], x[i+1:l])
        l--

通过这个"适当"的更改，性能大致相同。

英文:

You expect roughly the same execution time because you think they do roughly the same thing.

> The only difference between the two functions is that in uniq2, instead of slicing x and directly accessing len(x) each time, I save len(x) to a variable l and decrement it whenever I shift the slice.

This is wrong.

The first version does:

        copy(x[i:], x[i+1:])
        x = x[:len(x)-1]

And second does:

        copy(x[i:], x[i+1:])
        l--

The first difference is that the first assigns (copies) a slice header which is a reflect.SliceHeader value, being 3 integer (24 bytes on 64-bit architecture), while l-- does a simple decrement, it's much faster.

But the main difference does not stem from this. The main difference is that since the first version changes the x slice (the header, the length included), you end up copying less and less elements, while the second version does not change x and always copies to the end of the slice. x[i+1:] is equivalent to x[x+1:len(x)].

To demonstrate, imagine you pass a slice with length=10 and having all equal elements. The first version will copy 9 elements first, then 8, then 7 etc. The second version will copy 9 elements first, then 9 again, then 9 again etc.

Let's modify your functions to count the number of copied elements:

func uniq(x []int) []int {
	count := 0
	i := 0
	for i &lt; len(x)-1 {
		if x[i] == x[i+1] {
			count += copy(x[i:], x[i+1:])
			x = x[:len(x)-1]
		} else {
			i++
		}
	}
	fmt.Println(&quot;uniq copied&quot;, count, &quot;elements&quot;)
	return x
}

func uniq2(x []int) []int {
	count := 0
	i := 0
	l := len(x)
	for i &lt; l-1 {
		if x[i] == x[i+1] {
			count += copy(x[i:], x[i+1:])
			l--
		} else {
			i++
		}
	}
	fmt.Println(&quot;uniq2 copied&quot;, count, &quot;elements&quot;)
	return x[:l]
}

Testing it:

uniq(make([]int, 1000))
uniq2(make([]int, 1000))

Output is:

uniq copied 499500 elements
uniq2 copied 998001 elements

uniq2() copies twice as many elements!

If we test it with a random slice:

uniq(genSlice())
uniq2(genSlice())

Output is:

uniq copied 7956671 elements
uniq2 copied 11900262 elements

Again, uniq2() copies roughly 1.5 times more elements! (But this greatly depends on the random numbers.)

Try the examples on the Go Playground.

The "fix" is to modify uniq2() to copy until l:

        copy(x[i:], x[i+1:l])
        l--

With this "appropriate" change, performance is roughly the same.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

访问变量为什么比访问len()函数慢得多？

问题

答案1

从字符串中提取数字集合的正则表达式，当前缀出现时，不匹配该前缀。

In what situation will http.ListenAndServe return

Golang在路由之前修改HTTP请求参数，例如URL路径

使用接口来模拟数据库返回类型

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论