英文:
Does go garbage collect parts of slices?
问题
如果我按照这样的方式实现一个队列...
package main
import (
"fmt"
)
func PopFront(q *[]string) string {
r := (*q)[0]
*q = (*q)[1:len(*q)]
return r
}
func PushBack(q *[]string, a string) {
*q = append(*q, a)
}
func main() {
q := make([]string, 0)
PushBack(&q, "A")
fmt.Println(q)
PushBack(&q, "B")
fmt.Println(q)
PushBack(&q, "C")
fmt.Println(q)
PopFront(&q)
fmt.Println(q)
PopFront(&q)
fmt.Println(q)
}
...我最终得到一个数组["A", "B", "C"]
,但没有指向前两个元素的切片。由于切片的"start"指针不能被递减(据我所知),这些元素无法被访问。
Go的垃圾回收器是否足够智能,能够释放它们?
英文:
If I implement a queue like this...
package main
import(
"fmt"
)
func PopFront(q *[]string) string {
r := (*q)[0]
*q = (*q)[1:len(*q)]
return r
}
func PushBack(q *[]string, a string) {
*q = append(*q, a)
}
func main() {
q := make([]string, 0)
PushBack(&q, "A")
fmt.Println(q)
PushBack(&q, "B")
fmt.Println(q)
PushBack(&q, "C")
fmt.Println(q)
PopFront(&q)
fmt.Println(q)
PopFront(&q)
fmt.Println(q)
}
... I end up with an array ["A", "B", "C"]
that has no slices pointing to the first two elements. Since the "start" pointer of a slice can never be decremented (AFAIK), those elements can never be accessed.
Is Go's garbage collector smart enough to free them?
答案1
得分: 38
切片只是描述符(类似于小型结构体数据结构),如果没有引用,它们将被垃圾回收。
另一方面,切片的底层数组(描述符指向的数组)在通过重新切片创建的所有切片之间是共享的。引用自Go语言规范:切片类型:
> 一旦初始化,切片始终与保存其元素的底层数组相关联。因此,切片与其数组以及同一数组的其他切片共享存储;相反,不同的数组始终表示不同的存储。
因此,只要存在至少一个切片,或者存在一个持有数组的变量(如果通过对数组进行切片创建了切片),它将不会被垃圾回收。
关于此的官方声明:
Andrew Gerrand 的博文Go Slices: usage and internals清楚地说明了这种行为:
> 如前所述,重新切片切片不会复制底层数组。**只要仍然有引用,完整的数组将保留在内存中。**偶尔,这可能导致程序在只需要其中一小部分数据时将所有数据都保存在内存中。
>
> ...
>
> 由于切片引用了原始数组,只要切片存在,垃圾回收器就无法释放数组。
回到你的例子
虽然底层数组不会被释放,但请注意,如果向队列添加新元素,内置的 append
函数偶尔可能会分配一个新数组,并将当前元素复制到新数组中。但是,复制只会复制切片的元素,而不是整个底层数组!当发生这种重新分配和复制时,如果没有对它的其他引用,"旧" 数组可能会被垃圾回收。
另外,非常重要的一点是,如果从队列的前面弹出一个元素,切片将被重新切片,并且不包含对弹出元素的引用,但由于底层数组仍然包含该值,该值也将保留在内存中(而不仅仅是数组)。建议在从队列(切片/数组)中弹出或删除元素时,始终将其置零(切片中的相应元素),以便该值不会无谓地保留在内存中。如果切片包含指向大型数据结构的指针,这一点变得更加关键。
func PopFront(q *[]string) string {
r := (*q)[0]
(*q)[0] = "" // 始终将被删除的元素置零!
*q = (*q)[1:len(*q)]
return r
}
这在Slice Tricks wiki 页面中有提到:
> ### 无序删除
>
> a[i] = a[len(a)-1]
> a = a[:len(a)-1]
>
> 注意 如果元素的类型是指针或具有需要进行垃圾回收的指针字段的结构体,则上述 Cut
和 Delete
的实现可能存在潜在的 内存泄漏 问题:一些具有值的元素仍然被切片 a
引用,因此无法被回收。
英文:
Slices are just descriptors (small struct-like data structures) which if not referenced will be garbage collected properly.
The underlying array for a slice (to which the descriptor points to) on the other hand is shared between all slices that are created by reslicing it: quoting from the Go Language Specification: Slice Types:
> A slice, once initialized, is always associated with an underlying array that holds its elements. A slice therefore shares storage with its array and with other slices of the same array; by contrast, distinct arrays always represent distinct storage.
Therefore if at least one slice exists, or a variable holding the array (if a slice was created by slicing the array), it will not be garbage collected.
Official Statement about this:
The blog post Go Slices: usage and internals By Andrew Gerrand clearly states this behaviour:
> As mentioned earlier, re-slicing a slice doesn't make a copy of the underlying array. The full array will be kept in memory until it is no longer referenced. Occasionally this can cause the program to hold all the data in memory when only a small piece of it is needed.
>
> ...
>
> Since the slice references the original array, as long as the slice is kept around the garbage collector can't release the array.
Back to your example
While the underlying array will not be freed, note that if you add new elements to the queue, the built-in append
function occasionally might allocate a new array and copy the current elements to the new – but copying will only copy the elements of the slice and not the whole underlying array! When such a reallocation and copying occurs, the "old" array may be garbage collected if no other reference exists to it.
Also another very important thing is that if an element is popped from the front, the slice will be resliced and not contain a reference to the popped element, but since the underlying array still contains that value, the value will also remain in memory (not just the array). It is recommended that whenever an element is popped or removed from your queue (slice/array), always zero it (its respective element in the slice) so the value will not remain in memory needlessly. This becomes even more critical if your slice contains pointers to big data structures.
func PopFront(q *[]string) string {
r := (*q)[0]
(*q)[0] = "" // Always zero the removed element!
*q = (*q)[1:len(*q)]
return r
}
This is mentioned Slice Tricks wiki page:
> ### Delete without preserving order
>
> a[i] = a[len(a)-1]
> a = a[:len(a)-1]
>
> NOTE If the type of the element is a pointer or a struct with pointer fields, which need to be garbage collected, the above implementations of Cut
and Delete
have a potential memory leak problem: some elements with values are still referenced by slice a
and thus can not be collected.
答案2
得分: 6
在写这篇文章时,Go语言的垃圾回收器(GC)并不足够智能,无法回收切片中底层数组的开头部分,即使该部分是无法访问的。
正如其他人在这里提到的,切片(在底层)实际上是一个结构体,包含三个元素:指向底层数组的指针、切片的长度(可通过切片访问的值),以及切片的容量(可通过重新切片访问的值)。在Go博客中,详细讨论了切片的内部实现。这里还有一篇我喜欢的关于Go内存布局的文章。
当你重新切片并截断切片的尾部时,根据对内部实现的理解,很明显底层数组、指向底层数组的指针以及切片的容量都保持不变,只有切片的长度字段被更新。当你重新切片并截断切片的开头时,实际上是改变了指向底层数组的指针以及长度和容量。在这种情况下,根据我的阅读,通常不清楚为什么垃圾回收器不清理底层数组中这部分无法访问的内容,因为你无法重新切片数组来再次访问它。我猜测底层数组在垃圾回收器的视角中被视为一块连续的内存。如果你可以指向底层数组的任何部分,整个数组都不符合释放的条件。
我知道你在想什么...作为一名真正的计算机科学家,你可能想要一些证据。我会满足你:
https://goplay.space/#tDBQs1DfE2B
正如其他人提到的,并且在示例代码中展示的那样,使用append
函数可能会导致底层数组的重新分配和复制,这样旧的底层数组就可以被垃圾回收。
英文:
No. At the time of this writing, the Go garbage collector (GC) is not smart enough to collect the beginning of an underlying array in a slice, even if it is inaccessible.
As mentioned by others here, a slice (under the hood) is a struct of exactly three things: a pointer to its underlying array, the length of the slice (values accessible without reslicing), and the capacity of the slice (values accessible by reslicing). On the Go blog, slice internals are discussed at length. Here is another article I like about Go memory layouts.
When you reslice and cut off the tail end of a slice, it is obvious (upon understanding the internals) that the underlying array, the pointer to the underlying array, and the slice's capacity are all left unchanged; only the slice length field is updated. When you re-slice and cut off the beginning of a slice, you are really changing the pointer to the underlying array along with the length and capacity. In this case, it is generally unclear (based on my readings) why the GC does not clean up this inaccessible part of the underlying array because you cannot re-slice the array to access it again. My assumption is that the underlying array is treated as one block of memory from the GC's point of view. If you can point to any part of the underlying array, the entire thing is ineligible for deallocation.
I know what you're thinking... like the true computer scientist you are, you may want some proof. I'll indulge you:
https://goplay.space/#tDBQs1DfE2B
As mentioned by others and as shown in the sample code, using append
can cause a reallocation and copy of the underlying array, which allows the old underlying array to be garbage collected.
答案3
得分: 2
简单问题,简单回答:不会。(但是如果你不断地推动切片,它最终会溢出其底层数组,然后未使用的元素变得可供释放。)
英文:
Simple question, simple answer: No. (But if you keep pushing the slice will at some point overflow its underlying array then the unused elements become available to be freed.)
答案4
得分: -1
与我所阅读的相反,Golang似乎确实会对至少未使用的切片进行垃圾回收。以下测试案例提供了证据。
在第一个案例中,切片在每次迭代中被设置为slice[:1]。而在比较案例中,跳过了这一步骤。
第二个案例中消耗的内存远远超过了第一个案例。但是为什么呢?
如果在第一个测试中禁用垃圾回收,确实会导致内存飙升。生成的代码如下:
func TestArrayShiftMem2(t *testing.T) {
debug.SetGCPercent(-1)
slice := [][1024]byte{}
mem := runtime.MemStats{}
mem2 := runtime.MemStats{}
runtime.GC()
runtime.ReadMemStats(&mem)
// 1kb per
for i := 0; i < 1024*1024*1024*1024; i++ {
slice = append(slice, [1024]byte{})
slice = slice[1:]
// runtime.GC()
if i%(1024) == 0 {
fmt.Println("len, cap:", len(slice), cap(slice))
runtime.ReadMemStats(&mem2)
fmt.Println(mem2.HeapInuse - mem.HeapInuse)
fmt.Println(mem2.StackInuse - mem.StackInuse)
fmt.Println(mem2.HeapAlloc - mem.HeapAlloc)
}
}
}
输出结果 Test1:
go test -run=.Mem -v .
...
0
393216
21472
^CFAIL github.com/ds0nt/cs-mind-grind/arrays 1.931s
输出结果 Test3:
go test -run=.Mem3 -v .
...
19193856
393216
19213888
^CFAIL github.com/ds0nt/cs-mind-grind/arrays 2.175s
希望这些信息对你有所帮助!
英文:
Contrary to what I'm reading, Golang certainly seems to garbage collect at least unused slices starting sections. The following test case provides evidence.
In the first case the slice is set to slice[:1] in each iteration. In the comparison case, it skips that step.
The second case dwarfs the memory consumed in the first case. But why?
func TestArrayShiftMem(t *testing.T) {
slice := [][1024]byte{}
mem := runtime.MemStats{}
mem2 := runtime.MemStats{}
runtime.GC()
runtime.ReadMemStats(&mem)
for i := 0; i < 1024*1024*1024*1024; i++ {
slice = append(slice, [1024]byte{})
slice = slice[1:]
runtime.GC()
if i%(1024) == 0 {
runtime.ReadMemStats(&mem2)
fmt.Println(mem2.HeapInuse - mem.HeapInuse)
fmt.Println(mem2.StackInuse - mem.StackInuse)
fmt.Println(mem2.HeapAlloc - mem.HeapAlloc)
}
}
}
func TestArrayShiftMem3(t *testing.T) {
slice := [][1024]byte{}
mem := runtime.MemStats{}
mem2 := runtime.MemStats{}
runtime.GC()
runtime.ReadMemStats(&mem)
for i := 0; i < 1024*1024*1024*1024; i++ {
slice = append(slice, [1024]byte{})
// slice = slice[1:]
runtime.GC()
if i%(1024) == 0 {
runtime.ReadMemStats(&mem2)
fmt.Println(mem2.HeapInuse - mem.HeapInuse)
fmt.Println(mem2.StackInuse - mem.StackInuse)
fmt.Println(mem2.HeapAlloc - mem.HeapAlloc)
}
}
}
Output Test1:
go test -run=.Mem -v .
...
0
393216
21472
^CFAIL github.com/ds0nt/cs-mind-grind/arrays 1.931s
Output Test3:
go test -run=.Mem3 -v .
...
19193856
393216
19213888
^CFAIL github.com/ds0nt/cs-mind-grind/arrays 2.175s
If you disable garbage collection on the first test, indeed memory skyrockets. The resulting code looks like this:
func TestArrayShiftMem2(t *testing.T) {
debug.SetGCPercent(-1)
slice := [][1024]byte{}
mem := runtime.MemStats{}
mem2 := runtime.MemStats{}
runtime.GC()
runtime.ReadMemStats(&mem)
// 1kb per
for i := 0; i < 1024*1024*1024*1024; i++ {
slice = append(slice, [1024]byte{})
slice = slice[1:]
// runtime.GC()
if i%(1024) == 0 {
fmt.Println("len, cap:", len(slice), cap(slice))
runtime.ReadMemStats(&mem2)
fmt.Println(mem2.HeapInuse - mem.HeapInuse)
fmt.Println(mem2.StackInuse - mem.StackInuse)
fmt.Println(mem2.HeapAlloc - mem.HeapAlloc)
}
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论