英文:
Is there an efficient way of reclaiming over-capacity slices?
问题
我有很多已分配的切片(几百万个),我已经对它们进行了append
操作。我确定其中很多已经超过了容量。我想尝试减少内存使用。
我的第一次尝试是遍历所有切片,为每个切片分配一个新的len(oldSlice)
大小的切片,并将值复制过去。不幸的是,这似乎增加了内存使用量(增加了一倍),而垃圾回收需要很长时间才能回收内存。
有没有一种好的通用方法来减少大量超容量切片的内存使用?
英文:
I have a large number of allocated slices (a few million) which I have append
ed to. I'm sure a large number of them are over capacity. I want to try and reduce memory usage.
My first attempt is to iterate over all of them, allocate a new slice of len(oldSlice)
and copy the values over. Unfortunately this appears to increase memory usage (up to double) and the garbage collection is slow to reclaim the memory.
Is there a good general way to slim down memory usage for a large number of over-capacity slices?
答案1
得分: 1
选择正确的策略来分配缓冲区,在不了解具体问题的情况下是很困难的。
一般来说,你可以尝试重用你的缓冲区:
type buffer struct{}
var buffers = make(chan *buffer, 1024)
func newBuffer() *buffer {
select {
case b := <-buffers:
return b
default:
return &buffer{}
}
}
func returnBuffer(b *buffer) {
select {
case buffers <- b:
default:
}
}
英文:
Choosing the right strategy to allocate your buffers is hard without knowing the exact problem.
In general you can try to reuse your buffers:
type buffer struct{}
var buffers = make(chan *buffer, 1024)
func newBuffer() *buffer {
select {
case b:= <-buffers:
return b
default:
return &buffer{}
}
}
func returnBuffer(b *buffer) {
select {
case buffers <- b:
default:
}
}
答案2
得分: -1
append
函数中使用的启发式方法可能并不适用于所有应用程序。它设计用于在不知道要存储的数据的最终长度时使用。我建议尽早尽量减少额外分配的容量,而不是稍后迭代它们。下面是一个简单的策略示例,只在长度未知时使用缓冲区,并重复使用该缓冲区:
type buffer struct {
names []string
... // 可能还有其他内容
}
// 假设这个函数经常被调用,并且有很多很多的名字
func (b *buffer) readNames(lines bufio.Scanner) ([]string, error) {
// 从零开始,这样我们可以重复使用容量
b.names = b.names[:0]
for lines.Scan() {
b.names = append(b.names, lines.Text())
}
// 处理错误
err := lines.Err()
if err == io.EOF {
err = nil
}
// 分配一个最小的切片
out := make([]string, len(b.names))
copy(out, b.names)
return out, err
}
当然,如果你需要一个适用于并发使用的安全版本,你需要对此进行修改;对于这种情况,我建议使用带缓冲的通道作为存储缓冲区的漏桶。
英文:
The heuristic used in append
may not be suitable for all applications. It's designed for use when you don't know the final length of the data you'll be storing. Instead of iterating over them later, I'd try to minimize the amount of extra capacity you're allocating as early as possible. Here's a simple example of one strategy, which is to use a buffer only while the length is not known, and to reuse that buffer:
type buffer struct {
names []string
... // possibly other things
}
// assume this is called frequently and has lots and lots of names
func (b *buffer) readNames(lines bufio.Scanner) ([]string, error) {
// Start from zero, so we can re-use capacity
b.names = b.names[:0]
for lines.Scan() {
b.names = append(b.names, lines.Text())
}
// Figure out the error
err := lines.Err()
if err == io.EOF {
err = nil
}
// Allocate a minimal slice
out := make([]string, len(b.names))
copy(out, b.names)
return out, err
}
Of course, you'll need to modify this if you need something that's safe for concurrent use; for that I'd recommend using a buffered channel as a leaky bucket for storing your buffers.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论