问题

我有很多已分配的切片（几百万个），我已经对它们进行了append操作。我确定其中很多已经超过了容量。我想尝试减少内存使用。

我的第一次尝试是遍历所有切片，为每个切片分配一个新的len(oldSlice)大小的切片，并将值复制过去。不幸的是，这似乎增加了内存使用量（增加了一倍），而垃圾回收需要很长时间才能回收内存。

有没有一种好的通用方法来减少大量超容量切片的内存使用？

英文:

I have a large number of allocated slices (a few million) which I have appended to. I'm sure a large number of them are over capacity. I want to try and reduce memory usage.

My first attempt is to iterate over all of them, allocate a new slice of len(oldSlice) and copy the values over. Unfortunately this appears to increase memory usage (up to double) and the garbage collection is slow to reclaim the memory.

Is there a good general way to slim down memory usage for a large number of over-capacity slices?

答案1

得分: 1

选择正确的策略来分配缓冲区，在不了解具体问题的情况下是很困难的。

一般来说，你可以尝试重用你的缓冲区：

type buffer struct{}
    
var buffers = make(chan *buffer, 1024)
    
func newBuffer() *buffer {
    select {
    case b := <-buffers:
        return b
    default:
        return &buffer{}
    }
}
    
func returnBuffer(b *buffer) {
    select {
    case buffers <- b:
    default:
    }
}

英文:

Choosing the right strategy to allocate your buffers is hard without knowing the exact problem.

In general you can try to reuse your buffers:

type buffer struct{}

var buffers = make(chan *buffer, 1024)

func newBuffer() *buffer {
	select {
	case b:= &lt;-buffers:
		return b
		default:
		return &amp;buffer{}
	}
}

func returnBuffer(b *buffer) {
	select {
	case buffers &lt;- b:
	default:
	}
}

答案2

得分: -1

append函数中使用的启发式方法可能并不适用于所有应用程序。它设计用于在不知道要存储的数据的最终长度时使用。我建议尽早尽量减少额外分配的容量，而不是稍后迭代它们。下面是一个简单的策略示例，只在长度未知时使用缓冲区，并重复使用该缓冲区：

type buffer struct {
  names []string
  ... // 可能还有其他内容
}

// 假设这个函数经常被调用，并且有很多很多的名字
func (b *buffer) readNames(lines bufio.Scanner) ([]string, error) {
  // 从零开始，这样我们可以重复使用容量
  b.names = b.names[:0]

  for lines.Scan() {
    b.names = append(b.names, lines.Text())
  }

  // 处理错误
  err := lines.Err()
  if err == io.EOF {
    err = nil
  }

  // 分配一个最小的切片
  out := make([]string, len(b.names))
  copy(out, b.names)
  return out, err
}

当然，如果你需要一个适用于并发使用的安全版本，你需要对此进行修改；对于这种情况，我建议使用带缓冲的通道作为存储缓冲区的漏桶。

英文:

The heuristic used in append may not be suitable for all applications. It's designed for use when you don't know the final length of the data you'll be storing. Instead of iterating over them later, I'd try to minimize the amount of extra capacity you're allocating as early as possible. Here's a simple example of one strategy, which is to use a buffer only while the length is not known, and to reuse that buffer:

type buffer struct {
  names []string
  ... // possibly other things
}

// assume this is called frequently and has lots and lots of names
func (b *buffer) readNames(lines bufio.Scanner) ([]string, error) {
  // Start from zero, so we can re-use capacity
  b.names = b.names[:0]

  for lines.Scan() {
    b.names = append(b.names, lines.Text())
  }

  // Figure out the error
  err := lines.Err()
  if err == io.EOF {
    err = nil
  }

  // Allocate a minimal slice
  out := make([]string, len(b.names))
  copy(out, b.names)
  return out, err
}

Of course, you'll need to modify this if you need something that's safe for concurrent use; for that I'd recommend using a buffered channel as a leaky bucket for storing your buffers.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

有没有一种高效的方法来回收超额容量的切片？

问题

答案1

答案2

大猩猩网络工具包会话Memcache实现

出现了类型不匹配的错误，uint64和int32类型不匹配。

我们可以通过Go在Linux中将文件挂载为只读吗？

定义Go方法时，将其与结构定义分离有什么好处？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论