处理内存占用过高的应用程序的最佳方法是什么?使用Mmap、内存还是缓存?

huangapple go评论86阅读模式
英文:

Go: Best way to handle excessive memory application? Mmap, memory or caching?

问题

我有一个需要大约600GB内存的Go应用程序。运行该应用程序的机器上有128GB的RAM。我正在尝试决定如何最好地处理这个问题。

有以下几种选择:

  1. 将所有数据加载到内存中(假装有600GB的RAM),并让操作系统将不经常访问的部分换出到虚拟内存中。我喜欢这个想法,因为我不需要在代码中做任何特殊处理,操作系统会处理一切。然而,我不确定这是否是一个好主意。

  2. 将数据存储在磁盘上,并使用mmap(内存映射文件),这与上述方法类似,但需要更多的编码工作。而且,似乎意味着数据必须存储为[]byte,并且每次需要使用时都需要解析,而不是已经以实际计算所需的类型存在。

  3. 构建一个缓存系统,将数据保存在硬盘上,并在需要时加载数据,最常访问的数据保存在内存中,当超过内存限制时,最不常访问的数据将被清除。

这些方法的优缺点是什么?如果可能的话,我更倾向于选择(1),因为它简单...这样做有什么问题吗?

英文:

I have a Go application which requires around 600GB of memory. The machine on which is will run has 128GB of RAM. I'm trying to decide how best to handle this.

The options are:

  1. Just load everything into the memory (pretend like I have 600GB RAM) and let the OS page out the infrequently accessed part of the memory into virtual memory. I like this idea because I don't have to do anything special in the code, the OS will just handle everything. However, I'm not sure this is a good idea.

  2. Have the data stored on disk and use mmap (memory mapped file) which I guess is similar to the above but will require a lot more coding. Also it appears to mean that the data will have to be stored as []byte and then parsed every time I need to use it, rather that being already in whatever type I need it for the actual calculations.

  3. Build a caching system in which the data is kept on HDD and then loaded it when it's needed, with the most frequently accessed data being held in memory and the least frequently accessed data being purged whenever the memory limited is exceeded.

What are the advantages and disadvantages with these? I'd prefer to go with (1) if possible due to its simplicity... is there anything wrong with that?

答案1

得分: 1

这完全取决于数据访问的性质。这600GB的访问是否均匀分布?如果不是这样的话,那么在内存中缓存部分内容并将其余部分保留在硬盘上的解决方案可能已经足够了,因为你有足够的RAM可以缓存超过20%的数据。将所有内容保留在虚拟内存空间中可能会带来一些意想不到的缺点,比如需要一个巨大的交换分区。

要在磁盘上缓存数据,你可以像Dave建议的那样使用一个数据库引擎,因为它们通常很好地缓存最常访问的内容。你还可以使用memcached,这是一个用于在内存中缓存数据的库和客户端。

底线是,在不知道确切使用模式的情况下优化性能是很困难的。幸运的是,使用Go语言,你不必猜测,你可以进行测试和测量。

你可以定义一个类似于下面的接口:

type Index interface{
    Lookup(query string) Result
}

然后尝试你的所有解决方案,从最容易实现的开始。

type inMemoryIndex struct {...}

func (*inMemoryIndex) Lookup(query string) Result {...}

type memcachedIndex struct {...}

type dbIndex struct {...}

然后,你可以使用Go语言的内置基准测试工具来对你的应用程序进行基准测试,看它是否达到了你的标准。你甚至可以在那台机器上使用真实数据和模拟的用户查询进行基准测试。

你正确地认为,使用mmap可能需要更多的编码工作,所以我会在尝试所有其他选项之后再考虑使用它。

英文:

It all depends on the nature of the data access. Will the accesses to those 600GB be uniformly distributed? If that's not the case then a solution where you cache part of your content in memory and keep the rest of it on the HDD will likely be sufficient since you have enough RAM to cache more than 20% of your data. Keeping everything in virtual memory space may come with surprising drawbacks such as the need for a huge swap partition.

To cache the data on disk you could use a DB engine as Dave suggests since they usually do a good job of caching the most frequently accessed content. You could also use memcached, a library and client for caching stuff in memory.

The bottom line is that optimizing performance without knowing the exact usage patterns is hard. Luckily, with Go you don't have to guess. You can test and measure.

You can define an interface similar to

type Index interface{
    Lookup(query string) Result
}

And then try all of your solutions, starting with the easiest to implement.

type inMemoryIndex struct {...}

func (*inMemoryIndex) Lookup(query string) Result {...}

type memcachedIndex struct {...}

type dbIndex struct {...}

Then you can use Go's builtin benchmarking tools to benchmark your application and see if it lives up to your standards. You can even benchmark on that machine, using real data and mocked user queries.

You're correct to assume that mmap would require more coding so I would have saved that until I had tried all other options.

huangapple
  • 本文由 发表于 2015年4月6日 16:35:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/29467899.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定