如何管理大型电子商务网站的缓存

huangapple go评论59阅读模式
英文:

How to manage cache for large e-commerce website

问题

这是我第一次为大型网站创建缓存系统,我不确定该怎么做。

我的问题是:我应该使用Redis缓存还是文件缓存系统?

项目信息

  1. 基于PHP MVC框架
  2. 数据库:10万个产品,40万张图片,1万个过滤器,800个分类
英文:

This is my first time creating cache system for large website and im not sure how to do it.

My question is: should i use Redis cache of File cache system ?

Information about project

  1. Base on PHP MVC framework
  2. Database: 100k products, 400k images, 10k filters, 800 categories

答案1

得分: 1

以下是翻译好的部分:

无论缓存持久性机制如何,这被称为“穿透”缓存策略。

至于您的实际问题:

  1. 即使在现代文件系统中,目录查找性能通常在约10,000个条目标记附近开始明显下降,而50,000已经远远超过了这个数量。按照您的代码编写方式,每个请求都会进行一次查找,并且每次缓存命中都会进行第二次查找。
  2. 可能可以,但这不会简单或有趣。另一种选择是定期从时间到时间删除缓存,但由于有50,000个目录条目以及类似rm -f cache/*的shell扩展限制,这将导致问题。
  3. 是的。只需使用Redis。Redis将处理缓存的TTL和“最后使用”驱逐,以使您保持在设置的内存限制之下。除非您有非常严格的要求,通常没有理由缓存每一小块数据。缓存将自然而然地集中在最常请求的数据集上,这些数据将保留在缓存中。

总的来说,您还应该在服务器上启用资源监控,以便您可以随时间跟踪CPU/内存/IO等,从而可以做出有根据的决策,以设计您的应用程序/堆栈。

观点:磁盘IO通常是您最受限制的资源,因此只应将那些:

  1. 大到足以推出不可接受数量的数据从内存缓存中的东西。
  2. 资源密集型足以使其从磁盘中提取而不是重新计算/重新获取等等的东西。
英文:

Regardless of the cache persistence mechanism, this is called a "pull-through" cache strategy.

To your actual questions:

  1. Even in modern filesystems directory lookup performance usually starts to noticeably degrade around the 10k entry mark, and 50k is well past that. As-written, your code would do one lookup for every request, and a second for every cache hit.
  2. Probably, but that's not going to be simple or fun to implement. The alternative would be simply deleting the cache entirely from time to time, and that's going to be its own problem due to having 50k directory entries and rm -f cache/* or similar will break down due to shell expansion limitations.
  3. Yes. Just use Redis. Redis will handle cache TTLs and "last-used" eviction to keep you under a set memory limit. Unless you have very stringent requirements there's generally no good reason to cache every single scrap of data. The cache will naturally settle on the set of data that is most frequently requested being resident in the cache.

In general, you should also have resource monitoring enabled for your server so that you can track your CPU/memory/IO/etc over time and make informed decisions on how to architect your app/stack.

Opinion: Disk IO is generally your most limited resource, so you should only cache things there that are:

  1. Large enough to push out an unacceptable amount of data from the memory cache.
  2. Resource-intensive enough to justify pulling from disk rather than re-computing/re-fetching/etc.

huangapple
  • 本文由 发表于 2023年4月20日 03:49:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76058312.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定