如何评估总项目数量?

huangapple go评论49阅读模式
英文:

How to asses the total number of items?

问题

我有一个流处理系统,其中包含一个缓存。
缓存大小为100K,条目的TTL为30分钟。我想提出一个指标,可以帮助我评估被推送到系统的总项目数量。
注意:我想在没有简单地保存所有唯一键的集合的情况下完成这个任务。

这是我考虑的方法:
我可以监听条目驱逐的事件,并有两个计数器:

  1. eviction_full_capacity - 这意味着条目被删除,因为需要添加另一个项目,而缓存已满
  2. eviction_ttl - 条目已过期并且必须被删除

我还可以计算或检查输入到系统的速率。

基于上述信息,如何评估系统处理的总唯一项目数量?

英文:

I have a stream processing system which has a cache.
cache size is 100K and the TTL for entries is 30 minutes. I want to come up with a metric that will help me assessing the total number of items that were pushed to the system.
Note: I want to do it without having a set that simply holds all the unique keys.

Here's what I thought:
I can listen to the event of entry eviction and have two counters:

  1. eviction_full_capacity - it means that the entry was removed because another item needed to be added and the cache was full
  2. eviction_ttl - the entry has expired and had to be removed

I can also count or check the rate of the input to the system.

Given the above, how can I assess the total number of unique items that were processed by the system?

答案1

得分: 0

HyperLogLog算法将仅使用少量内存,相当精确地估计所见的不同项目数量。

如果您正在使用Redis,这已经被实现了

英文:

The HyperLogLog algorithm will estimate, fairly closely, the number of distinct items seen using only a small amount of memory.

If you're using Redis, this is already implemented.

huangapple
  • 本文由 发表于 2023年6月22日 19:29:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76531422.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定