英文:
How to asses the total number of items?
问题
我有一个流处理系统,其中包含一个缓存。
缓存大小为100K,条目的TTL为30分钟。我想提出一个指标,可以帮助我评估被推送到系统的总项目数量。
注意:我想在没有简单地保存所有唯一键的集合的情况下完成这个任务。
这是我考虑的方法:
我可以监听条目驱逐的事件,并有两个计数器:
- eviction_full_capacity - 这意味着条目被删除,因为需要添加另一个项目,而缓存已满
- eviction_ttl - 条目已过期并且必须被删除
我还可以计算或检查输入到系统的速率。
基于上述信息,如何评估系统处理的总唯一项目数量?
英文:
I have a stream processing system which has a cache.
cache size is 100K and the TTL for entries is 30 minutes. I want to come up with a metric that will help me assessing the total number of items that were pushed to the system.
Note: I want to do it without having a set that simply holds all the unique keys.
Here's what I thought:
I can listen to the event of entry eviction and have two counters:
- eviction_full_capacity - it means that the entry was removed because another item needed to be added and the cache was full
- eviction_ttl - the entry has expired and had to be removed
I can also count or check the rate of the input to the system.
Given the above, how can I assess the total number of unique items that were processed by the system?
答案1
得分: 0
HyperLogLog算法将仅使用少量内存,相当精确地估计所见的不同项目数量。
如果您正在使用Redis,这已经被实现了。
英文:
The HyperLogLog algorithm will estimate, fairly closely, the number of distinct items seen using only a small amount of memory.
If you're using Redis, this is already implemented.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论