英文:
Why is java HashMap resize or rehash not taking gradual approach like Redis
问题
我只是在想为什么 JDK 的 HashMap 重新哈希过程不像 Redis 那样采取渐进的方法。虽然 JDK HashMap 的重新哈希计算相当优雅且有效,但在原始 HashMap 中的元素数量相当多时,仍会花费可观的时间。
我并不是一个有经验的 Java 用户,所以我一直认为 Java 设计师肯定考虑到了我认知能力的限制之外的因素。
类似于 Redis 的渐进式重新哈希可以有效地将工作负载分配到 HashMap 中的每个 put、delete 或 get 操作中,这可以显著减少调整大小/重新哈希所需的时间。
而且我还比较了这两种哈希方法,在我看来,并没有限制 JDK 进行渐进式重新哈希的因素。
希望有人能给出一些线索或灵感。非常感谢。
英文:
I am just wondering why the jdk HashMap reshashing process not taking the gradual approach as Redis. Though the rehash calculation of Jdk HashMap is quite elegant and effective, it will still take noticeable time when the number of elements in the original HashMap contains quite a number of entries.
I am not an experience java user so I always suppose that there must be consideration of the java designers that is beyond the limit of my cognitive capability.
The gradual rehash like Redis can effectively distributes the workload to each put, delete or get in the HashMap, which could significantly reduce the resize/rehashing time .
And I have also compared the two hash methods which in my mind doesn't restrict Jdk from doing a gradual rehashing.
Hope someone could give an clue or some inspiration. Thanks a lot in advance.
答案1
得分: 3
如果考虑像 HashMap
这样的增量重哈希的成本和收益,结果表明成本并不可忽视,而收益也不如您所希望的那样大。
一个增量重哈希的 HashMap
:
- 平均使用更多 50% 的内存,因为在增量重哈希期间需要同时保留旧表和新表;并且
- 每个操作的计算成本稍高。此外:
- 重哈希仍然不是完全增量的,因为新哈希表数组的分配必须一次性完成;所以
- 任何操作的渐近复杂度都没有改进。最后:
- 几乎没有任何真正需要增量重哈希的东西可以在 Java 中实现,因为不可预测的垃圾回收暂停,所以为什么要费心呢?
英文:
If you think about the costs and benefits of incremental rehashing for something like HashMap
, It turns out that the costs are not insignificant, and the benefits are not as great as you might like.
An incrementally rehashing HashMap
:
- Uses 50% more memory on average, because it needs to keep both the old table and new table around during the incremental rehash; and
- Has a somewhat higher computational cost per operation. Also:
- The rehashing is still not entirely incremental, because allocating the new hash table array has to be done all at once; so
- There are no improvements in the asymptotic complexity of any operation. And finally:
- Almost nothing that would really need incremental rehashing can be implemented in Java at all, due to the unpredictable GC pauses, so why bother?
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论