‘哈希’ 是否比 ‘线性’ 搜索更高效?

huangapple go评论99阅读模式
英文:

Is 'hashing' more efficient than 'linear' search?

问题

我决定修改Java集合框架,所以我从内部实现开始。一个问题出现在我的脑海中,我无法解决。希望有人能够对以下内容进行清晰的解释。

ArrayList使用线性搜索或二进制搜索(两者都有优缺点),但我们可以对它们进行任何操作!我的问题是,为什么所有的“散列”类(比如HashMap等)都使用散列原理?它们不能只使用线性搜索或二进制搜索吗?为什么不只是在数组内部存储键/值对呢?相反地,为什么没有(例如将ArrayList存储在哈希表中)?

英文:

I decided to revise Java collection framework, so I started with internal implementation. One question came on my mind, which I can't solve. Hope someone can make a clear explanation on following.

ArrayList uses linear or binary search (both have pros/cons), but we can do anything with them! My question is why do all 'hashing' classes (like HashMap f.e.) use hashing principle? Couldn't they settle with linear or binary search for example? Why just not store Key/Value pair inside array? And the opposite, why isn't (for example ArrayList stored in hashTable)?

答案1

得分: 3

集合框架的意图是程序员会根据使用情况选择适当的数据结构。根据您所用的情况,不同的数据结构是合适的。

哈希类使用哈希原理,正如您所说,因为如果您选择它们,那么这就是您想要使用的。(哈希通常是简单直接的查找的最佳选择。)螺丝刀使用螺纹原理,因为如果您拿起螺丝刀,您想要拧紧某样东西;如果您有一颗需要安装的钉子,您本应该拿起锤子。

但是,如果您不打算进行查找,或者如果线性搜索对您来说已经足够好了,那么ArrayList 就是您想要的。在永远不会使用哈希表的集合中添加哈希表是不值得的,这样做会消耗 CPU 和内存,而这些操作是您不需要的。

英文:

The intention of the collections framework is that the programmer will choose the data structure appropriate to the use case. Depending on what you're using it for, different data structures are appropriate.

Hashing classes use the hashing principle, as you put it, because if you choose them, then that's what you want to use. (Hashing is generally the best choice for simple, straightforward lookups.) A screwdriver uses the screwing principle because if you pick up a screwdriver, you want to screw something in; if you had a nail you needed to put in, you would have picked up the hammer instead.

But if you're not going to be performing lookups, or if linear search is good enough for you, then an ArrayList is what you want. It's not worth adding a hash table to a collection that's never going to use it, and it costs CPU and memory to do things you aren't going to need.

答案2

得分: 1

我有一个包含大量值的哈希表(约1500个值)。代码的性质是,一旦加载了哈希表,就不会再进行更改。每个网页会多次访问哈希表,我曾想知道是否可以加速以实现更快的页面加载。

有一天,我有些时间,于是我进行了一系列的时间测试(使用纳秒时间函数)。然后,我将哈希表的使用重新改成了一个数组。不是ArrayList,而是一个实际的数组[]。我用键类存储了与获取哈希值有关的索引。

有一个区别,数组查找更快。我计算出,在一天的活动中,我几乎可以节省将近一秒的时间!

所以是的,使用数组比使用哈希表更快,但效果可能因人而异 ‘哈希’ 是否比 ‘线性’ 搜索更高效?

后来我又将代码恢复回使用哈希表,因为这样更容易维护...

英文:

I had a large hash of values (about 1,500). The nature of the code was that once the hashmap was loaded it would never be altered. The hashmap was accessed many times per web page, and I had wondered if it could be sped up for faster page loading.

One day I had some time, so I did a series of time tests (using the nano time function). I then reworked the hashmap use over to an array. Not an ArrayList, but an actual array[]. I stored the index with the key class used to get the hash value.

There was a difference, that the array lookup was faster. I calculated that over a days worth of activity I would have saved almost a full second!

So yes, using an array is faster than using a hash, YMMV ‘哈希’ 是否比 ‘线性’ 搜索更高效?

And I reverted my code back to using a hashmap, as it was easier to maintain...

huangapple
  • 本文由 发表于 2020年8月20日 03:04:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/63493422.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定