2020年8月20日 03:04:23go评论107阅读模式

英文:

Is 'hashing' more efficient than 'linear' search?

问题

我决定修改Java集合框架，所以我从内部实现开始。一个问题出现在我的脑海中，我无法解决。希望有人能够对以下内容进行清晰的解释。

ArrayList使用线性搜索或二进制搜索（两者都有优缺点），但我们可以对它们进行任何操作！我的问题是，为什么所有的“散列”类（比如HashMap等）都使用散列原理？它们不能只使用线性搜索或二进制搜索吗？为什么不只是在数组内部存储键/值对呢？相反地，为什么没有（例如将ArrayList存储在哈希表中）？

英文:

I decided to revise Java collection framework, so I started with internal implementation. One question came on my mind, which I can't solve. Hope someone can make a clear explanation on following.

ArrayList uses linear or binary search (both have pros/cons), but we can do anything with them! My question is why do all 'hashing' classes (like HashMap f.e.) use hashing principle? Couldn't they settle with linear or binary search for example? Why just not store Key/Value pair inside array? And the opposite, why isn't (for example ArrayList stored in hashTable)?

答案1

得分: 3

集合框架的意图是程序员会根据使用情况选择适当的数据结构。根据您所用的情况，不同的数据结构是合适的。

哈希类使用哈希原理，正如您所说，因为如果您选择它们，那么这就是您想要使用的。（哈希通常是简单直接的查找的最佳选择。）螺丝刀使用螺纹原理，因为如果您拿起螺丝刀，您想要拧紧某样东西；如果您有一颗需要安装的钉子，您本应该拿起锤子。

但是，如果您不打算进行查找，或者如果线性搜索对您来说已经足够好了，那么ArrayList 就是您想要的。在永远不会使用哈希表的集合中添加哈希表是不值得的，这样做会消耗 CPU 和内存，而这些操作是您不需要的。

英文:

The intention of the collections framework is that the programmer will choose the data structure appropriate to the use case. Depending on what you're using it for, different data structures are appropriate.

Hashing classes use the hashing principle, as you put it, because if you choose them, then that's what you want to use. (Hashing is generally the best choice for simple, straightforward lookups.) A screwdriver uses the screwing principle because if you pick up a screwdriver, you want to screw something in; if you had a nail you needed to put in, you would have picked up the hammer instead.

But if you're not going to be performing lookups, or if linear search is good enough for you, then an ArrayList is what you want. It's not worth adding a hash table to a collection that's never going to use it, and it costs CPU and memory to do things you aren't going to need.

答案2

得分: 1

我有一个包含大量值的哈希表（约1500个值）。代码的性质是，一旦加载了哈希表，就不会再进行更改。每个网页会多次访问哈希表，我曾想知道是否可以加速以实现更快的页面加载。

有一天，我有些时间，于是我进行了一系列的时间测试（使用纳秒时间函数）。然后，我将哈希表的使用重新改成了一个数组。不是ArrayList，而是一个实际的数组[]。我用键类存储了与获取哈希值有关的索引。

有一个区别，数组查找更快。我计算出，在一天的活动中，我几乎可以节省将近一秒的时间！

所以是的，使用数组比使用哈希表更快，但效果可能因人而异

后来我又将代码恢复回使用哈希表，因为这样更容易维护...

英文:

I had a large hash of values (about 1,500). The nature of the code was that once the hashmap was loaded it would never be altered. The hashmap was accessed many times per web page, and I had wondered if it could be sped up for faster page loading.

One day I had some time, so I did a series of time tests (using the nano time function). I then reworked the hashmap use over to an array. Not an ArrayList, but an actual array[]. I stored the index with the key class used to get the hash value.

There was a difference, that the array lookup was faster. I calculated that over a days worth of activity I would have saved almost a full second!

So yes, using an array is faster than using a hash, YMMV

And I reverted my code back to using a hashmap, as it was easier to maintain...

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

‘哈希’ 是否比 ‘线性’ 搜索更高效？

问题

答案1

答案2

无法从非静态方法调用到非静态类/方法。

IntelliJ错误：重复的Mapper类与自动编译

如何使用递归打印一个三角形？

为什么主线程被阻塞？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论