问题

我正在读取大量数据，需要将其存储为相应值的配对。然后，我将需要查找每个键对应的值，几乎每个键都要查找一次。

由于我只会针对每个键进行一次查找，是否值得使用字典，或者在性能方面使用List<Pair<...>>会更好？

英文:

I'm reading a large amount of data that I need to store as pairs of corresponding values. Then I'll need to look up the corresponding values for the keys, once for almost each key.

Since I'll be searching for each key only once, is it worth it to use a Dictionary, or would it be better in terms of performance to use List<Pair<...>> here?

答案1

得分: 1

这取决于你拥有多少数据以及计算字典哈希的复杂程度。

对于列表，你的平均查找时间将是N/2。对于字典，哈希解析始终相同，但这些哈希解析可能相当昂贵。因此，对于任何数据集，都存在某个数量N，其中列表的平均查找成本开始超过哈希解析。在这个数量之下，列表更快。在这个数量之上，字典更快。

这个分界点取决于你的数据。我记得很多年前我们曾经使用10个项目作为一个经验法则，但我不知道那个规则到底有多有效。但实际上，了解的最佳方法是在你的实际数据样本上尝试使用两者。

另外，我们还需要考虑排序的影响。如果你知道你需要精确一次性地查找列表中的每个项目，那么基本未排序数据的总成本为N个项目 * N/2平均查找时间。这是一个O(n^2)算法，效率不高。但很多时候，你可以对数据进行排序，以便知道查找将按顺序发生，然后遍历列表。这可能会更加高效，是一个O(n*log(n))算法，更有可能胜过字典。

英文:

It depends on how much data you have and how complicated it is to calculate the dictionary hashes.

Your average lookup time for the List will be N/2. For the Dictionary it will always be the same hash resolution, but these hash resolutions can be fairly expensive. Therefore, for any data set there exists some number N where the List's average lookup cost starts to exceed the hash resolution. Below this number, the List is faster. Above the number, the Dictionary is faster.

Where this breaking point happens depends on your data. I recall many years ago we used to use just 10 items as a rule of thumb, but I have no idea how valid that really is. Really, though, the best way to find out is to actually try both on a sample of your real data.

We also need to throw in sorting as a further wrinkle. If you know you'll need to find every item in a list exactly once, then the total cost for the basic unsorted data is N items * N/2 average lookup time. This is a O(n<sup>2</sup>) algorithm, which is not great. But very often you can sort the data so you know the look-ups will happen sequentially, and then walk the List. This can be MUCH more efficient — O(n*log(n)) — and is more likely to beat the dictionary.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

C#: 字典或键值对列表

问题

答案1

Why is double implicitly converted to int in a cast operator?

如何为ServiceBusReceivedMessage的模拟设置消息主体

如何在数组中找到最大的奇数，并在没有奇数的情况下返回0（Java）？

Automapper与ImmutableList

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论