问题

我有一个包含大约3000个bson对象的切片。每个对象都有一些嵌套映射，一个对象的平均大小为4 kb。在我的代码中，我需要能够根据它们的uid字段快速检索这些对象。我的原始计划是编写一个函数，简单地循环遍历原始切片，并检查匹配的uid，例如object["uid"] == uidToFind。然而，现在我认为创建一个大的映射表会更好，其中键使用uid字段，值使用相应的对象，类似于这样：

m := make(map[string]bson.M)
m["sample_UID_0"] = bsonObjects[0]
m["sample_UID_1"] = bsonObjects[1]
//... 继续处理剩下的3000个对象...

我的问题是，我应该优先选择这种解决方案，而不是每次循环遍历原始切片吗？由于我没有数百万个对象，我认为将重要的内容保存在一个全局可用的映射表中，并且只需使用m["sample_UID"]简单地访问它们，而不是始终循环遍历整个切片，这可能是一个更好的主意。

英文:

I have a slice which contains around 3000 bson objects. Every object has some nested mappings and one object has an average size of 4 kb. In my code I have to be able to retrieve these objects based on their uid field fast as possible. My original plan was to write a function to simply loop through the original slice and check for the matching uid like object["uid"] == uidToFind. However now I believe it would be better creating one big mapping where the keys are using the uid field and the values using the corresponding object, somehow like this:

m := make(map[string]bson.M)
m[&quot;sample_UID_0&quot;] = bsonObjects[0]
m[&quot;sample_UID_1&quot;] = bsonObjects[1]
//... continue with the remaining 3000 objects...

My question is, should I favor this solution over looping through the original slice every time? As I don't have millions of objects, I assume it would be a better idea to keep important stuff in one globally available mapping and just simply access them with m["sample_UID"] rather than always loop through the whole slice.

答案1

得分: 1

除了性能和内存使用之外，这两种解决方案之间还有一个主要区别，即你的映射（map）只能包含一个具有相同id的条目。但是，你的数组中可以有多个相同的id，没有任何限制。

一般来说，如果你的数组是有序的，你可以使用二分搜索，这在性能上与在映射中进行搜索没有明显的区别，但是在添加新条目时必须保持数组的有序性。因此，这种解决方案的好处取决于你的数组是否经常被修改。数组应该比映射使用更少的内存。

在你的具体示例中，3000个条目并不算多。根据对数据结构搜索的频率，很难确定在性能方面是否会有任何明显的差异。你可以使用基准测试来检查这一点。

英文:

Beyond performance and memory usage there is one main difference between both solution, the way your map is defined can only contain one entry with the same id. Nothing prevents to have multiple times the same id in your array.

In general, if your array is sorted, you can use a dichotomic search which should not raise any visible difference with a search in an map, but you must keep your array sorted when adding new entries. So the benefit of this solution depends if your array is frequently modified. An array should use less memory than a map.

In your specific example, 3000 entries is not that much. Depending on the frequency of the search in data structure, it's not obvious you may notice any difference on the performance side. You may want to use benchmark to check that.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

循环切片或使用映射来检索对象更好。

问题

答案1

SQL模式在PostgreSQL中无法运行吗？

Golang和Mgo按$natural: -1进行排序。

如何检查网页是从本地主机访问还是从外部访问？

在WSL 2中，PATH变量经常被重置。如何修复这个问题？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。