循环切片或使用映射来检索对象更好。

huangapple go评论73阅读模式
英文:

Looping a slice or using a mapping is better to retrieve an object

问题

我有一个包含大约3000个bson对象的切片。每个对象都有一些嵌套映射,一个对象的平均大小为4 kb。在我的代码中,我需要能够根据它们的uid字段快速检索这些对象。我的原始计划是编写一个函数,简单地循环遍历原始切片,并检查匹配的uid,例如object["uid"] == uidToFind。然而,现在我认为创建一个大的映射表会更好,其中键使用uid字段,值使用相应的对象,类似于这样:

m := make(map[string]bson.M)
m["sample_UID_0"] = bsonObjects[0]
m["sample_UID_1"] = bsonObjects[1]
//... 继续处理剩下的3000个对象...

我的问题是,我应该优先选择这种解决方案,而不是每次循环遍历原始切片吗?由于我没有数百万个对象,我认为将重要的内容保存在一个全局可用的映射表中,并且只需使用m["sample_UID"]简单地访问它们,而不是始终循环遍历整个切片,这可能是一个更好的主意。

英文:

I have a slice which contains around 3000 bson objects. Every object has some nested mappings and one object has an average size of 4 kb. In my code I have to be able to retrieve these objects based on their uid field fast as possible. My original plan was to write a function to simply loop through the original slice and check for the matching uid like object["uid"] == uidToFind. However now I believe it would be better creating one big mapping where the keys are using the uid field and the values using the corresponding object, somehow like this:

m := make(map[string]bson.M)
m["sample_UID_0"] = bsonObjects[0]
m["sample_UID_1"] = bsonObjects[1]
//... continue with the remaining 3000 objects...

My question is, should I favor this solution over looping through the original slice every time? As I don't have millions of objects, I assume it would be a better idea to keep important stuff in one globally available mapping and just simply access them with m["sample_UID"] rather than always loop through the whole slice.

答案1

得分: 1

除了性能和内存使用之外,这两种解决方案之间还有一个主要区别,即你的映射(map)只能包含一个具有相同id的条目。但是,你的数组中可以有多个相同的id,没有任何限制。

一般来说,如果你的数组是有序的,你可以使用二分搜索,这在性能上与在映射中进行搜索没有明显的区别,但是在添加新条目时必须保持数组的有序性。因此,这种解决方案的好处取决于你的数组是否经常被修改。数组应该比映射使用更少的内存。

在你的具体示例中,3000个条目并不算多。根据对数据结构搜索的频率,很难确定在性能方面是否会有任何明显的差异。你可以使用基准测试来检查这一点。

英文:

Beyond performance and memory usage there is one main difference between both solution, the way your map is defined can only contain one entry with the same id. Nothing prevents to have multiple times the same id in your array.

In general, if your array is sorted, you can use a dichotomic search which should not raise any visible difference with a search in an map, but you must keep your array sorted when adding new entries. So the benefit of this solution depends if your array is frequently modified. An array should use less memory than a map.

In your specific example, 3000 entries is not that much. Depending on the frequency of the search in data structure, it's not obvious you may notice any difference on the performance side. You may want to use benchmark to check that.

huangapple
  • 本文由 发表于 2022年7月5日 22:00:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/72870741.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定