英文:
mgo - bson.ObjectId vs string id
问题
使用mgo
,最佳实践似乎是将对象ID设置为bson.ObjectId
。
这并不是很方便,因为结果是,ID不再以普通的字符串形式存储,而是以二进制形式存储在数据库中。在谷歌搜索中,似乎有很多关于“如何从bson id获取字符串”的问题,而且在golang
中,ObjectId
有一个Hex()
方法,可以让你获取字符串。
当将数据从Mongo导出到另一个数据库平台时,bson变得更加麻烦(这种情况发生在处理从后台Mongo数据库收集的大数据,并且你想将其与一些属性合并时),这意味着很多麻烦(你需要将二进制的ObjectId转换为字符串,以便在不使用bson表示的不同平台上与ID进行连接)。
我的问题是:使用bson.ObjectId
与使用普通字符串ID相比,有什么好处?如果我使用普通字符串ID存储我的Mongo实体,会丢失重要的东西吗?
英文:
Using mgo
, it seems that best practice is to set object ids to be bson.ObjectId
.
This is not very convenient, as the result is that instead of a plain string
id the id is stored as binary in the DB. Googling this seems to yield tons of questions like "how do I get a string out of the bson id?", and indeed in golang
there is the Hex()
method of the ObjectId
to allow you to get the string.
The bson becomes even more annoying to work with when exporting data from mongo to another DB platform (this is the case when dealing with big data that is collected and you want to merge it with some properties from the back office mongo DB), this means a lot of pain (you need to transform the binary ObjectId to a string in order to join with the id in different platforms that do not use bson representation).
My question is: what are the benefits of using bson.ObjectId
vs string
id? Will I lose anything significant if I store my mongo
entities with a plain string id?
答案1
得分: 3
正如评论中已经提到的,将ObjectId存储为十六进制字符串将使所需的空间加倍,并且如果您想提取其中一个值,您首先需要从该字符串构造一个ObjectId。
但是您有一个误解。绝对没有必要为强制使用的_id
字段使用ObjectId。我经常建议不要这样做。原因如下。
以一本书为简单例子,忽略关系和其他考虑:
{
_id: ObjectId("56b0d36c23da2af0363abe37"),
isbn: "978-3453056657",
title: "Neuromancer",
author: "William Gibson",
language: "German"
}
在这里,ObjectId有什么用处呢?实际上没有。它将成为一个几乎没有用处的索引,因为您永远不会通过这样的人工键搜索您的图书数据库。它没有语义价值。它将成为一个已经具有全局唯一ID(ISBN)的对象的唯一ID。
因此,我们可以简化我们的书籍文档如下:
{
_id: "978-3453056657",
title: "Neuromancer",
author: "William Gibson",
language: "German"
}
我们减小了文档的大小,利用了一个已经存在的全局唯一ID,并且没有基本上未使用的索引。
回到您的基本问题,即不使用ObjectIds是否会有所损失:很多时候,不使用ObjectId是更好的选择。但是如果您使用它,请使用二进制形式。
英文:
As was already mentioned in the comments, storing the ObjectId as a hex string would double the space needed for it and in case you want to extract one of its values, you'd first need to construct an ObjectId from that string.
But you have a misconception. There is absolutely no need to use an ObjectId for the mandatory _id
field. Quite often, I advice against that. Here is why.
Take the simple example of a book, relations and some other considerations set aside for simplicty:
{
_id: ObjectId("56b0d36c23da2af0363abe37"),
isbn: "978-3453056657",
title: "Neuromancer",
author: "William Gibson",
language: "German"
}
Now, what use would have the ObjectId here? Actually none. It would be an index with hardly any use, since you would never search your book databases by an artificial key like that. It holds no semantic value. It would be a unique ID for an object which already has a globally unique ID – the ISBN.
So we simplify our book document like this:
{
_id: "978-3453056657",
title: "Neuromancer",
author: "William Gibson",
language: "German"
}
We have reduced the size of the document, make use of a preexisting globally unique ID and do not have a basically unused index.
Back to your basic question wether you loose something by not using ObjectIds: Quite often, not using the ObjectId is the better choice. But if you use it, use the binary form.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论