英文:
MongoDB: querying against multiple concatenated collections
问题
我有多个集合,其中包含相同类型和架构的数据。
我只希望MongoDB将它们视为单个集合。
我不想要连接集合,我只想按照连续文档的方式查询它们,所有文档都具有相同的架构。我知道这是一个特殊的用例。其他人一直在谈论基于列聚合集合并进行SQL连接。
我希望查询根据插入时间对集合进行排序。
假设我们有一个名为C1的集合,其中包含C1_d1、c1_d2、c1_d3等文档,以及集合C2,其中包含C2_d1和C2_d2。假设首先将C2_d1插入到C2,然后将C1_d1插入到C1,然后将C2_d2插入到C2,然后插入剩下的C1文档。
现在我想同时查询C1和C2,因此它应该按照以下顺序考虑文档:
C2_d1,C1_d1,C2_d2,C1_d2,C1_d3,就像它们是单个集合中的文档一样。
如果可以的话,我希望您用JVM驱动程序进行描述。
英文:
i have hot multiple collections which has got data of the same type and schema.
all i want is mongodb to look at them as they are single collection
i don't want to join collections.i just want to query them as if they are a single collection with consecutive documents.all documents has got the same schema.i know it's a special usecase.others keep talking about aggregating collection based on a column and doing a sql join.
i want the query to sort the collection based on the insert time
suppose that we have got a collection named C1 which has got C1_d1,c1_d2,c1_d3 documents and the collection C2 has got the C2_d1 and C2_d2 and suppose that first C2_d1 has been inserted to C2 and then C1_d1 to C1, then C2_d2 has been inserted to C2 and then the rest of C1 documents
now i want to query C1 and C2 simulteniously, so it should consider documents in this order:
C2_d1 , C1_d1 , C2_d2 , C1_d2 , C1_d3 as if they are in a single collection
i appreciate if you describe it with the jvm driver
答案1
得分: 3
你可以通过使用以下聚合管道查询来实现这一点:
db.C1.aggregate([{ $unionWith: "C2" }])
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbcf"), "name" : "C1_d1" }
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbd0"), "name" : "C1_d2" }
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbd1"), "name" : "C1_d3" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd2"), "name" : "C2_d1" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd3"), "name" : "C2_d2" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd4"), "name" : "C1_d3" }
这将联合C1
和C2
集合,如果你需要以某种方式对其进行排序,你可以在末尾添加一个排序阶段:
db.C1.aggregate([
{ $unionWith: "C2" },
{ $sort: {field1: 1} }
])
如果你总是以这种方式查询,你可能希望将其放入一个视图中,以使得操作变得更加简便。要创建一个管道视图,请使用以下代码:
db.createView("View1",
"C1",
[{ $unionWith: "C2" }]
);
现在,如果我们向 MongoDB 集合中插入一些数据:
db.C1.insertMany([
{name: "C1_d1"},
{name: "C1_d2"},
{name: "C1_d3"}
]);
db.C2.insertMany([
{name: "C2_d1"},
{name: "C2_d2"},
{name: "C1_d3"}
]);
我们可以查询视图并获得结果:
db.View1.find()
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbcf"), "name" : "C1_d1" }
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbd0"), "name" : "C1_d2" }
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbd1"), "name" : "C1_d3" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd2"), "name" : "C2_d1" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd3"), "name" : "C2_d2" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd4"), "name" : "C1_d3" }
此外,在你的 Java 代码中,你可以通过以下方式访问该视图:
MongoClient mongoClient = new MongoClient();
DB database = mongoClient.getDB("test");
DBCollection collection = database.getCollection("View1");
如果在创建视图时出现以下错误:
Invalid pipeline for view test.View1 :: caused by :: $unionWith is not allowed in the current feature compatibility version. See https://docs.mongodb.com/master/release-notes/4.4-compatibility/#feature-compatibility for more information.
你可以使用以下命令开启此特性:
db.adminCommand( { setFeatureCompatibilityVersion: "4.4" } )
英文:
You can achieve this by just querying using an aggregation pipeline of the following:
db.C1.aggregate([{ $unionWith: "C2" }])
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbcf"), "name" : "C1_d1" }
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbd0"), "name" : "C1_d2" }
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbd1"), "name" : "C1_d3" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd2"), "name" : "C2_d1" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd3"), "name" : "C2_d2" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd4"), "name" : "C1_d3" }
This will union both C1
and C2
collections, if you need to have these sorted in some way, you can add a sort stage to the end.
db.C1.aggregate([
{ $unionWith: "C2" },
{ $sort: {field1: 1} }
])
If you're always querying like this you might want to put this in a view to make your life a bit easier, to create a view of a pipeline use the following code:
db.createView("View1",
"C1",
[{ $unionWith: "C2" }]
);
Now if we insert some data in to our MongoDB collections.
db.C1.insertMany([
{name: "C1_d1"},
{name: "C1_d2"},
{name: "C1_d3"}
]);
db.C2.insertMany([
{name: "C2_d1"},
{name: "C2_d2"},
{name: "C1_d3"}
]);
We'll be able to query the view and get back our results.
db.View1.find()
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbcf"), "name" : "C1_d1" }
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbd0"), "name" : "C1_d2" }
{ "_id" : ObjectId("5f5895b5c5140b67d2d6cbd1"), "name" : "C1_d3" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd2"), "name" : "C2_d1" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd3"), "name" : "C2_d2" }
{ "_id" : ObjectId("5f5895bec5140b67d2d6cbd4"), "name" : "C1_d3" }
Also, within your java code you'll be able to access the view with
MongoClient mongoClient = new MongoClient();
DB database = mongoClient.getDB("test");
DBCollection collection = database.getCollection("View1");
If you get the following error when creating the view:
Invalid pipeline for view test.View1 :: caused by :: $unionWith is not allowed in the current feature compatibility version. See https://docs.mongodb.com/master/release-notes/4.4-compatibility/#feature-compatibility for more information.
you can turn on the feature with the following command
db.adminCommand( { setFeatureCompatibilityVersion: "4.4" } )
答案2
得分: 0
除了上面足够的答案外,我想分享一下它的 JVM API 版本。在聚合函数中有一些可以使用的聚合阶段,无论你是否使用 Java API。
例如,您可以使用 $skip 阶段跳过一些结果,或者使用 $limit 阶段限制结果。$unionWith 是最新的阶段,只适用于 v4.4,我之前并不知道。当前版本为 4.4,我已经更新了我的 MongoDB 服务器,并且能够使用该阶段。其他人可能会在更复杂的情况下使用聚合,比如执行连接和分组等操作。
因此,根据我所说的,我们可以将两个(或更多)集合合并成一个,并对其进行查询:
val observable = MongoDBGateway.mongoClient.getDatabase("DB1")
.getCollection("C1")
.aggregate(mutableListOf(
Document("$unionWith", "C2"),
Document("$sort", Document("Data", 1)),
Document("$match", Filters.or(
Filters.eq("Data", 2),
Filters.eq("Data", 1)
)),
Document("$skip", 2),
Document("$limit", 2),
))
val iterator = observable.iterator()
while (iterator.hasNext())
println(iterator.next())
我使用了 Kotlin 语言,但在任何 JVM 语言上都是相同的。如您所见,我使用了 $unionWith 阶段将 C1 集合与 C2 集合连接起来。然后我立即对结果按 Data 列进行升序排序,接着对最后一个阶段的结果进行了查询,其中 Data 列为 1 或 2,然后跳过了 2 个结果,并限制了结果数量。
因此,尽管我们无法直接访问 FindIterable 结果来执行查询、限制或跳过操作,但我们可以通过这些阶段来实现所有这些操作。它们可能会在后面提供一个不需要了解阶段的 API。
最后一点,注意阶段的顺序。因为每个阶段的结果都会传递给下一个阶段。例如,如果我们将 $limit 阶段放在第一位,它将首先获取 C1 的前两个文档,然后再执行其余操作,最终的结果将完全不同。因此,首先进行联合,然后排序,然后匹配,然后跳过,最后限制。
英文:
in addition to the answer above which is enough, i want to share the jvm api version of it.there are some aggregate stages which can be used, in the aggregate function whether you are using Java api or not.
for example you can skip some result with $skip stage or limit the result with $limit stage.the $unionWith is the newest stage which only works on v4.4 and i did not know that.as i'm speaking now the current version is 4.4 and i updated my MongoDB server and i was able to use that stage.others may use aggregates for more complex situations like doing joins and grouping and etc...
so with that said, we can combine two(or more) collection into one and do a query on it:
val observable = MongoDBGateway.mongoClient.getDatabase("DB1")
.getCollection("C1")
.aggregate(mutableListOf(
Document("$unionWith", "C2"),
Document("$sort", Document("Data",1)),
Document("$match", Filters.or(
Filters.eq("Data",2),
Filters.eq("Data",1)
)),
Document("$skip", 2),
Document("$limit", 2),
))
val iterator = observable.iterator()
while (iterator.hasNext())
println(iterator.next())
i have used Kotlin language, but it's same on any jvm language.as you can see, i have concatenated the C1 collection with C2 with the help of $unionWith stage.then immediately i have sorted the result on Data column ascending and then i have done a query on the result of last stage where Data column is 1 or 2 and then i have skipped 2 of the results and the i have limitted the result
so although we don't have access to the FindIterable result to do queries and limits or skip, we could do all that with the stages.they may provide an api later that does not need to know about stages.
one last thing, pay attention to the order of stages.because the result of each stage is piped to the next.for example if we put $limit stage first, it will first get the first 2 documents of C1 and then it will do the rest and the total result will be absolutely different.so first do the union and then sort and then match and then skip and then limit.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论