英文:
Does MongoDB use indexes after root replacement?
问题
我使用以下的聚合管道:
db.myCollection.aggregate([
{
"$match": {
"x": {
"$lte": 100
}
}
}, {
"$sort": {
"x": -1
}
}, {
"$group": {
"_id": {
"y": "$y",
"z": "$z"
},
"r": {
"$first": "$$CURRENT"
}
}
}, {
"$replaceRoot": {
"newRoot": "$r"
}
}, {
"$match": {
"a": true,
"b": "value"
}
}
]);
显然,通过添加以下索引可以提高查询性能:
db.myCollection.createIndex({ x: -1 });
文档指出$group
阶段不会使用索引,因为它根据另一个字段进行分组,然后是$match
和$sort
阶段。我是对的吗?还是可以将x
和y
字段添加到第一个索引以使用它?
而且,请帮助我理解:在$replaceRoot
阶段之后,MongoDB是否可以使用索引?如果我添加以下索引,是否会提高性能?
db.myCollection.createIndex({ a: 1, b: 1 });
英文:
I use following aggregation pipline:
db.myCollection.aggregate([
{
"$match": {
"x": {
"$lte": 100
}
}
}, {
"$sort": {
"x": -1
}
}, {
"$group": {
"_id": {
"y": "$y",
"z": "$z"
},
"r": {
"$first": "$$CURRENT"
}
}
}, {
"$replaceRoot": {
"newRoot": "$r"
}
}, {
"$match": {
"a": true,
"b": "value"
}
}
]);
Obviously, it's possible to improve performance of the query by adding following index:
db.myCollection.createIndex({ x: -1 });
The documentation says the $group
stage will not use indexes, because it groups by another field then $match
and $sort
stages. Am I right, or is it possible to add x
and y
fields to the first index to use one?
And, please, help me to understand: can the MongoDB use indexes after $replaceRoot
stage? Will it improve performance if I add the following index?
db.myCollection.createIndex({ a: 1, b: 1 });
答案1
得分: 4
这里有两个不同的问题。这个回答将主要关注标题中的问题,即关于在$replaceRoot
之后的$match
阶段中的索引使用。
对于这个问题的答案是:在你的特定情况下,索引不能帮助你。
请记住,索引用于有效地查找集合中的文档。但在这里经过$group
阶段之后(因此也包括$replaceRoot
),您不再拥有原始文档。相反,您有了一些新的合成内容。数据库将在此之后操作这些新项目,因此通常没有任何好处(甚至能力)返回到集合以再次扫描源文档。
在$match
断言是在分组期间使用的字段上的情况下,可能存在一种特殊情况。但在这种情况下也不适用。您在评论中确认您不能手动将后续的$match
阶段提前,因为这会破坏业务逻辑。这也是数据库无法执行的原因,因为不允许执行会逻辑上改变客户端请求的结果的任何优化。
至于您关于$group
本身的索引使用的另一个问题,据我上次检查,您链接的优化不适用于您示例中的复合值分组。您可以尝试在所有三个字段上创建一个复合索引({ x: 1, y: 1, z: 1 }
),并查看.explain()
输出是否显示了您正在使用的版本上的DISTINCT_SCAN
计划。即使如此,根据y
和z
的唯一值有多少,它可能不一定更快。
英文:
You've asked two different questions here. This answer will focus mostly on the question in the title which is about index usage in a $match
stage that follows a $replaceRoot
.
The answer to that question is: no, an index cannot help here in your particular case.
Keep in mind that indexes are used to efficiently find documents in the collection. But after the $group
stage here (and therefore the $replaceRoot
as well) you no longer have the original documents. Instead you have synthesized something new. The database is going to operate on those new items after that point, therefore there would generally be no benefit (or even ability) to go back to the collection to begin scanning source documents once again.
There is potentially a special case when the $match
predicates are on fields used during the grouping. However that does not apply in this case either. You confirmed in the comments that you cannot manually pull the trailing $match
stage earlier as it would break the business logic. This is the same reason that the database would be unable to do so as it is not allowed to perform any optimization that would logically change the results requested by the client.
As for you other question about index usage for $group
itself, last I checked the optimization you linked was not applicable for grouping on compound values as you have in your example. You could try creating a compound index on all three fields ({ x: 1, y: 1, z: 1 }
) and see if the .explain()
output shows a plan with a DISTINCT_SCAN
on the version that you are using. Even if so it might not necessarily be faster depending on how many unique values there are for y
and z
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论