2023年3月9日 20:32:27go评论132阅读模式

英文:

Does MongoDB use indexes after root replacement?

问题

我使用以下的聚合管道：

db.myCollection.aggregate([
    {
        "$match": {
            "x": {
                "$lte": 100
            }
        }
    }, {
        "$sort": {
            "x": -1
        }
    }, {
        "$group": {
            "_id": {
                "y": "$y",
                "z": "$z"
            },
            "r": {
                "$first": "$$CURRENT"
            }
        }
    }, {
        "$replaceRoot": {
            "newRoot": "$r"
        }
    }, {
        "$match": {
            "a": true,
            "b": "value"
        }
    }
]);

显然，通过添加以下索引可以提高查询性能：

db.myCollection.createIndex({ x: -1 });

文档指出$group阶段不会使用索引，因为它根据另一个字段进行分组，然后是$match和$sort阶段。我是对的吗？还是可以将x和y字段添加到第一个索引以使用它？

而且，请帮助我理解：在$replaceRoot阶段之后，MongoDB是否可以使用索引？如果我添加以下索引，是否会提高性能？

db.myCollection.createIndex({ a: 1, b: 1 });

英文:

I use following aggregation pipline:

db.myCollection.aggregate([
    {
        &quot;$match&quot;: {
            &quot;x&quot;: {
                &quot;$lte&quot;: 100
            }
        }
    }, {
        &quot;$sort&quot;: {
            &quot;x&quot;: -1
        }
    }, {
        &quot;$group&quot;: {
            &quot;_id&quot;: {
                &quot;y&quot;: &quot;$y&quot;,
                &quot;z&quot;: &quot;$z&quot;
            },
            &quot;r&quot;: {
                &quot;$first&quot;: &quot;$$CURRENT&quot;
            }
        }
    }, {
        &quot;$replaceRoot&quot;: {
            &quot;newRoot&quot;: &quot;$r&quot;
        }
    }, {
        &quot;$match&quot;: {
            &quot;a&quot;: true,
            &quot;b&quot;: &quot;value&quot;
        }
    }
]);

Obviously, it's possible to improve performance of the query by adding following index:

db.myCollection.createIndex({ x: -1 });

The documentation says the $group stage will not use indexes, because it groups by another field then $match and $sort stages. Am I right, or is it possible to add x and y fields to the first index to use one?

And, please, help me to understand: can the MongoDB use indexes after $replaceRoot stage? Will it improve performance if I add the following index?

db.myCollection.createIndex({ a: 1, b: 1 });

答案1

得分: 4

这里有两个不同的问题。这个回答将主要关注标题中的问题，即关于在$replaceRoot之后的$match阶段中的索引使用。

对于这个问题的答案是：在你的特定情况下，索引不能帮助你。

请记住，索引用于有效地查找集合中的文档。但在这里经过$group阶段之后（因此也包括$replaceRoot），您不再拥有原始文档。相反，您有了一些新的合成内容。数据库将在此之后操作这些新项目，因此通常没有任何好处（甚至能力）返回到集合以再次扫描源文档。

在$match断言是在分组期间使用的字段上的情况下，可能存在一种特殊情况。但在这种情况下也不适用。您在评论中确认您不能手动将后续的$match阶段提前，因为这会破坏业务逻辑。这也是数据库无法执行的原因，因为不允许执行会逻辑上改变客户端请求的结果的任何优化。

至于您关于$group本身的索引使用的另一个问题，据我上次检查，您链接的优化不适用于您示例中的复合值分组。您可以尝试在所有三个字段上创建一个复合索引（{ x: 1, y: 1, z: 1 }），并查看.explain()输出是否显示了您正在使用的版本上的DISTINCT_SCAN计划。即使如此，根据y和z的唯一值有多少，它可能不一定更快。

英文:

You've asked two different questions here. This answer will focus mostly on the question in the title which is about index usage in a $match stage that follows a $replaceRoot.

The answer to that question is: no, an index cannot help here in your particular case.

Keep in mind that indexes are used to efficiently find documents in the collection. But after the $group stage here (and therefore the $replaceRoot as well) you no longer have the original documents. Instead you have synthesized something new. The database is going to operate on those new items after that point, therefore there would generally be no benefit (or even ability) to go back to the collection to begin scanning source documents once again.

There is potentially a special case when the $match predicates are on fields used during the grouping. However that does not apply in this case either. You confirmed in the comments that you cannot manually pull the trailing $match stage earlier as it would break the business logic. This is the same reason that the database would be unable to do so as it is not allowed to perform any optimization that would logically change the results requested by the client.

As for you other question about index usage for $group itself, last I checked the optimization you linked was not applicable for grouping on compound values as you have in your example. You could try creating a compound index on all three fields ({ x: 1, y: 1, z: 1 }) and see if the .explain() output shows a plan with a DISTINCT_SCAN on the version that you are using. Even if so it might not necessarily be faster depending on how many unique values there are for y and z.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

MongoDB在根替换后使用索引吗？

问题

答案1

如何在MongoDB中的多个子文档中重新组合具有相同字段的内容？

在插入具有time.Time字段的文档时设置默认日期。

如何根据Mongoose中的createdAt字段对数组中的对象进行排序？

如何使用golang从MongoDB数组中删除第N个元素？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论