2023年6月29日 18:26:36go评论74阅读模式

英文:

How to use MongoDB aggregation to get only those records whose index is divisible by 5?

问题

以下是您要翻译的内容：

这是我的数据在名为usages的集合中的样子：

{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675122960, usage: null, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123020, usage: 2.74, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123080, usage: 2.62, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123140, usage: 4.32, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123200, usage: null, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123260, usage: 5.38, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123320, usage: 5.08, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123380, usage: 1.02, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123440, usage: 2.35, limit: 2.7 },

请注意：这只是我的数据样本，实际上我有40000条记录。

我试图使用以下mongoose查询获取最后60条数据：

  const count = await Usage.count()
  Usage.find({}).sort({ timeStamp: 1 }).skip(count - 60);

到目前为止，一切都正常。

但现在，有一个要求，需要在所需范围内获取较少的数据。假设输出只需要来自最后60条记录的12条记录，而不会丢失太多数据。我们将在前端使用这些数据绘制折线图。

因此，我可以筛选数据，以便我将选择索引可被5整除的数据。

所以，这是我的代码：

    const count = await Usage.count();
    const data = await Usage
        .find({})
        .sort({ timeStamp: 1 })
        .skip(count - start)
        .exec();
    const requiredData = data.filter((item, index) => index % 5 === 0);
    console.log(requiredData);

上述代码应该按预期工作，但它非常慢，因为我是在从数据库获取数据之后对数据进行筛选。

理想情况下，我应该使用一些查询来筛选数据，以便数据库只给我筛选后的结果。

有人可以帮助我实现这个吗？

这是一个非常相似的问题：https://stackoverflow.com/questions/62201364/filter-items-of-array-by-index-in-aggregation-pipeline

但在那个问题中，OP想要根据名为arr的特定字段进行筛选。但在我的情况下，我想要根据整个集合进行筛选。我不太了解如何使用聚合来编写查询。我尝试过一些视频教程来学习聚合，但它看起来有点复杂，需要更多时间来学习。

英文:

Here is what my data looks like in a collection named usages:

{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675122960, usage: null, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123020, usage: 2.74, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123080, usage: 2.62, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123140, usage: 4.32, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123200, usage: null, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123260, usage: 5.38, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123320, usage: 5.08, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123380, usage: 1.02, limit: 2.7 },
{ _id: AUTOGENERATED_MONGO_OBJECT_ID, timeStamp: 1675123440, usage: 2.35, limit: 2.7 },

Please Note: This is just sample of how my data looks, actually, I have 40000 records.

I am trying to get last 60 records of data using this mongoose query:

  const count = await Usage.count()
  Usage.find({}).sort({ timeStamp: 1 }).skip(count - 60);

So far, everything is working fine.

But now, there is a requirement to get less data in the required range. Let's say output needs to have only 12 records from the last 60 records without losing much data. We will be plotting a line chart on the frontend using this data.

So, I can filter the data such that I will pick data whose index is divisible by 5.

So, here is my code:

    const count = await Usage.count();
    const data = await Usage
        .find({})
        .sort({ timeStamp: 1 })
        .skip(count - start)
        .exec();
    const requiredData = data.filter((item, index) =&gt; index % 5 === 0);
    console.log(requiredData);

The above code should work as expected, but it is very slow because I am filtering data after I get it from the database.

Ideally, I should filter the data using some query so the database should give me filtered results only.

Can someone help me to achieve that?

Here is a very similar question: https://stackoverflow.com/questions/62201364/filter-items-of-array-by-index-in-aggregation-pipeline

But in that question, OP wants to filter based on a particular field named arr. But in my case, I want to filter based on the whole collection. And I don't have much knowledge of how can I write a query using aggregate. I tried some video tutorials to learn aggregate, but it looks a bit complex, and need some more time to learn it.

答案1

得分: 3

尝试这个：

db.collection.aggregate([
   { $sort: { t: -1 } },
   { $limit: 60 },
   {
      $setWindowFields: {
         sortBy: { t: 1 },
         output: {
            pos: { $documentNumber: {} }
         }
      }
   },
   { $set: { remainder: { $mod: ["$pos", 5] } } },
   { $match: { remainder: 1 } },
   { $sort: { t: 1 } },
   { $unset: ["pos", "remainder"] }
])

Mongo Playground

使用 $setWindowFields 您甚至可以更高级地操作，例如如果与前一个文档的差异小于 xyz，则跳过文档。

英文:

Try this one:

db.collection.aggregate([
   { $sort: { t: -1 } },
   { $limit: 60 },
   {
      $setWindowFields: {
         sortBy: { t: 1 },
         output: {
            pos: { $documentNumber: {} }
         }
      }
   },
   { $set: { remainder: { $mod: [&quot;$pos&quot;, 5] } } },
   { $match: { remainder: 1 } },
   { $sort: { t: 1 } },
   { $unset: [&quot;pos&quot;, &quot;remainder&quot;] }
])

Mongo Playground

With $setWindowFields you could do it even more advanced, for example skip the document if difference to previous document is less than xyz.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用MongoDB聚合只获取那些索引可被5整除的记录？

问题

答案1

Node.js或Go适用于MongoDB服务器。

PugJS 在表单操作中添加问号

传递查询ID在API请求中，在刷新页面时发送未定义的ID。

如何使用Spring Framework的聚合功能返回包含唯一值的完整文档？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论