How to get sum of specific number of nested objects in an array sorted by date with aggregation pipeline, MongoDB

huangapple go评论68阅读模式
英文:

How to get sum of specific number of nested objects in an array sorted by date with aggregation pipeline, MongoDB

问题

我需要获取基于日期的成员数组中最近的3个条目/对象的“points”之和。因此,总积分应为3000(Person B 500,Person C 1000和Person D 1500)。

如何在我的当前聚合管道中执行此操作?我是否需要使用slice或其他内容?

以下是文档:

[{
  "_id": { "$oid": "00001" },
  "leaderId": 001,
  "leader": "Josh",
  "members": [
    {
      "name": "Person A",
      "points": 500,
      "date": "2023-05-30T18:00:00.000+00:00"
    },
    {
      "name": "Person B",
      "points": 500,
      "date": "2023-05-30T19:10:00.000+00:00"
    },
    {
      "name": "Person C",
      "points": 1000,
      "date": "2023-05-30T19:20:00.000+00:00"
    },
    {
      "name": "Person D",
      "points": 1500,
      "date": "2023-05-30T19:30:00.000+00:00"
    }]
  }]

以下是返回3500的聚合管道,因为它从成员数组下的所有对象获取了积分之和:

db.users.aggregate([
  { $match: { leader: "Josh" } },
  { $unwind: "$members" },
  { $sort: { "members.date": -1 } },
  { $group: { _id: "$leaderId", latestThreePoints: { $sum: "$members.points" } } }
])

如果你想获取最近的3个成员的积分之和,你可以使用以下聚合管道:

db.users.aggregate([
  { $match: { leader: "Josh" } },
  { $unwind: "$members" },
  { $sort: { "members.date": -1 } },
  { $limit: 3 },  // 限制只获取前3个最新的成员
  { $group: { _id: "$leaderId", latestThreePoints: { $sum: "$members.points" } } }
])

这将返回最近3个成员的积分之和,即3000

英文:

I need to get the sum of the "points" of the 3 most recent entries/objects under members array based on date. So, the total points should be 3000. (Person B 500, Person C 1000, and Person D 1500).

How do I do that with my current aggregation pipeline? Do I use slice or something?

Here's the document

[{
  "_id": { "$oid": "00001" },
  "leaderId": 001,
  "leader": "Josh",
  "members": [
    {
      "name": "Person A",
      "points": 500,
      "date": 2023-05-30T18:00:00.000+00:00
    },
    {
      "name": "Person B",
      "points": 500,
      "date": 2023-05-30T19:10:00.000+00:00
    },
    {
      "name": "Person C",
      "points": 1000,
      "date": 2023-05-30T19:20:00.000+00:00
    },
    {
      "name": "Person D",
      "points": 1500,
      "date": 2023-05-30T19:30:00.000+00:00
    }]
  }]

Here's my aggregation pipeline that returns 3500 because it gets the sum of points from all objects under members array.

db.users.aggregate([{ $match: { leader: "Josh" } },
{$unwind: "$members"},
{$sort: {"members.date": -1}},
{$group: {_id: "$leaderId", latestThreePoints: {$sum: "$members.points"}}}])

答案1

得分: 1

From your current work, you just need an extra $limit stage after the $sort

db.collection.aggregate([
  {
    $match: {
      leader: "Josh"
    }
  },
  {
    $unwind: "$members"
  },
  {
    $sort: {
      "members.date": -1
    }
  },
  {
    $limit: 3
  },
  {
    $group: {
      _id: "$leaderId",
      latestThreePoints: {
        $sum: "$members.points"
      }
    }
  }
])

Mongo Playground

英文:

From your current work, you just need an extra $limit stage after the $sort

db.collection.aggregate([
  {
    $match: {
      leader: "Josh"
    }
  },
  {
    $unwind: "$members"
  },
  {
    $sort: {
      "members.date": -1
    }
  },
  {
    $limit: 3
  },
  {
    $group: {
      _id: "$leaderId",
      latestThreePoints: {
        $sum: "$members.points"
      }
    }
  }
])

Mongo Playground

答案2

得分: 1

I agree with @ray's answer that you can achieve your desired result from the starting pipeline by adding a $limit stage.

That said, it seems as if the _id field is related leaderId field. If so, that means that the $unwind -> processing -> $group on leaderId approach being used is an antipattern as it unnecessarily introduces a blocking stage.

Instead, consider processing the documents in place. You mentioned $slice which seems like the appropriate solution to me. If we know that the members array is always in ascending order, then the following should compute what you want:

{
  "$project": {
    _id: "$leaderId",
    "latestThreePoints": {
      "$reduce": {
        "input": {
          "$slice": [
            "$members",
            -3
          ]
        },
        "initialValue": 0,
        "in": {
          $sum: [
            "$$this.points",
            "$$value"
          ]
        }
      }
    }
  }
}

Demo.

If the array is not always sorted, but you are on version 6.0 or higher, then you can use the $sortArray operator. The stage would look as follows:

{
  "$project": {
    _id: "$leaderId",
    "latestThreePoints": {
      "$reduce": {
        "input": {
          "$slice": [
            {
              $sortArray: {
                input: "$members",
                sortBy: {
                  date: -1
                }
              }
            },
            3
          ]
        },
        "initialValue": 0,
        "in": {
          $sum: [
            "$$this.points",
            "$$value"
          ]
        }
      }
    }
  }
}

Demo.

Also of note, you can use $addFields instead of $project if you would like to add this new field to the existing fields in the document as opposed to removing all of the others.

英文:

I agree with @ray's answer that you can achieve your desired result from the starting pipeline by adding a $limit stage.

That said, it seems as if the _id field is related leaderId field. If so, that means that the $unwind -> processing -> $group on leaderId approach being used is an antipattern as it unnecessarily introduces a blocking stage.

Instead, consider processing the documents in place. You mentioned $slice which seems like the appropriate solution to me. If we know that the members array is always in ascending order, then the following should compute what you want:

  {
    "$project": {
      _id: "$leaderId",
      "latestThreePoints": {
        "$reduce": {
          "input": {
            "$slice": [
              "$members",
              -3
            ]
          },
          "initialValue": 0,
          "in": {
            $sum: [
              "$$this.points",
              "$$value"
            ]
          }
        }
      }
    }
  }

Demo.

If the array is not always sorted, but you are on version 6.0 or higher, then you can use the $sortArray operator. The stage would look as follows:

  {
    "$project": {
      _id: "$leaderId",
      "latestThreePoints": {
        "$reduce": {
          "input": {
            "$slice": [
              {
                $sortArray: {
                  input: "$members",
                  sortBy: {
                    date: -1
                  }
                }
              },
              3
            ]
          },
          "initialValue": 0,
          "in": {
            $sum: [
              "$$this.points",
              "$$value"
            ]
          }
        }
      }
    }
  }

Demo.

Also of note, you can use $addFields instead of $project if you would like to add this new field to the existing fields in the document as opposed to removing all of the others.

huangapple
  • 本文由 发表于 2023年5月30日 01:19:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76359228.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定