How to get sum of specific number of nested objects in an array sorted by date with aggregation pipeline, MongoDB

huangapple go评论83阅读模式
英文:

How to get sum of specific number of nested objects in an array sorted by date with aggregation pipeline, MongoDB

问题

我需要获取基于日期的成员数组中最近的3个条目/对象的“points”之和。因此,总积分应为3000(Person B 500,Person C 1000和Person D 1500)。

如何在我的当前聚合管道中执行此操作?我是否需要使用slice或其他内容?

以下是文档:

  1. [{
  2. "_id": { "$oid": "00001" },
  3. "leaderId": 001,
  4. "leader": "Josh",
  5. "members": [
  6. {
  7. "name": "Person A",
  8. "points": 500,
  9. "date": "2023-05-30T18:00:00.000+00:00"
  10. },
  11. {
  12. "name": "Person B",
  13. "points": 500,
  14. "date": "2023-05-30T19:10:00.000+00:00"
  15. },
  16. {
  17. "name": "Person C",
  18. "points": 1000,
  19. "date": "2023-05-30T19:20:00.000+00:00"
  20. },
  21. {
  22. "name": "Person D",
  23. "points": 1500,
  24. "date": "2023-05-30T19:30:00.000+00:00"
  25. }]
  26. }]

以下是返回3500的聚合管道,因为它从成员数组下的所有对象获取了积分之和:

  1. db.users.aggregate([
  2. { $match: { leader: "Josh" } },
  3. { $unwind: "$members" },
  4. { $sort: { "members.date": -1 } },
  5. { $group: { _id: "$leaderId", latestThreePoints: { $sum: "$members.points" } } }
  6. ])

如果你想获取最近的3个成员的积分之和,你可以使用以下聚合管道:

  1. db.users.aggregate([
  2. { $match: { leader: "Josh" } },
  3. { $unwind: "$members" },
  4. { $sort: { "members.date": -1 } },
  5. { $limit: 3 }, // 限制只获取前3个最新的成员
  6. { $group: { _id: "$leaderId", latestThreePoints: { $sum: "$members.points" } } }
  7. ])

这将返回最近3个成员的积分之和,即3000

英文:

I need to get the sum of the "points" of the 3 most recent entries/objects under members array based on date. So, the total points should be 3000. (Person B 500, Person C 1000, and Person D 1500).

How do I do that with my current aggregation pipeline? Do I use slice or something?

Here's the document

  1. [{
  2. "_id": { "$oid": "00001" },
  3. "leaderId": 001,
  4. "leader": "Josh",
  5. "members": [
  6. {
  7. "name": "Person A",
  8. "points": 500,
  9. "date": 2023-05-30T18:00:00.000+00:00
  10. },
  11. {
  12. "name": "Person B",
  13. "points": 500,
  14. "date": 2023-05-30T19:10:00.000+00:00
  15. },
  16. {
  17. "name": "Person C",
  18. "points": 1000,
  19. "date": 2023-05-30T19:20:00.000+00:00
  20. },
  21. {
  22. "name": "Person D",
  23. "points": 1500,
  24. "date": 2023-05-30T19:30:00.000+00:00
  25. }]
  26. }]

Here's my aggregation pipeline that returns 3500 because it gets the sum of points from all objects under members array.

  1. db.users.aggregate([{ $match: { leader: "Josh" } },
  2. {$unwind: "$members"},
  3. {$sort: {"members.date": -1}},
  4. {$group: {_id: "$leaderId", latestThreePoints: {$sum: "$members.points"}}}])

答案1

得分: 1

From your current work, you just need an extra $limit stage after the $sort

  1. db.collection.aggregate([
  2. {
  3. $match: {
  4. leader: "Josh"
  5. }
  6. },
  7. {
  8. $unwind: "$members"
  9. },
  10. {
  11. $sort: {
  12. "members.date": -1
  13. }
  14. },
  15. {
  16. $limit: 3
  17. },
  18. {
  19. $group: {
  20. _id: "$leaderId",
  21. latestThreePoints: {
  22. $sum: "$members.points"
  23. }
  24. }
  25. }
  26. ])

Mongo Playground

英文:

From your current work, you just need an extra $limit stage after the $sort

  1. db.collection.aggregate([
  2. {
  3. $match: {
  4. leader: "Josh"
  5. }
  6. },
  7. {
  8. $unwind: "$members"
  9. },
  10. {
  11. $sort: {
  12. "members.date": -1
  13. }
  14. },
  15. {
  16. $limit: 3
  17. },
  18. {
  19. $group: {
  20. _id: "$leaderId",
  21. latestThreePoints: {
  22. $sum: "$members.points"
  23. }
  24. }
  25. }
  26. ])

Mongo Playground

答案2

得分: 1

I agree with @ray's answer that you can achieve your desired result from the starting pipeline by adding a $limit stage.

That said, it seems as if the _id field is related leaderId field. If so, that means that the $unwind -> processing -> $group on leaderId approach being used is an antipattern as it unnecessarily introduces a blocking stage.

Instead, consider processing the documents in place. You mentioned $slice which seems like the appropriate solution to me. If we know that the members array is always in ascending order, then the following should compute what you want:

  1. {
  2. "$project": {
  3. _id: "$leaderId",
  4. "latestThreePoints": {
  5. "$reduce": {
  6. "input": {
  7. "$slice": [
  8. "$members",
  9. -3
  10. ]
  11. },
  12. "initialValue": 0,
  13. "in": {
  14. $sum: [
  15. "$$this.points",
  16. "$$value"
  17. ]
  18. }
  19. }
  20. }
  21. }
  22. }

Demo.

If the array is not always sorted, but you are on version 6.0 or higher, then you can use the $sortArray operator. The stage would look as follows:

  1. {
  2. "$project": {
  3. _id: "$leaderId",
  4. "latestThreePoints": {
  5. "$reduce": {
  6. "input": {
  7. "$slice": [
  8. {
  9. $sortArray: {
  10. input: "$members",
  11. sortBy: {
  12. date: -1
  13. }
  14. }
  15. },
  16. 3
  17. ]
  18. },
  19. "initialValue": 0,
  20. "in": {
  21. $sum: [
  22. "$$this.points",
  23. "$$value"
  24. ]
  25. }
  26. }
  27. }
  28. }
  29. }

Demo.

Also of note, you can use $addFields instead of $project if you would like to add this new field to the existing fields in the document as opposed to removing all of the others.

英文:

I agree with @ray's answer that you can achieve your desired result from the starting pipeline by adding a $limit stage.

That said, it seems as if the _id field is related leaderId field. If so, that means that the $unwind -> processing -> $group on leaderId approach being used is an antipattern as it unnecessarily introduces a blocking stage.

Instead, consider processing the documents in place. You mentioned $slice which seems like the appropriate solution to me. If we know that the members array is always in ascending order, then the following should compute what you want:

  1. {
  2. "$project": {
  3. _id: "$leaderId",
  4. "latestThreePoints": {
  5. "$reduce": {
  6. "input": {
  7. "$slice": [
  8. "$members",
  9. -3
  10. ]
  11. },
  12. "initialValue": 0,
  13. "in": {
  14. $sum: [
  15. "$$this.points",
  16. "$$value"
  17. ]
  18. }
  19. }
  20. }
  21. }
  22. }

Demo.

If the array is not always sorted, but you are on version 6.0 or higher, then you can use the $sortArray operator. The stage would look as follows:

  1. {
  2. "$project": {
  3. _id: "$leaderId",
  4. "latestThreePoints": {
  5. "$reduce": {
  6. "input": {
  7. "$slice": [
  8. {
  9. $sortArray: {
  10. input: "$members",
  11. sortBy: {
  12. date: -1
  13. }
  14. }
  15. },
  16. 3
  17. ]
  18. },
  19. "initialValue": 0,
  20. "in": {
  21. $sum: [
  22. "$$this.points",
  23. "$$value"
  24. ]
  25. }
  26. }
  27. }
  28. }
  29. }

Demo.

Also of note, you can use $addFields instead of $project if you would like to add this new field to the existing fields in the document as opposed to removing all of the others.

huangapple
  • 本文由 发表于 2023年5月30日 01:19:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76359228.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定