英文:
MongoDB - How to group documents and conditionally push each item to array
问题
db.phrases.aggregate([
    {
        $group: {
            _id: '$author',
            phrases: {
                $push: {
                    $cond: {
                        if: {
                            $lte: [{ $sum: '$phraseLength' }, 200]
                        },
                        then: { phrase: '$phrase', phraseLength: '$phraseLength', author: '$author' },
                        else: null
                    }
                }
            }
        }
    },
    {
        $project: {
            _id: 1,
            phrases: {
                $filter: {
                    input: '$phrases',
                    as: 'phrase',
                    cond: { $ne: ['$$phrase', null] }
                }
            }
        }
    }
])
英文:
I have a collection of phrases with the following structure:
- phrase - string
 - phraseLength (the length of the phrase string)
 - author - string
 
I would like to group phrases by author. And for each author, the maximum length of all that author's phrases shouldn't exceed 200 characters.
In other words, I am trying to figure out how to create an aggregate query that will:
- loop through each phrase
 - If current author's total running phrases length is less than 200 characters, then push it to the author's phrases array
 
So for example with the following documents (I added incorrect phrase length for each document for ease of understanding):
[
    {
        phrase: 'This is phrase 1 of author 1',
        phraseLength: 50,
        author: 'Author 1',
    },
    {
        phrase: 'This is phrase 1 of author 1',
        phraseLength: 150,
        author: 'Author 1',
    },
    {
        phrase: 'This is phrase 1 of author 1',
        phraseLength: 10,
        author: 'Author 1',
    },
    {
        phrase: 'This is phrase 1 of author 2',
        phraseLength: 20,
        author: 'Author 2',
    },
    {
        phrase: 'This is phrase 2  of author 2',
        phraseLength: 180,
        author: 'Author 2',
    },
    {
        phrase: 'This is phrase 3  of author 2',
        phraseLength: 50,
        author: 'Author 2',
    },
]
This is the desired output:
[
    {
        _id: 'Author 1',
        phrases: [
            {
                phrase: 'This is phrase 1 of author 1',
                phraseLength: 50,
                author: 'Author 1',
            },
            {
                phrase: 'This is phrase 1 of author 1',
                phraseLength: 150,
                author: 'Author 1',
            },
        ]
    },
    {
        _id: 'Author 2',
        phrases: [
            {
                phrase: 'This is phrase 1 of author 2',
                phraseLength: 20,
                author: 'Author 2',
            },
            {
                phrase: 'This is phrase 2  of author 2',
                phraseLength: 180,
                author: 'Author 2',
            },
        ]
    },
]
I know how to group all phrases by author like this:
db.phrases.aggregate([
    {
        $group: {
            _id: '$author',
            phrases: {
                $push: { phrase: '$phrase', phraseLength: '$phraseLength', author: '$author' }
            }
        }
    },
])
And I have tried the following query to accomplish the desired outcome but it didn't work.
db.phrases.aggregate([
    {
        $group: {
            _id: "$author",
            runningTotalPhraseLength: { $sum: '$phraseLength' },
            $cond: {
                if: {
                    $lte: [{ $size: '$runningTotalPhraseLength' }, 200]
                },
                then: {
                    phrases: {
                        $push: { phrase: '$phrase', phraseLength: '$phraseLength', author: '$author' }
                    }
                },
                else: {},
            },
        }
    },
])
Any help is very much appreciated! 🙂
答案1
得分: 1
我认为在$group阶段定义条件累积推送逻辑是不可能的。
但是你需要另一个带有$reduce运算符的$set阶段。
- 
$group - 
$set- 创建一个temp字段。2.1.
$reduce- 迭代phrases数组中的每个元素并返回一个新对象。2.1.1.
input-phrases数组。2.1.2.
initialValue- 使用length和phrases字段初始化一个对象。2.1.3.
in-2.1.3.1.
$cond- 比较累积的value.length加上当前迭代对象的length是否小于或等于200。如果为真,则将对象设置为
length:$$value.length + $$this.length,并将迭代的短语对象添加到累积的$$value.phrases中。否则,保持累积的
$$value的相同值。 - 
$set- 用temp.phrases替换phrases数组。 - 
$unset- 删除temp字段。 
英文:
I think it is not possible to define the conditional accumulate push logic in the $group stage.
But you need another $set stage with $reduce operator.
- 
$group - 
$set- Create atempfield.2.1.
$reduce- Iterate each element inphrasesarray and return a new object.2.1.1.
input-phrasesarray.2.1.2.
initialValue- Initialize an object withlengthandphrasesfields.2.1.3.
in-2.1.3.1.
$cond- Compare if the accumulatedvalue.lengthplus thelengthfor current iterate object is lesser or equal to 200.If true, then set the object with
length:$$value.length + $$this.lengthandphraseswith add the iterated phrase object to the accumulated$$value.phrases.Else, remain the same value for the accumulated
$$value. - 
$set- Replacephrasesarray withtemp.phrases. - 
$unset- Removetempfield. 
db.phrases.aggregate([
  {
    $group: {
      _id: "$author",
      phrases: {
        $push: {
          phrase: "$phrase",
          phraseLength: "$phraseLength",
          author: "$author"
        }
      }
    }
  },
  {
    $set: {
      temp: {
        $reduce: {
          input: "$phrases",
          initialValue: {
            length: 0,
            phrases: []
          },
          in: {
            $cond: {
              if: {
                $lte: [
                  {
                    $sum: [
                      "$$this.phraseLength",
                      "$$value.length"
                    ]
                  },
                  200
                ]
              },
              then: {
                length: {
                  $sum: [
                    "$$this.phraseLength",
                    "$$value.length"
                  ]
                },
                phrases: {
                  $concatArrays: [
                    [
                      "$$this"
                    ],
                    "$$value.phrases"
                  ]
                }
              },
              else: "$$value"
            }
          }
        }
      }
    }
  },
  {
    $set: {
      phrases: "$temp.phrases"
    }
  },
  {
    $unset: "temp"
  }
])
答案2
得分: 1
使用 $setWindowFields 计算累积长度,然后进行条件分组,
db.phrases.aggregate([
  {
    $setWindowFields: {
      partitionBy: "$author",
      sortBy: {null: 1},
      output: {
        "cumuLength": {
          $sum: "$phraseLength",
          window: {
            documents: ["unbounded","current"]
          }
        }
      }
    }
  },
  {
    $group: {
      _id: "$author",
      phrases: {
        $push: {
          $cond: [
            {$lte: ["$cumuLength", 200]},
            {
              phrase: "$phrase",
              phraseLength: "$phraseLength",
              author: "$author"
            },
            "$$REMOVE"
          ]
        }
      }
    }
  }
])
英文:
Using $setWindowFields to compute cumulative length and then conditional grouping,
db.phrases.aggregate([
  {
    $setWindowFields: {
      partitionBy: "$author",
      sortBy: {null: 1},
      output: {
        "cumuLength": {
          $sum: "$phraseLength",
          window: {
            documents: ["unbounded","current"]
          }
        }
      }
    }
  },
  {
    $group: {
      _id: "$author",
      phrases: {
        $push: {
          $cond: [
            {$lte: ["$cumuLength",200]},
            {
              phrase: "$phrase",
              phraseLength: "$phraseLength",
              author: "$author"
            },
            "$$REMOVE"
          ]
        }
      }
    }
  }
])
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论