英文:
Golang MongoDB (mgo) aggregation with nested arrays
问题
我有以下形式的MongoDB数据:
{"_id":"53eb9a5673a57578a10074ec","data":{"statistics":{"gsm":[{"type":"Attacks","value":{"team1":66,"team2":67}},{"type":"Corners","value":{"team1":8,"team2":5}},{"type":"Dangerous attacks","value":{"team1":46,"team2":49}},{"type":"Fouls","value":{"team1":9,"team2":14}},{"type":"Free kicks","value":{"team1":18,"team2":10}},{"type":"Goals","value":{"team1":2,"team2":1}},{"type":"Goal kicks","value":{"team1":10,"team2":11}},{"type":"Offsides","value":{"team1":1,"team2":4}},{"type":"Posession","value":{"team1":55,"team2":45}},{"type":"Shots blocked","value":{"team1":4,"team2":1}},{"type":"Shots off target","value":{"team1":7,"team2":5}}]}}}
我想要获取 data.statistics.gsm.type == "Attacks" 时 data.statistics.gsm.value.team1 的平均值,使用 Golang 的 MongoDB 驱动程序 mgo。到目前为止,我尝试过以下代码(其中一个或两个 group 语句):
pipeline := []bson.M{
bson.M{"$match": bson.M{"kick_off.utc.gsm.date_time": bson.M{"$gt": start, "$lt": end}}},
bson.M{
"$group": bson.M{
"_id": "$gsm_id",
"event_array" : bson.M{"$first": "$data.statistics.gsm"}}},
bson.M{
"$group": bson.M{
"_id": "$type",
"avg_attack" : bson.M{"$avg": "$data.statistics.gsm.value.team1"}}}}
只有第一个 group 语句,我得到以下结果,但第二个 group 语句无法帮助我获取平均值。
[{"_id":1953009,"event_array":[{"type":"Attacks","value":{"team1":48,"team2":12}},{"type":"Corners","value":{"team1":12,"team2":0}},{"type":"Dangerous attacks","value":{"team1":46,"team2":7}},{"type":"Fouls","value":{"team1":10,"team2":3}},{"type":"Free kicks","value":{"team1":5,"team2":12}},{"type":"Goals","value":{"team1":8,"team2":0}}]}]
英文:
I have MongoDB data of the following form:
{"_id":"53eb9a5673a57578a10074ec","data":{"statistics":{"gsm":[{"type":"Attacks","value":{"team1":66,"team2":67}},{"type":"Corners","value":{"team1":8,"team2":5}},{"type":"Dangerous attacks","value":{"team1":46,"team2":49}},{"type":"Fouls","value":{"team1":9,"team2":14}},{"type":"Free kicks","value":{"team1":18,"team2":10}},{"type":"Goals","value":{"team1":2,"team2":1}},{"type":"Goal kicks","value":{"team1":10,"team2":11}},{"type":"Offsides","value":{"team1":1,"team2":4}},{"type":"Posession","value":{"team1":55,"team2":45}},{"type":"Shots blocked","value":{"team1":4,"team2":1}},{"type":"Shots off target","value":{"team1":7,"team2":5}}]}}}
I want to get the average of data.statistics.gsm.value.team1 when data.statistics.gsm.type == "Attacks" using the Golang MongoDB driver mgo. Code I have tried so far (with either one or both the group statements below):
pipeline := []bson.M{
bson.M{"$match": bson.M{"kick_off.utc.gsm.date_time": bson.M{"$gt": start, "$lt": end}}},
bson.M{
"$group": bson.M{
"_id": "$gsm_id",
"event_array" : bson.M{"$first": "$data.statistics.gsm"}}},
bson.M{
"$group": bson.M{
"_id": "$type",
"avg_attack" : bson.M{"$avg": "$data.statistics.gsm.value.team1"}}}}
With only the first group statement, I get back the below, but the second group statement doesn't help me get the average.
[{"_id":1953009,"event_array":[{"type":"Attacks","value":{"team1":48,"team2":12}},{"type":"Corners","value":{"team1":12,"team2":0}},{"type":"Dangerous attacks","value":{"team1":46,"team2":7}},{"type":"Fouls","value":{"team1":10,"team2":3}},{"type":"Free kicks","value":{"team1":5,"team2":12}},{"type":"Goals","value":{"team1":8,"team2":0}}
答案1
得分: 3
我总是发现将JSON格式化输出很有帮助。以下是你说你从第一个group语句中得到的内容:
[
{
"_id": 1953009,
"event_array": [
{
"type": "Attacks",
"value": {
"team1": 48,
"team2": 12
}
},
{
"type": "Corners",
"value": {
"team1": 12,
"team2": 0
}
},
...
现在是你使用的第二个group语句:
"$group": bson.M{
"_id": "$type",
"avg_attack" : bson.M{"$avg": "$data.statistics.gsm.value.team1"}
}
你试图对第一个group语句的结果中的data.statistics.gsm.value.team1
取平均值,但在第一个group语句的结果中并不存在这个字段,所以当然不会给你一个平均值。
我建议你尝试使用$unwind
操作符将数组拆分为一组文档,然后你应该能够按照你在这里尝试的方式进行分组,使用{$avg: "$value.team1"}
。
因此,生成聚合的整体流程应为:$match -> $group1 -> $unwind -> $group2
。只需记住,管道的每个阶段都在前一个阶段生成的数据上操作,这就是为什么你的data.statistics.gsm.value.team1
部分是错误的。
英文:
I always find it helpful to get a pretty print view of the json. Here is what you say you get from the first group statement:
[
{
"_id":1953009,
"event_array":[
{
"type":"Attacks",
"value":{
"team1":48,
"team2":12
}
},
{
"type":"Corners",
"value":{
"team1":12,
"team2":0
}
},
...
Now the second group statement you use:
"$group": bson.M{
"_id": "$type",
"avg_attack" : bson.M{"$avg": "$data.statistics.gsm.value.team1"}
}
You're trying to take the average of data.statistics.gsm.value.team1
on the results of the first group statement, but that doesn't exist in the results of the first group statement so of course it won't give you an average.
Instead of the approach you're using, I'd suggest looking into the $unwind
operator to break down the array into a set of documents, then you should be able group them in the way you're trying to here with {$avg: "$value.team1"}
.
So the overall pipeline that is used to produce the aggregation would be: $match -> $group1 -> $unwind -> $group2
. Just keep in mind that each phase of the pipeline is operating on the data produced by the previous stage, which is why your data.statistics.gsm.value.team1
part was incorrect.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论