Golang MongoDB(mgo)使用嵌套数组进行聚合。

huangapple go评论82阅读模式
英文:

Golang MongoDB (mgo) aggregation with nested arrays

问题

我有以下形式的MongoDB数据:

{"_id":"53eb9a5673a57578a10074ec","data":{"statistics":{"gsm":[{"type":"Attacks","value":{"team1":66,"team2":67}},{"type":"Corners","value":{"team1":8,"team2":5}},{"type":"Dangerous attacks","value":{"team1":46,"team2":49}},{"type":"Fouls","value":{"team1":9,"team2":14}},{"type":"Free kicks","value":{"team1":18,"team2":10}},{"type":"Goals","value":{"team1":2,"team2":1}},{"type":"Goal kicks","value":{"team1":10,"team2":11}},{"type":"Offsides","value":{"team1":1,"team2":4}},{"type":"Posession","value":{"team1":55,"team2":45}},{"type":"Shots blocked","value":{"team1":4,"team2":1}},{"type":"Shots off target","value":{"team1":7,"team2":5}}]}}}

我想要获取 data.statistics.gsm.type == "Attacks" 时 data.statistics.gsm.value.team1 的平均值,使用 Golang 的 MongoDB 驱动程序 mgo。到目前为止,我尝试过以下代码(其中一个或两个 group 语句):

pipeline := []bson.M{
	bson.M{"$match": bson.M{"kick_off.utc.gsm.date_time": bson.M{"$gt": start, "$lt": end}}},
	bson.M{
		"$group": bson.M{
			"_id":     "$gsm_id",
			"event_array" : bson.M{"$first": "$data.statistics.gsm"}}},
	bson.M{
  		"$group": bson.M{
  			"_id":     "$type",
  			"avg_attack" : bson.M{"$avg": "$data.statistics.gsm.value.team1"}}}}

只有第一个 group 语句,我得到以下结果,但第二个 group 语句无法帮助我获取平均值。

[{"_id":1953009,"event_array":[{"type":"Attacks","value":{"team1":48,"team2":12}},{"type":"Corners","value":{"team1":12,"team2":0}},{"type":"Dangerous attacks","value":{"team1":46,"team2":7}},{"type":"Fouls","value":{"team1":10,"team2":3}},{"type":"Free kicks","value":{"team1":5,"team2":12}},{"type":"Goals","value":{"team1":8,"team2":0}}]}]
英文:

I have MongoDB data of the following form:

{"_id":"53eb9a5673a57578a10074ec","data":{"statistics":{"gsm":[{"type":"Attacks","value":{"team1":66,"team2":67}},{"type":"Corners","value":{"team1":8,"team2":5}},{"type":"Dangerous attacks","value":{"team1":46,"team2":49}},{"type":"Fouls","value":{"team1":9,"team2":14}},{"type":"Free kicks","value":{"team1":18,"team2":10}},{"type":"Goals","value":{"team1":2,"team2":1}},{"type":"Goal kicks","value":{"team1":10,"team2":11}},{"type":"Offsides","value":{"team1":1,"team2":4}},{"type":"Posession","value":{"team1":55,"team2":45}},{"type":"Shots blocked","value":{"team1":4,"team2":1}},{"type":"Shots off target","value":{"team1":7,"team2":5}}]}}}

I want to get the average of data.statistics.gsm.value.team1 when data.statistics.gsm.type == "Attacks" using the Golang MongoDB driver mgo. Code I have tried so far (with either one or both the group statements below):

pipeline := []bson.M{
	bson.M{"$match": bson.M{"kick_off.utc.gsm.date_time": bson.M{"$gt": start, "$lt": end}}}, 
bson.M{
		"$group": bson.M{
			"_id":     "$gsm_id",
    "event_array" : bson.M{"$first": "$data.statistics.gsm"}}},
bson.M{
  			"$group": bson.M{
  				"_id":     "$type",
          "avg_attack" : bson.M{"$avg": "$data.statistics.gsm.value.team1"}}}}

With only the first group statement, I get back the below, but the second group statement doesn't help me get the average.

[{"_id":1953009,"event_array":[{"type":"Attacks","value":{"team1":48,"team2":12}},{"type":"Corners","value":{"team1":12,"team2":0}},{"type":"Dangerous attacks","value":{"team1":46,"team2":7}},{"type":"Fouls","value":{"team1":10,"team2":3}},{"type":"Free kicks","value":{"team1":5,"team2":12}},{"type":"Goals","value":{"team1":8,"team2":0}}

答案1

得分: 3

我总是发现将JSON格式化输出很有帮助。以下是你说你从第一个group语句中得到的内容:

[
  {
    "_id": 1953009,
    "event_array": [
      {
        "type": "Attacks",
        "value": {
          "team1": 48,
          "team2": 12
        }
      },
      {
        "type": "Corners",
        "value": {
          "team1": 12,
          "team2": 0
        }
      },
      ...

现在是你使用的第二个group语句:

"$group": bson.M{
     "_id": "$type",
     "avg_attack" : bson.M{"$avg": "$data.statistics.gsm.value.team1"}
}

你试图对第一个group语句的结果中的data.statistics.gsm.value.team1取平均值,但在第一个group语句的结果中并不存在这个字段,所以当然不会给你一个平均值。

我建议你尝试使用$unwind操作符将数组拆分为一组文档,然后你应该能够按照你在这里尝试的方式进行分组,使用{$avg: "$value.team1"}

因此,生成聚合的整体流程应为:$match -> $group1 -> $unwind -> $group2。只需记住,管道的每个阶段都在前一个阶段生成的数据上操作,这就是为什么你的data.statistics.gsm.value.team1部分是错误的。

英文:

I always find it helpful to get a pretty print view of the json. Here is what you say you get from the first group statement:

[  
{  
"_id":1953009,
"event_array":[  
  {  
    "type":"Attacks",
    "value":{  
      "team1":48,
      "team2":12
    }
  },
  {  
    "type":"Corners",
    "value":{  
      "team1":12,
      "team2":0
    }
  },
...

Now the second group statement you use:

"$group": bson.M{
     "_id":     "$type",
     "avg_attack" : bson.M{"$avg": "$data.statistics.gsm.value.team1"}
}

You're trying to take the average of data.statistics.gsm.value.team1 on the results of the first group statement, but that doesn't exist in the results of the first group statement so of course it won't give you an average.

Instead of the approach you're using, I'd suggest looking into the $unwind operator to break down the array into a set of documents, then you should be able group them in the way you're trying to here with {$avg: "$value.team1"}.

So the overall pipeline that is used to produce the aggregation would be: $match -> $group1 -> $unwind -> $group2. Just keep in mind that each phase of the pipeline is operating on the data produced by the previous stage, which is why your data.statistics.gsm.value.team1 part was incorrect.

huangapple
  • 本文由 发表于 2014年11月14日 00:57:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/26914225.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定