在MongoDB聚合中检索每个日期的项目数量。

huangapple go评论76阅读模式
英文:

Retrieving item count per date on MongoDB aggregate

问题

我有一个包含事件的集合。每个事件都有时间戳,精确到毫秒。我想按天对事件进行分组计数。

例如:

我有以下数据:

KEY       VALUE

_id       111222333444555ddd666fff
time      2023-04-23T15:35:19.631Z
type      pizza-event

_id       111222333444555ddd666fff
time      2023-04-23T01:41:20.631Z
type      TV-event

_id       111222333444555ddd666fff
time      2023-04-22T05:00:05.631Z
type      some-event

我希望得到以下结果:

KEY        VALUE

date       04-22-2023
count      1.0

date       04-23-2023
count      2.0

最终目标是在一个 Golang 项目中使用这个查询。

到目前为止,我有以下查询:

[
        {
            "$match" : {
                "$and" : [
                    {
                        "type" : "foo.bar.event"
                    },
                    {
                        "time" : {
                            "$gte" : ISODate("2023-04-23T00:00:00.000+0000")
                        }
                    },
                    {
                        "time" : {
                            "$lte" : ISODate("2023-04-25T00:00:00.000+0000")
                        }
                    }
                ]
            }
        }, 
        {
            "$group" : {
                "_id" : {
                    "$dateToString" : {
                        "format" : "%m-%d-%Y",
                        "date" : "$time"
                    }
                },
                "count" : {
                    "$sum" : 1.0
                }
            }
        }
    ]

这个查询返回的结果是:

KEY            VALUE

_id            04-24-2023
count          476.0

_id            04-23-2023
count          28.0

这个查询本来是可以工作的,但是当我在 Go 项目中编写这个查询时,"$dateToString" 下面出现了一个红色波浪线,提示 "Invalid field name",而且我希望日期的键名是 "time" 而不是 "_id"。当我在 group 阶段做以下更改时:

{
    _id: null,
    "date": {"$dateToString": { "format": "%m-%d-%Y", "date": "$time"}},
    "count": {"$sum": 1}
}

我得到了一个未知的 group 操作符 "$dateToString" 的错误。所以我想先创建分组,然后再添加一个 project 阶段来使用 "$dateToString",但是现在 group 阶段返回的是每毫秒的分组,这违背了分组的目的。

我意识到我在讨论两个不同的问题。然而,虽然帮助解决所有问题会很好,但这个问题特别是关于修复 MongoDB 查询的。如果需要,我可以在另一个线程中讨论 Golang 代码编写的问题。如果有需要,请告诉我如何更清楚地表达。

英文:

I have a collection containing events. Each event is timestamped down to the millisecond. I would like to group the events in a count per day.
e.g.:

I have

KEY       VALUE

_id       111222333444555ddd666fff
time      2023-04-23T15:35:19.631Z
type      pizza-event

_id       111222333444555ddd666fff
time      2023-04-23T01:41:20.631Z
type      TV-event

_id       111222333444555ddd666fff
time      2023-04-22T05:00:05.631Z
type      some-event

I would like

KEY        VALUE

date       04-22-2023
count      1.0

date       04-23-2023
count      2.0

The ultimate goal is to use the query in a Golang project.

So far I have

[
        {
            "$match" : {
                "$and" : [
                    {
                        "type" : "foo.bar.event"
                    },
                    {
                        "time" : {
                            "$gte" : ISODate("2023-04-23T00:00:00.000+0000")
                        }
                    },
                    {
                        "time" : {
                            "$lte" : ISODate("2023-04-25T00:00:00.000+0000")
                        }
                    }
                ]
            }
        }, 
        {
            "$group" : {
                "_id" : {
                    "$dateToString" : {
                        "format" : "%m-%d-%Y",
                        "date" : "$time"
                    }
                },
                "count" : {
                    "$sum" : 1.0
                }
            }
        }
    ]

which returns

KEY            VALUE

_id            04-24-2023
count          476.0

_id            04-23-2023
count          28.0

That could've worked but when I code this query in the go project a red squiggly appears under "$dateToString" with the message "Invalid field name" and ideally I would like the date to have a key of "time" instead of "_id". When I do following change in the group stage:

{
    _id: null,
    "date": {"$dateToString": { "format": "%m-%d-%Y", "date": "$time"}},
    "count": {"$sum": 1}
}

I get an unknown group operator "$dateToString" error. So I thought of creating the groups and then add a project stage for the "$dateToString" but now the group stage returns groups for every millisecond which defeats the point of grouping.

I realize I am discussing 2 different problems. However, while help with everything would be wonderful, this question is specifically about fixing the mongo query. I will return for the Golang coding on an other thread if necessary.
Let me know if I can be clearer.

答案1

得分: 0

首先,如果你有一个应用程序的用户界面来显示查询结果,不需要在查询中格式化输出。这是应用程序用户界面的责任。顺便说一句,如果你有一个应用程序用户界面,考虑使用$dateTrunc而不是$dateToString

无论如何,关于你问题中的要求,下面这个$project阶段应该适合你:

[
  {
    "$group": {
      "_id": {
        "$dateToString": { "date": "$time", "format": "%m-%d-%Y" }
      },
      "count": { "$sum": 1 }
    }
  },
  { "$project": { "_id": 0, "time": "$_id", "count": "$count" } }
]

在MongoDB Shell中的输出:

{ "time" : "02-08-2020", "count" : 2 }
{ "time" : "05-18-2020", "count" : 2 }
{ "time" : "03-20-2021", "count" : 3 }
{ "time" : "01-11-2021", "count" : 1 }

关于在Go项目中使用该查询,这是一个示例:

package main

import (
	"context"
	"fmt"
	"time"

	"go.mongodb.org/mongo-driver/bson"
	"go.mongodb.org/mongo-driver/mongo"
	"go.mongodb.org/mongo-driver/mongo/options"
)

func main() {
	ctx, cancel := context.WithTimeout(context.Background(), 20*time.Second)
	defer cancel()
	client, err := mongo.Connect(ctx, options.Client().ApplyURI("mongodb://localhost:27017"))
	if err != nil {
		panic(err)
	}

	coll := client.Database("baz").Collection("cakeSales")

	matchStage := bson.D{
		{"$match", bson.D{
			{"$and", []bson.D{
				{{"time", bson.D{
					{"$gte", time.Date(2019, 6, 1, 0, 0, 0, 0, time.UTC)},
				}}},
				{{"time", bson.D{
					{"$lte", time.Date(2021, 2, 1, 0, 0, 0, 0, time.UTC)},
				}}},
			}},
		}},
	}

	groupStage := bson.D{
		{"$group", bson.D{
			{"_id", bson.D{
				{"$dateToString", bson.D{
					{"date", "$time"},
					{"format", "%m-%d-%Y"},
				}},
			}},
			{"count", bson.D{
				{"$sum", 1},
			}},
		}},
	}

	projectStage := bson.D{
		{"$project", bson.D{
			{"_id", 0},
			{"time", "$_id"},
			{"count", "$count"},
		}},
	}

	cursor, err := coll.Aggregate(context.TODO(),
		mongo.Pipeline{matchStage, groupStage, projectStage})
	if err != nil {
		panic(err)
	}

	var results []bson.M
	if err = cursor.All(context.TODO(), &results); err != nil {
		panic(err)
	}
	for _, result := range results {
		fmt.Printf(
			"time: %s count: %v\n",
			result["time"],
			result["count"])
	}
}

希望对你有帮助!

英文:

First of all, if you have an application UI to display the query result, don't bother to format the output in the query. That's the application UI's responsibility. BTW, if you have an application UI, consider using $dateTrunc instead of $dateToString.

Anyway, regarding the requirement in your question, a $project stage like this one should work for you:

[
{
"$group": {
"_id": {
"$dateToString": { "date": "$time", "format": "%m-%d-%Y" }
},
"count": { "$sum": 1 }
}
},
{ "$project": { "_id": 0, "time": "$_id", "count": "$count" } }
]

Output from MongoDB Shell:

{ "time" : "02-08-2020", "count" : 2 }
{ "time" : "05-18-2020", "count" : 2 }
{ "time" : "03-20-2021", "count" : 3 }
{ "time" : "01-11-2021", "count" : 1 }

Regarding using the query in a Go project, here is a demo:

package main

import (
	"context"
	"fmt"
	"time"

	"go.mongodb.org/mongo-driver/bson"
	"go.mongodb.org/mongo-driver/mongo"
	"go.mongodb.org/mongo-driver/mongo/options"
)

func main() {
	ctx, cancel := context.WithTimeout(context.Background(), 20*time.Second)
	defer cancel()
	client, err := mongo.Connect(ctx, options.Client().ApplyURI("mongodb://localhost:27017"))
	if err != nil {
		panic(err)
	}

	coll := client.Database("baz").Collection("cakeSales")

	matchStage := bson.D{
		{"$match", bson.D{
			{"$and", []bson.D{
				{{"time", bson.D{
					{"$gte", time.Date(2019, 6, 1, 0, 0, 0, 0, time.UTC)},
				}}},
				{{"time", bson.D{
					{"$lte", time.Date(2021, 2, 1, 0, 0, 0, 0, time.UTC)},
				}}},
			}},
		}},
	}

	groupStage := bson.D{
		{"$group", bson.D{
			{"_id", bson.D{
				{"$dateToString", bson.D{
					{"date", "$time"},
					{"format", "%m-%d-%Y"},
				}},
			}},
			{"count", bson.D{
				{"$sum", 1},
			}},
		}},
	}

	projectStage := bson.D{
		{"$project", bson.D{
			{"_id", 0},
			{"time", "$_id"},
			{"count", "$count"},
		}},
	}

	cursor, err := coll.Aggregate(context.TODO(),
		mongo.Pipeline{matchStage, groupStage, projectStage})
	if err != nil {
		panic(err)
	}

	var results []bson.M
	if err = cursor.All(context.TODO(), &results); err != nil {
		panic(err)
	}
	for _, result := range results {
		fmt.Printf(
			"time: %s count: %v\n",
			result["time"],
			result["count"])
	}
}

huangapple
  • 本文由 发表于 2023年4月26日 22:24:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/76111924.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定