英文:
Retrieving item count per date on MongoDB aggregate
问题
我有一个包含事件的集合。每个事件都有时间戳,精确到毫秒。我想按天对事件进行分组计数。
例如:
我有以下数据:
KEY VALUE
_id 111222333444555ddd666fff
time 2023-04-23T15:35:19.631Z
type pizza-event
_id 111222333444555ddd666fff
time 2023-04-23T01:41:20.631Z
type TV-event
_id 111222333444555ddd666fff
time 2023-04-22T05:00:05.631Z
type some-event
我希望得到以下结果:
KEY VALUE
date 04-22-2023
count 1.0
date 04-23-2023
count 2.0
最终目标是在一个 Golang 项目中使用这个查询。
到目前为止,我有以下查询:
[
{
"$match" : {
"$and" : [
{
"type" : "foo.bar.event"
},
{
"time" : {
"$gte" : ISODate("2023-04-23T00:00:00.000+0000")
}
},
{
"time" : {
"$lte" : ISODate("2023-04-25T00:00:00.000+0000")
}
}
]
}
},
{
"$group" : {
"_id" : {
"$dateToString" : {
"format" : "%m-%d-%Y",
"date" : "$time"
}
},
"count" : {
"$sum" : 1.0
}
}
}
]
这个查询返回的结果是:
KEY VALUE
_id 04-24-2023
count 476.0
_id 04-23-2023
count 28.0
这个查询本来是可以工作的,但是当我在 Go 项目中编写这个查询时,"$dateToString" 下面出现了一个红色波浪线,提示 "Invalid field name",而且我希望日期的键名是 "time" 而不是 "_id"。当我在 group 阶段做以下更改时:
{
_id: null,
"date": {"$dateToString": { "format": "%m-%d-%Y", "date": "$time"}},
"count": {"$sum": 1}
}
我得到了一个未知的 group 操作符 "$dateToString" 的错误。所以我想先创建分组,然后再添加一个 project 阶段来使用 "$dateToString",但是现在 group 阶段返回的是每毫秒的分组,这违背了分组的目的。
我意识到我在讨论两个不同的问题。然而,虽然帮助解决所有问题会很好,但这个问题特别是关于修复 MongoDB 查询的。如果需要,我可以在另一个线程中讨论 Golang 代码编写的问题。如果有需要,请告诉我如何更清楚地表达。
英文:
I have a collection containing events. Each event is timestamped down to the millisecond. I would like to group the events in a count per day.
e.g.:
I have
KEY VALUE
_id 111222333444555ddd666fff
time 2023-04-23T15:35:19.631Z
type pizza-event
_id 111222333444555ddd666fff
time 2023-04-23T01:41:20.631Z
type TV-event
_id 111222333444555ddd666fff
time 2023-04-22T05:00:05.631Z
type some-event
I would like
KEY VALUE
date 04-22-2023
count 1.0
date 04-23-2023
count 2.0
The ultimate goal is to use the query in a Golang project.
So far I have
[
{
"$match" : {
"$and" : [
{
"type" : "foo.bar.event"
},
{
"time" : {
"$gte" : ISODate("2023-04-23T00:00:00.000+0000")
}
},
{
"time" : {
"$lte" : ISODate("2023-04-25T00:00:00.000+0000")
}
}
]
}
},
{
"$group" : {
"_id" : {
"$dateToString" : {
"format" : "%m-%d-%Y",
"date" : "$time"
}
},
"count" : {
"$sum" : 1.0
}
}
}
]
which returns
KEY VALUE
_id 04-24-2023
count 476.0
_id 04-23-2023
count 28.0
That could've worked but when I code this query in the go project a red squiggly appears under "$dateToString" with the message "Invalid field name" and ideally I would like the date to have a key of "time" instead of "_id". When I do following change in the group stage:
{
_id: null,
"date": {"$dateToString": { "format": "%m-%d-%Y", "date": "$time"}},
"count": {"$sum": 1}
}
I get an unknown group operator "$dateToString" error. So I thought of creating the groups and then add a project stage for the "$dateToString" but now the group stage returns groups for every millisecond which defeats the point of grouping.
I realize I am discussing 2 different problems. However, while help with everything would be wonderful, this question is specifically about fixing the mongo query. I will return for the Golang coding on an other thread if necessary.
Let me know if I can be clearer.
答案1
得分: 0
首先,如果你有一个应用程序的用户界面来显示查询结果,不需要在查询中格式化输出。这是应用程序用户界面的责任。顺便说一句,如果你有一个应用程序用户界面,考虑使用$dateTrunc
而不是$dateToString
。
无论如何,关于你问题中的要求,下面这个$project
阶段应该适合你:
[
{
"$group": {
"_id": {
"$dateToString": { "date": "$time", "format": "%m-%d-%Y" }
},
"count": { "$sum": 1 }
}
},
{ "$project": { "_id": 0, "time": "$_id", "count": "$count" } }
]
在MongoDB Shell中的输出:
{ "time" : "02-08-2020", "count" : 2 }
{ "time" : "05-18-2020", "count" : 2 }
{ "time" : "03-20-2021", "count" : 3 }
{ "time" : "01-11-2021", "count" : 1 }
关于在Go项目中使用该查询,这是一个示例:
package main
import (
"context"
"fmt"
"time"
"go.mongodb.org/mongo-driver/bson"
"go.mongodb.org/mongo-driver/mongo"
"go.mongodb.org/mongo-driver/mongo/options"
)
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 20*time.Second)
defer cancel()
client, err := mongo.Connect(ctx, options.Client().ApplyURI("mongodb://localhost:27017"))
if err != nil {
panic(err)
}
coll := client.Database("baz").Collection("cakeSales")
matchStage := bson.D{
{"$match", bson.D{
{"$and", []bson.D{
{{"time", bson.D{
{"$gte", time.Date(2019, 6, 1, 0, 0, 0, 0, time.UTC)},
}}},
{{"time", bson.D{
{"$lte", time.Date(2021, 2, 1, 0, 0, 0, 0, time.UTC)},
}}},
}},
}},
}
groupStage := bson.D{
{"$group", bson.D{
{"_id", bson.D{
{"$dateToString", bson.D{
{"date", "$time"},
{"format", "%m-%d-%Y"},
}},
}},
{"count", bson.D{
{"$sum", 1},
}},
}},
}
projectStage := bson.D{
{"$project", bson.D{
{"_id", 0},
{"time", "$_id"},
{"count", "$count"},
}},
}
cursor, err := coll.Aggregate(context.TODO(),
mongo.Pipeline{matchStage, groupStage, projectStage})
if err != nil {
panic(err)
}
var results []bson.M
if err = cursor.All(context.TODO(), &results); err != nil {
panic(err)
}
for _, result := range results {
fmt.Printf(
"time: %s count: %v\n",
result["time"],
result["count"])
}
}
希望对你有帮助!
英文:
First of all, if you have an application UI to display the query result, don't bother to format the output in the query. That's the application UI's responsibility. BTW, if you have an application UI, consider using $dateTrunc
instead of $dateToString
.
Anyway, regarding the requirement in your question, a $project
stage like this one should work for you:
[
{
"$group": {
"_id": {
"$dateToString": { "date": "$time", "format": "%m-%d-%Y" }
},
"count": { "$sum": 1 }
}
},
{ "$project": { "_id": 0, "time": "$_id", "count": "$count" } }
]
Output from MongoDB Shell:
{ "time" : "02-08-2020", "count" : 2 }
{ "time" : "05-18-2020", "count" : 2 }
{ "time" : "03-20-2021", "count" : 3 }
{ "time" : "01-11-2021", "count" : 1 }
Regarding using the query in a Go project, here is a demo:
package main
import (
"context"
"fmt"
"time"
"go.mongodb.org/mongo-driver/bson"
"go.mongodb.org/mongo-driver/mongo"
"go.mongodb.org/mongo-driver/mongo/options"
)
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 20*time.Second)
defer cancel()
client, err := mongo.Connect(ctx, options.Client().ApplyURI("mongodb://localhost:27017"))
if err != nil {
panic(err)
}
coll := client.Database("baz").Collection("cakeSales")
matchStage := bson.D{
{"$match", bson.D{
{"$and", []bson.D{
{{"time", bson.D{
{"$gte", time.Date(2019, 6, 1, 0, 0, 0, 0, time.UTC)},
}}},
{{"time", bson.D{
{"$lte", time.Date(2021, 2, 1, 0, 0, 0, 0, time.UTC)},
}}},
}},
}},
}
groupStage := bson.D{
{"$group", bson.D{
{"_id", bson.D{
{"$dateToString", bson.D{
{"date", "$time"},
{"format", "%m-%d-%Y"},
}},
}},
{"count", bson.D{
{"$sum", 1},
}},
}},
}
projectStage := bson.D{
{"$project", bson.D{
{"_id", 0},
{"time", "$_id"},
{"count", "$count"},
}},
}
cursor, err := coll.Aggregate(context.TODO(),
mongo.Pipeline{matchStage, groupStage, projectStage})
if err != nil {
panic(err)
}
var results []bson.M
if err = cursor.All(context.TODO(), &results); err != nil {
panic(err)
}
for _, result := range results {
fmt.Printf(
"time: %s count: %v\n",
result["time"],
result["count"])
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论