将MongoDB文档基于日期字段使用聚合拆分为多个文档

huangapple go评论59阅读模式
英文:

Divide a MongoDB document into multiple documents based on Date Field using aggregation

问题

我有一个在MongoDB上的文档,其中包含了飞行数据记录。这些数据涵盖了过去4年,混合在下面示例中显示的`flight_records`数组中。我想根据记录的年份将这个文档拆分成单独的文档。我的示例文档如下 -

{
	"flight_number": "AS6312",
	"airlines_code": "AS",
	"flight_records": [{
			"status": "Landed 13:35",
			"origin": "ONT",
			"destination": "SEA",
			"date_of_journey": "2023-12-21",
			"scheduled": {
				"dep": 1671648300,
				"arr": 1671658500
			},
			"real": {
				"dep": 1671649420,
				"arr": 1671658520
			}
		},
		{
			"status": "Canceled",
			"origin": "ONT",
			"destination": "SEA",
			"date_of_journey": "2022-12-20",
			"scheduled": {
				"dep": 1671561900,
				"arr": 1671572100
			},
			"real": {
				"dep": 0,
				"arr": 0
			}
		},
		{
			"status": "Landed 13:09",
			"origin": "ONT",
			"destination": "SEA",
			"date_of_journey": "2021-12-19",
			"scheduled": {
				"dep": 1671475500,
				"arr": 1671485700
			},
			"real": {
				"dep": 1671475903,
				"arr": 1671484184
			}
		}
	]
}

预期输出是三个文档,分别对应2021-2023年。
预期输出文档1 ->

{
	"flight_number": "AS6312",
	"airlines_code": "AS",
	"flight_records": [{
		"status": "Landed 13:35",
		"origin": "ONT",
		"destination": "SEA",
		"date_of_journey": "2023-12-21",
		"scheduled": {
			"dep": 1671648300,
			"arr": 1671658500
		},
		"real": {
			"dep": 1671649420,
			"arr": 1671658520
		}
	}]
}

类似地,预期输出文档2 -> 等等...

每一个这样的文档后来都会根据记录的年份保存在单独的集合中。我尝试使用$match和$find,但无法获得正确的输出。
英文:

I have a document on MongoDB that contains flight data records. This data goes back 4 years , and is mixed up in the flight_records array shown in the sample below. I want to split this document into separate documents based on the Year of the record. My Sample Document is as follows -

{
"flight_number": "AS6312",
"airlines_code": "AS",
"flight_records": [{
"status": "Landed 13:35",
"origin": "ONT",
"destination": "SEA",
"date_of_journey": "2023-12-21",
"scheduled": {
"dep": 1671648300,
"arr": 1671658500
},
"real": {
"dep": 1671649420,
"arr": 1671658520
}
},
{
"status": "Canceled",
"origin": "ONT",
"destination": "SEA",
"date_of_journey": "2022-12-20",
"scheduled": {
"dep": 1671561900,
"arr": 1671572100
},
"real": {
"dep": 0,
"arr": 0
}
},
{
"status": "Landed 13:09",
"origin": "ONT",
"destination": "SEA",
"date_of_journey": "2021-12-19",
"scheduled": {
"dep": 1671475500,
"arr": 1671485700
},
"real": {
"dep": 1671475903,
"arr": 1671484184
}
}
]

}

And the expected output is three documents, each for 2021-2023.
The Expected output Document 1 ->

{
"flight_number": "AS6312",
"airlines_code": "AS",
"flight_records": [
{
"status": "Landed 13:35",
"origin": "ONT",
"destination": "SEA",
"date_of_journey": "2023-12-21",
"scheduled": {
"dep": 1671648300,
"arr": 1671658500
},
"real": {
"dep": 1671649420,
"arr": 1671658520
}
}]

}

Similarly , Expected output Document 2-> and so on...

{
"flight_number": "AS6312",
"airlines_code": "AS",
"flight_records": [
{
"status": "Canceled",
"origin": "ONT",
"destination": "SEA",
"date_of_journey": "2022-12-20",
"scheduled": {
"dep": 1671561900,
"arr": 1671572100
},
"real": {
"dep": 0,
"arr": 0
}
}]}

Each of these documents will be saved into separate collections later based on the year of the record. I tried using the $match and the $find but couldn't get the correct output.

答案1

得分: 1

以下是我即兴提出的解决方案,希望其他人会发现它有用。

db.collection.aggregate(
{
    "$unwind": "$flight_records"
}, 
{
    "$group": {
        _id: {
            $year: {
                $toDate: "$flight_records.date_of_journey"
            }
        },
        recordsForYear: {
            $push: "$flight_records"
        }
    }
})

以下是结果:

{
    "_id": 2021,
    "recordsForYear": [
        {
            "date_of_journey": "2021-11-18",
            "destination": "DTW",
            "origin": "DCA",
            "real": {
                "arr": 1.668740944e+09,
                "dep": 1.668736446e+09
            },
            "scheduled": {
                "arr": 1.66874124e+09,
                "dep": 1.66873524e+09
            },
            "status": "Landed 22:09"
        },
        {
            "date_of_journey": "2021-11-16",
            "destination": "DCA",
            "origin": "DTW",
            "real": {
                "arr": 1.668644421e+09,
                "dep": 1.668640813e+09
            },
            "scheduled": {
                "arr": 1.66864524e+09,
                "dep": 1.6686396e+09
            },
            "status": "Landed 19:20"
        },
        {
            "date_of_journey": "2021-11-14",
            "destination": "DCA",
            "origin": "DTW",
            "real": {
                "arr": 1.668471887e+09,
                "dep": 1.66846811e+09
            },
            "scheduled": {
                "arr": 1.66847244e+09,
                "dep": 1.6684668e+09
            },
            "status": "Landed 19:24"
        }
    ]
},
{
    "_id": 2023,
    "recordsForYear": [
        {
            "date_of_journey": "2023-11-17",
            "destination": "DTW",
            "origin": "DCA",
            "real": {
                "arr": 1.668653594e+09,
                "dep": 1.668649558e+09
            },
            "scheduled": {
                "arr": 1.66865484e+09,
                "dep": 1.66864884e+09
            },
            "status": "Landed 21:53"
        },
        {
            "date_of_journey": "2023-11-15",
            "destination": "DCA",
            "origin": "DTW",
            "real": {
                "arr": 1.668557859e+09,
                "dep": 1.668553656e+09
            },
            "scheduled": {
                "arr": 1.66855884e+09,
                "dep": 1.6685532e+09
            },
            "status": "Landed 19:17"
        },
        {
            "date_of_journey": "2023-11-14",
            "destination": "DTW",
            "origin": "DCA",
            "real": {
                "arr": 1.668395275e+09,
                "dep": 1.668391115e+09
            },
            "scheduled": {
                "arr": 1.66839564e+09,
                "dep": 1.66838964e+09
            },
            "status": "Landed 22:07"
        }
    ]
},
{
    "_id": 2022,
    "recordsForYear": [
        {
            "date_of_journey": "2022-11-17",
            "destination": "DCA",
            "origin": "DTW",
            "real": {
                "arr": 1.668730907e+09,
                "dep": 1.668727074e+09
            },
            "scheduled": {
                "arr": 1.66873164e+09,
                "dep": 1.668726e+09
            },
            "status": "Landed 19:21"
        },
        {
            "date_of_journey": "2022-11-16",
            "destination": "DTW",
            "origin": "DCA",
            "real": {
                "arr": 1.668568424e+09,
                "dep": 1.668564614e+09
            },
            "scheduled": {
                "arr": 1.66856844e+09,
                "dep": 1.66856244e+09
            },
            "status": "Landed 22:13"
        },
        {
            "date_of_journey": "2022-11-15",
            "destination": "DTW",
            "origin": "DCA",
            "real": {
                "arr": 1.668481169e+09,
                "dep": 1.668477413e+09
            },
            "scheduled": {
                "arr": 1.66848204e+09,
                "dep": 1.66847604e+09
            },
            "status": "Landed 21:59"
        }
    ]
}
]
英文:

Here is the solution that I improvised, I hope others will find it useful.

db.collection.aggregate(
{
"$unwind": "$flight_records"}, 
{
"$group": {
_id: {
$year: {
$toDate: "$flight_records.date_of_journey"
}
},
recordsForYear: {
$push: "$flight_records"
}
}})

And here is the result :

{
"_id": 2021,
"recordsForYear": [
{
"date_of_journey": "2021-11-18",
"destination": "DTW",
"origin": "DCA",
"real": {
"arr": 1.668740944e+09,
"dep": 1.668736446e+09
},
"scheduled": {
"arr": 1.66874124e+09,
"dep": 1.66873524e+09
},
"status": "Landed 22:09"
},
{
"date_of_journey": "2021-11-16",
"destination": "DCA",
"origin": "DTW",
"real": {
"arr": 1.668644421e+09,
"dep": 1.668640813e+09
},
"scheduled": {
"arr": 1.66864524e+09,
"dep": 1.6686396e+09
},
"status": "Landed 19:20"
},
{
"date_of_journey": "2021-11-14",
"destination": "DCA",
"origin": "DTW",
"real": {
"arr": 1.668471887e+09,
"dep": 1.66846811e+09
},
"scheduled": {
"arr": 1.66847244e+09,
"dep": 1.6684668e+09
},
"status": "Landed 19:24"
}
]},{
"_id": 2023,
"recordsForYear": [
{
"date_of_journey": "2023-11-17",
"destination": "DTW",
"origin": "DCA",
"real": {
"arr": 1.668653594e+09,
"dep": 1.668649558e+09
},
"scheduled": {
"arr": 1.66865484e+09,
"dep": 1.66864884e+09
},
"status": "Landed 21:53"
},
{
"date_of_journey": "2023-11-15",
"destination": "DCA",
"origin": "DTW",
"real": {
"arr": 1.668557859e+09,
"dep": 1.668553656e+09
},
"scheduled": {
"arr": 1.66855884e+09,
"dep": 1.6685532e+09
},
"status": "Landed 19:17"
},
{
"date_of_journey": "2023-11-14",
"destination": "DTW",
"origin": "DCA",
"real": {
"arr": 1.668395275e+09,
"dep": 1.668391115e+09
},
"scheduled": {
"arr": 1.66839564e+09,
"dep": 1.66838964e+09
},
"status": "Landed 22:07"
}
]},{
"_id": 2022,
"recordsForYear": [
{
"date_of_journey": "2022-11-17",
"destination": "DCA",
"origin": "DTW",
"real": {
"arr": 1.668730907e+09,
"dep": 1.668727074e+09
},
"scheduled": {
"arr": 1.66873164e+09,
"dep": 1.668726e+09
},
"status": "Landed 19:21"
},
{
"date_of_journey": "2022-11-16",
"destination": "DTW",
"origin": "DCA",
"real": {
"arr": 1.668568424e+09,
"dep": 1.668564614e+09
},
"scheduled": {
"arr": 1.66856844e+09,
"dep": 1.66856244e+09
},
"status": "Landed 22:13"
},
{
"date_of_journey": "2022-11-15",
"destination": "DTW",
"origin": "DCA",
"real": {
"arr": 1.668481169e+09,
"dep": 1.668477413e+09
},
"scheduled": {
"arr": 1.66848204e+09,
"dep": 1.66847604e+09
},
"status": "Landed 21:59"
}
]

}
]

huangapple
  • 本文由 发表于 2023年1月9日 16:34:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/75054771.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定