英文:
Migrating from MongoDB to PostgreSQL: Writing aggregation pipeline query for PostgreSQL
问题
We have a requirement of migrating from MongoDB to PostgreSQL in our Node.js application. Our express APIs heavily use mongodb "aggregation" framework, which we have to re-write in PostgreSQL. For Example, we have below form of data:
[{
"product": "PD1",
"application": "A",
"project": "PR1",
"version": "1"
},{
"product": "PD1",
"application": "A",
"project": "PR2",
"version": "3"
},{
"product": "PD1",
"application": "B",
"project": "PR3",
"version": "2"
},{
"product": "PD2",
"application": "C",
"project": "PR4",
"version": "5"
},{
"product": "PD2",
"application": "C",
"project": "PR5",
"version": "2"
}]
It is guaranteed that an application belongs to only one product. Similarly, a project belongs to only one application. I have written below MongoDB aggregate query to summarize this data:
let data = await CompanyModel.aggregate([{
$group: {
"_id": "$application",
"projects":{
$push: {
"project": "$project",
"version": "$version"
}
},
"product": {$first: "$product"}
}
},{
$group: {
"_id": "$product",
"applications":{
$push: {
"application": "$_id",
"projects": "$projects"
}
},
}
}])
The above query outputs below result:
[{
"_id": "PD2",
"applications": [{
"application": "C",
"projects": [{ "project": "PR4", "version": "5" }, { "project": "PR5", "version": "2" }]
}]
},
{
"_id": "PD1",
"applications": [{
"application": "B",
"projects": [{ "project": "PR3", "version": "2" }]
}, { "application": "A", "projects": [{ "project": "PR1", "version": "1" }, { "project": "PR2", "version": "3" }] }]
}]
There is some hard requirement to structure data in this way only.
Now comes the main part, how do we write a query in PostgreSQL that acts like a pipeline just like above query, where output of one stage is input to other stage. I have some experience with SQL but have never worked on PostgreSQL. But I know that PostgreSQL supports JSON data. My guess is that in SQL, I would do something like below:
SELECT * from (SELECT * from table_name group by field_name1) group by field_name2;
Basically I am trying to check if any such pipeline feature exist in PostgreSQL but couldn't find any.
I need some guidance on this. Thanks in advance!
英文:
We have a requirement of migrating from MongoDB to PostgreSQL in our Node.js application. Our express APIs heavily use mongodb aggregation
framework, which we have to re-write in PostgreSQL. For Example, we have below form of data:
[{
"product": "PD1",
"application": "A",
"project": "PR1",
"version": "1"
},{
"product": "PD1",
"application": "A",
"project": "PR2",
"version": "3"
},{
"product": "PD1",
"application": "B",
"project": "PR3",
"version": "2"
},{
"product": "PD2",
"application": "C",
"project": "PR4",
"version": "5"
},{
"product": "PD2",
"application": "C",
"project": "PR5",
"version": "2"
}]
It is guaranteed that an application belongs to only one product. Similarly, a project belongs to only one application. I have written below MongoDB aggregate query to summarize this data:
let data = await CompanyModel.aggregate([{
$group: {
"_id": "$application",
"projects":{
$push: {
"project": "$project",
"version": "$version"
}
},
"product": {$first: "$product"}
}
},{
$group: {
"_id": "$product",
"applications":{
$push: {
"application": "$_id",
"projects": "$projects"
}
},
}
}])
The above query outputs below result:
[{
"_id": "PD2",
"applications": [{
"application": "C",
"projects": [{ "project": "PR4", "version": "5" }, { "project": "PR5", "version": "2" }]
}]
},
{
"_id": "PD1",
"applications": [{
"application": "B",
"projects": [{ "project": "PR3", "version": "2" }]
}, { "application": "A", "projects": [{ "project": "PR1", "version": "1" }, { "project": "PR2", "version": "3" }] }]
}]
There is some hard requirement to structure data in this way only.
Now comes the main part, how do we write a query in PostgreSQL that acts like a pipeline just like above query, where output of one stage is input to other stage. I have some experience with SQL but have never worked on PostgreSQL. But I know that PostgreSQL supports JSON data. My guess is that in SQl, I would do something like below:
SELECT * from (SELECT * from table_name group by field_name1) group by field_name2;
Basically I am trying to check if any such pipeline feature exist in PostgreSQL but couldn't find any.
I need some guidance on this.
Thanks in advance!
答案1
得分: 0
I'll provide a translation of the code portion you provided:
WITH app_projects AS (
SELECT
application,
product,
jsonb_agg(
jsonb_build_object(
'project', project,
'version', version
)
) AS projects
FROM
CompanyTable
GROUP BY
application,
product
),
prod_apps AS (
SELECT
product,
jsonb_agg(
jsonb_build_object(
'application', application,
'projects', projects
)
) AS applications
FROM
app_projects
GROUP BY
product
)
SELECT
product AS _id,
applications
FROM
prod_apps;
这是你提供的SQL代码的翻译部分。如果你需要进一步的解释或有其他问题,请随时提问。
英文:
WITH app_projects AS (
SELECT
application,
product,
jsonb_agg(
jsonb_build_object(
'project', project,
'version', version
)
) AS projects
FROM
CompanyTable
GROUP BY
application,
product
),
prod_apps AS (
SELECT
product,
jsonb_agg(
jsonb_build_object(
'application', application,
'projects', projects
)
) AS applications
FROM
app_projects
GROUP BY
product
)
SELECT
product AS _id,
applications
FROM
prod_apps;
I used here CTE in postgresql. You can read more here, good explanation:
https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-cte/
-
In
app_projects
CTE, for eachapplication
andproduct
, I group projects and versions into a json array of objects using thejsonb_agg
andjsonb_build_object
functions. -
In
prod_apps
CTE, I group the results of the previous CTE byproduct
and create a json array of objects for each product containingapplications
and theirprojects
.
At the end, I select all the results from the prod_apps
CTE to get the desired output.
jsonb_build_object
- it's a function
jsonb_build_object(
'project', project,
'version', version
)
workes like this:
{'project': project, 'version': version}
jsonb_agg
- it's an expression
if you have some_value
:
some_value
------
1
2
3
jsonb_agg(some_value)
will make:
jsonb_agg
---------
[1, 2, 3]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论