英文:
Is it possible to add the data from multiple table of selective columns of mysql rds database to single table with another mysql rds table
问题
可以使用AWS Glue将来自多个MySQL RDS数据库表的选择性列数据添加到另一个MySQL RDS实例的单个表中。请提供建议。谢谢。
英文:
Is it possible to add the data from multiple table of selective columns of mysql rds database to single table with another mysql rds instance using AWS Glue.
Please suggest.
Thanks
答案1
得分: 1
第一种方法:
-
在所有这些表上运行一个Glue爬虫,并从Glue目录中加载所有这些表到您的Glue作业中。
-
一旦您将它们加载到多个Glue DynamicFrames中,您可以选择要连接的列以及连接键,并将它们连接起来。
-
然后将这些DynamicFrames连接,并将组合的结果写回到MySQL RDS表中。
在这个方法中,您将从多个表中加载所有列,然后在您的Glue作业中选择所需的列并将它们连接起来。
第二种方法:
-
您可以编写一个SQL查询来选择和连接所有这些多个表,并将其下推到MySQL引擎。
-
结果然后在MySQL引擎上计算,然后将这个结果加载到Spark DataFrame中。
-
最后一步是将这个DataFrame转换为DynamicFrame,并将其写入MySQL表中。
在这种方法中,您将计算任务委托给了MySQL,如果您的表太大,数据库引擎会受到影响。
英文:
Yes it is possible to achieve this with Glue via two approaches:
First approach:
-
Run a Glue crawler on all these tables and load all these tables in to your Glue job from Glue catalog.
-
Once you have loaded them in to multiple Glue DynamicFrames then you can select the columns along with the join key and join them.
-
Then join these DynamicFrames and write the combined result back into MySql RDS table.
In this approach you will be loading all the columns from multiple tables and then selecting required columns inside your Glue job and join them.
Second Approach:
-
You can frame a SQL query to select and join all these multiple tables and push it down to MySQL engine.
-
The result is then calculated at MySQL engine and you will load this result into Spark DatFrame.
-
Final step will be to convert this DataFrame to DynamicFrame and writing it to MySQL table.
In this approach you are delegating the computing task to MySQL and there will be impact on database engine if your tables are too big.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论