英文:
ADF: Does a Copy activity within a ForEach loop hit the source DB multiple times?
问题
I need to upsert an entire table from a source DB to multiple destination DBs (all Azure SQL). I have a Lookup activity that pulls the list of server/DB names and passes it to a ForEach loop, inside which I've placed a Copy Data activity with a static dataset as the Source and a dynamic Destination.
My question is this: Will this arrangement hit the source DB every time the loop runs? I really only need to pull the data once since the exact same data needs to be upserted to all the destination DBs. Is there a way to pull the data one time and dump it to memory somehow instead of going back to the source DB every time the loop runs? Or by definition, does my use of a static dataset as the Source mean that the data is only pulled once per pipeline run?
Does any of this make any sense?
英文:
I need to upsert an entire table from a source DB to multiple destination DBs (all Azure SQL). I have a Lookup activity that pulls the list of server/DB names and passes it to a ForEach loop, inside which I've placed a Copy Data activity with a static dataset as the Source and a dynamic Destination.
My question is this: Will this arrangement hit the source DB every time the loop runs? I really only need to pull the data once since the exact same data needs to be upserted to all the destination DBs. Is there a way to pull the data one time and dump it to memory somehow instead of going back to the source DB every time the loop runs? Or by definition, does my use of a static dataset as the Source mean that the data is only pulled once per pipeline run?
Does any of this make any sense?
答案1
得分: 1
在Azure数据工厂中,使用静态数据集作为“复制数据活动”中的“源”意味着数据将仅在管道运行时拉取一次,而不是每次循环运行时。这种行为是有意设计的,它确保数据仅从源读取一次,然后在“ForEach”循环的每个迭代中进行处理,而不会多次访问源数据库。
因此,您可以放心,数据将仅在管道运行开始时从源数据库中提取一次,并存储在内存中供所有ForEach循环的迭代使用,其中它将被更新到多个目标数据库中。
英文:
In Azure Data Factory, using a static dataset as the Source
in the Copy Data activity
means that the data will be pulled only once per pipeline run, not every time the loop runs. This behavior is by design, and it ensures that the data is read from the source once and then processed for each iteration of the ForEach
loop without hitting the source DB multiple times.
So, you can rest assured that the data will be fetched from the source database only once at the beginning of the pipeline run and will be stored in memory for use in all the iterations of the ForEach loop, where it will be upserted to the multiple destination databases.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论