英文:
Copy Activity in Azure Datafactory
问题
Azure Data Factory的复制活动如何在将数据从Data Lake Gen1移动到Data Lake Gen2的过程中用于重命名文件夹,以下是不同的情景:
如果Data Lake Gen1中的原始文件夹结构是y=2022/m=08/d=01,并且在Data Lake Gen2中期望的结构是2022/08/01,是否可以配置复制活动来相应地重命名文件夹?
我希望对每个月和每一天都执行此操作。
英文:
How can Azure Datafactory's copy activity be used to rename folders during the process of moving data from a datalake gen1 to a datalake gen2, in the following scenarios:
If the original folder structure in datalake gen1 is y=2022/m=08/d=01, and the desired structure in datalake gen2 is 2022/08/01, can the copy activity be configured to rename the folders accordingly?
I want to do it for all months and days of year.
答案1
得分: 1
-
您可以将源ADLS gen1的文件夹和文件路径详细信息存储在控制表中,并将其用作查找活动的源数据集。
-
执行一个for-each活动,并将其与查找活动依次连接。在for-each活动中,将项目的表达式设置为
@activity('Lookup1').output.value
-
在for-each活动内,执行一个复制活动。为ADLS gen1创建一个链接服务,并为文件夹和文件创建数据集参数。在源数据集中,将
@item().Folder
作为文件夹参数的值传递,将@item().File
传递给文件参数。 -
类似地,为ADLS gen2创建一个接收数据集。同时,为文件夹和文件名创建数据集参数。
-
将
@replace(replace(replace(item().Folder,'y=',''),'m=',''),'d=','')
作为文件夹参数的值传递。这将从item().folder值中删除 'y=', 'm=' 和 'd='。将@item().file
传递给文件参数。
当管道运行时,文件夹将按照要求重命名。
源文件路径:
接收文件路径:
英文:
- You can store the folder and file path details of source ADLS gen1 in a control table and use that as source dataset for lookup activity.
-
Take a for-each activity, and connect it sequentially with lookup activity. In for-each activity, expression for item is given as
@activity('Lookup1').output.value
-
Inside for-each activity, take a copy activity. Create a linked service for ADLS gen1 and create a dataset parameter for folder and file. In source dataset, pass
@item().Folder
as value to folder parameter and@item().File
to file parameter.
- Similarly, create a sink dataset for ADLS gen2. Also, create dataset parameters for folder and file names.
- Pass
@replace(replace(replace(item().Folder,'y=',''),'m=',''),'d=','')
as a value to folder parameter. This will replace 'y=', 'm=' and 'd='
from item().folder value. Pass@item().file
to file parameter.
When pipeline is run, folders are renamed as per the requirement.
**Source File path: **
Sink File Path:
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论