英文:
Azure Data Factory, Copy Activity with "flatten hierarchy" Copy behavior
问题
我有一个数据工厂数据管道,其中包含一个具有“展平层次结构”选项的复制活动。此选项会更改目标中的文件名。是否有一种方法可以从自动生成的文件名中追溯回原始文件路径?
英文:
I have a Data Factory data pipeline with a Copy Activity that has the 'Flatten Hierarchy' option. This option changes the file names in the destination. Is there a way to trace back the original file path from the auto-generated file name?
答案1
得分: 0
- 使用“展平层次结构”可能无法保留路径或文件名。即使对于使用“展平层次结构”复制的文件,文件元数据(以下是我用于演示的文件元数据,并观察到未被保留)也未被保留。
- 如果您知道层次结构的深度,可以在Azure数据工厂中使用循环自行展平文件。但这个过程可能需要大量的for循环活动,这在实际中并不可行。
- 在Azure数据工厂中不支持递归,因为会检测到循环并引发错误。
- 因此,使用Azure数据工厂可能无法保留文件名和路径的信息。作为替代方案,您可以使用Python等编程语言与Databricks一起展平和保留文件名。
英文:
- There might be no way to retain path or filename while using
flatten hierarchy
. Even the file metadata (the following is the file metadata that I have used for demonstration and observed that it is not retained) is not being retained for the files that are copied usingflatten hierarchy
.
- You can flatten the files by yourself using loops in azure data factory if you know the depth of hierarchy. But this procedure might require a lot of for loop activities and this is not practically possible.
- Recursion is not supported in Azure data factory as the cycle would be detected and an error would be thrown.
- So, it might be not possible to reserve the information of file names and path using Azure data factory. As an alternative, you can use any programming language like Python with Databricks to flatten and preserve filename.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论