英文:
Creating Dynamic Folders based on Filename in Azure Data Factory DataFlow
问题
我正在处理一个Azure Data Factory数据流管道,在其中我有一个Sink活动。在sink中的一个列包含文件名信息,格式为“2023-07-19_diane_12345.csv”。我想使用模式从文件名中提取特定数据,并根据以下模式创建文件夹 yyyy/mm/dd。
我尝试在表达式中使用substring和lastIndexOf函数,但结果不如预期。
下面的参数包含文件名
在SINK活动中,我试图使用表达式构建器来完成这项工作,但它显示列未找到。
似乎我正在构建的表达式是为CopyActivity而不是Dataflow。是否有其他方法来执行此任务并动态创建文件夹?
管道如下所示
另外,sink设置如下:
英文:
I am working on an Azure Data Factory data flow pipeline where I have a Sink activity. One of the columns in the sink contains filename information in the format "2023-07-19_diane_12345.csv". I want to use a pattern to extract specific data from the filename and create folders based on the following pattern yyyy/mm/dd.
I tried using the substring and lastIndexOf functions in the expression, but it is not working as expected.
Below parameter contains the filename
and in the SINK activity, I am trying to use expression builder to do that job but it says column not found.
It seems the expression I am building is for CopyActivity not for Dataflow I guess. Is there any other way to perform the job and create folders dynamically?
The pipeline looks like this
Also the sink settings:
答案1
得分: 1
在派生列中,使用以下表达式构建器添加列:
replace(substring(fileName, 0, 10), "-", "/")
这里的fileName
是存储文件名的列。
根据以下方式配置Sink设置:
- 文件名选项:
将文件夹命名为列数据
- 列数据:
<创建的列>
文件将以以下格式存储:yyyy/mm/dd
,如下所示:
英文:
In derived column Add column with below expression builder
replace(substring(fileName ,0, 10),"-","/")
Here fileName
is the column which stores the name of the file.
Configure the Sink settings as mentioned below:
- File name option :
Name folder as column data
- Column data :
<created column>
The file will be stored in the yyyy/mm/dd
format as mentioned below:
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论