英文:
Convert Column to JSON object of Another column in Azure Dataflow
问题
I have data in below format and i am using Data flow to format the record in JSON format and store it into another column of the Data.
Wanted to convert in below format using Dataflow:
I am not getting any way to convert this using Data Flow.
英文:
I have data in below format and i am using Data flow to format the record in JSON format and store it into another column of the Data.
Wanted to convert in below format using Dataflow:
I am not getting any way to convert this using Data Flow
答案1
得分: 1
-
你可以使用派生列转换来实现这个目标。我已经将以下内容作为我的来源。
-
现在,使用
associate
函数分别创建键值对,使用派生列转换创建两个新列。 -
现在,使用新创建的两列作为
array(A, B)
创建一个数组。 -
现在,在目标中,我选择了一个 JSON 目标文件,并只映射了所需的列,如下所示。
-
这将在下面的图像中显示最终数据预览,这是要求。
-
以下是完整的数据流 JSON:
英文:
- You can use derived column transformations to achieve this. I have taken the following as my source.
- Now, make key value pair separately using
associate
function to create 2 new columns using derived column transformation.
A: associate(CUST_ID_A,{SCORE A})
B: associate(CUST_ID_B,{SCORE B})
- Now, create an array using the newly created so columns as
array(A,B)
.
- Now, in sink, I choose a JSON sink file and mapped only required columns as shown below:
- This will give the final data preview as shown in the below image which is the requirement.
- The following is the complete dataflow JSON:
{
"name": "dataflow1",
"properties": {
"type": "MappingDataFlow",
"typeProperties": {
"sources": [
{
"dataset": {
"referenceName": "DelimitedText1",
"type": "DatasetReference"
},
"name": "source1"
}
],
"sinks": [
{
"dataset": {
"referenceName": "Json1",
"type": "DatasetReference"
},
"name": "sink1"
}
],
"transformations": [
{
"name": "derivedColumn1"
},
{
"name": "derivedColumn2"
}
],
"scriptLines": [
"source(output(",
" TRANS_ID as string,",
" CUST_ID_A as string,",
" {SCORE A} as string,",
" CUST_ID_B as string,",
" {SCORE B} as string",
" ),",
" allowSchemaDrift: true,",
" validateSchema: false,",
" ignoreNoFilesFound: false) ~> source1",
"source1 derive(A = associate(CUST_ID_A,{SCORE A}),",
" B = associate(CUST_ID_B,{SCORE B})) ~> derivedColumn1",
"derivedColumn1 derive(cust_conf = array(A,B)) ~> derivedColumn2",
"derivedColumn2 sink(allowSchemaDrift: true,",
" validateSchema: false,",
" partitionFileNames:['op.json'],",
" umask: 0022,",
" preCommands: [],",
" postCommands: [],",
" skipDuplicateMapInputs: true,",
" skipDuplicateMapOutputs: true,",
" mapColumn(",
" TRANS_ID,",
" cust_conf",
" ),",
" partitionBy('hash', 1)) ~> sink1"
]
}
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论