string_agg函数在Azure数据流表达式语言中用于Derive转换中的替代。

huangapple go评论77阅读模式
英文:

string_agg substitute used in Derive transformation in Azure data flow expression language

问题

如何将此SQL代码 "left(string_agg(zationClient, ';;') within group (order by zationClient), 100)" 转换为Azure数据工厂中的数据流表达式语言中的转换?

我期望的输出是这样的,分隔符是 "NAFDQ-Trulife; EAQVN - Trulife"

我明白在数据流表达式语言中不支持string_agg。

我尝试过使用substring(concat(zationClient, ';;'), 1, 128),但它没有给出所需的结果。

英文:

How do I convert this sql code "left(string_agg(zationClient, '; ') within group (order by zationClient) , 100)" as a Transformation in data factory specifically in azure data flow expression language?

I am expecting an output like this with delimiter "NAFDQ-Trulife; EAQVN - Trulife"

I understand that string_agg is not supported in data flow expression language.

I tried substring(concat(zationClient, '; '), 1, 128) but it doesn't give the required result.

答案1

得分: 0

  • 您需要使用派生列转换、排序转换和聚合转换的组合来获得所需的结果。以下是我拥有的起始数据:

string_agg函数在Azure数据流表达式语言中用于Derive转换中的替代。

  • 我已添加了一个名为tp的新列,其静态值为1。这可以用于按需分组数据。

string_agg函数在Azure数据流表达式语言中用于Derive转换中的替代。

  • 现在,使用排序转换在所需列上对数据进行排序,其配置类似于下面所示的配置:

string_agg函数在Azure数据流表达式语言中用于Derive转换中的替代。

  • 现在,使用聚合转换按tp列进行分组。在聚合选项卡中,使用以下表达式来获取所需的结果:

left(substring(toString(reduce(collect(con), ''', #acc+';'+#item, #result)),2),100)

string_agg函数在Azure数据流表达式语言中用于Derive转换中的替代。

  • 以下是完整的数据流 JSON:
{
    "name": "dataflow1",
    "properties": {
        "type": "MappingDataFlow",
        "typeProperties": {
            "sources": [
                {
                    "dataset": {
                        "referenceName": "DelimitedText1",
                        "type": "DatasetReference"
                    },
                    "name": "source1"
                }
            ],
            "sinks": [
                {
                    "name": "sink1"
                }
            ],
            "transformations": [
                {
                    "name": "derivedColumn1"
                },
                {
                    "name": "sort1"
                },
                {
                    "name": "aggregate1"
                }
            ],
            "scriptLines": [
                "source(output(",
                "          id as short,",
                "          con as string",
                "     ),",
                "     allowSchemaDrift: true,",
                "     validateSchema: false,",
                "     ignoreNoFilesFound: false) ~> source1",
                "source1 derive(tp = 1) ~> derivedColumn1",
                "derivedColumn1 sort(asc(con, true),",
                "     caseInsensitive: true) ~> sort1",
                "sort1 aggregate(groupBy(tp),",
                "     con = left(substring(toString(reduce(collect(con), ''', #acc+';'+#item, #result)),2),100)) ~> aggregate1",
                "aggregate1 sink(validateSchema: false,",
                "     skipDuplicateMapInputs: true,",
                "     skipDuplicateMapOutputs: true,",
                "     store: 'cache',",
                "     format: 'inline',",
                "     output: false,",
                "     saveOrder: 1) ~> sink1"
            ]
        }
    }
}
英文:
  • You need to use a combination of derived column transformation, sort transformation and an aggregate transformation to get the desired result. The following is the starting data that I have:

string_agg函数在Azure数据流表达式语言中用于Derive转换中的替代。

  • I have added a new column tp with a static value 1. This can be used to group the data as required.

string_agg函数在Azure数据流表达式语言中用于Derive转换中的替代。

  • Now sort the data on the required column using sort transformation with configurations similar to the ones shown below;

string_agg函数在Azure数据流表达式语言中用于Derive转换中的替代。

  • Now using the aggregate transformations, to group by the tp column. In the aggregates tab, use the required column with the following expression to get the desired results.
left(substring(toString(reduce(collect(con), '', #acc+';'+#item, #result)),2),100)

string_agg函数在Azure数据流表达式语言中用于Derive转换中的替代。

  • The following is the entire Dataflow JSON:
{
    "name": "dataflow1",
    "properties": {
        "type": "MappingDataFlow",
        "typeProperties": {
            "sources": [
                {
                    "dataset": {
                        "referenceName": "DelimitedText1",
                        "type": "DatasetReference"
                    },
                    "name": "source1"
                }
            ],
            "sinks": [
                {
                    "name": "sink1"
                }
            ],
            "transformations": [
                {
                    "name": "derivedColumn1"
                },
                {
                    "name": "sort1"
                },
                {
                    "name": "aggregate1"
                }
            ],
            "scriptLines": [
                "source(output(",
                "          id as short,",
                "          con as string",
                "     ),",
                "     allowSchemaDrift: true,",
                "     validateSchema: false,",
                "     ignoreNoFilesFound: false) ~> source1",
                "source1 derive(tp = 1) ~> derivedColumn1",
                "derivedColumn1 sort(asc(con, true),",
                "     caseInsensitive: true) ~> sort1",
                "sort1 aggregate(groupBy(tp),",
                "     con = left(substring(toString(reduce(collect(con), '', #acc+';'+#item, #result)),2),100)) ~> aggregate1",
                "aggregate1 sink(validateSchema: false,",
                "     skipDuplicateMapInputs: true,",
                "     skipDuplicateMapOutputs: true,",
                "     store: 'cache',",
                "     format: 'inline',",
                "     output: false,",
                "     saveOrder: 1) ~> sink1"
            ]
        }
    }
}

huangapple
  • 本文由 发表于 2023年6月5日 20:07:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76406264.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定