Flatten活动在ADF数据流中能够处理动态展开吗?

huangapple go评论59阅读模式
英文:

Can the Flatten activity in ADF data flow handle dynamic unroll by

问题

I'm importing items from Cosmos DB, however, each item can potentially have a different schema structure. Below are a few examples of how they could look.

Ex.1
{
  "id": "01-000008",
  "pKey": "000008",
  "delete": false,
  "/": {
    "fwqs": false,
    "itemDescription": "Test Description one",
    "linkset": {
      "anchor": "Test anchor one",
      "itemDescription": "Test Description one",
      "https://test/testIdeas": [
        {
          "href": "https://test.com/test/",
          "title": "Where to test one",
          "fwqs": true,
          "public": true,
          "type": "text/html",
          "context": [
            "us"
          ],
          "hreflang": [
            "en"
          ]
        }
      ],
      "https://test/activityIdeas": [
        {
          "href": "https://test.com/test/",
          "title": "Where to buy",
          "fwqs": true,
          "public": true,
          "type": "text/html",
          "context": [
            "us"
          ],
          "hreflang": [
            "en"
          ]
        }
      ]
    }
  }
}

Ex.2
{
  "id": "01-000009",
  "pKey": "000009",
  "delete": false,
  "/10/101": {
    "fwqs": false,
    "itemDescription": "Test Description Two",
    "linkset": {
      "anchor": "Test anchor Two",
      "itemDescription": "Test Description Two",
      "https://test/testIdeas": [
        {
          "href": "https://test.com/test/",
          "title": "Where to test",
          "fwqs": true,
          "public": true,
          "type": "text/html",
          "context": [
            "us"
          ],
          "hreflang": [
            "en"
          ]
        }
      ]
    }
  }
}

Ex.3
{
  "id": "01-0000010",
  "pKey": "0000010",
  "delete": false,
  "/10/12345/AB/100001": {
    "fwqs": false,
    "itemDescription": "Test Description Three",
    "linkset": {
      "anchor": "Test anchor Three",
      "itemDescription": "Test Description Three",
      "https://test/testIdeas": [
        {
          "href": "https://test.com/test/",
          "title": "Where to test",
          "fwqs": true,
          "public": true,
          "type": "text/html",
          "context": [
            "us"
          ],
          "hreflang": [
            "en"
          ]
        }
      ]
    }
  }
}

I've made sure to clear the data set schema and the source projection in preparation to handle the dynamic key.

In Ex.1 in the unroll by, I'm trying to access {/}.linkset.{https://test/testIdeas}. But because the "{/}" key is always changing, I need to make this part dynamic somehow in order to also pick up the keys you find in Ex.2 and Ex.3.

I'm struggling to resolve the issue, and I'm not sure what function should be used to identify a dynamic key and if it should be done in the "add dynamic content" or the "add column pattern" part of the "unroll by" option in the flatten activity.

Desired Results
Current Results

Flatten Activity Ideas

Flatten Test Ideas

Derived 'new_col' Activity

英文:

I'm importing items from Cosmos db, however each item can potentially have a different schema structure. Below are a few examples of how they could look.

Ex.1
{
"id": "01-000008",
"pKey": "000008",
"delete": false,
"/": {
"fwqs": false,
"itemDescription": "Test Description one",
"linkset": {
"anchor": "Test anchor one",
"itemDescription": "Test Description one",
"https://test/testIdeas": [
{
"href": "https://test.com/test/",
"title": "Where to test one",
"fwqs": true,
"public": true,
"type": "text/html",
"context": [
"us"
],
"hreflang": [
"en"
]
}
],
"https://test/activityIdeas": [
{
"href": "https://test.com/test/",
"title": "Where to buy",
"fwqs": true,
"public": true,
"type": "text/html",
"context": [
"us"
],
"hreflang": [
"en"
]
}
]
}
}
}
Ex.2
{
"id": "01-000009",
"pKey": "000009",
"delete": false,
"/10/101": {
"fwqs": false,
"itemDescription": "Test Description Two",
"linkset": {
"anchor": "Test anchor Two",
"itemDescription": "Test Description Two",
"https://test/testIdeas": [
{
"href": "https://test.com/test/",
"title": "Where to test",
"fwqs": true,
"public": true,
"type": "text/html",
"context": [
"us"
],
"hreflang": [
"en"
]
}
]
}
}
}
Ex.3
{
"id": "01-0000010",
"pKey": "0000010",
"delete": false,
"/10/12345/AB/100001": {
"fwqs": false,
"itemDescription": "Test Description Three",
"linkset": {
"anchor": "Test anchor Three",
"itemDescription": "Test Description Three",
"https://test/testIdeas": [
{
"href": "https://test.com/test/",
"title": "Where to test",
"fwqs": true,
"public": true,
"type": "text/html",
"context": [
"us"
],
"hreflang": [
"en"
]
}
]
}
}
}

I've made sure to clear the data set schema and the source projection in preparation to handle the dynamic key.

In Ex.1 in the unroll by I'm trying to access {/}.linkset.{https://test/testIdeas}. But because the "{/}" key is always changing I need to make this part dynamic somehow in order to also pick up the keys you find in Ex.2 & Ex.3.

I'm struggling to resolve the issue, I'm not sure what function should be used to identify a dynamic key and if it should be done in the "add dynamic content" or the "add column pattern" part of the "unroll by" option in the flatten activity.

desired results
current results

flatten activity ideas

flatten test ideas

derived 'new_col' activity

答案1

得分: 0

One approach to solve this is renaming the dynamic key with a static name and then flattening that column. Since the key has the pattern starting with \, column which begins with \ can be renamed as new_column. Below are the detailed steps to rename the dynamic key.

  • One sample source JSON dataset is taken in source transformation.
    Flatten活动在ADF数据流中能够处理动态展开吗?

  • Then derived column transformation is taken to rename the column that starts with \ to new_col.

Flatten活动在ADF数据流中能够处理动态展开吗?

Dataflow script for derived column:
derive(each(match(startsWith(name,'/')), 'new_col' = $$)) ~> derivedColumn1

  • Then you can use flatten transformation and give the unroll by basis the new_column field.

Flatten活动在ADF数据流中能够处理动态展开吗?

By this way, flatten the Json with dynamic key can be flattened.

英文:

One approach to solve this is renaming the dynamic key with a static name and then flattening that column. Since the key has the pattern starting with \, column which begins with \ can be renamed as new_column. Below are the detailed steps to rename the dynamic key.

  • One sample source JSON dataset is taken in source transformation.
    Flatten活动在ADF数据流中能够处理动态展开吗?

  • Then derived column transformation is taken to rename the column that starts with \ to new_col.

Flatten活动在ADF数据流中能够处理动态展开吗?

Dataflow script for derived column:

derive(each(match(startsWith(name,'/')), 'new_col'  =  $$)) ~> derivedColumn1
  • Then you can use flatten transformation and give the unroll by basis the new_column field.

Flatten活动在ADF数据流中能够处理动态展开吗?

By this way, flatten the Json with dynamic key can be flattened.

huangapple
  • 本文由 发表于 2023年5月11日 17:33:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76226168.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定