Azure Data Factory – 复制数据 REST API => ADLS

huangapple go评论67阅读模式
英文:

Azure Data Factory - Copy data REST API => ADLS

问题

我在Azure Data Factory上遇到了一个问题,

我正在从Salesforce Marketing Cloud的REST API中提取数据。
第一页运行良好,但当我尝试使用分页时,它不再工作。

实际上,基本URL如下:https://mcXXXXXXXXXXXXXXX.rest.marketingcloudapis.com

在我的管道中,我使用了复制数据:
Azure Data Factory – 复制数据 REST API => ADLS

方法是GET,并且我使用数据集属性发送API路径:/data/v1/customobjectdata/key/GMC_Historique_ADF/rowset

我的第一个响应是这样的:
Azure Data Factory – 复制数据 REST API => ADLS

因此,在分页规则上,我使用了这个:
Azure Data Factory – 复制数据 REST API => ADLS

问题在于基本URL中缺少/data/在links.next值中。

是否有任何解决办法可以将/data/添加到links.next的基本URL中,以获得以下链接:https://mcXXXXXXXXXXXXXXX.rest.marketingcloudapis.com/data/v1/customobjectdata/token/xxxxxxxxxx-xxxxxxx-xxxxxx/rowset?$page=2

英文:

I'm strugnling with an issue on Azure Data Facotry

I an extracting data from a REST API from Salesforce Marketing cloud.
The first page is goes well but when I want to use the pagination it doesen't work anymore.

In fact the base URL is like https://mcXXXXXXXXXXXXXXX.rest.marketingcloudapis.com

On my pipeline I use a copy data:
Azure Data Factory – 复制数据 REST API => ADLS

The method id a GET and I send the API path with the dataset properties : /data/v1/customobjectdata/key/GMC_Historique_ADF/rowset

My first response is like that :
Azure Data Factory – 复制数据 REST API => ADLS

So on pagination rules I use this :
Azure Data Factory – 复制数据 REST API => ADLS

The issue is that on the base URL /data/ is missing in links.next value.

Is there any solution to add /data/ in base url from links.next to have something like the following link : https://mcXXXXXXXXXXXXXXX.rest.marketingcloudapis.com/data/v1/customobjectdata/token/xxxxxxxxxx-xxxxxxx-xxxxxx/rowset?$page=2

答案1

得分: 0

"The next link itself is not valid, i would like to setup manually the base URL and only take the next.link value to build a custom URL"

如果下一个链接本身无效,则无法在ADF中使用分页。分页需要当前页面中包含所有下一页的链接。您需要将下一个链接添加到您的API页面中以使用分页。

"Is there any solution to add /data/ in base url from links.next to have something like the following link : https://mcXXXXXXXXXXXXXXX.rest.marketingcloudapis.com/data/v1/customobjectdata/token/xxxxxxxxxx-xxxxxxx-xxxxxx/rowset?$page=2"

是否有解决方法,可以从links.next中添加/data/到基本URL,以获得类似以下链接的结果:https://mcXXXXXXXXXXXXXXX.rest.marketingcloudapis.com/data/v1/customobjectdata/token/xxxxxxxxxx-xxxxxxx-xxxxxx/rowset?$page=2

如果您的所有URL都相同,唯一的区别是页面编号,那么您可以尝试以下解决方法。

注意: 仅当您知道总页数时,此方法才有效。

构建一个长度等于总页数的数组,并将其传递给ForEach活动。

@range(1,<total_number_of_pages>)

在ForEach中,使用REST数据集作为源和ADLS的临时位置作为接收端的复制活动。

出于演示目的,我使用了一个示例REST API。对于您来说,您需要在基本URL中提供您的URL,并使用数据集参数作为相对URL。

Azure Data Factory – 复制数据 REST API => ADLS

rowset?$page=@{item()}作为源的值。

Azure Data Factory – 复制数据 REST API => ADLS

还可以使用JSON接收数据集(临时位置)文件名的数据集参数,将动态内容设置为file@{item()}.json

在ForEach之外,使用另一个复制活动。这是为了将临时位置中的所有单个页面响应JSON合并到目标位置的单个JSON文件中。

在此复制活动的源中,提供临时位置和通配符路径*

Azure Data Factory – 复制数据 REST API => ADLS

在接收数据集中,提供目标文件位置,并将复制行为设置为合并文件。还将文件模式设置为对象数组。执行后,这将为您提供最终的JSON文件。

Azure Data Factory – 复制数据 REST API => ADLS

英文:

>The next link itself is not valid, i would like to setup manually the base URL and only take the next.link value to build a custom URL

If the next link itself is not valid, then you cannot use pagination in ADF. Pagination requires all the next page links to be there in the current page.
You need to add the next link to your API page to use the pagination.

>Is there any solution to add /data/ in base url from links.next to have something like the following link : https://mcXXXXXXXXXXXXXXX.rest.marketingcloudapis.com/data/v1/customobjectdata/token/xxxxxxxxxx-xxxxxxx-xxxxxx/rowset?$page=2

If all of your URL are same and only difference between among them is the page number, then you can try the below workaround to achieve your requirement.

NOTE: This method only works if you know the total numbers of pages.

Build an array of length of total number of pages and give it to ForEach activity.

@range(1,<total_number_of_pages>).

Inside ForEach, use copy activity with REST dataset as source and temporary location of ADLS as sink.

For demo I have used a sample REST API. For you, you need to give your URL in the base and use the dataset parameters for the Relative URL.

Azure Data Factory – 复制数据 REST API => ADLS

Give this rowset?$page=@{item()} as the value for it in your source.

Azure Data Factory – 复制数据 REST API => ADLS

Use the dataset parameters for JSON sink dataset(temporary location) filename also and give the dynamic content as file@{item()}.json for it.

Outside the ForEach, use another copy activity. This is for merging all individual page response JSONs from temporary location to a single JSON file in target location.

In the source of this copy activity, give the temporary location and * in the wild card path.

Azure Data Factory – 复制数据 REST API => ADLS

In the sink dataset, give your target file location and set the Copy behavior as Merge files. Also, set the File pattern to Array of objects. This will give you the final JSON file after Execution.

Azure Data Factory – 复制数据 REST API => ADLS

huangapple
  • 本文由 发表于 2023年6月29日 21:08:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76581375.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定