英文:
Is there a way to copy data from Sharepoint On Premise using Azure Datafactory or Azure Synapse Pipelines?
问题
你好,希望你一切都好。
有一个Sharepoint On Premise,存储着我想要分析和转换的几张表格。我想知道是否可以连接Azure Datafactory或Azure Synapse Pipelines到Sharepoint On Premise,以便将数据复制到Azure?
我尝试过去到关联服务中查看是否有适用于Sharepoint On Premises的连接器,但我只找到了适用于Sharepoint列表的内容。所以我想知道如何以其他方式连接到这个数据源。
英文:
Hello and I hope you are well.
There is this Sharepoint On Premise that stores sa few tables I would like to analyse and transform. I would like to know if it's possible to connect Azure Datafactory or Azure Synapse Pipelines to Sharepoint On Premise in order to copy data to Azure?
I tried to go to linked services to see if there's connector for Sharepoint On Premises, but I only found something for Sharepoint Lists. So I was wondering how else to connect to this data source?
答案1
得分: 2
是的,要使用Azure Datafactory或Azure Synapse Pipelines从Sharepoint On Premise复制数据,您将需要以下信息:
- 应用程序ID
- 应用程序密钥
- 租户ID
打开SharePoint Online网站链接,例如 https://[your_site_url]/_layouts/15/appinv.aspx
(替换站点URL)。
使用下面的图片作为参考。
上述链接将允许您为您的应用程序ID授予权限。
• 应用程序域:contoso.com
• 重定向URL:https://www.contoso.com
• 权限请求XML:
<AppPermissionRequests AllowAppOnlyPolicy="true">
<AppPermissionRequest Scope="http://sharepoint/content/sitecollection/web" Right="Read"/>
</AppPermissionRequests>
进入概览部分的Active Directory,您将获取租户ID。
进入应用程序注册并创建一个新的应用程序注册以获取Client ID。
转到证书和秘密以创建一个秘密值,将这三个值都复制到记事本供将来使用。
进入ADF或Synapse工作区,创建一个Pipeline并选择Web活动。
URL为 https://accounts.accesscontrol.windows.net/72f988bf-86f1-41af-91ab-2d7cd011db47/tokens/Oauth/2
方法:POST
头部:
Content-Type: application/x-www-form-urlencoded
主体:
grant_type=client_credentials&client_id=[Client-ID]@[Tenant-ID]&client_secret=[Client-Secret]&resource=00000003-0000-0ff1-ce00-000000000000/[Tenant-Name].sharepoint.com@[Tenant-ID]
用您的配置替换Client ID、Tenant ID、Tenant Name和Client Secret。
接下来,请求文件的URL应如下所示:
https://[site-url]/_api/web/GetFileByServerRelativeUrl('[relative-path-to-file]')/$value
替换站点URL和文件的相对路径。相对路径应再次是站点以及文件夹路径的结构。
现在返回到ADF或Synapse管道,为HTTP创建一个新的链接服务:
提供活动的名称,如sharepoint。
基本URL为 microsoft.sharepoint.com
身份验证为anonymous
创建一个复制活动
源:
在源中创建一个数据集,它应该是HTTP类型,格式为binary
选择HTTP的链接服务。
相对URL将如下所示:
/teams/sharepointaccess/Shared%20Documents/Forms/AllItems.aspx
在接收器的附加头部中,添加动态内容并将授权令牌和Web活动的输出值连接在一起,如图片所示。
接收器:
在接收器端,为ADLS gen 2创建一个二进制格式的数据集
附加接收器的链接服务
现在可以触发管道了。
参考文档:
您还可以参考微软的文档:
英文:
Yes to copy data from Sharepoint On Premise using Azure Datafactory or Azure Synapse Pipelines. you will need the
-Application ID
-Application key
-Tenant ID
Open SharePoint Online site link e.g. https://[your_site_url]/_layouts/15/appinv.aspx (replace the site URL).
Use the image as reference.
The above url will allow you grant permission for your application ID.
• App Domain: contoso.com
• Redirect URL: https://www.contoso.com
• Permission Request XML:
XMLCopy
<AppPermissionRequests AllowAppOnlyPolicy="true">
<AppPermissionRequest Scope="http://sharepoint/content/sitecollection/web" Right="Read"/>
</AppPermissionRequests>
Go to the Active driectory in the overview section you will get the Tenant ID
Go to app registrations and create a new app registration for Clinet ID
GO to the certificates and secrets for creating a secret value copy all the 3 into a note pad for the future use.
Go to the ADF or Synapse workspace
Create a Pipeline choose the web activity.
url as https://accounts.accesscontrol.windows.net/72f988bf-86f1-41af-91ab-2d7cd011db47/tokens/Oauth/2
Method: POST
Headers:
Content-Type: application/x-www-form-urlencoded
Body: grant_type=client_credentials&client_id=[Client-ID]@[Tenant-ID]&client_secret=[Client-Secret]&resource=00000003-0000-0ff1-ce00-000000000000/[Tenant-Name].sharepoint.com@[Tenant-ID]
Replace the Client ID, Tenant ID, Tenant Name and Clinet Secret with your cofiguration
Next the Url to request the file:
The url looks something like this
https://[site-url]/_api/web/GetFileByServerRelativeUrl('[relative-path-to-file]')/$value
replace the site url and relative path to the file. the relative path should be the site again and the structre of folder path.
Now go back to the ADF or Synapse pipeline and create a new linked service for HTTP:
provide the name of the activity like sharepoint
Base url like microsoft.sharepoint.com
Disable the certificate
Authenication to anonymous
Create a Copy activity
Source:
in the source create a dataset it should be a HTTP type and format binary
choose the created linked service for the HTTP.
Relative url will be something like this.
/teams/sharepointaccess/Shared%20Documents/Forms/AllItems.aspx
In the sink additional header. add dynamic content and concatinate autraization token and output value from the web activity
Like shown in the picture.
Sink:
In the sink side create a dataset to the ADLS gen 2 in the Binary format
Attach the linked service for the sink
and you can trigger the Pipleline now.
Docs reffered:
You can also use the microsft documentation for your reference.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论