英文:
How can I configure Azure Data Factory to create Table Storage tables during a copy activity (using SAS token)?
问题
我正在创建一个数据工厂管道,用于将数据从Blob存储复制到表存储。表在表存储中可能存在,也可能不存在,如果不存在则需要创建。我的表存储链接服务的身份验证方法必须是SAS令牌(其值从密钥保管库中获取)。
关于我的链接服务配置和管道的一切都运行正常,除了如果表接收端不存在,管道将失败。我尝试了一个类似的使用帐户密钥身份验证的配置,它可以正常工作,但我正在寻找一种使用SAS令牌身份验证的方法。
当前配置
这是我用于表存储账户的链接服务配置:
{
"name": "Table Storage Account",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"parameters": {
"StorageAccountName": {
"type": "string"
},
"TableName": {
"type": "string"
}
},
"annotations": [],
"type": "AzureTableStorage",
"typeProperties": {
"sasUri": "https://@{linkedService().StorageAccountName}.table.core.windows.net/@{linkedService().TableName}",
"sasToken": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "Key Vault",
"type": "LinkedServiceReference"
},
"secretName": {
"value": "@{linkedService().StorageAccountName}-sas",
"type": "Expression"
}
}
}
}
}
这是SAS令牌的设置:
sv=2021-12-02&ss=t&srt=sco&sp=rwlacu&se=2023-05-02T03:00:00Z&st=2023-04-19T16:09:39Z&spr=https&sig=[REDACTED]
这是用于复制数据活动的数据集配置:
{
"name": "StorageTable",
"properties": {
"description": "Dataset for the azure table account.",
"linkedServiceName": {
"referenceName": "Table Storage Account",
"type": "LinkedServiceReference",
"parameters": {
"StorageAccountName": {
"value": "@dataset().StorageAccountName",
"type": "Expression"
},
"TableName": {
"value": "@dataset().TableName",
"type": "Expression"
}
}
},
"parameters": {
"StorageAccountName": {
"type": "string"
},
"TableName": {
"type": "string"
}
},
"annotations": [],
"type": "AzureTable",
"schema": [],
"typeProperties": {
"tableName": {
"value": "@dataset().TableName",
"type": "Expression"
}
}
},
"type": "Microsoft.DataFactory/factories/datasets"
}
这是复制数据活动的接收端配置(尽管这里似乎没有相关的内容):
"sink": {
"type": "AzureTableSink",
"azureTableInsertType": "replace",
"azureTablePartitionKeyName": {
"value": "PartitionKey",
"type": "Expression"
},
"azureTableRowKeyName": {
"value": "RowKey",
"type": "Expression"
},
"writeBatchSize": 10000
}
使用这个配置,所有连接都可以在门户中成功验证,但ADF不会在表不存在时创建它。
示例错误信息
ErrorCode=FailedStorageOperation,Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=存储操作失败,具有以下错误 '0:指定的表不存在。RequestId:0c4fc844-d002-0001-0d87-738fd1000000 Time:2023-04-20T12:58:38.1032528Z。',Source=,Type=Microsoft.WindowsAzure.Storage.StorageException,Message=0:指定的表不存在。RequestId:0c4fc844-d002-0001-0d87-738fd1000000 Time:2023-04-20T12:58:38.1032528Z,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=0:指定的表不存在。RequestId:0c4fc844-d002-0001-0d87-738fd1000000 Time:2023-04-20T12:58:38.1032528Z,。
其他尝试
更新SAS URI
我将"sasUri"更改为 https://@{linkedService().StorageAccountName}.table.core.windows.net
(删除了表名),希望数据工厂能够根据操作来确定使用哪些REST URL路径。
这个变化破坏了门户中的连接验证(我忽略了这个用于测试目的),但管道仍然正常工作(可能是因为表名仍然在数据集中提供)。不幸的是,它仍然不能创建不存在的表格。
英文:
I'm creating a data factory pipeline to copy data from blob storage to table storage. The table may or may not exist in table storage and needs to be created if it doesn't exist. The Authentication method for my table storage linked service must be a SAS token (the value of which I'm grabbing from Key Vault).
Everything about my linked service configuration and pipeline works fine, except that the pipeline fails if the table sink doesn't exist. I've tried a similar configuration using Account Key authentication that works, but I'm looking for a way to do this with SAS token authentication.
Current Configuration
This is my Linked Service configuration for the table storage account:
{
"name": "Table Storage Account",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"parameters": {
"StorageAccountName": {
"type": "string"
},
"TableName": {
"type": "string"
}
},
"annotations": [],
"type": "AzureTableStorage",
"typeProperties": {
"sasUri": "https://@{linkedService().StorageAccountName}.table.core.windows.net/@{linkedService().TableName}",
"sasToken": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "Key Vault",
"type": "LinkedServiceReference"
},
"secretName": {
"value": "@{linkedService().StorageAccountName}-sas",
"type": "Expression"
}
}
}
}
}
These are the SAS Token settings:
sv=2021-12-02&ss=t&srt=sco&sp=rwlacu&se=2023-05-02T03:00:00Z&st=2023-04-19T16:09:39Z&spr=https&sig=[REDACTED]
This is the Data Set configuration used by the Copy Data activity:
{
"name": "StorageTable",
"properties": {
"description": "Dataset for the azure table account.",
"linkedServiceName": {
"referenceName": "Table Storage Account",
"type": "LinkedServiceReference",
"parameters": {
"StorageAccountName": {
"value": "@dataset().StorageAccountName",
"type": "Expression"
},
"TableName": {
"value": "@dataset().TableName",
"type": "Expression"
}
}
},
"parameters": {
"StorageAccountName": {
"type": "string"
},
"TableName": {
"type": "string"
}
},
"annotations": [],
"type": "AzureTable",
"schema": [],
"typeProperties": {
"tableName": {
"value": "@dataset().TableName",
"type": "Expression"
}
}
},
"type": "Microsoft.DataFactory/factories/datasets"
}
This is the sink configuration for the Copy Data activity (though nothing seems relevant here):
"sink": {
"type": "AzureTableSink",
"azureTableInsertType": "replace",
"azureTablePartitionKeyName": {
"value": "PartitionKey",
"type": "Expression"
},
"azureTableRowKeyName": {
"value": "RowKey",
"type": "Expression"
},
"writeBatchSize": 10000
}
With this configuration, all connections can be validated successfully in the portal, but ADF won't create a table if it doesn't exist.
Example Error Message
ErrorCode=FailedStorageOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A storage operation failed with the following error '0:The table specified does not exist.
RequestId:0c4fc844-d002-0001-0d87-738fd1000000
Time:2023-04-20T12:58:38.1032528Z'.,Source=,''Type=Microsoft.WindowsAzure.Storage.StorageException,Message=0:The table specified does not exist.
RequestId:0c4fc844-d002-0001-0d87-738fd1000000
Time:2023-04-20T12:58:38.1032528Z,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=0:The table specified does not exist.
RequestId:0c4fc844-d002-0001-0d87-738fd1000000
Time:2023-04-20T12:58:38.1032528Z,,'
Other Attempts
Update SAS URI
I changed the "sasUri" to https://@{linkedService().StorageAccountName}.table.core.windows.net
(removing the table name), hoping that with the base account URI, data factory would figure out which REST URL paths to use depending on the operation.
This change broke the connection validation in the portal (which I ignored for testing purposes), but the pipelines still worked fine (probably because the Table Name was still provided in the Data Set). Unfortunately, it still did not create tables that do not exist.
答案1
得分: 2
> 如何配置Azure Data Factory以在复制活动期间创建Table Storage表(使用SAS令牌)?
按照下面的步骤在复制活动期间创建Table Storage表(使用SAS令牌并检查是否有遗漏):
-
首先,从
存储帐户 >> 安全性 + 网络 >> 共享访问签名
获取SAS令牌
,使用下面的设置
生成并复制SAS令牌
-
创建一个密钥保管库以将此
SAS令牌
存储为秘密。在此密钥保管库中创建一个将名称设为storage_account_name-sas
和秘密设为SAS令牌
的秘密。
-
现在,使用上述值和一些链接服务参数创建链接服务。
由于我们在此处为存储帐户名称和表名称创建了链接服务参数,我提供了 SAS URL 作为https://@{linkedService().StorageAccountName}.table.core.windows.net/@{linkedService().TableName}
链接服务设置:
-
现在为Azure表存储创建数据集,使用上述链接服务,然后在数据集中为存储帐户名称和表名称创建数据集参数。
然后将这些参数添加到相应的链接服务属性和表中。
-
现在创建流水线,在其中添加了复制活动并将源选择为Blob文件。
然后将值传递给数据集属性存储帐户名称和表名称,如sampleblob5
这不在表存储中存在。
流水线成功运行:
输出:
在运行流水线之前(sampleblob5表不存在):
运行流水线后(复制活动创建了sampleblob5文件):
英文:
> How can I configure Azure Data Factory to create Table Storage tables during a copy activity (using SAS token)?
Follow below process to create Table Storage tables during a copy activity (using SAS token and check if you missed any):
-
First grab the
SAS token
with below settings fromstorage account >> Security +networking >> shared access signature
Generate and copy theSAS token
-
Create a key vault to store this
SAS token
as secret. In this key vault create a secret add name asstorage_account_name-sas
and secret asSAS token
-
Now created linked service with the above values and some linked service parameters.
As we have created linked service parameters for storage account name and table name here, I provided SAS URL ashttps://@{linkedService().StorageAccountName}.table.core.windows.net/@{linkedService().TableName}
linked service settings:
-
Now create dataset for azure table storage with above linked service and then crate a dataset parameter for storage account name and table name in dataset.
Then add these parameters to respective linked service properties and table.
-
Now created pipeline added copy activity in it and selected source as blob file.
Then passed values to dataset properties storage account name and table name assampleblob5
which is not exist in table storage.
Pipeline ran successfully:
Output:
Before running pipeline (sampleblob5 table is not exist):
After running pipeline (Copy activity created sampleblob5 file):
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论