如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

huangapple go评论52阅读模式
英文:

How can I configure Azure Data Factory to create Table Storage tables during a copy activity (using SAS token)?

问题

我正在创建一个数据工厂管道,用于将数据从Blob存储复制到表存储。表在表存储中可能存在,也可能不存在,如果不存在则需要创建。我的表存储链接服务的身份验证方法必须是SAS令牌(其值从密钥保管库中获取)。

关于我的链接服务配置和管道的一切都运行正常,除了如果表接收端不存在,管道将失败。我尝试了一个类似的使用帐户密钥身份验证的配置,它可以正常工作,但我正在寻找一种使用SAS令牌身份验证的方法。

当前配置

这是我用于表存储账户的链接服务配置:

{
    "name": "Table Storage Account",
    "type": "Microsoft.DataFactory/factories/linkedservices",
    "properties": {
        "parameters": {
            "StorageAccountName": {
                "type": "string"
            },
            "TableName": {
                "type": "string"
            }
        },
        "annotations": [],
        "type": "AzureTableStorage",
        "typeProperties": {
            "sasUri": "https://@{linkedService().StorageAccountName}.table.core.windows.net/@{linkedService().TableName}",
            "sasToken": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "Key Vault",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@{linkedService().StorageAccountName}-sas",
                    "type": "Expression"
                }
            }
        }
    }
}

这是SAS令牌的设置:

sv=2021-12-02&ss=t&srt=sco&sp=rwlacu&se=2023-05-02T03:00:00Z&st=2023-04-19T16:09:39Z&spr=https&sig=[REDACTED]

这是用于复制数据活动的数据集配置:

{
    "name": "StorageTable",
    "properties": {
        "description": "Dataset for the azure table account.",
        "linkedServiceName": {
            "referenceName": "Table Storage Account",
            "type": "LinkedServiceReference",
            "parameters": {
                "StorageAccountName": {
                    "value": "@dataset().StorageAccountName",
                    "type": "Expression"
                },
                "TableName": {
                    "value": "@dataset().TableName",
                    "type": "Expression"
                }
            }
        },
        "parameters": {
            "StorageAccountName": {
                "type": "string"
            },
            "TableName": {
                "type": "string"
            }
        },
        "annotations": [],
        "type": "AzureTable",
        "schema": [],
        "typeProperties": {
            "tableName": {
                "value": "@dataset().TableName",
                "type": "Expression"
            }
        }
    },
    "type": "Microsoft.DataFactory/factories/datasets"
}

这是复制数据活动的接收端配置(尽管这里似乎没有相关的内容):

"sink": {
    "type": "AzureTableSink",
    "azureTableInsertType": "replace",
    "azureTablePartitionKeyName": {
        "value": "PartitionKey",
        "type": "Expression"
    },
    "azureTableRowKeyName": {
        "value": "RowKey",
        "type": "Expression"
    },
    "writeBatchSize": 10000
}

使用这个配置,所有连接都可以在门户中成功验证,但ADF不会在表不存在时创建它。

示例错误信息

ErrorCode=FailedStorageOperation,Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=存储操作失败,具有以下错误 '0:指定的表不存在。RequestId:0c4fc844-d002-0001-0d87-738fd1000000 Time:2023-04-20T12:58:38.1032528Z。',Source=,Type=Microsoft.WindowsAzure.Storage.StorageException,Message=0:指定的表不存在。RequestId:0c4fc844-d002-0001-0d87-738fd1000000 Time:2023-04-20T12:58:38.1032528Z,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=0:指定的表不存在。RequestId:0c4fc844-d002-0001-0d87-738fd1000000 Time:2023-04-20T12:58:38.1032528Z,。

其他尝试

更新SAS URI

我将"sasUri"更改为 https://@{linkedService().StorageAccountName}.table.core.windows.net(删除了表名),希望数据工厂能够根据操作来确定使用哪些REST URL路径。

这个变化破坏了门户中的连接验证(我忽略了这个用于测试目的),但管道仍然正常工作(可能是因为表名仍然在数据集中提供)。不幸的是,它仍然不能创建不存在的表格。

英文:

I'm creating a data factory pipeline to copy data from blob storage to table storage. The table may or may not exist in table storage and needs to be created if it doesn't exist. The Authentication method for my table storage linked service must be a SAS token (the value of which I'm grabbing from Key Vault).

Everything about my linked service configuration and pipeline works fine, except that the pipeline fails if the table sink doesn't exist. I've tried a similar configuration using Account Key authentication that works, but I'm looking for a way to do this with SAS token authentication.

Current Configuration

This is my Linked Service configuration for the table storage account:

{
    "name": "Table Storage Account",
    "type": "Microsoft.DataFactory/factories/linkedservices",
    "properties": {
        "parameters": {
            "StorageAccountName": {
                "type": "string"
            },
            "TableName": {
                "type": "string"
            }
        },
        "annotations": [],
        "type": "AzureTableStorage",
        "typeProperties": {
            "sasUri": "https://@{linkedService().StorageAccountName}.table.core.windows.net/@{linkedService().TableName}",
            "sasToken": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "Key Vault",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@{linkedService().StorageAccountName}-sas",
                    "type": "Expression"
                }
            }
        }
    }
}

These are the SAS Token settings:

sv=2021-12-02&ss=t&srt=sco&sp=rwlacu&se=2023-05-02T03:00:00Z&st=2023-04-19T16:09:39Z&spr=https&sig=[REDACTED]

This is the Data Set configuration used by the Copy Data activity:

{
    "name": "StorageTable",
    "properties": {
        "description": "Dataset for the azure table account.",
        "linkedServiceName": {
            "referenceName": "Table Storage Account",
            "type": "LinkedServiceReference",
            "parameters": {
                "StorageAccountName": {
                    "value": "@dataset().StorageAccountName",
                    "type": "Expression"
                },
                "TableName": {
                    "value": "@dataset().TableName",
                    "type": "Expression"
                }
            }
        },
        "parameters": {
            "StorageAccountName": {
                "type": "string"
            },
            "TableName": {
                "type": "string"
            }
        },
        "annotations": [],
        "type": "AzureTable",
        "schema": [],
        "typeProperties": {
            "tableName": {
                "value": "@dataset().TableName",
                "type": "Expression"
            }
        }
    },
    "type": "Microsoft.DataFactory/factories/datasets"
}

This is the sink configuration for the Copy Data activity (though nothing seems relevant here):

"sink": {
    "type": "AzureTableSink",
    "azureTableInsertType": "replace",
    "azureTablePartitionKeyName": {
        "value": "PartitionKey",
        "type": "Expression"
    },
    "azureTableRowKeyName": {
        "value": "RowKey",
        "type": "Expression"
    },
    "writeBatchSize": 10000
}

With this configuration, all connections can be validated successfully in the portal, but ADF won't create a table if it doesn't exist.

Example Error Message

ErrorCode=FailedStorageOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A storage operation failed with the following error '0:The table specified does not exist.
RequestId:0c4fc844-d002-0001-0d87-738fd1000000
Time:2023-04-20T12:58:38.1032528Z'.,Source=,''Type=Microsoft.WindowsAzure.Storage.StorageException,Message=0:The table specified does not exist.
RequestId:0c4fc844-d002-0001-0d87-738fd1000000
Time:2023-04-20T12:58:38.1032528Z,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=0:The table specified does not exist.
RequestId:0c4fc844-d002-0001-0d87-738fd1000000
Time:2023-04-20T12:58:38.1032528Z,,'

Other Attempts

Update SAS URI

I changed the "sasUri" to https://@{linkedService().StorageAccountName}.table.core.windows.net (removing the table name), hoping that with the base account URI, data factory would figure out which REST URL paths to use depending on the operation.

This change broke the connection validation in the portal (which I ignored for testing purposes), but the pipelines still worked fine (probably because the Table Name was still provided in the Data Set). Unfortunately, it still did not create tables that do not exist.

答案1

得分: 2

> 如何配置Azure Data Factory以在复制活动期间创建Table Storage表(使用SAS令牌)?

按照下面的步骤在复制活动期间创建Table Storage表(使用SAS令牌并检查是否有遗漏):

  • 首先,从 存储帐户 >> 安全性 + 网络 >> 共享访问签名 获取 SAS令牌,使用下面的设置
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?
    生成并复制 SAS令牌
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

  • 创建一个密钥保管库以将此 SAS令牌 存储为秘密。在此密钥保管库中创建一个将名称设为 storage_account_name-sas秘密设为 SAS令牌的秘密。
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

  • 现在,使用上述值和一些链接服务参数创建链接服务。
    由于我们在此处为存储帐户名称和表名称创建了链接服务参数,我提供了 SAS URL 作为 https://@{linkedService().StorageAccountName}.table.core.windows.net/@{linkedService().TableName}

链接服务设置:

如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

  • 现在为Azure表存储创建数据集,使用上述链接服务,然后在数据集中为存储帐户名称和表名称创建数据集参数。
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?
    然后将这些参数添加到相应的链接服务属性和表中。
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

  • 现在创建流水线,在其中添加了复制活动并将源选择为Blob文件。
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?
    然后将值传递给数据集属性存储帐户名称和表名称,如 sampleblob5 这不在表存储中存在。
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

流水线成功运行:

如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

输出:


在运行流水线之前(sampleblob5表不存在):
如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

运行流水线后(复制活动创建了sampleblob5文件):
如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

英文:

> How can I configure Azure Data Factory to create Table Storage tables during a copy activity (using SAS token)?

Follow below process to create Table Storage tables during a copy activity (using SAS token and check if you missed any):

  • First grab the SAS token with below settings from storage account >> Security +networking >> shared access signature
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?
    Generate and copy the SAS token
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

  • Create a key vault to store this SAS token as secret. In this key vault create a secret add name as storage_account_name-sas and secret as SAS token
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

  • Now created linked service with the above values and some linked service parameters.
    As we have created linked service parameters for storage account name and table name here, I provided SAS URL as https://@{linkedService().StorageAccountName}.table.core.windows.net/@{linkedService().TableName}

linked service settings:

如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

  • Now create dataset for azure table storage with above linked service and then crate a dataset parameter for storage account name and table name in dataset.
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?
    Then add these parameters to respective linked service properties and table.
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

  • Now created pipeline added copy activity in it and selected source as blob file.
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?
    Then passed values to dataset properties storage account name and table name as sampleblob5 which is not exist in table storage.
    如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

Pipeline ran successfully:

如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

Output:


Before running pipeline (sampleblob5 table is not exist):
如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

After running pipeline (Copy activity created sampleblob5 file):
如何配置Azure数据工厂以在复制活动期间创建表存储表(使用SAS令牌)?

huangapple
  • 本文由 发表于 2023年4月20日 02:18:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/76057693.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定