在Azure数据湖Gen2上创建容器。

huangapple go评论65阅读模式
英文:

Create container on Azure datalake Gen2

问题

我正在尝试使用 Python 代码通过 Databricks 在 Azure Datalake Gen2 上创建容器。

我尝试了许多代码变体并得到了不同的错误。其中一个示例如下:

from azure.storage.filedatalake import DataLakeServiceClient

service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
           "https", storage_account_name), credential=my_secret_key)

file_system_client = service_client.get_file_system_client("mytestfolder")

print("the file system exists: " + str(file_system_client.exists()))

if not file_system_client.exists():
    file_system_client.create_file_system()
    print("the file system is created.")

使用这个代码出现的错误是:

AzureSigningError: Invalid base64-encoded string: number of data characters (37) cannot be 1 more than a multiple of 4

我希望修复上述代码或任何其他 Python / PySpark 代码片段以在 Gen2 存储上创建容器。我还尝试了下面的代码,但出现了与上述相同的错误:

from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

container_name = 'test'
storage_connection_string = f'DefaultEndpointsProtocol=https;AccountName={acname};AccountKey={ackey};EndpointSuffix=core.windows.net;'
blob_service_client = BlobServiceClient.from_connection_string(storage_connection_string)
blob_service_client.create_container(container_name, public_access='CONTAINER', timeout=10)
英文:

I am trying to create container on Azure Datalake Gen2 using python code through databricks.

I have tried many variations in the code and got different errors. One of the examples is as below
The code i used (copied from stackoverflow.com) as is:

from azure.storage.filedatalake import DataLakeServiceClient

service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
           "https", storage_account_name), credential=my_secret_key)

#the get_file_system_client method will not throw error if the file system does not exist, if you're using the latest library 12.3.0
file_system_client = service_client.get_file_system_client("mytestfolder")

print("the file system exists: " + str(file_system_client.exists()))

#create the file system if it does not exist
if not file_system_client.exists():
    file_system_client.create_file_system()
    print("the file system is created.")

The error i am getting with this is:

AzureSigningError: Invalid base64-encoded string: number of data characters (37) cannot be 1 more than a multiple of 4

I am looking to fix the above code or any other piece of python / pyspark code to create containers on the Gen2 Storage.
Also tried the below code as well with the same error as above.

from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
container_name = 'test'
storage_connection_string = f'DefaultEndpointsProtocol=https;AccountName={acname};AccountKey={ackey};EndpointSuffix= core.windows.net;'
blob_service_client = BlobServiceClient.from_connection_string(storage_connection_string)
blob_service_client.create_container(container_name,public_access='CONTAINER',timeout = 10)

Any Help Please.

答案1

得分: 0

你能使用下面的示例代码来创建一个在ADLS Gen2账户中的容器(文件系统),然后检查是否有帮助吗?

import os
import random
import uuid

from azure.storage.filedatalake import (
    DataLakeServiceClient,
)

def run():
    account_name = os.getenv('STORAGE_ACCOUNT_NAME', "MyStorageadlsgen2")
    account_key = os.getenv('STORAGE_ACCOUNT_KEY', "R/puXXXXXXXXXXXXXXXXfSLo2PiqPXf4ltj+CUs2yg==")

    # 使用环境变量中的凭据设置服务客户端
    service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
        "https",
        account_name
    ), credential=account_key)

    # 为测试目的生成一个随机名称
    fs_name = "testfs{}".format(random.randint(1, 1000))
    print("生成一个名为 '{}' 的测试文件系统。".format(fs_name))

    # 创建文件系统
    filesystem_client = service_client.create_file_system(file_system=fs_name)

    print("ADLS文件系统创建成功")

if __name__ == '__main__':
    run()
英文:

Could you please leverage the below sample to create a container (filesystem) in ADLS Gen2 account and check if that helps ?

import os
import random
import uuid

from azure.storage.filedatalake import (
    DataLakeServiceClient,
)

def run():
    account_name = os.getenv('STORAGE_ACCOUNT_NAME', "MyStorageadlsgen2")
    account_key = os.getenv('STORAGE_ACCOUNT_KEY', "R/puXXXXXXXXXXXXXXXXfSLo2PiqPXf4ltj+CUs2yg==")

    # set up the service client with the credentials from the environment variables
    service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
        "https",
        account_name
    ), credential=account_key)

    # generate a random name for testing purpose
    fs_name = "testfs{}".format(random.randint(1, 1000))
    print("Generating a test filesystem named '{}'.".format(fs_name))

    # create the filesystem
    filesystem_client = service_client.create_file_system(file_system=fs_name)

    print("ADLS filesystem created successfully")


if __name__ == '__main__':
    run()

huangapple
  • 本文由 发表于 2023年5月17日 08:48:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76267913.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定