在Azure数据湖Gen2上创建容器。

huangapple go评论87阅读模式
英文:

Create container on Azure datalake Gen2

问题

我正在尝试使用 Python 代码通过 Databricks 在 Azure Datalake Gen2 上创建容器。

我尝试了许多代码变体并得到了不同的错误。其中一个示例如下:

  1. from azure.storage.filedatalake import DataLakeServiceClient
  2. service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
  3. "https", storage_account_name), credential=my_secret_key)
  4. file_system_client = service_client.get_file_system_client("mytestfolder")
  5. print("the file system exists: " + str(file_system_client.exists()))
  6. if not file_system_client.exists():
  7. file_system_client.create_file_system()
  8. print("the file system is created.")

使用这个代码出现的错误是:

  1. AzureSigningError: Invalid base64-encoded string: number of data characters (37) cannot be 1 more than a multiple of 4

我希望修复上述代码或任何其他 Python / PySpark 代码片段以在 Gen2 存储上创建容器。我还尝试了下面的代码,但出现了与上述相同的错误:

  1. from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
  2. container_name = 'test'
  3. storage_connection_string = f'DefaultEndpointsProtocol=https;AccountName={acname};AccountKey={ackey};EndpointSuffix=core.windows.net;'
  4. blob_service_client = BlobServiceClient.from_connection_string(storage_connection_string)
  5. blob_service_client.create_container(container_name, public_access='CONTAINER', timeout=10)
英文:

I am trying to create container on Azure Datalake Gen2 using python code through databricks.

I have tried many variations in the code and got different errors. One of the examples is as below
The code i used (copied from stackoverflow.com) as is:

  1. from azure.storage.filedatalake import DataLakeServiceClient
  2. service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
  3. "https", storage_account_name), credential=my_secret_key)
  4. #the get_file_system_client method will not throw error if the file system does not exist, if you're using the latest library 12.3.0
  5. file_system_client = service_client.get_file_system_client("mytestfolder")
  6. print("the file system exists: " + str(file_system_client.exists()))
  7. #create the file system if it does not exist
  8. if not file_system_client.exists():
  9. file_system_client.create_file_system()
  10. print("the file system is created.")

The error i am getting with this is:

  1. AzureSigningError: Invalid base64-encoded string: number of data characters (37) cannot be 1 more than a multiple of 4

I am looking to fix the above code or any other piece of python / pyspark code to create containers on the Gen2 Storage.
Also tried the below code as well with the same error as above.

  1. from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
  2. container_name = 'test'
  3. storage_connection_string = f'DefaultEndpointsProtocol=https;AccountName={acname};AccountKey={ackey};EndpointSuffix= core.windows.net;'
  4. blob_service_client = BlobServiceClient.from_connection_string(storage_connection_string)
  5. blob_service_client.create_container(container_name,public_access='CONTAINER',timeout = 10)

Any Help Please.

答案1

得分: 0

你能使用下面的示例代码来创建一个在ADLS Gen2账户中的容器(文件系统),然后检查是否有帮助吗?

  1. import os
  2. import random
  3. import uuid
  4. from azure.storage.filedatalake import (
  5. DataLakeServiceClient,
  6. )
  7. def run():
  8. account_name = os.getenv('STORAGE_ACCOUNT_NAME', "MyStorageadlsgen2")
  9. account_key = os.getenv('STORAGE_ACCOUNT_KEY', "R/puXXXXXXXXXXXXXXXXfSLo2PiqPXf4ltj+CUs2yg==")
  10. # 使用环境变量中的凭据设置服务客户端
  11. service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
  12. "https",
  13. account_name
  14. ), credential=account_key)
  15. # 为测试目的生成一个随机名称
  16. fs_name = "testfs{}".format(random.randint(1, 1000))
  17. print("生成一个名为 '{}' 的测试文件系统。".format(fs_name))
  18. # 创建文件系统
  19. filesystem_client = service_client.create_file_system(file_system=fs_name)
  20. print("ADLS文件系统创建成功")
  21. if __name__ == '__main__':
  22. run()
英文:

Could you please leverage the below sample to create a container (filesystem) in ADLS Gen2 account and check if that helps ?

  1. import os
  2. import random
  3. import uuid
  4. from azure.storage.filedatalake import (
  5. DataLakeServiceClient,
  6. )
  7. def run():
  8. account_name = os.getenv('STORAGE_ACCOUNT_NAME', "MyStorageadlsgen2")
  9. account_key = os.getenv('STORAGE_ACCOUNT_KEY', "R/puXXXXXXXXXXXXXXXXfSLo2PiqPXf4ltj+CUs2yg==")
  10. # set up the service client with the credentials from the environment variables
  11. service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
  12. "https",
  13. account_name
  14. ), credential=account_key)
  15. # generate a random name for testing purpose
  16. fs_name = "testfs{}".format(random.randint(1, 1000))
  17. print("Generating a test filesystem named '{}'.".format(fs_name))
  18. # create the filesystem
  19. filesystem_client = service_client.create_file_system(file_system=fs_name)
  20. print("ADLS filesystem created successfully")
  21. if __name__ == '__main__':
  22. run()

huangapple
  • 本文由 发表于 2023年5月17日 08:48:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76267913.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定