英文:
Create container on Azure datalake Gen2
问题
我正在尝试使用 Python 代码通过 Databricks 在 Azure Datalake Gen2 上创建容器。
我尝试了许多代码变体并得到了不同的错误。其中一个示例如下:
from azure.storage.filedatalake import DataLakeServiceClient
service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
"https", storage_account_name), credential=my_secret_key)
file_system_client = service_client.get_file_system_client("mytestfolder")
print("the file system exists: " + str(file_system_client.exists()))
if not file_system_client.exists():
file_system_client.create_file_system()
print("the file system is created.")
使用这个代码出现的错误是:
AzureSigningError: Invalid base64-encoded string: number of data characters (37) cannot be 1 more than a multiple of 4
我希望修复上述代码或任何其他 Python / PySpark 代码片段以在 Gen2 存储上创建容器。我还尝试了下面的代码,但出现了与上述相同的错误:
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
container_name = 'test'
storage_connection_string = f'DefaultEndpointsProtocol=https;AccountName={acname};AccountKey={ackey};EndpointSuffix=core.windows.net;'
blob_service_client = BlobServiceClient.from_connection_string(storage_connection_string)
blob_service_client.create_container(container_name, public_access='CONTAINER', timeout=10)
英文:
I am trying to create container on Azure Datalake Gen2 using python code through databricks.
I have tried many variations in the code and got different errors. One of the examples is as below
The code i used (copied from stackoverflow.com) as is:
from azure.storage.filedatalake import DataLakeServiceClient
service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
"https", storage_account_name), credential=my_secret_key)
#the get_file_system_client method will not throw error if the file system does not exist, if you're using the latest library 12.3.0
file_system_client = service_client.get_file_system_client("mytestfolder")
print("the file system exists: " + str(file_system_client.exists()))
#create the file system if it does not exist
if not file_system_client.exists():
file_system_client.create_file_system()
print("the file system is created.")
The error i am getting with this is:
AzureSigningError: Invalid base64-encoded string: number of data characters (37) cannot be 1 more than a multiple of 4
I am looking to fix the above code or any other piece of python / pyspark code to create containers on the Gen2 Storage.
Also tried the below code as well with the same error as above.
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
container_name = 'test'
storage_connection_string = f'DefaultEndpointsProtocol=https;AccountName={acname};AccountKey={ackey};EndpointSuffix= core.windows.net;'
blob_service_client = BlobServiceClient.from_connection_string(storage_connection_string)
blob_service_client.create_container(container_name,public_access='CONTAINER',timeout = 10)
Any Help Please.
答案1
得分: 0
你能使用下面的示例代码来创建一个在ADLS Gen2账户中的容器(文件系统),然后检查是否有帮助吗?
import os
import random
import uuid
from azure.storage.filedatalake import (
DataLakeServiceClient,
)
def run():
account_name = os.getenv('STORAGE_ACCOUNT_NAME', "MyStorageadlsgen2")
account_key = os.getenv('STORAGE_ACCOUNT_KEY', "R/puXXXXXXXXXXXXXXXXfSLo2PiqPXf4ltj+CUs2yg==")
# 使用环境变量中的凭据设置服务客户端
service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
"https",
account_name
), credential=account_key)
# 为测试目的生成一个随机名称
fs_name = "testfs{}".format(random.randint(1, 1000))
print("生成一个名为 '{}' 的测试文件系统。".format(fs_name))
# 创建文件系统
filesystem_client = service_client.create_file_system(file_system=fs_name)
print("ADLS文件系统创建成功")
if __name__ == '__main__':
run()
英文:
Could you please leverage the below sample to create a container (filesystem) in ADLS Gen2 account and check if that helps ?
import os
import random
import uuid
from azure.storage.filedatalake import (
DataLakeServiceClient,
)
def run():
account_name = os.getenv('STORAGE_ACCOUNT_NAME', "MyStorageadlsgen2")
account_key = os.getenv('STORAGE_ACCOUNT_KEY', "R/puXXXXXXXXXXXXXXXXfSLo2PiqPXf4ltj+CUs2yg==")
# set up the service client with the credentials from the environment variables
service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
"https",
account_name
), credential=account_key)
# generate a random name for testing purpose
fs_name = "testfs{}".format(random.randint(1, 1000))
print("Generating a test filesystem named '{}'.".format(fs_name))
# create the filesystem
filesystem_client = service_client.create_file_system(file_system=fs_name)
print("ADLS filesystem created successfully")
if __name__ == '__main__':
run()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论