英文:
How to download a parquet "file" (actually directory) using the Azure Client?
问题
I am using the az storage fs file download to download the contents for a parquet directory like this:
az storage fs file download
   --path myname/1/batch-repo/form/Fulfillment/2022/01/02/batch-form-Fulfillment.parquet/
   --account-name my-storage-account --f my-container
The download was attempted but apparently the az cli is not aware this is a parquet and can not handle it - either at the directory level or individual files:
> ValueError: This pipeline didn't have the RawDeserializer policy; can't deserialize
Is there any workaround to download the contents of a parquet file?
英文:
I am using the az storage fs file download to download the contents for a parquet directory like this:
az storage fs file download 
   --path myname/1/batch-repo/form/Fulfillment/2022/01/02/batch-form-Fulfillment.parquet/  
   --account-name my-storage-account --f my-container
The download was attempted but apparently the az cli is not aware this is a parquet and can not handle it - either at the directory level or individual files:
> ValueError: This pipeline didn't have the RawDeserializer policy; can't deserialize
Is there any workaround to download the contents of a parquet file?
答案1
得分: 1
After reproducing from my end, I received same error while downloading a directory using the same script as yours.

我在自己的端上重现后,使用与您相同的脚本下载目录时出现了相同的错误。
I can see that the individual files are getting downloaded with the below script.
az storage fs file download -f container --path dir1/part-00004-a9e77425-5fb4-456f-ba52-f821123bd193-c000.snappy.parquet --account-name <ACCOUNT_NAME> --account-key "<ACCOUNT_KEY>"
我可以看到使用下面的脚本可以下载单个文件。
az storage fs file download -f container --path dir1/part-00004-a9e77425-5fb4-456f-ba52-f821123bd193-c000.snappy.parquet --account-name <ACCOUNT_NAME> --account-key "<ACCOUNT_KEY>"
However, if you are trying to download at directory level you must use az storage fs directory download. Below is the complete script that worked for me.
然而,如果您尝试在目录级别下载,您必须使用 az storage fs directory download。以下是对我有效的完整脚本。
az storage fs directory download -f container -d folder1 -s dir1 --account-name adls76224157 --account-name <ACCOUNT_NAME> --account-key "<ACCOUNT_KEY>"
Results:

结果:

Below is the structure of my files
以下是我的文件结构
英文:
After reproducing from my end, I received same error while downloading a directory using the same script as yours.

I can see that the individual files are getting downloaded with the below script.
az storage fs file download -f container --path dir1/part-00004-a9e77425-5fb4-456f-ba52-f821123bd193-c000.snappy.parquet --account-name <ACCOUNT_NAME> --account-key "<ACCOUNT_KEY>"
However, if you are trying to download at directory level you must use az storage fs directory download. Below is the complete script that worked for me.
az storage fs directory download -f container -d folder1 -s dir1 --account-name adls76224157 --account-name <ACCOUNT_NAME> --account-key "<ACCOUNT_KEY>"
Results:

Below is the structure of my files
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。



评论