英文:
Databricks dbutils.fs.ls() error Py4JSecurityException: Constructor public com.databricks.backend.daemon.dbutils.FSUtilsParallel is not whitelisted
问题
When executing dbutils.fs.ls() to list files in Databricks, I get the error:
py4j.security.Py4JSecurityException: Constructor public com.databricks.backend.daemon.dbutils.FSUtilsParallel(org.apache.spark.SparkContext) is not whitelisted.
I also get the error when I try dbutils.fs.ls().
Unfortunately, the following failed spark.databricks.pyspark.enablePy4JSecurity False
I have done some research and can someone advise if any of the following will fix the issue:
-
Enabling credential passthrough for standard and high-concurrency clusters.
-
Configuring credential passthrough and initializing storage resources in ADLS accounts.
-
Accessing ADLS resources directly when credential passthrough is enabled.
-
Accessing ADLS resources through a mount point when credential passthrough is enable
英文:
When executing dbtuils.fs.ls() to list files in Databricks I get the error:
py4j.security.Py4JSecurityException: Constructor public com.databricks.backend.daemon.dbutils.FSUtilsParallel(org.apache.spark.SparkContext) is not whitelisted.
I also get the error when I try dbtuils.fs.ls().
Unfortunately, the following failed spark.databricks.pyspark.enablePy4JSecurity False
I have done some research and can someone advise if any of the following will fix the issue:
-
Enabling credential passthrough for standard and high-concurrency clusters.
-
Configuring credential passthrough and initializing storage resources in ADLS accounts.
-
Accessing ADLS resources directly when credential passthrough is enabled.
-
Accessing ADLS resources through a mount point when credential passthrough is enable
Thanks
答案1
得分: 1
根据微软文档
> 你面临的错误是因为在使用高并发集群并启用凭据透传时,会在各种库操作中出现此问题。
> 当您访问一个Azure Databricks尚未明确指定为适用于Azure Data Lake Storage凭据透传集群的安全方法时,将引发此问题。
解决方法如下:
- 如果适用,作为解决方法使用不同的集群模式。
- 为此,将
spark.databricks.pyspark.enableProcessIsolation
更新为false
,以便您的集群以无隔离共享访问模式运行。 - 在您的情况下,作为解决方法使用标准集群。
英文:
As per MS Document
> The error you are facing is because, when utilizing a High Concurrency cluster and enabling credential pass through, this problem occurs with various library operations.
> When you visit a method that Azure Databricks hasn't expressly designated as safe for Azure Data Lake Storage credential passthrough clusters, this issue is raised.
To workaround you can:
- Use a different cluster mode as a workaround if it describes your situation.
- Update
spark.databricks.pyspark.enableProcessIsolation
tofalse
for this your cluster needs to be with no isolation shared access mode - Use standard clusters in your case as a workaround.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论