2023年7月11日 00:15:10go评论136阅读模式

英文:

Databricks - not allowing to write output to folder using key stored in azure keyvault

问题

我目前遇到一个奇怪的问题。

最近我改变了处理我的Databricks项目中的秘密（存储密钥）的方式，使用了Azure Key Vault和秘密范围。

当我从Blob存储加载输入文件时，它运行正常。
但是当我尝试写输出时，我得到以下错误：

shaded.databricks.org.apache.hadoop.fs.azure.AzureException：使用匿名凭据无法访问帐户[REDACTED].blob.core.windows.net中的容器analysis，并且在配置中找不到它们的凭据。

我在Spark配置级别上设置了以下键：spark.conf.set("fs.azure.account.key.ACCOUNTNAME.blob.core.windows.net", dbutils.secrets.get('scope', 'STORAGE-KEY'))

现在，当我在Spark配置部分的集群配置中硬编码这个fs.azure.account.key.sparkstorageprod.blob.core.windows.net STORAGE-KEY时，它运行正常，我能够写输出。

我想知道如何避免在集群配置的Spark配置部分中硬编码密钥值，而只依赖于秘密范围和Azure Key Vault。

英文:

I'm currently facing a weird issue.

I recently changed the way i handle the secrets (storage key) inside my databricks project, using azure keyvault and secret scope.

When i load the input file from a blob storage it works fine.
But when i try to write the output i get the following error:

shaded.databricks.org.apache.hadoop.fs.azure.AzureException: Unable to access container analysis in account [REDACTED].blob.core.windows.net using anonymous credentials, and no credentials found for them in the configuration.

I did set on the spark conf level the key: spark.conf.set("fs.azure.account.key.ACCOUNTNAME.blob.core.windows.net", dbutils.secrets.get('scope', 'STORAGE-KEY'))

Now when i hardcode in the cluster configuration in the Spark config section this fs.azure.account.key.sparkstorageprod.blob.core.windows.net STORAGE-KEY, it works fine and i'm abble to write the output.

I would like to know how to avoid hardcoding the key value in the spark conf section in the cluster configuration, and only rely on the secret scope and the azure key vault.

答案1

得分: 0

我尝试在我的环境中执行相同的操作，成功将其写入存储帐户，没有任何问题。

重新检查所有的密钥范围步骤。

我想知道如何在集群配置中避免硬编码密钥值，而只依赖于密钥范围和 Azure 密钥保管库。

您可以在集群级别使用密钥范围，而不需要硬编码密钥，如下所示。在 {{}} 中指定密钥名称和范围。

示例：spark.hadoop.fs.azure.account.key.<account_name>.blob.core.windows.net {{secrets/<secret-scope-name>/<secret-name>}}

我的演示：

集群配置：

您可以看到，在使用集群级别的密钥范围后，我能够读取和写入。

英文:

I tried the same in my environment and able to write it to storage account without any issue.

Databricks – 不允许使用存储在Azure密钥保管库中的密钥将输出写入文件夹

Recheck all the Secret scope steps again.

>I would like to know how to avoid hardcoding the key value in the spark conf section in the cluster configuration, and only rely on the secret scope and the azure key vault.

You can use the Secret scope in the cluster level without hardcoding the secret like below. Specify the secret name and scope in {{}}.

Ex: spark.hadoop.fs.azure.account.key.<account_name>.blob.core.windows.net {{secrets/<secret-scope-name>/<secret-name>}}

My Demo:

Cluster configurations:

Databricks – 不允许使用存储在Azure密钥保管库中的密钥将输出写入文件夹

You can see that I am able to read and write after using secret scope at cluster level.

Databricks – 不允许使用存储在Azure密钥保管库中的密钥将输出写入文件夹

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Databricks – 不允许使用存储在Azure密钥保管库中的密钥将输出写入文件夹

问题

答案1

如何在Databricks中将一个笔记本中的变量/函数访问到另一个笔记本

如何从Java连接到并将CSV文件写入远程的Databricks Apache Spark实例？

如何在Databricks中为SQL Warehouse设置默认时区？

如何根据要求在SPARK AZURE-DATABRICKS中使用SCALA将JSON对象转换为列中的值

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论