Databricks文件触发器 – 如何设置存储防火墙白名单

huangapple go评论67阅读模式
英文:

Databricks file trigger - how to whitlelist storage firewall

问题

最近,Databricks 添加了一个新功能 - 文件触发器。然而,这个功能似乎需要一个存储账户来允许所有网络流量。

我的存储账户已配置了防火墙,它拒绝来自未知来源的流量。Databricks Workspace 部署在我们的内部网络上 - 我们正在使用 Vnet 注入。所有必要的子网都在白名单中,通常存储正常工作,但文件触发器不起作用。

如果我关闭存储防火墙,文件触发器就能正常工作。外部位置和Azure Databricks 连接器已正确配置。

我收到的错误消息:

存储位置 abfss://<container>@<storage>.dfs.core.windows.net/ 的凭据无效。Unity 目录中的外部位置的凭据不能用于读取配置路径中的文件。请授予所需的权限。

如果我查看我的存储账户日志 - 看起来文件触发器会列出来自以 10.120.x.x 开头的私有 IP 地址的存储账户。如何将此服务加入白名单?我想保持存储处于防火墙下。

英文:

Recently, Databricks added a new feature - file trigger.
However, this functionality seems to need a storage account to allow all network traffic.

My storage account has a firewall configured, it denies traffic from unknown sources. Databricks Workspace is deployed to our internal network - we are using Vnet injection. All necessary subnets are whitelisted, generally, storage works fine, but not with a file trigger.
If I turn off the storage firewall, the file trigger works fine.
External location and Azure Databricks Connector are configured correctly.

The error I get:

> Invalid credentials for storage location abfss://<container>@<storage>.dfs.core.windows.net/. The credentials for the external location in the Unity Catalog cannot be used to read the files from the configured path. Please grant the required permissions.

If I look at the logs in my storage account - it looks like the file trigger lists the storage account from a private IP address starting from 10.120.x.x.
How do I whitelist this service? I want to keep my storage under the firewall.

答案1

得分: 2

Update 3rd April 2023rd: ADLS firewall isn't supported right now out of the box, work is in progress to solve this issue.

It's described in the documentation - you need:

  • Create managed identity by creating the Databricks Access Connector
  • Give this managed identity permission to access your storage account
  • Create UC external location using the managed identity
  • Give access to your storage account to given access connector - in "Networking", select "Resource instances", then select a Resource type of Microsoft.Databricks/accessConnectors and select your Azure Databricks access connector.
英文:

Update 3rd April 2023rd: ADLS firewall isn't supported right now out of the box, work is in progress to solve this issue.

It's described in the documentation - you need:

  • Create managed identity by creating the Databricks Access Connector

  • Give this managed identity permission to access your storage account

  • Create UC external location using the managed identity

  • Give access to your storage account to given access connector - in "Networking", select "Resource instances", then select a Resource type of Microsoft.Databricks/accessConnectors and select your Azure Databricks access connector.

huangapple
  • 本文由 发表于 2023年3月31日 19:39:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/75898118.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定