Azure使用Terraform部署Databricks时的配额限制问题。

huangapple go评论78阅读模式
英文:

Azure quota limit issue when deploying Databricks via Terraform

问题

我正在尝试在Azure中部署一个Databricks工作区并创建一个单节点集群。要做到这一点,我使用以下Terraform的main.tf文件:

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "3.55.0"
    }
    databricks = {
      source  = "databricks/databricks"
      version = "1.0.0"
    }
  }
}

# 配置Azure提供者
provider "azurerm" {
  features {}
}

# 配置Databricks提供者
provider "databricks" {
  host = azurerm_databricks_workspace.databricks_workspace.workspace_url
}

# 创建资源组
resource "azurerm_resource_group" "resource_group" {
  name     = var.resource_group_name
  location = var.resource_group_location
}

# 创建Databricks工作区
resource "azurerm_databricks_workspace" "databricks_workspace" {
  location            = azurerm_resource_group.resource_group.location
  name                = "databricks-test-001"
  resource_group_name = azurerm_resource_group.resource_group.name
  sku                 = "standard"

  depends_on = [
    azurerm_resource_group.resource_group
  ]
}

# 创建集群
data "databricks_node_type" "smallest" {
  local_disk = true

  depends_on = [
    azurerm_databricks_workspace.databricks_workspace
  ]
}

data "databricks_spark_version" "latest_lts" {
  long_term_support = true

  depends_on = [
    azurerm_databricks_workspace.databricks_workspace
  ]
}

resource "databricks_cluster" "single_node" {
  cluster_name            = "Single Node"
  spark_version           = data.databricks_spark_version.latest_lts.id
  node_type_id            = data.databricks_node_type.smallest.id
  autotermination_minutes = 10

  spark_conf = {
    # 单节点
    "spark.databricks.cluster.profile" : "singleNode"
    "spark.master" : "local[*]"
  }

  custom_tags = {
    "ResourceClass" = "SingleNode"
  }

  depends_on = [
    azurerm_databricks_workspace.databricks_workspace
  ]
}

# 创建笔记本
resource "databricks_notebook" "notebook" {
  content_base64 = base64encode("print('Welcome to Databricks-Labs notebook')")
  path           = "/Shared/Demo/demo_example_notebook"
  language       = "PYTHON"

  depends_on = [
    databricks_cluster.single_node,
    azurerm_databricks_workspace.databricks_workspace
  ]
}

当我应用terraform计划时,我得到以下错误:

要求:24,(最低)需要新的限制:24。提交 https://aka.ms/ProdportalCRP/#blade/Microsoft_Azure_Capacity/UsageAndQuota.ReactView/Parameters/%7B%22subscriptionId%22:%22a9f6a84e-aa76-4493-ad46-7335d8bc7ea5%22,%22command%22:%22openQuotaApprovalBlade%22,%22quotas%22:[%7B%22location%22:%22westus%22,%22providerId%22:%22Microsoft.Compute%22,%22resourceName%22:%22StandardNCADSA100v4Family%22,%22quotaRequest%22:%7B%22properties%22:%7B%22limit%22:24,%22unit%22:%22Count%22,%22name%22:%7B%22value%22:%22StandardNCADSA100v4Family%22%7D%7D%7D%7D]%7D 提交要求以增加配额,在‘详细信息’部分列出的参数来使部署成功。请阅读有关配额限制的更多信息 https://docs.microsoft.com/en-us/azure/azure-supportability/per-vm-quota-requests databricks_error_message:错误代码: QuotaExceeded,错误消息: 由于超出了已批准的StandardNCADSA100v4Family内核配额,因此无法完成操作。其他详情-部署模型:资源管理器,位置:westus,当前限制:0,当前使用:0,额外要求:24,(最低)需要新的限制:24。请提交配额增加请求 https://aka.ms/ProdportalCRP/#blade/Microsoft_Azure_Capacity/UsageAndQuota.ReactView/Parameters/%7B%22subscriptionId%22:%22a9f6a84e-aa76-4493-ad46-7335d8bc7ea5%22,%22command%22:%22openQuotaApprovalBlade%22,%22quotas%22:[%7B%22location%22:%22westus%22,%22providerId%22:%22Microsoft.Compute%22,%22resourceName%22:%22StandardNCADSA100v4Family%22,%22quotaRequest%22:%7B%22properties%22:%7B%22limit%22:24,%22unit%22:%22Count%22,%22name%22:%7B%22value%22:%22StandardNCADSA100v4Family%22%7D%7D%7D%7D]%7D 提交要求以增加配额,在‘详细信息’部分列出的参数来使部署成功。请阅读有关配额限制的更多信息 https://docs.databricks.com/dev-tools/api/latest/clusters.html#clusterclusterstate 获取更多详情


我明白需要增加配额限制,并已提交请求,但遭到拒绝。我还尝试在不同的Azure区域进行部署,但由于同样的原因而不成功。我不明白的是,如果我转到已创建的Databricks工作区并尝试从那里创建单节点集群,它可以无问题地运行。对我来说,这意味着配额限制可能不是实际问题。我将感激任何关于可能是问题的建议。
<details>
<summary>英文:</summary>
I am trying to deploy a databricks workspace in Azure and create a Single Node cluster. To do that, I use the following Terraform main.tf file:
terraform {
required_providers {
azurerm = {
source  = &quot;hashicorp/azurerm&quot;
version = &quot;3.55.0&quot;
}
databricks = {
source  = &quot;databricks/databricks&quot;
version = &quot;1.0.0&quot;
}
}
}
# Configure Azure provider
provider &quot;azurerm&quot; {
features {}
}
# Configure Databricks provider
provider &quot;databricks&quot; {
host = azurerm_databricks_workspace.databricks_workspace.workspace_url
}
# Create resource group
resource &quot;azurerm_resource_group&quot; &quot;resource_group&quot; {
name     = var.resource_group_name
location = var.resource_group_location
}
# Create Databricks workspace
resource &quot;azurerm_databricks_workspace&quot; &quot;databricks_workspace&quot; {
location            = azurerm_resource_group.resource_group.location
name                = &quot;databricks-test-001&quot;
resource_group_name = azurerm_resource_group.resource_group.name
sku                 = &quot;standard&quot;
depends_on = [
azurerm_resource_group.resource_group
]
}
# Create cluster
data &quot;databricks_node_type&quot; &quot;smallest&quot; {
local_disk = true
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}
data &quot;databricks_spark_version&quot; &quot;latest_lts&quot; {
long_term_support = true
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}
resource &quot;databricks_cluster&quot; &quot;single_node&quot; {
cluster_name            = &quot;Single Node&quot;
spark_version           = data.databricks_spark_version.latest_lts.id
node_type_id            = data.databricks_node_type.smallest.id
autotermination_minutes = 10
spark_conf = {
# Single-node
&quot;spark.databricks.cluster.profile&quot; : &quot;singleNode&quot;
&quot;spark.master&quot; : &quot;local[*]&quot;
}
custom_tags = {
&quot;ResourceClass&quot; = &quot;SingleNode&quot;
}
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}
# Create Notebook
resource &quot;databricks_notebook&quot; &quot;notebook&quot; {
content_base64 = base64encode(&quot;print(&#39;Welcome to Databricks-Labs notebook&#39;)&quot;)
path           = &quot;/Shared/Demo/demo_example_notebook&quot;
language       = &quot;PYTHON&quot;
depends_on = [
databricks_cluster.single_node,
azurerm_databricks_workspace.databricks_workspace
]
}
When I apply the terraform plan, I get the following error:
*Required: 24, (Minimum) New Limit Required: 24. Submit a request for Quota increase at https://aka.ms/ProdportalCRP/#blade/Microsoft_Azure_Capacity/UsageAndQuota.ReactView/Parameters/%7B%22subscriptionId%22:%22a9f6a84e-aa76-4493-ad46-7335d8bc7ea5%22,%22command%22:%22openQuotaApprovalBlade%22,%22quotas%22:[%7B%22location%22:%22westus%22,%22providerId%22:%22Microsoft.Compute%22,%22resourceName%22:%22StandardNCADSA100v4Family%22,%22quotaRequest%22:%7B%22properties%22:%7B%22limit%22:24,%22unit%22:%22Count%22,%22name%22:%7B%22value%22:%22StandardNCADSA100v4Family%22%7D%7D%7D%7D]%7D by specifying parameters listed in the ‘Details’ section for deployment to succeed. Please read more about quota limits at https://docs.microsoft.com/en-us/azure/azure-supportability/per-vm-quota-requests databricks_error_message:Error code: QuotaExceeded, error message: Operation could not be completed as it results in exceeding approved StandardNCADSA100v4Family Cores quota. Additional details - Deployment Model: Resource Manager, Location: westus, Current Limit: 0, Current Usage: 0, Additional Required: 24, (Minimum) New Limit Required: 24. Submit a request for Quota increase at https://aka.ms/ProdportalCRP/#blade/Microsoft_Azure_Capacity/UsageAndQuota.ReactView/Parameters/%7B%22subscriptionId%22:%22a9f6a84e-aa76-4493-ad46-7335d8bc7ea5%22,%22command%22:%22openQuotaApprovalBlade%22,%22quotas%22:[%7B%22location%22:%22westus%22,%22providerId%22:%22Microsoft.Compute%22,%22resourceName%22:%22StandardNCADSA100v4Family%22,%22quotaRequest%22:%7B%22properties%22:%7B%22limit%22:24,%22unit%22:%22Count%22,%22name%22:%7B%22value%22:%22StandardNCADSA100v4Family%22%7D%7D%7D%7D]%7D by specifying parameters listed in the ‘Details’ section for deployment to succeed. Please read more about quota limits at https://docs.microsoft.com/en-us/azure/azure-supportability/per-vm-quota-requests]. Please see https://docs.databricks.com/dev-tools/api/latest/clusters.html#clusterclusterstate for more details*
I get that the quota limit needs to be increased and I have submitted a request, which was denied. I have also tried to do the deployment in different Azure regions, which was also unsuccessful for the same reason. 
What I don&#39;t get is why if I go to the created Databricks workspace and try to create a Single Node cluster from there it works with no issues at all. To me this means that the quota limit might not be the actual issue. 
I would appreciate any suggestions about what might be the issue.
</details>
# 答案1
**得分**: 0
以下是您提供的内容的翻译:
**Main.tf:**
```plaintext
我已经检查了以下代码,并尝试创建一个单节点集群。
resource "azurerm_databricks_workspace" "databricks_workspace" {
location            = data.azurerm_resource_group.example.location
name                = "databricks-test-001"
resource_group_name = data.azurerm_resource_group.example.name
sku                 = "standard"
}
data "databricks_current_user" "me" {
depends_on = [azurerm_databricks_workspace.databricks_workspace]
}
data "databricks_spark_version" "latest" {
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}
data "databricks_spark_version" "latest_lts" {
long_term_support = true
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}
data "databricks_node_type" "smallest" {   
local_disk = true
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}
resource "databricks_cluster" "single_node" {
cluster_name            = "Single Node"
spark_version           = data.databricks_spark_version.latest_lts.id
node_type_id            = data.databricks_node_type.smallest.id
autotermination_minutes = 10
spark_conf = {
# Single-node
"spark.databricks.cluster.profile" : "singleNode"
"spark.master" : "local[*]"
}
custom_tags = {
"ResourceClass" = "SingleNode"
}
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}

Terraform应用后:

它成功执行:

问题主要是由于Azure区域的配额限制引起的。
正如您所说,即使在更改了区域后,问题仍然存在,这个问题可能是由于使用不正确的Azure凭据登录时出现了错误的配置引起的。

请确保在使用az登录后设置正确的订阅。

az account set --subscription "xxxx"

main.tf

provider "databricks" {
  host = azurerm_databricks_workspace.databricks_workspace.workspace_url
}

provider "azuread" {
   subscription_id = "xxx"
      tenant_id              = "xxx"
}

data "azuread_client_config" "current" {
  //tenant_id = "xxxx"
}

data "azurerm_client_config" "current" {
  //tenant_id = "xxx"
 // subscription_id = "8ae0844f-xxx"
}

data azurerm_subscription "current"{
//  subscription_id = "xxx"
}

data "azurerm_storage_account" "example" {
name = "remoxxxx"
  resource_group_name = data.azurerm_resource_group.example.name
}

terraform {
  backend "azurerm" {
    resource_group_name  = "vxxx"
    storage_account_name = "remotestatekavstr233"
    container_name       = "terraform"
    key                  = "terraform.tfstate"
  }
}

如果确认订阅已正确设置,那么订阅可能已达到其限制。

请查看Azure错误代码QuotaExceeded故障排除 - Azure | Microsoft Learn

英文:

I have checked with the following code and tried creating a Single Node cluster.

Main.tf:

resource &quot;azurerm_databricks_workspace&quot; &quot;databricks_workspace&quot; {
location            = data.azurerm_resource_group.example.location
name                = &quot;databricks-test-001&quot;
resource_group_name = data.azurerm_resource_group.example.name
sku                 = &quot;standard&quot;
}
data &quot;databricks_current_user&quot; &quot;me&quot; {
depends_on = [azurerm_databricks_workspace.databricks_workspace]
}
data &quot;databricks_spark_version&quot; &quot;latest&quot; {
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}
data &quot;databricks_spark_version&quot; &quot;latest_lts&quot; {
long_term_support = true
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}
data &quot;databricks_node_type&quot; &quot;smallest&quot; {   
local_disk = true
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}
resource &quot;databricks_cluster&quot; &quot;single_node&quot; {
cluster_name            = &quot;Single Node&quot;
spark_version           = data.databricks_spark_version.latest_lts.id
node_type_id            = data.databricks_node_type.smallest.id
autotermination_minutes = 10
spark_conf = {
# Single-node
&quot;spark.databricks.cluster.profile&quot; : &quot;singleNode&quot;
&quot;spark.master&quot; : &quot;local[*]&quot;
}
custom_tags = {
&quot;ResourceClass&quot; = &quot;SingleNode&quot;
}
depends_on = [
azurerm_databricks_workspace.databricks_workspace
]
}

Upon terraform apply:

Azure使用Terraform部署Databricks时的配额限制问题。

It got executed successfully:

The issue mainly occurs due to azure quota limitation for regions.
As you said even after changing the region , the issue prevails, the issue must be due to misconfiguration of azure subscription on logging in with incorrect azure creadetials .

Make sure to set the correct subscription after logging in with az login.

az account set --subscription &quot;xxxx&quot;

Azure使用Terraform部署Databricks时的配额限制问题。

main.tf

provider &quot;databricks&quot; {
host = azurerm_databricks_workspace.databricks_workspace.workspace_url
}
provider &quot;azuread&quot; {
subscription_id = &quot;xxx
tenant_id              = &quot;xxx&quot;
}
data &quot;azuread_client_config&quot; &quot;current&quot; {
//tenant_id = &quot;xxxx&quot;
}
data &quot;azurerm_client_config&quot; &quot;current&quot; {
//tenant_id = &quot;xxx&quot;
// subscription_id = &quot;8ae0844f-xxx&quot;
}
data azurerm_subscription &quot;current&quot;{
//  subscription_id = &quot;xxx&quot;
}
data &quot;azurerm_storage_account&quot; &quot;example&quot; {
name = &quot;remoxxxx&quot;
resource_group_name = data.azurerm_resource_group.example.name
}
terraform {
backend &quot;azurerm&quot; {
resource_group_name  = &quot;vxxx&quot;
storage_account_name = &quot;remotestatekavstr233&quot;
container_name       = &quot;terraform&quot;
key                  = &quot;terraform.tfstate&quot;
}
}

Azure使用Terraform部署Databricks时的配额限制问题。

If the subscription is confirmed to set properly , then the subscription might have reached its limit.

Please check Troubleshoot QuotaExceeded error code - Azure | Microsoft Learn

huangapple
  • 本文由 发表于 2023年5月13日 20:39:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/76242778.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定