如何使用Python脚本自动创建Azure Databricks集群

huangapple go评论103阅读模式
英文:

How to automate the creation of Azure Databricks cluster using python script

问题

有没有办法使用Python脚本自动创建Azure Databricks集群?

英文:

Is there any way to automate the creation of Azure Databricks Cluster creation using Python script

答案1

得分: 2

如果您不想安装任何软件包,可以直接使用Databricks REST API,只需定义一个负载并调用create cluster API,类似于以下代码:

import requests

workspace_host = "https://adb-.....azuredatabricks.net"
personal_token = "dapi...."

cluster_spec = {
    "cluster_name": "my-cluster",
    "node_type_id": "Standard_DS3_v2",
    "spark_version": "11.3.x-scala2.12",
    "num_workers": 3
}

res = requests.post(workspace_host + "/api/2.0/clusters/create", 
   headers={"Authorization": "Bearer " + personal_token}, 
   json=cluster_spec)
cluster_id = res.json()['cluster_id']

但更好的方式是使用新的Databricks Python SDK,它简化了许多操作并支持更多的身份验证方法。

英文:

If you don't want to install any package, then you can use Databricks REST API directly, just define a payload, and call create cluster API. Something like this:

import requests

workspace_host = "https://adb-.....azuredatabricks.net"
personal_token = "dapi...."

cluster_spec = {
    "cluster_name": "my-cluster",
    "node_type_id": "Standard_DS3_v2",
    "spark_version": "11.3.x-scala2.12",
    "num_workers": 3
}

res = requests.post(workspace_host + "/api/2.0/clusters/create", 
   headers={"Authorization": "Bearer " + personal_token}, 
   json=cluster_spec)
cluster_id = res.json()['cluster_id']

But it's better to use new Databricks SDK for Python that simplifies many operations and supporting more authentication methods.

答案2

得分: -1

可以使用Python脚本自动创建Azure Databricks集群。Azure提供了用于Python的Azure SDK。

pip install azure-mgmt-databricks

from azure.identity import DefaultAzureCredential
from azure.mgmt.databricks import DatabricksManagementClient
from azure.mgmt.databricks.models import Workspace, CreateWorkspaceParameters, Sku, ManagedResourceGroupConfiguration, ManagedResourceGroup

# 定义你的Azure订阅ID和资源组详情
subscription_id = "<your-subscription-id>"
resource_group = "<your-resource-group>"

# 创建一个Databricks管理客户端
credential = DefaultAzureCredential()
databricks_client = DatabricksManagementClient(credential, subscription_id)

# 定义集群详情
cluster_name = "<cluster-name>"
location = "<azure-region>"
workspace_name = "<databricks-workspace-name>"
node_type = "<node-type>"
worker_count = 2

# 创建一个集群
workspace = Workspace(location=location)
workspace.managed_resource_group_id = ManagedResourceGroup(id=f"/subscriptions/{subscription_id}/resourceGroups/{resource_group}")
cluster = CreateWorkspaceParameters(
    sku=Sku(name=node_type, capacity=worker_count),
    custom_parameters=workspace
)

databricks_client.workspaces.create_or_update(resource_group, workspace_name, cluster_name, cluster)
英文:

it is possible to automate the creation of Azure Databricks clusters using a Python script. Azure provides the Azure SDK for Python

pip install azure-mgmt-databricks

from azure.identity import DefaultAzureCredential
from azure.mgmt.databricks import DatabricksManagementClient
from azure.mgmt.databricks.models import Workspace, 
CreateWorkspaceParameters, Sku, ManagedResourceGroupConfiguration, 
ManagedResourceGroup
# Define your Azure subscription ID and resource group details
subscription_id = &quot;&lt;your-subscription-id&gt;&quot;
reso1urce_group = &quot;&lt;your-resource-group&gt;&quot;
# Create a Databricks management client
credential = DefaultAzureCredential()
databricks_client = DatabricksManagementClient(credential, 
subscription_id)

# Define the cluster details
cluster_name = &quot;&lt;cluster-name&gt;&quot;
location = &quot;&lt;azure-region&gt;&quot;
workspace_name = &quot;&lt;databricks-workspace-name&gt;&quot;
node_type = &quot;&lt;node-type&gt;&quot;
worker_count = 2

# Create a cluster
workspace = Workspace(location=location)
workspace.managed_resource_group_id = ManagedResourceGroup(id=f&quot;/subscriptions/{subscription_id}/resourceGroups/{resource_group}&quot;)
cluster = CreateWorkspaceParameters(
    sku=Sku(name=node_type, capacity=worker_count),
    custom_parameters=workspace
)

databricks_client.workspaces.create_or_update(resource_group, workspace_name, cluster_name, cluster)

huangapple
  • 本文由 发表于 2023年6月5日 20:31:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76406432.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定