如何使用Python脚本自动创建Azure Databricks集群

huangapple go评论132阅读模式
英文:

How to automate the creation of Azure Databricks cluster using python script

问题

有没有办法使用Python脚本自动创建Azure Databricks集群?

英文:

Is there any way to automate the creation of Azure Databricks Cluster creation using Python script

答案1

得分: 2

如果您不想安装任何软件包,可以直接使用Databricks REST API,只需定义一个负载并调用create cluster API,类似于以下代码:

  1. import requests
  2. workspace_host = "https://adb-.....azuredatabricks.net"
  3. personal_token = "dapi...."
  4. cluster_spec = {
  5. "cluster_name": "my-cluster",
  6. "node_type_id": "Standard_DS3_v2",
  7. "spark_version": "11.3.x-scala2.12",
  8. "num_workers": 3
  9. }
  10. res = requests.post(workspace_host + "/api/2.0/clusters/create",
  11. headers={"Authorization": "Bearer " + personal_token},
  12. json=cluster_spec)
  13. cluster_id = res.json()['cluster_id']

但更好的方式是使用新的Databricks Python SDK,它简化了许多操作并支持更多的身份验证方法。

英文:

If you don't want to install any package, then you can use Databricks REST API directly, just define a payload, and call create cluster API. Something like this:

  1. import requests
  2. workspace_host = "https://adb-.....azuredatabricks.net"
  3. personal_token = "dapi...."
  4. cluster_spec = {
  5. "cluster_name": "my-cluster",
  6. "node_type_id": "Standard_DS3_v2",
  7. "spark_version": "11.3.x-scala2.12",
  8. "num_workers": 3
  9. }
  10. res = requests.post(workspace_host + "/api/2.0/clusters/create",
  11. headers={"Authorization": "Bearer " + personal_token},
  12. json=cluster_spec)
  13. cluster_id = res.json()['cluster_id']

But it's better to use new Databricks SDK for Python that simplifies many operations and supporting more authentication methods.

答案2

得分: -1

可以使用Python脚本自动创建Azure Databricks集群。Azure提供了用于Python的Azure SDK。

  1. pip install azure-mgmt-databricks
  2. from azure.identity import DefaultAzureCredential
  3. from azure.mgmt.databricks import DatabricksManagementClient
  4. from azure.mgmt.databricks.models import Workspace, CreateWorkspaceParameters, Sku, ManagedResourceGroupConfiguration, ManagedResourceGroup
  5. # 定义你的Azure订阅ID和资源组详情
  6. subscription_id = "<your-subscription-id>"
  7. resource_group = "<your-resource-group>"
  8. # 创建一个Databricks管理客户端
  9. credential = DefaultAzureCredential()
  10. databricks_client = DatabricksManagementClient(credential, subscription_id)
  11. # 定义集群详情
  12. cluster_name = "<cluster-name>"
  13. location = "<azure-region>"
  14. workspace_name = "<databricks-workspace-name>"
  15. node_type = "<node-type>"
  16. worker_count = 2
  17. # 创建一个集群
  18. workspace = Workspace(location=location)
  19. workspace.managed_resource_group_id = ManagedResourceGroup(id=f"/subscriptions/{subscription_id}/resourceGroups/{resource_group}")
  20. cluster = CreateWorkspaceParameters(
  21. sku=Sku(name=node_type, capacity=worker_count),
  22. custom_parameters=workspace
  23. )
  24. databricks_client.workspaces.create_or_update(resource_group, workspace_name, cluster_name, cluster)
英文:

it is possible to automate the creation of Azure Databricks clusters using a Python script. Azure provides the Azure SDK for Python

  1. pip install azure-mgmt-databricks
  2. from azure.identity import DefaultAzureCredential
  3. from azure.mgmt.databricks import DatabricksManagementClient
  4. from azure.mgmt.databricks.models import Workspace,
  5. CreateWorkspaceParameters, Sku, ManagedResourceGroupConfiguration,
  6. ManagedResourceGroup
  7. # Define your Azure subscription ID and resource group details
  8. subscription_id = &quot;&lt;your-subscription-id&gt;&quot;
  9. reso1urce_group = &quot;&lt;your-resource-group&gt;&quot;
  10. # Create a Databricks management client
  11. credential = DefaultAzureCredential()
  12. databricks_client = DatabricksManagementClient(credential,
  13. subscription_id)
  14. # Define the cluster details
  15. cluster_name = &quot;&lt;cluster-name&gt;&quot;
  16. location = &quot;&lt;azure-region&gt;&quot;
  17. workspace_name = &quot;&lt;databricks-workspace-name&gt;&quot;
  18. node_type = &quot;&lt;node-type&gt;&quot;
  19. worker_count = 2
  20. # Create a cluster
  21. workspace = Workspace(location=location)
  22. workspace.managed_resource_group_id = ManagedResourceGroup(id=f&quot;/subscriptions/{subscription_id}/resourceGroups/{resource_group}&quot;)
  23. cluster = CreateWorkspaceParameters(
  24. sku=Sku(name=node_type, capacity=worker_count),
  25. custom_parameters=workspace
  26. )
  27. databricks_client.workspaces.create_or_update(resource_group, workspace_name, cluster_name, cluster)

huangapple
  • 本文由 发表于 2023年6月5日 20:31:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76406432.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定