如何在运行时获取Azure Databricks中整个集群的信息?

huangapple go评论63阅读模式
英文:

How to get the whole cluster information in azure databricks at the runtime?

问题

以下是翻译好的部分:

"Latest Version :12.0 (includes Apache Spark 3.3.1, Scala 2.12)"
"Error:py4j.security.Py4JSecurityException: Method public scala.collection.immutable.Map com.databricks.backend.common.rpc.CommandContext.tags() is not whitelisted on class class com.databricks.backend.common.rpc.CommandContext"

英文:

The below code was working for the older version and the version has changed the code is not working in databricks.

Latest Version :12.0 (includes Apache Spark 3.3.1, Scala 2.12)

dbutils.notebook.entry_point.getDbutils().notebook().getContext().tags()

what is the alteranative for this code?

Error:py4j.security.Py4JSecurityException: Method public scala.collection.immutable.Map com.databricks.backend.common.rpc.CommandContext.tags() is not whitelisted on class class com.databricks.backend.common.rpc.CommandContext

答案1

得分: 2

你可以直接从Spark配置中获取大部分集群信息:

val p = "spark.databricks.clusterUsageTags."
spark.conf.getAll
  .collect{ case (k, v) if k.startsWith(p) => s"${k.replace(p, "")}: $v" }
  .toList.sorted.foreach(println)
p = "spark.databricks.clusterUsageTags."
conf = [f"{k.replace(p, '')}: {v}" for k, v in spark.sparkContext.getConf().getAll() if k.startswith(p)]
for l in sorted(conf): print(l)

[...]
clusterId: 0123-456789-0abcde1
clusterLastActivityTime: 1676449848620
clusterName: test
clusterNodeType: Standard_F4s_v2
[...]

英文:

You can get most of cluster info directly from Spark config:

%scala
val p = "spark.databricks.clusterUsageTags."
spark.conf.getAll
  .collect{ case (k, v) if k.startsWith(p) => s"${k.replace(p, "")}: $v" }
  .toList.sorted.foreach(println)

%python
p = "spark.databricks.clusterUsageTags."
conf = [f"{k.replace(p, '')}: {v}" for k, v in spark.sparkContext.getConf().getAll() if k.startswith(p)]
for l in sorted(conf): print(l)

[...]
clusterId: 0123-456789-0abcde1
clusterLastActivityTime: 1676449848620
clusterName: test
clusterNodeType: Standard_F4s_v2
[...]

</details>



# 答案2
**得分**: 0

我已创建了以下具有 12.0 DBS 运行时的集群,如下所示:

- 以上命令已按要求提供输出,没有任何错误:

- 但如果您需要集群信息,那么您可以使用 `Clusters 2.0` API。以下代码将有效:

```python
import requests
import json

my_json = {"cluster_id": spark.conf.get("spark.databricks.clusterUsageTags.clusterId")}    

auth = {"Authorization": "Bearer <your_access_token>"}

response = requests.get('https://<workspace_url>/api/2.0/clusters/get', json=my_json, headers=auth).json()
print(response)
英文:

I have created the following cluster with DBS runtime of 12.0 as shown below:

如何在运行时获取Azure Databricks中整个集群的信息?

  • The above command has given the output as required without any error:

如何在运行时获取Azure Databricks中整个集群的信息?

  • But if it is the cluster information that you need, then you can use Clusters 2.0 API. The following code would work:


import requests
import json

my_json = {&quot;cluster_id&quot;: spark.conf.get(&quot;spark.databricks.clusterUsageTags.clusterId&quot;)}    

auth = {&quot;Authorization&quot;: &quot;Bearer &lt;your_access_token&gt;&quot;}

response = requests.get(&#39;https://&lt;workspace_url&gt;/api/2.0/clusters/get&#39;, json = my_json, headers=auth).json()
print(response)

如何在运行时获取Azure Databricks中整个集群的信息?

huangapple
  • 本文由 发表于 2023年2月16日 17:36:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/75470322.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定