2020年1月6日 21:43:27go评论95阅读模式

英文:

how to access Dataproc cluster metadata?

问题

在创建集群后，我试图获取我的附加组件的URL地址（不使用GCP仪表板）。我正在使用Dataproc Python API，具体来说是get_cluster()函数。

该函数返回了大量数据，但我无法找到Jupyter网关的URL或其他元数据。

from google.cloud import dataproc_v1
project_id, cluster_name = '', ''
region = 'europe-west4'
client = dataproc_v1.ClusterControllerClient(
                       client_options={
                            'api_endpoint': '{}-dataproc.googleapis.com:443'.format(region)
                        }
                    )
response = client.get_cluster(project_id, region, cluster_name)
print(response)

是否有人有解决方案？[1]: https://googleapis.dev/python/dataproc/latest/gapic/v1/api.html

英文:

After the creation of a cluster, I'm trying to retrieve the URL address of my additional components (without using the GCP Dashboard). I am using de [Dataproc python API][1] and more specifically the get_cluster() function.

A lot of data is returned by the function but I cannot manage to find the Jupyter gateway URL or other metadata.

from google.cloud import dataproc_v1
project_id, cluster_name = &#39;&#39;, &#39;&#39;
region = &#39;europe-west4&#39;
client = dataproc_v1.ClusterControllerClient(
                       client_options={
                            &#39;api_endpoint&#39;: &#39;{}-dataproc.googleapis.com:443&#39;.format(region)
                        }
                    )
response = client.get_cluster(project_id, region, cluster_name)
print(response)

Does anyone as a solution to this?
[1]: https://googleapis.dev/python/dataproc/latest/gapic/v1/api.html

答案1

得分: 4

If you have followed this doc to setup Jupyter access by enabling Component Gateway, then you can access the Web Interfaces as described here. The trick is that this is included in the API response for the v1beta2 version.

Changes needed in the code are minimal (no additional requirements apart from google-cloud-dataproc library). Just replace dataproc_v1 for dataproc_v1beta2 and access the endpoints with response.config.endpoint_config:

from google.cloud import dataproc_v1beta2
project_id, cluster_name = '', ''
region = 'europe-west4'
client = dataproc_v1beta2.ClusterControllerClient(
                       client_options={
                            'api_endpoint': '{}-dataproc.googleapis.com:443'.format(region)
                        }
                    )
response = client.get_cluster(project_id, region, cluster_name)
print(response.config.endpoint_config)

In my case I get:

http_ports {
  key: "HDFS NameNode"
  value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/hdfs/dfshealth.html"
}
http_ports {
  key: "Jupyter"
  value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/jupyter/"
}
http_ports {
  key: "JupyterLab"
  value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/jupyter/lab/"
}
http_ports {
  key: "MapReduce Job History"
  value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/jobhistory/"
}
http_ports {
  key: "Spark History Server"
  value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/sparkhistory/"
}
http_ports {
  key: "Tez"
  value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/apphistory/tez-ui/"
}
http_ports {
  key: "YARN Application Timeline"
  value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/apphistory/"
}
http_ports {
  key: "YARN ResourceManager"
  value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/yarn/"
}
enable_http_port_access: true

英文:

from google.cloud import dataproc_v1beta2
project_id, cluster_name = &#39;&#39;, &#39;&#39;
region = &#39;europe-west4&#39;
client = dataproc_v1beta2.ClusterControllerClient(
                       client_options={
                            &#39;api_endpoint&#39;: &#39;{}-dataproc.googleapis.com:443&#39;.format(region)
                        }
                    )
response = client.get_cluster(project_id, region, cluster_name)
print(response.config.endpoint_config)

In my case I get:

http_ports {
  key: &quot;HDFS NameNode&quot;
  value: &quot;https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/hdfs/dfshealth.html&quot;
}
http_ports {
  key: &quot;Jupyter&quot;
  value: &quot;https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/jupyter/&quot;
}
http_ports {
  key: &quot;JupyterLab&quot;
  value: &quot;https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/jupyter/lab/&quot;
}
http_ports {
  key: &quot;MapReduce Job History&quot;
  value: &quot;https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/jobhistory/&quot;
}
http_ports {
  key: &quot;Spark History Server&quot;
  value: &quot;https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/sparkhistory/&quot;
}
http_ports {
  key: &quot;Tez&quot;
  value: &quot;https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/apphistory/tez-ui/&quot;
}
http_ports {
  key: &quot;YARN Application Timeline&quot;
  value: &quot;https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/apphistory/&quot;
}
http_ports {
  key: &quot;YARN ResourceManager&quot;
  value: &quot;https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/yarn/&quot;
}
enable_http_port_access: true

答案2

得分: 0

启用组件并使用以下配置：

'endpoint_config': {
    'enable_http_port_access': True
},

然后上述答案将有效：

client.get_cluster(project_id, region, cluster_name)

英文:

You need to v1beat2

Enable the component with:

&#39;endpoint_config&#39;: {
                &#39;enable_http_port_access&#39;: True
            },

then the above answer will work:

client.get_cluster(project_id, region, cluster_name)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何访问Dataproc集群元数据？

问题

答案1

答案2

如何在Google Cloud上连接到Cassandra？

有没有办法在Google Drive服务帐户中收到更改通知（比如新文件上传）？

使用Google日历API库在.NET 7 C#中使用凭据模拟用户创建Google日历事件。

dev_appserver.py BadArgumentError: app must not be empty

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。