2023年5月7日 04:21:46go评论156阅读模式

英文:

Running a script after creation of instance in GCP Managed Instance Group

问题

我正在使用分布式计算框架Bacalhau[0]。设置集群的模式如下所示：

$ curl -sL https://get.bacalhau.org/install.sh | bash
[...output...]
$ bacalhau serve
要将另一个节点连接到此私有节点，请在您的shell中运行以下命令：
bacalhau serve --node-type compute --private-internal-ipfs --peer /ip4/10.158.0.2/tcp/1235/p2p/QmeEoVj8wyxMxhcUSr6p7EK1Dcie7PvNeXCVQny15Htb1W --ipfs-swarm-addr /ip4/10.158.0.2/tcp/46199/p2p/QmVPFmHmruuuAcEmsGRapB6yDDaPxhf2huqa9PhPVEHK8F

（在生产环境中，使用systemd来完成此操作，这里我省略了）。

我想做的是创建一个Google托管的实例组，它会监视Cloud Pub/Sub（此处未涉及）以在发出信号时创建新实例。问题是，只有在第一个实例启动后才知道对等字符串。我最初的想法是启动一个实例，捕获输出，并将其写入一个所有内容都可以读取的公共位置。

我考虑了以下几种模式：

创建一个实例模板，该模板检查一个中央端点（KV存储？）以获取此信息。
创建一个实例模板，该模板从GCS存储桶中读取此信息。
还有其他解决方案吗？

我阅读了这篇文章1，介绍了使用GCS进行领导者选举，但我能强制使用GCS作为锁定机制吗？还是我需要使用整个库[2]？还是有其他解决方案？我可以使用GCP上的任何托管服务来完成此操作。

我更倾向于不使用golang，而是使用非编译语言（例如Python）来完成此操作。

[0] https://docs.bacalhau.org/quick-start-pvt-cluster

1 https://cloud.google.com/blog/topics/developers-practitioners/implementing-leader-election-google-cloud-storage

[2] https://pkg.go.dev/github.com/hashicorp/vault/physical/gcs

英文:

I'm using the distributed compute framework Bacalhau[0]. The pattern for setting up a cluster is the following:

$ curl -sL https://get.bacalhau.org/install.sh | bash
[...output...]
$ bacalhau serve
To connect another node to this private one, run the following command in your shell:
bacalhau serve --node-type compute --private-internal-ipfs --peer /ip4/10.158.0.2/tcp/1235/p2p/QmeEoVj8wyxMxhcUSr6p7EK1Dcie7PvNeXCVQny15Htb1W --ipfs-swarm-addr /ip4/10.158.0.2/tcp/46199/p2p/QmVPFmHmruuuAcEmsGRapB6yDDaPxhf2huqa9PhPVEHK8F

(doing this in a production friendly format involves using systemd - I have excluded it here).

What I'd like to do is have a Google managed instance group that watches a Cloud Pub/Sub (not covered here) to create a new instance when the signal is made. The problem is that the peering string is only known after the first instance starts. My initial thought is that I would start one instance, capture the output, and write it to a common location which everything could read from.

I've thought about the following patterns:

Create an instance template that checks a central endpoint (KV store?) for this information
Create an instance template that reads from a GCS bucket for this information
Something else?

I've read this piece1 about leader election using GCS, but can I force GCS as the locking mechanism? Or do I need to use a whole library[2]? Or is there another solution? I can use any managed service on GCP to accomplish this.

My preference would NOT be to use golang, but to use a non-compiled language (e.g. Python) to accomplish this.

[0] https://docs.bacalhau.org/quick-start-pvt-cluster

1 https://cloud.google.com/blog/topics/developers-practitioners/implementing-leader-election-google-cloud-storage

[2] https://pkg.go.dev/github.com/hashicorp/vault/physical/gcs

答案1

得分: 2

你可以采用一种方法，使用元数据服务器来存储对等连接字符串。

GCP提供了一个实例元数据服务器，允许您为实例存储和检索元数据。当您创建一个新实例时，您可以使用gcloud命令行工具或Google Cloud API将对等连接字符串设置为实例的元数据：

gcloud compute instances add-metadata INSTANCE_NAME --metadata PEERING_STRING=VALUE

要在Bacalhau启动脚本中读取元数据：

curl -H "Metadata-Flavor: Google" "http://metadata.google.internal/computeMetadata/v1/instance/attributes/PEERING_STRING"

如果您特别需要一个Python脚本，可以使用requests库向此curl链接发出API请求。

英文:

One approach you could take is to use a metadata server to store the peering string. *GCP documentation.

GCP provides an instance metadata server that allows you to store and retrieve metadata for your instances. When you create a new instance, you can set the peering string as metadata on the instance using the gcloud command-line tool or the Google Cloud API:

gcloud compute instances add-metadata INSTANCE_NAME --metadata PEERING_STRING=VALUE

To read the metadata from within your Bacalhau startup script:

curl -H &quot;Metadata-Flavor: Google&quot; &quot;http://metadata.google.internal/computeMetadata/v1/instance/attributes/PEERING_STRING&quot;

If you specifically want a python script, make an api request using requests library to this curl link.

答案2

得分: 1

另一个选择是使用GCP Secrets。

基本上，初始节点将所需的信息写入一个Secret，处理Pub/Sub消息的代码将从Secret中提取该信息，以便用于添加新节点。

我建议这样做的原因是，对我来说，你需要的这些信息可能会给黑客提供恶意加入你的集群的权限。我会将这些信息视为受保护的。使用主实例的元数据允许具有相当低级别权限的任何人访问元数据，从而可能向您的集群添加新的受感染节点。

英文:

Another option to the first answer would be to use GCP Secrets.

In essence, the initial node would write the needed information to a Secret and the code processing the Pub/Sub message would pull that information from the Secret in order to use it to add new nodes.

The reason I suggest this is that, to me, it seems that this information you need would give a hacker access to join your cluster maliciously. I would treat that info as protected. Using the metadata of the prime instance allows anybody with a pretty low-level permission to access the metadata, and therefore potentially add new infected nodes to your cluster.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Running a script after creation of instance in GCP Managed Instance Group

问题

答案1

答案2

Dask/pandas应用函数并返回多行

如何使用Selenium从亚马逊网站获取价格。

如何查询具有未知祖先的键的实体？

在地图中检测键是否存在结构体。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。