英文:
Running a script after creation of instance in GCP Managed Instance Group
问题
我正在使用分布式计算框架Bacalhau[0]。设置集群的模式如下所示:
$ curl -sL https://get.bacalhau.org/install.sh | bash
[...output...]
$ bacalhau serve
要将另一个节点连接到此私有节点,请在您的shell中运行以下命令:
bacalhau serve --node-type compute --private-internal-ipfs --peer /ip4/10.158.0.2/tcp/1235/p2p/QmeEoVj8wyxMxhcUSr6p7EK1Dcie7PvNeXCVQny15Htb1W --ipfs-swarm-addr /ip4/10.158.0.2/tcp/46199/p2p/QmVPFmHmruuuAcEmsGRapB6yDDaPxhf2huqa9PhPVEHK8F
(在生产环境中,使用systemd来完成此操作,这里我省略了)。
我想做的是创建一个Google托管的实例组,它会监视Cloud Pub/Sub(此处未涉及)以在发出信号时创建新实例。问题是,只有在第一个实例启动后才知道对等字符串。我最初的想法是启动一个实例,捕获输出,并将其写入一个所有内容都可以读取的公共位置。
我考虑了以下几种模式:
- 创建一个实例模板,该模板检查一个中央端点(KV存储?)以获取此信息。
- 创建一个实例模板,该模板从GCS存储桶中读取此信息。
- 还有其他解决方案吗?
我阅读了这篇文章1,介绍了使用GCS进行领导者选举,但我能强制使用GCS作为锁定机制吗?还是我需要使用整个库[2]?还是有其他解决方案?我可以使用GCP上的任何托管服务来完成此操作。
我更倾向于不使用golang,而是使用非编译语言(例如Python)来完成此操作。
[0] https://docs.bacalhau.org/quick-start-pvt-cluster
1 https://cloud.google.com/blog/topics/developers-practitioners/implementing-leader-election-google-cloud-storage
[2] https://pkg.go.dev/github.com/hashicorp/vault/physical/gcs
英文:
I'm using the distributed compute framework Bacalhau[0]. The pattern for setting up a cluster is the following:
$ curl -sL https://get.bacalhau.org/install.sh | bash
[...output...]
$ bacalhau serve
To connect another node to this private one, run the following command in your shell:
bacalhau serve --node-type compute --private-internal-ipfs --peer /ip4/10.158.0.2/tcp/1235/p2p/QmeEoVj8wyxMxhcUSr6p7EK1Dcie7PvNeXCVQny15Htb1W --ipfs-swarm-addr /ip4/10.158.0.2/tcp/46199/p2p/QmVPFmHmruuuAcEmsGRapB6yDDaPxhf2huqa9PhPVEHK8F
(doing this in a production friendly format involves using systemd - I have excluded it here).
What I'd like to do is have a Google managed instance group that watches a Cloud Pub/Sub (not covered here) to create a new instance when the signal is made. The problem is that the peering string is only known after the first instance starts. My initial thought is that I would start one instance, capture the output, and write it to a common location which everything could read from.
I've thought about the following patterns:
- Create an instance template that checks a central endpoint (KV store?) for this information
- Create an instance template that reads from a GCS bucket for this information
- Something else?
I've read this piece1 about leader election using GCS, but can I force GCS as the locking mechanism? Or do I need to use a whole library[2]? Or is there another solution? I can use any managed service on GCP to accomplish this.
My preference would NOT be to use golang, but to use a non-compiled language (e.g. Python) to accomplish this.
[0] https://docs.bacalhau.org/quick-start-pvt-cluster
1 https://cloud.google.com/blog/topics/developers-practitioners/implementing-leader-election-google-cloud-storage
[2] https://pkg.go.dev/github.com/hashicorp/vault/physical/gcs
答案1
得分: 2
你可以采用一种方法,使用元数据服务器来存储对等连接字符串。
GCP提供了一个实例元数据服务器,允许您为实例存储和检索元数据。当您创建一个新实例时,您可以使用gcloud命令行工具或Google Cloud API将对等连接字符串设置为实例的元数据:
gcloud compute instances add-metadata INSTANCE_NAME --metadata PEERING_STRING=VALUE
要在Bacalhau启动脚本中读取元数据:
curl -H "Metadata-Flavor: Google" "http://metadata.google.internal/computeMetadata/v1/instance/attributes/PEERING_STRING"
如果您特别需要一个Python脚本,可以使用requests库向此curl链接发出API请求。
英文:
One approach you could take is to use a metadata server to store the peering string. *GCP documentation.
GCP provides an instance metadata server that allows you to store and retrieve metadata for your instances. When you create a new instance, you can set the peering string as metadata on the instance using the gcloud command-line tool or the Google Cloud API:
gcloud compute instances add-metadata INSTANCE_NAME --metadata PEERING_STRING=VALUE
To read the metadata from within your Bacalhau startup script:
curl -H "Metadata-Flavor: Google" "http://metadata.google.internal/computeMetadata/v1/instance/attributes/PEERING_STRING"
If you specifically want a python script, make an api request using requests library to this curl link.
答案2
得分: 1
另一个选择是使用GCP Secrets。
基本上,初始节点将所需的信息写入一个Secret,处理Pub/Sub消息的代码将从Secret中提取该信息,以便用于添加新节点。
我建议这样做的原因是,对我来说,你需要的这些信息可能会给黑客提供恶意加入你的集群的权限。我会将这些信息视为受保护的。使用主实例的元数据允许具有相当低级别权限的任何人访问元数据,从而可能向您的集群添加新的受感染节点。
英文:
Another option to the first answer would be to use GCP Secrets.
In essence, the initial node would write the needed information to a Secret and the code processing the Pub/Sub message would pull that information from the Secret in order to use it to add new nodes.
The reason I suggest this is that, to me, it seems that this information you need would give a hacker access to join your cluster maliciously. I would treat that info as protected. Using the metadata of the prime instance allows anybody with a pretty low-level permission to access the metadata, and therefore potentially add new infected nodes to your cluster.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论