英文:
Multi cluster CockroachDB with Cilium Cluster Mesh
问题
我正在尝试启用一个多集群的CockroachDB,使用Cilium Cluster Mesh连接3个k8s集群。关于多集群CockroachDB的想法在cockroachlabs.com - 1和2中有描述。鉴于该文章要求对CoreDNS ConfigMap进行更改,而不是使用Cilium的全局服务,感觉不太理想。
因此,问题是如何在Cilium Cluster Mesh环境中启用多集群CockroachDB,使用Cilium的全局服务而不是修改CoreDNS ConfigMap?
使用helm安装CockroachDB时,它会部署一个StatefulSet,其中包含精心编制的--join
参数。它包含了CockroachDB pod的FQDN,这些pod将加入集群。
Pod的FQDN来自于service.discovery,该服务使用clusterIP: None
创建,并且
(...) 仅用于为StatefulSet中的每个pod创建DNS条目,以便它们可以解析彼此的IP地址。
发现服务会自动注册StatefulSet中所有pod的DNS条目,以便它们可以轻松引用。
是否可以为在远程集群上运行的StatefulSet创建类似的发现服务或替代服务?这样,在启用集群网格的情况下,集群Α中的pod X、Y、Z可以通过它们的FQDN访问集群Β中的pod J、K、L吗?
如create-service-per-pod-in-statefulset中所建议的,可以创建类似以下的服务:
{{- range $i, $_ := until 3 -}}
---
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
io.cilium/global-service: 'true'
service.cilium.io/affinity: "remote"
labels:
app.kubernetes.io/component: cockroachdb
app.kubernetes.io/instance: dbs
app.kubernetes.io/name: cockroachdb
name: dbs-cockroachdb-remote-{{ $i }}
namespace: dbs
spec:
ports:
- name: grpc
port: 26257
protocol: TCP
targetPort: grpc
- name: http
port: 8080
protocol: TCP
targetPort: http
selector:
app.kubernetes.io/component: cockroachdb
app.kubernetes.io/instance: dbs
app.kubernetes.io/name: cockroachdb
statefulset.kubernetes.io/pod-name: cockroachdb-{{ $i }}
type: ClusterIP
clusterIP: None
publishNotReadyAddresses: true
---
kind: Service
apiVersion: v1
metadata:
name: dbs-cockroachdb-public-remote-{{ $i }}
namespace: dbs
labels:
app.kubernetes.io/component: cockroachdb
app.kubernetes.io/instance: dbs
app.kubernetes.io/name: cockroachdb
annotations:
io.cilium/global-service: 'true'
service.cilium.io/affinity: "remote"
spec:
ports:
- name: grpc
port: 26257
protocol: TCP
targetPort: grpc
- name: http
port: 8080
protocol: TCP
targetPort: http
selector:
app.kubernetes.io/component: cockroachdb
app.kubernetes.io/instance: dbs
app.kubernetes.io/name: cockroachdb
{{- end -}}
这样它们就类似于原始的service.discovery和service.public。
然而,尽管存在Cilium的注释
io.cilium/global-service: 'true'
service.cilium.io/affinity: "remote"
服务似乎仍然绑定到本地的k8s集群,导致CockroachDB由3个节点而不是6个节点组成(集群A中的3个节点+集群B中的3个节点)。
无论我在我的--join
命令覆盖中使用哪个服务(dbs-cockroachdb-public-remote-X
或dbs-cockroachdb-remote-X
),结果都是相同的,只有3个节点而不是6个。
有什么想法吗?
英文:
I am trying to enable a multi cluster CockroachDB spawning 3 k8s clusters connected with Cilium Cluster Mesh. The idea of having a multi cluster CockroachDB is described on cockroachlabs.com - 1, 2. Given the fact that the article calls for a change in CoreDNS ConfigMap, instead of using Cilium global-services feels suboptimal.
Therefore the question arises, how to enable a multi cluster CockroachDB in a Cilium Cluster Mesh environment, using Cilium global services instead of hacking CoreDNS ConfigMap ?
With CockroachDB installed via helm, it deploys a StatefulSet with a carefully crafted --join
parameter. It contains FQDNs of CockroachDB pods that are to join the cluster.
The pod FQDNs come from service.discover that is created with clusterIP: None
and
> (...) only exists to create DNS entries for each pod in the StatefulSet such that they can resolve each other's IP addresses.
The discovery service automatically registers DNS entries for all pods within the StatefulSet, so that they can be easily referenced
Can a similar discovery service or alternative be created for a StatefulSet running on a remote cluster ? So that with cluster mesh enabled, pods J,K,L in cluster Β could be reached from pods X,Y,Z in cluster Α by their FQDN ?
As suggested in create-service-per-pod-in-statefulset, one could create services like
{{- range $i, $_ := until 3 -}}
---
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
io.cilium/global-service: 'true'
service.cilium.io/affinity: "remote"
labels:
app.kubernetes.io/component: cockroachdb
app.kubernetes.io/instance: dbs
app.kubernetes.io/name: cockroachdb
name: dbs-cockroachdb-remote-{{ $i }}
namespace: dbs
spec:
ports:
- name: grpc
port: 26257
protocol: TCP
targetPort: grpc
- name: http
port: 8080
protocol: TCP
targetPort: http
selector:
app.kubernetes.io/component: cockroachdb
app.kubernetes.io/instance: dbs
app.kubernetes.io/name: cockroachdb
statefulset.kubernetes.io/pod-name: cockroachdb-{{ $i }}
type: ClusterIP
clusterIP: None
publishNotReadyAddresses: true
---
kind: Service
apiVersion: v1
metadata:
name: dbs-cockroachdb-public-remote-{{ $i }}
namespace: dbs
labels:
app.kubernetes.io/component: cockroachdb
app.kubernetes.io/instance: dbs
app.kubernetes.io/name: cockroachdb
annotations:
io.cilium/global-service: 'true'
service.cilium.io/affinity: "remote"
spec:
ports:
- name: grpc
port: 26257
protocol: TCP
targetPort: grpc
- name: http
port: 8080
protocol: TCP
targetPort: http
selector:
app.kubernetes.io/component: cockroachdb
app.kubernetes.io/instance: dbs
app.kubernetes.io/name: cockroachdb
{{- end -}}
So that they resemble the original service.discovery and service.public
However, despite the presence of cilium annotations
io.cilium/global-service: 'true'
service.cilium.io/affinity: "remote"
services look bound to the local k8s cluster, resulting in CockroachDB consisting of 3 instead of 6 nodes. (3 in cluster A + 3 in cluster B)
It does not matter how which service (dbs-cockroachdb-public-remote-X
, or dbs-cockroachdb-remote-X
) I use in my --join
command overwrite
join:
- dbs-cockroachdb-0.dbs-cockroachdb.dbs:26257
- dbs-cockroachdb-1.dbs-cockroachdb.dbs:26257
- dbs-cockroachdb-2.dbs-cockroachdb.dbs:26257
- dbs-cockroachdb-public-remote-0.dbs:26257
- dbs-cockroachdb-public-remote-1.dbs:26257
- dbs-cockroachdb-public-remote-2.dbs:26257
The result is the same, 3 nodes instead of 6.
Any ideas?
答案1
得分: 2
显然,由于7070的原因,修补CoreDNS ConfigMap是我们可以做的最合理的事情。在该错误的评论中,提到了一篇文章,提供了额外的上下文信息。
我对这个故事的改变是,我使用kubernetes插件配置更新了配置映射:
apiVersion: v1
data:
Corefile: |-
saturn.local {
log
errors
kubernetes saturn.local {
endpoint https://[ENDPOINT]
kubeconfig [PATH_TO_KUBECONFIG]
}
}
rhea.local {
...
这样我就可以解析其他名称了。
在我的设置中,每个集群都有自己的domain.local
。PATH_TO_KUBECONFIG
是一个普通的kubeconfig文件。必须在kube-system
命名空间中创建通用密钥,并且必须在coredns部署下挂载密钥卷。
英文:
Apparently due to 7070, patching CoreDNS ConfigMap is the most reasonable thing we can do. In the comments of that bug, an article is mentioned, that provides additional context.
My twist to this story is that I updated the config map with kubernetes plugin config:
apiVersion: v1
data:
Corefile: |-
saturn.local {
log
errors
kubernetes saturn.local {
endpoint https://[ENDPOINT]
kubeconfig [PATH_TO_KUBECONFIG]
}
}
rhea.local {
...
So that I could resolve other names as well.
In my setup, each cluster has its own domain.local
. PATH_TO_KUBECONFIG
is a plane kubeconfig file. Generic secret has to be created in kube-system
namespace and the secret volume has to be mounted under coredns deployment.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论