多集群CockroachDB与Cilium集群网格

huangapple go评论81阅读模式
英文:

Multi cluster CockroachDB with Cilium Cluster Mesh

问题

我正在尝试启用一个多集群的CockroachDB,使用Cilium Cluster Mesh连接3个k8s集群。关于多集群CockroachDB的想法在cockroachlabs.com - 12中有描述。鉴于该文章要求对CoreDNS ConfigMap进行更改,而不是使用Cilium的全局服务,感觉不太理想。

因此,问题是如何在Cilium Cluster Mesh环境中启用多集群CockroachDB,使用Cilium的全局服务而不是修改CoreDNS ConfigMap?

使用helm安装CockroachDB时,它会部署一个StatefulSet,其中包含精心编制的--join参数。它包含了CockroachDB pod的FQDN,这些pod将加入集群。

Pod的FQDN来自于service.discovery,该服务使用clusterIP: None创建,并且

(...) 仅用于为StatefulSet中的每个pod创建DNS条目,以便它们可以解析彼此的IP地址。

发现服务会自动注册StatefulSet中所有pod的DNS条目,以便它们可以轻松引用

是否可以为在远程集群上运行的StatefulSet创建类似的发现服务或替代服务?这样,在启用集群网格的情况下,集群Α中的pod X、Y、Z可以通过它们的FQDN访问集群Β中的pod J、K、L吗?

create-service-per-pod-in-statefulset中所建议的,可以创建类似以下的服务:

{{- range $i, $_ := until 3 -}}
---
apiVersion: v1
kind: Service
metadata:
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
    io.cilium/global-service: 'true'
    service.cilium.io/affinity: "remote"
  labels:
    app.kubernetes.io/component: cockroachdb
    app.kubernetes.io/instance: dbs
    app.kubernetes.io/name: cockroachdb
  name: dbs-cockroachdb-remote-{{ $i }}
  namespace: dbs
spec:
  ports:
  - name: grpc
    port: 26257
    protocol: TCP
    targetPort: grpc
  - name: http
    port: 8080
    protocol: TCP
    targetPort: http
  selector:
    app.kubernetes.io/component: cockroachdb
    app.kubernetes.io/instance: dbs
    app.kubernetes.io/name: cockroachdb
    statefulset.kubernetes.io/pod-name: cockroachdb-{{ $i }}
  type: ClusterIP
  clusterIP: None
  publishNotReadyAddresses: true
---
kind: Service
apiVersion: v1
metadata:
  name: dbs-cockroachdb-public-remote-{{ $i }}
  namespace: dbs
  labels:
    app.kubernetes.io/component: cockroachdb
    app.kubernetes.io/instance: dbs
    app.kubernetes.io/name: cockroachdb
  annotations:
    io.cilium/global-service: 'true'
    service.cilium.io/affinity: "remote"
spec:
  ports:
  - name: grpc
    port: 26257
    protocol: TCP
    targetPort: grpc
  - name: http
    port: 8080
    protocol: TCP
    targetPort: http
  selector:
    app.kubernetes.io/component: cockroachdb
    app.kubernetes.io/instance: dbs
    app.kubernetes.io/name: cockroachdb
{{- end -}}

这样它们就类似于原始的service.discoveryservice.public

然而,尽管存在Cilium的注释

io.cilium/global-service: 'true'
service.cilium.io/affinity: "remote"

服务似乎仍然绑定到本地的k8s集群,导致CockroachDB由3个节点而不是6个节点组成(集群A中的3个节点+集群B中的3个节点)。

无论我在我的--join命令覆盖中使用哪个服务(dbs-cockroachdb-public-remote-Xdbs-cockroachdb-remote-X),结果都是相同的,只有3个节点而不是6个。

有什么想法吗?

英文:

I am trying to enable a multi cluster CockroachDB spawning 3 k8s clusters connected with Cilium Cluster Mesh. The idea of having a multi cluster CockroachDB is described on cockroachlabs.com - 1, 2. Given the fact that the article calls for a change in CoreDNS ConfigMap, instead of using Cilium global-services feels suboptimal.

Therefore the question arises, how to enable a multi cluster CockroachDB in a Cilium Cluster Mesh environment, using Cilium global services instead of hacking CoreDNS ConfigMap ?

With CockroachDB installed via helm, it deploys a StatefulSet with a carefully crafted --join parameter. It contains FQDNs of CockroachDB pods that are to join the cluster.

The pod FQDNs come from service.discover that is created with clusterIP: None and

> (...) only exists to create DNS entries for each pod in the StatefulSet such that they can resolve each other's IP addresses.

The discovery service automatically registers DNS entries for all pods within the StatefulSet, so that they can be easily referenced

Can a similar discovery service or alternative be created for a StatefulSet running on a remote cluster ? So that with cluster mesh enabled, pods J,K,L in cluster Β could be reached from pods X,Y,Z in cluster Α by their FQDN ?

As suggested in create-service-per-pod-in-statefulset, one could create services like

{{- range $i, $_ := until 3 -}}
---
apiVersion: v1
kind: Service
metadata:
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
    io.cilium/global-service: 'true'
    service.cilium.io/affinity: "remote"
  labels:
    app.kubernetes.io/component: cockroachdb
    app.kubernetes.io/instance: dbs
    app.kubernetes.io/name: cockroachdb
  name: dbs-cockroachdb-remote-{{ $i }}
  namespace: dbs
spec:
  ports:
  - name: grpc
    port: 26257
    protocol: TCP
    targetPort: grpc
  - name: http
    port: 8080
    protocol: TCP
    targetPort: http
  selector:
    app.kubernetes.io/component: cockroachdb
    app.kubernetes.io/instance: dbs
    app.kubernetes.io/name: cockroachdb
    statefulset.kubernetes.io/pod-name: cockroachdb-{{ $i }}
  type: ClusterIP
  clusterIP: None
  publishNotReadyAddresses: true
---
kind: Service
apiVersion: v1
metadata:
  name: dbs-cockroachdb-public-remote-{{ $i }}
  namespace: dbs
  labels:
    app.kubernetes.io/component: cockroachdb
    app.kubernetes.io/instance: dbs
    app.kubernetes.io/name: cockroachdb
  annotations:
    io.cilium/global-service: 'true'
    service.cilium.io/affinity: "remote"
spec:
  ports:
  - name: grpc
    port: 26257
    protocol: TCP
    targetPort: grpc
  - name: http
    port: 8080
    protocol: TCP
    targetPort: http
  selector:
    app.kubernetes.io/component: cockroachdb
    app.kubernetes.io/instance: dbs
    app.kubernetes.io/name: cockroachdb
{{- end -}}

So that they resemble the original service.discovery and service.public

However, despite the presence of cilium annotations

io.cilium/global-service: 'true'
service.cilium.io/affinity: "remote"

services look bound to the local k8s cluster, resulting in CockroachDB consisting of 3 instead of 6 nodes. (3 in cluster A + 3 in cluster B)

hubble:
多集群CockroachDB与Cilium集群网格
CockroachDB:
多集群CockroachDB与Cilium集群网格

It does not matter how which service (dbs-cockroachdb-public-remote-X, or dbs-cockroachdb-remote-X) I use in my --join command overwrite

    join:
      - dbs-cockroachdb-0.dbs-cockroachdb.dbs:26257
      - dbs-cockroachdb-1.dbs-cockroachdb.dbs:26257
      - dbs-cockroachdb-2.dbs-cockroachdb.dbs:26257
      - dbs-cockroachdb-public-remote-0.dbs:26257
      - dbs-cockroachdb-public-remote-1.dbs:26257
      - dbs-cockroachdb-public-remote-2.dbs:26257

The result is the same, 3 nodes instead of 6.

Any ideas?

答案1

得分: 2

显然,由于7070的原因,修补CoreDNS ConfigMap是我们可以做的最合理的事情。在该错误的评论中,提到了一篇文章,提供了额外的上下文信息。

我对这个故事的改变是,我使用kubernetes插件配置更新了配置映射:

apiVersion: v1
data:
  Corefile: |- 
    saturn.local {
      log
      errors
      kubernetes saturn.local {
        endpoint https://[ENDPOINT]
        kubeconfig [PATH_TO_KUBECONFIG]
      }
    }
    rhea.local {
      ...

这样我就可以解析其他名称了。
在我的设置中,每个集群都有自己的domain.localPATH_TO_KUBECONFIG是一个普通的kubeconfig文件。必须在kube-system命名空间中创建通用密钥,并且必须在coredns部署下挂载密钥卷。

多集群CockroachDB与Cilium集群网格

英文:

Apparently due to 7070, patching CoreDNS ConfigMap is the most reasonable thing we can do. In the comments of that bug, an article is mentioned, that provides additional context.

My twist to this story is that I updated the config map with kubernetes plugin config:

apiVersion: v1
data:
  Corefile: |-
    saturn.local {
      log
      errors
      kubernetes saturn.local {
        endpoint https://[ENDPOINT]
        kubeconfig [PATH_TO_KUBECONFIG]
      }
    }
    rhea.local {
      ...

So that I could resolve other names as well.
In my setup, each cluster has its own domain.local. PATH_TO_KUBECONFIG is a plane kubeconfig file. Generic secret has to be created in kube-system namespace and the secret volume has to be mounted under coredns deployment.

多集群CockroachDB与Cilium集群网格

huangapple
  • 本文由 发表于 2023年8月8日 21:00:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76859821.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定