Kubernetes Pod 无法按名称解析服务

huangapple go评论67阅读模式
英文:

Kubernetes Pod Cannot Resolve Service by Name

问题

我是kubernetes的新手,我在部署中遇到问题。我有一个本地的k8s集群(minikube),运行了2个Pods(rabbitMQ和需要连接rabbitMQ的Python微服务)。我已经为RMQ创建了一个Service,但我无法让微服务能够使用Service的名称进行连接。如果我在名称的位置使用Service Endpoint IP 来测试RMQ是否正确部署,它可以正常工作。我也知道最佳实践是使用Helm Charts来部署rabbitMQ,但由于我正在学习k8s,我想先让它正常工作,然后再迁移到Helm。我还应该注意,当使用docker-compose运行时,这一切都可以正常工作,我想从compose迁移到k8s。

rabbitMQ的部署yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rabbitmq3
spec:
  replicas: 1
  selector:
    matchLabels:
      app: rabbitmq3
  template:
    metadata:
      labels:
        app: rabbitmq3
    spec:
      containers:
      - name: rabbitmq
        image: rabbitmq:3.11.10-management-alpine
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: "256Mi"
            cpu: "500m"
        ports:
          - containerPort: 5672
            name: amqp
          - containerPort: 15672
            name: discovery
      restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: rabbitmq3
  name: rmqserver-ui
spec:
  type: NodePort
  ports:
    - name: discovery
      port: 15672
      targetPort: 15672
  selector:
    app: rabbitmq3
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: rabbitmq3
  name: rmqserver
spec:
  type: ClusterIP
  ports:
    - name: amqp
      port: 5672
      targetPort: 5672
  selector:
    app: rabbitmq3

Python微服务使用Pika来通过Service的DNS名称(如上面Service名称yaml中定义的rmqserver)连接到rabbitMQ。此代码运行在一个单独的Pod中。

# self._rmqusername 和 self._rmqpassword 为 guest/guest
credentials = pika.PlainCredentials(self._rmqusername, self._rmqpassword)
connection = pika.BlockingConnection(pika.ConnectionParameters(host=rmqserver, credentials=credentials))

当我将微服务部署到k8s时,Pod日志中会抛出以下错误:

2023-03-03 19:54:26,572 [ERROR] Address resolution failed: gaierror(-3, 'Try again')
2023-03-03 19:54:26,573 [ERROR] getaddrinfo failed: gaierror(-3, 'Try again').
2023-03-03 19:54:26,573 [ERROR] AMQP connection workflow failed: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None.
2023-03-03 19:54:26,573 [ERROR] AMQPConnectionWorkflow - reporting failure: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None
2023-03-03 19:54:26,574 [ERROR] Connection workflow failed: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None
2023-03-03 19:54:26,574 [ERROR] Error in _create_connection().
Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/pika/adapters/blocking_connection.py", line 451, in _create_connection
    raise self._reap_last_connection_workflow_error(error)
  File "/usr/lib/python3.8/site-packages/pika/adapters/utils/selector_ioloop_adapter.py", line 565, in _resolve
    result = socket.getaddrinfo(self._host, self._port, self._family,
  File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Try again

如果我将主机设置为Service Endpoint IP,就不会抛出异常并且可以连接。所以这让我相信有一个与DNS名称查找有关的问题,我可能忽略了某个步骤或配置。

英文:

I am new to kubernetes and I am having trouble with my deployment. I have a local k8s cluster (minikube) running 2 Pods (rabbitMQ and a python microservice that needs to connect to rabbitMQ). I have created a Service for RMQ but I cannot get the microservice to be able to connect using the Service by Name. If I use the Service Endpoint IP in place of the name just to test that RMQ is deployed correctly it works fine. I'm also aware the it's best practice to use Helm Charts to deploy rabbitMQ but since I'm learning k8s, I was wanting to get this working first before migrating to Helm. I should also note that this is all working when run with docker-compose, I'm wanting to migrate from compose to k8s.

rabbitMQ deployment yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rabbitmq3
spec:
  replicas: 1
  selector:
    matchLabels:
      app: rabbitmq3
  template:
    metadata:
      labels:
        app: rabbitmq3
    spec:
      containers:
      - name: rabbitmq
        image: rabbitmq:3.11.10-management-alpine
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: "256Mi"
            cpu: "500m"
        ports:
          - containerPort: 5672
            name: amqp
          - containerPort: 15672
            name: discovery
      restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: rabbitmq3
  name: rmqserver-ui
spec:
  type: NodePort
  ports:
    - name: discovery
      port: 15672
      targetPort: 15672
  selector:
    app: rabbitmq3
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: rabbitmq3
  name: rmqserver
spec:
  type: ClusterIP
  ports:
    - name: amqp
      port: 5672
      targetPort: 5672
  selector:
    app: rabbitmq3

The python microservice uses Pika to connect to rabbitMQ via the DNS name (rmqserver as defined in the Service name yaml above). This code is running in a separate Pod

# self._rmqusername and self._rmqpassword are guest/guest
credentials = pika.PlainCredentials(self._rmqusername, self._rmqpassword)
connection = pika.BlockingConnection(pika.ConnectionParameters(host=rmqserver, credentials=credentials))

When I deploy the microservice to k8s the Pod logs throw:

2023-03-03 19:54:26,572 [ERROR] Address resolution failed: gaierror(-3, 'Try again')
2023-03-03 19:54:26,573 [ERROR] getaddrinfo failed: gaierror(-3, 'Try again').
2023-03-03 19:54:26,573 [ERROR] AMQP connection workflow failed: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None.
2023-03-03 19:54:26,573 [ERROR] AMQPConnectionWorkflow - reporting failure: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None
2023-03-03 19:54:26,574 [ERROR] Connection workflow failed: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None
2023-03-03 19:54:26,574 [ERROR] Error in _create_connection().
Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/pika/adapters/blocking_connection.py", line 451, in _create_connection
    raise self._reap_last_connection_workflow_error(error)
  File "/usr/lib/python3.8/site-packages/pika/adapters/utils/selector_ioloop_adapter.py", line 565, in _resolve
    result = socket.getaddrinfo(self._host, self._port, self._family,
  File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Try again

If I set the host equal to the Service Endpoint IP, there is no exception and it will connect. So this leads me to believe a problem with the DNS name lookup and that I'm missing a step / configuration.

答案1

得分: 0

报告更新,问题已解决。我删除了一个使用 $ minikube node add 创建的工作节点,当我刚开始学习如何设置 minikube 时。

$ minikube status

minikube
type: 控制平面
host: 运行中
kubelet: 运行中
apiserver: 运行中
kubeconfig: 已配置

minikube-m02
type: 工作节点
host: 运行中
kubelet: 运行中

在删除 minikube-m02 后,重新应用 RMQ 部署和 Python 服务 Pod,一切都连接正常。不过我不确定为什么这个节点导致了问题。

英文:

Reporting back with an update that fixed the problem. I deleted a worker node that I had created with $ minikube node add when I was first learning to set up minikube.

$ minikube status


minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured


minikube-m02
type: Worker
host: Running
kubelet: Running

After deleting minikube-m02 and reapplying the RMQ Deployment and python service Pod, it connected just fine. I'm not sure why the Node caused the problem though.

huangapple
  • 本文由 发表于 2023年3月4日 04:06:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/75631453.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定