Kubernetes Pod 无法按名称解析服务

huangapple go评论112阅读模式
英文:

Kubernetes Pod Cannot Resolve Service by Name

问题

我是kubernetes的新手,我在部署中遇到问题。我有一个本地的k8s集群(minikube),运行了2个Pods(rabbitMQ和需要连接rabbitMQ的Python微服务)。我已经为RMQ创建了一个Service,但我无法让微服务能够使用Service的名称进行连接。如果我在名称的位置使用Service Endpoint IP 来测试RMQ是否正确部署,它可以正常工作。我也知道最佳实践是使用Helm Charts来部署rabbitMQ,但由于我正在学习k8s,我想先让它正常工作,然后再迁移到Helm。我还应该注意,当使用docker-compose运行时,这一切都可以正常工作,我想从compose迁移到k8s。

rabbitMQ的部署yaml:

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. name: rabbitmq3
  5. spec:
  6. replicas: 1
  7. selector:
  8. matchLabels:
  9. app: rabbitmq3
  10. template:
  11. metadata:
  12. labels:
  13. app: rabbitmq3
  14. spec:
  15. containers:
  16. - name: rabbitmq
  17. image: rabbitmq:3.11.10-management-alpine
  18. imagePullPolicy: IfNotPresent
  19. resources:
  20. limits:
  21. memory: "256Mi"
  22. cpu: "500m"
  23. ports:
  24. - containerPort: 5672
  25. name: amqp
  26. - containerPort: 15672
  27. name: discovery
  28. restartPolicy: Always
  29. ---
  30. apiVersion: v1
  31. kind: Service
  32. metadata:
  33. labels:
  34. app: rabbitmq3
  35. name: rmqserver-ui
  36. spec:
  37. type: NodePort
  38. ports:
  39. - name: discovery
  40. port: 15672
  41. targetPort: 15672
  42. selector:
  43. app: rabbitmq3
  44. ---
  45. apiVersion: v1
  46. kind: Service
  47. metadata:
  48. labels:
  49. app: rabbitmq3
  50. name: rmqserver
  51. spec:
  52. type: ClusterIP
  53. ports:
  54. - name: amqp
  55. port: 5672
  56. targetPort: 5672
  57. selector:
  58. app: rabbitmq3

Python微服务使用Pika来通过Service的DNS名称(如上面Service名称yaml中定义的rmqserver)连接到rabbitMQ。此代码运行在一个单独的Pod中。

  1. # self._rmqusername 和 self._rmqpassword 为 guest/guest
  2. credentials = pika.PlainCredentials(self._rmqusername, self._rmqpassword)
  3. connection = pika.BlockingConnection(pika.ConnectionParameters(host=rmqserver, credentials=credentials))

当我将微服务部署到k8s时,Pod日志中会抛出以下错误:

  1. 2023-03-03 19:54:26,572 [ERROR] Address resolution failed: gaierror(-3, 'Try again')
  2. 2023-03-03 19:54:26,573 [ERROR] getaddrinfo failed: gaierror(-3, 'Try again').
  3. 2023-03-03 19:54:26,573 [ERROR] AMQP connection workflow failed: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None.
  4. 2023-03-03 19:54:26,573 [ERROR] AMQPConnectionWorkflow - reporting failure: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None
  5. 2023-03-03 19:54:26,574 [ERROR] Connection workflow failed: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None
  6. 2023-03-03 19:54:26,574 [ERROR] Error in _create_connection().
  7. Traceback (most recent call last):
  8. File "/usr/lib/python3.8/site-packages/pika/adapters/blocking_connection.py", line 451, in _create_connection
  9. raise self._reap_last_connection_workflow_error(error)
  10. File "/usr/lib/python3.8/site-packages/pika/adapters/utils/selector_ioloop_adapter.py", line 565, in _resolve
  11. result = socket.getaddrinfo(self._host, self._port, self._family,
  12. File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
  13. for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
  14. socket.gaierror: [Errno -3] Try again

如果我将主机设置为Service Endpoint IP,就不会抛出异常并且可以连接。所以这让我相信有一个与DNS名称查找有关的问题,我可能忽略了某个步骤或配置。

英文:

I am new to kubernetes and I am having trouble with my deployment. I have a local k8s cluster (minikube) running 2 Pods (rabbitMQ and a python microservice that needs to connect to rabbitMQ). I have created a Service for RMQ but I cannot get the microservice to be able to connect using the Service by Name. If I use the Service Endpoint IP in place of the name just to test that RMQ is deployed correctly it works fine. I'm also aware the it's best practice to use Helm Charts to deploy rabbitMQ but since I'm learning k8s, I was wanting to get this working first before migrating to Helm. I should also note that this is all working when run with docker-compose, I'm wanting to migrate from compose to k8s.

rabbitMQ deployment yaml:

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. name: rabbitmq3
  5. spec:
  6. replicas: 1
  7. selector:
  8. matchLabels:
  9. app: rabbitmq3
  10. template:
  11. metadata:
  12. labels:
  13. app: rabbitmq3
  14. spec:
  15. containers:
  16. - name: rabbitmq
  17. image: rabbitmq:3.11.10-management-alpine
  18. imagePullPolicy: IfNotPresent
  19. resources:
  20. limits:
  21. memory: "256Mi"
  22. cpu: "500m"
  23. ports:
  24. - containerPort: 5672
  25. name: amqp
  26. - containerPort: 15672
  27. name: discovery
  28. restartPolicy: Always
  29. ---
  30. apiVersion: v1
  31. kind: Service
  32. metadata:
  33. labels:
  34. app: rabbitmq3
  35. name: rmqserver-ui
  36. spec:
  37. type: NodePort
  38. ports:
  39. - name: discovery
  40. port: 15672
  41. targetPort: 15672
  42. selector:
  43. app: rabbitmq3
  44. ---
  45. apiVersion: v1
  46. kind: Service
  47. metadata:
  48. labels:
  49. app: rabbitmq3
  50. name: rmqserver
  51. spec:
  52. type: ClusterIP
  53. ports:
  54. - name: amqp
  55. port: 5672
  56. targetPort: 5672
  57. selector:
  58. app: rabbitmq3

The python microservice uses Pika to connect to rabbitMQ via the DNS name (rmqserver as defined in the Service name yaml above). This code is running in a separate Pod

  1. # self._rmqusername and self._rmqpassword are guest/guest
  2. credentials = pika.PlainCredentials(self._rmqusername, self._rmqpassword)
  3. connection = pika.BlockingConnection(pika.ConnectionParameters(host=rmqserver, credentials=credentials))

When I deploy the microservice to k8s the Pod logs throw:

  1. 2023-03-03 19:54:26,572 [ERROR] Address resolution failed: gaierror(-3, 'Try again')
  2. 2023-03-03 19:54:26,573 [ERROR] getaddrinfo failed: gaierror(-3, 'Try again').
  3. 2023-03-03 19:54:26,573 [ERROR] AMQP connection workflow failed: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None.
  4. 2023-03-03 19:54:26,573 [ERROR] AMQPConnectionWorkflow - reporting failure: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None
  5. 2023-03-03 19:54:26,574 [ERROR] Connection workflow failed: AMQPConnectionWorkflowFailed: 1 exceptions in all; last exception - gaierror(-3, 'Try again'); first exception - None
  6. 2023-03-03 19:54:26,574 [ERROR] Error in _create_connection().
  7. Traceback (most recent call last):
  8. File "/usr/lib/python3.8/site-packages/pika/adapters/blocking_connection.py", line 451, in _create_connection
  9. raise self._reap_last_connection_workflow_error(error)
  10. File "/usr/lib/python3.8/site-packages/pika/adapters/utils/selector_ioloop_adapter.py", line 565, in _resolve
  11. result = socket.getaddrinfo(self._host, self._port, self._family,
  12. File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
  13. for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
  14. socket.gaierror: [Errno -3] Try again

If I set the host equal to the Service Endpoint IP, there is no exception and it will connect. So this leads me to believe a problem with the DNS name lookup and that I'm missing a step / configuration.

答案1

得分: 0

报告更新,问题已解决。我删除了一个使用 $ minikube node add 创建的工作节点,当我刚开始学习如何设置 minikube 时。

  1. $ minikube status
  2. minikube
  3. type: 控制平面
  4. host: 运行中
  5. kubelet: 运行中
  6. apiserver: 运行中
  7. kubeconfig: 已配置
  8. minikube-m02
  9. type: 工作节点
  10. host: 运行中
  11. kubelet: 运行中

在删除 minikube-m02 后,重新应用 RMQ 部署和 Python 服务 Pod,一切都连接正常。不过我不确定为什么这个节点导致了问题。

英文:

Reporting back with an update that fixed the problem. I deleted a worker node that I had created with $ minikube node add when I was first learning to set up minikube.

  1. $ minikube status
  2. minikube
  3. type: Control Plane
  4. host: Running
  5. kubelet: Running
  6. apiserver: Running
  7. kubeconfig: Configured
  8. minikube-m02
  9. type: Worker
  10. host: Running
  11. kubelet: Running

After deleting minikube-m02 and reapplying the RMQ Deployment and python service Pod, it connected just fine. I'm not sure why the Node caused the problem though.

huangapple
  • 本文由 发表于 2023年3月4日 04:06:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/75631453.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定