英文:
Kafka: events published from the host machine are not consumed by the application running in Docker
问题
我正在为一个应用编写端到端测试。我启动一个应用程序实例,一个Kafka实例和一个Zookeeper(全部在Docker容器中),然后与应用程序API交互以测试其功能。我需要测试此应用程序中事件消费者的功能。我从我的测试中发布事件,预期应用程序会处理它们。
问题: 如果我在本地运行应用程序(而不是在Docker中),并运行会产生事件的测试,则应用程序代码中的消费者会正确处理事件。在这种情况下,消费者和测试的bootstrapServers
均设置为localhost:9092
。但是,如果将应用程序作为Docker容器实例运行,则它无法看到这些事件。在这种情况下,应用程序中的bootstrapServers
设置为kafka:9092
,而测试中的设置为localhost:9092
,其中kafka
是Docker容器的名称。kafka
容器将其9092
端口暴露给主机,以便可以从Docker容器内部和主机(运行我的测试)访问同一实例的Kafka。
代码中唯一的区别是localhost
与kafka
作为引导服务器的设置。在这两种情况下,消费者和生产者都能够成功启动;事件在没有错误的情况下被发布。问题只是在一种情况下,消费者无法接收事件。
问题: 如何使Docker化的消费者能够看到从主机机器上发布的事件?
注:我有一个正确配置的Docker网络,其中包括应用程序实例、Zookeeper和Kafka。它们都可以相互“看到”。kafka
和zookeeper
的相应端口对主机进行了暴露。
Kafka端口:0.0.0.0:9092->9092/tcp
。Zookeeper端口:22/tcp、2888/tcp、3888/tcp、0.0.0.0:2181->2181/tcp
。
我正在使用 wurstmeister/kafka 和 wurstmeister/zookeeper Docker 镜像(我无法替换它们)。
欢迎分享任何想法/思路。你会如何进行调试?
更新: 问题出在设置了不同端口的KAFKA_ADVERTISED_LISTENERS
和KAFKA_LISTENERS
环境变量上,用于内部和外部通信。解决方案是在Docker容器内部运行应用程序代码时,使用正确的端口。
英文:
I am writing end-to-end tests for an application. I start an instance of an application, a Kafka instance, and a Zookeeper (all Dockerized) and then I interact with the application API to test its functionality. I need to test an event consumer's functionality in this application. I publish events from my tests and the application is expected to handle them.
Problem: If I run the application locally (not in Docker) and run tests that would produce events, the consumer in the application code handles events correctly. In this case, the consumer and the test have bootstrapServers
set to localhost:9092
. But if the application is run as a Dockerized instance it doesn't see the events. In this case bootstrapServers
are set to kafka:9092
in the application and localhost:9092
in the test where kafka
is a Docker container name. The kafka
container exposes its 9092
port to the host so that the same instance of Kafka can be accessed from inside a Docker container and from the host (running my tests).
The only difference in the code is localhost
vs kafka
set as bootstrap servers. In both scenarios consumers and producers start successfully; events are published without errors. It is just that in one case the consumer doesn't receive events.
Question: How to make Dockerized consumers see events posted from the host machine?
Note: I have a properly configured Docker network which includes the application instance, Zookeeper, and Kafka. They all "see" each other. The corresponding ports of kafka
and zookeeper
are exposed to the host.
Kafka ports: 0.0.0.0:9092->9092/tcp
. Zookeeper ports: 22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp
.
I am using wurstmeister/kafka and wurstmeister/zookeeper Docker images (I cannot replace them).
Any ideas/thoughts are appreciated. How would you debug it?
UPDATE: The issue was with KAFKA_ADVERTISED_LISTENERS
and KAFKA_LISTENERS
env variables that were set to different ports for INSIDE and OUTSIDE communications. The solution was to use a correct port in the application code when it is run inside a Docker container.
答案1
得分: 3
这种问题通常与Kafka处理代理地址的方式有关。
当你启动一个Kafka代理时,它会绑定在0.0.0.0:9092
上,并在Zookeeper中注册自己的地址为<hostname>:9092
。当你使用客户端连接时,Zookeeper会被联系以获取特定代理的地址。
这意味着当你启动一个Kafka容器时,你会遇到以下情况:
- 容器名称:kafka
- 网络名称:kafkanet
- 主机名:kafka
- 在Zookeeper上的注册:kafka:9092
现在,如果你从kafkanet网络内的容器连接到Kafka客户端,你从Zookeeper获取的地址是kafka:9092
,这在kafkanet
网络中是可以解析的。
然而,如果你从Docker外部连接到Kafka(即使用由Docker映射的localhost:9092
端点),你仍然会得到无法解析的kafka:9092
地址。
为了解决这个问题,你可以在代理配置中指定advertised.host.name
和advertised.port
,以便地址可以被所有客户端正确解析(参见文档)。
通常的做法是将advertised.host.name
设置为<container-name>.<network>
(在你的情况下类似于kafka.kafkanet),以便连接到网络的任何容器都能正确解析Kafka代理的IP。
然而,在你的情况下,你有一个混合的网络配置,因为一些组件存在于Docker内部(因此能够解析kafkanet网络),而另一些组件存在于外部。如果这是一个生产系统,我的建议是将advertised.host.name
设置为主机机器的DNS/IP,并始终依赖于Docker端口映射来访问Kafka代理。
然而,根据我的理解,你只需要这个设置来测试,所以最简单的方法是“欺骗”Docker外部的系统。使用上面指定的命名,这意味着只需在你的 /etc/hosts
(或Windows的等效位置)中添加一行 127.0.0.1 kafka.kafkanet
。
这样,当你外部Docker的客户端连接到Kafka时,应该会发生以下情况:
- 客户端 -> Kafka 通过 localhost:9092
- Kafka 查询 Zookeeper 并返回主机
kafka.kafkanet
- 客户端将
kafka.kafkanet
解析为 127.0.0.1 - 客户端 -> Kafka 通过 127.0.0.1:9092
编辑
正如评论中指出的,更新的Kafka版本现在使用listeners
和advertised.listeners
的概念,代替了host.name
和advertised.host.name
(这两者已被弃用,只有在没有指定上述内容时才使用)。然而,总体思想是相同的:
host.name
:指定Kafka代理应该绑定的主机(与port
一起使用)listeners
:指定Kafka代理应该绑定的所有端点(例如PLAINTEXT://0.0.0.0:9092,SSL://0.0.0.0:9091
)advertised.host.name
:指定向客户端公布的代理(即客户端应该使用哪个地址连接)advertised.listeners
:指定所有公布的端点(例如PLAINTEXT://kafka.example.com:9092,SSL://kafka.example.com:9091
)
在这两种情况下,客户端要想成功与Kafka通信,都需要能够解析并连接到advertised
主机名和端口。
在这两种情况下,如果未指定,代理会自动使用运行代理的主机机器的主机名派生这些值。
英文:
Thes kind of issues are usually related to the way Kafka handles the broker's address.
When you start a Kafka broker it binds itself on 0.0.0.0:9092
and register itself on Zookeeper with the address <hostname>:9092
. When you connect with a client, Zookeeper will be contacted to fetch the address of the specific broker.
This means that when you start a Kafka container you have a situation like the following:
- container name: kafka
- network name: kafkanet
- hostname: kafka
- registration on zookeeper: kafka:9092
Now if you connect a client to your Kafka from a container inside the kafkanet network, the address you get back from Zookeeper is kafka:9092
which is resolvable through the kafkanet
network.
However if you connect to Kafka from outside docker (i.e. using the localhost:9092
endpoint mapped by docker), you still get back the kafka:9092
address which is not resolvable.
In order to address this issue you can specify the advertised.host.name
and advertised.port
in the broker configuration in such a way that the address is resolvable by all the client (see documentation).
What is usually done is to set advertised.host.name
as <container-name>.<network>
(in your case something like kafka.kafkanet) so that any container connected to the network is able to correctly resolve the IP of the Kafka broker.
In your case however you have a mixed network configuration, as some components live inside docker (hence able to resolve the kafkanet network) while others live outside it. If it were a production system my suggestion would be to set the advertised.host.name
to the DNS/IP of the host machine and always rely on docker port mapping to reach the Kafka broker.
From my understanding however you only need this setup to test things out, so the easiest thing would be to "trick" the system living outside docker. Using the naming specified above, this means simply to add to your /etc/hosts
(or windows equivalent) the line 127.0.0.1 kafka.kafkanet
.
This way when your client living outside docker connects to Kafka the following should happen:
- client -> Kafka via localhost:9092
- kafka queries Zookeeper and return the host
kafka.kafkanet
- client resolves
kafka.kafkanet
to 127.0.0.1 - client -> Kafka via 127.0.0.1:9092
EDIT
As pointed out in a comment, newer Kafka version now use the concept of listeners
and advertised.listeners
which are used in place of host.name
and advertised.host.name
(which are deprecated and only used in case the the above ones are not specified). The general idea is the same however:
host.name
: specifies the host to which the Kafka broker should bind itself to (works in conjunction withport
listeners
: specifies all the endpoints to which the Kafka broker should bind (for instancePLAINTEXT://0.0.0.0:9092,SSL://0.0.0.0:9091
)advertised.host.name
: specifies how the broker is advertised to client (i.e. which address client should use to connect to it)avertised.listeners
: specifies all the advertised endpoints (for instancePLAINTEXT://kafka.example.com:9092,SSL://kafka.example.com:9091
)
In both cases for client to be able to successfully communicate with Kafka they need to be able to resolve and connect to the advertised
hostname and port.
In both cases if not specified they are automatically derived by the broker using the hostname of the machine the broker is running on.
答案2
得分: 1
你一直在引用 8092
。那是有意的吗?Kafka 运行在 9092
。最简单的测试是下载相同版本的 Kafka,并手动运行其 kafka-console-consumer
和 kafka-console-producer
脚本,看看是否可以在您的主机上进行发布订阅操作。
英文:
You kept referencing 8092
. Was that intentional? Kafka runs on 9092
. Easiest test is to download the same version of Kafka and manually run its kafka-console-consumer
and kafka-console-producer
scripts to see if you can pub-sub from your host machine.
答案3
得分: 0
你尝试过在 Docker 化的应用程序中使用 "host.docker.internal" 吗?
英文:
did you try "host.docker.internal" in dockerized application?
答案4
得分: 0
你可以为你的容器创建一个 docker 网络,然后容器就能够解析彼此的主机名并进行通信。
注意:这在使用 docker-compose 以及独立容器时同样适用。
英文:
You could create a docker network for your containers and then containers will be able to resolve each other hostnames and communicate.
Note: this is usable with docker-compose as well with standalone containers
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论