Apache Kafka: 删除并重新创建具有点号名称的主题时出现问题

huangapple go评论50阅读模式
英文:

Apache Kafka: Trouble deleting and re-creating topic with a dot in the name

问题

我遇到了Kafka的一个问题,如果我创建一个带有点号的主题,然后删除它,再重新创建它,主题创建就会失败。我正在使用kafka_2.13-3.3.1与KRaft在一个5节点集群上。

我最初遇到这个问题是在设置MirrorMaker2时。它会创建带有点号的主题,然后我删除了MM2的主题以重新设置MM2,现在MM2无法重新创建它的主题。

无论如何,以下是一个简单的CLI示例:

# bin/kafka-topics.sh --create --topic a.test.topic --bootstrap-server kfk-01:9092
警告:由于指标名称的限制,主题名称中包含句点('.')或下划线('_')可能会发生冲突。为避免问题,最好使用其中一个,但不要同时使用。
已创建主题a.test.topic。

# bin/kafka-topics.sh --delete --topic a.test.topic --bootstrap-server kfk-01:9092

# bin/kafka-topics.sh --create --topic a.test.topic --bootstrap-server kfk-01:9092
警告:由于指标名称的限制,主题名称中包含句点('.')或下划线('_')可能会发生冲突。为避免问题,最好使用其中一个,但不要同时使用。
在执行主题命令时出现错误:服务器在处理请求时遇到了意外错误。
[2023-02-06 19:35:59,110] 错误 org.apache.kafka.common.errors.UnknownServerException:服务器在处理请求时遇到了意外错误。

我认为这不是一个时间问题...如果我使用不带点号的主题执行这个操作,它总是成功的。

我在本地节点的服务器日志中收到了一些消息...它在以下两者之间来回切换:

[2023-02-06 20:01:31,740] 警告 [Controller 1] createTopics:在40微秒内的时期10188失败,出现未知的服务器异常NoSuchElementException。放弃领导权并恢复到上次提交的偏移2760857。(org.apache.kafka.controller.QuorumController)
java.util.NoSuchElementException
        at org.apache.kafka.timeline.SnapshottableHashTable$CurrentIterator.next(SnapshottableHashTable.java:167)
        at org.apache.kafka.timeline.SnapshottableHashTable$CurrentIterator.next(SnapshottableHashTable.java:139)
        at org.apache.kafka.timeline.TimelineHashSet$ValueIterator.next(TimelineHashSet.java:120)
        at org.apache.kafka.controller.ReplicationControlManager.validateNewTopicNames(ReplicationControlManager.java:799)
        at org.apache.kafka.controller.ReplicationControlManager.createTopics(ReplicationControlManager.java:567)
        at org.apache.kafka.controller.QuorumController.lambda$createTopics$7(QuorumController.java:1832)
        at org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:767)
        at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
        at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
        at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
        at java.base/java.lang.Thread.run(Thread.java:829)
[2023-02-06 20:01:31,740] 信息 [RaftManager nodeId=1] 收到用户请求,要辞去当前时期10188的领导权(org.apache.kafka.raft.KafkaRaftClient)
[2023-02-06 20:01:31,740] 信息 [RaftManager nodeId=1] 完成过渡到ResignedState(localId=1,时期=10188,选民=[1,2,3,4,5],选举超时ms=1140,未确认选民=[2,3,4,5],首选继任者=[2,3,4,5])(org.apache.kafka.raft.QuorumState)
[2023-02-06 20:01:31,750] 信息 [RaftManager nodeId=1] 完成过渡到Unattached(时期=10189,选民=[1,2,3,4,5],选举超时ms=1909)(org.apache.kafka.raft.QuorumState)
[2023-02-06 20:01:31,750] 信息 [Controller 1] 在新时期10189中,领导者是(无)。(org.apache.kafka.controller.QuorumController)
[2023-02-06 20:01:31,754] 信息 [RaftManager nodeId=1] 完成过渡到Voted(时期=10189,投票ID=2,选民=[1,2,3,4,5],选举超时ms=1929)(org.apache.kafka.raft.QuorumState)
[2023-02-06 20:01:31,754] 信息 [RaftManager nodeId=1] 投票请求VoteRequestData(clusterId='ZmJlNWVjMDI5OWFlNDVhYw',topics=[TopicData(topicName='__cluster_metadata',partitions=[PartitionData(partitionIndex=0,candidateEpoch=10189,candidateId=2,lastOffsetEpoch=10188,lastOffset=2760858)])])wit
[2023-02-06 20:01:31,763] 信息 [RaftManager nodeId=1] 完成过渡到FollowerState(fetchTimeoutMs=2000,时期=10189,领导者ID=2,选民=[1,2,3,4,5],highWatermark=Optional.empty,fetchingSnapshot=Optional.empty)(org.apache.kafka.raft.QuorumState)
[2023-02-06 20:01:31,763] 信息 [Controller 1] 在新时期10189中,领导者是2。(org.apache.kafka.controller.QuorumController)
[2023-02-06 20:01:31,823] 错误 [Controller 1] processBrokerHeartbeat:由于NotControllerException无法启动处理。(org.apache.kafka.controller.QuorumController)
[2023-02-06 20:01:33,468] 错误 [Controller 1] processBrokerHeartbeat:由于NotControllerException无法启动处理。(

<details>
<summary>英文:</summary>

I have run into an issue in kafka where if I create topic with a dot in the name, then delete it, then create it again, topic creation fails. I am using kafka_2.13-3.3.1 with KRaft on a 5-node cluster.

I originally ran into this problem while setting up MirrorMaker2.  It creates topics with dots in the name, and i nuked the MM2 topics to redo MM2, and now MM2 can&#39;t recreate its topics.

Anyway, here is a simple CLI example:

bin/kafka-topics.sh --create --topic a.test.topic --bootstrap-server kfk-01:9092

WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Created topic a.test.topic.

bin/kafka-topics.sh --delete --topic a.test.topic --bootstrap-server kfk-01:9092

bin/kafka-topics.sh --create --topic a.test.topic --bootstrap-server kfk-01:9092

WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Error while executing topic command : The server experienced an unexpected error when processing the request.
[2023-02-06 19:35:59,110] ERROR org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error when processing the request.


I don&#39;t think this is a timing issue...if i perform this exercise with a topic without dots in the name, it always succeeds.

I am getting some messages in the server logs on the local node...it goes back and forth between this one:

[2023-02-06 20:01:31,740] WARN [Controller 1] createTopics: failed with unknown server exception NoSuchElementException at epoch 10188 in 40 us. Renouncing leadership and reverting to the last committed offset 2760857. (org.apache.kafka.controller.QuorumController)
java.util.NoSuchElementException
at org.apache.kafka.timeline.SnapshottableHashTable$CurrentIterator.next(SnapshottableHashTable.java:167)
at org.apache.kafka.timeline.SnapshottableHashTable$CurrentIterator.next(SnapshottableHashTable.java:139)
at org.apache.kafka.timeline.TimelineHashSet$ValueIterator.next(TimelineHashSet.java:120)
at org.apache.kafka.controller.ReplicationControlManager.validateNewTopicNames(ReplicationControlManager.java:799)
at org.apache.kafka.controller.ReplicationControlManager.createTopics(ReplicationControlManager.java:567)
at org.apache.kafka.controller.QuorumController.lambda$createTopics$7(QuorumController.java:1832)
at org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:767)
at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
at java.base/java.lang.Thread.run(Thread.java:829)
[2023-02-06 20:01:31,740] INFO [RaftManager nodeId=1] Received user request to resign from the current epoch 10188 (org.apache.kafka.raft.KafkaRaftClient)
[2023-02-06 20:01:31,740] INFO [RaftManager nodeId=1] Completed transition to ResignedState(localId=1, epoch=10188, voters=[1, 2, 3, 4, 5], electionTimeoutMs=1140, unackedVoters=[2, 3, 4, 5], preferredSuccessors=[2, 3, 4, 5]) (org.apache.kafka.raft.QuorumState)
[2023-02-06 20:01:31,750] INFO [RaftManager nodeId=1] Completed transition to Unattached(epoch=10189, voters=[1, 2, 3, 4, 5], electionTimeoutMs=1909) (org.apache.kafka.raft.QuorumState)
[2023-02-06 20:01:31,750] INFO [Controller 1] In the new epoch 10189, the leader is (none). (org.apache.kafka.controller.QuorumController)
[2023-02-06 20:01:31,754] INFO [RaftManager nodeId=1] Completed transition to Voted(epoch=10189, votedId=2, voters=[1, 2, 3, 4, 5], electionTimeoutMs=1929) (org.apache.kafka.raft.QuorumState)
[2023-02-06 20:01:31,754] INFO [RaftManager nodeId=1] Vote request VoteRequestData(clusterId='ZmJlNWVjMDI5OWFlNDVhYw', topics=[TopicData(topicName='__cluster_metadata', partitions=[PartitionData(partitionIndex=0, candidateEpoch=10189, candidateId=2, lastOffsetEpoch=10188, lastOffset=2760858)])]) wit
[2023-02-06 20:01:31,763] INFO [RaftManager nodeId=1] Completed transition to FollowerState(fetchTimeoutMs=2000, epoch=10189, leaderId=2, voters=[1, 2, 3, 4, 5], highWatermark=Optional.empty, fetchingSnapshot=Optional.empty) (org.apache.kafka.raft.QuorumState)
[2023-02-06 20:01:31,763] INFO [Controller 1] In the new epoch 10189, the leader is 2. (org.apache.kafka.controller.QuorumController)
[2023-02-06 20:01:31,823] ERROR [Controller 1] processBrokerHeartbeat: unable to start processing because of NotControllerException. (org.apache.kafka.controller.QuorumController)
[2023-02-06 20:01:33,468] ERROR [Controller 1] processBrokerHeartbeat: unable to start processing because of NotControllerException. (org.apache.kafka.controller.QuorumController)
[2023-02-06 20:01:33,512] ERROR [Controller 1] processBrokerHeartbeat: unable to start processing because of NotControllerException. (org.apache.kafka.controller.QuorumController)
[2023-02-06 20:01:33,595] ERROR [Controller 1] processBrokerHeartbeat: unable to start processing because of NotControllerException. (org.apache.kafka.controller.QuorumController)
[2023-02-06 20:01:33,595] INFO [BrokerToControllerChannelManager broker=1 name=heartbeat] Client requested disconnect from node 1 (org.apache.kafka.clients.NetworkClient)
[2023-02-06 20:01:33,595] INFO [BrokerToControllerChannelManager broker=1 name=heartbeat]: Recorded new controller, from now on will use broker kfk-02:9091 (id: 2 rack: null) (kafka.server.BrokerToControllerRequestThread)
[2023-02-06 20:01:33,671] ERROR [Controller 1] processBrokerHeartbeat: unable to start processing because of NotControllerException. (org.apache.kafka.controller.QuorumController)


and this one:

[2023-02-06 20:02:15,898] ERROR [Controller 1] createTopics: unable to start processing because of NotControllerException. (org.apache.kafka.controller.QuorumController)
[2023-02-06 20:02:16,034] INFO [RaftManager nodeId=1] Become candidate due to fetch timeout (org.apache.kafka.raft.KafkaRaftClient)
[2023-02-06 20:02:16,039] INFO [RaftManager nodeId=1] Completed transition to CandidateState(localId=1, epoch=10190, retries=1, electionTimeoutMs=1411) (org.apache.kafka.raft.QuorumState)
[2023-02-06 20:02:16,040] INFO [Controller 1] In the new epoch 10190, the leader is (none). (org.apache.kafka.controller.QuorumController)
[2023-02-06 20:02:16,051] INFO [RaftManager nodeId=1] Completed transition to Leader(localId=1, epoch=10190, epochStartOffset=2760947, highWatermark=Optional.empty, voterStates={1=ReplicaState(nodeId=1, endOffset=Optional.empty, lastFetchTimestamp=-1, lastCaughtUpTimestamp=-1, hasAcknowledgedLeader=
[2023-02-06 20:02:16,067] INFO [Controller 1] Becoming the active controller at epoch 10190, committed offset 2760946, committed epoch 10189 (org.apache.kafka.controller.QuorumController)
[2023-02-06 20:02:17,742] INFO [BrokerToControllerChannelManager broker=1 name=heartbeat] Client requested disconnect from node 2 (org.apache.kafka.clients.NetworkClient)
[2023-02-06 20:02:17,743] INFO [BrokerToControllerChannelManager broker=1 name=heartbeat]: Recorded new controller, from now on will use broker kfk-01:9091 (id: 1 rack: null) (kafka.server.BrokerToControllerRequestThread)


What am I doing wrong?  Is this a bug?

</details>


# 答案1
**得分**: 0

使用 Kafka 3.3.2 版本,此问题已得到修复。

https://issues.apache.org/jira/browse/KAFKA-14337

或者,阅读警告,并改用下划线(或连字符),因为指标始终使用点符号表示。

<details>
<summary>英文:</summary>

Use Kafka 3.3.2 where this is fixed

https://issues.apache.org/jira/browse/KAFKA-14337

Or, read the warning, and use underscores (or hyphens) instead since metrics will always use dot-notation. 

</details>



huangapple
  • 本文由 发表于 2023年2月7日 04:24:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75366203.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定