Kafka broker 重新启动,出现错误信息:”复制因子:3 大于代理 0。”

huangapple go评论142阅读模式
英文:

Kafka broker restart with "Replication factor: 3 greater than broker 0."

问题

在您提供的信息中,问题似乎出现在Kafka经纪人(broker)2加入集群时,出现了"org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 0."错误。

您的同事建议将每个主题的复制因子设置为3,并重新分配现有主题的复制因子。这可能是问题的根本原因,因为复制因子指定了在集群中有多少副本,如果集群中没有足够的经纪人来满足所需的复制因子,则会出现错误。

为了找到问题的根本原因,您可以尝试以下步骤:

  1. 确保Kafka集群中有足够的经纪人来满足所需的复制因子。
  2. 检查现有主题的复制因子设置,确保它们与集群的可用经纪人数量相匹配。
  3. 如果可能的话,将现有主题的复制因子设置为3,并重新分配它们的分区以满足新的设置。
  4. 监控Kafka集群的健康状态,确保没有其他问题导致复制因子错误。

总之,您的同事的建议可能是解决问题的一种方法,但您应该确保Kafka集群本身没有其他问题,并且经纪人数量足够来满足所需的复制因子。如果问题仍然存在,您可能需要进一步调查其他潜在的根本原因。

英文:

We have eight kafka brokers cluster with 5 zookeeper cluster in confluent-platform 7.0.1. For example when we restart broker id 2, according kafka log as fallowing

[2023-06-15 09:56:09,906] INFO Socket connection established, initiating session, client: /10.136.132.61:36972, server: datagovstg-kfk04.deltaww.com/10.136.132.62:2181 (org.apache.zookeeper.ClientCnxn)
[2023-06-15 09:56:10,013] INFO Session establishment complete on server datagovstg-kfk04.deltaww.com/10.136.132.62:2181, session id = 0x535f9587f0000, negotiated timeout = 18000 (org.apache.zookeeper.ClientCnxn)
[2023-06-15 09:56:10,014] INFO [ZooKeeperClient Kafka server] Connected. (kafka.zookeeper.ZooKeeperClient)
[2023-06-15 09:56:11,283] INFO [feature-zk-node-event-process-thread]: Starting (kafka.server.FinalizedFeatureChangeListener$ChangeNotificationProcessorThread)
[2023-06-15 09:56:11,483] INFO Updated cache from existing <empty> to latest FinalizedFeaturesAndEpoch(features=Features{}, epoch=0). (kafka.server.FinalizedFeatureCache)
[2023-06-15 09:56:11,490] INFO Cluster ID = Xugt0nfcSjW7zXrDhEzkuA (kafka.server.KafkaServer)

// DELETED

[2023-06-15 09:56:13,644] INFO Creating /brokers/ids/2 (is it secure? false) (kafka.zk.KafkaZkClient)
[2023-06-15 09:56:13,732] INFO Stat of the created znode at /brokers/ids/2 is: 85899346228,85899346228,1686794161605,1686794161605,1,0,0,1466719931400192,329,0,85899346228
 (kafka.zk.KafkaZkClient)
[2023-06-15 09:56:13,732] INFO Registered broker 2 at path /brokers/ids/2 with addresses: PLAINTEXT://datagovstg-kfk03.deltaww.com:9092,SASL_PLAINTEXT://datagovstg-kfk03.deltaww.com:9093, czxid (broker epoch): 85899346228 (kafka.zk.KafkaZkClient)
[2023-06-15 09:56:13,789] INFO [ExpirationReaper-2-topic]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2023-06-15 09:56:13,793] INFO [ExpirationReaper-2-Heartbeat]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2023-06-15 09:56:13,794] INFO [ExpirationReaper-2-Rebalance]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2023-06-15 09:56:13,809] INFO [GroupCoordinator 2]: Starting up. (kafka.coordinator.group.GroupCoordinator)
[2023-06-15 09:56:13,818] INFO [GroupCoordinator 2]: Startup complete. (kafka.coordinator.group.GroupCoordinator)
[2023-06-15 09:56:13,834] INFO [TransactionCoordinator id=2] Starting up. (kafka.coordinator.transaction.TransactionCoordinator)
[2023-06-15 09:56:13,838] INFO [Transaction Marker Channel Manager 2]: Starting (kafka.coordinator.transaction.TransactionMarkerChannelManager)
[2023-06-15 09:56:13,838] INFO [TransactionCoordinator id=2] Startup complete. (kafka.coordinator.transaction.TransactionCoordinator)

// DELETED

[2023-06-15 09:56:21,639] INFO Kafka version: 7.0.1-ccs (org.apache.kafka.common.utils.AppInfoParser)
[2023-06-15 09:56:21,639] INFO Kafka commitId: b7e52413e7cb3e8b (org.apache.kafka.common.utils.AppInfoParser)
[2023-06-15 09:56:21,639] INFO Kafka startTimeMs: 1686794181632 (org.apache.kafka.common.utils.AppInfoParser)
[2023-06-15 09:56:21,641] INFO [KafkaServer id=2] started (kafka.server.KafkaServer)
[2023-06-15 09:56:21,940] INFO [Admin Manager on Broker 2]: Error processing create topic request CreatableTopic(name='_schemas', numPartitions=3, replicationFactor=3, assignments=[], configs=[]) (kafka.server.ZkAdminManager)
org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 0.
[2023-06-15 09:56:21,994] INFO [Admin Manager on Broker 2]: Error processing create topic request CreatableTopic(name='__consumer_offsets', numPartitions=50, replicationFactor=3, assignments=[], configs=[CreateableTopicConfig(name='compression.type', value='producer'), CreateableTopicConfig(name='cleanup.policy', value='compact'), CreateableTopicConfig(name='segment.bytes', value='104857600')]) (kafka.server.ZkAdminManager)
org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 0.
[2023-06-15 09:56:22,095] INFO [Admin Manager on Broker 2]: Error processing create topic request CreatableTopic(name='_schemas', numPartitions=3, replicationFactor=3, assignments=[], configs=[]) (kafka.server.ZkAdminManager)
org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 0.
[2023-06-15 09:56:22,136] INFO [Admin Manager on Broker 2]: Error processing create topic request CreatableTopic(name='_schemas', numPartitions=3, replicationFactor=3, assignments=[], configs=[]) (kafka.server.ZkAdminManager)
org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 0.
[2023-06-15 09:56:22,174] INFO [Admin Manager on Broker 2]: Error processing create topic request CreatableTopic(name='__consumer_offsets', numPartitions=50, replicationFactor=3, assignments=[], configs=[CreateableTopicConfig(name='compression.type', value='producer'), CreateableTopicConfig(name='cleanup.policy', value='compact'), CreateableTopicConfig(name='segment.bytes', value='104857600')]) (kafka.server.ZkAdminManager)

Looks like broker id 2 has join cluster, but

org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 0.

I write simple java program using bootstrap.server=broker1 to describe cluster, I can get broker 2 in available nodes. But I can not using bootstrap.server=broker2 to describe cluster or list topics. I use zookeeper-shell connect to zookeeper, I use this command ls /brokers/ids then I get eight broker id in children.
When I shutdown all kafka brokers and restart kafka brokers, this situation was disappear. I don't this this is right solution to handle this error.

We have some topics with only one replica. A colleague believes that this is the root cause and suggests that we should set the replication factor to 3 for each topic. The existing topics should be reassigned to have a replication factor of 3. Is this the root cause of the problem we are encountering?

Does anyone have any suggestions on what I can try to find the root cause?

答案1

得分: 0

错误与您的Schema Registry的bootstrap.server配置相关,而不是与Kafka本身有关,因为Kafka要求存在_schemas__consumer_offsets主题,并且两者的默认复制因子都为3

如果您不需要Schema Registry,您可以忽略它,但是消费者客户端也将无法读取具有消费者组的数据,而不管集群中的经纪人数量或您在其中设置的bootstrap.servers是什么。

英文:

The error is related to your Schema Registry's bootstrap.server config, not Kafka itself, as that requires the _schemas and __consumer_offsets topic to exist, and the default replication factor for both is 3.

If you aren't needing the Schema Registry, you can ignore that, but consumer clients will also not be able to read data with consumer groups without the __consumer_offsets topic, regardless of the number of brokers in the cluster, or what you've set as bootstrap.servers in them.

huangapple
  • 本文由 发表于 2023年6月15日 16:43:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76480686.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定