Runbook

Kafka Leader Election Failures Incident.

Back to Runbooks

Overview

This incident type refers to a problem with the leader election process in Apache Kafka, which is a distributed streaming platform. Leader election is a critical component of Kafka, as it determines which broker node is responsible for handling read and write requests for a specific partition of data. If leader election fails, it can result in data inconsistencies, message loss, and service disruptions. This type of incident requires immediate attention and resolution to ensure the stability and reliability of the Kafka cluster.

Parameters

Debug

Check Zookeeper status

Check Kafka broker status

Check Kafka logs for any errors

Check Zookeeper logs for any errors

Check the Kafka topic for any issues

Check the Kafka broker configuration

Check the Zookeeper configuration

Check the Kafka broker leader election status

Check the Zookeeper node status

Check the Zookeeper node children

Repair

Check if the issue is with the Zookeeper ensemble and fix any issues.

Restart the affected brokers to ensure that they are communicating properly with the Zookeeper.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.