This incident type refers to an alert triggered by the Kafka system when one or more topics have under-replicated data due to an insufficient number of in-sync replicas. In-sync replicas are replicas that are fully caught up with the leader replica and are considered reliable for serving data to consumers. When the number of in-sync replicas drops below a certain threshold, the Kafka system raises an alert to notify administrators to investigate and take corrective action to ensure data consistency and availability.
Parameters
Debug
1. Check the status of the Kafka brokers
2. Verify that all Kafka brokers are up and running
3. Check the in-sync replicas for each partition of the under-replicated topic
4. Check the replication factor for the under-replicated topic
5. Check the ISR (in-sync replica) count for each partition of the under-replicated topic
6. Check the logs for any errors or warnings related to the under-replicated topic
7. Check the network connectivity between the Kafka brokers and Zookeeper
The replication factor for the affected topics may have been set too low, increasing the likelihood of under-replication.
Repair
Increase the replication factor for the affected topic to ensure that there are more replicas of the partitions available.
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.