Runbook

Kafka liveness check failure.

Back to Runbooks

Overview

This incident type refers to a scenario where the liveness check for Kafka cannot reach/communicate with the host where the broker is running. This issue can arise due to various reasons such as network issues, broker failure, or configuration errors. This can result in the Kafka broker being unavailable or not responding to requests, leading to potential service disruptions or downtime.

Parameters

Debug

Check if Zookeeper is running on the specified <zk_host> and <zk_port>

Check the status of all Kafka brokers

Check if the Kafka broker is running on the specified <broker_host> and <broker_port>

Check if the Kafka broker is registered with Zookeeper

Check if the Kafka topics are being replicated correctly

Check if the Kafka producer is able to send messages to the broker

Check if the Kafka consumer is able to receive messages from the broker

Network connectivity issues between the liveness check service and the broker host.

The broker host is experiencing high CPU or memory utilization that is causing it to be unresponsive.

Repair

Check if the broker is running on the host specified in the configuration file. If not, start the broker on the specified host.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.