Runbook

Cassandra connection timeouts incident

Back to Runbooks

Overview

This incident refers to a situation where there are connection timeouts between nodes in a Cassandra cluster, resulting in a high level of urgency. The incident is triggered automatically by an alert system and assigned to an engineer for resolution. It requires immediate attention as it can potentially impact the availability and performance of the Cassandra service.

Parameters

Debug

Check Cassandra service status

Check Cassandra cluster status

Check if there are any network issues

Check if there are any Cassandra connection issues

Network connectivity issues between nodes in the Cassandra cluster.

Repair

Identify the root cause of the connection timeouts

Restart the Cassandra nodes that are causing the timeouts and monitor the system for any improvements.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.