Runbook

RabbitMQ Node Performance Degradation Incident

Back to Runbooks

Overview

RabbitMQ is a popular open-source message broker that facilitates communication between distributed systems. This incident type refers to instances where a RabbitMQ node experiences performance degradation, meaning that the node is not functioning optimally and may be causing delays or other issues for the applications relying on it. This can have significant impacts on system availability and reliability, particularly if the degraded node is a critical component of the overall architecture. Addressing this incident type requires identifying the root cause of the degradation and implementing appropriate remediation measures.

Parameters

Debug

1. Check RabbitMQ status

2. Check if RabbitMQ is running on all nodes

3. Check RabbitMQ log files for any errors

4. Check system resource usage

5. Check network connectivity

6. Check for any blocked connections

7. Check for any blocked channels

8. Check for any queues with high resource usage

Repair

Increase the resources allocated to the RabbitMQ node(s) affected by the incident, such as CPU, RAM, or disk space, to improve its performance.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.