Runbook

Primary PostgreSQL Node Failure on Linux

Back to Runbooks

Overview

This incident type refers to a failure in the primary PostgreSQL node on a Linux system. PostgreSQL is a popular open source relational database management system used by many organizations. The primary node is responsible for handling the majority of the traffic and data replication in a PostgreSQL cluster. When the primary node fails, it can result in data loss, slow performance, and potential downtime for users. This type of incident requires immediate attention and resolution to minimize the impact on users and ensure data integrity.

Parameters

Debug

Check PostgreSQL service status

Check PostgreSQL logs for any errors

Check PostgreSQL configuration file for any misconfigurations

Check disk space and resource utilization

Repair

Restart the failed PostgreSQL node service to bring it back online.

If restarting the service doesn't work, try to failover to a standby node in the PostgreSQL cluster.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.