Runbook

Kubernetes Deployments Replica Pods Monitoring Incident

Back to Runbooks

Overview

This incident type relates to the monitoring of Kubernetes deployments replica pods. It implies that there is an issue with the number of replica pods available as compared to the desired number. The incident might be triggered by a query alert monitor and might require immediate action to resolve the issue. The incident could impact the deployment of applications hosted on Kubernetes and might require troubleshooting and fixing the underlying issue.

Parameters

Debug

List all deployments in the affected namespace

Check if there are any pods that are not ready

Check the logs of the affected pods for any errors

Check the events in the affected namespace

Check the status of the Kubernetes nodes

Check the status of the Kubernetes services

A recent deployment or upgrade of applications on Kubernetes might have caused the pods to go down.

There might be a scaling issue where the desired number of replica pods is not being met due to resource constraints or misconfiguration.

Repair

Check if any recent changes were made to the deployment that could have caused the issue. Verify if the replicas are scaled down or if there is a problem with the deployment configuration.

Perform a Rolling Restart of a Kubernetes Deployment.

.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.