Runbook

Kubernetes Pod Restarting Monitoring

Back to Runbooks

Overview

A Kubernetes Pod Restarting Monitoring incident is triggered when a pod running on a Kubernetes cluster restarts multiple times within a certain time frame. This incident type is usually used to detect issues with the application or infrastructure running on the cluster, and can be caused by various factors such as resource constraints, misconfigurations, or bugs in the application code. The incident is typically resolved by identifying and addressing the underlying cause of the pod restarts.

Parameters

Debug

List all pods in <namespace>

Get detailed information about a specific pod

View the logs for a specific container in a pod

View the metrics for a specific pod

Misconfigurations: The pod may be restarting due to misconfigurations in the Kubernetes manifest files, such as incorrect environment variables or volume mounts. This may also be caused by misconfigured resource requests or limits.

Resource constraints: The pod may be restarting due to insufficient resources such as CPU or memory. This may be due to high resource usage by other pods running on the same node or cluster, or because the pod's resource requests or limits are not properly configured.

Repair

Adjust the memory requests and limits.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.