A Kubernetes Pod Restarting Monitoring incident is triggered when a pod running on a Kubernetes cluster restarts multiple times within a certain time frame. This incident type is usually used to detect issues with the application or infrastructure running on the cluster, and can be caused by various factors such as resource constraints, misconfigurations, or bugs in the application code. The incident is typically resolved by identifying and addressing the underlying cause of the pod restarts.
Parameters
Debug
List all pods in <namespace>
Get detailed information about a specific pod
View the logs for a specific container in a pod
View the events related to a specific pod
View the metrics for a specific pod
Misconfigurations: The pod may be restarting due to misconfigurations in the Kubernetes manifest files, such as incorrect environment variables or volume mounts. This may also be caused by misconfigured resource requests or limits.
Resource constraints: The pod may be restarting due to insufficient resources such as CPU or memory. This may be due to high resource usage by other pods running on the same node or cluster, or because the pod's resource requests or limits are not properly configured.
Repair
Adjust the memory requests and limits.
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.