Runbook

Kubernetes Daemonset Multiple Restarts

Back to Runbooks

Overview

This incident type refers to an alert triggered by a Kubernetes Daemonset restarting multiple times within a short period of time. This can be an indication of a problem with the application or infrastructure and needs to be investigated and resolved promptly. The incident may require collaboration between the development and operations teams to identify the root cause and implement a fix to prevent further occurrences.

Parameters

Debug

List all daemonsets in the default namespace

Describe a specific daemonset

Check the status of all daemonset pods

Get logs for a specific pod

Check the restart count for a specific pod

Check the status of all nodes in the cluster

Check the status of all pods running on a specific node

The Kubernetes cluster may not have enough resources to support the containers running on the Daemonset, causing them to restart frequently.

Repair

Increase the resources allocated to the containers to prevent them from running out of memory or CPU and causing a restart.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.