Runbook

Kubernetes Nodes with Network Unavailable

Back to Runbooks

Overview

This incident type involves nodes in a Kubernetes cluster that are experiencing network unavailability, meaning they are not accessible. This could be due to a misconfiguration, route exhaustion, or a physical problem with the network connection to the hardware. It is a high urgency incident that requires immediate attention to restore network connectivity to the affected nodes.

Parameters

Debug

Check if Kubernetes nodes are available

Check the status of each Kubernetes node

Check the network configuration of each Kubernetes node

Check if there are any pods that are failing due to network issues

Check the status of the Kubernetes network components

Check if there are any network policies that could be blocking traffic

Check if there are any issues with the Kubernetes service

Check if there are any issues with the Kubernetes endpoint

Check if there are any issues with the Kubernetes ingress

Firewall or security group settings blocking network traffic on the affected nodes

Routing issues in the cluster

Repair

Check if the routing tables are correctly configured to ensure that the nodes can communicate with each other.

Check for any network security policies that may be blocking traffic between nodes and adjust them accordingly.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.