Runbook

Airflow Worker Node Overload.

Back to Runbooks

Overview

Apache Airflow is a platform used for creating, scheduling, and monitoring workflows. A worker node is a component of this platform that executes tasks and runs jobs in parallel. When a worker node is overloaded, it means that it is unable to handle the number of tasks assigned to it, causing failures in the execution of workflows. This incident type refers to the situation where an Apache Airflow worker node is overloaded and needs to be addressed to ensure that the platform can continue to function properly.

Parameters

Debug

Check CPU usage of worker nodes

Check memory usage of worker nodes

Check disk usage of worker nodes

Check airflow worker logs for errors

Check the number of running tasks on the worker nodes

Configuration issues such as improper settings for worker node resources like CPU, memory, or disk space that do not align with the workload requirements.

Repair

Increase the capacity of the worker node by adding more resources such as CPU, memory, or storage.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.