Runbook
Overloading of Resources in Apache Airflow
Back to Runbooks
Overview
This incident type involves managing the level of parallelism and concurrency for tasks in Apache Airflow to prevent overloading of system resources. Apache Airflow is a platform used for scheduling and monitoring workflows. When the level of parallelism or concurrency is too high, it can cause resource exhaustion, leading to failure of tasks or even the entire system. This incident requires careful management of the number of tasks that can run simultaneously and the resources they require to ensure smooth performance of the system.
Parameters
Debug
List all running pods in the Airflow namespace
View the logs of a specific pod
Check the status of the Airflow scheduler
Check the resource usage of a specific pod
Check the resource usage of all pods in the Airflow namespace
Check the resource requests and limits for a specific pod
Check the resource requests and limits for all pods in the Airflow namespace
Repair
Learn more
Related Runbooks
Check out these related runbooks to help you debug and resolve similar issues.