In this tutorial, you’ll learn how to install Apache Airflow on a Kubernetes cluster. We’ll first deploy Airflow on Kubernetes using Helm, run a sample DAG, and then clean up the installation. By the end, you’ll have a fully functional Airflow setup on Kubernetes.
Please note that a basic understanding of Kubernetes is required to follow this tutorial.
Prerequisites #
Before you start, make sure that the following tools are installed on your local system.
Prerequisites
- A Kubernetes cluster (Or a local Kubernetes cluster with installation of Docker Desktop or MiniKube)
- kubectl to manage Kubernetes resources
- Helm to deploy resources based on Helm charts
Step 1: Create airflow namespace #
To start, let’s create a dedicated namespace for Airflow within the Kubernetes cluster. Using a separate namespace helps isolate Airflow’s resources, making it easier to manage, monitor, and troubleshoot while preventing conflicts with other deployments.
kubectl create ns airflow
Step 2: Add Airflow Helm repository #
Now that the Airflow namespace is in place, we need to add the Airflow Helm repository. Use the following commands:
helm repo add apache-airflow https://airflow.apache.org helm repo update helm search repo airflow
To confirm that the Airflow repository has been added, run:
helm repo list
The result should include the name of the Airflow repository.
NAME URL apache-airflow https://airflow.apache.org
With the repository configured, it’s time to deploy Apache Airflow to your Kubernetes cluster Helm install command bellow.
helm install airflow apache-airflow/airflow --namespace airflow --debug
Step 3: Verify the Deployment #
To ensure that Airflow has been deployed successfully, you can check the status of the pods.
kubectl get pods -n airflow
You should get results similar to the following.
NAME READY STATUS RESTARTS AGE airflow-postgresql-0 1/1 Running 0 5m17s airflow-redis-0 1/1 Running 0 5m17s airflow-scheduler-6f48fcad45-kkxmr 2/2 Running 0 5m17s airflow-statsd-7d985bcb6f-b42t7 1/1 Running 0 5m17s airflow-triggerer-0 2/2 Running 0 5m17s airflow-webserver-9f64f5b98-gv4sk 1/1 Running 0 5m17s airflow-worker-0 2/2 Running 0 5m17s
Step 4: Port Forwarding for Airflow #
Once Airflow is deployed, a service named airflow-webserver is also deployed. To get the name of the service, run the command:
kubectl get svc -n airflow
The result should resemble the screen below
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE airflow-postgresql ClusterIP 10.108.13.202 <none> 5432/TCP 6m47s airflow-postgresql-hl ClusterIP None <none> 5432/TCP 6m47s airflow-redis ClusterIP 10.101.6.45 <none> 6379/TCP 6m47s airflow-statsd ClusterIP 10.102.16.173 <none> 9125/UDP, 9102/TCP 6m47s airflow-triggerer ClusterIP None <none> 8794/TCP 6m47s airflow-webserver ClusterIP 10.109.38.86 <none> 8080/TCP 6m47s airflow-worker ClusterIP None <none> 8793/TCP 6m47s
Now, run the following command tu to forward the service’s port to your local machine.
kubectl port-forward svc/airflow-webserver 8080:8080 -n airflow
Step 5: Log in to Airflow #
After setting up Apache Airflow, please wait for a moment.
Open a web browser and navigate to http://localhost:8080. You should see the Apache Airflow login page.
Log in using the following credentials:
- Username: admin
- Password: admin
Please note that you should not use these for production environments.
Step 6: Cleanup #
Once you’ve completed your tasks and want to uninstall the Airflow Helm Charts Release, you can use the following command:
helm uninstall airflow --namespace airflow
You might also find it necessary to keep an eye on your Airflow installation. To achieve this, you can establish a monitoring platform by following the instructions in our guide Deploy Prometheus Operator in Kubernetes.