-
First Time Setup
-
Create a python virtual environment the .venv folder for python to use
python3 -m venv .venv
-
Create a .env
cp env.example .env
-
-
Activating Virtual Environment
-
Make python download and use open source code inside your .venv folder
source .venv/bin/activate -
Check if python using .venv folder
which python3
-
-
Install the latest packages for the project
pip install -r requirements.txt
pytest- Note that all test files must end in
_test.py
pip freeze > requirements.txtpip install 'apache-airflow==2.9.1' \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.9.1/constraints-3.8.txt"- Change constraints-3.8.txt to your python version
- More details in Airflow document
-
Create an airflow folder under your object, and create a dags folder under airflow.
-
To configure Airflow to recognize your DAGs directory, you need to set the
AIRFLOW_HOMEenvironment variable. replacing/path/to/dags/folder/parent/folderwith the actual path to your desired directory:
export AIRFLOW_HOME= path/to/dags/flod/parent/flod- Run Airflow Standalone, and get the default username and password.
airflow standaloneThe command initializes the database, creates a user, and starts all components.
- If you prefer to run individual components of Airflow manually, or if you need personalized user information, instead of using the all-in-one standalone command, you can run the following:
airflow db init
airflow users create \
--username admin \
--firstname Peter \
--lastname Parker \
--role Admin \
--email [email protected]
airflow webserver --port 8080
airflow scheduler
airflow triggerer- More details in Airflow document
-
Access the Airflow UI: Visit localhost:8080 in your browser.
-
Connect to AWS S3: Choose connections under Admin, create a new connection. Input
oncokb_s3inConnection Id, chooseConnection Typeas Amazon Web Services , and inputAWS Access Key IDandAWS Secret Access Key. -
Connect to MySQL: Choose connections under Admin, create a new connection. Input
oncokb_mysqlinConnection Id, chooseConnection Typeas MySQL , and inputHost,schema,login,passwordandport.
- If you want to close all Airflow DAG or connectionsexamples on Airflow webserver. Open airflow.cfg and change
load_examples = Falseorload_default_connections = False.
With the Airflow CLI, run to test your dag, you can check the result and logs at Airflow UI.
airflow dags test <dag_id>You can use CLI to list all dags you have.
airflow dags list