This is the repository for the Manifold Airflow DAGs (Directed Acyclic Graphs, e.g., data processing workflows) and related jobs. These DAGs are expected to be run within an Airflow installation akin to the one built by our TUL Airflow Playbook (private repository).
This repository has 3 main groups of files:
- Airflow DAG definition python files (ending with
_dag.py
); - Airflow DAG tasks python files used by the above (starting with
task_
); - and required local development, test, deployment, and CI files (
tests
,configs
,.travis
, Pipfile, etc.).
The following are the Airflow expectations for the DAGs to successfully run:
Libraries & Packages
- Python Version and Packages: see the Pipfile
Airflow Variables
See variables.json
file
Airflow Connections
SOLRCLOUD
: An HTTP Connection used to connect to SolrCloud.AIRFLOW_S3
: An AWS (not S3 with latest Airflow upgrade) Connection used to manage AWS credentials (which we use to interact with our Airflow Data S3 Bucket).
The following commands are available for local testing and development:
make up
: Sets up local airflow with these dags.make down
: Close the local setup.make reload
: Reload configurations for local setup.make tty-webserver
: Enter airflow webserver container instance.make tty-worker
: Enter airflow worker container instance.make tty-schedular
: Enter airflow schedular contain instance.