Preflow is an open-source, Prefect based workflow system designed to automate and orchestrate data processing for CKAN. It can be used for a wide range of data operations, including harvesting, validation, transformation, and loading of resource files into the CKAN Datastore, as well as many other extensible data pipeline tasks. Preflow is built for flexibility and can be easily extended to support additional data operations per your requirements.
- ckan_datastore_ingestion: Ingests resource files into the CKAN Datastore. This flow is triggered by the ckanext-preflowextension when a resource is created or updated in CKAN.
-
Clone this repository and install dependencies:
git clone https://github.com/datopian/preflow.git
-
** Build and run the Docker container:**
docker compose -f docker-compose.yml up --build
-
Deploy the Prefect flow:
docker exec -it prefect-worker prefect deploy -n ckan_datastore_ingestion
-
Trigger the flow run from the Prefect API, UI or CLI.
- Edit
flows/prefect.yaml
to adjust deployments, parameters, and schedules.
- The workflow requires the
ckanext-preflow
extension to be installed and enabled on your CKAN instance for status trigger the Prefect flow if you want to use theckan_datastore_ingestion
flow.