Skip to content

Complete MLOps pipeline: 6-stage ML workflow,, Kubernetes deployment, Prometheus monitoring, Airflow orchestration, CI/CD automation, Obervability Stacks (Prometheus, EFK, Jeager)

Notifications You must be signed in to change notification settings

Abeshith/MLOps_PipeLine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MACHINE LEARNING OPERATIONS PIPELINE

last commit Python Languages

Built with the tools and technologies:

Flask JSON Markdown YAML scikit-learn XGBoost DVC NumPy pandas Kaggle

MLflow Docker Kubernetes Python Apache Airflow GitHub Actions Prometheus Grafana Elasticsearch


οΏ½ About the Project & Approach

This repository demonstrates a comprehensive MLOps (Machine Learning Operations) pipeline that showcases industry-standard practices for end-to-end machine learning workflow automation. The project implements a production-ready ML system with automated training, validation, deployment, and monitoring capabilities.

🎯 Project Approach

This MLOps pipeline follows a modular, scalable architecture designed to handle real-world machine learning challenges:

  1. Data-Centric Approach: Implements robust data ingestion, validation, and feature engineering processes
  2. Model-Centric Operations: Automated model training, evaluation, and deployment with performance tracking
  3. Infrastructure-as-Code: Kubernetes manifests and Docker containerization for scalable deployments
  4. Observability-First: Comprehensive monitoring with metrics, logs, and distributed tracing
  5. CI/CD Integration: Automated testing, security scanning, and deployment pipelines

οΏ½ What I Built

Core ML Pipeline Components:

  • οΏ½ 6-Stage DVC Pipeline: Data ingestion β†’ Validation β†’ Feature engineering β†’ Transformation β†’ Training β†’ Evaluation
  • πŸ€– XGBoost Model: Achieved 92.15% accuracy with automated hyperparameter tuning
  • πŸ“Š MLflow Integration: Experiment tracking and model registry for version control
  • πŸ” Data Quality Monitoring: Evidently AI for drift detection and quality assessment

Production Infrastructure:

  • 🐳 Docker Containerization: Multi-stage builds with security best practices
  • ☸️ Kubernetes Deployment: Scalable orchestration with services, ingress, and monitoring
  • πŸŒͺ️ Apache Airflow: Workflow orchestration and scheduling for automated pipeline execution
  • πŸš€ GitHub Actions CI/CD: Automated testing, security scanning, and deployment

Observability Stack:

  • πŸ“ˆ Prometheus: Metrics collection and alerting for model performance
  • πŸ“Š Grafana: Custom dashboards for ML pipeline visualization
  • πŸ” ELK Stack: Centralized logging with Elasticsearch, Kibana, and Fluentd
  • πŸ•ΈοΈ Jaeger: Distributed tracing for microservices monitoring

Getting Started

Prerequisites

  • Python 3.11+
  • Docker & Docker Compose
  • Git & DVC
  • Kaggle Account (for data access)

Installation

  1. Clone the repository
git clone https://github.com/Abeshith/MLOps_PipeLine.git
cd MLOps_PipeLine
  1. Set up Python environment
# Using uv (recommended)
uv venv
uv pip install -r requirements.txt

# Or using pip
pip install -r requirements.txt
  1. Configure Kaggle credentials
# Create kaggle.json in ~/.kaggle/ directory
{
  "username": "your_kaggle_username",
  "key": "your_kaggle_key"
}

πŸ”„ Pipeline Execution

Individual Stage Execution

Stage 1: Data Ingestion

python -m src.mlpipeline.pipeline.stage_01_data_ingestion

or 

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_01_data_ingestion.py                

Downloads Kaggle competition data, splits into train/test sets, validates data integrity

Stage 2: Data Validation

python -m src.mlpipeline.pipeline.stage_02_data_validation

or 

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_02_data_validation.py                

Schema validation against config/schema.yaml, data drift detection using Evidently AI

Stage 3: Feature Engineering

python -m src.mlpipeline.pipeline.stage_03_feature_engineering

or

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_03_feature_engineering.py      

Feature creation and transformation, correlation analysis, feature importance calculation

Stage 4: Data Transformation

python -m src.mlpipeline.pipeline.stage_04_data_transformation

or  

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_04_data_transformation.py                

Data preprocessing and scaling, encoding categorical variables, train/test preparation

Stage 5: Model Training

python -m src.mlpipeline.pipeline.stage_05_model_trainer

or 

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_05_model_trainer.py                

XGBoost model training with hyperparameter tuning, MLflow experiment tracking

Stage 6: Model Evaluation

python -m src.mlpipeline.pipeline.stage_06_model_evaluation

or

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_06_model_evaluation.py                            

Performance metrics calculation, model comparison, results logging to MLflow

Complete Pipeline Execution

Run Complete Pipeline Script

python main.py

or

$env:PYTHONPATH = "src"; python main.py

Executes all 6 stages sequentially with comprehensive logging and error handling

Start Flask Application

python app.py

or 

$env:PYTHONPATH = "src"; python main.py

Launches web interface at http://localhost:5000 for model predictions and monitoring

DVC Pipeline Management

# Run complete pipeline
dvc repro

# Run specific stages
dvc repro data_ingestion
dvc repro data_validation
dvc repro feature_engineering
dvc repro data_transformation
dvc repro model_trainer
dvc repro model_evaluation

# View pipeline status
dvc dag

⚑ Apache Airflow Pipeline Setup

Prerequisites

Note: Apache Airflow requires Linux-based systems for optimal performance. Windows users should use WSL or Linux environment.

Environment Setup

1. Copy Project to Linux Environment (Windows Users)

cp -r /mnt/d/MLOps_PipeLine ~/

2. Install Python and Setup Virtual Environment

# Install Python3 if not already installed
sudo apt update
sudo apt install python3 python3-pip python3-venv

# Navigate to project directory
cd ~/MLOps_PipeLine

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

3. Configure Apache Airflow

# Set Airflow home directory
export AIRFLOW_HOME=~/airflow
echo $AIRFLOW_HOME

# Configure Airflow settings
vim ~/airflow/airflow.cfg

-> insert - i
Replace "auth_manager = airflow.api.fastapi.auth.managers.simple.simple_auth_manager.SimpleAuthManager" in airflow.cfg file
with "auth_manager=airflow.providers.fab.auth_manager.fab_auth_manager.FabAuthManager".
After Updation press- > esc -> :wq!

# Create DAGs directory
mkdir -p ~/airflow/dags

# Copy DAG file to Airflow directory
cp model_dag.py ~/airflow/dags/

# Test DAG configuration
python ~/airflow/dags/model_dag.py

4. Launch Airflow Webserver

# Start Airflow standalone mode
airflow standalone

5. Execute Pipeline

  • Open your web browser and navigate to: http://0.0.0.0:8080
  • Search for ml_pipeline_dag in the DAGs list
  • Click on the DAG and trigger the pipeline execution
  • Monitor the workflow progress through the Airflow UI

☸️ Kubernetes Deployment

1. Start Minikube Cluster

# Initialize Minikube cluster
minikube start

2. Deploy Application

# Deploy the application using deployment manifest
kubectl apply -f k8s/deployment.yaml

# Check pod status
kubectl get pods

3. Deploy Services

# Apply service configuration
kubectl apply -f k8s/service.yaml

# Verify service deployment
kubectl get svc

4. Access Application - Method 1 (Port Forwarding)

# Forward service port to local machine
kubectl port-forward svc/mlapp-service 8000:80

# Access application in browser
# Navigate to: http://localhost:8000

5. Access Application - Method 2 (Load Balancer)

# Edit service configuration
kubectl edit svc mlapp-service

# Change service type from NodePort to LoadBalancer
esc - :wq! - enter
# Save and exit the editor

# Open new terminal and create tunnel
minikube tunnel

# In original terminal, check for external IP
kubectl get svc

# Access application using external IP in browser
# 127.0.0.1

6. Configuring Ingress - (Optional)

# Deploy the ingress.yaml file
kubectl apply -f k8s/ingress.yaml

# Install the Ingress Controller (nginx)
minikube addons enable ingress

# Check the downloaded Ingress
kubectl get pods -A | grep nginx

# Check Ingress is Deployed
kubectl get ingress # A Address is Being Updated like -> 192.168.49.2

# for setup local system configuraton
sudo vim /etc/hosts

# Add
127.0.0.1       localhost
127.0.1.1       Abis-PC.        Abis-PC
192.168.49.2    foo.bar.com
esc - :wq!

# Check Updated or not
ping foo.bar.com

# then go to browser
http://foo.bar.com/demo
http://foo.bar.com/admin

πŸ“Š Observability Stack

πŸ” Metrics Collection with Prometheus

Start Prometheus Monitoring

# Navigate to observability directory
cd observability

# Start Prometheus service
docker-compose up -d prometheus

# Access Prometheus UI
# URL: http://localhost:9090

Prometheus Operations

# Check application health
curl http://localhost:9090/api/v1/query?query=up

# View custom metrics
curl http://localhost:9090/api/v1/query?query=ml_model_accuracy

# Check targets status
curl http://localhost:9090/api/v1/targets

# View alert rules
curl http://localhost:9090/api/v1/rules

Configure ML Pipeline Metrics

# Prometheus collects metrics for:
# - Model accuracy and performance
# - Pipeline execution times
# - Resource utilization
# - API response times
# - Error rates and failures

πŸ“ˆ Visualization with Grafana

Start Grafana Dashboard

# Start Grafana service
docker-compose up -d grafana

# Access Grafana UI
# URL: http://localhost:3000
# Default credentials: admin/admin

Import ML Pipeline Dashboards

# Pre-configured dashboards available:
# 1. ML Pipeline Performance Dashboard
# 2. Model Metrics Dashboard  
# 3. Infrastructure Monitoring Dashboard

# Import custom dashboard JSON files from:
# observability/grafana/dashboards/

Grafana Operations

# Create data source (Prometheus)
# URL: http://prometheus:9090

# Import dashboard from file
# Upload: observability/grafana/dashboards/ml-pipeline-dashboard.json

# Set up alerting rules for model performance
# Configure notification channels (email, slack, webhook)

πŸ“‹ Log Management with ELK Stack

Start Elasticsearch & Kibana

# Start Elasticsearch service
docker-compose up -d elasticsearch

# Start Kibana service  
docker-compose up -d kibana

# Access Kibana UI
# URL: http://localhost:5601

Elasticsearch Operations

# Check cluster health
curl http://localhost:9200/_cluster/health

# List indices
curl http://localhost:9200/_cat/indices?v

# Search ML pipeline logs
curl -X GET "localhost:9200/logs-*/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": {
      "level": "ERROR"
    }
  }
}'

# View pipeline execution logs
curl -X GET "localhost:9200/ml-pipeline-*/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "range": {
      "@timestamp": {
        "gte": "now-1h"
      }
    }
  }
}'

Kibana Dashboard Setup

# Create index patterns for:
# - ml-pipeline-* (Pipeline execution logs)
# - application-* (Application logs)
# - error-* (Error logs)
# - performance-* (Performance metrics)

# Build visualizations for:
# - Pipeline success/failure rates
# - Model training duration trends
# - Error analysis and patterns
# - Resource usage over time

Fluentd Log Collection

# Start Fluentd service
docker-compose up -d fluentd

# Fluentd automatically collects logs from:
# - ML pipeline stages
# - Flask application
# - Kubernetes pods (if deployed)
# - Docker containers
# - System logs

# Configure log parsing in:
# observability/fluentd/fluent.conf

πŸ•ΈοΈ Distributed Tracing with Jaeger

Start Jaeger Tracing

# Start Jaeger service
docker-compose up -d jaeger

# Access Jaeger UI
# URL: http://localhost:16686

Jaeger Operations

# View ML pipeline traces
# Search for service: ml-pipeline
# View operation: model_training

# Analyze request flows:
# - Data ingestion β†’ validation β†’ training β†’ evaluation
# - API request β†’ prediction β†’ response
# - Error traces and bottleneck identification

# Configure trace sampling and retention
# Monitor microservice dependencies

🚨 Alerting & Monitoring

Complete Monitoring Stack

# Start all monitoring services
docker-compose up -d

# Check services status
docker-compose ps

# View service logs
docker-compose logs prometheus
docker-compose logs grafana
docker-compose logs elasticsearch

Alert Configuration

# Prometheus alerts configured for:
# - Model accuracy degradation (< 90%)
# - High error rates (> 5%)
# - Pipeline failures
# - Resource exhaustion
# - API latency issues

# View active alerts
# URL: http://localhost:9090/alerts

# Configure alert notifications in Grafana
# Set up escalation policies and on-call rotations

Monitoring Cleanup

# Stop all services
docker-compose down

# Remove volumes (caution: deletes all data)
docker-compose down -v

# Remove specific service
docker-compose stop grafana
docker-compose rm grafana

πŸ”„ CI/CD Pipeline

Automated MLOps pipeline with GitHub Actions

Pipeline Stages:

  1. Setup: Python environment and dependencies
  2. Code Quality: Black formatting, isort, flake8 linting
  3. Data Validation: Schema validation and data quality checks
  4. Model Performance: Accuracy threshold validation (>90%)
  5. Security Scan: Secret detection and vulnerability scanning
  6. Build & Deploy: Docker build and push to registry

Triggers:

  • Push to main branch
  • Pull request creation
  • Manual workflow dispatch

Required Secrets:

# Add to GitHub repository secrets:
DOCKER_USERNAME=your_docker_username
DOCKER_PASSWORD=your_docker_password

οΏ½ Project Structure

MLOps_PipeLine/
β”œβ”€β”€ πŸ“‚ src/mlpipeline/           # Core ML pipeline source code
β”‚   β”œβ”€β”€ πŸ“‚ components/           # ML pipeline components
β”‚   β”œβ”€β”€ πŸ“‚ config/              # Configuration management
β”‚   β”œβ”€β”€ πŸ“‚ entity/              # Data entities and schemas
β”‚   β”œβ”€β”€ πŸ“‚ pipeline/            # Stage-wise pipeline execution
β”‚   └── πŸ“‚ utils/               # Utility functions
β”œβ”€β”€ πŸ“‚ config/                   # Configuration files
β”‚   β”œβ”€β”€ config.yaml             # Main configuration
β”‚   β”œβ”€β”€ params.yaml             # Model parameters
β”‚   └── schema.yaml             # Data schema validation
β”œβ”€β”€ πŸ“‚ artifacts/               # Generated artifacts (DVC tracked)
β”‚   β”œβ”€β”€ πŸ“‚ data_ingestion/      # Raw and processed data
β”‚   β”œβ”€β”€ πŸ“‚ data_validation/     # Validation reports
β”‚   β”œβ”€β”€ πŸ“‚ feature_engineering/ # Feature artifacts
β”‚   β”œβ”€β”€ πŸ“‚ model_trainer/       # Trained models
β”‚   └── πŸ“‚ model_evaluation/    # Performance metrics
β”œβ”€β”€ πŸ“‚ k8s/                     # Kubernetes manifests
β”‚   β”œβ”€β”€ deployment.yaml         # Application deployment
β”‚   β”œβ”€β”€ service.yaml           # Service configuration
β”‚   └── ingress.yaml           # Ingress routing
β”œβ”€β”€ πŸ“‚ observability/           # Monitoring stack
β”‚   β”œβ”€β”€ docker-compose.yml     # Complete monitoring setup
β”‚   β”œβ”€β”€ πŸ“‚ prometheus/         # Metrics collection
β”‚   β”œβ”€β”€ πŸ“‚ grafana/           # Visualization dashboards
β”‚   β”œβ”€β”€ πŸ“‚ elasticsearch/     # Log storage
β”‚   └── πŸ“‚ kibana/            # Log analysis
β”œβ”€β”€ πŸ“‚ .github/workflows/       # CI/CD automation
β”‚   └── ci-cd.yml              # Complete MLOps pipeline
β”œβ”€β”€ dvc.yaml                   # DVC pipeline definition
β”œβ”€β”€ Dockerfile                 # Container definition
β”œβ”€β”€ model_dag.py              # Airflow DAG definition
└── app.py                    # Flask application

οΏ½πŸ“ˆ Model Performance

Current model performance metrics:

  • Algorithm: XGBoost Classifier
  • Accuracy: 92.15%
  • Precision: 91.47%
  • Recall: 92.00%
  • F1-Score: 91.64%
  • AUC: 94.04%

Performance tracking via MLflow with experiment comparison and model registry integration.


Infrastructure Components

  • Container Orchestration: Kubernetes with auto-scaling
  • Service Mesh: Istio for traffic management (optional)
  • Data Storage: DVC for version control, MLflow for experiments
  • Monitoring: Full observability stack with metrics, logs, traces
  • CI/CD: Automated testing, security scanning, deployment

🎯 Conclusion

This MLOps pipeline demonstrates Development and Devployment-ready machine learning operations with:

βœ… End-to-End Automation: From data ingestion to model deployment
βœ… Scalable Infrastructure: Kubernetes orchestration with monitoring
βœ… Quality Assurance: Automated testing, validation, and security scanning
βœ… Observability: Comprehensive metrics, logging, and tracing
βœ… Continuous Integration: GitHub Actions for automated workflows
βœ… Model Governance: Version control, experiment tracking, performance monitoring

The pipeline achieves 92.15% model accuracy while maintaining production standards for reliability, scalability, and maintainability. It serves as a blueprint for implementing MLOps best practices in enterprise environments.

Key Achievements:

  • πŸ”„ Fully automated 6-stage ML pipeline
  • 🐳 Containerized deployment with security best practices
  • ☸️ Kubernetes orchestration with service mesh capabilities
  • πŸ“Š Real-time monitoring and alerting system
  • πŸš€ CI/CD automation with quality gates
  • πŸ“ˆ Model performance tracking and drift detection

⭐ Star this repository if you found it helpful!

About

Complete MLOps pipeline: 6-stage ML workflow,, Kubernetes deployment, Prometheus monitoring, Airflow orchestration, CI/CD automation, Obervability Stacks (Prometheus, EFK, Jeager)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published