MACHINE LEARNING OPERATIONS PIPELINE

Built with the tools and technologies:

� About the Project & Approach

This repository demonstrates a comprehensive MLOps (Machine Learning Operations) pipeline that showcases industry-standard practices for end-to-end machine learning workflow automation. The project implements a production-ready ML system with automated training, validation, deployment, and monitoring capabilities.

🎯 Project Approach

This MLOps pipeline follows a modular, scalable architecture designed to handle real-world machine learning challenges:

Data-Centric Approach: Implements robust data ingestion, validation, and feature engineering processes
Model-Centric Operations: Automated model training, evaluation, and deployment with performance tracking
Infrastructure-as-Code: Kubernetes manifests and Docker containerization for scalable deployments
Observability-First: Comprehensive monitoring with metrics, logs, and distributed tracing
CI/CD Integration: Automated testing, security scanning, and deployment pipelines

� What I Built

Core ML Pipeline Components:

� 6-Stage DVC Pipeline: Data ingestion → Validation → Feature engineering → Transformation → Training → Evaluation
🤖 XGBoost Model: Achieved 92.15% accuracy with automated hyperparameter tuning
📊 MLflow Integration: Experiment tracking and model registry for version control
🔍 Data Quality Monitoring: Evidently AI for drift detection and quality assessment

Production Infrastructure:

🐳 Docker Containerization: Multi-stage builds with security best practices
☸️ Kubernetes Deployment: Scalable orchestration with services, ingress, and monitoring
🌪️ Apache Airflow: Workflow orchestration and scheduling for automated pipeline execution
🚀 GitHub Actions CI/CD: Automated testing, security scanning, and deployment

Observability Stack:

📈 Prometheus: Metrics collection and alerting for model performance
📊 Grafana: Custom dashboards for ML pipeline visualization
🔍 ELK Stack: Centralized logging with Elasticsearch, Kibana, and Fluentd
🕸️ Jaeger: Distributed tracing for microservices monitoring

Getting Started

Prerequisites

Python 3.11+
Docker & Docker Compose
Git & DVC
Kaggle Account (for data access)

Installation

Clone the repository

git clone https://github.com/Abeshith/MLOps_PipeLine.git
cd MLOps_PipeLine

Set up Python environment

# Using uv (recommended)
uv venv
uv pip install -r requirements.txt

# Or using pip
pip install -r requirements.txt

Configure Kaggle credentials

# Create kaggle.json in ~/.kaggle/ directory
{
  "username": "your_kaggle_username",
  "key": "your_kaggle_key"
}

🔄 Pipeline Execution

Individual Stage Execution

Stage 1: Data Ingestion

python -m src.mlpipeline.pipeline.stage_01_data_ingestion

or 

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_01_data_ingestion.py

Downloads Kaggle competition data, splits into train/test sets, validates data integrity

Stage 2: Data Validation

python -m src.mlpipeline.pipeline.stage_02_data_validation

or 

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_02_data_validation.py

Schema validation against config/schema.yaml, data drift detection using Evidently AI

Stage 3: Feature Engineering

python -m src.mlpipeline.pipeline.stage_03_feature_engineering

or

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_03_feature_engineering.py

Feature creation and transformation, correlation analysis, feature importance calculation

Stage 4: Data Transformation

python -m src.mlpipeline.pipeline.stage_04_data_transformation

or  

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_04_data_transformation.py

Data preprocessing and scaling, encoding categorical variables, train/test preparation

Stage 5: Model Training

python -m src.mlpipeline.pipeline.stage_05_model_trainer

or 

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_05_model_trainer.py

XGBoost model training with hyperparameter tuning, MLflow experiment tracking

Stage 6: Model Evaluation

python -m src.mlpipeline.pipeline.stage_06_model_evaluation

or

$env:PYTHONPATH = "src"; python src/mlpipeline/pipeline/stage_06_model_evaluation.py

Performance metrics calculation, model comparison, results logging to MLflow

Complete Pipeline Execution

Run Complete Pipeline Script

python main.py

or

$env:PYTHONPATH = "src"; python main.py

Executes all 6 stages sequentially with comprehensive logging and error handling

Start Flask Application

python app.py

or 

$env:PYTHONPATH = "src"; python main.py

Launches web interface at http://localhost:5000 for model predictions and monitoring

DVC Pipeline Management

# Run complete pipeline
dvc repro

# Run specific stages
dvc repro data_ingestion
dvc repro data_validation
dvc repro feature_engineering
dvc repro data_transformation
dvc repro model_trainer
dvc repro model_evaluation

# View pipeline status
dvc dag

⚡ Apache Airflow Pipeline Setup

Prerequisites

Note: Apache Airflow requires Linux-based systems for optimal performance. Windows users should use WSL or Linux environment.

Environment Setup

1. Copy Project to Linux Environment (Windows Users)

cp -r /mnt/d/MLOps_PipeLine ~/

2. Install Python and Setup Virtual Environment

# Install Python3 if not already installed
sudo apt update
sudo apt install python3 python3-pip python3-venv

# Navigate to project directory
cd ~/MLOps_PipeLine

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

3. Configure Apache Airflow

# Set Airflow home directory
export AIRFLOW_HOME=~/airflow
echo $AIRFLOW_HOME

# Configure Airflow settings
vim ~/airflow/airflow.cfg

-> insert - i
Replace "auth_manager = airflow.api.fastapi.auth.managers.simple.simple_auth_manager.SimpleAuthManager" in airflow.cfg file
with "auth_manager=airflow.providers.fab.auth_manager.fab_auth_manager.FabAuthManager".
After Updation press- > esc -> :wq!

# Create DAGs directory
mkdir -p ~/airflow/dags

# Copy DAG file to Airflow directory
cp model_dag.py ~/airflow/dags/

# Test DAG configuration
python ~/airflow/dags/model_dag.py

4. Launch Airflow Webserver

# Start Airflow standalone mode
airflow standalone

5. Execute Pipeline

Open your web browser and navigate to: http://0.0.0.0:8080
Search for ml_pipeline_dag in the DAGs list
Click on the DAG and trigger the pipeline execution
Monitor the workflow progress through the Airflow UI

☸️ Kubernetes Deployment

1. Start Minikube Cluster

# Initialize Minikube cluster
minikube start

2. Deploy Application

# Deploy the application using deployment manifest
kubectl apply -f k8s/deployment.yaml

# Check pod status
kubectl get pods

3. Deploy Services

# Apply service configuration
kubectl apply -f k8s/service.yaml

# Verify service deployment
kubectl get svc

4. Access Application - Method 1 (Port Forwarding)

# Forward service port to local machine
kubectl port-forward svc/mlapp-service 8000:80

# Access application in browser
# Navigate to: http://localhost:8000

5. Access Application - Method 2 (Load Balancer)

# Edit service configuration
kubectl edit svc mlapp-service

# Change service type from NodePort to LoadBalancer
esc - :wq! - enter
# Save and exit the editor

# Open new terminal and create tunnel
minikube tunnel

# In original terminal, check for external IP
kubectl get svc

# Access application using external IP in browser
# 127.0.0.1

6. Configuring Ingress - (Optional)

# Deploy the ingress.yaml file
kubectl apply -f k8s/ingress.yaml

# Install the Ingress Controller (nginx)
minikube addons enable ingress

# Check the downloaded Ingress
kubectl get pods -A | grep nginx

# Check Ingress is Deployed
kubectl get ingress # A Address is Being Updated like -> 192.168.49.2

# for setup local system configuraton
sudo vim /etc/hosts

# Add
127.0.0.1       localhost
127.0.1.1       Abis-PC.        Abis-PC
192.168.49.2    foo.bar.com
esc - :wq!

# Check Updated or not
ping foo.bar.com

# then go to browser
http://foo.bar.com/demo
http://foo.bar.com/admin

📊 Observability Stack

🔍 Metrics Collection with Prometheus

Start Prometheus Monitoring

# Navigate to observability directory
cd observability

# Start Prometheus service
docker-compose up -d prometheus

# Access Prometheus UI
# URL: http://localhost:9090

Prometheus Operations

# Check application health
curl http://localhost:9090/api/v1/query?query=up

# View custom metrics
curl http://localhost:9090/api/v1/query?query=ml_model_accuracy

# Check targets status
curl http://localhost:9090/api/v1/targets

# View alert rules
curl http://localhost:9090/api/v1/rules

Configure ML Pipeline Metrics

# Prometheus collects metrics for:
# - Model accuracy and performance
# - Pipeline execution times
# - Resource utilization
# - API response times
# - Error rates and failures

📈 Visualization with Grafana

Start Grafana Dashboard

# Start Grafana service
docker-compose up -d grafana

# Access Grafana UI
# URL: http://localhost:3000
# Default credentials: admin/admin

Import ML Pipeline Dashboards

# Pre-configured dashboards available:
# 1. ML Pipeline Performance Dashboard
# 2. Model Metrics Dashboard  
# 3. Infrastructure Monitoring Dashboard

# Import custom dashboard JSON files from:
# observability/grafana/dashboards/

Grafana Operations

# Create data source (Prometheus)
# URL: http://prometheus:9090

# Import dashboard from file
# Upload: observability/grafana/dashboards/ml-pipeline-dashboard.json

# Set up alerting rules for model performance
# Configure notification channels (email, slack, webhook)

📋 Log Management with ELK Stack

Start Elasticsearch & Kibana

# Start Elasticsearch service
docker-compose up -d elasticsearch

# Start Kibana service  
docker-compose up -d kibana

# Access Kibana UI
# URL: http://localhost:5601

Elasticsearch Operations

# Check cluster health
curl http://localhost:9200/_cluster/health

# List indices
curl http://localhost:9200/_cat/indices?v

# Search ML pipeline logs
curl -X GET "localhost:9200/logs-*/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": {
      "level": "ERROR"
    }
  }
}'

# View pipeline execution logs
curl -X GET "localhost:9200/ml-pipeline-*/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "range": {
      "@timestamp": {
        "gte": "now-1h"
      }
    }
  }
}'

Kibana Dashboard Setup

# Create index patterns for:
# - ml-pipeline-* (Pipeline execution logs)
# - application-* (Application logs)
# - error-* (Error logs)
# - performance-* (Performance metrics)

# Build visualizations for:
# - Pipeline success/failure rates
# - Model training duration trends
# - Error analysis and patterns
# - Resource usage over time

Fluentd Log Collection

# Start Fluentd service
docker-compose up -d fluentd

# Fluentd automatically collects logs from:
# - ML pipeline stages
# - Flask application
# - Kubernetes pods (if deployed)
# - Docker containers
# - System logs

# Configure log parsing in:
# observability/fluentd/fluent.conf

🕸️ Distributed Tracing with Jaeger

Start Jaeger Tracing

# Start Jaeger service
docker-compose up -d jaeger

# Access Jaeger UI
# URL: http://localhost:16686

Jaeger Operations

# View ML pipeline traces
# Search for service: ml-pipeline
# View operation: model_training

# Analyze request flows:
# - Data ingestion → validation → training → evaluation
# - API request → prediction → response
# - Error traces and bottleneck identification

# Configure trace sampling and retention
# Monitor microservice dependencies

🚨 Alerting & Monitoring

Complete Monitoring Stack

# Start all monitoring services
docker-compose up -d

# Check services status
docker-compose ps

# View service logs
docker-compose logs prometheus
docker-compose logs grafana
docker-compose logs elasticsearch

Alert Configuration

# Prometheus alerts configured for:
# - Model accuracy degradation (< 90%)
# - High error rates (> 5%)
# - Pipeline failures
# - Resource exhaustion
# - API latency issues

# View active alerts
# URL: http://localhost:9090/alerts

# Configure alert notifications in Grafana
# Set up escalation policies and on-call rotations

Monitoring Cleanup

# Stop all services
docker-compose down

# Remove volumes (caution: deletes all data)
docker-compose down -v

# Remove specific service
docker-compose stop grafana
docker-compose rm grafana

🔄 CI/CD Pipeline

Automated MLOps pipeline with GitHub Actions

Pipeline Stages:

Setup: Python environment and dependencies
Code Quality: Black formatting, isort, flake8 linting
Data Validation: Schema validation and data quality checks
Model Performance: Accuracy threshold validation (>90%)
Security Scan: Secret detection and vulnerability scanning
Build & Deploy: Docker build and push to registry

Triggers:

Push to main branch
Pull request creation
Manual workflow dispatch

Required Secrets:

# Add to GitHub repository secrets:
DOCKER_USERNAME=your_docker_username
DOCKER_PASSWORD=your_docker_password

� Project Structure

MLOps_PipeLine/
├── 📂 src/mlpipeline/           # Core ML pipeline source code
│   ├── 📂 components/           # ML pipeline components
│   ├── 📂 config/              # Configuration management
│   ├── 📂 entity/              # Data entities and schemas
│   ├── 📂 pipeline/            # Stage-wise pipeline execution
│   └── 📂 utils/               # Utility functions
├── 📂 config/                   # Configuration files
│   ├── config.yaml             # Main configuration
│   ├── params.yaml             # Model parameters
│   └── schema.yaml             # Data schema validation
├── 📂 artifacts/               # Generated artifacts (DVC tracked)
│   ├── 📂 data_ingestion/      # Raw and processed data
│   ├── 📂 data_validation/     # Validation reports
│   ├── 📂 feature_engineering/ # Feature artifacts
│   ├── 📂 model_trainer/       # Trained models
│   └── 📂 model_evaluation/    # Performance metrics
├── 📂 k8s/                     # Kubernetes manifests
│   ├── deployment.yaml         # Application deployment
│   ├── service.yaml           # Service configuration
│   └── ingress.yaml           # Ingress routing
├── 📂 observability/           # Monitoring stack
│   ├── docker-compose.yml     # Complete monitoring setup
│   ├── 📂 prometheus/         # Metrics collection
│   ├── 📂 grafana/           # Visualization dashboards
│   ├── 📂 elasticsearch/     # Log storage
│   └── 📂 kibana/            # Log analysis
├── 📂 .github/workflows/       # CI/CD automation
│   └── ci-cd.yml              # Complete MLOps pipeline
├── dvc.yaml                   # DVC pipeline definition
├── Dockerfile                 # Container definition
├── model_dag.py              # Airflow DAG definition
└── app.py                    # Flask application

�📈 Model Performance

Current model performance metrics:

Algorithm: XGBoost Classifier
Accuracy: 92.15%
Precision: 91.47%
Recall: 92.00%
F1-Score: 91.64%
AUC: 94.04%

Performance tracking via MLflow with experiment comparison and model registry integration.

Infrastructure Components

Container Orchestration: Kubernetes with auto-scaling
Service Mesh: Istio for traffic management (optional)
Data Storage: DVC for version control, MLflow for experiments
Monitoring: Full observability stack with metrics, logs, traces
CI/CD: Automated testing, security scanning, deployment

🎯 Conclusion

This MLOps pipeline demonstrates Development and Devployment-ready machine learning operations with:

✅ End-to-End Automation: From data ingestion to model deployment
✅ Scalable Infrastructure: Kubernetes orchestration with monitoring
✅ Quality Assurance: Automated testing, validation, and security scanning
✅ Observability: Comprehensive metrics, logging, and tracing
✅ Continuous Integration: GitHub Actions for automated workflows
✅ Model Governance: Version control, experiment tracking, performance monitoring

The pipeline achieves 92.15% model accuracy while maintaining production standards for reliability, scalability, and maintainability. It serves as a blueprint for implementing MLOps best practices in enterprise environments.

Key Achievements:

🔄 Fully automated 6-stage ML pipeline
🐳 Containerized deployment with security best practices
☸️ Kubernetes orchestration with service mesh capabilities
📊 Real-time monitoring and alerting system
🚀 CI/CD automation with quality gates
📈 Model performance tracking and drift detection

⭐ Star this repository if you found it helpful!

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.dvc		.dvc
.github/workflows		.github/workflows
artifacts		artifacts
config		config
k8s		k8s
logs		logs
observability		observability
src/mlpipeline		src/mlpipeline
static		static
templates		templates
.dockerignore		.dockerignore
.dvcignore		.dvcignore
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
Observability.md		Observability.md
README.md		README.md
app.py		app.py
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
main.py		main.py
model_dag.py		model_dag.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_observability.ps1		run_observability.ps1
uv.lock		uv.lock

Abeshith/MLOps_PipeLine

Folders and files

Latest commit

History

Repository files navigation