🌍 AQI Prediction using Machine Learning

Predict the Air Quality Index (AQI) from real‑world pollutant data (CO, NO₂, SO₂, O₃, PM2.5, PM10) across multiple global cities using Python & ML.

✨ Features

📊 Exploratory Data Analysis – Visual pollutant distributions & AQI trends.
🧹 Data Cleaning – Null checks, duplicates, drop unused columns.
🧮 Feature Engineering – One‑hot encode cities; scale numeric features.
🤖 Machine Learning Models – Linear Regression baseline vs Random Forest ensemble.
📈 Model Evaluation – R², RMSE, MAE comparison.
🔮 Custom Prediction – Plug in new pollutant readings and estimate AQI.

🎯 Why This Matters

Poor air quality affects respiratory health, productivity, and urban planning. An ML model that estimates AQI from pollutant levels helps:

Citizens track exposure risk.
City agencies forecast alerts.
Students learn regression modeling on environmental data.

📂 Dataset

Rows: 52,560 hourly records
Columns: City, CO, NO2, SO2, O3, PM2.5, PM10, AQI
Cities Covered: Brasilia, Cairo, Dubai, London, New York, Sydney
Use: Educational / learning project dataset (bundled locally in repo).

If you later host the dataset separately (e.g., Kaggle), update the link here.

🛠 Tech Stack

Python (pandas, numpy)
Visualization: matplotlib, seaborn
Modeling: scikit-learn (LinearRegression, RandomForestRegressor, MinMaxScaler, metrics)
Environment: Jupyter Notebook

🔍 Workflow

Load CSV → pandas.read_csv()
Inspect shape, dtypes, nulls
Drop Date (not modeled)
Encode City → one-hot columns
Split train/test
Scale features → MinMaxScaler
Train models:
- Linear Regression (baseline)
- Random Forest Regressor (ensemble)
Evaluate → R², RMSE, MAE
Predict on new samples

📊 Results

Model	R²	RMSE	MAE	Notes
Linear Regression	0.83	10.21	7.38	Baseline
Random Forest	0.86	9.37	6.33	✅ Best Model

(Metrics from notebook run; will vary by random seed.)

🚀 Quickstart

First create the repo on GitHub under your account prachi757 named aqi-prediction (Public). Then run the steps below.

1. Clone the repository

git clone https://github.com/prachi757/aqi-prediction.git
cd aqi-prediction

If you forked this repo instead: replace the URL with your fork (shown on GitHub after you click Fork).

2. (Optional) Create & activate a virtual environment

macOS / Linux

python -m venv .venv
source .venv/bin/activate

Windows PowerShell

python -m venv .venv
.\.venv\Scripts\activate

3. Install dependencies

pip install --upgrade pip
pip install -r requirements.txt

4. Launch the notebook

jupyter notebook Major_Project.ipynb

Run cells top→bottom.

🧪 Try Your Own Prediction

After running the notebook and training the Random Forest model:

# Example new pollutant reading (scaled automatically below)
# Order: CO, NO2, SO2, O3, PM2_5, PM10, Brasilia, Cairo, Dubai, London, New_York, Sydney
new_sample = [[0.7, 45.0, 12.0, 32.0, 58.0, 105, 0, 1, 0, 0, 0, 0]]

# IMPORTANT: Use the *same* scaler fitted on training data
new_sample_scaled = scaler.transform(new_sample)

pred = AQI_Regressor.predict(new_sample_scaled)
print(f"Predicted AQI: {pred[0]:.2f}")

📁 Project Structure

aqi-prediction/
│
├── Major_Project.ipynb        # Notebook: EDA + Modeling
├── Air_Quality_dataset.csv    # Dataset (hourly pollutant readings)
├── requirements.txt           # Environment + install instructions
└── README.md                  # You are here!

🗺 Roadmap (Future Ideas)

Add Gradient Boosting / XGBoost
Include time‑series features from Date (hour, month, season)
Hyperparameter tuning (GridSearchCV)
Streamlit mini‑app for live AQI prediction
Feature importance + SHAP explainability

🙋 Contact

Prachi Garg
GitHub: prachi757
LinkedIn: Prachi Garg
Email: [email protected]

📜 License

Educational & portfolio use. Feel free to fork, learn, and extend—please credit the original author.

⭐ Like this project?

If it helped you, star the repo and share! 🙌

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌍 AQI Prediction using Machine Learning

✨ Features

🎯 Why This Matters

📂 Dataset

🛠 Tech Stack

🔍 Workflow

📊 Results

🚀 Quickstart

1. Clone the repository

2. (Optional) Create & activate a virtual environment

3. Install dependencies

4. Launch the notebook

🧪 Try Your Own Prediction

📁 Project Structure

🗺 Roadmap (Future Ideas)

🙋 Contact

📜 License

⭐ Like this project?

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Air_Quality_dataset.csv		Air_Quality_dataset.csv
Major_Project.ipynb		Major_Project.ipynb
README.md		README.md
requirements.txt		requirements.txt

prachi757/AQI-PREDICTION

Folders and files

Latest commit

History

Repository files navigation

🌍 AQI Prediction using Machine Learning

✨ Features

🎯 Why This Matters

📂 Dataset

🛠 Tech Stack

🔍 Workflow

📊 Results

🚀 Quickstart

1. Clone the repository

2. (Optional) Create & activate a virtual environment

3. Install dependencies

4. Launch the notebook

🧪 Try Your Own Prediction

📁 Project Structure

🗺 Roadmap (Future Ideas)

🙋 Contact

📜 License

⭐ Like this project?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages