Credit Risk Classification using AWS SageMaker

Overview

Credit Risk Classification using AWS SageMaker is an end-to-end project that demonstrates how to build, train, and deploy a machine learning model to classify credit risk (e.g., predicting if a loan applicant is likely to default), utilizing AWS SageMaker’s managed ML services. The project covers data preprocessing, model development, evaluation, and deployment in a scalable and reproducible way.

Project Structure

.
├── data/                   # Raw and processed datasets
├── notebooks/              # Jupyter notebooks for EDA, training, and inference
├── src/                    # Source code for data loading, model, utils, etc.
│   ├── preprocessing.py
│   ├── train.py
│   └── inference.py
├── requirements.txt        # Python dependencies
├── README.md
├── .gitignore
└── config/                 # Configuration files and hyperparameters

Features

Data Preprocessing: Cleaning, feature engineering, and transformation scripts.
Model Training: Training pipelines using AWS SageMaker SDK.
Model Evaluation: Metrics and visualization for evaluating model performance.
Deployment: Scripts and steps for deploying models as SageMaker endpoints.
Automation: Sample workflow for automating training and deployment.
Scalability: Easily adaptable to larger datasets and more complex models.

Installation

Clone the repository

git clone https://github.com/naman-sriv/Credit_Risk_Classification_AWS_Sagemaker.git
cd Credit_Risk_Classification_AWS_Sagemaker

Create and activate a virtual environment (recommended)

python -m venv venv
source venv/bin/activate   # On Windows: venv\Scripts\activate

Install dependencies
```
pip install -r requirements.txt
```
(Optional) Set up AWS credentials
- Configure your AWS CLI with aws configure, or set environment variables as per AWS docs.

Usage

1. Explore the Notebooks

Check the notebooks/ directory for EDA, training, and deployment walkthroughs.
Example: notebooks/eda.ipynb, notebooks/train_model.ipynb, notebooks/deploy_model.ipynb

2. Prepare Data

Place your raw dataset in the data/ folder or update paths in the config files.

Run data preprocessing scripts:

python src/preprocessing.py --config config/preprocessing.yaml

3. Train the Model

Local training:

python src/train.py --config config/train_config.yaml

Or use SageMaker:
- Follow instructions in notebooks/train_model.ipynb to launch a SageMaker training job.

4. Evaluate the Model

Use evaluation scripts or notebooks to review metrics and visualizations.

5. Deploy the Model

Deploy using SageMaker endpoint scripts or via notebook.

Example:

python src/deploy.py --model-path <model_artifact>

Model Training & Evaluation

Algorithms Used: (e.g., Logistic Regression, XGBoost, Random Forest)
Evaluation Metrics: Accuracy, Precision, Recall, F1-score, ROC-AUC, etc.
Validation: k-fold cross-validation, hold-out set, etc.

Deployment

SageMaker Endpoint: Deploy trained model as a REST API endpoint.

Sample Request:

import boto3

runtime = boto3.client('sagemaker-runtime')
response = runtime.invoke_endpoint(
    EndpointName='your-endpoint-name',
    ContentType='text/csv',
    Body='<CSV_DATA>'
)
print(response['Body'].read())

Customization

Change hyperparameters in the config/ directory.
Add new features or models in src/.
Update data sources as needed.

Contributing

Contributions are welcome! Please open issues or submit pull requests for improvements.

Fork the repository.
Create your feature branch: git checkout -b feature/YourFeature
Commit your changes: git commit -am 'Add some feature'
Push to the branch: git push origin feature/YourFeature
Open a pull request.

License

This project is licensed under the MIT License. See LICENSE for more information.

Acknowledgements

Contact

For questions or feedback, please contact naman-sriv.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
German Credit Risk Data.csv		German Credit Risk Data.csv
LICENSE		LICENSE
Model.ipynb		Model.ipynb
README.md		README.md
requirements.txt		requirements.txt
script.py		script.py
test-V-1.csv		test-V-1.csv
train-V-1.csv		train-V-1.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Credit Risk Classification using AWS SageMaker

Table of Contents

Overview

Project Structure

Features

Installation

Usage

1. Explore the Notebooks

2. Prepare Data

3. Train the Model

4. Evaluate the Model

5. Deploy the Model

Model Training & Evaluation

Deployment

Customization

Contributing

License

Acknowledgements

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

naman-sriv/Credit_Risk_Classification_AWS_Sagemaker

Folders and files

Latest commit

History

Repository files navigation

Credit Risk Classification using AWS SageMaker

Table of Contents

Overview

Project Structure

Features

Installation

Usage

1. Explore the Notebooks

2. Prepare Data

3. Train the Model

4. Evaluate the Model

5. Deploy the Model

Model Training & Evaluation

Deployment

Customization

Contributing

License

Acknowledgements

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages