Skip to content

naman-sriv/Credit_Risk_Classification_AWS_Sagemaker

Repository files navigation

Credit Risk Classification using AWS SageMaker

Table of Contents


Overview

Credit Risk Classification using AWS SageMaker is an end-to-end project that demonstrates how to build, train, and deploy a machine learning model to classify credit risk (e.g., predicting if a loan applicant is likely to default), utilizing AWS SageMaker’s managed ML services. The project covers data preprocessing, model development, evaluation, and deployment in a scalable and reproducible way.


Project Structure

.
├── data/                   # Raw and processed datasets
├── notebooks/              # Jupyter notebooks for EDA, training, and inference
├── src/                    # Source code for data loading, model, utils, etc.
│   ├── preprocessing.py
│   ├── train.py
│   └── inference.py
├── requirements.txt        # Python dependencies
├── README.md
├── .gitignore
└── config/                 # Configuration files and hyperparameters

Features

  • Data Preprocessing: Cleaning, feature engineering, and transformation scripts.
  • Model Training: Training pipelines using AWS SageMaker SDK.
  • Model Evaluation: Metrics and visualization for evaluating model performance.
  • Deployment: Scripts and steps for deploying models as SageMaker endpoints.
  • Automation: Sample workflow for automating training and deployment.
  • Scalability: Easily adaptable to larger datasets and more complex models.

Installation

  1. Clone the repository

    git clone https://github.com/naman-sriv/Credit_Risk_Classification_AWS_Sagemaker.git
    cd Credit_Risk_Classification_AWS_Sagemaker
  2. Create and activate a virtual environment (recommended)

    python -m venv venv
    source venv/bin/activate   # On Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. (Optional) Set up AWS credentials

    • Configure your AWS CLI with aws configure, or set environment variables as per AWS docs.

Usage

1. Explore the Notebooks

  • Check the notebooks/ directory for EDA, training, and deployment walkthroughs.
  • Example: notebooks/eda.ipynb, notebooks/train_model.ipynb, notebooks/deploy_model.ipynb

2. Prepare Data

  • Place your raw dataset in the data/ folder or update paths in the config files.
  • Run data preprocessing scripts:
    python src/preprocessing.py --config config/preprocessing.yaml

3. Train the Model

  • Local training:
    python src/train.py --config config/train_config.yaml
  • Or use SageMaker:
    • Follow instructions in notebooks/train_model.ipynb to launch a SageMaker training job.

4. Evaluate the Model

  • Use evaluation scripts or notebooks to review metrics and visualizations.

5. Deploy the Model

  • Deploy using SageMaker endpoint scripts or via notebook.
  • Example:
    python src/deploy.py --model-path <model_artifact>

Model Training & Evaluation

  • Algorithms Used: (e.g., Logistic Regression, XGBoost, Random Forest)
  • Evaluation Metrics: Accuracy, Precision, Recall, F1-score, ROC-AUC, etc.
  • Validation: k-fold cross-validation, hold-out set, etc.

Deployment

  • SageMaker Endpoint: Deploy trained model as a REST API endpoint.
  • Sample Request:
    import boto3
    
    runtime = boto3.client('sagemaker-runtime')
    response = runtime.invoke_endpoint(
        EndpointName='your-endpoint-name',
        ContentType='text/csv',
        Body='<CSV_DATA>'
    )
    print(response['Body'].read())

Customization

  • Change hyperparameters in the config/ directory.
  • Add new features or models in src/.
  • Update data sources as needed.

Contributing

Contributions are welcome! Please open issues or submit pull requests for improvements.

  1. Fork the repository.
  2. Create your feature branch: git checkout -b feature/YourFeature
  3. Commit your changes: git commit -am 'Add some feature'
  4. Push to the branch: git push origin feature/YourFeature
  5. Open a pull request.

License

This project is licensed under the MIT License. See LICENSE for more information.


Acknowledgements


Contact

For questions or feedback, please contact naman-sriv.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published