Skip to content

wavhalkomal/Classification-algorithm-for-dietetics-dataset-

Repository files navigation

CS 513 - Knowledge Discovery and Data Mining

Team:

Komal Wavhal Lilli Nappi Milan Girish Chandiramani Suraj Gangwani

Problem Statement:

Diabetes, a prevalent chronic disease, affects millions of people worldwide and is linked to severe complications such as heart disease, vision loss, and kidney failure. Early detection of diabetes can significantly improve treatment outcomes and reduce healthcare costs. This problem can be addressed by developing predictive models to identify individuals at risk of developing diabetes or prediabetes.

The dataset from the CDC’s Behavioral Risk Factor Surveillance System (BRFSS) for 2015 contains responses from 70,692 individuals, equally split between those with diabetes or prediabetes (class 1) and those without (class 0). The dataset includes 21 feature variables related to health behaviors and conditions, providing an opportunity to build a classification model to predict diabetes risk.

The goal of this project is to create a machine learning model that can accurately classify individuals into two categories: those at risk of diabetes (prediabetes or diabetes) and those who are not. Such a model would support early diagnosis, allowing for timely interventions to manage and potentially prevent the disease.

About Dataset: diabetes _ binary _ 5050split _ health _ indicators _ BRFSS2015.csv is a clean dataset of 70,692 survey responses to the CDC's BRFSS2015. It has an equal 50-50 split of respondents with no diabetes and with either prediabetes or diabetes. The target variable Diabetes_binary has 2 classes. 0 is for no diabetes, and 1 is for prediabetes or diabetes. This dataset has 21 feature variables and is balanced.

Dataset https://www.kaggle.com/code/chanpreetsingh07/harsirat-kaur-project-demo![image](https://github.com/user-attachments/assets/136af0a3-0c9d-49aa-9ae6-17f23aaac2f1)

image

About

Executing the Classification algorithms like Random Forest, CART, ANN, xgboost, SVM, kmeans, knn, NB on Dietetics Dataset - https://www.kaggle.com/datasets/cdc/behavioral-risk-factor-surveillance-system?select=2015.csv

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •