Skip to content

Add autism #203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 169 additions & 0 deletions docs/machine-learning/autism-detection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
# 🌟 Autism Spectrum Disorder (ASD) Detection using Machine Learning

<div align="center">
<img src="https://github.com/user-attachments/assets/62cc5129-b502-4164-849b-8f74da079ee3" />
</div>

## 🎯 AIM
To develop a machine learning model that predicts the likelihood of Autism Spectrum Disorder (ASD) based on behavioral and demographic features.

## 🌊 DATASET LINK
[Autism Screening Data](https://www.kaggle.com/code/konikarani/autism-prediction/data)

## 📚 KAGGLE NOTEBOOK
[Autism Detection Kaggle Notebook](https://www.kaggle.com/code/thatarguy/autism-prediction-using-ml?kernelSessionId=224830771)

??? Abstract "Kaggle Notebook"

<iframe src="https://www.kaggle.com/embed/thatarguy/autism-prediction-using-ml?kernelSessionId=224830771" height="800" style="margin: 0 auto; width: 100%; max-width: 950px;" frameborder="0" scrolling="auto" title="autism prediction using ml"></iframe>
## ⚙️ TECH STACK

| **Category** | **Technologies** |
|--------------------------|---------------------------------------------|
| **Languages** | Python |
| **Libraries/Frameworks** | Pandas, NumPy, Scikit-learn, |
| **Tools** | Jupyter Notebook, VS Code |

---

## 🖍 DESCRIPTION
!!! info "What is the requirement of the project?"
- The rise in Autism cases necessitates early detection.
- Traditional diagnostic methods are time-consuming and expensive.
- Machine learning can provide quick, accurate predictions to aid early intervention.

??? info "How is it beneficial and used?"
- Helps doctors and researchers identify ASD tendencies early.
- Reduces the time taken for ASD screening.
- Provides a scalable and cost-effective approach.

??? info "How did you start approaching this project? (Initial thoughts and planning)"
- Collected and preprocessed the dataset.
- Explored different ML models for classification.
- Evaluated models based on accuracy and efficiency.


---

## 🔍 PROJECT EXPLANATION

### 🧩 DATASET OVERVIEW & FEATURE DETAILS
The dataset consists of **800 rows** and **22 columns**, containing information related to autism spectrum disorder (ASD) detection based on various parameters.


| **Feature Name** | **Description** | **Datatype** |
|---------------------|----------------------------------------------------|:-----------:|
| `ID` | Unique identifier for each record | `int64` |
| `A1_Score` - `A10_Score` | Responses to 10 screening questions (0 or 1) | `int64` |
| `age` | Age of the individual | `float64` |
| `gender` | Gender (`m` for male, `f` for female) | `object` |
| `ethnicity` | Ethnic background | `object` |
| `jaundice` | Whether the individual had jaundice at birth (`yes/no`) | `object` |
| `austim` | Family history of autism (`yes/no`) | `object` |
| `contry_of_res` | Country of residence | `object` |
| `used_app_before` | Whether the individual used a screening app before (`yes/no`) | `object` |
| `result` | Score calculated based on the screening test | `float64` |
| `age_desc` | Age description (e.g., "18 and more") | `object` |
| `relation` | Relation of the person filling out the form | `object` |
| `Class/ASD` | ASD diagnosis label (`1` for ASD, `0` for non-ASD) | `int64` |

This dataset provides essential features for training a model to detect ASD based on questionnaire responses and demographic information.


---

### 🛠 PROJECT WORKFLOW
!!! success "Project workflow"
``` mermaid
graph LR
A[Start] --> B[Data Preprocessing];
B --> C[Feature Engineering];
C --> D[Model Training];
D --> E[Model Evaluation];
E --> F[Deployment];
```

=== "Step 1"
- Collected dataset and performed exploratory data analysis.

=== "Step 2"
- Preprocessed data (handling missing values, encoding categorical data).

=== "Step 3"
- Feature selection and engineering.

=== "Step 4"
- Trained multiple classification models (Decision Tree, Random Forest, XGBoost).

=== "Step 5"
- Evaluated models using accuracy, precision, recall, and F1-score.


---

### 🖥️ CODE EXPLANATION
=== "Section 1: Data Preprocessing"
- Loaded dataset and handled missing values.

=== "Section 2: Model Training"
- Implemented Logistic Regression and Neural Networks for classification.

---

### ⚖️ PROJECT TRADE-OFFS AND SOLUTIONS
=== "Trade Off 1"
- **Accuracy vs. Model Interpretability**: Used a Random Forest model instead of a deep neural network for better interpretability.

=== "Trade Off 2"
- **Speed vs. Accuracy**: Chose Logistic Regression for quick predictions in real-time applications.

---

## 🖼 SCREENSHOTS
!!! tip "Visualizations and EDA of different features"

=== "Age Distribution"
![img](https://github.com/user-attachments/assets/412aa82d-0f7a-4c7a-bdca-30a553de36b4)

??? example "Model performance graphs"

=== "Confusion Matrix"
![img](https://github.com/user-attachments/assets/71c5773c-fe1f-42bb-ab76-e1150f564507)

??? example "Features Correlation"

=== "Feature Correlation Heatmap"
![img](https://github.com/user-attachments/assets/60d24749-2f2e-4222-9895-c46c29ea596e)


---

## 📉 MODELS USED AND THEIR EVALUATION METRICS
| Model | Accuracy | Precision | Recall | F1-score |
|------------|----------|-----------|--------|----------|
| Decision Tree | 73% | 0.71 | 0.73 | 0.72 |
| Random Forest | 82% | 0.82 | 0.82 | 0.82 |
| XGBoost | 81% | 0.81 | 0.81 | 081 |

---

## ✅ CONCLUSION
### 🔑 KEY LEARNINGS
!!! tip "Insights gained from the data"
- Behavioral screening scores are the strongest predictors of ASD.
- Family history and neonatal jaundice also show correlations with ASD diagnosis.

??? tip "Improvements in understanding machine learning concepts"
- Feature selection and engineering play a crucial role in medical predictions.
- Trade-offs between accuracy, interpretability, and computational efficiency need to be balanced.

---

### 🌍 USE CASES
=== "Early ASD Screening"
- Helps parents and doctors identify ASD tendencies at an early stage.

=== "Assistive Diagnostic Tool"
- Can support psychologists in preliminary ASD assessments before clinical diagnosis.


11 changes: 11 additions & 0 deletions docs/machine-learning/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,5 +97,16 @@
</div>
</a>
</figure>
<!-- autism detection -->
<figure style="padding: 1rem; background: rgba(39, 39, 43, 0.5); border-radius: 10px; border: 1px solid rgba(76, 76, 82, 0.4); box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1); transition: transform 0.2s ease-in-out; text-align: center; max-width: 320px; margin: auto;">
<a href="autism-detection" style="color: white; text-decoration: none; display: block;">
<img src="https://github.com/user-attachments/assets/2c1bdd07-f30a-4b1e-b0c5-74248ae0b700" alt="autism detcion using ml" style="width: 100%; height: 150px; object-fit: cover; border-radius: 8px; transition: transform 0.2s;" />
<div style="padding: 0.8rem;">
<h3 style="margin: 0; font-size: 18px;">Autism Detection</h3>
<p style="font-size: 14px; opacity: 0.8;">Predicting Autism Using Machine Learning</p>
<p style="font-size: 12px; opacity: 0.6;">📅 2025-02-26 | ⏱️ 8 mins</p>
</div>
</a>
</figure>

</div>