This project implements a Convolutional Neural Network (CNN) for real-time facial emotion recognition using the FER2013 dataset. The model is capable of detecting 7 emotions from facial expressions: angry, disgust, fear, happy, sad, surprise, and neutral.
training.py: Script to train the CNN model on the FER2013 dataset.model.py: Definition of the neural network architecture using PyTorch.app.py: Real-time emotion recognition using OpenCV and a webcam.model.pth: Trained model weights (exported after training).
The architecture consists of 4 convolutional blocks (Conv2D + BatchNorm + ReLU + MaxPooling), followed by a fully connected head:
self.conv1 → self.conv2 → self.conv3 → self.conv4 → flatten → fc1 → fc2Dropout is used to reduce overfitting. The final layer outputs a 7-class prediction using softmax.
- Dataset: FER2013 (48x48 grayscale facial images)
- Accuracy achieved: ~56% on validation
- Loss function:
CrossEntropyLoss - Optimizer:
Adam, learning rate = 0.001 - Epochs: 50
- Data Augmentation:
- Random horizontal flip
- Random rotation
Training was performed in Google Colab using GPU acceleration. The final model was exported to a .pth file and used locally for inference.
- Open webcam feed using OpenCV.
- Detect face using Haar Cascades.
- Preprocess the detected face:
- Convert to grayscale
- Resize to 48x48
- Convert to PyTorch tensor
- Pass the image through the trained model.
- Display the predicted emotion on screen using
cv2.putText.
Install dependencies with:
pip install -r requirements.txtMain libraries used:
torchtorchvisionopencv-pythonimutils
- To train the model:
python training.py- To run the real-time emotion detector:
python app.pyMake sure
model.pthis in the same directory asapp.py.
- The model was trained on grayscale images resized to 48x48 pixels, so input preprocessing must match.
- You can further improve performance by:
- Using a deeper CNN (e.g., ResNet)
- Applying more aggressive data augmentation
- Training on a larger or more balanced dataset
- Emotion detection is approximate and works best under good lighting conditions.
This project is licensed under the MIT License – see the LICENSE file for details.
