GitHub - RohitMacherla3/human-emotion-detection-CV: The project aims to detect various human emotions like "angry", "happy", and "sad" based on image data. It utilizes custom convolutional neural networks as a baseline model and incorporates the modern CNNs like ResNet and EfficientNet to get the optimal results

Objective

The project aims to detect various human emotions like "angry", "happy", and "sad" based on image data. It utilizes custom convolutional neural networks as a baseline model and incorporates modern CNNs like ResNet and EfficientNet while also exploring the Transformer models in Vision to get the optimal results.

Deployment

Streamlit Web Application - https://human-emotion-detection.streamlit.app/

Dataset

The dataset is obtained from Kaggle - https://www.kaggle.com/datasets/muhammadhananasghar/human-emotions-datasethes It has train and test datasets separated both containing 3 classes - 'angry', 'happy', and 'sad'

Train data has about 6799 files and the train data has 2278 files.

Preprocessing

Initially, the data was converted into a Tensorflow dataset to be able to work with the neural networks making the data into batches, converting the target variable into categorical values, and shuffling the data to remove any data collection bias.
Data Augmentation was performed using random rotation, random flip, and random contrast layers of keras to remove location invariance problem.
Images were rescaled and resized to standard (224,224, 3) sizes.

Models Used

Baseline Convolutional Neural Network
ResNet50
EfficientNetB4
Vision Transformer

Training and Optimization

All the models were trained for 30 epochs with an initial learning rate of 0.01 (5e-5 for Vision Transformer). Adam was used as the optimizer to train to minimize the categorical cross-entropy loss. Below are the configurations:

BATCH_SIZE:32
IM_SIZE: 256
LEARNING_RATE: 0.001
N_EPOCH: 30
N_FILTERS:6
KERNEL_SIZE:3
N_STRIDES:1
POOL_SIZE:2
NUM_CLASSES:3

Tensorflow callbacks were used for logging to later visualize on a Tensorboard. 'Early Stopping' and 'Reduce on Plateau' were implemented to stop the training when there was no further improvement for a certain number of epochs.

Metrics

Model Comparision Plot

Predictions

Conclusion

As one would assume, the transformer model outperforms the CNN models with just training for 10 epochs, and among the CNN models, EfficientNetB4 does a good job compared to ResNet50 with much lesser model complexity.

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
Model		Model
Notebooks		Notebooks
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
MSDS_Final_Project_rm1667.pdf		MSDS_Final_Project_rm1667.pdf
Procfile		Procfile
README.md		README.md
__init__.py		__init__.py
app.py		app.py
config.toml		config.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Objective

Deployment

Dataset

Preprocessing

Models Used

Training and Optimization

Metrics

Predictions

Conclusion

About

Releases

Packages

Languages

License

RohitMacherla3/human-emotion-detection-CV

Folders and files

Latest commit

History

Repository files navigation

Objective

Deployment

Dataset

Preprocessing

Models Used

Training and Optimization

Metrics

Predictions

Conclusion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages