This project aims to recognize American Sign Language (ASL) alphabet using deep learning techniques.
The project preprocesses images of hands showing ASL signs and uses Fastai models to classify them.
It uses Streamlit for the web interface, Mediapipe to recognize hands, and OpenCV to crop the bounding box of the hand.
The dataset used is the ASL Alphabet dataset.
Architecture:
- Uses a pre-trained ResNet34 model from the Fastai library.
- The final layers are customized to fit the number of ASL alphabet classes.
Training:
- The dataset is split into training and validation sets using a random splitter with 20% validation data.
- Data augmentation techniques are applied, for example: resizing images to 224x224 pixels.
- The model is trained using the One Cycle Policy for 4 epochs.
- Cross-entropy loss is used as the loss function.
- The learning rate is fine-tuned using the learning rate finder.
- The model's performance is evaluated using accuracy metrics on the validation set.
Architecture:
- Uses a pre-trained ResNet50 model from the Fastai library.
- The final layers are customized to fit the number of ASL alphabet classes.
Training:
- The dataset is split into training and validation sets using a random splitter with 20% validation data.
- Data augmentation techniques are applied, for example: resizing images to 224x224 pixels with padding, and various transformations like brightness, contrast, rotation, dihedral, and random resized crop.
- The model is trained using the One Cycle Policy for 4 epochs with a base learning rate of 1e-3.
- Cross-entropy loss is used as the loss function.
- The learning rate is fine-tuned using the learning rate finder.
- The model's performance is evaluated using accuracy metrics on the validation set.