- About The Project
- Built With
- Features
- Project Structure
- Getting Started
- Installation
- Usage
- Training
- Fine-Tuning
- Roadmap
- Contributing
- Contributors
- Authors
- Acknowledgements
- License
- About Me
TINY-ViT offers a minimalist, yet complete implementation of the Vision Transformer (ViT) architecture for computer vision tasks. This project aims to provide a clear and structured approach to building Vision Transformers, making it accessible for educational purposes and practical applications alike.
-
Modular Design: Clear separation of components like data processing, model architecture, and training routines.
-
Customizable: Easy to adapt the architecture and data pipeline for various datasets and applications.
-
Poetry Dependency Management: Utilizes Poetry for simple and reliable package management.
-
Advanced Embedding Techniques: Implements three distinct techniques for image embedding in Vision Transformers:
- ViTConv2dEmbedding: Utilizes a Conv2D layer to transform input images into a sequence of flattened 2D patches, with a learnable class token appended.
class ViTConv2dEmbedding(nn.Module):
- ViTLNEmbedding: Applies layer normalization to flattened input patches before projecting them into an embedding space, enhancing stability and performance.
class ViTLNEmbedding(nn.Module):
- ViTPyCon2DEmbedding: Offers a unique tensor reshaping strategy to transform input images into a sequence of embedded patches, also including a learnable class token.
class ViTPyCon2DEmbedding(nn.Module):
-
Custom Activation Function: Incorporates the ViTGELUActFun class, which implements the Gaussian Error Linear Unit (GELU), providing smoother gating behavior than traditional nonlinearities like ReLU.
class ViTGELUActFun(nn.Module):
TINY-VIT-TRANSFORMER-FROM-SCRATCH
β
βββ dataset # Dataset directory
βββ tests # Test scripts
βββ tiny_vit_transformer_from_scratch
β βββ core # Core configurations and caching
β βββ data # Data processing modules
β βββ model # Transformer model components
βββ train.py # Script to train the model
βββ finetune.py # Script for fine-tuning the model
βββ README.md # Project README file
βββ poetry.lock # Poetry lock file for consistent builds
βββ pyproject.toml # Poetry project file with dependency descriptions
This section should list any major frameworks/libraries used to bootstrap your project:
To get a local copy up and running follow these simple steps.
- Clone the repo
git clone https://github.com/benisalla/tiny-vit-transformer-from-scratch.git
- Install Poetry packages
poetry install
how you can use this code
To train the model using the default configuration:
poetry run python train.py
To fine-tune a pre-trained model:
poetry run python finetune.py
The tiny-vit model was evaluated on a comprehensive set of test images to gauge its accuracy and performance. Here are the results:
- Accuracy on test images: 81.60%
These results demonstrate the effectiveness of the tiny-vit model in handling complex image recognition tasks. We continuously seek to improve the model and update the metrics as new test results become available.
See the open issues for a list of proposed features (and known issues).
Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change.
-
Asmae El-Ghezzaz - Data Scientist/ML
- GitHub
- Contributions: Provided expertise in machine learning and data science methodologies.
-
Idriss El Houari - Data Scientist
-
Farheen Akhter - Graduate Student at California State University
-
Aicha Dessa - Data Scientist Intern
-
Zeroual Salma - Student at Agronomic and Veterinary Institute Hassan II
- GitHub
- Contributions: Focused on plant pathology and contributed to dataset analysis and insights into Botrytis disease.
-
El Fakir Chaimae - Master's Student in Artificial Intelligence
- GitHub
- Contributions: Provided datasets on pathogen detection and collaborated on developing AI models for disease prediction.
-
Laghbissi Salma - Master's Student in Software Engineering for Cloud Computing
- GitHub
- Contributions: Researched on plant pathologies and contributed significantly to the dataset understanding and processing.
- Ismail Ben Alla (Me π) - View My GitHub Profile
- Asmae El-Ghezzaz (a friend of mine) - deserves a special thanks for her help and advices
This project owes its success to the invaluable support and resources provided by several individuals and organizations. A heartfelt thank you to:
- Asmae El-Ghezzaz - For inviting me to be a member of Moroccan Data Scientists (MDS), where I had the opportunity to develop this project.
- Moroccan Data Scientists (MDS) - Although I am no longer a member, I hold great admiration for the community and wish it continued success.
- Pests and Vigitebles Diseased Detection Team in MDS - Aicha, hiba, idriss, farheen, asmae, ...
- PyTorch - For the powerful and flexible deep learning platform that has made implementing models a smoother process.
- Kaggle - For providing the datasets used in training our models and hosting competitions that inspire our approaches.
- Google Colab - For the computational resources that have been instrumental in training and testing our models efficiently.
This project is made available under fair use guidelines. While there is no formal license associated with the repository, users are encouraged to credit the source if they utilize or adapt the code in their work. This approach promotes ethical practices and contributions to the open-source community. For citation purposes, please use the following:
@misc{tiny_vit_2024,
title={TINY-ViT: Vision Transformer from Scratch},
author={Ben Alla Ismail},
year={2024},
url={https://github.com/benisalla/tiny-vit-transformer-from-scratch}
}
π Ismail Ben Alla - Neural Network Enthusiast
I am deeply passionate about exploring artificial intelligence and its potential to solve complex problems and unravel the mysteries of our universe. My academic and professional journey is characterized by a commitment to learning and innovation in AI, deep learning, and machine learning.
- Passion for AI: Eager to push the boundaries of technology and discover new possibilities.
- Continuous Learning: Committed to staying informed and skilled in the latest advancements.
- Optimism and Dedication: Motivated by the challenges and opportunities that the future of AI holds.
I thoroughly enjoy what I do and am excited about the future of AI and machine learning. Let's connect and explore the endless possibilities of artificial intelligence together!