TINY-ViT: Vision Transformer from Scratch

Implementing a Vision Transformer model from the scratch.

A little demo

Table of Contents 📘

About The Project
Built With
Features
Project Structure
Getting Started
Installation
Usage
Training
Fine-Tuning
Roadmap
Contributing
Contributors
Authors
Acknowledgements
License
About Me

About The Project

TINY-ViT offers a minimalist, yet complete implementation of the Vision Transformer (ViT) architecture for computer vision tasks. This project aims to provide a clear and structured approach to building Vision Transformers, making it accessible for educational purposes and practical applications alike.

Features

Modular Design: Clear separation of components like data processing, model architecture, and training routines.
Customizable: Easy to adapt the architecture and data pipeline for various datasets and applications.
Poetry Dependency Management: Utilizes Poetry for simple and reliable package management.
Advanced Embedding Techniques: Implements three distinct techniques for image embedding in Vision Transformers:
- ViTConv2dEmbedding: Utilizes a Conv2D layer to transform input images into a sequence of flattened 2D patches, with a learnable class token appended.
```
class ViTConv2dEmbedding(nn.Module):
```
- ViTLNEmbedding: Applies layer normalization to flattened input patches before projecting them into an embedding space, enhancing stability and performance.
```
class ViTLNEmbedding(nn.Module):
```
- ViTPyCon2DEmbedding: Offers a unique tensor reshaping strategy to transform input images into a sequence of embedded patches, also including a learnable class token.
```
class ViTPyCon2DEmbedding(nn.Module):
```
Custom Activation Function: Incorporates the ViTGELUActFun class, which implements the Gaussian Error Linear Unit (GELU), providing smoother gating behavior than traditional nonlinearities like ReLU.
```
class ViTGELUActFun(nn.Module):
```

Project Structure

TINY-VIT-TRANSFORMER-FROM-SCRATCH
│
├── dataset                   # Dataset directory
├── tests                     # Test scripts
├── tiny_vit_transformer_from_scratch
│   ├── core                  # Core configurations and caching
│   ├── data                  # Data processing modules
│   └── model                 # Transformer model components
├── train.py                  # Script to train the model
├── finetune.py               # Script for fine-tuning the model
├── README.md                 # Project README file
├── poetry.lock               # Poetry lock file for consistent builds
└── pyproject.toml            # Poetry project file with dependency descriptions

Built With

This section should list any major frameworks/libraries used to bootstrap your project:

MyBest framework ever: PyTorch

Getting Started

To get a local copy up and running follow these simple steps.

Installation

Clone the repo

git clone https://github.com/benisalla/tiny-vit-transformer-from-scratch.git

Install Poetry packages
```
poetry install
```

Usage

how you can use this code

Training

To train the model using the default configuration:

poetry run python train.py

Fine-Tuning

To fine-tune a pre-trained model:

poetry run python finetune.py

Model Performance

The tiny-vit model was evaluated on a comprehensive set of test images to gauge its accuracy and performance. Here are the results:

Accuracy on test images: 81.60%

These results demonstrate the effectiveness of the tiny-vit model in handling complex image recognition tasks. We continuously seek to improve the model and update the metrics as new test results become available.

Roadmap

See the open issues for a list of proposed features (and known issues).

Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change.

Contributors

Asmae El-Ghezzaz - Data Scientist/ML
- GitHub
- Contributions: Provided expertise in machine learning and data science methodologies.
Idriss El Houari - Data Scientist
- GitHub
- Kaggle
- Contributions: Curated the plant disease dataset and developed the initial analysis notebook.
Farheen Akhter - Graduate Student at California State University
- GitHub
- Kaggle
- Contributions: Worked on improving crop yield and pest/disease detection through data analytics. Provided dataset and analytical insights on local farms in Ghana.
Aicha Dessa - Data Scientist Intern
- GitHub
- Kaggle
- Contributions: Analyzed plant disease recognition data and contributed to model training and testing processes.
Zeroual Salma - Student at Agronomic and Veterinary Institute Hassan II
- GitHub
- Contributions: Focused on plant pathology and contributed to dataset analysis and insights into Botrytis disease.
El Fakir Chaimae - Master's Student in Artificial Intelligence
- GitHub
- Contributions: Provided datasets on pathogen detection and collaborated on developing AI models for disease prediction.
Laghbissi Salma - Master's Student in Software Engineering for Cloud Computing
- GitHub
- Contributions: Researched on plant pathologies and contributed significantly to the dataset understanding and processing.

Authors

Ismail Ben Alla (Me 😉) - View My GitHub Profile
Asmae El-Ghezzaz (a friend of mine) - deserves a special thanks for her help and advices

Acknowledgements

This project owes its success to the invaluable support and resources provided by several individuals and organizations. A heartfelt thank you to:

Asmae El-Ghezzaz - For inviting me to be a member of Moroccan Data Scientists (MDS), where I had the opportunity to develop this project.
Moroccan Data Scientists (MDS) - Although I am no longer a member, I hold great admiration for the community and wish it continued success.
Pests and Vigitebles Diseased Detection Team in MDS - Aicha, hiba, idriss, farheen, asmae, ...
PyTorch - For the powerful and flexible deep learning platform that has made implementing models a smoother process.
Kaggle - For providing the datasets used in training our models and hosting competitions that inspire our approaches.
Google Colab - For the computational resources that have been instrumental in training and testing our models efficiently.

License

This project is made available under fair use guidelines. While there is no formal license associated with the repository, users are encouraged to credit the source if they utilize or adapt the code in their work. This approach promotes ethical practices and contributions to the open-source community. For citation purposes, please use the following:

@misc{tiny_vit_2024,
  title={TINY-ViT: Vision Transformer from Scratch},
  author={Ben Alla Ismail},
  year={2024},
  url={https://github.com/benisalla/tiny-vit-transformer-from-scratch}
}

About Me

🎓 Ismail Ben Alla - Neural Network Enthusiast

I am deeply passionate about exploring artificial intelligence and its potential to solve complex problems and unravel the mysteries of our universe. My academic and professional journey is characterized by a commitment to learning and innovation in AI, deep learning, and machine learning.

What Drives Me

Passion for AI: Eager to push the boundaries of technology and discover new possibilities.
Continuous Learning: Committed to staying informed and skilled in the latest advancements.
Optimism and Dedication: Motivated by the challenges and opportunities that the future of AI holds.

I thoroughly enjoy what I do and am excited about the future of AI and machine learning. Let's connect and explore the endless possibilities of artificial intelligence together!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TINY-ViT: Vision Transformer from Scratch

A little demo

Table of Contents 📘

About The Project

Features

Project Structure

Built With

Getting Started

Installation

Usage

Training

Fine-Tuning

Model Performance

Roadmap

Contributing

Contributors

Authors

Acknowledgements

License

About Me

What Drives Me

Get ready to see pixels transform into insights 🌟🔍✨

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
tiny_vit_transformer_from_scratch		tiny_vit_transformer_from_scratch
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

benisalla/Tiny-ViT-Transformer-from-scratch

Folders and files

Latest commit

History

Repository files navigation

TINY-ViT: Vision Transformer from Scratch

A little demo

Table of Contents 📘

About The Project

Features

Project Structure

Built With

Getting Started

Installation

Usage

Training

Fine-Tuning

Model Performance

Roadmap

Contributing

Contributors

Authors

Acknowledgements

License

About Me

What Drives Me

Get ready to see pixels transform into insights 🌟🔍✨

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages