InPainTor🎨: Context-Aware Segmentation and Inpainting in Real-Time

InPainTor🎨 is a deep learning model designed for context-aware segmentation and inpainting in real-time. It recognizes objects of interest and performs inpainting on specific classes while preserving the surrounding context.

🚀 Features

Real-time object recognition and inpainting
Selective removal and filling of missing or unwanted objects
Context preservation during inpainting
Two-stage training process: segmentation and inpainting
Support for COCO and RORD datasets

🚧 WIP (Work In Progress)

This project is currently under development. Use with caution and expect changes.

🛠️ Installation

Clone the repository:

git clone https://github.com/your-username/InPainTor.git
cd InPainTor

Create and activate the Conda environment:

conda env create -f environment.yml
conda activate inpaintor

🖥️ Usage

Training

To train the InPainTor model:

python src/train.py --coco_data_dir "path/to/COCO" --rord_data_dir "path/to/RORD" --seg_epochs <num_epochs> --inpaint_epochs <num_epochs>

Click to view all training arguments

--coco_data_dir: Path to the COCO 2017 dataset directory
--rord_data_dir: Path to the RORD dataset directory
--seg_epochs: Number of epochs for segmentation training (default: 10)
--inpaint_epochs: Number of epochs for inpainting training (default: 10)
--batch_size: Batch size for training (default: 2)
--learning_rate: Learning rate for the optimizer (default: 0.1)
--image_size: Size of the input images, assumed to be square (default: 512)
--mask_size: Size of the masks, assumed to be square (default: 256)
--model_name: Name of the model (default: 'InPainTor')
--log_interval: Log interval for training (default: 1000)
--resume_checkpoint: Path to the checkpoint to resume training from (default: None)
--selected_classes: List of class IDs for inpainting (default: [1, 72, 73, 77])

Inference

To perform inference using the trained InPainTor model:

python src/inference.py --model_path "path/to/model.pth" --data_dir "path/to/data" --image_size 512 --mask_size 256 --batch_size <num_examples_per_batch> --output_dir "path/to/outputs"

Click to view all inference arguments

--model_path: Path to the trained model checkpoint
--data_dir: Path to the directory containing images for inference
--image_size: Size of the input images, assumed to be square (default: 512)
--mask_size: Size of the masks, assumed to be square (default: 256)
--batch_size: Batch size for inference (default: 1)
--output_dir: Path to the directory to save the inpainted images

📁 Project Structure

Click to view the repository structure

InpainTor/ 
├── assets/                   📂: Repository assets (images, logos, etc.)
├── checkpoints/              💾: Model checkpoints
├── logs/                     📃: Log files
├── notebooks/                📓: Jupyter notebooks
├── outputs/                  📺: Output files generated during inference, training and debugging
├── src/                      📜: Source code files
│   ├── __init__.py           📊: Initialization file
│   ├── data_augmentation.py  📑: Data augmentation operations
│   ├── dataset.py            📊: Dataset loading and preprocessing
│   ├── debug_model.py        📊: Model debugging
│   ├── inference.py          📊: Inference script
│   ├── layers.py             📊: Model layers
│   ├── losses.py             📊: Loss functions
│   ├── model.py              📑: InpainTor model implementation
│   ├── train.py              📊: Training script
│   └── visualizations.py     📊: Visualization functions
├── .gitignore                🚫: Files to ignore in Git
├── environment.yml           🎛️: Conda environment configuration
└── README.md                 📖: Project README file

🧠 Model Architecture

The InPainTor model consists of three main components:

SharedEncoder: Encodes input images into a series of feature maps.
SegmentorDecoder: Decodes encoded features into segmentation masks.
GenerativeDecoder: Uses segmentation information to generate inpainted images.

Overview of InPainTor Model Architecture

Model Components in Detail

Model Concept

Model Training Process

Train SharedEncoder and SegmentorDecoder for accurate segmentation
Freeze SharedEncoder and and SegmentorDecoder, train GenerativeDecoder.

Example of Loss During Training Stages

📊 Dataset Requirements

RORD Inpainting Dataset Structure

The RORD dataset should be organized as follows:

root_dir/
├── train/
│   ├── img/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...
│   └── gt/
│       ├── image1.jpg
│       ├── image2.jpg
│       └── ...
└── val/
    ├── img/
    │   ├── image1.jpg
    │   ├── image2.jpg
    │   └── ...
    └── gt/
        ├── image1.jpg
        ├── image2.jpg
        └── ...

COCO Segmentation Dataset Structure

The COCO dataset (2017 version with 91 classes) should be organized as follows:

root_dir/
├── train/
│   ├── img/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...
│   └── gt/
│       ├── image1.jpg
│       ├── image2.jpg
│       └── ...
└── val/
    ├── img/
    │   ├── image1.jpg
    │   ├── image2.jpg
    │   └── ...
    └── gt/
        ├── image1.jpg
        ├── image2.jpg
        └── ...

For more information on COCO dataset classes, refer to this link.

🔮 Current Limitations and Future Work

Limitations

Segmentation Performance:
- The current segmentation model works relatively well for small datasets with limited variety
- It struggles with larger, more diverse datasets like COCO 2017.
Generator Performance:
- The current generator architecture may be too simplistic, particularly in the layers following the masking process.
- The frozen encoder in the generator could be limiting the model's learning capacity.
Hardware Constraints:
- Memory limitations restrict model size and batch processing capabilities.
- Impacts choice of architectures and training strategies.
No Data Augmentation:
- Not currently integrated into the training pipeline (but the implementation is 90% ready)

Future Work

Improve Segmentation Section:
- Investigate and implement more sophisticated segmentation architectures like ENet or BiSeNet
- Check if itś possible to adapt pre-trained models to this architecture.
Enhance Generator Architecture:
- Increase the number of parameters and layers after the masking process in the generator.
- Experiment with more sophisticated generator designs, potentially allowing (limited) parts of the encoder to be trainable.
Experiment with Cost Functions:
- Test and evaluate alternative loss functions.
- Consider multi-objective loss functions that balance different aspects of the inpainting task.
Incorporate Data Augmentation:
- Integrate the already implemented data augmentation techniques into the training pipeline.
Evaluation Metrics:
- Implement evaluation metrics to better assess the quality of inpainted images.

🤝 Contributing

Contributions to the InPainTor project are welcome! Please follow these steps to contribute:

Fork the repository
Create a new branch for your feature or bug fix
Commit your changes
Push to your fork and submit a pull request

We appreciate your contributions to improve InPainTor!

🙏 Acknowledgements

This work is funded by FCT - Fundação para a Ciência e a Tecnologia, I.P., through project with reference 2022.09235.PTDC.

📄 License

This project is licensed under GPLv3.

For more information or support, please open an issue in the GitHub repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InPainTor🎨: Context-Aware Segmentation and Inpainting in Real-Time

🚀 Features

🚧 WIP (Work In Progress)

🛠️ Installation

🖥️ Usage

Training

Inference

📁 Project Structure

🧠 Model Architecture

Overview of InPainTor Model Architecture

Model Components in Detail

Model Concept

Model Training Process

Example of Loss During Training Stages

📊 Dataset Requirements

🔮 Current Limitations and Future Work

🤝 Contributing

🙏 Acknowledgements

📄 License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
assets		assets
logs		logs
notebooks		notebooks
outputs		outputs
src		src
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

ipleiria-ciic/in-pain-tor

Folders and files

Latest commit

History

Repository files navigation

InPainTor🎨: Context-Aware Segmentation and Inpainting in Real-Time

🚀 Features

🚧 WIP (Work In Progress)

🛠️ Installation

🖥️ Usage

Training

Inference

📁 Project Structure

🧠 Model Architecture

Overview of InPainTor Model Architecture

Model Components in Detail

Model Concept

Model Training Process

Example of Loss During Training Stages

📊 Dataset Requirements

🔮 Current Limitations and Future Work

🤝 Contributing

🙏 Acknowledgements

📄 License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages