Instance Segmentation and Object Recognition on Modanet

This is my project for Deep Learning and Generative Models course at @UniPr. The goal of the project was to train and make some ablation studies with a Mask RCNN on ModaNet dataset. Modanet is composed of more than 40k images with dress' annotations.

Label	Description	Fine-Grained-categories
1	bag	bag
2	belt	belt
3	boots	boots
4	footwear	footwear
5	outer	coat/jacket/suit/blazers/cardigan/sweater/Jumpsuits/Rompers/vest
6	dress	dress/t-shirt dress
7	sunglasses	sunglasses
8	pants	pants/jeans/leggings
9	top	top/blouse/t-shirt/shirt
10	shorts	shorts
11	skirt	skirt
12	headwear	headwear
13	scarf & tie	scartf & tie

The project's task also requires to modify the Mask RCNN's structure by adding new branch in the head of the net, which will be in charge of classify if an object recognized by the net is an accessory or not, based on the following classification:

Label	Accessory
1	bag
2	belt
7	sunglasses
12	headwear
13	scarf & tie

So, the final structure of the model will be something like this:

Here is the structure of the project:

main.py: This module is responsible for launching the simulation and handling args options.
solver.py: This module includes methods for training, validation, test and evaluation. It's the core of the project.
models/mask_rcnn.py: This module create the Mask RCNN model, based on official PyTorch implementations. The model can be selected between two backbone provided by PyTorch. This module also create the FastRCNNPredictorWithAccessory, which can replace the default RCNNPredictor if the accessories classifications is required.
models/roi_heads.py: This module implements a custom RoIHeads for accessory binary classification and a custom fasterrcnn loss with accessories classification
dataset/modanet.py: This module implements the DataLoader for Modanet dataset. It uses coco annotations.
utils/utils.py: This module contains utils functions, like drawing masks and boxes to an image.
test_rt.py: Work in progress module which will be in charge of use webcam and do model's inference on the image captured by the cam.

Getting Started

All of the modules needed to run the models are in the requirements.txt file. I also provide requirments_used.txt which specifies also the version of the package, just in case some of you want to replicate my environment.

pip install -r requirments.txt

And then you can clone this repo

git clone https://github.com/FilippoBotti/Mask_RCNN.git

Download dataset

To download the datasets I suggest you to follow the offical guide. By the way I was in trouble with that instructions so here's another useful link

Train from scratch

To see a list of possible args see the corresponding section below.

python main.py --mode train --model_name modanet_training --dataset_path DATASET_PATH --checkpoint_path CHECKPOINT_PATH --epochs 40 --pretrained False --manual_seed False

Finetuning

I also provide finetuning MaskRCNN with pretrained weights based on Pytorch's training on Coco dataset

python main.py --mode train --model_name modanet --dataset_path DATASET_PATH --checkpoint_path CHECKPOINT_PATH --epochs 40 --pretrained True --manual_seed False

Test

python main.py --mode test --model_name modanet --dataset_path DATASET_PATH --checkpoint_path CHECKPOINT_PATH

Evaluate

python main.py --mode evaluate --model_name modanet --dataset_path DATASET_PATH --checkpoint_path CHECKPOINT_PATH

Args

Parameter	Description
model_name	The name of the model to be saved/loaded
annotations_file	The name of the annotations file
epochs	Number of epochs to train
workers	Number of workers in data loader
print_every	Print loss every N iterations
lr	Learning rate
opt	The optimizer used, which can be SGD or ADAM (better results)
dataset_path	The dataset directory
checkpoint_path	Path where to save the trained model
resume_train	Determines if resume train or not
mode	The mode of the running, which can be train, test, evaluation, or debug
pretrained	Determines if load pretrained coco's weights or train from scratch
version	Determines the backbone that will be used (V1 or V2)
cls_accessory	Add accessories classifier to the net
manual_seed	Use the same random seed (1) to replicate training results
coco_evaluation	Use evaluation from COCO standard; if false, Mean Average Precision (mAP) will be used

Authors

Filippo Botti

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

This code is based on Zheng et al., 2018

References

Zheng, S., Yang, F., Kiapour, M. H., & Piramuthu, R. (2018). ModaNet: A Large-Scale Street Fashion Dataset with Polygon Annotations. In ACM Multimedia.

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
__pycache__		__pycache__
datasets		datasets
models		models
plots		plots
presentation		presentation
test		test
utils		utils
LICENSE		LICENSE
accessory_net.jpeg		accessory_net.jpeg
fashionpedia.py		fashionpedia.py
main.py		main.py
plot.py		plot.py
presentazione.key		presentazione.key
readme.md		readme.md
requirments.txt		requirments.txt
requirments_used.txt		requirments_used.txt
results.pages		results.pages
results.txt		results.txt
solver.py		solver.py
test_rt.py		test_rt.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instance Segmentation and Object Recognition on Modanet

Getting Started

Download dataset

Train from scratch

Finetuning

Test

Evaluate

Args

Authors

License

Acknowledgments

References

About

Releases

Packages

Languages

License

FilippoBotti/Mask_RCNN

Folders and files

Latest commit

History

Repository files navigation

Instance Segmentation and Object Recognition on Modanet

Getting Started

Download dataset

Train from scratch

Finetuning

Test

Evaluate

Args

Authors

License

Acknowledgments

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages