Inverse Image Frequence for Long-tailed Image Recognition

Abstract The long-tailed distribution is a common phenomenon in the real world. Extracted large scale image datasets inevitably demonstrate the long-tailed property and models trained with imbalanced data can obtain high performance for the over-represented categories, but struggle for the under-represented categories, leading to biased predictions and performance degradation. To address this challenge, we propose a novel de-biasing method named Inverse Image Frequency (IIF). IIF is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network. Our method achieves stronger performance than similar works and it is especially useful for downstream tasks such as long-tailed instance segmentation as it produces fewer false positive detections. Our extensive experiments show that IIF surpasses the state of the art on many long-tailed benchmarks such as ImageNet-LT, CIFAR-LT, Places-LT and LVIS, reaching 55.8 top-1 accuracy with ResNet50 on ImageNet-LT and 26.3 segmentation AP with MaskRCNN ResNet50 on LVIS.

Progress

Training code.
Evaluation code.
LVIS v1.0, ImageNet-LT, Places-LT datasets.
Provide classification checkpoint models.
Provide instance segmentation checkpoint models.

Tested with

python==3.8.12
torch==1.7.1
torchvision==0.8.2
mmdet==2.15.1
lvis
Tested on CUDA 10.1,10.0

Please Note that there is a reproducibility issue when using CUDA 10.2, as it drops classification performance by ~5%. For this reason please use either cuda 10.1 or cuda 10.0. Other versions are not tested.

Getting Started

Create a virtual environment

conda create --name mmdet pytorch=1.7.1 -y
conda activate mmdet

Install dependency packages

conda install torchvision -y
conda install pandas scipy -y
conda install opencv -y
pip install catalyst
pip install imgaug
pip install randaugment

Install MMDetection

pip install openmim
mim install mmdet==2.15.1

Clone this repo

git clone https://github.com/kostas1515/iif.git
cd iif

Datasets

For COCO and LVIS datasets:

Create data directory, download COCO 2017 datasets at https://cocodataset.org/#download (2017 Train images [118K/18GB], 2017 Val images [5K/1GB], 2017 Train/Val annotations [241MB]) and extract the zip files:

mkdir data
cd data
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip

#download and unzip LVIS annotations
wget https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvis_v1_train.json.zip
wget https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvis_v1_val.json.zip

modify mmdetection/configs/base/datasets/lvis_v1_instance.py and make sure data_root variable points to the above data directory, e.g., data_root= "<user_path>"

For ImageNet and Places-LT:

Download the ImageNet_2014 and Places_365.

Citation

 @article{alexandridis2023inverse,
  title={Inverse Image Frequency for Long-tailed Image Recognition},
  author={Alexandridis, Konstantinos Panagiotis and Luo, Shan and Nguyen, Anh and Deng, Jiankang and Zafeiriou, Stefanos},
  journal={IEEE Transactions on Image Processing},
  year={2023},
  publisher={IEEE}
}

Acknowledgements

This code uses the mmdet framework for instance segmentation. For classification, it uses MiSLAS and LDAM. Thank you for your wonderfull work!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Inverse Image Frequence for Long-tailed Image Recognition

Progress

Tested with

Getting Started

Datasets

Citation

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

Inverse Image Frequence for Long-tailed Image Recognition

Progress

Tested with

Getting Started

Datasets

Citation

Acknowledgements