PyTorch Implementation of paper:
Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval.
Minjoon Jung, Seongho Choi, Joochan Kim, Jin-Hwa Kim, Byoung-Tak Zhang
This project is implemented based on Pytorch with Anaconda.
git clone https://github.com/minjoong507/MPGN.git
Download the original feature files. Also, please check here to generate pseudo supervision.
Our environments:
- Python 3.9
- PyTorch 1.13.1
- CUDA 12.0
You can also run our code under the same environments in TVR.
We give the code for training the Cross-modal Moment Localization (XML).
If you want to train the model with the pseudo queries, you can use --training_w_pseudo_supervision
and also use --training_strategy
to decide the type of pseudo queries (visual
, textual
, aug
). aug
refers to using both type of the pseudo queries.
bash baselines/crossmodal_moment_localization/scripts/train.sh \
tvr video_sub resnet_i3d \
--exp_id test_run \
--training_w_pseudo_supervision \
-- training_strategy aug
If our project is useful to your research, please consider citing our papers:
@inproceedings{jung2022modal,
title={Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval},
author={Minjoon Jung and Seongho Choi and Joochan Kim and Jin-Hwa Kim and Byoung-Tak Zhang},
booktitle={EMNLP},
year={2022}
}
Our project follows the codes in TVR. We thank the authors for sharing their great work.