[AAAI-24] VVS : Video-to-Video Retrieval
With Irrelevant Frame Suppression [Project Page]
Official Pytorch Implementation of VVS: Video-to-Video Retrieval With Irrelevant Frame Suppression
Paper: Video-to-Video Retrieval With Irrelevant Frame Suppression
- For a fast verification, a simple evaluation protocol is guided as follows.
-
The process of fast evaluation for VVS on FIVR5K can be summarized into 3 steps:
-
Download the data from an Google Drive link.
-
Please locate the data as below
- Place the
pca.pkl
inside aVVS/data/vcdb
folder - Place the
fivr5k_resnet50_l4imac
inside aVVS/features
folder - Place the
table_benchmark_dim_3840
inside aVVS/jobs
folder
- Place the
-
Run the command to evaluate the VVS on FIVR5K
bash experiments/review/fast_evaluation_fivr5k.sh
-
-
Download the raw video dataset you want. The supported options are:
-
You should contact the author about the missing video that occurs during the download process.
-
The raw video data should be located like the structure below.
-
But preparing raw video is not essential. We provide the features, we used.
├── videos
├── fivr
└── videos
├── video_1
├── video_2
└── ...
├── cc_web
└── videos
├── video_1
├── video_2
└── ...
├── evve
└── videos
├── video_1
├── video_2
└── ...
-
For convenience, we provide the features we used. You can find them here.
-
Before running, Place the features inside a
VVS/features
folder.
├── features
└── vcdb_resnet50_l4imac
├── features
├── feat_1
├── feat_2
└── ...
└── fivr_resnet50_l4imac
├── features
├── feat_1
├── feat_2
└── ...
└── cc_web_resnet50_l4imac
├── features
├── feat_1
├── feat_2
└── ...
└── evve_resnet50_l4imac
├── features
├── feat_1
├── feat_2
└── ...
- OS : Ubuntu 18.04
- CUDA : 10.2
- Python 3.7
- Pytorch 1.8.1 Torchvision 0.9.1
- GPU : NVIDA-Tesla V100(32G)
Required packages are listed in environment.yaml. You can install by running:
conda env create -f environment.yaml
conda activate VVS
If your GPU only support above CUDA 11.0, you can install by running:
conda env create -f environment_cuda11.yaml
conda activate VVS
- Before running, Place the pca.pkl inside a
VVS/data/vcdb
folder or you can calculate PCA weight directlypython cal_pca.py
. - You can easily evaluate the model by running the provided script.
Please follow the instructions in README.md for training and evaluation
We provide checkpoints, to succesfully reproduce our benchmark experiments.
- You can run the script according to the feature dimension.
Dataset | script |
---|---|
FIVR5K | $ bash experiments/main_script/train/table_benchmark/eval_benchmark_fivr5k_dim_{dim}.sh |
FIVR200K | $ bash experiments/main_script/train/table_benchmark/eval_benchmark_fivr200k_dim_{dim}.sh |
CC_WEB_VIDEO | $ bash experiments/main_script/train/table_benchmark/eval_benchmark_cc_web_dim_{dim}.sh |
Usage | Method | train dataset | DSVR | CSVR | ISVR |
---|---|---|---|---|---|
frame | TN | VCDB | 0.724 | 0.699 | 0.589 |
DP | VCDB | 0.775 | 0.740 | 0.632 | |
TCAsym | VCDB | 0.728 | 0.698 | 0.592 | |
TCAf | VCDB | 0.877 | 0.830 | 0.703 | |
SCFV+NIP256 | VCDB | 0.819 | 0.764 | 0.622 | |
SCFV+TNIP256 | VCDB | 0.896 | 0.833 | 0.674 | |
ViSiLsym | VCDB | 0.833 | 0.792 | 0.654 | |
ViSiLf | VCDB | 0.843 | 0.797 | 0.660 | |
ViSiLv | VCDB | 0.892 | 0.841 | 0.702 | |
DnS(SfA) | DnS-100K | 0.921 | 0.875 | 0.741 | |
video | HC | VCDB | 0.265 | 0.247 | 0.193 |
DML | VCDB | 0.398 | 0.378 | 0.309 | |
TMK | VCDB | 0.417 | 0.394 | 0.319 | |
LAMV | VCDB | 0.489 | 0.459 | 0.364 | |
VRAG | VCDB | 0.484 | 0.470 | 0.399 | |
TCAc | VCDB | 0.570 | 0.553 | 0.473 | |
DnS(Sc) | DnS-100K | 0.574 | 0.558 | 0.476 | |
VVS500(Ours) | VCDB | 0.606 | 0.588 | 0.502 | |
VVS512(Ours) | VCDB | 0.608 | 0.590 | 0.505 | |
VVS1024(Ours) | VCDB | 0.645 | 0.627 | 0.536 | |
VVS3840(Ours) | VCDB | 0.711 | 0.689 | 0.590 |
Usage | Method | train dataset | cc_web | cc_web* | cc_webc | cc_webc* |
---|---|---|---|---|---|---|
frame | TN | VCDB | 0.978 | 0.965 | 0.991 | 0.987 |
DP | VCDB | 0.975 | 0.958 | 0.990 | 0.982 | |
CTE | VCDB | 0.996 | - | - | - | |
TCAsym | VCDB | 0.982 | 0.962 | 0.992 | 0.981 | |
TCAf | VCDB | 0.983 | 0.969 | 0.994 | 0.990 | |
SCFV+NIP256 | VCDB | 0.973 | 0.953 | 0.976 | 0.959 | |
SCFV+TNIP256 | VCDB | 0.978 | 0.969 | 0.983 | 0.975 | |
ViSiLsym | VCDB | 0.982 | 0.969 | 0.991 | 0.988 | |
ViSiLf | VCDB | 0.984 | 0.969 | 0.993 | 0.987 | |
ViSiLv | VCDB | 0.985 | 0.971 | 0.996 | 0.993 | |
DnS(SfA) | DnS-100K | 0.984 | 0.973 | 0.995 | 0.992 | |
video | HC | VCDB | 0.958 | - | - | - |
DML | VCDB | 0.971 | 0.941 | 0.979 | 0.959 | |
VRAG | VCDB | 0.971 | 0.952 | 0.980 | 0.967 | |
TCAc | VCDB | 0.973 | 0.947 | 0.983 | 0.965 | |
DnS(Sc) | DnS-100K | 0.972 | 0.952 | 0.980 | 0.967 | |
VVS500(Ours) | VCDB | 0.973 | 0.952 | 0.981 | 0.966 | |
VVS512(Ours) | VCDB | 0.973 | 0.952 | 0.981 | 0.967 | |
VVS1024(Ours) | VCDB | 0.973 | 0.952 | 0.982 | 0.969 | |
VVS3840(Ours) | VCDB | 0.975 | 0.955 | 0.984 | 0.973 |
We referenced the repos below for the code.
If you have any question or comment, please contact using the issue.