GitHub

The code has been tested on Ubuntu 14.04 and the only dependencies are from python which could be easily resolved with pip.

Dependencies

Install CUDA-8.0 and CUDNN6 from NVIDIA.
Install all python dependencies: cv2, h5py, numpy, PIL, pytorch.

sudo apt-get install python-opencv, python-pip
sudo pip install h5py, numpy, Pillow
sudo pip install http://download.pytorch.org/whl/cu80/torch-0.1.12.post2-cp27-none-linux_x86_64.whl 
sudo pip install torchvision

Model and Data

Download and uncompress the data from the competition. Put IsoGD_phase_1 and IsoGD_phase_2 containing original .avi videos in dataset. This videos in these folders should follow a format (as provided by the competition):

dataset/IsoGD_phase_1/train/001/K_00001.avi
dataset/IsoGD_phase_1/train/001/M_00001.avi
...
dataset/IsoGD_phase_1/valid/001/K_00001.avi
dataset/IsoGD_phase_1/valid/001/M_00001.avi
...
dataset/IsoGD_phase_2/test/001/K_00001.avi
dataset/IsoGD_phase_2/test/001/M_00001.avi

Download and decompress pre-trained model model_data.tar.gz from rgbd model and preprocessed pose data.

wget https://www.dropbox.com/s/plzuw7coomwtkas/model_data.tar.gz?dl=1 -O model_data.tar.gz
tar xf model_data.tar.gz

The model can also be downloaded from: https://pan.baidu.com/s/1miA27qC.

The md5sum for the file is: 39cd8d469bfea471a8e01168aa5889c7

After decompression, you will find model and data in two new folders: model and pose_h5 in the root folder of the code.

After all these steps, the code directory should have:

dataset/IsoGD_phase_1/train
dataset/IsoGD_phase_1/valid
dataset/IsoGD_phase_1/info
logs/test
model/*.model
pose_h5/phase1/*
pose_h5/phase2/*
isogd.py
main_rgbd_fusion.py
main_train_rgbd_c3d.py
rgbd_c3d.py
utils.py
README.md

And you are ready to do training or testing.

Testing

We run our model by ourselves and prediction results used for submission can be found in pred.

If you want to replicate these results with the pre-trained model, follow:

To evaluate on validation split (phase1) with RGB modality:

python main_train_rgbd_c3d.py --resume ./model/focus_rgb_init_depth.model \
--gpu_id 0,1,2,3 --evaluate 1 --eval_split val --modality focus_rgb

To evaluate on validation split (phase1) with depth modality:

python main_train_rgbd_c3d.py --resume ./model/focus_depth_init_rgb.model \
--gpu_id 0,1,2,3 --evaluate 1 --eval_split val --modality focus_depth

To evaluate on test split (phase2) with RGB modality:

python main_train_rgbd_c3d.py --resume ./model/focus_rgb_init_depth.model \
--gpu_id 0,1,2,3 --evaluate 1 --eval_split test --modality focus_rgb

To evaluate on test split (phase2) with depth modality:

python main_train_rgbd_c3d.py --resume ./model/focus_depth_init_rgb.model \
--gpu_id 0,1,2,3 --evaluate 1 --eval_split test --modality focus_depth

After this, logs/test/score folder will be populated with original prediction score for each valid/test sample.

These score will be used for late fusion and generate the final prediction submission file in the [next section](#Late fusion of RGBD).

Late fusion of RGBD

Prediction scores from RGB and depth modality are fused with weight 0.5.

For validation split (phase1):

python main_rgbd_fusion.py --score1 logs/test/score/focus_rgb_init_depth_val_score.txt \
 --score2 logs/test/score/focus_depth_init_rgb_val_score.txt --eval_split val

For test split (phase2):

python main_rgbd_fusion.py --score1 logs/test/score/focus_rgb_init_depth_test_score.txt \
 --score2 logs/test/score/focus_depth_init_rgb_test_score.txt --eval_split test

The prediction file in the required format will be located in ./logs/test/pred.

Training

A two-phase training procedure is adopted for RGB and Depth model, each phase will take 6-8 hours time on 4xTitan X(Maxwell) for each modality.

To train the model, follow these steps bellow:

Train RGB model initialized from C3D:

python main_train_rgbd_c3d.py --gpu_id 0,1,2,3 --modality focus_rgb

Train Depth model initialized from C3D:

python main_train_rgbd_c3d.py --gpu_id 0,1,2,3 --modality focus_depth

Finetune RGB model initialized with depth C3D:

python main_train_rgbd_c3d.py --gpu_id 0,1,2,3 --modality focus_rgb --resume ./model/focus_depth.model

Finetune depth model initialized with RGB C3D:

python main_train_rgbd_c3d.py --gpu_id 0,1,2,3 --modality focus_depth --resume ./model/focus_rgb.model

After the training, the final models should be found at:

model/focus_rgb_init_depth.model
model/focus_depth_init_rgb.model

Performance

Training time on 4 x Titan X(Maxwell) with batch size 32: 1.868s
Testing time on 4 x Titan X(Maxwell) with batch size 32: 1.093s
Accuracy on validation with RGBD late fusion(0.5RGB+0.5D): 0.6215
Accuracy on test with RGBD late fusion(0.5RGB+0.5D): [TBD]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dependencies

Model and Data

Testing

Late fusion of RGBD

Training

Performance

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
dataset		dataset
logs/test		logs/test
.gitignore		.gitignore
README.md		README.md
isogd.py		isogd.py
main_rgbd_fusion.py		main_rgbd_fusion.py
main_train_rgbd_c3d.py		main_train_rgbd_c3d.py
rgbd_c3d.py		rgbd_c3d.py
utils.py		utils.py

lolistoy/isogd

Folders and files

Latest commit

History

Repository files navigation

Dependencies

Model and Data

Testing

Late fusion of RGBD

Training

Performance

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages