SM⁴Depth

Seamless Monocular Metric Depth Estimation across
Multiple Scenes and Cameras by One Model

Yihao Liu^✝ · Feng Xue^✝ · Anlong Ming^* · Mingshuai Zhao · Huadong Ma · Nicu Sebe

(✝ denotes equal contribution; * denotes corresponding author)

arXiv | Project Page

News

The code of training will be released.
2024.10.24 The code of test has been released.
2024.03.14 The paper has been released on arXiv.

Introduction

In this paper, we propose an approach that is capable of learning widely-varying depth ranges of various scenes from different cameras, which can be seamlessly applied to both indoor and outdoor scenes, called SM4Depth. By training on 15w RGB-Depth pairs with various scales, SM4Depth outperforms the state-of-the-art methods on all never-seen-before datasets.

To evaluate the accuracy consistency of MMDE across indoor and outdoor scenes, we propose the BUPT Depth dataset. It consists of 14,932 continuous RGB-Depth pairs captured from the campus of Beijing University of Posts and Telecommunications (BUPT) by the ZED2. It also contains the re-generated depth maps from CreStereo and the sky masks from ViT-Adapter. The color and depth streams are captured with the focal length of 1091.517 and the baseline of 120.034mm. More visualization can be found in the project page.

Environments setup

Following SM4Depth, you need an NVIDIA 3090 (or a GPU with memory > 24G) and 400G of space for the training sets (will be given).

# create conda environment
conda create -n sm4depth python=3.8
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html  # torch 1.10 is also OK !

# install requirements
pip install opencv-python tensorboardX timm thop scipy h5py
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"

# dowload sm4depth code
cd && git clone https://github.com/1hao-Liu/SM4Depth.git

Dataset Preparation for Zero-shot Evaluation

All data can be downloaded using RESOURCE.

The resource folder should have the following structure:

* denotes unzip the file here.

 └── MODELS_ROOT
     ├── best_ckpt
     └── simmim_finetune__swin_base__img224_window7__800ep.pth
 └── TESTING_SETS_ROOT
     ├── * DDAD.zip
     ├── * ETH3D.zip
     ├── * iBims1.zip
     ├── * kitti_dataset.zip
     ├── * nuScenes
     ├── * nyu_test.zip
     ├── * SUNRGBD.zip
     └── BUPTDepth
         ├── * left.zip
         ├── ...
         └── * crestereo.zip

extract BUPTDepth's depthmap information in meters:

# depth values exceeding 30m are invalid
# zed2 depth
depth = Image.open(depth_file)
depthmap = np.asarray(depth, dtype=np.float32) / 256.0
# crestereo depth
depth = Image.open(depth_file)
depthmap = np.asarray(depth, dtype=np.float32) * 1.2231 / 256.0

Inference (quick start)

Please check the content of configs/test.txt and fill in the image's info in data_splits/quick_test.txt before you use it for the first time. Before that, please download the pre-trained ckpt from the google drive and Baidu Netdisk.

conda activate sm4depth
cd && cd SM4Depth/sm4depth
python test.py ../config/test.txt

Evaluation

Please check the content of configs/eval.txt before you use it for the first time

conda activate sm4depth
cd && cd SM4Depth/sm4depth
python eval.py ../config/eval.txt

# for example: test sm4depth on iBims-1 datset
== Load encoder backbone from: None
== Total number of parameters: 128403284
== Total number of learning parameters: 128403284
== Model Initialized
== Loading checkpoint '/data_root/models/best_ckpt'
== Loaded checkpoint '/data_root/models/best_ckpt'
100%|█████████████████████████████████████████████████████████████████████████████| 100/100 [00:10<00:00,  9.57it/s]
Computing errors for 100 eval samples , post_process:  True
  silog, abs_rel,   log10,     rms,  sq_rel, log_rms,      d1,      d2,      d3
10.4001,  0.1346,  0.0620,  0.6732,  0.1232,  0.1704,  0.7904,  0.9781,  0.9922

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Citation

If you use SM4Depth, please consider citing:

@inproceedings{liu2024sm4depth,
    author    = {Liu, Yihao and Xue, Feng and Ming, Anlong and Zhao, Mingshuai and Ma, Huadong and Sebe, Nicu},
    title     = {SM4Depth: Seamless Monocular Metric Depth Estimation across Multiple Cameras and Scenes by One Model},
    booktitle = {Proceedings of the 32nd ACM International Conference on Multimedia (MM '24)},
    year      = {2024},
    publisher = {ACM}
}

Acknowledgments:

SM4Depth builds on previous works code base such as NeWCRFs and DANet. If you found SM4Depth useful please consider citing these works as well.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
SM4Depth		SM4Depth
config		config
data_splits		data_splits
files		files
ACMMM_SM4Depth_Supplementary.pdf		ACMMM_SM4Depth_Supplementary.pdf
LICENSE		LICENSE
README.md		README.md
RESOURCE.md		RESOURCE.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SM⁴Depth

Seamless Monocular Metric Depth Estimation across
Multiple Scenes and Cameras by One Model

arXiv | Project Page

News

Introduction

Environments setup

Dataset Preparation for Zero-shot Evaluation

Inference (quick start)

Evaluation

License

Citation

About

Releases

Packages

Contributors 2

Languages

License

1hao-Liu/SM4Depth

Folders and files

Latest commit

History

Repository files navigation

SM4Depth

Seamless Monocular Metric Depth Estimation acrossMultiple Scenes and Cameras by One Model

arXiv | Project Page

News

Introduction

Environments setup

Dataset Preparation for Zero-shot Evaluation

Inference (quick start)

Evaluation

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

SM⁴Depth

Seamless Monocular Metric Depth Estimation across
Multiple Scenes and Cameras by One Model

Packages