BELM: High-quality Exact Inversion sampler of Diffusion Models 🏆

This repository is the official implementation of the NeurIPS 2024 paper: "BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models"

Keywords: Diffusion Model, Exact Inversion, ODE Solver

Fangyikang Wang¹, Hubery Yin², Yuejiang Dong³, Huminhao Zhu¹,
Chao Zhang¹, Hanbin Zhao¹, Hui Qian¹, Chen Li²

¹Zhejiang University ²WeChat, Tencent Inc. ³Tsinghua University

🆕 What's New?

🔥 We use the thought of bidirectional explicit to enable exact inversion

Schematic description of DDIM (left) and BELM (right). DDIM uses $\mathbf{x}_i$ and $\boldsymbol{\varepsilon}_\theta(\mathbf{x}_i,i)$ to calculate $\mathbf{x}_{i-1}$ based on a linear relation between $\mathbf{x}_i$, $\mathbf{x}_{i-1}$ and $\boldsymbol{\varepsilon}_\theta(\mathbf{x}_i,i)$ (represented by the blue line). However, DDIM inversion uses $\mathbf{x}_{i-1}$ and $\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i-1},i-1)$ to calculate $\mathbf{x}_{i}$ based on a different linear relation represented by the red line. This mismatch leads to the inexact inversion of DDIM. In contrast, BELM seeks to establish a linear relation between $\mathbf{x}_{i-1}$, $\mathbf{x}_i$, $\mathbf{x}_{i+1}$ and $\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i}, i)$ (represented by the green line). BELM and its inversion are derived from this unitary relation, which facilitates the exact inversion. Specifically, BELM uses the linear combination of $\mathbf{x}_i$, $\mathbf{x}_{i+1}$ and $\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i},i)$ to calculate $\mathbf{x}_{i-1}$, and the BELM inversion uses the linear combination of $\mathbf{x}_{i-1}$, $\mathbf{x}_i$ and $\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i},i)$ to calculate $\mathbf{x}_{i+1}$. The bidirectional explicit constraint means this linear relation does not include the derivatives at the bidirectional endpoint, that is, $\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i-1},i-1)$ and $\boldsymbol{\varepsilon}_\theta(\mathbf{x}_{i+1},i+1)$.

🔥 We introduce a generic formulation of the exact inversion samplers, BELM.

the general k-step BELM:

$$\bar{\mathbf{x}}_{i-1} = \sum_{j=1}^{k} a_{i,j}\cdot \bar{\mathbf{x}}_{i-1+j} +\sum_{j=1}^{k-1}b_{i,j}\cdot h_{i-1+j}\cdot\bar{\boldsymbol{\varepsilon}}_\theta(\bar{\mathbf{x}}_{i-1+j},\bar{\sigma}_{i-1+j}).$$

2-step BELM:

$$\bar{\mathbf{x}}_{i-1} = a_{i,2}\bar{\mathbf{x}}_{i+1} +a_{i,1}\bar{\mathbf{x}}_{i} + b_{i,1} h_i\bar{\boldsymbol{\varepsilon}}_\theta(\bar{\mathbf{x}}_i,\bar{\sigma}_i).$$

🔥 We derive the optimal coefficients for BELM via LTE minimization.

Proposition The LTE $\tau_i$ of BELM diffusion sampler, which is given by $\tau_i = \bar{\mathbf{x}}(t_{i-1}) - a_{i,2}\bar{\mathbf{x}}(t_{i+1}) -a_{i,1}\bar{\mathbf{x}}(t_{i}) - b_{i,1} h_i\bar{\boldsymbol{\varepsilon}}_\theta(\bar{\mathbf{x}}(t_i),\bar{\sigma}_i)$, can be accurate up to $\mathcal{O}\left({(h_{i}+h_{i+1})}^3\right)$ when formulae are designed as $a_{i,1} = \frac{h_{i+1}^2 - h_i^2}{h_{i+1}^2}$,$a_{i,2}=\frac{h_i^2}{h_{i+1}^2}$,$b_{i,1}=- \frac{h_i+h_{i+1}}{h_{i+1}} $.

where $h_i = \frac{\sigma_i}{\alpha_i}-\frac{\sigma_{i-1}}{\alpha{i-1}}$

the Optimal-BELM (O-BELM) sampler:

$$\mathbf{x}_{i-1} = \frac{h_i^2}{h_{i+1}^2}\frac{\alpha_{i-1}}{\alpha_{i+1}}\mathbf{x}_{i+1} +\frac{h_{i+1}^2 - h_i^2}{h_{i+1}^2}\frac{\alpha_{i-1}}{\alpha_{i}}\mathbf{x}_{i} - \frac{h_i(h_i+h_{i+1})}{h_{i+1}}\alpha_{i-1}\boldsymbol{\varepsilon}_\theta(\mathbf{x}_i,i).$$

The inversion of O-BELM diffusion sampler writes:

$$\mathbf{x}_{i+1}= \frac{h_{i+1}^2}{h_i^2}\frac{\alpha_{i+1}}{\alpha_{i-1}}\mathbf{x}_{i-1} + \frac{h_i^2-h_{i+1}^2}{h_i^2}\frac{\alpha_{i+1}}{\alpha_{i}}\mathbf{x}_{i}+\frac{h_{i+1}(h_i+h_{i+1})}{h_i}\alpha_{i+1} \boldsymbol{\varepsilon}_\theta(\mathbf{x}_i,i).$$

👨🏻‍💻 Run the code

1) Get start

Python 3.8.12
CUDA 11.7
NVIDIA A100 40GB PCIe
Torch 2.0.0
Torchvision 0.14.0

Please follow diffusers to install diffusers.

2) Run

first, please switch to the root directory.

CIFAR10 sampling

python3 ./scripts/cifar10.py --test_num 10 --batch_size 32 --num_inference_steps 100 --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxx/ddpm_ema_cifar10

CelebA-HQ sampling

python3 ./scripts/celeba.py --test_num 10 --batch_size 32 --num_inference_steps 100 --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxx/ddpm_ema_cifar10

FID evaluation

python3 ./scripts/celeba.py --test_num 10 --batch_size 32 --num_inference_steps 100 --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxx/ddpm_ema_cifar10

intrpolation

python3 ./scripts/interpolate.py --test_num 10 --batch_size 1 --num_inference_steps 100  --save_dir YOUR/SAVE/DIR --model_id xx

Reconstruction error calculation

python3 ./scripts/reconstruction.py --test_num 10 --num_inference_steps 100  --directory WHERE/YOUR/IMAGES/ARE --sampler_type belm

Image editing

python3 ./scripts/image_editing.py --num_inference_steps 200 --freeze_step 50 --guidance 2.0  --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxxxx/stable-diffusion-v1-5 --ori_im_path images/imagenet_dog_1.jpg --ori_prompt 'A dog' --res_prompt 'A Dalmatian'

🪪 License

This project is licensed under the MIT License - see the LICENSE file for details.

📝 Citation

If our work assists your research, feel free to give us a star ⭐ or cite us using:

@article{wang2024belm,
  title={BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models},
  author={Wang, Fangyikang and Yin, Hubery and Dong, Yuejiang and Zhu, Huminhao and Zhang, Chao and Zhao, Hanbin and Qian, Hui and Li, Chen},
  journal={arXiv preprint arXiv:2410.07273},
  year={2024}
}

📩 Contact me

My e-mail address:

wangfangyikang@zju.edu.cn

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
assets		assets
evaluations		evaluations
images		images
samplers		samplers
scripts		scripts
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BELM: High-quality Exact Inversion sampler of Diffusion Models 🏆

🆕 What's New?

🔥 We use the thought of bidirectional explicit to enable exact inversion

🔥 We introduce a generic formulation of the exact inversion samplers, BELM.

🔥 We derive the optimal coefficients for BELM via LTE minimization.

👨🏻‍💻 Run the code

1) Get start

2) Run

CIFAR10 sampling

CelebA-HQ sampling

FID evaluation

intrpolation

Reconstruction error calculation

Image editing

🪪 License

📝 Citation

📩 Contact me

About

Releases

Packages

Languages

License

zituitui/BELM

Folders and files

Latest commit

History

Repository files navigation

BELM: High-quality Exact Inversion sampler of Diffusion Models 🏆

🆕 What's New?

🔥 We use the thought of bidirectional explicit to enable exact inversion

🔥 We introduce a generic formulation of the exact inversion samplers, BELM.

🔥 We derive the optimal coefficients for BELM via LTE minimization.

👨🏻‍💻 Run the code

1) Get start

2) Run

CIFAR10 sampling

CelebA-HQ sampling

FID evaluation

intrpolation

Reconstruction error calculation

Image editing

🪪 License

📝 Citation

📩 Contact me

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages