E3DGE is an encoder-based 3D GAN inversion framework that yields high-quality shape and texture reconstruction.
Input | Inversion | Editing (-Smile) |
Editing (+Smile) |
Toonify |
For more visual results, go checkout our project page 📃
This repository contains the official implementation of E3DGE: Self-supervised Geometry-Aware Encoder for Style-based 3d GAN Inversion.
[06/2023] Inference and training codes on FFHQ with StyleSDF base model are released, including colab demo.
[03/2023] E3DGE is accepted to CVPR 2023 🥳!
- Release the inference and training code.
- Release the colab demo.
- Release Hugging face demo.
- Release pre-traind models using EG3D as the base model.
- Release video inversion code.
If you find our work useful for your research, please consider citing the paper:
@inproceedings{lan2022e3dge,
title={E3DGE: Self-Supervised Geometry-Aware Encoder for Style-based 3D GAN Inversion},
author={Lan, Yushi and Meng, Xuyi and Yang, Shuai and Loy, Chen Change and Dai, Bo},
booktitle={Computer Vision and Pattern Recognition (CVPR)},
year={2023}
}
NVIDIA GPUs are required for this project. We have test the inference codes on NVIDIA T4 and NVIDIA V100. The training codes have been tested on NVIDIA V100 (32GB). We recommend using anaconda to manage the python environments.
conda create --name e3dge python=3.8
conda activate e3dge
conda install -c conda-forge ffmpeg
conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=10.2 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
pip install -r requirements.txt
The pretrain 3D generators and encoders are needed for inference.
The following scripts download pretrain models and test dataset.
python download_models.py # download pre-trained models
python download_test_data.py # download preprocessed CelebA-HQ test set
Note that to render the mesh, 15GB GPU memory is required.
do novel view synthesis given 2D images (on some demo images):
bash scripts/test/demo_view_synthesis.sh
Conduct semantics editing (on some demo images):
bash scripts/test/demo_editing.sh
3D Toonifications with our pre-triaind encoder:
bash scripts/test/demo_toonify.sh
Reproduce the results in Table 1 (Quantitative performance on CelebA-HQ.)
bash scripts/test/eval_2dmetrics_ffhq.sh
More explaination of the inference scripts are included in scripts/test/RUN.md.
For all the experiments, we use 4 V100 GPUs
by default.
python download_models.py # download pre-trained models
python download_datasets.py # download preprocessed CelebA-HQ test set
stage 1 training (Sec. 1, Self-supervision for plausible shape inversion.)
bash scripts/train/ffhq/stage1.sh
stage 2.1 training (Sec. 2+3, Local feature fusion for high-fidelity inversion. Using 2D alignment only)
bash scripts/train/ffhq/stage2.1.sh
stage 2.2 training (Sec. 3, Hybrid feature alignment for high-quality editing)
# update the ffhq/afhq dataset path in the bash if adv_lambda != 0 (enables adversarial training)
bash scripts/train/ffhq/stage2.2.sh
Intermediate test results will be saved under ${checkpoints_dir}
every 2000
iterations, and the train results will be saved every 100
iterations.
For the training results, from the left to right is the (synthetic) ground truth images, E0
reconstruction (64x64 resolution), residual, aligned residual and the E1
reconstructionss (both thumb image and SR image).
For the test results, the first line is the real images ground truth images and the second line is the texture inversion results.
To inference the trained models, please refer to the Inference section.
Support for more datasets coming soon...
I have uploaded the python script to export the demo video here gallary_video.py
. You can modify the video paths and use it in your own project.
Though our test_ae.py automatically calculates the inversion metrics, you can also simply run this script calc_losses_on_images.py
and modify the --data_path
and --gt_path
argument to calculate the inversion performance of your results.
This study is supported under the RIE2020 Industry Alignment Fund Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s). It is also partially supported by Singapore MOE AcRF Tier 2 (MOE-T2EP20221-0011) and the NTU URECA research program.
This project is built on source codes shared by StyleSDF.
Distributed under the S-Lab License. See LICENSE
for more information.
If you have any question, please feel free to contact us via yushi001@e.ntu.edu.sg
.