It synthesizes photorealistic images (deepfakes) without a generative adversarial network. This is an implementation of the convolutional neural network described in "Photographic Image Synthesis with Cascaded Refinement Networks" by Qifeng Chen and Vladlen Koltun. There are some differences between their implementation and this one. You may find more information at their website.
- Tensorflow
- Keras
- OpenCV
- Pillow
- Numpy
- h5py
- Python 3
Please download the dataset from Cityscape. We used gtFine_trainvaltest (labels) and leftImg8bit_trainvaltest (data).
Link: https://www.cityscapes-dataset.com/downloads/
- Clone this repository.
- Download the dataset from Cityscape.
- Prepare a save file to begin training by using the
prepvgg
and thenprepcrn
subcommands. - Then train by using the
train
subcommand. - To synthesize images, use the
generate
subcommand after training. - Run
python3 crn.py --help
for more information.
Running this neural network requires a substantial amount of memory. Training the network in 256p requires at least 40 GB for a batch size of 1. Training in 1024p requires at least 120 GB for a batch size of 5.
256p is enabled. To use the code for 512p and 1024p, uncomment the extra modules.
- Uses batch normalization instead of layer normalization.
- Uses an earlier version of their loss function.
- Uses max pooling instead of bilinear subsampling.
Qifeng Chen and Vladlen Koltun. Photographic Image Synthesis with Cascaded Refinement Networks. In ICCV 2017.