reference_based_sketch_image_colorization

PyTorch implementation of the paper "Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence" (CVPR 2020)

Dependencies

Pytorch
torchvision
numpy
PIL
OpenCV
tqdm

Usage

Clone the repository

git clone https://github.com/Snailpong/style_transfer_implementation.git

Dataset download

Tag2Pix (filtered Danbooru2020): Link
You need to change the script 'danbooru2018' to 'danbooru2020' (can be changed)
In my experiment, I used about 6000 images filtered by python preprocessor/tagset_extractor.py
- I stopped the process when 0080 folder was finished downloading.

Sketch image generation

XDoG: Link
For automatic genration, I edited main function as follows:

if __name__ == '__main__':
  for file_name in os.listdir('../data/danbooru/color'):
      print(file_name, end='\r')
      image = cv2.imread(f'../data/danbooru/color/{file_name}', cv2.IMREAD_GRAYSCALE)
      result = xdog(image)
      cv2.imwrite(f'../data/danbooru/sketch/{file_name}', result)

folder structure example

.
└── data
    ├── danbooru
    |   ├── color
    |   |   ├── 7.jpg
    |   |   └── ...
    |   └── sketch
    |       ├── 7.jpg
    |       └── ...
    └── val
        ├── color
        |   ├── 1.jpg
        |   └── ...
        └── sketch
            ├── 1.jpg
            └── ...

TPS transformation module

TPS: Link
Place thinplate folder to main folder

Train

python train.py
arguments
- load_model: True/False
- cuda_visible: CUDA_VISIBLE_DEVICES (e.g. 1)

Test

python test.py
arguments
- image_path: folder path to convert the images
- cuda_visible

Results

Sketch	Reference	Result

Observation & Discussion

In Eq. (1), I could not scale the number of activation map, instead I scaled activation map into $f^{l_p} \in R^{h_p \times w_p \times c_l}$ .
In Eq. (5), I implemented the negative region as same region in different batches since the negative region is ambiguous.
In Eq. (9), since $l$ is unclear in contrast to Eq. (8), I computed style (gram) loss with relu5_1 activation map.
In this experiment, there was little difference in quality with or without the similarity-based triplet loss. After convergence from 20 to 0 from 1 epoch, there was little change.
When the test image was predicted every 1 epoch after the content loss was converged, the color quality difference was remarkable.
The converged adversarial losses of the generator and discriminator were 0.7 ~ 0.8 and 0.15 ~ 0.2, respectively.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
README.md		README.md
dataset.py		dataset.py
losses.py		losses.py
model.py		model.py
test.py		test.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

reference_based_sketch_image_colorization

Dependencies

Usage

Results

Observation & Discussion

Code Reference

About

Languages

Snailpong/reference_based_sketch_image_colorization

Folders and files

Latest commit

History

Repository files navigation

reference_based_sketch_image_colorization

Dependencies

Usage

Results

Observation & Discussion

Code Reference

About

Resources

Stars

Watchers

Forks

Languages