This is an official implementation for the paper GDB: Gated convolutions-based Document Binarization.
This repository also comprehensively collects the datasets that may be used in document binarization.
Below is a summary table of the datasets used for document binarization, along with links to download them.
- Python >= 3.7
- torch >= 1.7.0
- torchvision >= 0.8.0
Note: The pre-processing code is not provided yet. But it is on the way.
You can download the datasets from the links below and put them in the datasets_ori
folder.
When evaluating performance on the DIBCO2019 dataset,
first gather all datasets except for DIBCO2019 and place them in the img
and gt
folders under the datasets_ori
directory.
Then crop the images and ground truth images into patches (256 * 256) and place them in the img
and gt
folders under the datasets/DIBCO2019
directory.
Next, use the Otsu thresholding method to binaryze the images
under datasets/img
and place the results in the datasets/otsu
folder.
Use the Sobel operator to process the images under datasets/img
and place the results in the datasets/sobel
folder.
With these preprocessing steps completed,
Pass ./datasets/img
as an argument for the --dataRoot
parameter in train.py
and begin training.
python train.py
python test.py
- Add the code for training
- Add the code for testing
- Add the code for pre-processing
- Restruct the code
- Upload the pretrained weights
- Comprehensively collate document binarization benchmark datasets
- Add the code for evaluating the performance of the model
This work is permitted for academic research purposes only. For commercial use, please contact the author.
- If this work is useful, please cite it as:
@article{yang2024gdb,
title={GDB: gated convolutions-based document binarization},
author={Yang, Zongyuan and Liu, Baolin and Xiong, Yongping and Wu, Guibin},
journal={Pattern Recognition},
volume={146},
pages={109989},
year={2024},
publisher={Elsevier}
}