PyTorchUNet is a PyTorch-based implementation of the UNet architecture for semantic image segmentation. This repository contains a comprehensive implementation of the UNet architecture, including both the encoder and decoder modules, using PyTorch 2.0. The code is easy to understand and can be easily extended to different datasets and problems.
To use PyTorchUNet, you need to have Python 3.8 or higher installed on your system. You can install the required Python packages by running the following command:
pip install -r requirements.txt
This will install all the required packages, including PyTorch and its dependencies.
To train and test the PyTorchUNet model, you need to prepare your data in the appropriate format. The input images and maks should be in separate folders, and each image should have a corresponding masklabel with the same name. The folder structure should look like this:
data/
images/
image1.png
image2.png
...
masks/
mask1.png
mask2.png
To prepare dataset, run the following command:
python scripts/data_setup.py
To train the model, run the following command:
python train.py
This will train the PyTorchUNet model on the specified dataset and save the checkpoints at the specified path.
To test the model, run the following command:
python test.py -c <path_to_checkpoint> -i <path_to_test_image> -s <path_to_output_image_path>
# -s is optional
This will load the specified checkpoint and test the model on the specified test image. The output image will be saved at the specified path.
If you find a bug or have a feature request, please open an issue or submit a pull request. We welcome contributions from the community!
This implementation is based on the original UNet paper by Ronneberger et al. [1]. I would like to thank the PyTorch team for providing an excellent deep learning framework.
[1] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-Net: Convolutional Networks for Biomedical Image Segmentation." International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2015.
This project is licensed under the MIT License. See the LICENSE file for details.