HuBMAP-HackingtheKidney

Competition Homepage: https://www.kaggle.com/c/hubmap-kidney-segmentation/data

Dataset

The dataset is comprised of very large (>500MB - 5GB) TIFF files. The training set has 8, and the public test set has 5. The private test set is larger than the public test set.

The training set includes annotations in both RLE-encoded and unencoded (JSON) forms. The annotations denote segmentations of glomeruli.

Both the training and public test sets also include anatomical structure segmentation. They are intended to help you identify the various parts of the tissue.

Installation

To run the code seemlessly you can install all the repo assoacited packages usign HuBMAP.yml file.

To install all the packages use the following command:

conda env create -f environment.yml

Patch Creation

Since the test and train images are very large to fit in memory we create small patches from large images. We generate overlapping images of size 1024x1024 and later resize them to 256x256 pixels along with the associated ground truth masks.

Patch Generation Code can be ran using:

python create_patches.py

Pseudo Labelling

The number of Glomeruli in the training set is very low therefore to improve model generalization we use Pseudo Labelling to increase the model performance. The inspiration of Pseudo Labelling was inspired by a talk of Yauhen Babakhin in which multiple iterations of Model Predictions are carried out on Public Test set are done to refine synthetically generated pseudo labels. We used an ensemble of U-Net(Efficient-B2 and Efficient-B4) for the generation of pseudo labels.

python generate_pseudo_labels.py

Training

We have used U-Net and FPN with various backbones for training. The parameters we have used for submission in the HuBMAP competition can be seen in the train.py and train_Fold.py files. The train.py comprises the model trained for a single fold of data and train_Fold.py comprises the model trained for 5 Fold Cross-validation. To update the training and model parameters please refer to the training files.

To start 1 Fold training use:

python train.py

To start 5 Fold training use:

python train_Fold.py

Testing

The test script is used to calculate the evaluation metrics for model perfpoamnce on validation data. To start test script use:

python test.py

Inference Code

The inference code for competion submission note book can be found here.

Quantitative Results

Model	Backbone	Public Leader Board	Private Leader Board
Unet	Efficient B2	0.921	0.918
Unet	Efficient B4	0.915	0.920
FPN	Efficient B2	0.918	0.916
FPN	Efficient B4	0.919	0.919

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HuBMAP-HackingtheKidney

Dataset

Installation

Patch Creation

Pseudo Labelling

Training

Testing

Inference Code

Quantitative Results

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
HuBMAP.yml		HuBMAP.yml
README.md		README.md
create_patches.py		create_patches.py
generate_pseudo_labels.py		generate_pseudo_labels.py
inference-no-polygon.ipynb		inference-no-polygon.ipynb
test.py		test.py
train.py		train.py
train_Fold.py		train_Fold.py

nauyan/HuBMAP-HackingtheKidney

Folders and files

Latest commit

History

Repository files navigation

HuBMAP-HackingtheKidney

Dataset

Installation

Patch Creation

Pseudo Labelling

Training

Testing

Inference Code

Quantitative Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages