CBIS-DDSM-DATASET

If you want to train a breast cancer classifier or a segmentation model using the CBIS-DDSM dataset, this repository may help you to easily extract the mammograms and the masks from the original folder.

Setup

The dataset can be downloaded directly from the official site.
If you want to go into detail about the CBIS-DDSM dataset, you can check this paper. It describes how to use the dataset and how the dataset was built.

Quantitative Description

Despite the paper stating that CBIDS-DDSM has 753 calcification cases and 891 mass cases, it is difficult to determine how many images this dataset actually has. According to the metadata provided in the CSV files, CBIS-DDSM contains 3103 mammograms, 465 of which have more than one abnormality. 2.458 mamograms (79.21%) belong to the training set, and 645 (20.79% ) belong to the test set. Furthermore, 3568 cropped mammograms and 3568 masks are included.

A bit of explanation of the repository's functions

Mammograms_code.ipynb:

This script contains a function that retrieves the path of all mammograms on your local machine and merges each image path with its pathology in a data frame. The data frame is subsequently saved as a CSV file.

mask_code.ipynb:

This script contains a function that retrieves the path of all patches in your local machine and then merges each mask path with its pathology in a data frame. This data frame is subsequently saved as CSV file. Note: There are more masks than mammograms since some mammograms have more than one lesion.

convert_dicom.ipynb:

The images provided by CBIS-DDSM (mammograms, masks, crops of abnormalities) are saved in DICOM format. This function saves 16-bit mammogram from dicom as rescaled 16-bit png file.

Original_Split.ipynb:

This script is used to create the test and training set according to the standardized split given by the official paper. The path of all images is stored in a dataframe which is saved as CSV file.

Bonus :

In this repository, I implemented the deep learning classifier introduced in the paper "Deep Learning to Improve Breast Cancer Detection on Screening Mammography" using PyTorch and CBIS-DDSM dataset. The original code and model are available here. However, this code is in Keras.
My main goal is to provide an understandable implementation of this model, which can be helpful for everyone, especially those who are beginning to work with deep learning and are interested in medical applications.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
LICENSE		LICENSE
Mammograms_code.ipynb		Mammograms_code.ipynb
Original_Split .ipynb		Original_Split .ipynb
README.md		README.md
convert_dicom.ipynb		convert_dicom.ipynb
mask_code.ipynb		mask_code.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CBIS-DDSM-DATASET

Setup

Quantitative Description

A bit of explanation of the repository's functions

Mammograms_code.ipynb:

mask_code.ipynb:

convert_dicom.ipynb:

Original_Split.ipynb:

Bonus :

About

Releases 1

Packages

Languages

License

sposso/CBIS-DDSM-DATASET

Folders and files

Latest commit

History

Repository files navigation

CBIS-DDSM-DATASET

Setup

Quantitative Description

A bit of explanation of the repository's functions

Mammograms_code.ipynb:

mask_code.ipynb:

convert_dicom.ipynb:

Original_Split.ipynb:

Bonus :

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages