Investigation Into the Effects of Image Augmentation on Classification Accuracy of Convolutional Neural Networks
An investigation into the effects that image augmentation has on the accuracy and loss of Convolutional Neural Networks. This work was completed as part of dissertation project for Bachelor of Science (Honours) in Computer Science with specialism in Artificial Intelligence.
The summarised research notes can be found [here](github-pages/Investigation of the Effects of Medical Image Augmentation on the Robustness of Convolutional Neural Networks - A summary of Undergraduate Dissertation in Computer Science with Artificial Intelligence.pdf).
To generate a series of augmented dataset samples using rotation and noise and compare the effects that augmented samples have on model performance. Performance is measured in terms of general model accuracy and loss as reported by Keras, with further calculations of precision and error rates derived through the use of confusion matrices.
Varying factor of image augmentation is used in model training and/or testing (depending on the experiment performed) to determine how it affects the model's ability to correctly classify samples.
The Convolutional Neural Network Model used is based on the LeNet architecture design with some modifications necessary to achieve a reliable classification rate. It consists of 666,722 trainable parameters distributed across the following layers:
Layer Type | Output Shape | No. of Parameters |
---|---|---|
conv2d_1 (Conv2D) | (None, 64, 64, 20) | 1520 |
activation_1 (Activation) | (None, 64, 64, 20) | 0 |
max_pooling2d_1 (MaxPooling2) | (None, 32, 32, 20) | 0 |
conv2d_2 (Conv2D) | (None, 32, 32, 50) | 25050 |
activation_2 (Activation) | (None, 32, 32, 50) | 0 |
max_pooling2d_2 (MaxPooling2) | (None, 16, 16, 50) | 0 |
flatten_1 (Flatten) | (None, 12800) | 0 |
dense_1 (Dense) | (None, 50) | 640050 |
activation_3 (Activation) | (None, 50) | 0 |
dense_2 (Dense) | (None, 2) | 102 |
activation_4 (Activation) | (None, 2) | 0 |
The experiment focussed on binary classification of cancer samples (either as malignant or benign) based on the International Skin Imaging Collaboration (ISIC) Archive. The archive containins samples of high-quality images of nevi (skin moles) from both healthy patients and those with various types of cancer, to convert this dataset from a multi-variate classification to a binary classification the samples have been cut down to a single cancer sub-type.
The final dataset contained 1080 benign and 1080 malignant samples. Total sample number of 2160. Selected dataset is then split into 1620 (75%) training samples and 540 (25%) validation samples.
Note: The main hinderance of this project is the limited dataset size which translates to a limited ultimate accuracy achievable by the network
The model was trained over 100 epochs for each experiement variation using continuous integration TeamCity server.
Initial results from varying network structure and hyper parameters can be found in the learning rate and network density result directories. Each containing the accuracy and loss graphs at varying parameter setting for the given investigation.
Once the optimal trade-off has been selected, the finalised parameters were used in all of the following experiments.
Noise augmentation has been applied to the images at varying settings ranging from 0% (control data with no noise) to 15% (maximum corruption in experimental design) with a 5% increase between stages (0%, 5%, 10%, 15%).
The results can be found here.
Rotation augmentation has been applied to the samples at various settings using a built-in Keras library function for pre-processing the images on-the-fly as they are fed into the network model. The rotations were applied from 0° to 180° at 45° incremenets (0°, 45°, 90°, 135°, 180°).
The results can be found here.
To support the experimental results above, heatmaps were generated which highlight the areas of focus for the model. They demonstrate which areas of the image the network focuses on through a "thermal" overaly atop the original sample.
Heatmaps for noise experiments can be found here and rotation experiments here.
Images have been augmented to a varying degree using the following methods.
Noise has been artificially added to the sample image using black pixels; an augmentation factor determines the percentage of pixels to affect in an image of size X*Y. Black pixels are then placed at random locations along the X and Y axis of the image. New, augmented, image is then saved for model training and testing.
Rotation utilises in-built Keras libraries and loads original images from the dataset directories, rotating them prior to being fed into the model. Rotation degree represents the rotation between positive and negative rotation of the image. For example, an image set to 45° would rotate to a random position between -45° to 45°.
The tools directory contains various tools used during the preparation of datasets, directory creation, extraction of model heatmaps, visualisation and image reallocation.
The team behind the Grad-CAM tool as demonstrated in the paper "Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization" made heatmap visualisation prossible.
The International Society for Digital Imaging of the Skin made their dataset publicly available for research purposes. The archive used in this investigation can be found here.