Skip to content

Neural Network Models

Vamsi Nadella edited this page Apr 6, 2018 · 5 revisions

UNET

Theory

The U-Net is convolutional network architecture for fast and precise segmentation of images. Up to now it has outperformed the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

The provided model is basically a convolutional auto-encoder, but with a twist - it has skip connections from encoder layers to decoder layers that are on the same "level".

This deep neural network is implemented with Keras functional API, which makes it extremely easy to experiment with different interesting architectures.

U-NET Input

In first attempt we just trained the network with normal images extracted form the frame.

In Second attempt variance was added as a one more input in U-NET

and Finally we also added the Optical Flow form Opencv as the third input in U-NET.

Thus,

Input for U-NET=Images+Variance+Optical-flow+ Optical-Flow-Magnitude

To know more about variance go-to preprocessing page or click here

Implementation

We built our Unet model using Keras functional API. We also added a few modification, which are, adding dropout layers in block of Unet to avoid over fitting and adding batch normalization layer after each convolution layers to control large shift in data points.

Model has the following arguments:

  • trainPath - the folder of training images

  • testPath - the folder of testing images

  • batch_size - training batch size, Suggestion: 2, 4

  • drop - dropout layers

  • optimizer - optimizer in Unet, default: adam

  • epoch - the number of epoches in training

While training the network we use the following sample parameters:

 model parameters- x_train, y_train (includes variance and optical flow)
 epochs in range from 1000-1500
 batch_size=4
 lambda = 0.0002
 Drop = 0.3
 kernel_size = 3
 input_shape = (640, 640, 1)

In the end the train model is returned which can be used for prediction.

Other Models

Fast RCN

Fast R-CNN builds on previous work to efficiently classify object proposals using deep convolutional networks. Compared to previous work, Fast R-CNN employs several innovations to improve training and testing speed while also increasing detection accuracy. Fast R-CNN trains the very deep VGG16 network 9x faster than R-CNN.

Unfortunately this model did not work well and we continued implementing with U-NET.

The One Hundred Layers Tiramisu:

This network was developed by Simon Jegou,Michal Drozdzal, David Vazquez, Adriana Romero and Yoshua Bengio. The One Hundred Layers Tiramisu network is inspired by the DenseNets. The DenseNets were further extended to solve the semantic image segmentation problem on urban scene benchmark datasets like CamVid and Gatech.

The results depicted in the paper were outstanding compared to then existing image segmentation techniques without pretraining or post-processing.

The implementation of this network is based on Theano and Lasagne and was open sourced by the authors. Since theano was complicated to use there was similar implementation of this 100 layer network in keras by Jun Yamada.

We started by tweaking this keras based repository to use Tiramisu for segmentation of cilia in microscopic images. The preprocessing and post-processing techniques used for this network are similar to the one that are used for U-Net in this project.

One major challenge while implementing this network is memory. Since the network is very deep we kept running out of memory causing the execution to fail. Several attempts were made like

  1. With CPU #26

    • Increasing the VM memory to almost 120 Gb.
    • Reducing the samples.
    • Reducing the image size.
  2. With GPU #28

    • Tried several batch sizes like 1, 2, 5, 8, 15, 28, 32, 128, 256, 500 and 5000.
    • Reducing the image size to 240x240 instead of 640x640.(This is one was pretty close but did not help).