cs6910-Assignment-3 [CS20M005, CS20M016]

The code RnnWandB.ipynb implements Reccurent Neural Network on a subset of dakshina dataset with the following functionalities..

Dropout : The Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting.
Reccurrent Dropout : Just as with regular dropout, recurrent dropout has a regularizing effect and can prevent overfitting. It's used in Keras by simply passing an argument to the LSTM or RNN layer.
Beam Search : The beam search strategy generates the translation word by word from left-to-right while keeping a fixed number (beam) of active candidates at each time step. By increasing the beam size, the translation performance can increase at the expense of significantly reducing the decoder speed.
Cell Type : 3 types of Recurrent Neural network cells- LSTM, GRU and Simple RNN.

At dense layer we have used "softmax" as the activation function.

The implementation is linked with wandb and hyper parameter tuning can be done effectively by changing the values of sweep confiiguration in the script. The configuration used for parameter searching are as follows.

'epoch'          : [2,5,10,15,20]
'batch_size'     : [32,64]
'dropout'        : [0.1,0.01,0.0]
'recc_dropout'   : [0.1,0.01,0.0]
'beam_search'    : [0,3,5,6]
'layers'         : [3,4]
'hidden_neurons' : [64,126,256]
'cell_type'      : ['RNN','GRU','LSTM']
'optimizer_fn'   : ['adam','rmsprop']

Repository Contents

A Neural machine translation system with RNN is implemented in RNNWandB.iypnb.The transilterian system give hindi translation for the english romanized input word.

The best configuration for the model among RNN,LSTM,GRU obtained from hyperparameter search in WandB is implemented in BestModel_Rnn.iypnb.

Attention Mechanism is used to improve the performance of the Encoder-Decoder RNN on machine translation.

Attention.ipynb implements single layered Reccurent Neural Network along with attention mechanism and beam search.

NoAttention.iypnb implements sinble layered Reccurent Neural Nerwork without attention and beam search.

Bestmodel_attention.iypnb implements the best NMT model with attention obtained as per hyperparameter search in WandB. In addition, Visualisation of attention heatmaps and Connectivity visualisation is also included to get intution about how the model is getting trained.

Best Models

The best configuration that gave maximum accuracy for the RNN model is :

Word level Validation Accuracy : 52.7%

'epoch'         :15
'batch_size'    : 64
'dropout'       : 0.1
'beam_search'   : 6
'layers'        : 3
'hidden_neurons': 256
'recc_dropout'  : 0.1
'cell_type'     : GRU
'optimizer_fn'  : adam

The best configuration that gave maximum accuracy for the Attention model is :

Word level Validation Accuracy : 54.8%

'epoch'            : 10
'batch_size'       : 64
'dropout'          : 0.01
'recc_dropout'     : 0.05
‘beam_search'      : 6
‘latent dimensions': 256
‘cell_type'        : 'GRU'
'optimizer_fn'     : 'rmsprop'

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
predictions_attention		predictions_attention
predictions_vanilla		predictions_vanilla
Attention.ipynb		Attention.ipynb
BestModel_Rnn.ipynb		BestModel_Rnn.ipynb
Bestmodel_attention.ipynb		Bestmodel_attention.ipynb
NoAttention.ipynb		NoAttention.ipynb
README.md		README.md
RnnWandB.ipynb		RnnWandB.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cs6910-Assignment-3 [CS20M005, CS20M016]

Repository Contents

Best Models

About

Releases

Packages

Contributors 2

Languages

a-pt/cs6910-Assignment-3

Folders and files

Latest commit

History

Repository files navigation

cs6910-Assignment-3 [CS20M005, CS20M016]

Repository Contents

Best Models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages