RNN Models #20

jramapuram · 2015-06-18T09:27:51Z

Once we have an implementation of the Layer Class #17 , the Optimizer class and the DataSet class we can go about creating RNN flavors. There are 3 models that should be implemented:

Vanilla RNN
LSTM
GRU

These will require the implementation of their derivatives and their forward prop values.
Certain details to consider:

RNN's have a stack of weight matrices and bias' (not just 1 per Layer, thus the Layer needs to be general enough to handle this)
The optimization needs to be handled via two methods:
- RTRL (real time recurrent learning) &
- BPTT (backprop through time)

To enable the above two methods of learning we should consider inheriting from Layer and implementing a Recurrent Layer.

sherjilozair · 2015-08-29T06:14:00Z

BPTT is much more popularly used than RTRL, so might be better to prioritize BPTT over RTRL.

jramapuram · 2015-08-29T07:50:56Z

BPTT doesn't solve the stateful RNN problem unfortunately.
Truncated BPTT is a crude approximation as well as i
Also, there are a plethora of BPTT solutions already: keras, blocks, lasagne, chainer, ...

The only way to have an online RNN is to use RTRL as each step utilizes the full jacobian product from the previous time step as this allows for smooth information flow.

sherjilozair · 2015-08-29T08:03:50Z

Pretty much all other RNN implementations are slow, and not suitable for production. My own reason for betting on arrayfire is that this might yield production-ready implementations for deep learning algorithms.

jramapuram · 2017-07-10T22:40:46Z

I will be doing an internship till Sept and won't have time to update till then most likely. If someone wants to implement these first that would be great now that we have AD setup.

WilliamTambellini · 2017-07-10T23:12:54Z

Hi, I would be interested by either LSTM or GRU, forward pass would be a good first step before implementing backward/training.

pavanky · 2017-07-11T04:09:00Z

@jramapuram I am going to try and implement this. May be you and @WilliamTambellini review this once I send a PR.

WilliamTambellini · 2017-07-12T02:31:44Z

Hi @pavanky
it sounds good. GRU is usually a little simpler to implement than LSTM but your choice.
Have you thought about a possible example application (text generation, summarization, translation, question&answer, ...) ?
Cheers
W.

jramapuram · 2017-07-12T02:35:45Z

I suggest a simple char-run type problem

pavanky · 2017-07-12T04:48:13Z

@jramapuram @WilliamTambellini If you have specific examples in mind please let me know. Preferably implemented as an example in another ML toolkit already :)

pavanky · 2017-07-12T05:05:22Z

@jramapuram @WilliamTambellini I think I am going to target this example as a first step: https://github.com/pytorch/examples/tree/master/word_language_model

WilliamTambellini · 2017-08-29T20:24:05Z

Hi @pavanky It sounds very good: the Penn db is quite small (about 5M) and training time should nt be long. Perfect for an example. Have you opted between Elman, GRU, or LSTM ?

pavanky · 2017-08-29T21:05:55Z

@WilliamTambellini Will start with plain (Elman) RNNs first.

jramapuram mentioned this issue Jun 29, 2015

Proposal for the initial API #3

Closed

19 tasks

pavanky mentioned this issue Jul 10, 2017

LSTM #27

Closed

4 tasks

pavanky mentioned this issue Jul 10, 2017

TODO List for 0.1 release #17

Open

20 tasks

pavanky added this to the 0.1 milestone Jul 11, 2017

pavanky self-assigned this Jul 23, 2017

pavanky added the feature label Jul 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RNN Models #20

RNN Models #20

jramapuram commented Jun 18, 2015

sherjilozair commented Aug 29, 2015

jramapuram commented Aug 29, 2015

sherjilozair commented Aug 29, 2015

jramapuram commented Jul 10, 2017

WilliamTambellini commented Jul 10, 2017

pavanky commented Jul 11, 2017

WilliamTambellini commented Jul 12, 2017

jramapuram commented Jul 12, 2017

pavanky commented Jul 12, 2017 •

edited

Loading

pavanky commented Jul 12, 2017

WilliamTambellini commented Aug 29, 2017

pavanky commented Aug 29, 2017

RNN Models #20

RNN Models #20

Comments

jramapuram commented Jun 18, 2015

sherjilozair commented Aug 29, 2015

jramapuram commented Aug 29, 2015

sherjilozair commented Aug 29, 2015

jramapuram commented Jul 10, 2017

WilliamTambellini commented Jul 10, 2017

pavanky commented Jul 11, 2017

WilliamTambellini commented Jul 12, 2017

jramapuram commented Jul 12, 2017

pavanky commented Jul 12, 2017 • edited Loading

pavanky commented Jul 12, 2017

WilliamTambellini commented Aug 29, 2017

pavanky commented Aug 29, 2017

pavanky commented Jul 12, 2017 •

edited

Loading