Transformer and Generative Pre-Training

This repo contains (will contain shortly) a PyTorch implementation of the Transformer architecture (Vaswani et. al. 2017) as well as experiments with generative pre-training (Radford et. al. 2018, Devlin et. al. 2018).

The repo also contains slides for a presentation given for the Scientific Discussions at Intact Data Lab.

TODO

create training setting similar to Vaswani paper
add dropout
use BPE to encode sentences
Preprocessing using SpaCy
Train on WMT and Cornell Movie Dialog Corpus
add label smoothing
implement beam search

References

Vaswani, Ashish, et al. "Attention is all you need." Advances in Neural Information Processing Systems. 2017.
Radford, Alec, et al. "Improving language understanding by generative pre-training." URL https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf (2018).
Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
slides		slides
transformer		transformer
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
params.json		params.json
transformer.ipynb		transformer.ipynb
transformer_gpt_bert.pdf		transformer_gpt_bert.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer and Generative Pre-Training

TODO

References

About

Releases

Packages

Languages

License

patricebechard/transformer

Folders and files

Latest commit

History

Repository files navigation

Transformer and Generative Pre-Training

TODO

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages