This repository contains the slides of my talk “Introduction to Advanced Neural Network Architectures for Natural Language Processing”. The target audience are people new to Machine Learning and who have perhaps been exposed to the very basics fo neural networks before. To follow, undergrad linear algebra knowledge should be enough. The talk is designed to take roughly two hours. It first reviews deep learning basics, thereafter presents proven architectures in natrual language processing (recurrent and convolutional neural networks and the attention mechanism), and finally introduces the current state-of-the-art approach of pre-training and fine-tuning, specifically the OpenAI GPT and BERT models.
The contents of this talk are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0).