This repo contains short summaries of papers I've been reading, primarily at the intersection of deep learning and computer vision. If you find this content helpful, please consider leaving a star so future readers can benefit as well!
-
Added 6/2018:
- SoundNet: Learning Sound Representations from Unlabeled Video (Aytar, 2016)
- Dynamic Memory Networks for Visual and Textual Question Answering (Xiong, 2016)
- Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books (Zhu, 2015)
- Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (Zhu, 2017)
- Dense-Captioning Events in Videos (Krishna, 2017)
- Grounding of Textual Phrases in Images by Reconstruction (Rohrbach, 2015)
- Deep Compositional Question Answering with Neural Module Networks (Andreas, 2015)
- StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks (Zhang, 2016)
- Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning (Das, 2017)
-
Added 2/2018:
-
Added 11/2017:
-
Added 10/2017: