Papers of Interest
For the evaluation of the theoretical aspects of the course, we offer a list of papers of interest which the student may chose to read and review. These are loosely are categorized. For older papers that have been thoroughly review by the community (> 2 years), the student will be expected to focus on novel interpretations of the contribution.
Convolutional Neural Networks
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.
Zeiler, Matthew D., and Rob Fergus. “Visualizing and understanding convolutional networks.” European conference on computer vision. Springer, Cham, 2014.
Simonyan, Karen, and Andrew Zisserman. “Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556 (2014).
Lin, Min, Qiang Chen, and Shuicheng Yan. “Network in network.” arXiv preprint arXiv:1312.4400 (2013).
Szegedy, Christian, et al. “Going deeper with convolutions.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
He, Kaiming, et al. “Deep residual learning for image recognition.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Jaderberg, Max, Karen Simonyan, and Andrew Zisserman. “Spatial transformer networks.” Advances in Neural Information Processing Systems. 2015.
The next three papers should be read and reviewed together
Generative Adversarial Networks
Embedding spaces
The next four papers should be read and reviewed together
Mikolov, Tomas, et al. “Efficient estimation of word representations in vector space.” arXiv preprint arXiv:1301.3781 (2013).
Mikolov, Tomas, et al. “Distributed representations of words and phrases and their compositionality.” Advances in neural information processing systems. 2013.
Mikolov, Tomas, Wen-tau Yih, and Geoffrey Zweig. “Linguistic regularities in continuous space word representations.” hlt-Naacl. Vol. 13. 2013.
Mikolov, Tomas, Quoc V. Le, and Ilya Sutskever. “Exploiting similarities among languages for machine translation.” arXiv preprint arXiv:1309.4168 (2013).
Rong, Xin. “word2vec parameter learning explained.” arXiv preprint arXiv:1411.2738 (2014).
Goldberg, Yoav, and Omer Levy. “word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method.” arXiv preprint arXiv:1402.3722 (2014).
Pennington, Jeffrey, Richard Socher, and Christopher Manning. “Glove: Global vectors for word representation.” Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014.
Levy, Omer, and Yoav Goldberg. “Linguistic regularities in sparse and explicit word representations.” Proceedings of the eighteenth conference on computational natural language learning. 2014.
Zou, Will Y., et al. “Bilingual word embeddings for phrase-based machine translation.” Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013.
Szegedy, Christian, et al. “Intriguing properties of neural networks.” arXiv preprint arXiv:1312.6199 (2013). Limitations of neuron semantics, and introduction to adversarial examples.
Multimodal Approaches
Karpathy, Andrej, and Li Fei-Fei. “Deep visual-semantic alignments for generating image descriptions.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
Vinyals, Oriol, et al. “Show and tell: A neural image caption generator.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
Kiros, Ryan, Ruslan Salakhutdinov, and Richard S. Zemel. “Unifying visual-semantic embeddings with multimodal neural language models.” arXiv preprint arXiv:1411.2539 (2014).
Vendrov, Ivan, et al. “Order-embeddings of images and language.” arXiv preprint arXiv:1511.06361 (2015).
Socher, Richard, et al. “Zero-shot learning through cross-modal transfer.” Advances in neural information processing systems. 2013.
Transfer Learning
Recurrent Neural Networks
Alex Graves Generating Sequences With Recurrent Neural Networks
Kyunghyun Cho, Bart van Berrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, arXiv:1406.1078 / EMNLP 2014
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le, Sequence to Sequence Learning with Neural Networks, arXiv:1409.3215 / NIPS 2014
A.Graves, G. Wayne, and I. Danihelka., Neural Turing Machines, arXiv preprint arXiv:1410.5401
Mike Schuster and Kuldip K. Paliwal, Bidirectional Recurrent Neural Networks, Trans. on Signal Processing 1997
Nal Kalchbrenner, Ivo Danihelka, and Alex Graves, Grid Long Short-Term Memory, arXiv:1507.01526
Kai Sheng Tai, Richard Socher, and Christopher D. Manning, Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks, arXiv:1503.00075 / ACL 2015
Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, Richard Socher, “Ask Me Anything: Dynamic Memory Networks for Natural Language Processing”
Cho, Kyunghyun, et al. “Learning phrase representations using RNN encoder-decoder for statistical machine translation.” arXiv preprint arXiv:1406.1078 (2014).
Zaremba, Wojciech, Ilya Sutskever, and Oriol Vinyals. “Recurrent neural network regularization.” arXiv preprint arXiv:1409.2329 (2014).
Theory of Deep Learning
Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge. ASPLOS 2017: 615-629
Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, et al. In-Datacenter Performance Analysis of a Tensor Processing Unit. ISCA 2017: 1-12
Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, Olivier Temam: DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. ASPLOS 2014: 269-284
Ammar Ahmad Awan, Khaled Hamidouche, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda: S-Caffe: Co-designing MPI Runtimes and Caffe for Scalable Deep Learning on Modern GPU Clusters. PPOPP 2017: 193-205
Muhammet Mustafa Ozdal, Serif Yesil, Taemin Kim, Andrey Ayupov, John Greth, Steven M. Burns, Özcan Özturk: Energy Efficient Architecture for Graph Analytics Accelerators. ISCA 2016: 166-177