This repository contains important NLP papers (most), well-explained materials that everyone working in the field should know about and read.
I also implements several State-of-the-art NLP models. You can find that on my repo.
Check Neural Network Language Model (NNLM)
Attention Is All You Need (Transformer)
- NLP: Pretrained Language Model, Machine Translation, Text Summarization
- CV: Image-to-image Translation
- Learning Algorithm: Meta Learning
-
Deep Learning
-
NLP
-
Yongjun Hong, et al. How Generative Adversarial Networks and Their Variants Work: An Overview. ACM 2019. [ACM]
-
Samuel L. Smith, et al. Don't Decay the Learning Rate, Increase the Batch Size. ICLR 2018. [ICLR]
-
Peter F Brown, et al. Class-Based n-gram Models of Natural Language. 1992. [ACL Anthology]
-
Tomas Mikolov, et al. Efficient Estimation of Word Representations in Vector Space. 2013. [ArXiv]
-
Tomas Mikolov, et al. Distributed Representations of Words and Phrases and their Compositionality. NIPS 2013. [ArXiv]
-
Quoc V. Le and Tomas Mikolov. Distributed Representations of Sentences and Documents. 2014. [ArXiv]
-
Jeffrey Pennington, et al. GloVe: Global Vectors for Word Representation. 2014. [ACL Anthology]
-
Piotr Bojanowski, et al. Enriching Word Vectors with Subword Information. 2017. [ACL Anthology]
- Junjie Hu, et al. XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization. 2020. [ArXiv]
-
Kishore Papineni, et al. BLEU: a Method for Automatic Evaluation of Machine Translation. 2002 [CiteSeer]
-
Chin-Yew Lin. ROUGE: A Package for Automatic Evaluation of Summaries. ACL 2004. [ACL Anthology
-
Amosse Edouard. Event Detection and Analysis On Short Text Messages. 2018. [ResearchGate]
-
Deepayan Chakrabarti and Kunal Punera. Event Summarization Using Tweets. ICWSM 2011. [ResearchGate]
-
Maria Vargas-Vera and David Celjuska. Event Recognition on News Stories and Semi-Automatic Population of an Ontology. Web Intelligence 2004. [ResearchGate]
- Junyoung Chung, et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. CoRR 2014. [ArXiv]
- Steven J. Rennie, et al. Self-critical Sequence Training for Image Captioning. CVPR 2017. [ArXiv]
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. [ICLR]
-
Jun-Yan Zhu, et al. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. ICCV 2017. [ArXiv]
-
Yunjey Choi et al. StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. CVPR 2018. [ArXiv]
-
Taesung Park, et al. Contrastive Learning for Unpaired Image-to-Image Translation. ECCV 2020. [ArXiv]
-
Yoshua Bengio, et al. A Neural Probabilistic Language Model, J. of Machine Learning Research. 2003. [ACM DL]
-
Rafal Jozefowicz, et al. Exploring the Limits of Language Modeling. 2016. [ArXiv]
-
Matthew Peters, et al. Semi-supervised sequence tagging with bidirectional language models. ACL 2017. [ArXiv]
-
Matthew Peters, et al. Deep contextualized word representations. NAACL 2018. [ArXiv]
-
Jacob Devlin, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 2018. [ArXiv]
-
Jeremy Howard and Sebastian Ruder. Universal Language Model Fine-tuning for Text Classification. ACL 2018. [ArXiv]
-
Alec Radford, et al. Improving Language Understanding by Generative Pre-Training. 2018. [OpenAI]
-
Alec Radford, et al. Language Models are Unsupervised Multitask Learners. 2019. [OpenAI]]
-
Zhenzhong Lan, et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. ICLR 2019. [OpenReview]
-
Zihang Dai, et al. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. ACL 2019. [ArXiv]
-
Zhilin Yang, et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding. NIPS 2019. [ArXiv]
-
Colin Raffel, et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. [ArXiv]
-
Nikita Kitaev, et al. Reformer: The Efficient Transformer. [ArXiv]
-
Kevin Clark, er al. ELECTRA_Pre-training Text Encoders as Discriminators Rather Than Generators. ICLR 2020. [ArXiv]
-
Tom B. Brown, et al. Language Models are Few-Shot Learners. 2020. [ArXiv]
-
Louis Martin, et al. CamemBERT: a Tasty French Language Model. ACL 2020. [ArXiv]
-
Dzmitry Bahdanau, et al. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR 2015. [ArXiv]
-
Minh-Thang Luong, et al. Effective Approaches to Attention-based Neural Machine Translation. EMNLP 2015. [ArXiv]
-
Massive Exploration of Neural Machine Translation Architectures. ACL 2017 [ArXiv]
-
Yun Chen, et al. A Teacher-Student Framework for Zero-Resource Neural Machine Translation. ACL 2017. [ArXiv]
-
Ashish Vaswani, et al. Attention Is All You Need. 2017. [ArXiv]
-
Guillaume Lample and Alexis Conneau. Cross-lingual Language Model Pretraining. 2019. [ArXiv]
-
Alexis Conneau et al. Unsupervised Cross-lingual Representation Learning at Scale. ACL 2020. [ArXiv]
-
Christos Baziotis et al. Language Model Prior for Low-Resource Neural Machine Translation. EMNLP 2020. [ArXiv]
-
Chelsea Finn, et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. ICML 2017. [ArXiv]
-
Sachin Ravi and Hugo Larochelle. Optimization as a Model for Few-Shot Learning. ICLR 2017. [OpenReview]
-
Andrei A. Rusu, et al. Meta-Learning with Latent Embedding Optimization. ICLR 2019. [ArXiv]
-
Aravind Rajeswaran et al. Meta-Learning with Implicit Gradients, et al.: Meta-Learning with Implicit Gradients. NIPS 2019. [ArXiv]
- Victor Sanh, et al. A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks. AAAI 2019 [ArXiv]
-
Guillaume Lample, et al. Neural Architectures for Named Entity Recognition. ACL 2016. [ArXiv]
-
Xuezhe Ma, Eduard Hovy. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. ACL 2016. [ArXiv]
-
Matthew Peters, et al. Semi-Supervised Sequence Tagging With Bidirectional Language Models. ACL 2017. [ArXiv]
-
Kevin Clark, et al. Semi-Supervised Sequence Modeling with Cross-View Training. EMNLP 2018. [ArXiv]
-
Matthew Peters, et al. Deep Contextualized Word Representations. NAACL 2018. [ArXiv]
-
Abbas Ghaddar and Philippe Lannglais. Robust Lexical Features for Improved Neural Network Named-Entity Recognition. COLING 2018. [ACL Anthology]
-
Alan Akbik, et al. Contextual String Embeddings for Sequence Labeling. ACL 2018. [ResearchGate]
-
Alexei Baevski, et al. Cloze-driven Pretraining of Self-attention Networks. 2019. [ArXiv]
- John Lafferty, Andrew McCallum, Fernando C.N. Pereira: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. ICML 2001. [ACM DL]
- Kristopher D. Asis, et al. Multi_Step Reinforcement Learning_A Unifying Algorithm. AAAI 2018. [ArXiv]
- Thibault Fevry and Jason Phang. Unsupervised Sentence Compression using Denoising Auto-Encoders. CoNLL 2018. [ACL Anthology]
- Ilya Sutskever, et al. Sequence to Sequence Learning with Neural Networks. 2014. [ArXiv]
-
Yoon Kim, et al. Convolutional Neural Networks for Sentence Classification. EMNLP 2014. [ArXiv]
-
Xiang Zhang, et al. Character-Level Convolutional Networks For Text Classification. NIPS 2015. [ArXiv]
-
Yoon Kim, et al. Character-Aware Neural Language Models. AAAI 2016. [ArXiv]
-
Zichao Yang, et al. Hierarchical Attention Networks for Document Classification. NAACL 2016. [ACL Anthology]
-
Alon Jacovi, et al. Understanding Convolutional Neural Networks for Text Classification. EMNLP 2018. [ACL Anthology]
-
Lantao Yu, et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. AAAI 2017. [ArXiv]
-
William Fedus, et al. MaskGAN: Better Text Generation via Filling in the______. ICLR 2018. [ArXiv]
-
Weili Nie, et al. RELGAN: RELATIONAL GENERATIVE ADVERSARIAL NETWORKS FOR TEXT GENERATION. ICLR 2019. [ICLR]
-
Kaitao Song, et al. MASS: Masked Sequence to Sequence Pre-Training for Langauge Generation. ICML 2019. [ArXiv]
-
Mike Lewis, et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation. Translation, and Comprehension, ACL 2020. [AxXiv]
-
Zichao Yang, et al. Unsupervised Text Style Transfer using Language Models as Discriminators. NIPS 2018. [ArXiv]
-
Sandeep Subramanian, et al. Multiple-Attribute Text Style Transfer. ICLR 2019. [ArXiv]
-
Romain Paulus, et al. A Deep Reinforced Model for Abstractive Summarization. ICLR 2018. [ArXiv]
-
Angela Fan, et al. Controllable Abstractive Summarization. ACL 2018. [ArXiv]
-
Yaushian Wang and Hung-Yi Lee. Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks. EMNLP 2018. [ACL Anthology]
-
Peter J. Liu, et al. SummAE: Zero-Shot Abstractive Text Summarization using Length-Agnostic Auto-Encoders. 2019 [ArXiv]
-
Christos Baziotis, et al. SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression. NAACL 2019. [ArXiv]
-
Jingqing Zhang, et al. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. ICML 2020. [ArXiv]