diff --git a/2022Fall_AntNLP/README.md b/2022Fall_AntNLP/README.md index 9a22a59..d7dcb2f 100644 --- a/2022Fall_AntNLP/README.md +++ b/2022Fall_AntNLP/README.md @@ -44,7 +44,7 @@ Week | Date | Speaker | Paper | Materials 3 |10.5 | 国庆节 |-|- 4 |10.12 | 王志承 || 5 |10.19 | 杜威 |Document-Level Event Extraction|[Slides](https://github.com/AntNLP/seminar/blob/master/2022Fall_AntNLP/week5/2022-10-24%E7%BB%84%E4%BC%9A.pdf) -6 |10.26 | 刘宇芳 || +6 |10.26 | 刘宇芳 |Calibrating Factual Knowledge in PLMs|[Slides](https://github.com/AntNLP/seminar/blob/master/2022Fall_AntNLP/week6/1102.pdf) 7 |11.2 | 杨晰 || 8 |11.9 | 汪杰 || 9 |11.16 | 李雨倩 || diff --git a/2022Fall_AntNLP/week6/1102.pdf b/2022Fall_AntNLP/week6/1102.pdf new file mode 100644 index 0000000..29229fd Binary files /dev/null and b/2022Fall_AntNLP/week6/1102.pdf differ diff --git a/2022Fall_PLM/README.md b/2022Fall_PLM/README.md index dcea5de..c7e189c 100644 --- a/2022Fall_PLM/README.md +++ b/2022Fall_PLM/README.md @@ -20,7 +20,7 @@ Welcome to AntNLP Seminar for PLMs 2022 Fall. : ) Week | Date | Speaker | Topic |Paper | Key Phrase |Slides ---- | ---- | ---- | ---- | ---- | ---- | ---- 1 | 10.20 | 李雨倩 | Preliminaries: Past, Architectures, Pre-training, Capabilities | 1. [Attention Is All You Need](https://arxiv.org/abs/1706.03762) (Transformer)
2. [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/pdf/1810.04805.pdf) (Bert)
3. [ Improving Language Understanding by Generative Pre-Training](https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf) (OpenAI GPT)
4. [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/pdf/1907.11692.pdf) (RoBERTa)
5. [ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://arxiv.org/pdf/2003.10555.pdf) (ELECTRA)| transformer; encoder-only models|[Slides](https://github.com/AntNLP/seminar/tree/master/2022Fall_PLM/week1/ppt.pptx) -2 | 10.27 | 刘燕婷 | Other Pretraining Language Models I|1. [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf) (T5)
2. [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/pdf/1910.13461.pdf) (BART)
3. [ mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/pdf/2010.11934.pdf) (mT5)
4. [AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2seq Model](https://arxiv.org/pdf/2208.01448.pdf) (AlexaTM) | encoder-decoder models | +2 | 10.27 | 刘燕婷 | Other Pretraining Language Models I|1. [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf) (T5)
2. [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/pdf/1910.13461.pdf) (BART)
3. [ mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/pdf/2010.11934.pdf) (mT5)
4. [AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2seq Model](https://arxiv.org/pdf/2208.01448.pdf) (AlexaTM) | encoder-decoder models |[Slides](https://github.com/AntNLP/seminar/tree/master/2022Fall_PLM/week2/ppt.pptx) 3 | 11.3 | 杜威 | Other Pretraining Language Models II | 1. [Language Models are Few-Shot Learners](https://arxiv.org/pdf/2005.14165.pdf)(GPT3)
2. [Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) (GPT2)
3. [PaLM: Scaling Language Modeling with Pathways](https://arxiv.org/pdf/2204.02311.pdf) (PaLM)
4. [OPT: Open Pre-trained Transformer Language Models](https://arxiv.org/pdf/2205.01068.pdf) (OPT) | decoder-only models | 4 | 11.10 | 丁炫文 | Prompting for few-shot learning |1. [ Making Pre-trained Language Models Better Few-shot Learners](https://arxiv.org/pdf/2012.15723.pdf)
2. [ How Many Data Points is a Prompt Worth?](https://arxiv.org/pdf/2103.08493.pdf)
3. [Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference](https://arxiv.org/pdf/2001.07676.pdf)
4. [True Few-Shot Learning with Language Models](https://arxiv.org/pdf/2105.11447.pdf)
5. [Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models](https://arxiv.org/pdf/2106.13353.pdf) | PET | 5 | 11.17 | 汪杰 | In-context learning and limits | 1. [Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity](https://arxiv.org/abs/2104.08786)
2. [On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model](https://arxiv.org/abs/2204.13509)
3. [Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?](https://arxiv.org/pdf/2202.12837.pdf)
4. [Impact of Pretraining Term Frequencies on Few-Shot Reasoning](https://arxiv.org/pdf/2202.07206.pdf)
5. [Do Prompt-Based Models Really Understand the Meaning of their Prompts?](https://arxiv.org/abs/2109.01247) | In-context learning | diff --git a/2022Fall_PLM/week2/ppt.pptx b/2022Fall_PLM/week2/ppt.pptx new file mode 100644 index 0000000..d533f63 Binary files /dev/null and b/2022Fall_PLM/week2/ppt.pptx differ