[course]
Note that this repo is still a work in progress...
In this repository, I share my solutions and additional resources for CS224N in the spring of 2024. While there are already some excellent repos offering solutions for various semesters, I found that none fully cover the latest version of the course. Moreover, many of these solutions seem to be copied from past versions, which may not reflect the latest updates. Hence, I decide to document my own learning process and provide all the solutions. If you spot any mistakes, feel free to file an issue!
Just follow your own path.
- Watch Lecture 3 - Backprop and Neural Networks.
- Add assignment 1 part1 draft.
- Watch Lecture 5 - Recurrent Neural networks (RNNs) and Lecture 6 - Simple and LSTM RNNs.
- Read cs224n-2019-notes04-dependencyparsing.
- Add assignment 2 coding part and some written.
- Train a neural dependency parser, which achieves UAS 88.47 on the dev set and 89.18 on the test set.
- Watch Lecture 7 - Translation, Seq2Seq, Attention, Lecture 8 - Self-Attention and Transformers, and Lecture 9 - Pretraining.
- (Optional) Read An Analysis on the Learning Rules of the Skip-Gram Model to section III.
- Add assignment 2 written to Q1 (c).
- Read Dropout: A Simple Way to Prevent Neural Networks from Overfitting.
- Only focus on the motivation and effect of dropout.
- Finish assignment 2.
- Do assignment 3 coding part.
- Temporarily skip code snippets related to beam search.
- Train a NMT model, which achieves BLEU 22.33 on the test set.
- Finish assignment 3.
- Read chapters 1 and 2 of Natural Language Processing with Transformers.
- Read chapters 3 to 5 of Natural Language Processing with Transformers.
I switch back to the course schedule in the spring of 2023, because there's no lectures updated for 2024. However, all solutions still follow the 2024 version.
- Watch Lecture11 - Natural Language Generation.
- I think it's more consistent to take this lecture before Lecture 10 - Prompting, Reinforcement Learning from Human Feedback.
- Read chapter 6 of Natural Language Processing with Transformers.
- Add assignment 4 written to Q1 (c) i and Q2.
- Recently play around with a RAG application.
- As my custom final project?
- Read chapter 7 of Natural Language Processing with Transformers.
- Haven't run the corresponding notebook due to the problem of launching Elasticsearch locally.
- Watch Lecture12 - Question Answering.
- Read RoFormer: Enhanced Transformer with Rotary Position Embedding.
- Have to reread it to grasp the main points.
- Add assignment 4 written and coding parts.
- Pre-train and fine-tune a simple GPT which achieves accuracy of 29.00 of the dev set in Q3 (g).
- I will finish Q1 (c) to (e) after reviewing probability theory.
- Watch Lecture 16 - Multimodal Deep Learning, Lecture 17 - Coreference Resolution, Lecture 18 - Model Analysis and Explanation, Lecture 19 - Model Interpretability & Editing.
- I will dive deeper into lec 16, which looks fascinating. Also, I can't quite get lec 19.
- Add the default final project minBERT with section 3 implemented.
- Finish the first half of minBERT.
- Fine-tuning the last linear layer for SST: Dev acc 0.931.
- Fine-tuning the last linear layer for CFIMDB: Dev acc 0.771.
- Fine-tuning the full model for SST: Dev acc 0.517.
- Fine-tuning the full model for CFIMDB: Dev acc 0.959.
- Read BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding.