From 2ba406738442a3cbcc2994f0a1eeb6f9aecce333 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tesla=20Zhang=E2=80=AE?= <ice1000kotlin@foxmail.com>
Date: Tue, 3 Sep 2024 14:55:41 -0400
Subject: [PATCH 1/2] Change the date to a more persistent pointer

---
 hw0.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw0.ipynb b/hw0.ipynb
index e15d948..211ec05 100644
--- a/hw0.ipynb
+++ b/hw0.ipynb
@@ -328,7 +328,7 @@
    "source": [
     "## Question 3: Softmax loss\n",
     "\n",
-    "Implement the softmax (a.k.a. cross-entropy) loss as defined in `softmax_loss()` function in `src/simple_ml.py`.  Recall (hopefully this is review, but we'll also cover it in lecture on 9/1), that for a multi-class output that can take on values $y \\in \\{1,\\ldots,k\\}$, the softmax loss takes as input a vector of logits $z \\in \\mathbb{R}^k$, the true class $y \\in \\{1,\\ldots,k\\}$ returns a loss defined by\n",
+    "Implement the softmax (a.k.a. cross-entropy) loss as defined in `softmax_loss()` function in `src/simple_ml.py`.  Recall (hopefully this is review, but we'll also cover it in the second lecture of weeek 1), that for a multi-class output that can take on values $y \\in \\{1,\\ldots,k\\}$, the softmax loss takes as input a vector of logits $z \\in \\mathbb{R}^k$, the true class $y \\in \\{1,\\ldots,k\\}$ returns a loss defined by\n",
     "\\begin{equation}\n",
     "\\ell_{\\mathrm{softmax}}(z, y) = \\log\\sum_{i=1}^k \\exp z_i - z_y.\n",
     "\\end{equation}\n",

From ffbc3cbf8c313d7846196040f1e664cdd44d8bf0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tesla=20Zhang=E2=80=AE?= <ice1000kotlin@foxmail.com>
Date: Tue, 3 Sep 2024 20:35:32 -0400
Subject: [PATCH 2/2] There seems to be more of these dates

---
 hw0.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw0.ipynb b/hw0.ipynb
index 211ec05..988a350 100644
--- a/hw0.ipynb
+++ b/hw0.ipynb
@@ -369,7 +369,7 @@
    "source": [
     "## Question 4: Stochastic gradient descent for softmax regression\n",
     "\n",
-    "In this question you will implement stochastic gradient descent (SGD) for (linear) softmax regression.  In other words, as discussed in lecture on 9/1, we will consider a hypothesis function that makes $n$-dimensional inputs to $k$-dimensional logits via the function\n",
+    "In this question you will implement stochastic gradient descent (SGD) for (linear) softmax regression.  In other words, as discussed in lecture 2 in week 1, we will consider a hypothesis function that makes $n$-dimensional inputs to $k$-dimensional logits via the function\n",
     "\\begin{equation}\n",
     "h(x) = \\Theta^T x\n",
     "\\end{equation}\n",
@@ -494,7 +494,7 @@
     "\\minimize_{W_1, W_2} \\;\\; \\ell_{\\mathrm{softmax}}(\\mathrm{ReLU}(X W_1) W_2, y).\n",
     "\\end{equation}\n",
     "\n",
-    "Using the chain rule, we can derive the backpropagation updates for this network (we'll briefly cover these in class, on 9/8, but also provide the final form here for ease of implementation).  Specifically, let\n",
+    "Using the chain rule, we can derive the backpropagation updates for this network (we'll briefly cover these in the lecture 2 in week 2, but also provide the final form here for ease of implementation).  Specifically, let\n",
     "\\begin{equation}\n",
     "\\begin{split}\n",
     "Z_1 \\in \\mathbb{R}^{m \\times d} & = \\mathrm{ReLU}(X W_1) \\\\\n",