Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the date to a more persistent pointer #14

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions hw0.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -328,7 +328,7 @@
"source": [
"## Question 3: Softmax loss\n",
"\n",
"Implement the softmax (a.k.a. cross-entropy) loss as defined in `softmax_loss()` function in `src/simple_ml.py`. Recall (hopefully this is review, but we'll also cover it in lecture on 9/1), that for a multi-class output that can take on values $y \\in \\{1,\\ldots,k\\}$, the softmax loss takes as input a vector of logits $z \\in \\mathbb{R}^k$, the true class $y \\in \\{1,\\ldots,k\\}$ returns a loss defined by\n",
"Implement the softmax (a.k.a. cross-entropy) loss as defined in `softmax_loss()` function in `src/simple_ml.py`. Recall (hopefully this is review, but we'll also cover it in the second lecture of weeek 1), that for a multi-class output that can take on values $y \\in \\{1,\\ldots,k\\}$, the softmax loss takes as input a vector of logits $z \\in \\mathbb{R}^k$, the true class $y \\in \\{1,\\ldots,k\\}$ returns a loss defined by\n",
"\\begin{equation}\n",
"\\ell_{\\mathrm{softmax}}(z, y) = \\log\\sum_{i=1}^k \\exp z_i - z_y.\n",
"\\end{equation}\n",
Expand Down Expand Up @@ -369,7 +369,7 @@
"source": [
"## Question 4: Stochastic gradient descent for softmax regression\n",
"\n",
"In this question you will implement stochastic gradient descent (SGD) for (linear) softmax regression. In other words, as discussed in lecture on 9/1, we will consider a hypothesis function that makes $n$-dimensional inputs to $k$-dimensional logits via the function\n",
"In this question you will implement stochastic gradient descent (SGD) for (linear) softmax regression. In other words, as discussed in lecture 2 in week 1, we will consider a hypothesis function that makes $n$-dimensional inputs to $k$-dimensional logits via the function\n",
"\\begin{equation}\n",
"h(x) = \\Theta^T x\n",
"\\end{equation}\n",
Expand Down Expand Up @@ -494,7 +494,7 @@
"\\minimize_{W_1, W_2} \\;\\; \\ell_{\\mathrm{softmax}}(\\mathrm{ReLU}(X W_1) W_2, y).\n",
"\\end{equation}\n",
"\n",
"Using the chain rule, we can derive the backpropagation updates for this network (we'll briefly cover these in class, on 9/8, but also provide the final form here for ease of implementation). Specifically, let\n",
"Using the chain rule, we can derive the backpropagation updates for this network (we'll briefly cover these in the lecture 2 in week 2, but also provide the final form here for ease of implementation). Specifically, let\n",
"\\begin{equation}\n",
"\\begin{split}\n",
"Z_1 \\in \\mathbb{R}^{m \\times d} & = \\mathrm{ReLU}(X W_1) \\\\\n",
Expand Down