-
-
Notifications
You must be signed in to change notification settings - Fork 26
Commit
- Loading branch information
There are no files selected for viewing
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,383 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 7, | ||
"metadata": { | ||
"collapsed": true, | ||
"scrolled": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import numpy as np\n", | ||
"import matplotlib.pyplot as plt\n", | ||
"import seaborn as sns" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 8, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Populating the interactive namespace from numpy and matplotlib\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"# Plot style\n", | ||
"sns.set()\n", | ||
"%pylab inline\n", | ||
"pylab.rcParams['figure.figsize'] = (4, 4)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"$$\n", | ||
"\\newcommand\\norm[1]{\\left\\lVert#1\\right\\rVert} \n", | ||
"\\DeclareMathOperator{\\Tr}{Tr}\n", | ||
"\\newcommand\\bs[1]{\\boldsymbol{#1}}\n", | ||
"$$" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Introduction\n", | ||
"\n", | ||
"This chapter is very light! I can assure you that you will read it in 1 minute! It is nice after the last two chapters that were quite big! We will see what is the Trace of a matrix. It will be needed for the last chapter on the Principal Component Analysis (PCA)." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# 2.10 The Trace Operator" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<img src=\"images/trace-matrix.png\" width=\"200\" alt=\"Calculating the trace of a matrix\" title=\"Calculating the trace of a matrix\">\n", | ||
"<em>The trace of matrix</em>\n", | ||
"\n", | ||
"\n", | ||
"The trace is the sum of all values in the diagonal of a square matrix.\n", | ||
"\n", | ||
"$$\n", | ||
"\\bs{A}=\n", | ||
"\\begin{bmatrix}\n", | ||
" 2 & 9 & 8 \\\\\\\\\n", | ||
" 4 & 7 & 1 \\\\\\\\\n", | ||
" 8 & 2 & 5\n", | ||
"\\end{bmatrix}\n", | ||
"$$\n", | ||
"\n", | ||
"$$\n", | ||
"\\mathrm{Tr}(\\bs{A}) = 2 + 7 + 5 = 14\n", | ||
"$$\n", | ||
"\n", | ||
"Numpy provides the function `trace()` to calculate it:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 10, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"array([[2, 9, 8],\n", | ||
" [4, 7, 1],\n", | ||
" [8, 2, 5]])" | ||
] | ||
}, | ||
"execution_count": 10, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"A = np.array([[2, 9, 8], [4, 7, 1], [8, 2, 5]])\n", | ||
"A" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 11, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"14" | ||
] | ||
}, | ||
"execution_count": 11, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"A_tr = np.trace(A)\n", | ||
"A_tr" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"GoodFellow et al. explain that the trace can be used to specify the Frobenius norm of a matrix (see [2.5](https://hadrienj.github.io/posts/Deep-Learning-Book-Series-2.5-Norms/)). The Frobenius norm is the equivalent of the $L^2$ norm for matrices. It is defined by:\n", | ||
"\n", | ||
"$$\n", | ||
"\\norm{\\bs{A}}_F=\\sqrt{\\sum_{i,j}A^2_{i,j}}\n", | ||
"$$\n", | ||
"\n", | ||
"Take the square of all elements and sum them. Take the square root of the result. This norm can also be calculated with:\n", | ||
"\n", | ||
"$$\n", | ||
"\\norm{\\bs{A}}_F=\\sqrt{\\Tr({\\bs{AA}^T})}\n", | ||
"$$\n", | ||
"\n", | ||
"We can check this. The first way to compute the norm can be done with the simple command `np.linalg.norm()`:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 12, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"17.549928774784245" | ||
] | ||
}, | ||
"execution_count": 12, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"np.linalg.norm(A)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"The Frobenius norm of $\\bs{A}$ is 17.549928774784245.\n", | ||
"\n", | ||
"With the trace the result is identical:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 13, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"17.549928774784245" | ||
] | ||
}, | ||
"execution_count": 13, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"np.sqrt(np.trace(A.dot(A.T)))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Since the transposition of a matrix doesn't change the diagonal, the trace of the matrix is equal to the trace of its transpose:\n", | ||
"\n", | ||
"$$\n", | ||
"\\Tr(\\bs{A})=\\Tr(\\bs{A}^T)\n", | ||
"$$" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Trace of a product\n", | ||
"\n", | ||
"$$\n", | ||
"\\Tr(\\bs{ABC}) = \\Tr(\\bs{CAB}) = \\Tr(\\bs{BCA})\n", | ||
"$$\n", | ||
"\n", | ||
"\n", | ||
"### Example 1.\n", | ||
"\n", | ||
"Let's see an example of this property.\n", | ||
"\n", | ||
"$$\n", | ||
"\\bs{A}=\n", | ||
"\\begin{bmatrix}\n", | ||
" 4 & 12 \\\\\\\\\n", | ||
" 7 & 6\n", | ||
"\\end{bmatrix}\n", | ||
"$$\n", | ||
"\n", | ||
"$$\n", | ||
"\\bs{B}=\n", | ||
"\\begin{bmatrix}\n", | ||
" 1 & -3 \\\\\\\\\n", | ||
" 4 & 3\n", | ||
"\\end{bmatrix}\n", | ||
"$$\n", | ||
"\n", | ||
"$$\n", | ||
"\\bs{C}=\n", | ||
"\\begin{bmatrix}\n", | ||
" 6 & 6 \\\\\\\\\n", | ||
" 2 & 5\n", | ||
"\\end{bmatrix}\n", | ||
"$$" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 14, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"531" | ||
] | ||
}, | ||
"execution_count": 14, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"A = np.array([[4, 12], [7, 6]])\n", | ||
"B = np.array([[1, -3], [4, 3]])\n", | ||
"C = np.array([[6, 6], [2, 5]])\n", | ||
"\n", | ||
"np.trace(A.dot(B).dot(C))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 15, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"531" | ||
] | ||
}, | ||
"execution_count": 15, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"np.trace(C.dot(A).dot(B))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 16, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"531" | ||
] | ||
}, | ||
"execution_count": 16, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"np.trace(B.dot(C).dot(A))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"$$\n", | ||
"\\bs{ABC}=\n", | ||
"\\begin{bmatrix}\n", | ||
" 360 & 432 \\\\\\\\\n", | ||
" 180 & 171\n", | ||
"\\end{bmatrix}\n", | ||
"$$\n", | ||
"\n", | ||
"$$\n", | ||
"\\bs{CAB}=\n", | ||
"\\begin{bmatrix}\n", | ||
" 498 & 126 \\\\\\\\\n", | ||
" 259 & 33\n", | ||
"\\end{bmatrix}\n", | ||
"$$\n", | ||
"\n", | ||
"$$\n", | ||
"\\bs{BCA}=\n", | ||
"\\begin{bmatrix}\n", | ||
" -63 & -54 \\\\\\\\\n", | ||
" 393 & 594\n", | ||
"\\end{bmatrix}\n", | ||
"$$\n", | ||
"\n", | ||
"$$\n", | ||
"\\Tr(\\bs{ABC}) = \\Tr(\\bs{CAB}) = \\Tr(\\bs{BCA}) = 531\n", | ||
"$$" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# References\n", | ||
"\n", | ||
"[Trace (linear algebra) - Wikipedia](https://en.wikipedia.org/wiki/Trace_(linear_algebra))\n", | ||
"\n", | ||
"[Numpy Trace operator](https://docs.scipy.org/doc/numpy/reference/generated/numpy.trace.html)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.6.7" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |