forked from RussTedrake/underactuated
-
Notifications
You must be signed in to change notification settings - Fork 0
/
stochastic.html
372 lines (295 loc) · 18.4 KB
/
stochastic.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
<!DOCTYPE html>
<html>
<head>
<title>Underactuated Robotics</title>
<meta name="Underactuated Robotics" content="text/html; charset=utf-8;" />
<script type="text/javascript" src="htmlbook/book.js"></script>
<script src="htmlbook/mathjax-config.js" defer></script>
<script type="text/javascript" id="MathJax-script" defer
src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js">
</script>
<script>window.MathJax || document.write('<script type="text/javascript" src="htmlbook/MathJax/es5/tex-chtml.js" defer><\/script>')</script>
<link rel="stylesheet" href="htmlbook/highlight/styles/default.css">
<script src="htmlbook/highlight/highlight.pack.js"></script> <!-- http://highlightjs.readthedocs.io/en/latest/css-classes-reference.html#language-names-and-aliases -->
<script>hljs.initHighlightingOnLoad();</script>
<link rel="stylesheet" type="text/css" href="htmlbook/book.css">
</head>
<body onload="loadChapter('underactuated');">
<div data-type="titlepage">
<header>
<h1><a href="underactuated.html" style="text-decoration:none;">Underactuated Robotics</a></h1>
<p data-type="subtitle">Algorithms for Walking, Running, Swimming, Flying, and Manipulation</p>
<p style="font-size: 18px;"><a href="http://people.csail.mit.edu/russt/">Russ Tedrake</a></p>
<p style="font-size: 14px; text-align: right;">
© Russ Tedrake, 2020<br/>
<a href="tocite.html">How to cite these notes</a><br/>
</p>
</header>
</div>
<p><b>Note:</b> These are working notes used for <a
href="http://underactuated.csail.mit.edu/Spring2020/">a course being taught
at MIT</a>. They will be updated throughout the Spring 2020 semester. <a
href="https://www.youtube.com/channel/UChfUOAhz7ynELF-s_1LPpWg">Lecture videos are available on YouTube</a>.</p>
<table style="width:100%;"><tr style="width:100%">
<td style="width:33%;text-align:left;"><a class="previous_chapter" href=humanoids.html>Previous Chapter</a></td>
<td style="width:33%;text-align:center;"><a href=underactuated.html>Table of contents</a></td>
<td style="width:33%;text-align:right;"><a class="next_chapter" href=dp.html>Next Chapter</a></td>
</tr></table>
<!-- EVERYTHING ABOVE THIS LINE IS OVERWRITTEN BY THE INSTALL SCRIPT -->
<chapter style="counter-reset: chapter 5"><h1>Model Systems
with Stochasticity</h1>
<!-- TODO: Adopt more of the 6.437 notation; it's excellent and compatible:
https://stellar.mit.edu/S/course/6/sp19/6.437/courseMaterial/topics/topic5/readings/Preliminaries/Preliminaries.pdf
-->
<p>My goals for this chapter are to build intuition for the beautiful and rich
behavior of nonlinear dynamical system that are subjected to random
(noise/disturbance) inputs. So far we have focused primarily on systems
described by \[ \dot{\bx}(t) = f(\bx(t),\bu(t)) \quad \text{or} \quad \bx[n+1]
= f(\bx[n],\bu[n]). \] In this chapter, I would like to broaden the scope to
think about \[ \dot{\bx}(t) = f(\bx(t),\bu(t),\bw(t)) \quad \text{or} \quad
\bx[n+1] = f(\bx[n],\bu[n],\bw[n]), \] where this additional input $\bw$ is
the (vector) output of some random process. In other words, we can begin
thinking about stochastic systems by simply understanding the dynamics of our
existing ODEs subjected to an additional random input. </p>
<p> This form is extremely general as written. $\bw(t)$ can represent
time-varying random disturbances (e.g. gusts of wind), or even constant model
errors/uncertainty. One thing that we are not adding, yet, is measurement
uncertainty. There is a great deal of work on observability and state
estimation that study the question of how you can infer the true state of the
system given noise sensor readings. For this chapter we are assuming perfect
measurements of the full state, and are focused instead on the way that
"process noise" shapes the long-term dynamics of the system. </p>
<p> I will also stick primarily to discrete time dynamics for this chapter,
simply because it is easier to think about the output of a discrete-time
random process, $\bw[n]$, than a $\bw(t)$. But you should know that all of the
ideas work in continuous time, too. Also, most of our examples will take the
form of <i>additive noise</i>: \[ \bx[n+1] = f(\bx[n],\bu[n]) + \bw[n], \]
which is a particular useful and common specialization of our general form.
And this form doesn't give up much -- the disturbance on step $k$ can pass
through the nonlinear function $f$ on step $k+1$ giving rich results -- but is
often much easier to work with. </p>
<section><h1>The Master Equation</h1>
<p>Let's start by looking at some simple examples.</p>
<example><h1>A Bistable System + Noise</h1>
<p>Let's consider (a time-reversed version of) one of my favorite
one-dimensional systems: \[ \dot{x} = x - x^3. \] <todo>add plot</todo>
This deterministic system has stable fixed points at $x^* = \{-1,1\}$ and
an unstable fixed point at $x^* = 0$. </p> <p> A reasonable discrete-time
approximation of these dynamics, now with additive noise, is \[ x[n+1] =
x[n] + h (x[n] - x[n]^3 + w[n]). \] When $w[n]=0$, this system has the
same fixed points and stability properties as the continuous time system.
But let's examine the system when $w[n]$ is instead the result of a
<i>zero-mean Gaussian white noise process</i>, defined by: \begin{gather*}
\forall n, E\left[ w[n] \right] = 0,\\ E\left[ w[i]w[j] \right] =
\begin{cases} \sigma^2, & \text{ if } i=j,\\ 0, & \text{ otherwise.}
\end{cases} \end{gather*} Here $\sigma$ is the standard deviation of the
Gaussian. </p> <p>When you simulate this system for small values of
$\sigma$, you will see trajectories move roughly towards one of the two
fixed points (for the deterministic system), but every step is modified by
the noise. In fact, even if the trajectory were to arrive exactly at what
was once a fixed point, it is almost surely going to move again on the
very next step. In fact, if we plot many runs of the simulation from
different initial conditions all on the same plot, we will see something
like the figure below.</p> <figure> <img width="80%"
src="figures/cubic_bistable_particles_time.svg"/>
<figcaption>Simulation of the bistable system with noise ($\sigma = 0.01$)
from many initial conditions.</figcaption> </figure>
<p>During any individual simulation, the state jumps around randomly for
all time, and could even transition from the vicinity of one fixed point
to the other fixed point. Another way to visualize this output is to
animate a histogram of the particles over time.</p>
<figure> <img width="80%"
src="figures/cubic_bistable_histogram.svg"/> <figcaption>Histogram of
the states of the bistable system with noise ($\sigma = 0.01$) after
simulating from random initial conditions until $t=30$. <br/><a
href="figures/cubic_bistable_histogram.swf">Click here to see the
animation</a></figcaption> </figure>
<p>You can run this demo for yourself:</p>
<pysrc>simple/stochastic_system_particles.py</pysrc>
</example>
<p>Let's take a moment to appreciate the implications of what we've just
observed. Every time that we've analyzed a system to date, we've asked
questions like "given x[0], what is the long-term behavior of the system,
$\lim_{n\rightarrow\infty} x[n]$?", but now $x[n]$ is a <i>random
variable</i>. The trajectories of this system do not converge, and the
system does not exhibit any form of stability that we've introduced so far.
</p>
<p>All is not lost. If you watch the animation closely, you might notice
the <i>distribution</i> of this random variable is actually very well
behaved. This is the key idea for this chapter.</p>
<p>Let us use $p_n(x)$ to denote the <a
href="https://en.wikipedia.org/wiki/Probability_density_function">
probability density function</a> over the random variable $x$ at time $n$.
Note that this density function must satisfy \[ \int_{-\infty}^\infty p_n(x)
dx = 1.\] It is actually possible to write the <i>dynamics of the
probability density</i> with the simple relation \[ p_{n+1}(x) =
\int_{-\infty}^\infty p(x|x') p_n(x') dx', \] where $p(x|x')$ encodes the
stochastic dynamics as a conditional distribution of the next state (here
$x$) as a function of the current state (here $x'$). Dynamical systems that
can be encoded in this way are known as <i>continuous-state Markov
Processes</i>, and the governing equation above is often referred to as the
"<a href="https://en.wikipedia.org/wiki/Master_equation">master
equation</a>" for the stochastic process. In fact this update is even
linear(!) ; a fact that can enable closed-form solutions to some impressive
long-term statistics, like mean time to failure or first passage
times<elib>Byl08a</elib>. Unfortunately, it is often difficult to perform
the integral in closed form, so we must often resort to discretation or
numerical analysis. </p>
<p>The slightly more general form of the master equation, which works for
multivariable distributions with state-domain ${\cal X}$, and systems with
control inputs $\bu$, is \[ p_{n+1}(\bx) = \int_{\cal X} p(\bx|\bx',\bu)
p_n(\bx') d\bx'. \] This is a (continuous-state) Markov Decision
Process.</p>
</section>
<section><h1>Stationary Distributions</h1>
<p>In the example above, the histogram is our numerical approximation of the
probability density. If you simulate it a few times, you will probably
believe that, although the individual trajectories of the system <i>do
not</i> converge, the probability distribution actually <i>does</i> converge
to what's known as a <i>stationary distribution</i> -- a fixed point of the
master equation. Instead of thinking about the dynamics of the
trajectories, we need to start thinking about the dynamics of the
distribution.</p>
<example><h1>The Linear Gaussian System</h1>
<p>Let's consider the one-dimensional linear system with additive noise:
\[ x[n+1] = a x[n] + w[n]. \] When $w[n]=0$, the system is stable (to the
origin) for $-1 < a < 1$. Let's make sure that we can understand the
dynamics of the master equation for the case when $w[n]$ is Gaussian white
noise with standard deviation $\sigma$. </p>
<p>First, recall that the probability density function of a Gaussian with
mean $\mu$ is given by \[ p(w) = \frac{1}{\sqrt{2 \pi \sigma^2}}
e^{-\frac{(w-\mu)^2}{2\sigma^2}}. \] When conditioned on $x[n]$, the
distribution given by the dynamics subjected to mean-zero Gaussian white
noise is simply another Gaussian, with the mean given by $ax[n]$: \[
p(x[n+1]|x[n]) = \frac{1}{\sqrt{2 \pi \sigma^2}}
e^{-\frac{(x[n+1]-ax[n])^2}{2\sigma^2}}, \] yielding the master equation
\[ p_{n+1}(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} \int_{-\infty}^\infty
e^{-\frac{(x-ax')^2}{2\sigma^2}} p_n(x') dx'. \] </p>
<p>Now here's the magic. Let's push a distribution, $p_n(x)$, which is
zero-mean, with standard deviation $\sigma_n$, through the master
equation: \begin{align*} p_{n+1}(x) =& \frac{1}{\sqrt{2 \pi
\sigma^2}}\frac{1}{\sqrt{2 \pi \sigma_n^2}} \int_{-\infty}^\infty
e^{-\frac{(x-ax')^2}{2\sigma^2}} e^{-\frac{(x')^2}{2\sigma_n^2}} dx',\\ =&
\frac{1}{\sqrt{2 \pi (\sigma^2+a^2 \sigma_n^2)}} e^{-\frac{x^2}{2(\sigma^2
+ a^2 \sigma_n^2)}}. \end{align*} The result is another mean-zero Gaussian
with $\sigma_{n+1}^2 = \sigma^2 + a^2 \sigma_n^2$. This result
generalizes to the multi-variate case, too, and might be familiar to you
e.g. from the process update of the <a
href="https://en.wikipedia.org/wiki/Kalman_filter">Kalman filter</a>. </p>
<p>Taking it a step further, we can see that a stationary distribution for
this system is given by a mean-zero Gaussian with \[ \sigma_*^2 =
\frac{\sigma^2}{1-a^2}. \] Note that this distribution is well defined
when $-1 < a < 1$ (only when the system is stable).</p>
</example>
<p>The stationary distribution of the linear Gaussian system reveals the
fundamental and general balance between two terms in the governing equations
of any stochastic dynamical system: the stability of the deterministic
system is bringing trajectories together (smaller $a$ means faster
convergence of the deterministic system and results in a more narrow
distribution), but the noise in the system is forcing trajectories apart
(larger $\sigma$ means larger noise and results in a wider distribution).
</p>
<p>Given how rich the dynamics can be for deterministic nonlinear systems,
you can probably imagine that the possible long-term dynamics of the
probability are also extremely rich. If we simply flip the signs in the
dynamics we examined above, we'll get our next example:</p>
<example><h1>The Cubic Example + Noise</h1>
<p>Now let's consider the discrete-time approximation of \[ \dot{x} = -x +
x^3, \] again with additive noise: \[ x[n+1] = x[n] + h (-x[n] + x[n]^3 +
w[n]). \] With $w[n]=0$, the system has only a single stable fixed point
(at the origin), whose deterministic region of attraction is given by $x
\in (-1,1)$. If we again simulate the system from a set of random initial
conditions and plot the histogram, we will see something like the figure
below. </p>
<figure> <img width="80%" src="figures/cubic_histogram.svg"/>
<figcaption>Histogram of the states of the bistable system with noise
($\sigma = 0.01$) after simulating from random initial conditions until
$t=30$. <br/><a href="figures/cubic_histogram.swf">Click here to see the
animation</a></figcaption> </figure>
<p>Be sure to watch the animation. Or better yet, run the simulation
for yourself by changing the sign of the derivative in the example
and re-running:</p>
<pysrc>simple/stochastic_system_particles.py</pysrc>
<p>Now try increasing the noise (e.g. pre-multiply the noise input,
$w$, in the dynamics by a scalar like 2).</p>
<p style="text-align:center"><a
href="figures/cubic_histogram_more_noise.swf">Click here to see the
animation</a></p>
<p>What is the stationary distribution for this system? In fact, there
isn't one. Although we initially collect probability density around the
stable fixed point, you should notice a slow leak -- on every step there
is some probability of transitioning past unstable fixed points and
getting driven by the unstable dynamics off towards infinity. If we run
the simulation long enough, there won't be any probability density left at
$x=0$.</p>
</example>
<example><h1>The Stochastic Van der Pol Oscillator</h1>
<p>One more example; this is a fun one. Let's think about one of the
simplest examples that we had for a system that demonstrates limit cycle
stability -- the Van der Pol oscillator -- but now we'll add Gaussian
white noise, $$\ddot{q} + (q^2 - 1)\dot{q} + q = w(t),$$ Here's the
question: if we start with a small set of initial conditions centered
around one point on the limit cycle, then what is the long-term behavior
of this distribution? </p>
<figure> <img width="80%" src="figures/vanderpol_particles.svg"/>
<figcaption>Randomly sampled initial conditions pictured with the stable
limit cycle of the Van der Pol oscillator.</figcaption> </figure>
<p>Since the long-term behavior of the deterministic system is periodic,
it would be very logical to think that the state distribution for this
stochastic system would fall into a stable periodic solution, too. But
think about it a little more, and then watch the animation (or run the
simulation yourself).</p>
<pysrc>van_der_pol/particles.py</pysrc>
<p style="text-align:center"><a
href="figures/vanderpol_particles.swf">Click here to see the animation
(first 20 seconds)</a></p>
<p style="text-align:center"><a
href="figures/vanderpol_particles_mixed.svg">Click here to see the
particles after 2000 seconds of simulation</a></p>
<p>The explanation is simple: the periodic solution of the system is only
<i>orbitally stable</i>; there is no stability along the limit cycle. So
any disturbances that push the particles along the limit cycle will go
unchecked. Eventually, the distribution will "mix" along the entire
cycle. Perhaps surprisingly, this system that has a limit cycle stability
when $w=0$ eventually reaches a stationary distribution (fixed point) in
the master equation.</p>
</example>
</section>
<section><h1>Extended Example: The Rimless Wheel on Rough
Terrain</h1>
<p>Coming soon. Read the paper <elib>Byl08f</elib> and/or watch the <a
href="https://youtu.be/TaaJgvjYBQc">video</a>.</p>
</section>
<section><h1>Noise models for real robots/systems.</h1>
<p>Sensor models. Beam model from probabilistic robotics. RGB-D
dropouts.</p>
<p>Perception subsystem. Output of a
perception system is not Gaussian noise, it's missed
detections/drop-outs...</p>
<p>Distributions over tasks/environments.</p>
</section>
<section><h1>Analysis (a preview)</h1>
<p>Coming soon. Uncertainty quantification (e.g., linear guassian
approximation; discretize then closed form solutions using the transition
matrix....). Monte-Carlo. Particle sim/filter. Rare event simulation </p>
<p>L2-gain with dissipation inequalities. Finite-time verification with
sums of squares.</p>
</section>
<section><h1>Control Design (a preview)</h1>
</section>
</chapter>
<!-- EVERYTHING BELOW THIS LINE IS OVERWRITTEN BY THE INSTALL SCRIPT -->
<table style="width:100%;"><tr style="width:100%">
<td style="width:33%;text-align:left;"><a class="previous_chapter" href=humanoids.html>Previous Chapter</a></td>
<td style="width:33%;text-align:center;"><a href=underactuated.html>Table of contents</a></td>
<td style="width:33%;text-align:right;"><a class="next_chapter" href=dp.html>Next Chapter</a></td>
</tr></table>
<div id="footer">
<hr>
<table style="width:100%;">
<tr><td><em>Underactuated Robotics</em></td><td align="right">© Russ
Tedrake, 2020</td></tr>
</table>
</div>
</body>
</html>