In this unit we're going to turn from the pure mathematical objects of Random Variables and Conditional Expectations to start getting our hands dirty with data.
There are two mandatory sections for this unit, and one optional unit.
The first section introduces "random sampling", a technical definition of what we hope to achieve with random sampling, several pieces of mathematical framework and then a crucial theorem in statistics, the Weak Law of Large Numbers.
The second section introduces what is probably the most central theorem in applied statistics, the Central Limit Theorem. We use the CLT and its guarantee that under very general conditions sample averages converge in distribution to a Gaussian distribution to characterize the certainty (or uncertainty) about the estimate produced by a sample estimator.
The second section concludes with a discussion of the applied concept of a "plug-in estimator" which is a pragmatic statement about how to estimate a population parameter from an i.i.d. sample.
The optional third section discusses the asymptotic properties of estimators presented in the first two sections. In general, the take away from this is that with more (ideally infinite) data, estimators either produce estimates that have more guarantees or can produce the same guarantees but under less strong requirements about the data.
The third section also presents a pragmatic method of estimating the heretofore unknowable joint pdf -- a flexible kernel method that converges in distribution to the joint pdf.
We are aware that this is another big week of study and that we're asking a lot of you. But, be encouraged by this: *at the conclusion of this week, you will have learned and begun to apply some of the most important pieces of statistics. And, you will have built this understanding in a way that lets you expand into every applied data science discipline.