-
Notifications
You must be signed in to change notification settings - Fork 0
/
probability_theory_1a.html
1560 lines (1518 loc) · 122 KB
/
probability_theory_1a.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8">
<meta name="generator" content="quarto-1.6.1">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>8 Probability Theory, Part 1 – Resampling statistics</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */
vertical-align: middle;
}
/* CSS for syntax highlighting */
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
}
pre.numberSource { margin-left: 3em; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
/* CSS for citations */
div.csl-bib-body { }
div.csl-entry {
clear: both;
margin-bottom: 0em;
}
.hanging-indent div.csl-entry {
margin-left:2em;
text-indent:-2em;
}
div.csl-left-margin {
min-width:2em;
float:left;
}
div.csl-right-inline {
margin-left:2em;
padding-left:1em;
}
div.csl-indent {
margin-left: 2em;
}</style>
<script src="site_libs/quarto-nav/quarto-nav.js"></script>
<script src="site_libs/quarto-nav/headroom.min.js"></script>
<script src="site_libs/clipboard/clipboard.min.js"></script>
<script src="site_libs/quarto-search/autocomplete.umd.js"></script>
<script src="site_libs/quarto-search/fuse.min.js"></script>
<script src="site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="./">
<link href="./probability_theory_1b.html" rel="next">
<link href="./sampling_tools.html" rel="prev">
<script src="site_libs/quarto-html/quarto.js"></script>
<script src="site_libs/quarto-html/popper.min.js"></script>
<script src="site_libs/quarto-html/tippy.umd.min.js"></script>
<script src="site_libs/quarto-html/anchor.min.js"></script>
<link href="site_libs/quarto-html/tippy.css" rel="stylesheet">
<link href="site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="site_libs/bootstrap/bootstrap.min.js"></script>
<link href="site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="site_libs/bootstrap/bootstrap.min.css" rel="stylesheet" id="quarto-bootstrap" data-mode="light">
<script id="quarto-search-options" type="application/json">{
"location": "sidebar",
"copy-button": false,
"collapse-after": 3,
"panel-placement": "start",
"type": "textbox",
"limit": 50,
"keyboard-shortcut": [
"f",
"/",
"s"
],
"show-item-context": false,
"language": {
"search-no-results-text": "No results",
"search-matching-documents-text": "matching documents",
"search-copy-link-title": "Copy link to search",
"search-hide-matches-text": "Hide additional matches",
"search-more-match-text": "more match in this document",
"search-more-matches-text": "more matches in this document",
"search-clear-button-title": "Clear",
"search-text-placeholder": "",
"search-detached-cancel-button-title": "Cancel",
"search-submit-button-title": "Submit",
"search-label": "Search"
}
}</script>
<script type="text/javascript">
$(document).ready(function() {
$("table").addClass('lightable-paper lightable-striped lightable-hover')
});
</script>
<script src="https://cdnjs.cloudflare.com/polyfill/v3/polyfill.min.js?features=es6"></script>
<script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml-full.js" type="text/javascript"></script>
<script type="text/javascript">
const typesetMath = (el) => {
if (window.MathJax) {
// MathJax Typeset
window.MathJax.typeset([el]);
} else if (window.katex) {
// KaTeX Render
var mathElements = el.getElementsByClassName("math");
var macros = [];
for (var i = 0; i < mathElements.length; i++) {
var texText = mathElements[i].firstChild;
if (mathElements[i].tagName == "SPAN") {
window.katex.render(texText.data, mathElements[i], {
displayMode: mathElements[i].classList.contains('display'),
throwOnError: false,
macros: macros,
fleqn: false
});
}
}
}
}
window.Quarto = {
typesetMath
};
</script>
<link rel="stylesheet" href="style.css">
<link rel="stylesheet" href="font-awesome.min.css">
</head>
<body class="nav-sidebar floating">
<div id="quarto-search-results"></div>
<header id="quarto-header" class="headroom fixed-top">
<nav class="quarto-secondary-nav">
<div class="container-fluid d-flex">
<button type="button" class="quarto-btn-toggle btn" data-bs-toggle="collapse" role="button" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<i class="bi bi-layout-text-sidebar-reverse"></i>
</button>
<nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"><li class="breadcrumb-item"><a href="./probability_theory_1a.html"><span class="chapter-number">8</span> <span class="chapter-title">Probability Theory, Part 1</span></a></li></ol></nav>
<a class="flex-grow-1" role="navigation" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
</a>
<button type="button" class="btn quarto-search-button" aria-label="Search" onclick="window.quartoOpenSearch();">
<i class="bi bi-search"></i>
</button>
</div>
</nav>
</header>
<!-- content -->
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article">
<!-- sidebar -->
<nav id="quarto-sidebar" class="sidebar collapse collapse-horizontal quarto-sidebar-collapse-item sidebar-navigation floating overflow-auto">
<div class="pt-lg-2 mt-2 text-left sidebar-header">
<div class="sidebar-title mb-0 py-0">
<a href="./">Resampling statistics</a>
</div>
</div>
<div class="mt-2 flex-shrink-0 align-items-center">
<div class="sidebar-search">
<div id="quarto-search" class="" title="Search"></div>
</div>
</div>
<div class="sidebar-menu-container">
<ul class="list-unstyled mt-1">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./index.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">R version</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./preface_third.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Preface to the third edition</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./preface_second.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Preface to the second edition</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./intro.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">1</span> <span class="chapter-title">Introduction</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./resampling_method.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">2</span> <span class="chapter-title">The resampling method</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./what_is_probability.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">3</span> <span class="chapter-title">What is probability?</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./about_technology.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">4</span> <span class="chapter-title">Introducing R and the Jupyter notebook</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./resampling_with_code.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">5</span> <span class="chapter-title">Resampling with code</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./resampling_with_code2.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">6</span> <span class="chapter-title">More resampling with code</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./sampling_tools.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">7</span> <span class="chapter-title">Tools for samples and sampling</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./probability_theory_1a.html" class="sidebar-item-text sidebar-link active">
<span class="menu-text"><span class="chapter-number">8</span> <span class="chapter-title">Probability Theory, Part 1</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./probability_theory_1b.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">9</span> <span class="chapter-title">Probability Theory Part I (continued)</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./more_sampling_tools.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">10</span> <span class="chapter-title">Two puzzles and more tools</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./probability_theory_2_compound.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">11</span> <span class="chapter-title">Probability Theory, Part 2: Compound Probability</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./probability_theory_3.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">12</span> <span class="chapter-title">Probability Theory, Part 3</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./probability_theory_4_finite.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">13</span> <span class="chapter-title">Probability Theory, Part 4: Estimating Probabilities from Finite Universes</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./sampling_variability.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">14</span> <span class="chapter-title">On Variability in Sampling</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./monte_carlo.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">15</span> <span class="chapter-title">The Procedures of Monte Carlo Simulation (and Resampling)</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./standard_scores.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">16</span> <span class="chapter-title">Ranks, Quantiles and Standard Scores</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./inference_ideas.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">17</span> <span class="chapter-title">The Basic Ideas in Statistical Inference</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./inference_intro.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">18</span> <span class="chapter-title">Introduction to Statistical Inference</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./point_estimation.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">19</span> <span class="chapter-title">Point Estimation</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./framing_questions.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">20</span> <span class="chapter-title">Framing Statistical Questions</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./testing_counts_1.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">21</span> <span class="chapter-title">Hypothesis-Testing with Counted Data, Part 1</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./significance.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">22</span> <span class="chapter-title">The Concept of Statistical Significance in Testing Hypotheses</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./testing_counts_2.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">23</span> <span class="chapter-title">The Statistics of Hypothesis-Testing with Counted Data, Part 2</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./testing_measured.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">24</span> <span class="chapter-title">The Statistics of Hypothesis-Testing With Measured Data</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./testing_procedures.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">25</span> <span class="chapter-title">General Procedures for Testing Hypotheses</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./confidence_1.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">26</span> <span class="chapter-title">Confidence Intervals, Part 1: Assessing the Accuracy of Samples</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./confidence_2.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">27</span> <span class="chapter-title">Confidence Intervals, Part 2: The Two Approaches to Estimating Confidence Intervals</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./reliability_average.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">28</span> <span class="chapter-title">Some Last Words About the Reliability of Sample Averages</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./correlation_causation.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">29</span> <span class="chapter-title">Correlation and Causation</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./how_big_sample.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">30</span> <span class="chapter-title">How Large a Sample?</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./bayes_simulation.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">31</span> <span class="chapter-title">Bayesian Analysis by Simulation</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./references.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">References</span></a>
</div>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-1" role="navigation" aria-expanded="true">
<span class="menu-text">Appendices</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-1" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-1" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./exercise_solutions.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">A</span> <span class="chapter-title">Exercise Solutions</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./technical_note.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">B</span> <span class="chapter-title">Technical Note to the Professional Reader</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./acknowlegements.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">C</span> <span class="chapter-title">Acknowledgements</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./code_topics.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">D</span> <span class="chapter-title">Code topics</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./errors_suggestions.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">E</span> <span class="chapter-title">Errors and suggestions</span></span></a>
</div>
</li>
</ul>
</li>
</ul>
</div>
</nav>
<div id="quarto-sidebar-glass" class="quarto-sidebar-collapse-item" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item"></div>
<!-- margin-sidebar -->
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
<nav id="TOC" role="doc-toc" class="toc-active">
<h2 id="toc-title">Table of contents</h2>
<ul>
<li><a href="#introduction" id="toc-introduction" class="nav-link active" data-scroll-target="#introduction"><span class="header-section-number">8.1</span> Introduction</a></li>
<li><a href="#definitions" id="toc-definitions" class="nav-link" data-scroll-target="#definitions"><span class="header-section-number">8.2</span> Definitions</a></li>
<li><a href="#theoretical-and-historical-methods-of-estimation" id="toc-theoretical-and-historical-methods-of-estimation" class="nav-link" data-scroll-target="#theoretical-and-historical-methods-of-estimation"><span class="header-section-number">8.3</span> Theoretical and historical methods of estimation</a></li>
<li><a href="#samples-and-universes" id="toc-samples-and-universes" class="nav-link" data-scroll-target="#samples-and-universes"><span class="header-section-number">8.4</span> Samples and universes</a>
<ul class="collapse">
<li><a href="#the-concept-of-a-sample" id="toc-the-concept-of-a-sample" class="nav-link" data-scroll-target="#the-concept-of-a-sample"><span class="header-section-number">8.4.1</span> The concept of a sample</a></li>
</ul></li>
<li><a href="#the-concept-of-a-universe-or-population" id="toc-the-concept-of-a-universe-or-population" class="nav-link" data-scroll-target="#the-concept-of-a-universe-or-population"><span class="header-section-number">8.5</span> The concept of a universe or population</a></li>
<li><a href="#the-conventions-of-probability" id="toc-the-conventions-of-probability" class="nav-link" data-scroll-target="#the-conventions-of-probability"><span class="header-section-number">8.6</span> The conventions of probability</a></li>
<li><a href="#sec-addition-rule" id="toc-sec-addition-rule" class="nav-link" data-scroll-target="#sec-addition-rule"><span class="header-section-number">8.7</span> Mutually exclusive events — the addition rule</a></li>
<li><a href="#joint-probabilities" id="toc-joint-probabilities" class="nav-link" data-scroll-target="#joint-probabilities"><span class="header-section-number">8.8</span> Joint probabilities</a></li>
<li><a href="#sec-what-is-resampling" id="toc-sec-what-is-resampling" class="nav-link" data-scroll-target="#sec-what-is-resampling"><span class="header-section-number">8.9</span> The Monte Carlo simulation method (resampling)</a></li>
<li><a href="#sec-if-statements" id="toc-sec-if-statements" class="nav-link" data-scroll-target="#sec-if-statements"><span class="header-section-number">8.10</span> If statements in R</a></li>
<li><a href="#the-deductive-formulaic-method" id="toc-the-deductive-formulaic-method" class="nav-link" data-scroll-target="#the-deductive-formulaic-method"><span class="header-section-number">8.11</span> The deductive formulaic method</a></li>
<li><a href="#sec-multiplication-rule" id="toc-sec-multiplication-rule" class="nav-link" data-scroll-target="#sec-multiplication-rule"><span class="header-section-number">8.12</span> Multiplication rule</a></li>
<li><a href="#sec-cond-uncond" id="toc-sec-cond-uncond" class="nav-link" data-scroll-target="#sec-cond-uncond"><span class="header-section-number">8.13</span> Conditional and unconditional probabilities</a></li>
<li><a href="#sec-shuffling" id="toc-sec-shuffling" class="nav-link" data-scroll-target="#sec-shuffling"><span class="header-section-number">8.14</span> Shuffling with <span class="r"><code>sample</code></span></a></li>
<li><a href="#code-answers-to-the-cards-and-pennies-problem" id="toc-code-answers-to-the-cards-and-pennies-problem" class="nav-link" data-scroll-target="#code-answers-to-the-cards-and-pennies-problem"><span class="header-section-number">8.15</span> Code answers to the cards and pennies problem</a></li>
<li><a href="#the-commanders-again-plus-leaving-the-game-early" id="toc-the-commanders-again-plus-leaving-the-game-early" class="nav-link" data-scroll-target="#the-commanders-again-plus-leaving-the-game-early"><span class="header-section-number">8.16</span> The Commanders again, plus leaving the game early</a></li>
</ul>
</nav>
</div>
<!-- main -->
<main class="content" id="quarto-document-content">
<header id="title-block-header" class="quarto-title-block default">
<div class="quarto-title">
<h1 class="title"><span id="sec-prob-theory-one-a" class="quarto-section-identifier"><span class="chapter-number">8</span> <span class="chapter-title">Probability Theory, Part 1</span></span></h1>
</div>
<div class="quarto-title-meta">
</div>
</header>
<section id="introduction" class="level2" data-number="8.1">
<h2 data-number="8.1" class="anchored" data-anchor-id="introduction"><span class="header-section-number">8.1</span> Introduction</h2>
<p>Let’s assume we understand the nature of the system or mechanism that produces the uncertain events in which we are interested. That is, the probability of the relevant independent <em>simple</em> events is assumed to be known, the way we assume we know the probability of a single “6” with a given die. The task is to determine the probability of various sequences or combinations of the simple events — say, three “6’s” in a row with the die. These are the sorts of probability problems dealt with in this chapter.</p>
<!---
Define or rephrase "independent*. This discussed in 1b, maybe move discussion
here or duck here, and defer.
-->
<p>The resampling method — or just call it simulation or Monte Carlo method, if you prefer — will be illustrated with classic examples. Typically, a single trial of the system is simulated with cards, dice, random numbers, or a computer program. Then trials are repeated again and again to estimate the frequency of occurrence of the event in which we are interested; this is the probability we seek. We can obtain as accurate an estimate of the probability as we wish by increasing the number of trials. The key task in each situation is <em>designing an experiment that accurately simulates the system in which we are interested</em>.</p>
<p>This chapter begins the Monte Carlo simulation work that culminates in the resampling method in statistics proper. The chapter deals with problems in probability theory — that is, situations where one wants to estimate the probability of one or more particular events when the basic structure and parameters of the system are known. In later chapters we move on to inferential statistics, where similar simulation work is known as resampling.</p>
</section>
<section id="definitions" class="level2" data-number="8.2">
<h2 data-number="8.2" class="anchored" data-anchor-id="definitions"><span class="header-section-number">8.2</span> Definitions</h2>
<p>A few definitions first:</p>
<ul>
<li><em>Simple Event</em>: An event such as a single flip of a coin, or one draw of a single card. A simple event cannot be broken down into simpler events of a similar sort.</li>
<li><em>Simple Probability</em> (also called “primitive probability”): The probability that a simple event will occur; for example, that my favorite football team, the Washington Commanders, will win on Sunday.</li>
</ul>
<p>During a recent season, the “experts” said that the Commanders had a 60 percent chance of winning on Opening Day; that estimate is a simple probability. We can <em>model</em> that probability by putting into a bucket six green balls to stand for wins, and four red balls to stand for losses (or we could use 60 and 40 balls, or 600 and 400). For the outcome on any given day, we draw one ball from the bucket, and record a simulated win if the ball is green, a loss if the ball is red.</p>
<p>So far the bucket has served only as a physical representation of our thoughts. But as we shall see shortly, this representation can help us think clearly about the process of interest to us. It can also give us information that is not yet in our thoughts.</p>
<p>Estimating simple probabilities wisely depends largely upon gathering evidence well. It also helps to adjust one’s probability estimates skillfully to make them internally consistent. Estimating probabilities has much in common with estimating lengths, weights, skills, costs, and other subjects of measurement and judgment.</p>
<p>Some more definitions:</p>
<ul>
<li><em>Composite Event</em>: A composite event is the combination of two or more simple events. Examples include all heads in three throws of a single coin; all heads in one throw of three coins at once; Sunday being a nice day <em>and</em> the Commanders winning; and the birth of nine females out of the next ten calves born if the chance of a female in a single birth is 0.48.</li>
<li><em>Compound Probability</em>: The probability that a composite event will occur.</li>
</ul>
<p>The difficulty in estimating <em>simple</em> probabilities such as the chance of the Commanders winning on Sunday arises from our lack of understanding of the world around us. The difficulty of estimating <em>compound</em> probabilities such as the probability of it being a nice day Sunday <em>and</em> the Commanders winning is the weakness in our mathematical intuition interacting with our lack of understanding of the world around us. Our task in the study of probability and statistics is to overcome the weakness of our mathematical intuition by using a systematic process of simulation (or the devices of formulaic deductive theory).</p>
<p>Consider now a question about a compound probability: What are the chances of the Commanders winning their first <em>two</em> games if we think that <em>each</em> of those games can be modeled by our bucket containing six red and four green balls? If one drawing from the bucket represents one game, a second drawing should represent the second game (assuming we replace the first ball drawn in order to keep the chances of winning the same for the two games). If so, two drawings from the bucket should represent two games. And we can then estimate the compound probability we seek with a series of two-ball trial experiments.</p>
<p>More specifically, our procedure in this case — the prototype of all procedures in the resampling simulation approach to probability and statistics — is as follows:</p>
<ol type="1">
<li>Put six green (“Win”) and four red (“Lose”) balls in a bucket.</li>
<li>Draw a ball, record its color, and replace it (so that the probability of winning the second simulated game is the same as the first).</li>
<li>Draw another ball and record its color.</li>
<li>If both balls drawn were green record “Yes”; otherwise record “No.”</li>
<li>Repeat steps 2-4 a thousand times.</li>
<li>Count the proportion of “Yes”s to the total number of “Yes”s and “No”s; the result is the probability we seek.</li>
</ol>
<p>Much the same procedure could be used to estimate the probability of the Commanders winning (say) 3 of their next 4 games. We will return to this illustration again and we will see how it enables us to estimate many other sorts of probabilities.</p>
<ul>
<li><em>Experiment or Experimental Trial, or Trial, or Resampling Experiment</em>: A simulation experiment or trial is a randomly-generated composite event which has the same characteristics as the actual composite event in which we are interested (except that in inferential statistics the resampling experiment is generated with the “benchmark” or “null” universe rather than with the “alternative” universe). <!---
Explain the above better.
--></li>
<li><em>Parameter</em>: A numerical property of a universe. For example, the “true” mean (don’t worry about the meaning of “true”), and the range between largest and smallest members, are two of its parameters.</li>
</ul>
</section>
<section id="theoretical-and-historical-methods-of-estimation" class="level2" data-number="8.3">
<h2 data-number="8.3" class="anchored" data-anchor-id="theoretical-and-historical-methods-of-estimation"><span class="header-section-number">8.3</span> Theoretical and historical methods of estimation</h2>
<p>As introduced in <a href="what_is_probability.html#sec-probability-ways" class="quarto-xref"><span>Section 3.5</span></a>, there are two general ways to tackle any probability problem: <em>theoretical-deductive</em> and <em>empirical</em>, each of which has two sub-types. These concepts have complicated links with the concept of “frequency series” discussed earlier.</p>
<ul>
<li><p><em>Empirical Methods</em>. One empirical method is to look at <em>actual cases in nature</em> — for example, examine all (or a sample of) the families in Brazil that have four children and count the proportion that have three girls among them. (This is the most fundamental process in science and in information-getting generally. But in general we do not discuss it in this book and leave it to courses called “research methods.” I regard that as a mistake and a shame, but so be it.) In some cases, of course, we cannot get data in such fashion because it does not exist.</p>
<p>Another empirical method is to manipulate the simple elements in such fashion as to produce hypothetical experience with how the simple elements behave. This is the heart of the resampling method, as well as of physical simulations such as wind tunnels.</p></li>
<li><p><em>Theoretical Methods</em>. The most fundamental theoretical approach is to resort to first principles, working with the elements in their full deductive simplicity, and examining all possibilities. This is what we do when we use a tree diagram to calculate the probability of three girls in families of four children.</p></li>
</ul>
<!---
Check we have in fact introduced tree diagram.
-->
<p>The formulaic approach is a theoretical method that aims to avoid the inconvenience of resorting to first principles, and instead uses calculation shortcuts that have been worked out in the past.</p>
<p><em>What the Book Teaches</em>. This book teaches you the empirical method using hypothetical cases. Formulas can be misleading for most people in most situations, and should be used as a shortcut only when a person understands exactly which first principles are embodied in the formulas. But most of the time, students and practitioners resort to the formulaic approach without understanding the first principles that lie behind them — indeed, their own teachers often do not understand these first principles — and therefore they have almost no way to verify that the formula is right. Instead they use canned checklists of qualifying conditions.</p>
</section>
<section id="samples-and-universes" class="level2" data-number="8.4">
<h2 data-number="8.4" class="anchored" data-anchor-id="samples-and-universes"><span class="header-section-number">8.4</span> Samples and universes</h2>
<p>The terms “sample” and “universe” (or “population”)<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a> were used earlier without definition. But now these terms must be defined.</p>
<section id="the-concept-of-a-sample" class="level3" data-number="8.4.1">
<h3 data-number="8.4.1" class="anchored" data-anchor-id="the-concept-of-a-sample"><span class="header-section-number">8.4.1</span> The concept of a sample</h3>
<p>For our purposes, a “sample” is a collection of observations for which you obtain the data to be used in the problem. Almost any set of observations for which you have data constitutes a sample. (You might, or might not, choose to call a complete census a sample.)</p>
<!---
The idea of a census and a population and a sample above?
-->
</section>
</section>
<section id="the-concept-of-a-universe-or-population" class="level2" data-number="8.5">
<h2 data-number="8.5" class="anchored" data-anchor-id="the-concept-of-a-universe-or-population"><span class="header-section-number">8.5</span> The concept of a universe or population</h2>
<p>For every sample there must also be a universe “behind” it. But “universe” is harder to define, partly because it is often an <em>imaginary</em> concept. A universe is the collection of things or people <em>that you want to say that your sample was taken from</em>. A universe can be finite and well defined — “all live holders of the Congressional Medal of Honor,” “all presidents of major universities,” “all billion-dollar corporations in the United States.” Of course, these finite universes may not be easy to pin down; for instance, what is a “major university”? And these universes may contain some elements that are difficult to find; for instance, some Congressional Medal winners may have left the country, and there may not be adequate public records on some billion-dollar corporations.</p>
<p>Universes that are called “infinite” are harder to understand, and it is often difficult to decide which universe is appropriate for a given purpose. For example, if you are studying a sample of patients suffering from schizophrenia, what is the universe from which the sample comes? Depending on your purposes, the appropriate universe might be all patients with schizophrenia now alive, or it might be all patients who might <em>ever</em> live. The latter concept of the universe of patients with schizophrenia is <em>imaginary</em> because some of the universe does not exist. And it is <em>infinite</em> because it goes on forever.</p>
<p>Not everyone likes this definition of “universe.” Others prefer to think of a universe, not as the collection of people or things that you <em>want</em> to say your sample was taken from, but as the collection that the sample was <em>actually</em> taken from. This latter view equates the universe to the “sampling frame” (the actual list or set of elements you sample from) which is always finite and existent. The definition of universe offered here is simply the most practical, in our opinion.</p>
<!---
More here on hypothetical world / universe. Refer back to previous chapters?
-->
</section>
<section id="the-conventions-of-probability" class="level2" data-number="8.6">
<h2 data-number="8.6" class="anchored" data-anchor-id="the-conventions-of-probability"><span class="header-section-number">8.6</span> The conventions of probability</h2>
<p>Let’s review the basic conventions and rules used in the study of probability:</p>
<ol type="1">
<li>Probabilities are expressed as decimals between 0 and 1, like percentages. The weather forecaster might say that the probability of rain tomorrow is 0.2, or 0.97.</li>
<li>The probabilities of all the possible alternative outcomes in a single “trial” must add to unity. If you are prepared to say that it must either rain or not rain, with no other outcome being possible — that is, if you consider the outcomes to be <em>mutually exclusive</em> (a term that we discuss below), then one of those probabilities implies the other. That is, if you estimate that the probability of rain is 0.2 — written <span class="math inline">\(P(\text{rain}) = 0.2\)</span> — that implies that you estimate that <span class="math inline">\(P(\text{no rain}) = 0.8\)</span>.</li>
</ol>
<div class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Writing probabilities
</div>
</div>
<div class="callout-body-container callout-body">
<p>We will now be writing some simple formulae using probability. Above we write the <em>probability of rain tomorrow</em> as <span class="math inline">\(P(\text{rain})\)</span>. This probability might be 0.2, and we could write this as:</p>
<p><span class="math display">\[
P(\text{rain}) = 0.2
\]</span></p>
<p>We can term “rain tomorrow” an <em>event</em> — the event may occur: <span class="math inline">\(\text{rain}\)</span>, or it may <em>not</em> occur: <span class="math inline">\(\text{no rain}\)</span>.</p>
<p>We often shorten the <em>name</em> of our event — here <span class="math inline">\(\text{rain}\)</span> — to a single letter, such as <span class="math inline">\(R\)</span>. So, in this case, we could write <span class="math inline">\(P(\text{rain}) = 0.2\)</span> as <span class="math inline">\(P(R) = 0.2\)</span> — meaning the same thing. We tend to prefer single letters — as in <span class="math inline">\(P(R)\)</span> — to longer names — as in <span class="math inline">\(P(\text{rain})\)</span>. This is because the single letters can be easier to read in these compact formulae.</p>
<p>Above we have written the probability of “rain tomorrow” event <em>not</em> occurring as <span class="math inline">\(P(\text{no rain})\)</span>. Another way of referring to an event <em>not</em> occurring is to suffix the event name with a <em>caret</em> (^) character like this: <span class="math inline">\(\ \hat{} R\)</span>. So read <span class="math inline">\(P(\ \hat{} R)\)</span> as “the probability that it will not rain”, and it is just another way of writing <span class="math inline">\(P(\text{no rain})\)</span>. We sometimes call <span class="math inline">\(\ \hat{}
R\)</span> the <em>complement</em> of <span class="math inline">\(R\)</span>.</p>
<p>We use <span class="math inline">\(\text{and}\)</span> between two events to mean <em>both</em> events occur.</p>
<p>For example, say we call the event “Commanders win the game” as <span class="math inline">\(W\)</span>. One example of a <em>compound event</em> (see above) would be the event <span class="math inline">\(W \text{and} R\)</span>, meaning, the event where the Commanders won the game <em>and</em> it rained.</p>
</div>
</div>
<!---
End of callout note.
-->
</section>
<section id="sec-addition-rule" class="level2" data-number="8.7">
<h2 data-number="8.7" class="anchored" data-anchor-id="sec-addition-rule"><span class="header-section-number">8.7</span> Mutually exclusive events — the addition rule</h2>
<p><strong>Definition:</strong> If there are just two events <span class="math inline">\(A\)</span> and <span class="math inline">\(B\)</span> and they are “mutually exclusive” or “disjoint,” each implies the absence of the other. Green and red coats are mutually exclusive for you if (but only if) you never wear more than one coat at a time.</p>
<p>To state this idea formally, if <span class="math inline">\(A\)</span> and <span class="math inline">\(B\)</span> are mutually exclusive, then:</p>
<p><span class="math display">\[
P(A \text{ and } B) = 0
\]</span></p>
<p>If <span class="math inline">\(A\)</span> is “wearing a green coat” and <span class="math inline">\(B\)</span> is “wearing a red coat” (and you never wear two coats at the same time), then the probability that you are wearing a green coat <em>and</em> a red coat is 0: <span class="math inline">\(P(A \text{ and } B) = 0\)</span>.</p>
<p>In that case, outcomes <span class="math inline">\(A\)</span> and <span class="math inline">\(B\)</span>, and hence outcome <span class="math inline">\(A\)</span> and its own absence (written <span class="math inline">\(P(\ \hat{} A)\)</span>), are necessarily mutually exclusive, and hence the two probabilities add to unity:</p>
<!---
This needs some serious fixing, by adding the "collectively exhaustive" condition.
Specifically, the addition rule, as usually put, is that, for mutually
exclusive events:
P(A or B) = P(A) + P(B)
But here we have this rule in the description:
P(A and B) = 0
However, we then go onto P(A or ^A) = 1, which is the extension of the collectively exhaustive condition.
What to do? Move? Delete? Rewrite?
-->
<p><span class="math display">\[
P(A) + P(\ \hat{} A) = 1
\]</span></p>
<p>The sales of your store in a given year cannot be both above and below $1 million. Therefore if <span class="math inline">\(P(\text{sales > \$1 million}) = 0.2\)</span>, <span class="math inline">\(P(\text{sales <=
\$1 million}) = 0.8\)</span>.</p>
<p>This “complements” rule is useful as a consistency check on your estimates of probabilities. If you say that the probability of rain is 0.2, then you should check that you think that the probability of no rain is 0.8; if not, reconsider both the estimates. The same for the probabilities of your team winning and losing its next game.</p>
</section>
<section id="joint-probabilities" class="level2" data-number="8.8">
<h2 data-number="8.8" class="anchored" data-anchor-id="joint-probabilities"><span class="header-section-number">8.8</span> Joint probabilities</h2>
<p>Let’s return now to the Commanders. We said earlier that our best guess of the probability that the Commanders will win the first game is 0.6. Let’s complicate the matter a bit and say that the probability of the Commanders winning depends upon the weather; on a nice day we estimate a 0.65 chance of winning, on a nasty (rainy or snowy) day a chance of 0.55. It is obvious that we then want to know the chance of a nice day, and we estimate a probability of 0.7. Let’s now ask the probability that both will happen — <em>it will be a nice day and the Commanders will win</em>. Before getting on with the process of estimation itself, let’s tarry a moment to discuss the probability estimates. Where do we get the notion that the probability of a nice day next Sunday is 0.7? We might have done so by checking the records of the past 50 years, and finding 35 nice days on that date. If we assume that the weather has not changed over that period (an assumption that some might not think reasonable, and the wisdom of which must be the outcome of some non-objective judgment), our probability estimate of a nice day would then be 35/50 = 0.7.</p>
<p>Two points to notice here: 1) The source of this estimate is an objective “frequency series.” And 2) the data come to us as the records of 50 days, of which 35 were nice. We would do best to stick with exactly those numbers rather than convert them into a single number — 70 percent. Percentages have a way of being confusing. (When his point score goes up from 2 to 3, my racquetball partner is fond of saying that he has made a “fifty percent increase”; that’s just one of the confusions with percentages.) And converting to a percent loses information: We no longer know how many observations the percent is based upon, whereas 35/50 keeps that information.</p>
<p>Now, what about the estimate that the Commanders have a 0.65 chance of winning on a nice day — where does that come from? Unlike the weather situation, there is no long series of stable data to provide that information about the probability of winning. Instead, we <em>construct</em> an estimate using whatever information or “hunch” we have. The information might include the Commanders’ record earlier in this season, injuries that have occurred, what the “experts” in the newspapers say, the gambling odds, and so on. The result certainly is not “objective,” or the result of a stable frequency series. But we treat the 0.65 probability in quite the same way as we treat the .7 estimate of a nice day. In the case of winning, however, we produce an estimate expressed directly as a percent.</p>
<p>If we are shaky about the estimate of winning — as indeed we ought to be, because so much judgment and guesswork inevitably goes into it — we might proceed as follows: Take hold of a bucket and two bags of balls, green and red. Put into the bucket some number of green balls — say 10. Now add enough red balls to express your judgment that the <em>ratio</em> is the ratio of expected wins to losses on a nice day, adding or subtracting green balls as necessary to get the ratio you want. If you end up with 13 green and 7 red balls, then you are “modeling” a probability of 0.65, as stated above. If you end up with a different ratio of balls, then you have learned from this experiment with your own mind processes that you think that the probability of a win on a nice day is something other than 0.65.</p>
<p>Don’t put away the bucket. We will be using it again shortly. And keep in mind how we have just been using it, because our use later will be somewhat different though directly related.</p>
<p>One good way to begin the process of producing a compound estimate is by portraying the available data in a “tree diagram” like <a href="#fig-commanders-tree" class="quarto-xref">Figure <span>8.1</span></a>. The tree diagram shows the possible events in the order in which they might occur. A tree diagram is extremely valuable whether you will continue with either simulation or the formulaic method.</p>
<div class="cell" data-layout-align="center">
<div class="cell-output-display">
<div id="fig-commanders-tree" class="quarto-float quarto-figure quarto-figure-center anchored" data-fig-align="center">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-commanders-tree-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="diagrams/commanders_tree.svg" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:70.0%">
</div>
<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-commanders-tree-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
Figure 8.1: Tree diagram
</figcaption>
</figure>
</div>
</div>
</div>
</section>
<section id="sec-what-is-resampling" class="level2" data-number="8.9">
<h2 data-number="8.9" class="anchored" data-anchor-id="sec-what-is-resampling"><span class="header-section-number">8.9</span> The Monte Carlo simulation method (resampling)</h2>
<p>The steps we follow to simulate an answer to the compound probability question are as follows:</p>
<ol type="1">
<li>Put seven blue balls (for “nice day”) and three yellow balls (“not nice”) into a bucket labeled A.</li>
<li>Put 65 green balls (for “win”) and 35 red balls (“lose”) into a bucket labeled B. This bucket represents the chance that the Commanders will when it is a nice day.</li>
<li>Draw one ball from bucket A. If it is blue, carry on to the next step; otherwise record “no” and stop.</li>
<li>If you have drawn a blue ball from bucket A, now draw a ball from bucket B, and if it is green, record “yes” on a score sheet; otherwise write “no.”</li>
<li>Repeat steps 3-4 perhaps 10000 times.</li>
<li>Count the number of “yes” trials.</li>
<li>Compute the probability you seek as (number of “yeses”/ 10000). (This is the same as (number of “yeses”/ (number of “yeses” + number of “noes”)</li>
</ol>
<p>Actually doing the above series of steps by hand is useful to build your intuition about probability and simulation methods. But the procedure can also be simulated with a computer. We will use R to do this in a moment.</p>
</section>
<section id="sec-if-statements" class="level2" data-number="8.10">
<h2 data-number="8.10" class="anchored" data-anchor-id="sec-if-statements"><span class="header-section-number">8.10</span> If statements in R</h2>
<p>Before we get to the simulation, we need another feature of R, called a <em>conditional</em> or <em>if</em> statement.</p>
<p>Here we have rewritten step 4 above, but using indentation to emphasize the idea:</p>
<pre><code>If you have drawn a blue ball from bucket A:
Draw a ball from bucket B
if the ball is green:
record "yes"
otherwise:
record "no".</code></pre>
<p>Notice the structure. The first line is the <em>header</em> of the <code>if</code> statement. It has a <em>condition</em> — this is why <code>if</code> statements are often called <em>conditional</em> statements. The condition here is “you have drawn a blue ball from bucket A”. If this condition is met — it is True that you have drawn a blue ball from bucket A <em>then</em> we go on to do the stuff that is indented. Otherwise we do not do any of the stuff that is indented.</p>
<p>The indented stuff above is the <em>body</em> of the <code>if</code> statement. It is the stuff we do <code>if</code> the <em>conditional</em> at the top is True.</p>
<p>Now let’s see how we would write that in R.</p>
<p>Let’s make bucket A. Remember, this is the <em>weather</em> bucket. It has seven blue balls (for 70% fine days) and 3 yellow balls (for 30% rainy days). See <a href="sampling_tools.html#sec-repeating" class="quarto-xref"><span>Section 7.6</span></a> for the <span class="r"><code>rep</code></span> way of repeating elements multiple times.</p>
<div id="nte-fine_win" class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Note 8.1: Notebook: Fine day and win
</div>
</div>
<div class="callout-body-container callout-body">
<div class="nb-links">
<p><a class="notebook-link" href="notebooks/fine_win.Rmd">Download notebook</a> <a class="interact-button" href="./interact/lab/index.html?path=fine_win.ipynb">Interact</a></p>
</div>
</div>
</div>
<div class="nb-start" name="fine_win" title="Fine day and win">
</div>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="co"># blue means "nice day", yellow means "not nice".</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a>bucket_A <span class="ot"><-</span> <span class="fu">rep</span>(<span class="fu">c</span>(<span class="st">'blue'</span>, <span class="st">'yellow'</span>), <span class="fu">c</span>(<span class="dv">7</span>, <span class="dv">3</span>))</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a>bucket_A</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code> [1] "blue" "blue" "blue" "blue" "blue" "blue" "blue" "yellow"
[9] "yellow" "yellow"</code></pre>
</div>
</div>
<p>Now let us draw a ball at random from bucket_A:</p>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>a_ball <span class="ot"><-</span> <span class="fu">sample</span>(bucket_A, <span class="at">size=</span><span class="dv">1</span>)</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a>a_ball</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "blue"</code></pre>
</div>
</div>
<p>How we run our first <code>if</code> statement. Running this code will display “The ball was blue” if the ball was blue, otherwise it will not display anything:</p>
<!---
End of Python block.
-->
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> (a_ball <span class="sc">==</span> <span class="st">'blue'</span>) {</span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">message</span>(<span class="st">'The ball was blue'</span>)</span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>The ball was blue</code></pre>
</div>
</div>
<div class="r">
<p>Notice that the header line has <code>if</code>, followed by an open parenthesis <code>(</code> introducing the <em>conditional expression</em> <code>a_ball == 'blue'</code>. There follows close parenthesis <code>)</code> to finish the conditional expression. Next there is a open curly brace <code>{</code> to signal the start of the body of the <code>if</code> statement. The <em>body</em> of the <code>if</code> statement is one or more lines of code, followed by the close curly brace <code>}</code>. Here there is only one line: <code>message('The ball was blue')</code>. R only runs the body of the if statement if the <em>condition</em> is <code>TRUE</code>.<a href="#fn2" class="footnote-ref" id="fnref2" role="doc-noteref"><sup>2</sup></a></p>
</div>
<!---
End of R block
-->
<p>To confirm we see “The ball was blue” if <code>a_ball</code> is <code>'blue'</code> and nothing otherwise, we can set <code>a_ball</code> and re-run the code:</p>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Set value of a_ball so we know what it is.</span></span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a>a_ball <span class="ot"><-</span> <span class="st">'blue'</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> (a_ball <span class="sc">==</span> <span class="st">'blue'</span>) {</span>
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a> <span class="co"># The conditional statement is True in this case, so the body does run.</span></span>
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">message</span>(<span class="st">'The ball was blue'</span>)</span>
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>The ball was blue</code></pre>
</div>
</div>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a>a_ball <span class="ot"><-</span> <span class="st">'yellow'</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb12"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> (a_ball <span class="sc">==</span> <span class="st">'blue'</span>) {</span>
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a> <span class="co"># The conditional statement is False, so the body does not run.</span></span>
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">message</span>(<span class="st">'The ball was blue'</span>)</span>
<span id="cb12-4"><a href="#cb12-4" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<p>We can add an <code>else</code> clause to the <code>if</code> statement. Remember the <em>body</em> of the <code>if</code> statement runs if the <em>conditional expression</em> (here <code>a_ball == 'blue')</code> is <code>TRUE</code>. The <code>else</code> clause runs if the conditional statement is <code>FALSE</code>. This may be clearer with an example:</p>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a>a_ball <span class="ot"><-</span> <span class="st">'blue'</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> (a_ball <span class="sc">==</span> <span class="st">'blue'</span>) {</span>
<span id="cb14-2"><a href="#cb14-2" aria-hidden="true" tabindex="-1"></a> <span class="co"># The conditional expression is True in this case, so the body runs.</span></span>
<span id="cb14-3"><a href="#cb14-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">message</span>(<span class="st">'The ball was blue'</span>)</span>
<span id="cb14-4"><a href="#cb14-4" aria-hidden="true" tabindex="-1"></a>} <span class="cf">else</span> {</span>
<span id="cb14-5"><a href="#cb14-5" aria-hidden="true" tabindex="-1"></a> <span class="co"># The conditional expression was True, so the else clause does not run.</span></span>
<span id="cb14-6"><a href="#cb14-6" aria-hidden="true" tabindex="-1"></a> <span class="fu">message</span>(<span class="st">'The ball was not blue'</span>)</span>
<span id="cb14-7"><a href="#cb14-7" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>The ball was blue</code></pre>
</div>
</div>
<div class="r">
<p>Notice that the <code>else</code> clause of the <code>if</code> statement starts with the end of the <code>if</code> body with the closing curly brace <code>}</code>. <code>else</code> follows, followed in turn by the opening curly brace <code>{</code> to start the body of the <code>else</code> clause. The body of the <code>else</code> clause only runs if the initial conditional expression is <em>not</em> <code>TRUE</code>.</p>
</div>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb16"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a>a_ball <span class="ot"><-</span> <span class="st">'yellow'</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb17"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> (a_ball <span class="sc">==</span> <span class="st">'yellow'</span>) {</span>
<span id="cb17-2"><a href="#cb17-2" aria-hidden="true" tabindex="-1"></a> <span class="co"># The conditional expression was False, so the body does not run.</span></span>
<span id="cb17-3"><a href="#cb17-3" aria-hidden="true" tabindex="-1"></a> <span class="fu">message</span>(<span class="st">'The ball was blue'</span>)</span>
<span id="cb17-4"><a href="#cb17-4" aria-hidden="true" tabindex="-1"></a>} <span class="cf">else</span> {</span>
<span id="cb17-5"><a href="#cb17-5" aria-hidden="true" tabindex="-1"></a> <span class="co"># but the else clause does run.</span></span>
<span id="cb17-6"><a href="#cb17-6" aria-hidden="true" tabindex="-1"></a> <span class="fu">message</span>(<span class="st">'The ball was not blue'</span>)</span>
<span id="cb17-7"><a href="#cb17-7" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>The ball was blue</code></pre>
</div>
</div>
<p>With this machinery, we can now implement the full logic of step 4 above:</p>
<pre><code>If you have drawn a blue ball from bucket A:
Draw a ball from bucket B
if the ball is green:
record "yes"
otherwise:
record "no".</code></pre>
<p>Here is bucket B. Remember green means “win” (65% of the time) and red means “lose” (35% of the time). We could call this the “Commanders win when it is a nice day” bucket:</p>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb20"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a>bucket_B <span class="ot"><-</span> <span class="fu">rep</span>(<span class="fu">c</span>(<span class="st">'green'</span>, <span class="st">'red'</span>), <span class="fu">c</span>(<span class="dv">65</span>, <span class="dv">35</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<p>The full logic for step 4 is:</p>
<p>Now we have everything we need to run many trials with the same logic.</p>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a><span class="co"># By default, say we have no result.</span></span>
<span id="cb21-2"><a href="#cb21-2" aria-hidden="true" tabindex="-1"></a>result <span class="ot">=</span> <span class="st">'No result'</span></span>
<span id="cb21-3"><a href="#cb21-3" aria-hidden="true" tabindex="-1"></a>a_ball <span class="ot"><-</span> <span class="fu">sample</span>(bucket_A, <span class="at">size=</span><span class="dv">1</span>)</span>
<span id="cb21-4"><a href="#cb21-4" aria-hidden="true" tabindex="-1"></a><span class="co"># If you have drawn a blue ball from bucket A: (then run code between {})</span></span>
<span id="cb21-5"><a href="#cb21-5" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span> (a_ball <span class="sc">==</span> <span class="st">'blue'</span>) {</span>
<span id="cb21-6"><a href="#cb21-6" aria-hidden="true" tabindex="-1"></a> <span class="co"># Draw a ball at random from bucket B</span></span>
<span id="cb21-7"><a href="#cb21-7" aria-hidden="true" tabindex="-1"></a> b_ball <span class="ot"><-</span> <span class="fu">sample</span>(bucket_B, <span class="at">size=</span><span class="dv">1</span>)</span>
<span id="cb21-8"><a href="#cb21-8" aria-hidden="true" tabindex="-1"></a> <span class="co"># if the ball is green: (then run code between {})</span></span>
<span id="cb21-9"><a href="#cb21-9" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> (b_ball <span class="sc">==</span> <span class="st">'green'</span>) {</span>
<span id="cb21-10"><a href="#cb21-10" aria-hidden="true" tabindex="-1"></a> <span class="co"># record "yes"</span></span>
<span id="cb21-11"><a href="#cb21-11" aria-hidden="true" tabindex="-1"></a> result <span class="ot"><-</span> <span class="st">'yes'</span></span>
<span id="cb21-12"><a href="#cb21-12" aria-hidden="true" tabindex="-1"></a> <span class="co"># otherwise:</span></span>
<span id="cb21-13"><a href="#cb21-13" aria-hidden="true" tabindex="-1"></a> } <span class="cf">else</span> {</span>
<span id="cb21-14"><a href="#cb21-14" aria-hidden="true" tabindex="-1"></a> <span class="co"># record "no".</span></span>
<span id="cb21-15"><a href="#cb21-15" aria-hidden="true" tabindex="-1"></a> result <span class="ot"><-</span> <span class="st">'no'</span></span>
<span id="cb21-16"><a href="#cb21-16" aria-hidden="true" tabindex="-1"></a> }</span>
<span id="cb21-17"><a href="#cb21-17" aria-hidden="true" tabindex="-1"></a>}</span>
<span id="cb21-18"><a href="#cb21-18" aria-hidden="true" tabindex="-1"></a><span class="co"># Show what we got in this case.</span></span>
<span id="cb21-19"><a href="#cb21-19" aria-hidden="true" tabindex="-1"></a>result</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "yes"</code></pre>
</div>
</div>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb23"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb23-1"><a href="#cb23-1" aria-hidden="true" tabindex="-1"></a><span class="co"># The result of each trial.</span></span>
<span id="cb23-2"><a href="#cb23-2" aria-hidden="true" tabindex="-1"></a><span class="co"># To start with, say we have no result for all the trials.</span></span>
<span id="cb23-3"><a href="#cb23-3" aria-hidden="true" tabindex="-1"></a>z <span class="ot"><-</span> <span class="fu">rep</span>(<span class="st">'No result'</span>, <span class="dv">10000</span>)</span>
<span id="cb23-4"><a href="#cb23-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb23-5"><a href="#cb23-5" aria-hidden="true" tabindex="-1"></a><span class="co"># Repeat trial procedure 10000 times</span></span>
<span id="cb23-6"><a href="#cb23-6" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> (i <span class="cf">in</span> <span class="dv">1</span><span class="sc">:</span><span class="dv">10000</span>) {</span>
<span id="cb23-7"><a href="#cb23-7" aria-hidden="true" tabindex="-1"></a> <span class="co"># draw one "ball" for the weather, store in "a_ball"</span></span>
<span id="cb23-8"><a href="#cb23-8" aria-hidden="true" tabindex="-1"></a> <span class="co"># blue is "nice day", yellow is "not nice"</span></span>
<span id="cb23-9"><a href="#cb23-9" aria-hidden="true" tabindex="-1"></a> a_ball <span class="ot"><-</span> <span class="fu">sample</span>(bucket_A, <span class="at">size=</span><span class="dv">1</span>)</span>
<span id="cb23-10"><a href="#cb23-10" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> (a_ball <span class="sc">==</span> <span class="st">'blue'</span>) { <span class="co"># nice day</span></span>
<span id="cb23-11"><a href="#cb23-11" aria-hidden="true" tabindex="-1"></a> <span class="co"># if no rain, check on game outcome</span></span>
<span id="cb23-12"><a href="#cb23-12" aria-hidden="true" tabindex="-1"></a> <span class="co"># green is "win" (give nice day), red is "lose" (given nice day).</span></span>
<span id="cb23-13"><a href="#cb23-13" aria-hidden="true" tabindex="-1"></a> b_ball <span class="ot"><-</span> <span class="fu">sample</span>(bucket_B, <span class="at">size=</span><span class="dv">1</span>)</span>
<span id="cb23-14"><a href="#cb23-14" aria-hidden="true" tabindex="-1"></a> <span class="cf">if</span> (b_ball <span class="sc">==</span> <span class="st">'green'</span>) { <span class="co"># Commanders win</span></span>
<span id="cb23-15"><a href="#cb23-15" aria-hidden="true" tabindex="-1"></a> <span class="co"># Record result.</span></span>
<span id="cb23-16"><a href="#cb23-16" aria-hidden="true" tabindex="-1"></a> z[i] <span class="ot"><-</span> <span class="st">'yes'</span></span>
<span id="cb23-17"><a href="#cb23-17" aria-hidden="true" tabindex="-1"></a> } <span class="cf">else</span> {</span>
<span id="cb23-18"><a href="#cb23-18" aria-hidden="true" tabindex="-1"></a> z[i] <span class="ot"><-</span> <span class="st">'no'</span></span>
<span id="cb23-19"><a href="#cb23-19" aria-hidden="true" tabindex="-1"></a> }</span>
<span id="cb23-20"><a href="#cb23-20" aria-hidden="true" tabindex="-1"></a> }</span>
<span id="cb23-21"><a href="#cb23-21" aria-hidden="true" tabindex="-1"></a> <span class="co"># End of trial, go back to the beginning until done.</span></span>
<span id="cb23-22"><a href="#cb23-22" aria-hidden="true" tabindex="-1"></a>}</span>
<span id="cb23-23"><a href="#cb23-23" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb23-24"><a href="#cb23-24" aria-hidden="true" tabindex="-1"></a><span class="co"># Count of the number of times we got "yes".</span></span>
<span id="cb23-25"><a href="#cb23-25" aria-hidden="true" tabindex="-1"></a>k <span class="ot"><-</span> <span class="fu">sum</span>(z <span class="sc">==</span> <span class="st">'yes'</span>)</span>
<span id="cb23-26"><a href="#cb23-26" aria-hidden="true" tabindex="-1"></a><span class="co"># Show the proportion of *both* fine day *and* wins</span></span>
<span id="cb23-27"><a href="#cb23-27" aria-hidden="true" tabindex="-1"></a>kk <span class="ot"><-</span> k <span class="sc">/</span> <span class="dv">10000</span></span>
<span id="cb23-28"><a href="#cb23-28" aria-hidden="true" tabindex="-1"></a>kk</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 0.461</code></pre>
</div>
</div>
<p>The above procedure gives us the probability that it will be a nice day and the Commanders will win — about 46.1%.</p>
<div class="nb-end">
</div>
<div class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
End of notebook: Fine day and win
</div>
</div>
<div class="callout-body-container callout-body">
<p><code>fine_win</code> starts at <a href="#nte-fine_win" class="quarto-xref">Note <span>8.1</span></a>.</p>
</div>
</div>
<!---
End of notebook.
-->
<p>Let’s say that we think that the Commanders have a 0.55 (55%) chance of winning on a not-nice day. With the aid of a bucket with a different composition — one made by substituting 55 green and 45 yellow balls in Step 4 — a similar procedure yields the chance that it will be a <em>nasty</em> day and the Commanders will win. With a similar substitution and procedure we could also estimate the probabilities that it will be a nasty day and the Commanders will lose, and a nice day and the Commanders will lose. The sum of these probabilities should come close to unity, because the sum includes all the possible outcomes. But it will not <em>exactly</em> equal unity because of what we call “sampling variation” or “sampling error.”</p>
<p>Please notice that each trial of the procedure begins with the same numbers of balls in the buckets as the previous trial. That is, you must replace the balls you draw after each trial in order that the probabilities remain the same from trial to trial. Later we will discuss the general concept of replacement versus non-replacement more fully.</p>
</section>
<section id="the-deductive-formulaic-method" class="level2" data-number="8.11">
<h2 data-number="8.11" class="anchored" data-anchor-id="the-deductive-formulaic-method"><span class="header-section-number">8.11</span> The deductive formulaic method</h2>
<p>It also is possible to get an answer with formulaic methods to the question about a nice day and the Commanders winning. The following discussion of nice-day-Commanders-win handled by formula is a prototype of the formulaic deductive method for handling other problems.</p>
<p>Return now to the tree diagram (<a href="#fig-commanders-tree" class="quarto-xref">Figure <span>8.1</span></a>) above. We can read from the tree diagram that 70 percent of the time it will be nice, and of that 70 percent of the time, 65 percent of the games will be wins. That is, <span class="math inline">\(0.65 * 0.7 = 0.455\)</span> = the probability of a nice day and a win. That is the answer we seek. The method seems easy, but it also is easy to get confused and obtain the wrong answer.</p>
</section>
<section id="sec-multiplication-rule" class="level2" data-number="8.12">
<h2 data-number="8.12" class="anchored" data-anchor-id="sec-multiplication-rule"><span class="header-section-number">8.12</span> Multiplication rule</h2>
<p>We can generalize what we have just done. The foregoing formula exemplifies what is known as the “multiplication rule”:</p>
<p><span class="math display">\[
P(\text{nice day and win}) = P(\text{nice day}) * P(\text{winning | nice day})
\]</span></p>
<p>where the vertical line in <span class="math inline">\(P(\text{winning | nice day})\)</span> means “conditional upon” or “given that.” That is, the vertical line indicates a “conditional probability,” a concept we must consider in a minute.</p>
<p>The multiplication rule is a formula that produces the probability of the <em>combination (juncture) of two or more events</em>. More discussion of it will follow below.</p>
</section>
<section id="sec-cond-uncond" class="level2" data-number="8.13">
<h2 data-number="8.13" class="anchored" data-anchor-id="sec-cond-uncond"><span class="header-section-number">8.13</span> Conditional and unconditional probabilities</h2>
<p>Two kinds of probability statements — <em>conditional</em> and <em>unconditional</em> — must now be distinguished.</p>
<p>It is the appropriate concept when many factors, all small relative to each other rather than one force having an overwhelming influence, affect the outcome.</p>
<p>A <em>conditional</em> probability is formally written <span class="math inline">\(P(\text{Commanders win
| rain}) = 0.65\)</span>, and it is read “The probability that the Commanders will win if (given that) it rains is 0.65.” It is the appropriate concept when there is one (or more) major event of interest in decision contexts.</p>
<p>Let’s use another football example to explain conditional and unconditional probabilities. In the year this was being written, the University of Maryland had an unpromising football team. Someone may nevertheless ask what chance the team had of winning the post season game at the bowl to which only the best team in the University of Maryland’s league is sent. One may say that <em>if</em> by some miracle the University of Maryland does get to the bowl, its chance would be a bit less than 50- 50 — say, 0.40. That is, the probability of its winning, <em>conditional</em> on getting to the bowl is 0.40. But the chance of its getting to the bowl at all is very low, perhaps 0.01. If so, the unconditional probability of winning at the bowl is the probability of its getting there multiplied by the probability of winning <em>if</em> it gets there; that is, 0.01 x 0.40 = 0.004. (It would be even better to say that .004 is the probability of winning conditional only on having a team, there being a league, and so on, all of which seem almost sure things.) Every probability is conditional on many things — that war does not break out, that the sun continues to rise, and so on. But if all those unspecified conditions are very sure, and can be taken for granted, we talk of the probability as unconditional.</p>
<p>A conditional probability is a statement that the probability of an event is such-and-such <em>if</em> something else is so-and-so; it is the “if” that makes a probability statement conditional. True, in <em>some</em> sense all probability statements are conditional; for example, the probability of an even-numbered spade is 6/52 <em>if</em> the deck is a poker deck and not necessarily if it is a pinochle deck or Tarot deck. But we ignore such conditions for most purposes.</p>
<p>Most of the use of the concept of probability in the social sciences is conditional probability. All hypothesis-testing statistics (discussed starting in <a href="framing_questions.html" class="quarto-xref"><span>Chapter 20</span></a>) are conditional probabilities.</p>
<p>Here is the typical conditional-probability question used in social-science statistics: What is the probability of obtaining this sample S (by chance) <em>if</em> the sample were taken from universe A? For example, what is the probability of getting a sample of five children with I.Q.s over 100 <em>by chance</em> in a sample randomly chosen from the universe of children whose average I.Q. is 100?</p>
<p>One way to obtain such conditional-probability statements is by examination of the results generated by universes like the conditional universe. For example, assume that we are considering a universe of children where the average I.Q. is 100.</p>
<p>Write down “over 100” and “under 100” respectively on many slips of paper, put them into a hat, draw five slips several times, and see how often the first five slips drawn are all over 100. This is the resampling (Monte Carlo simulation) method of estimating probabilities.</p>
<p>Another way to obtain such conditional-probability statements is formulaic calculation. For example, if half the slips in the hat have numbers under 100 and half over 100, the probability of getting five in a row above 100 is 0.03125 — that is, <span class="math inline">\(0.5^5\)</span>, or 0.5 x 0.5 x 0.5 x 0.5 x 0.5, using the multiplication rule introduced above. But if you are not absolutely sure you know the proper mathematical formula, you are more likely to come up with a sound answer with the simulation method.</p>
<p>Let’s illustrate the concept of conditional probability with four cards — two aces and two 3’s (or two black and two red). What is the probability of an ace? Obviously, 0.5. If you first draw an ace, what is the probability of an ace now? That is, what is the probability of an ace <em>conditional on</em> having drawn one already? Obviously not 0.5.</p>
<p>This change in the conditional probabilities is the basis of mathematician <a href="https://en.wikipedia.org/wiki/Edward_O._Thorp">Edward Thorp’s</a> famous system of card-counting to beat the casinos at blackjack (Twenty One).</p>
<p>Casinos can defeat card counting by using many decks at once so that conditional probabilities change more slowly, and are not very different than unconditional probabilities. Looking ahead, we will see that sampling with replacement, and sampling without replacement from a huge universe, are much the same in practice, so we can substitute one for the other at our convenience.</p>
<p>Let’s further illustrate the concept of conditional probability with a puzzle <span class="citation" data-cites="gardner2001colossal">(from <a href="references.html#ref-gardner2001colossal" role="doc-biblioref">Gardner 2001, 288</a>)</span>. “… shuffle a packet of four cards — two red, two black — and deal them face down in a row. Two cards are picked at random, say by placing a penny on each. What is the probability that those two cards are the same color?”</p>
<p><strong>1.</strong> Play the game with the cards 100 times, and estimate the probability sought.</p>
<p>OR</p>
<ol type="1">
<li>Put slips with the numbers “1,” “1,” “2,” and “2” in a hat, or in a vector named <code>N</code> on a computer.</li>
<li>Shuffle the slips of paper by shaking the hat or shuffling the vector (of which more below).</li>
<li>Take two slips of paper from the hat or from <code>N</code>, to get two numbers.</li>
<li>Call the first number you selected <code>A</code> and the second <code>B</code>.</li>
<li>Are <code>A</code> and <code>B</code> the same? If so, record “Yes” otherwise “No”.</li>
<li>Repeat (2-5) 10000 times, and count the proportion of “Yes” results. That proportion equals the probability we seek to estimate.</li>
</ol>
<p>Before we proceed to do this procedure in R, we need a command to <em>shuffle</em> a vector.</p>
</section>
<section id="sec-shuffling" class="level2" data-number="8.14">
<h2 data-number="8.14" class="anchored" data-anchor-id="sec-shuffling"><span class="header-section-number">8.14</span> Shuffling with <span class="r"><code>sample</code></span></h2>
<p>In the recipe above, the vector <code>N</code> has four values:</p>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb25"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb25-1"><a href="#cb25-1" aria-hidden="true" tabindex="-1"></a>N <span class="ot">=</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">1</span>, <span class="dv">2</span>, <span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<p>For the physical simulation, we specified that we would shuffle the slips of paper with these numbers, meaning that we would jumble them up into a random order. When we have done this, we will select two slips — say the first two — from the shuffled slips.</p>
<p>As we will be discussing more in various places, this shuffle-then-draw procedure is also called <em>resampling without replacement</em>. The <em>without replacement</em> idea refers to the fact that, after shuffling, we take a first virtual slip of paper from the shuffled vector, and then a second — but we do not replace the first slip of paper into the shuffled vector before drawing the second. For example, say I drew a “1” from <code>N</code> for the first value. If I am sampling <em>without replacement</em> then, when I draw the next value, the candidates I am choosing from are now “1”, “2” and “2”, because I have removed the “1” I got as the first value. If I had instead been sampling <em>with replacement</em>, then I would put back the “1” I had drawn, and would draw the second sample from the full set of “1”, “1”, “2”, “2”.</p>
<div class="r">
<p>In fact we can can use R’s <code>sample</code> function to shuffle any vector. The <em>default</em> behavior of <code>sample</code> is to sample <em>without replacement</em>. Up until now we have always told R to change that default behavior, using the <code>replace=TRUE</code> argument to <code>sample</code>. <code>replace=TRUE</code> tells <code>sample</code> to sample <em>with replacement</em>. Now we want to sample <em>without replacement</em>, so we leave out <code>replace=TRUE</code> to let sample do its default sampling, without replacement. That is, when we do not specify <code>replace=</code>, R assumes <code>replace=FALSE</code> — sampling without replacement.</p>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb26"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb26-1"><a href="#cb26-1" aria-hidden="true" tabindex="-1"></a><span class="co"># The vector N, shuffled into a random order.</span></span>
<span id="cb26-2"><a href="#cb26-2" aria-hidden="true" tabindex="-1"></a><span class="co"># Note that "sample" *by default*, samples without replacement.</span></span>
<span id="cb26-3"><a href="#cb26-3" aria-hidden="true" tabindex="-1"></a><span class="co"># When we ask for size=4, we are asking for a sample that is the same</span></span>
<span id="cb26-4"><a href="#cb26-4" aria-hidden="true" tabindex="-1"></a><span class="co"># size as the original vector, and so, this will be the original vector</span></span>
<span id="cb26-5"><a href="#cb26-5" aria-hidden="true" tabindex="-1"></a><span class="co"># with a random reordering.</span></span>
<span id="cb26-6"><a href="#cb26-6" aria-hidden="true" tabindex="-1"></a>shuffled <span class="ot"><-</span> <span class="fu">sample</span>(N, <span class="at">size=</span><span class="dv">4</span>)</span>
<span id="cb26-7"><a href="#cb26-7" aria-hidden="true" tabindex="-1"></a><span class="co"># The "slips" are now in random order.</span></span>
<span id="cb26-8"><a href="#cb26-8" aria-hidden="true" tabindex="-1"></a>shuffled</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 1 2 2 1</code></pre>
</div>
</div>
<p>And in fact, if you omit the <code>size=</code> argument to <code>sample</code>, it will assume you mean the size to be the same size as the input array — in this case, it will assume <code>size=length(N)</code> and therefore <code>size=4</code>. So we can get the same effect of a reordered (shuffled) vector by omitting both <code>size=</code> and <code>replace=</code>:</p>
<div class="cell" data-layout-align="center">
<div class="sourceCode cell-code" id="cb28"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb28-1"><a href="#cb28-1" aria-hidden="true" tabindex="-1"></a><span class="co"># The vector N, shuffled into a random order (the same procedure as the chunk</span></span>
<span id="cb28-2"><a href="#cb28-2" aria-hidden="true" tabindex="-1"></a><span class="co"># above).</span></span>
<span id="cb28-3"><a href="#cb28-3" aria-hidden="true" tabindex="-1"></a>shuffled <span class="ot"><-</span> <span class="fu">sample</span>(N)</span>
<span id="cb28-4"><a href="#cb28-4" aria-hidden="true" tabindex="-1"></a><span class="co"># The "slips" are now in random order.</span></span>
<span id="cb28-5"><a href="#cb28-5" aria-hidden="true" tabindex="-1"></a>shuffled</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 2 1 1 2</code></pre>
</div>
</div>
</div>
<!---
End of Python block.
-->
<p>See <a href="probability_theory_2_compound.html#sec-shuffling-deck" class="quarto-xref"><span>Section 11.4</span></a> for some more discussion of shuffling and sampling without replacement.</p>
</section>
<section id="code-answers-to-the-cards-and-pennies-problem" class="level2" data-number="8.15">
<h2 data-number="8.15" class="anchored" data-anchor-id="code-answers-to-the-cards-and-pennies-problem"><span class="header-section-number">8.15</span> Code answers to the cards and pennies problem</h2>
<div id="nte-cards_pennies" class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Note 8.2: Notebook: Cards and pennies
</div>
</div>
<div class="callout-body-container callout-body">
<div class="nb-links">
<p><a class="notebook-link" href="notebooks/cards_pennies.Rmd">Download notebook</a> <a class="interact-button" href="./interact/lab/index.html?path=cards_pennies.ipynb">Interact</a></p>
</div>
</div>
</div>