-
Notifications
You must be signed in to change notification settings - Fork 1
/
quiz2.Rmd
159 lines (130 loc) · 4.25 KB
/
quiz2.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
---
title: "Quiz 2"
author: "Liam Damewood"
date: "September 7, 2014"
output: html_document
---
### Question 1
```{r, echo = F}
rm(list = ls())
```
Consider the following data with x as the predictor and y as as the outcome.
```{r}
x <- c(0.61, 0.93, 0.83, 0.35, 0.54, 0.16, 0.91, 0.62, 0.62)
y <- c(0.67, 0.84, 0.6, 0.18, 0.85, 0.47, 1.1, 0.65, 0.36)
```
Give a P-value for the two sided hypothesis test of whether $\beta_1$ from a linear regression model is 0 or not.
* 0.05296
* 2.325
* 0.025
* 0.391
```{r}
# Easy way
summary(lm(y ~ x))$coefficients[2,4]
```
### Question 2
Consider the previous problem, give the estimate of the residual standard deviation.
* 0.3552
* 0.223
* 0.4358
* 0.05296
```{r}
fit <- lm(y ~ x)
e <- resid(fit)
plot(y ~ x)
abline(fit)
plot(e ~ x)
abline(h = 0)
sqrt(sum(e*e)/(length(e)-2))
sd(e) * sqrt((length(e) - 1)/(length(e) - 2))
```
### Question 3
```{r, echo = F}
rm(list = ls())
```
In the `mtcars` data set, fit a linear regression model of weight (predictor) on mpg (outcome). Get a 95% confidence interval for the expected mpg at the average weight. What is the lower endpoint?
* -4.00
* 21.190
* -6.486
* 18.991
```{r}
data(mtcars)
# Easy way
fit <- lm(mpg ~ wt, data = mtcars)
coef <- summary(fit)$coefficients
coef[1,1] + -1 * qt(0.975, df = fit$df) * coef[1,2]
```
### Question 4
Refer to the previous question. Read the help file for mtcars. What is the weight coefficient interpreted as?
* The estimated expected change in mpg per 1 lb increase in weight.
* The estimated 1,000 lb change in weight per 1 mpg increase.
* The estimated expected change in mpg per 1,000 lb increase in weight.
* It can't be interpreted without further information
```{r}
?mtcars
```
### Question 5
Consider again the `mtcars` data set and a linear regression model with mpg as predicted by weight (1,000 lbs). A new car is coming weighing 3000 pounds. Construct a 95% prediction interval for its mpg. What is the upper endpoint?
* 21.25
* 27.57
* -5.77
* 14.93
```{r}
fit <- lm(mpg ~ wt, data = mtcars)
coef <- summary(fit)$coefficients
sigma = summary(fit)$sigma
mpg <- predict(fit, newdata = data.frame(wt = 3))
mpg + qt(0.975, df = fit$df) * sigma
```
### Question 6
Consider again the `mtcars` data set and a linear regression model with mpg as predicted by weight (in 1,000 lbs). A “short” ton is defined as 2,000 lbs. Construct a 95% confidence interval for the expected change in mpg per 1 short ton increase in weight. Give the lower endpoint.
* -6.486
* -9.000
* -12.973
* 4.2026
```{r}
fit <- lm(y ~ I(x/2 - mean(x/2)))
coef <- summary(fit)$coefficients
coef[2,1] + -1 * qt(0.975, df = fit$df) * coef[2,2]
```
### Question 7
If my $X$ from a linear regression is measured in centimeters and I convert it to meters what would happen to the slope coefficient?
* It would get multiplied by 100 <--
* It would get divided by 10
* It would get multiplied by 10
* It would get divided by 100
```{r}
lm(y ~ x)
lm(y ~ I(x/100))
```
### Question 8
I have an outcome, $Y$, and a predictor, $X$ and fit a linear regression model with $Y=\beta_0+\beta_1X+\epsilon$ to obtain $\hat\beta_0$ and $\hat\beta_1$. What would be the consequence to the subsequent slope and intercept if I were to refit the model with a new regressor, $X+c$ for some constant, $c$?
* The new slope would be $c\hat\beta_1$
* The new slope would be $\hat\beta_1+c$
* The new intercept would be $\hat\beta_0-c\beta_1$ <--
* The new intercept would be $\hat\beta_0+c\hat\beta_1$
```{r}
l = lm(y ~ x)
lm(y ~ I(x + 10))
```
### Question 9
Refer back to the mtcars data set with mpg as an outcome and weight (wt) as the predictor. About what is the ratio of the the sum of the squared errors, $\sum_{i=1}^n(Y_i-\hat Y_i)^2$ when comparing a model with just an intercept (denominator) to the model with the intercept and slope (numerator)?
* 0.25 <--
* 0.75
* 0.50
* 4.00
```{r}
modelA = mean(y)
modelB = lm(y ~ x)
sum((modelB$fitted.values - y)^2) / sum((modelA - y)^2)
```
### Question 10
Do the residuals always have to sum to 0 in linear regression?
* The residuals never sum to zero.
* The residuals must always sum to zero.
* If an intercept is included, the residuals most likely won't sum to zero.
* If an intercept is included, then they will sum to 0. <--
```{r}
modelC = lm(y ~ x - 1)
sum(modelC$fitted.values - y)
```