-
Notifications
You must be signed in to change notification settings - Fork 0
/
Week8-Lecture2.qmd
144 lines (89 loc) · 5.14 KB
/
Week8-Lecture2.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
---
date: "`r (lubridate::ymd('20241002')+lubridate::dweeks(7))`"
title: "Class 16 Instrumental Variables and Two-Stage Least Squares"
---
# Instrumental Variable
## Class Objectives
- The requirements of a valid instrumental variable and how to find good instruments
- Intuition of why instrumental variables solve endogeneity problems
- Apply two-stage least square method to estimate the causal effects using instrumental variables
## Causal Inference from OLS
- From non-experimental secondary data, it is impossible to control all confounding factors, which means we can never obtain causal effects from OLS regressions.
- Can we still obtain causal inference from secondary data?
## What is an Instrumental Variable
::: block
### Instrumental Variable
An instrumental variable is a set of variables $Z$ that satisfies the following requirements:
1. $z$ is exogeneous and uncorrelated with $\epsilon$; that is, $cov(Z,\epsilon) = 0$
2. $z$ only affects $Y$ through $X$, but not directly affect $Y$
3. $z$ affects $x$ to some extent, that is, $cov(Z,x) \neq 0$
:::
\small
- Point 1 is called **exogeneity** requirement: the instrumental variable should be beyond individual's control, such that the instrumental variables are uncorrelated with any individual's unobserved confounding factors.
- Potential IVs: government policy; natural disasters; randomized experiment; birthdays; etc.
- Point 2 is called **exclusion restriction**: the instrumental variable should only affect $Y$ through $X$, but not directly affect $Y$.
- Point 3 is called **relevance requirement**: though beyond an individual's control, the instrumental variable should still affect the individual's $X$, causing some exogenous changes in $X$ that is beyond individual control.
- If the correlation between $z$ and $x$ is too small, we have a [**weak IV** problem]((https://www.econometrics-with-r.org/12.3-civ.html)).
## Graphical Illustration of IV
```{r}
#| echo: false
#| fig-align: 'center'
knitr::include_graphics("images/Week 8/InstrumentalVariable.png")
```
## A Classic Example of Instrumental Variable
**Return of Military Service to Lifetime Income**[^1]
[^1]: Angrist, Joshua D., Stacey H. Chen, and Jae Song. "Long-term consequences of Vietnam-era conscription: New estimates using social security data." *American Economic Review* 101, no. 3 (2011): 334-38.
$$
Income = \beta_0 + \beta_1MilitaryService + \epsilon
$$
- OLS suffers from endogeneity problems. What are the potential endogeneity issues?
- A lottery was used to determine if soldiers with certain birthdays are drafted to the frontline.
## A Classic Example of Instrumental Variable
- The date of birth ($z$) or zodiacs can be an **instrumental variable** for military service ($x$) in this case.
- Relevance requirement: Affects years of military service: $cov(z,x) \neq 0$
- Exogeneity requirement: Randomly drawn and thus uncorrelated with any confounders: $cov(z,\epsilon) = 0$
- Exclusion restriction: $z$ only affects $Y$ through $X$, but not directly affect $Y$.
```{r}
#| echo: false
#| fig-align: 'center'
knitr::include_graphics("images/Week 8/zodiac.jpg")
```
## More Examples of IVs
::: {.content-visible when-format="beamer"}
**Can you come up with IV candidates for the following causal questions?**[^2]
- Number of restaurants on UberEat =\> Number of orders on UberEat
- Retail price =\> Sales
:::
[^2]: See html version for answers.
::: {.content-visible when-format="html"}
**Can you come up with IV candidates for the following causation questions?**
- Number of restaurants on UberEat =\> Number of orders on UberEat
- temporary close down of restaurants due to government inspections
- Retail price =\> Sales
- wholesale price
- costs of raw materials
- COGS
- Hausman instruments: the prices of the same product in other markets
:::
# Two-Stage Least Squares
## Solving Endogeneity Using IV
- Given an endogenous OLS regression,
$$
y_{i}=X_{i} \beta+\varepsilon_{i}, \quad \operatorname{cov}\left(X_{i}, \varepsilon_{i}\right) \neq 0
$$
- Find instrumental variables $Z_i$ that do not (directly) influence $y_i$ , but are correlated with $X_i$
## Two-Stage Least Squares: Stage 1
1. Run a regression with `X ~ Z`. The predicted $\hat X$ is predicted by Z, which should be uncorrelated with the error term $\epsilon$.
- $\hat{X}$ (the part of changes in $X$ due to $Z$) is exogenous, because $Z$ is exogenous
- All endogenous parts are now left over in the error term in the first-stage regression $\epsilon_{i}$
$$
X_{i}=Z_{i}\eta+\epsilon_{i}
$$
## Two-Stage Least Squares: Stage 2
2. Run a regression with $Y$ \~ $\hat{X}$: now $\hat{X}$ is uncorrelated with the error term and thus we can get causal inference from the second stage regression.
$$
y_{i}=\hat{X} \beta+\varepsilon_{i}, \quad \operatorname{cov}\left(\hat{X}_{i}, \varepsilon_{i}\right) = 0
$$
## After-Class Readings
- Next week, we are going to discuss a case study using IV and 2SLS. Please read the case study before the next class.
- (optional) [Econometrics with R: Instrumental Variables Regression](https://www.econometrics-with-r.org/12-ivr.html)