forked from hadley/adv-r
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Meta.Rmd
69 lines (48 loc) · 4.39 KB
/
Meta.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
# (PART) Metaprogramming {-}
# Introduction {#meta}
```{r setup, include = FALSE}
source("common.R")
```
> "Flexibility in syntax, if it does not lead to ambiguity, would seem a
> reasonable thing to ask of an interactive programming language."
>
> --- Kent Pitman
Compare to most modern programming languages, one of the most surprising things about R is its capability for metaprogramming: the ability of code to inspect and modify other code. Metaprogramming is particularly important for R because R is not just a programming language; it is also an environment for doing interactive data analysis. R's metaprogramming tools make it possible for packages like ggplot2 and dplyr to work. Both ggplot2 and dplyr distort the way the language works in order to provide tools that ease interactive data exploration.
In particular, many data analysis packages allow you to use the names of variables in a dataframe as if they were objects in the environment. This makes interactive exploration more fluid at the cost of introducing some minor ambiguity. For example, take the subset function in base R. It allows you to pick rows from a dataframe based on the values of their observations:
```{r}
data(diamonds, package = "ggplot2")
subset(diamonds, x == 0 & y == 0 & z == 0)
```
This is considerably shorter than the equivalent code using `[` and `$`:
```{r}
diamonds[diamonds$x == 0 & diamonds$y == 0 & diamonds$z == 0, ]
```
(Base R functions like `subset()` and `transform()` are what inspired the development of dplyr)
Functions like `subset()` are often said to use __non-standard evalution__, or NSE for short. \index{non-standard evaluation} That's because they evaluate one (or more) of their arguments in way that differs to how they'd normally be evaluated. For example, if you take the second argument to `subset()` above and try and evaluate it directly, the code will not work:
```{r, eval = FALSE}
x == 0 | y == 0 | z == 0
```
NSE is so woven through out R that it's sometimes difficult to spot. For example, each line of code in the following block uses NSE in someway:
```{r}
data(diamonds, package = "ggplot2")
mean(diamonds$carat)
rm(diamonds)
```
As you might guess, defining these tools by what they are not (standard evaluation) is not particularly precise. Additionally, the expression of these ideas has grown organically in base R over the last twenty years. This sometimes makes it hard to see the big ideas. Instead of using functions from base R to illustrat the ideas, we'll instead use functions from the rlang package. This package was developed recently, so make the mapping between the big ideas and code much more clear.
In this section of the book, you'll learn about the three big ideas that underpin NSE
* In __Expressions__, [Expressions], you'll learn that R code defines a
hierarchical structure. You'll learn how to visualise the hierarchy for
arbitrary code, and how the rules of R's grammar convert linear sequences
of characters into a tree. You'll also learn how to use recursive functions
to iterate over the trees in order to extract useful information or modify
the code in some way.
* In __Quotation__, [Quotation], you'll learn how to use tools from the rlang
package to capture unevaluated function arguments. You'll also learn more
about the powerful tool of quasiquotation and how you to can use it to
construct calls "by hand".
* In __Evaluation__, [Evaluation], you'll put all the pieces together in order
to understand how NSE works, and how to write your own functions that work
like `subset()`.
This part of the book concludes with [DSLs](#dsls), a chapter that pulls together all these threads and shows how you can use R to create two __domain specifc languages__. You'll learn how to translate R code into HTML and SQL, ideas that underpin the shiny and dplyr packages.
Each chapter follows the same basic structure. You'll get the lay of the land in introduction, and see a motivating exemplare. Then we'll discuss the big ideas using tools from the rlang package. We'll then circle back to talk about how those ideas are expressed in base R. Finally each chapter finishes with a case study, using the ideas to solve a bigger problem.
If you're reading these chapters primarily to better understand tidy evaluation so you can better program with the tidyverse, I'd recommend just reading the first 2-3 sections of each chapter; skip the sections about base R and more advanced techniques.