-
Notifications
You must be signed in to change notification settings - Fork 10
/
r-viz-gapminder.Rmd
558 lines (376 loc) · 23.5 KB
/
r-viz-gapminder.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
---
title: "Advanced Data Visualization with ggplot2"
---
```{r init, include=F}
library(knitr)
opts_chunk$set(message=FALSE, warning=FALSE, eval=FALSE, echo=TRUE, fig.keep="none", cache=TRUE)
options(digits=3)
options(max.print=200)
.ex <- 1 # Track ex numbers w/ hidden var. Increment each ex: `r .ex``r .ex=.ex+1`
library(ggplot2)
theme_set(theme_bw(base_size=16) + theme(strip.background = element_blank()))
```
This section will cover fundamental concepts for creating effective data visualization and will introduce tools and techniques for visualizing large, high-dimensional data using R. We will review fundamental concepts for visually displaying quantitative information, such as using series of small multiples, avoiding "chart-junk," and maximizing the data-ink ratio. We will cover the grammar of graphics (geoms, aesthetics, stats, and faceting), and using the ggplot2 package to create plots layer-by-layer.
This lesson assumes a [basic familiarity with R](r-basics.html), [data frames](r-dataframes.html), and [manipulating data with dplyr and `%>%`](r-dplyr-yeast.html).
## Review
### Gapminder data
We're going to work with a different dataset for this section. It's a [cleaned-up excerpt](https://github.com/jennybc/gapminder) from the [Gapminder data](http://www.gapminder.org/data/). Download the [**gapminder.csv** data by clicking here](data/gapminder.csv) or using the link above.
Let's read in the data to an object called `gm` and take a look with `View`. Remember, we need to load both the dplyr and readr packages for efficiently reading in and displaying this data.
```{r readGapminder, eval=TRUE}
# Load packages
library(readr)
library(dplyr)
# Download the data locally and read the file
gm <- read_csv(file="data/gapminder.csv")
# Show the first few lines of the data
gm
# Optionally bring up data in a viewer window.
# View(gm)
```
This particular excerpt has 1704 observations on six variables:
* `country` a categorical variable 142 levels
* `continent`, a categorical variable with 5 levels
* `year`: going from 1952 to 2007 in increments of 5 years
* `pop`: population
* `gdpPercap`: GDP per capita
* `lifeExp`: life expectancy
### dplyr review
The dplyr package gives you a handful of useful **verbs** for managing data. On their own they don't do anything that base R can't do. Here are some of the _single-table_ verbs we'll be working with in this lesson (single-table meaning that they only work on a single table -- contrast that to _two-table_ verbs used for joining data together). They all take a `data.frame` or `tbl` as their input for the first argument, and they all return a `data.frame` or `tbl` as output.
1. `filter()`: filters _rows_ of the data where some condition is true
1. `select()`: selects out particular _columns_ of interest
1. `mutate()`: adds new columns or changes values of existing columns
1. `arrange()`: arranges a data frame by the value of a column
1. `summarize()`: summarizes multiple values to a single value, most useful when combined with...
1. `group_by()`: groups a data frame by one or more variable. Most data operations are useful done on groups defined by variables in the the dataset. The `group_by` function takes an existing data frame and converts it into a grouped data frame where `summarize()` operations are performed _by group_.
Additionally, the **`%>%`** operator allows you to "chain" operations together. Rather than nesting functions inside out, the `%>%` operator allows you to write operations left-to-right, top-to-bottom. Let's say we wanted to get the average life expectancy and GDP (not GDP per capita) for Asian countries for each year.
![](img/nest_vs_pipe_gm.png)
The `%>%` would allow us to do this:
```{r pipe, eval=TRUE}
gm %>%
mutate(gdp=gdpPercap*pop) %>%
filter(continent=="Asia") %>%
group_by(year) %>%
summarize(mean(lifeExp), mean(gdp))
```
Instead of this:
```{r nopipemess, eval=TRUE, results='hide'}
summarize(
group_by(
filter(
mutate(gm, gdp=gdpPercap*pop),
continent=="Asia"),
year),
mean(lifeExp), mean(gdp))
```
## About ggplot2
**ggplot2** is a widely used R package that extends R's visualization capabilities. It takes the hassle out of things like creating legends, mapping other variables to scales like color, or faceting plots into small multiples. We'll learn about what all these things mean shortly.
_Where does the "gg" in ggplot2 come from?_ The **ggplot2** package provides an R implementation of Leland Wilkinson's *Grammar of Graphics* (1999). The *Grammar of Graphics* allows you to think beyond the garden variety plot types (e.g. scatterplot, barplot) and the consider the components that make up a plot or graphic, such as how data are represented on the plot (as lines, points, etc.), how variables are mapped to coordinates or plotting shape or color, what transformation or statistical summary is required, and so on.
Specifically, **ggplot2** allows you to build a plot layer-by-layer by specifying:
- a **geom**, which specifies how the data are represented on the plot (points, lines, bars, etc.),
- **aesthetics** that map variables in the data to axes on the plot or to plotting size, shape, color, etc.,
- a **stat**, a statistical transformation or summary of the data applied prior to plotting,
- **facets**, which we've already seen above, that allow the data to be divided into chunks on the basis of other categorical or continuous variables and the same plot drawn for each chunk.
_First, a note about `qplot()`._ The `qplot()` function is a quick and dirty way of making ggplot2 plots. You might see it if you look for help with ggplot2, and it's even covered extensively in the ggplot2 book. And if you're used to making plots with built-in base graphics, the `qplot()` function will probably feel more familiar. But the sooner you abandon the `qplot()` syntax the sooner you'll start to really understand ggplot2's approach to building up plots layer by layer. So we're not going to use it at all in this class.
Finally, see [this course's help page](help.html#ggplot2-resources) for links to getting more help with ggplot2.
## Plotting bivariate data: continuous Y by continuous X
The `ggplot` function has two required arguments: the *data* used for creating the plot, and an *aesthetic* mapping to describe how variables in said data are mapped to things we can see on the plot.
First let's load the package:
```{r loadggplot2, eval=TRUE}
library(ggplot2)
```
Now, let's lay out the plot. If we want to plot a continuous Y variable by a continuous X variable we're probably most interested in a scatter plot. Here, we're telling ggplot that we want to use the `gm` dataset, and the aesthetic mapping will map `gdpPercap` onto the x-axis and `lifeExp` onto the y-axis. Remember that the variable names are case sensitive!
```{r noLayers}
ggplot(gm, aes(x = gdpPercap, y = lifeExp))
```
When we do that we get a blank canvas with no data showing (you might get an error if you're using an old version of ggplot2). That's because all we've done is laid out a two-dimensional plot specifying what goes on the x and y axes, but we haven't told it what kind of geometric object to plot. The obvious choice here is a point. Check out [docs.ggplot2.org](http://docs.ggplot2.org/) to see what kind of geoms are available.
```{r}
ggplot(gm, aes(x = gdpPercap, y = lifeExp)) + geom_point()
```
Here, we've built our plot in layers. First, we create a canvas for plotting layers to come using the `ggplot` function, specifying which **data** to use (here, the **gm** data frame), and an **aesthetic mapping** of `gdpPercap` to the x-axis and `lifeExp` to the y-axis. We next add a layer to the plot, specifying a **geom**, or a way of visually representing the aesthetic mapping.
Now, the typical workflow for building up a ggplot2 plot is to first construct the figure and save that to a variable (for example, `p`), and as you're experimenting, you can continue to re-define the `p` object as you develop "keeper commands".
First, let's construct the graphic. Notice that we don't have to specify `x=` and `y=` if we specify the arguments in the correct order (x is first, y is second).
```{r, eval=TRUE}
p <- ggplot(gm, aes(gdpPercap, lifeExp))
```
The `p` object now contains the canvas, but nothing else. Try displaying it by just running `p`. Let's experiment with adding points and a different scale to the x-axis.
```{r}
# Experiment with adding poings
p + geom_point()
# Experiment with a different scale
p + geom_point() + scale_x_log10()
```
I like the look of using a log scale for the x-axis. Let's make that stick.
```{r, eval=TRUE}
p <- p + scale_x_log10()
```
Now, if we re-ran `p` still nothing would show up because the `p` object just contains a blank canvas. Now, re-plot again with a layer of points:
```{r}
p + geom_point()
```
Now notice what I've saved to `p` at this point: only the basic plot layout and the log10 mapping on the x-axis. I didn't save any layers yet because I want to fiddle around with the points for a bit first.
Above we implied the aesthetic mappings for the x- and y- axis should be `gdpPercap` and `lifeExp`, but we can also add aesthetic mappings to the geoms themselves. For instance, what if we wanted to color the points by the value of another variable in the dataset, say, continent?
```{r}
p + geom_point(aes(color=continent))
```
Notice the difference here. If I wanted the colors to be some static value, I wouldn't wrap that in a call to `aes()`. I would just specify it outright. Same thing with other features of the points. For example, lets make all the points huge (`size=8`) blue (`color="blue"`) semitransparent (`alpha=(1/4)`) triangles (`pch=17`):
```{r}
p + geom_point(color="blue", pch=17, size=8, alpha=1/4)
```
Now, this time, let's map the aesthetics of the point character to certain features of the data. For instance, let's give the points different colors and character shapes according to the continent, and map the size of the point onto the life Expectancy:
```{r}
p + geom_point(aes(col=continent, shape=continent, size=lifeExp))
```
Now, this isn't a great plot because there are several aesthetic mappings that are redundant. Life expectancy is mapped to both the y-axis and the size of the points -- the size mapping is superfluous. Similarly, continent is mapped to both the color and the point character (the shape is superfluous). Let's get rid of that, but let's make the points a little bigger outsize of an aesthetic mapping.
```{r scatter_colContinent_size4, eval=TRUE, fig.keep="last"}
p + geom_point(aes(col=continent), size=3)
```
----
**EXERCISE `r .ex``r .ex=.ex+1`**
Re-create this same plot from scratch without saving anything to a variable. That is, start from the `ggplot` call.
* Start with the `ggplot()` function.
* Use the gm data.
* Map `gdpPercap` to the x-axis and `lifeExp` to the y-axis.
* Add points to the plot
* Make the points size 3
* Map continent onto the aesthetics of the point
* Use a log<sub>10</sub> scale for the x-axis.
```{r, echo=FALSE}
ggplot(gm, aes(gdpPercap, lifeExp)) +
geom_point(aes(col=continent), size=3) +
scale_x_log10()
```
----
### Adding layers
Let's add a fitted curve to the points. Recreate the plot in the `p` object if you need to.
```{r scatter_addsmoothlayer, eval=TRUE, fig.keep="last"}
p <- ggplot(gm, aes(gdpPercap, lifeExp)) + scale_x_log10()
p + geom_point() + geom_smooth()
```
By default `geom_smooth()` will try to lowess for data with n<1000 or generalized additive models for data with n>1000. We can change that behavior by tweaking the parameters to use a thick red line, use a linear model instead of a GAM, and to turn off the standard error stripes.
```{r}
p + geom_point() + geom_smooth(lwd=2, se=FALSE, method="lm", col="red")
```
But let's add back in our aesthetic mapping to the continents. Notice what happens here. We're mapping continent as an aesthetic mapping _to the color of the points only_ -- so `geom_smooth()` still works only on the entire data.
```{r}
p + geom_point(aes(color = continent)) + geom_smooth()
```
But notice what happens here: we make the call to `aes()` outside of the `geom_point()` call, and the continent variable gets mapped as an aesthetic to any further geoms. So here, we get separate smoothing lines for each continent. Let's do it again but remove the standard error stripes and make the lines a bit thicker.
```{r scatter_final, eval=TRUE, fig.keep="last"}
p + aes(color = continent) + geom_point() + geom_smooth()
p + aes(color = continent) + geom_point() + geom_smooth(se=F, lwd=2)
```
### Faceting
Facets display subsets of the data in different panels. There are a couple ways to do this, but `facet_wrap()` tries to sensibly wrap a series of facets into a 2-dimensional grid of small multiples. Just give it a formula specifying which variables to facet by. We can continue adding more layers, such as smoothing. If you have a look at the help for `?facet_wrap()` you'll see that we can control how the wrapping is laid out.
```{r scatter_facet1, eval=TRUE, fig.keep="last", fig.width=5, fig.height=12}
p + geom_point() + facet_wrap(~continent)
p + geom_point() + geom_smooth() + facet_wrap(~continent, ncol=1)
```
### Saving plots
There are a few ways to save ggplots. The quickest way, that works in an interactive session, is to use the `ggsave()` function. You give it a file name and by default it saves the last plot that was printed to the screen.
```{r, eval=FALSE}
p + geom_point()
ggsave(file="myplot.png")
```
But if you're running this through a script, the best way to do it is to pass `ggsave()` the object containing the plot that is meant to be saved. We can also adjust things like the width, height, and resolution. `ggsave()` also recognizes the name of the file extension and saves the appropriate kind of file. Let's save a PDF.
```{r, eval=FALSE}
pfinal <- p + geom_point() + geom_smooth() + facet_wrap(~continent, ncol=1)
ggsave(pfinal, file="myplot.pdf", width=5, height=15)
```
----
**EXERCISE `r .ex``r .ex=.ex+1`**
1. Make a scatter plot of `lifeExp` on the y-axis against `year` on the x.
1. Make a series of small multiples faceting on continent.
1. Add a fitted curve, smooth or lm, with and without facets.
1. **Bonus**: using `geom_line()` and and aesthetic mapping `country` to `group=`, make a "spaghetti plot", showing _semitransparent_ lines connected for each country, faceted by continent. Add a smoothed loess curve with a thick (`lwd=3`) line with no standard error stripe. Reduce the opacity (`alpha=`) of the individual black lines. _Don't_ show Oceania countries (that is, `filter()` the data where `continent!="Oceania"` before you plot it).
```{r, include=FALSE, eval=FALSE}
p <- ggplot(gm, aes(year, lifeExp))
p + geom_point()
p + geom_point() + geom_smooth()
p + geom_point() + geom_smooth() + facet_wrap(~continent)
```
```{r spaghetti, echo=FALSE, eval=TRUE, fig.keep="last", fig.width=8}
p <- ggplot(filter(gm, continent!="Oceania"), aes(year, lifeExp))
# p + facet_wrap(~continent) + geom_line()
# p + facet_wrap(~continent) + geom_line(aes(group=country))
p + facet_wrap(~continent) + geom_line(aes(group=country), alpha=.5) + geom_smooth(lwd=3, se=FALSE)
```
----
## Plotting bivariate data: continuous Y by categorical X
With the last example we examined the relationship between a continuous Y variable against a continuous X variable. A scatter plot was the obvious kind of data visualization. But what if we wanted to visualize a continuous Y variable against a categorical X variable? We sort of saw what that looked like in the last exercise. `year` is a continuous variable, but in this dataset, it's broken up into 5-year segments, so you could almost think of each year as a categorical variable. But a better example would be life expectancy against continent or country.
First, let's set up the basic plot:
```{r, eval=TRUE}
p <- ggplot(gm, aes(continent, lifeExp))
```
Then add points:
```{r}
p + geom_point()
```
That's not terribly useful. There's a big overplotting problem. We can try to solve with transparency:
```{r}
p + geom_point(alpha=1/4)
```
But that really only gets us so far. What if we spread things out by adding a little bit of horizontal noise (aka "jitter") to the data.
```{r}
p + geom_jitter()
```
Note that the little bit of horizontal noise that's added to the jitter is random. If you run that command over and over again, each time it will look slightly different. The idea is to visualize the density at each vertical position, and spreading out the points horizontally allows you to do that. If there were still lots of over-plotting you might think about adding some transparency by setting the `alpha=` value for the jitter.
```{r}
p + geom_jitter(alpha=1/2)
```
Probably a more common visualization is to show a box plot:
```{r}
p + geom_boxplot()
```
But why not show the summary and the raw data?
```{r}
p + geom_jitter() + geom_boxplot()
```
Notice how in that example we first added the jitter layer then added the boxplot layer. But the boxplot is now superimposed over the jitter layer. Let's make the jitter layer go on top. Also, go back to just the boxplots. Notice that the outliers are represented as points. But there's no distinction between the outlier point from the boxplot geom and all the other points from the jitter geom. Let's change that. Notice the British spelling.
```{r}
p + geom_boxplot(outlier.colour = "red") + geom_jitter(alpha=1/2)
```
There's another geom that's useful here, called a voilin plot.
```{r}
p + geom_violin()
p + geom_violin() + geom_jitter(alpha=1/2)
```
Let's go back to our boxplot for a moment.
```{r}
p + geom_boxplot()
```
This plot would be a lot more effective if the continents were shown in some sort of order other than alphabetical. To do that, we'll have to go back to our basic build of the plot again and use the `reorder` function in our original aesthetic mapping. Here, reorder is taking the first variable, which is some categorical variable, and ordering it by the level of the mean of the second variable, which is a continuous variable. It looks like this
```{r, eval=TRUE}
p <- ggplot(gm, aes(x=reorder(continent, lifeExp), y=lifeExp))
```
```{r boxplot, eval=TRUE, fig.keep='last'}
p + geom_boxplot()
```
----
**EXERCISE `r .ex``r .ex=.ex+1`**
1. Make a jittered strip plot of GDP per capita against continent.
1. Make a box plot of GDP per capita against continent.
1. Using a log<sub>10</sub> y-axis scale, overlay semitransparent jittered points on top of box plots, where outlying points are colored.
1. **BONUS**: Try to reorder the continents on the x-axis by GDP per capita. Why isn't this working as expected? See `?reorder` for clues.
```{r, echo=FALSE}
p <- ggplot(gm, aes(continent, gdpPercap))
p + geom_jitter()
p + geom_boxplot()
p <- ggplot(gm, aes(reorder(continent, gdpPercap), gdpPercap))
p <- p + scale_y_log10()
p + geom_boxplot(outlier.colour="red") + geom_jitter(alpha=1/2)
library(dplyr)
gm %>% group_by(continent) %>% summarize(mean(gdpPercap))
gm %>% group_by(continent) %>% summarize(mean(log10(gdpPercap)))
p <- ggplot(gm, aes(reorder(continent, gdpPercap, FUN=function(x) mean(log10(x))), gdpPercap))
p <- p + scale_y_log10()
p + geom_boxplot(outlier.colour="red") + geom_jitter(alpha=1/2)
```
----
## Plotting univariate continuous data
What if we just wanted to visualize distribution of a single continuous variable? A histogram is the usual go-to visualization. Here we only have one aesthetic mapping instead of two.
```{r init_histogram, eval=TRUE}
p <- ggplot(gm, aes(lifeExp))
```
```{r}
p + geom_histogram()
```
When we do this ggplot lets us know that we're automatically selecting the width of the bins, and we might want to think about this a little further.
```{r}
p + geom_histogram(bins=30)
p + geom_histogram(bins=10)
p + geom_histogram(bins=200)
```
```{r histogram, eval=TRUE, fig.keep="last"}
p + geom_histogram(bins=60)
```
Alternative we could plot a smoothed density curve instead of a histogram:
```{r}
p + geom_density()
```
Back to histograms. What if we wanted to color this by continent?
```{r}
p + geom_histogram(aes(color=continent))
```
That's not what we had in mind. That's just the outline of the bars. We want to change the _fill_ color of the bars.
```{r}
p + geom_histogram(aes(fill=continent))
```
Well, that's not exactly what we want either. If you look at the help for `?geom_histogram` you'll see that by default it stacks overlapping points. This isn't really an effective visualization. Let's change the position argument.
```{r}
p + geom_histogram(aes(fill=continent), position="identity")
```
But the problem there is that the histograms are blocking each other. What if we tried transparency?
```{r}
p + geom_histogram(aes(fill=continent), position="identity", alpha=1/3)
```
That's somewhat helpful, and might work for two distributions, but it gets cumbersome with 5. Let's go back and try this with density plots, first changing the color of the line:
```{r}
p + geom_density(aes(color=continent))
```
Then by changing the color of the fill and setting the transparency to 25%:
```{r densityplot, eval=TRUE, fig.keep="last"}
p + geom_density(aes(fill=continent), alpha=1/4)
```
----
**EXERCISE `r .ex``r .ex=.ex+1`**
1. Plot a histogram of GDP Per Capita.
1. Do the same but use a log<sub>10</sub> x-axis.
1. Still on the log<sub>10</sub> x-axis scale, try a density plot mapping continent to the fill of each density distribution, and reduce the opacity.
1. Still on the log<sub>10</sub> x-axis scale, make a histogram faceted by continent _and_ filled by continent. Facet with a single column (see `?facet_wrap` for help).
1. Save this figure to a 6x10 PDF file.
```{r, echo=FALSE, eval=FALSE}
p <- ggplot(gm, aes(gdpPercap))
p + geom_histogram()
p <- p + scale_x_log10()
p + geom_histogram()
p + geom_density(aes(fill=continent), alpha=1/4)
p + geom_histogram(aes(fill=continent)) + facet_wrap(~continent, ncol=1)
ggsave("myplot.pdf", width=6, height=10)
```
----
## Publication-ready plots & themes
Let's make a plot we made earlier (life expectancy versus the log of GDP per capita with points colored by continent with lowess smooth curves overlaid without the standard error ribbon):
```{r, eval=TRUE}
p <- ggplot(gm, aes(gdpPercap, lifeExp))
p <- p + scale_x_log10()
p <- p + aes(col=continent) + geom_point() + geom_smooth(lwd=2, se=FALSE)
```
Give the plot a title and axis labels:
```{r, eval=TRUE}
p <- p + ggtitle("Life expectancy vs GDP by Continent")
p <- p + xlab("GDP Per Capita (USD)") + ylab("Life Expectancy (years)")
```
By default, the "gray" theme is the usual background (I've changed this course website to use the black and white background for all images).
```{r theme_gray, eval=TRUE, fig.keep="last"}
p + theme_gray()
```
We could also get a black and white background:
```{r}
p + theme_bw()
```
Or go a step further and remove the gridlines:
```{r theme_classic, eval=TRUE, fig.keep="last"}
p + theme_classic()
```
Finally, there's another package that gives us lots of different themes. Install it if you don't have it already. Install all its dependencies along with it.
```{r, eval=FALSE}
install.packages("ggthemes", dependencies = TRUE)
```
```{r themes, eval=FALSE}
library(ggthemes)
p <- ggplot(gm, aes(gdpPercap, lifeExp))
p <- p + scale_x_log10()
p <- p + aes(col=continent) + geom_point() + geom_smooth(lwd=2, se=FALSE)
p + theme_excel()
p + theme_excel() + scale_colour_excel()
p + theme_gdocs() + scale_colour_gdocs()
p + theme_stata() + scale_colour_stata()
p + theme_wsj() + scale_colour_wsj()
p + theme_economist()
p + theme_fivethirtyeight()
p + theme_tufte()
```
Don't forget - see [this course's help page](help.html#ggplot2-resources) for links to getting more help with ggplot2!
## Homework
Looking for more practice? [Try this homework assignment](r-viz-homework.html).
----