-
Notifications
You must be signed in to change notification settings - Fork 0
/
7.mutate.qmd
98 lines (75 loc) · 3.8 KB
/
7.mutate.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
title: "7. Mutate"
author: "Patrick Spauster"
format:
html:
code-copy: true
toc: true
toc-location: right
editor: visual
---
## Video Tutorial
<div style="max-width:608px"><div style="position:relative;padding-bottom:66.118421052632%"><iframe id="kaltura_player" src="https://cdnapisec.kaltura.com/p/1674401/sp/167440100/embedIframeJs/uiconf_id/23435151/partner_id/1674401?iframeembed=true&playerId=kaltura_player&entry_id=1_zcp7gcg8&flashvars[streamerType]=auto&flashvars[localizationCode]=en&flashvars[sideBarContainer.plugin]=true&flashvars[sideBarContainer.position]=left&flashvars[sideBarContainer.clickToClose]=true&flashvars[chapters.plugin]=true&flashvars[chapters.layout]=vertical&flashvars[chapters.thumbnailRotator]=false&flashvars[streamSelector.plugin]=true&flashvars[EmbedPlayer.SpinnerTarget]=videoHolder&flashvars[dualScreen.plugin]=true&flashvars[LeadWithHLSOnFlash]=true&flashvars[Kaltura.addCrossoriginToIframe]=true&&wid=1_wy3hyyla" width="608" height="402" allowfullscreen webkitallowfullscreen mozAllowFullScreen allow="autoplay *; fullscreen *; encrypted-media *" sandbox="allow-downloads allow-forms allow-same-origin allow-scripts allow-top-navigation allow-pointer-lock allow-popups allow-modals allow-orientation-lock allow-popups-to-escape-sandbox allow-presentation allow-top-navigation-by-user-activation" frameborder="0" title="7. mutate" style="position:absolute;top:0;left:0;width:100%;height:100%"></iframe></div></div>
## Mutate
Mutate is an incredibly powerful tool to create new columns and new variables.
```{r include = F}
library(tidyverse)
library(janitor)
?mutate()
```
Let's grab our code to read in the clean dataframe
```{r message = F}
fhv_clean <- read_csv(file = "For_Hire_Vehicles__FHV__-_Active.csv") %>%
clean_names() %>%
rename(hybrid = veh)
```
We create a new column with mutate by setting the name of our new column and a new value
```{r}
fhv_clean %>%
mutate(city = "New York City")
```
You can create multiple new columns (`...)` at once
```{r}
fhv_clean %>%
mutate(city = "New York City",
active = TRUE) #I can overwrite column names too. I've made this active column boolean (true or false)
```
## Mutate with logical expressions
Where mutate gets powerful is when you use it with logical expressions. Here we use `if_else()`
```{r}
fhv_rideshare <- fhv_clean %>%
mutate(rideshare = if_else(
condition = base_name == "UBER USA, LLC",
true = "rideshare",
false = "limo"
)) #if it's an uber call it rideshare, if its a limo call it something else
#notice I named the arguments here! A good practice when the argument is not ...
```
Tabulate the variable we made with the `count`() funtion
```{r}
fhv_rideshare %>%
count(rideshare)
```
What if we have more than one logical expression we care about? Check out `case_when`.
```{r}
fhv_blackcar <- fhv_clean %>%
mutate(
ride_type = case_when(
base_name == "UBER USA, LLC" & base_type == "BLACK-CAR" ~ "BLACK CAR RIDESHARE",
base_name != "UBER USA, LLC" & base_type == "BLACK-CAR" ~ "BLACK CAR NON-RIDESHARE",
TRUE ~ base_type #if it doesn't meet either condition, return the base_type
))
```
Use `&` and `|` for **and** and **or** logical expressions with multiple conditions
```{r}
fhv_blackcar %>%
count(ride_type)#now we have four categories!
```
## Normalizing with Mutate
You can use statistical functions like mean to normalize data with mutate. mean will return the average of all the vehicle years. You can use mutate to generate a new variable that takes the distance from each observation to the mean.
```{r}
fhv_clean %>%
mutate(year_norm = vehicle_year/mean(vehicle_year, na.rm = T),
year_pct = percent_rank(vehicle_year)) %>%
select(vehicle_license_number, vehicle_year, year_norm, year_pct)
```