forked from itsleeds/roadworksUK
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
151 lines (108 loc) · 3.89 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
---
output: word_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# roadworksUK
The goal of roadworksUK is to enable you to access, process and visualis data on UK roadworks, particularly Electronic Transfer of Notifications (EToN) records.
## Installation
<!-- You can install the released version of roadworksUK from [CRAN](https://CRAN.R-project.org) with: -->
```{r, echo=FALSE, eval=FALSE}
install.packages("roadworksUK")
```
Install the package hosted on [GitHub](https://github.com/) with:
```{r, message=FALSE}
# install.packages("devtools")
devtools::install_github("ITSLeeds/roadworksUK")
```
Load the package with:
```{r}
library(roadworksUK)
```
## What's in the package?
There are a number of functions for processing EToN records, as shown by:
```{r}
x <- library(help = roadworksUK)
x$info[[2]]
```
The datasets provided by the package include `kent10`, a minimal dataset containing data from Kent, `htdd_ashford`, ~981 records from Ashford, Kent.
The characteristics of these datasets is demonstrated below:
```{r}
data("htdd_ashford")
dim(htdd_ashford) # a larger dataset
htdd_ashford[1:3, 1:5]
```
```{r}
file_path = system.file("extdata", "kent10.csv.gz", package = "roadworksUK")
file.exists(file_path)
htdd_example = rw_import_elgin_htdd(file_path = file_path)
ncol(htdd_example)
nrow(htdd_example) # a small dataset with 10 rows
```
## Example: common roadworks in Ashford
This section explores roadworks `htdd_ashford`, a dataset that is provided by the package: it's made available when you load **roadworksUK**.
It relies on the **tidyverse**:
```{r, message=FALSE}
library(tidyverse)
```
A good way to 'get the measure' of potentially large spatio-temporal datasets is to find their size (in MB/GB/TB and number of rows/columns) and their temporal extent (we'll plot its spatial extent soon).
We can find all of these things for the `htdd_ashford` dataset as follows:
```{r}
pryr::object_size(htdd_ashford) # less than 2 MB - good test dataset
nrow(htdd_ashford)
ncol(htdd_ashford)
range(htdd_ashford$e__date_created)
```
A summary table of the contents of this dataset can be generated as follows:
```{r}
htdd_ashford %>%
select(id, responsible_org_name, responsible_org_sector, description) %>%
slice(1:10) %>%
knitr::kable()
```
The following commands can find-out who reports road works in Ashford, using the **dplyr** package (part of the tidyverse):
```{r, message=FALSE}
htdd_ashford %>%
group_by(publisher_name) %>%
summarise(n = n()) %>%
arrange(desc(n))
```
It's mostly Kenty County Council. There are a handful of reports by HE in the region also.
Find out who does the work with the following commands (this finds the top 5 organisations, change then `n` parameter to see more organisations):
```{r}
htdd_ashford %>%
group_by(responsible_org_name) %>%
summarise(n = n()) %>%
arrange(desc(n)) %>%
top_n(n = 5, wt = n)
```
Let's take a look at the temporal distribution of roadworks in the example dataset:
```{r}
plot(htdd_ashford$e__date_created, htdd_ashford$e__duration_days)
```
This distribution is characteristic of roadworks data: it's not usually logged when it begins but after it ends.
A log of actual reporting dates is illustrated in the next plot:
```{r}
plot(htdd_ashford$e__date_updated, htdd_ashford$e__duration_days)
```
This shows the log comes from a single month (June) in 2018.
We can do more sophisticated plots building on these examples and using packages such as **ggplot2**.
For now, we will move on to plot the spatial extent of the object:
```{r}
library(tmap)
tmap_mode("view")
tm_basemap(server = leaflet::providers$OpenTopoMap) +
qtm(htdd_ashford$i__location_point)
```
```{r}
library(tidyverse)
htdd_small = htdd_ashford %>%
select(description)
```