-
Notifications
You must be signed in to change notification settings - Fork 4
/
session-messy-csv.qmd
84 lines (54 loc) · 2.71 KB
/
session-messy-csv.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
title: "Introduction to R and Rstudio"
subtitle: "Session - Import a messy csv"
execute:
echo: false
eval: true
---
## Importing a csv that needs tidying
```{r}
#| label: "libs"
#| include: false
library(readr)
```
```{r }
#| label: "load-data"
beds_data <- read_csv(url("https://raw.githubusercontent.com/nhs-r-community/intro_r_data/main/beds_data.csv"),
col_types = cols(date = col_date(format = "%d/%m/%Y")),
skip = 3)
```
### Exploring Mental Health (MH) Inpatient Capacity
The following is some analysis of Mental Health inpatient capacity in England.
### Background
Maintaining clinical effectiveness and safety when a ward is fully occupied is a serious challenge for staff.
Inappropriate out of area placements have an added cost and also mean patients are separated from their social support networks.
## The Data
KH03 returns (bed numbers and occupancy) by organisation, published by NHS England.
Scraped from the [NHSE statistics](https://www.england.nhs.uk/statistics/statistical-work-areas/bed-availability-and-occupancy/bed-data-overnight/
) website which is partially cleaned
## Accessible spreadsheets
This is a perfect opportunity to say that are [Government data standards](https://analysisfunction.civilservice.gov.uk/area_of_work/accessibility/) to make spreadsheets and charts accessible.
## Importing a messy dataset
::: columns
::: {.column width="40%"}
Click on the file `beds_data.csv` in the Files pane (bottom right) and then `Import Dataset...`
:::
::: {.column width="60%"}
<img class="center" src="img/session06/beds-data-wizard.PNG" alt="Screenshot of the import wizard with the beds_ae.csv data populating it. Notice that something isn't right."/>
:::
:::
## Cleaning in the wizard
It's a common issue to have blank metadata and blank lines at the top of spreadsheets
<img class="center" src="img/session06/beds-data-wizard-2.png" alt="Screenshot of the beds_data.csv data in the import wizard with the Title and Source rows highlighted." />
## Skip rows
<img class="center" src="img/session06/beds-data-wizard-skip.PNG" alt="Screenshot of where to skip rows in the import wizard. In this case skip 3 rows so the column headers appear at the top of the preview."/>
Move the cursor to another area to update the view immediately.
## Dates may be a problem
<img class="center" src="img/session06/beds-data-wizard-date.png" alt="Screenshot again with one more thing to fix: Click on the drop down menu by the column called date."/>
## Default US dates on UK data
<img class="center" src="img/session06/beds-data-date-wizard.PNG" alt="Screenshot of a wizard pop up to change the date format to %d/%m/%Y."/>
## Final data
```{r}
beds_data
```
## End session