Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions for the R and SQL episode #835

Closed
Aron-github opened this issue Feb 23, 2023 · 1 comment
Closed

Suggestions for the R and SQL episode #835

Aron-github opened this issue Feb 23, 2023 · 1 comment

Comments

@Aron-github
Copy link

In the R Ecology Lesson, under "SQL databases and R" I would suggest the following edits, believing that this might help learner transition from the previous episode to the current.

  • In the Introduction section: Add a sentence to introduce what SQL means and why we will be working with SQL databases. For example, after the second paragraph, one could add:

Often, public or private databases are structured using SQL (Structured Query Language), a standardized programming language that is used to manage relational databases and perform various operations on the data in them. These operations are in principle similar to what we have explored so far using tidyverse in the Manipulating data episode (select, filter, perform operations, etc.) but expressed through the SQL grammar.

  • In the Introduction section: avoid using SQL jargon (before introducing it) if unnecessary, or add a vocabulary and/or example to introduce it first. Therefore, in the fourth paragraph, I would suggest to modify this sentence:

Interfacing with databases using dplyr focuses on retrieving and analyzing datasets by generating SELECT SQL statements, but it doesn’t modify the database itself. dplyr does not offer functions to UPDATE or DELETE entries.

with

Interfacing with databases using dplyr focuses on retrieving and analyzing datasets by converting R code into the corresponding SQL statements, but it doesn’t include any function to directly modify the content of database itself (e.g. by updating or deleting data).

  • Reorder subsection as follows, so that even learners who never saw SQL before could be gently introduced to its grammar:
  1. Querying the database with the dplyr syntax
  2. SQL translation (using the same exaple used in 1.)
  3. Querying the database with the SQL syntax

In this way, learners could understand the SQL code used in 3. because it is the SQL translation of the R query used in 1.

  • In the Complex database queries challenge: offer a solution to the challenge without the tally() function, which hasn't been introduced in the previous episodes. This could be achieved by substituting
species <- tbl(mammals, "species")
genus_counts <- left_join(surveys, plots) %>%
  left_join(species) %>%
  filter(taxa == "Rodent") %>%
  group_by(plot_type, genus) %>%
  tally() 

with

species <- tbl(mammals, "species")
genus_counts <- left_join(surveys, plots) %>%
  left_join(species) %>%
  filter(taxa == "Rodent") %>%
  count(plot_type, genus) 
@tobyhodges
Copy link
Member

Thanks @Aron-github for opening this issue, and hi 👋 The lesson underwent a major update and reorganisation when #887 was merged. One significant change is that the content on interacting with databases has been removed. As this issue relates that content, I will close it. Please open a new issue if you have suggestions for how the new content could be improved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants