Skip to content

Commit

Permalink
feat!: splitting code tips into items of a dropdown (easier to navigate)
Browse files Browse the repository at this point in the history
  • Loading branch information
njlyon0 committed Aug 22, 2024
1 parent 33b4f46 commit db8e0c3
Show file tree
Hide file tree
Showing 8 changed files with 95 additions and 63 deletions.
15 changes: 15 additions & 0 deletions _freeze/tip_packages/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"hash": "3eface853d5b64411045b9234e761a14",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: \"Streamlined R Package Loading\"\n---\n\n\n## Overview \n\nLoading packages / libraries in R can be cumbersome when working collaboratively because there is no guarantee that you all have the same packages installed. While you could comment-out an `install.packages()` line for every package you need for a given script, we recommend using the R package `librarian` to greatly simplify this process!\n\n`librarian::shelf()` accepts the names of all of the packages--either CRAN or GitHub--installs those that are missing in that particular R session and then attaches all of them.\n\n## Traditional Package Loading\n\nTo load packages typically you'd have something like the following in your script:\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Install packages (if needed)\ninstall.packages(\"tidyverse\")\ninstall.packages(\"devtools\")\ndevtools::install_github(\"NCEAS/scicomptools\")\n\n# Load libraries\nlibrary(tidyverse); library(scicomptools)\n```\n:::\n\n\n## Package Loading with `librarian`\n\nWith `librarian::shelf()` however this becomes _much_ cleaner! In addition to being fewer lines, using `librarian` also removes the possibility that someone running your code misses one of the packages that your script depends on and then the script breaks for them later on. `librarian::shelf()` automatically detects whether a package is installed, installs it if necessary, and then attaches the package.\n\nIn essence, `librarian::shelf()` wraps `install.packages()`, `devtools::install_github()`, and `library()` into a single, human-readable function.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Install and load packages!\nlibrarian::shelf(tidyverse, NCEAS/scicomptools)\n```\n:::\n\n\nWhen using `librarian::shelf()`, package names do not need to be quoted and GitHub packages can be installed without the additional steps of installing the `devtools` package and using `devtools::install_github()` instead of `install.packages()`.\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {},
"engineDependencies": {},
"preserve": {},
"postProcess": true
}
}
15 changes: 15 additions & 0 deletions _freeze/tip_paths/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"hash": "0df1c095a536f900bbc9ddef437146a2",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: \"Reproducible File Paths\"\n---\n\n\n## Overview\n\nThis section contains our recommendations for handling file paths. When you code collaboratively (e.g., with GitHub), accounting for the difference between your folder structure and those of your colleagues becomes critical. **Ideally your code should be completely agnostic about (1) the operating system of the computer it is running on (i.e., Windows vs. Mac) and (2) the folder structure of the computer**. We can--fortunately--handle these two considerations relatively simply. This may seem somewhat dry but it is worth mentioning that failing to use relative file paths is a significant hindrance to reproducibility (see [Trisovic _et al._ 2022](https://www.nature.com/articles/s41597-022-01143-6)).\n\nYou may also find our [tutorial on storing user-specific information](https://lter.github.io/scicomp/tutorial_json.html) valuable in this context.\n\n## 1. Preserve File Paths as Objects\n\nDepending on the operating system of the computer, the slashes between folder names are different (`\\` versus `/`). The `file.path` function automatically detects the computer operating system and inserts the correct slash. We recommend using this function and assigning your file path to an object.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_path <- file.path(\"path\", \"to\", \"my\", \"file\")\nmy_path\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"path/to/my/file\"\n```\n\n\n:::\n:::\n\n\nOnce you have that path object, you can use it everywhere you import or export information to/from the code (with another use of `file.path` to get the right type of slash!).\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Import\nmy_raw_data <- read.csv(file = file.path(my_path, \"raw_data.csv\"))\n\n# Export\nwrite.csv(x = data_object, file = file.path(my_path, \"tidy_data.csv\"))\n```\n:::\n\n\n## 2. Create Necessary Sub-Folders in the Code\n\nUsing `file.path` guarantees that your code will work regardless of the upstream folder structure but what about the folders that you need to export or import things to/from? For example, say your `graphs.R` script saves a couple of useful exploratory graphs to the \"Plots\" folder, how would you guarantee that everyone running `graphs.R` *has* a \"Plots folder\"? You can use the `dir.create` function to create the folder in the code (and include your path object from step 1!).\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Create needed folder\ndir.create(path = file.path(my_path, \"Plots\"), showWarnings = FALSE)\n\n# Then export to that folder\nggplot2::ggsave(filename = file.path(my_path, \"Plots\", \"my_plot.png\"))\n```\n:::\n\n\nThe `showWarnings` argument of `dir.create` simply warns you if the folder you're creating already exists or not. There is no negative to \"creating\" a folder that already exists (nothing is overwritten!!) but the warning can be confusing so we can silence it ahead of time.\n\n## File Paths Summary\n\nWe strongly recommend following these guidelines so that your scripts work regardless of (1) the operating system, (2) folders \"upstream\" of the working directory, and (3) folders within the project. This will help your code by flexible and reproducible when others are attempting to re-run your scripts!\n\n<p align=\"center\">\n<img src=\"images/lter-photos/penguins.jpg\" alt=\"Photo of two penguins swimming in icy water\" width=\"80%\"/>\n</p>\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {},
"engineDependencies": {},
"preserve": {},
"postProcess": true
}
}
12 changes: 10 additions & 2 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,16 @@ website:
href: tutorial_json.qmd
- text: "Connect Google Drive + R"
href: tutorial_googledrive-pkg.qmd
- text: "Coding Tips"
href: best_practices.qmd
- text: "Tips"
menu:
- text: "Notebooks vs. Scripts"
href: tip_notebook-vs-script.qmd
- text: "File Names"
href: tip_names.qmd
- text: "File Paths"
href: tip_paths.qmd
- text: "Package Loading"
href: tip_packages.qmd
- text: "Portfolio"
href: portfolio.qmd
right:
Expand Down
42 changes: 0 additions & 42 deletions best_practices.qmd

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
---
title: "Good Naming Conventions"
---

## Overview

When you first start working on a project with your group members, figuring out what to name your folders/files may not be at the top of your priority list. However, following a good naming convention will allow team members to quickly locate files and figure out what they contain. The organized naming structure will also allow new members of the group to be onboarded more easily!

## Naming Tips

Here is a summary of some naming tips that we recommend. These were taken from the [Reproducibility Best Practices module](https://lter.github.io/ssecr/mod_reproducibility.html#naming-tips) in the LTER's SSECR course. Please feel free to refer to the aforementioned link for more information.

- Names should be informative
Expand All @@ -9,4 +17,3 @@ Here is a summary of some naming tips that we recommend. These were taken from t
- Spaces and special characters (e.g., é, ü, etc.) in folder/file names may cause errors when someone with a Windows computer tries to read those file paths. You can replace spaces with delimiters like underscores or hyphens to increase machine readability.
- Follow a consistent naming convention throughout!
- If you and your group members find a naming convention that works, stick with it! Having a consistent naming convention is key to getting new collaborators to follow it.

Original file line number Diff line number Diff line change
@@ -1,22 +1,33 @@
---
title: "Notebooks versus Scripts"
---

When coding in R, either R scripts (.R files) or R markdowns (.Rmd files) are viable options but they have different advantages and disadvantages that we will cover below.

### R Scripts - Positives
## R Scripts

### Positives

R scripts' greatest strength is their flexibility. They allow you to format a file in whatever way is most intuitive to you. Additionally, R scripts can be cleaner for `for` loops insofar as they need not be concerned with staying within a given code chunk (as would be the case for a .Rmd). Developing a new workflow can be swiftly accomplished in an R script as some or all of the code in a script can be run by simply selecting the desired lines rather than manually running the desired chunks in a .Rmd file. Finally, R scripts can also be a better home for custom functions that can be `source`d by another file (even a .Rmd!) for making repeated operations simpler to read.

### R Scripts - Potential Weaknesses
### Potential Weaknesses

The benefit of extreme flexibility in R scripts can sometimes be a disadvantage however. We've all seen (and written) R scripts that have few or no comments or where lines of code are densely packed without spacing or blank lines to help someone new to the code understand what is being done. R scripts can certainly be written in a way that is accessible to those without prior knowledge of what the script accomplishes but they do not *enforce* such structure. This can make it easy, especially when we're feeling pressed for time, to exclude structure that helps our code remain reproducible and understandable.

### R Markdowns - Positives
## R Markdowns

### Positives

R markdown files' ability to "knit" as HTML or PDF documents makes them extremely useful in creating outward-facing reports. This is particularly the case when the specific code is less important to communicate than visualizations and/or analyses of the data but .Rmd files do facilitate `echo`ing the code so that report readers can see how background operations were accomplished. The code chunk structure of these files can also nudge users towards including valuable comments (both between chunks and within them) though of course .Rmd files do not enforce such non-code content.

### R Markdowns - Potential Weaknesses
### Potential Weaknesses

R markdowns can fail to knit due to issues even when the code within the chunks works as desired. Duplicate code chunk names or a failure to install LaTeX can be a frustrating hurdle to overcome between functioning code and a knit output file. When code must be re-run repeatedly (as is often the case when developing a new workflow) the stop-and-start nature of running each code chunk separately can also be a small irritation.

### Script vs. Markdown Summary
## Script vs. Markdown Summary

Taken together, both R scripts and R markdown files can empower users to write reproducible, transparent code. However, both file types have some key limitations that should be taken into consideration when choosing which to use as you set out to create a new code product.

<p align="center">
<img src="images/lter-photos/patches.jpg" alt="Photo of people walking sampling insects at a farm where a wildflower strip has been planted between two row-cropped areas" width="100%"/>
</p>
25 changes: 18 additions & 7 deletions modules_best-practices/pkg-loading.qmd → tip_packages.qmd
Original file line number Diff line number Diff line change
@@ -1,21 +1,32 @@
---
title: "Streamlined R Package Loading"
---

## Overview

Loading packages / libraries in R can be cumbersome when working collaboratively because there is no guarantee that you all have the same packages installed. While you could comment-out an `install.packages()` line for every package you need for a given script, we recommend using the R package `librarian` to greatly simplify this process!

`librarian::shelf()` accepts the names of all of the packages--either CRAN or GitHub--installs those that are missing in that particular R session and then attaches all of them. See below for an example:
`librarian::shelf()` accepts the names of all of the packages--either CRAN or GitHub--installs those that are missing in that particular R session and then attaches all of them.

## Traditional Package Loading

To load packages typically you'd have something like the following in your script:

```{r library_og_method, eval = FALSE}
## Install packages (if needed)
# install.packages("tidyverse")
# install.packages("devtools")
# devtools::install_github("NCEAS/scicomptools")
```{r library_og_method}
#| eval: false
# Install packages (if needed)
install.packages("tidyverse")
install.packages("devtools")
devtools::install_github("NCEAS/scicomptools")
# Load libraries
library(tidyverse); library(scicomptools)
```

With `librarian::shelf()` however this becomes *much* cleaner! In addition to being fewer lines, using `librarian` also removes the possibility that someone running your code misses one of the packages that your script depends on and then the script breaks for them later on. `librarian::shelf()` automatically detects whether a package is installed, installs it if necessary, and then attaches the package.
## Package Loading with `librarian`

With `librarian::shelf()` however this becomes _much_ cleaner! In addition to being fewer lines, using `librarian` also removes the possibility that someone running your code misses one of the packages that your script depends on and then the script breaks for them later on. `librarian::shelf()` automatically detects whether a package is installed, installs it if necessary, and then attaches the package.

In essence, `librarian::shelf()` wraps `install.packages()`, `devtools::install_github()`, and `library()` into a single, human-readable function.

Expand Down
19 changes: 13 additions & 6 deletions modules_best-practices/file-paths.qmd → tip_paths.qmd
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
---
title: "Reproducible File Paths"
---

This section contains our recommendations for handling **file paths**. When you code collaboratively (e.g., with GitHub), accounting for the difference between your folder structure and those of your colleagues becomes critical. Ideally your code should be completely agnostic about (1) the operating system of the computer it is running on (i.e., Windows vs. Mac) and (2) the folder structure of the computer. We can--fortunately--handle these two considerations relatively simply.
## Overview

This may seem somewhat dry but it is worth mentioning that failing to use relative file paths is a significant hindrance to reproducibility (see [Trisovic et al. 2022](https://www.nature.com/articles/s41597-022-01143-6)).
This section contains our recommendations for handling file paths. When you code collaboratively (e.g., with GitHub), accounting for the difference between your folder structure and those of your colleagues becomes critical. **Ideally your code should be completely agnostic about (1) the operating system of the computer it is running on (i.e., Windows vs. Mac) and (2) the folder structure of the computer**. We can--fortunately--handle these two considerations relatively simply. This may seem somewhat dry but it is worth mentioning that failing to use relative file paths is a significant hindrance to reproducibility (see [Trisovic _et al._ 2022](https://www.nature.com/articles/s41597-022-01143-6)).

### 1. Preserve File Paths as Objects Using `file.path`
You may also find our [tutorial on storing user-specific information](https://lter.github.io/scicomp/tutorial_json.html) valuable in this context.

## 1. Preserve File Paths as Objects

Depending on the operating system of the computer, the slashes between folder names are different (`\` versus `/`). The `file.path` function automatically detects the computer operating system and inserts the correct slash. We recommend using this function and assigning your file path to an object.

Expand All @@ -22,7 +27,7 @@ my_raw_data <- read.csv(file = file.path(my_path, "raw_data.csv"))
write.csv(x = data_object, file = file.path(my_path, "tidy_data.csv"))
```

### 2. Create Necessary Sub-Folders in the Code with `dir.create`
## 2. Create Necessary Sub-Folders in the Code

Using `file.path` guarantees that your code will work regardless of the upstream folder structure but what about the folders that you need to export or import things to/from? For example, say your `graphs.R` script saves a couple of useful exploratory graphs to the "Plots" folder, how would you guarantee that everyone running `graphs.R` *has* a "Plots folder"? You can use the `dir.create` function to create the folder in the code (and include your path object from step 1!).

Expand All @@ -36,8 +41,10 @@ ggplot2::ggsave(filename = file.path(my_path, "Plots", "my_plot.png"))

The `showWarnings` argument of `dir.create` simply warns you if the folder you're creating already exists or not. There is no negative to "creating" a folder that already exists (nothing is overwritten!!) but the warning can be confusing so we can silence it ahead of time.

### File Paths Summary
## File Paths Summary

We strongly recommend following these guidelines so that your scripts work regardless of (1) the operating system, (2) folders "upstream" of the working directory, and (3) folders within the project. This will help your code by flexible and reproducible when others are attempting to re-run your scripts!

Also, for more information on how to read files in cloud storage locations such as Google Drive, Box, Dropbox, etc., please refer to our [Other Tutorials](https://nceas.github.io/scicomp.github.io/tutorials.html).
<p align="center">
<img src="images/lter-photos/penguins.jpg" alt="Photo of two penguins swimming in icy water" width="80%"/>
</p>

0 comments on commit db8e0c3

Please sign in to comment.