From db8e0c3ce71bf00b0563ea97f357bc5ced1f831f Mon Sep 17 00:00:00 2001 From: njlyon0 Date: Thu, 22 Aug 2024 17:07:23 -0400 Subject: [PATCH] feat!: splitting code tips into items of a dropdown (easier to navigate) --- .../tip_packages/execute-results/html.json | 15 +++++++ _freeze/tip_paths/execute-results/html.json | 15 +++++++ _quarto.yml | 12 +++++- best_practices.qmd | 42 ------------------- .../naming-conventions.qmd => tip_names.qmd | 9 +++- ...s-script.qmd => tip_notebook-vs-script.qmd | 21 +++++++--- .../pkg-loading.qmd => tip_packages.qmd | 25 +++++++---- .../file-paths.qmd => tip_paths.qmd | 19 ++++++--- 8 files changed, 95 insertions(+), 63 deletions(-) create mode 100644 _freeze/tip_packages/execute-results/html.json create mode 100644 _freeze/tip_paths/execute-results/html.json delete mode 100644 best_practices.qmd rename modules_best-practices/naming-conventions.qmd => tip_names.qmd (95%) rename modules_best-practices/markdown-vs-script.qmd => tip_notebook-vs-script.qmd (87%) rename modules_best-practices/pkg-loading.qmd => tip_packages.qmd (77%) rename modules_best-practices/file-paths.qmd => tip_paths.qmd (64%) diff --git a/_freeze/tip_packages/execute-results/html.json b/_freeze/tip_packages/execute-results/html.json new file mode 100644 index 0000000..e038792 --- /dev/null +++ b/_freeze/tip_packages/execute-results/html.json @@ -0,0 +1,15 @@ +{ + "hash": "3eface853d5b64411045b9234e761a14", + "result": { + "engine": "knitr", + "markdown": "---\ntitle: \"Streamlined R Package Loading\"\n---\n\n\n## Overview \n\nLoading packages / libraries in R can be cumbersome when working collaboratively because there is no guarantee that you all have the same packages installed. While you could comment-out an `install.packages()` line for every package you need for a given script, we recommend using the R package `librarian` to greatly simplify this process!\n\n`librarian::shelf()` accepts the names of all of the packages--either CRAN or GitHub--installs those that are missing in that particular R session and then attaches all of them.\n\n## Traditional Package Loading\n\nTo load packages typically you'd have something like the following in your script:\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Install packages (if needed)\ninstall.packages(\"tidyverse\")\ninstall.packages(\"devtools\")\ndevtools::install_github(\"NCEAS/scicomptools\")\n\n# Load libraries\nlibrary(tidyverse); library(scicomptools)\n```\n:::\n\n\n## Package Loading with `librarian`\n\nWith `librarian::shelf()` however this becomes _much_ cleaner! In addition to being fewer lines, using `librarian` also removes the possibility that someone running your code misses one of the packages that your script depends on and then the script breaks for them later on. `librarian::shelf()` automatically detects whether a package is installed, installs it if necessary, and then attaches the package.\n\nIn essence, `librarian::shelf()` wraps `install.packages()`, `devtools::install_github()`, and `library()` into a single, human-readable function.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Install and load packages!\nlibrarian::shelf(tidyverse, NCEAS/scicomptools)\n```\n:::\n\n\nWhen using `librarian::shelf()`, package names do not need to be quoted and GitHub packages can be installed without the additional steps of installing the `devtools` package and using `devtools::install_github()` instead of `install.packages()`.\n", + "supporting": [], + "filters": [ + "rmarkdown/pagebreak.lua" + ], + "includes": {}, + "engineDependencies": {}, + "preserve": {}, + "postProcess": true + } +} \ No newline at end of file diff --git a/_freeze/tip_paths/execute-results/html.json b/_freeze/tip_paths/execute-results/html.json new file mode 100644 index 0000000..d08523a --- /dev/null +++ b/_freeze/tip_paths/execute-results/html.json @@ -0,0 +1,15 @@ +{ + "hash": "0df1c095a536f900bbc9ddef437146a2", + "result": { + "engine": "knitr", + "markdown": "---\ntitle: \"Reproducible File Paths\"\n---\n\n\n## Overview\n\nThis section contains our recommendations for handling file paths. When you code collaboratively (e.g., with GitHub), accounting for the difference between your folder structure and those of your colleagues becomes critical. **Ideally your code should be completely agnostic about (1) the operating system of the computer it is running on (i.e., Windows vs. Mac) and (2) the folder structure of the computer**. We can--fortunately--handle these two considerations relatively simply. This may seem somewhat dry but it is worth mentioning that failing to use relative file paths is a significant hindrance to reproducibility (see [Trisovic _et al._ 2022](https://www.nature.com/articles/s41597-022-01143-6)).\n\nYou may also find our [tutorial on storing user-specific information](https://lter.github.io/scicomp/tutorial_json.html) valuable in this context.\n\n## 1. Preserve File Paths as Objects\n\nDepending on the operating system of the computer, the slashes between folder names are different (`\\` versus `/`). The `file.path` function automatically detects the computer operating system and inserts the correct slash. We recommend using this function and assigning your file path to an object.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_path <- file.path(\"path\", \"to\", \"my\", \"file\")\nmy_path\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"path/to/my/file\"\n```\n\n\n:::\n:::\n\n\nOnce you have that path object, you can use it everywhere you import or export information to/from the code (with another use of `file.path` to get the right type of slash!).\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Import\nmy_raw_data <- read.csv(file = file.path(my_path, \"raw_data.csv\"))\n\n# Export\nwrite.csv(x = data_object, file = file.path(my_path, \"tidy_data.csv\"))\n```\n:::\n\n\n## 2. Create Necessary Sub-Folders in the Code\n\nUsing `file.path` guarantees that your code will work regardless of the upstream folder structure but what about the folders that you need to export or import things to/from? For example, say your `graphs.R` script saves a couple of useful exploratory graphs to the \"Plots\" folder, how would you guarantee that everyone running `graphs.R` *has* a \"Plots folder\"? You can use the `dir.create` function to create the folder in the code (and include your path object from step 1!).\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Create needed folder\ndir.create(path = file.path(my_path, \"Plots\"), showWarnings = FALSE)\n\n# Then export to that folder\nggplot2::ggsave(filename = file.path(my_path, \"Plots\", \"my_plot.png\"))\n```\n:::\n\n\nThe `showWarnings` argument of `dir.create` simply warns you if the folder you're creating already exists or not. There is no negative to \"creating\" a folder that already exists (nothing is overwritten!!) but the warning can be confusing so we can silence it ahead of time.\n\n## File Paths Summary\n\nWe strongly recommend following these guidelines so that your scripts work regardless of (1) the operating system, (2) folders \"upstream\" of the working directory, and (3) folders within the project. This will help your code by flexible and reproducible when others are attempting to re-run your scripts!\n\n

\n\"Photo\n

\n", + "supporting": [], + "filters": [ + "rmarkdown/pagebreak.lua" + ], + "includes": {}, + "engineDependencies": {}, + "preserve": {}, + "postProcess": true + } +} \ No newline at end of file diff --git a/_quarto.yml b/_quarto.yml index ca979b3..554db59 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -37,8 +37,16 @@ website: href: tutorial_json.qmd - text: "Connect Google Drive + R" href: tutorial_googledrive-pkg.qmd - - text: "Coding Tips" - href: best_practices.qmd + - text: "Tips" + menu: + - text: "Notebooks vs. Scripts" + href: tip_notebook-vs-script.qmd + - text: "File Names" + href: tip_names.qmd + - text: "File Paths" + href: tip_paths.qmd + - text: "Package Loading" + href: tip_packages.qmd - text: "Portfolio" href: portfolio.qmd right: diff --git a/best_practices.qmd b/best_practices.qmd deleted file mode 100644 index d3a3e18..0000000 --- a/best_practices.qmd +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: "Coding Tips" ---- - -```{r libraries} -#| include: false -#| message: false -#| echo: false - -library(librarian) -librarian::shelf(tidyverse, cran_repo = 'https://cran.r-project.org') -``` - -### Welcome! - -This page contains the collected best practice tips of our team. More will be added over time and feel free to post [an issue](https://github.com/lter/scicomp/issues) if you have a specific request for a section to add to this document. Please feel free to reach out to [our team](https://lter.github.io/scicomp/staff.html) if you have any questions about this best practices manual and/or need help implementing some of this content. - -Check the headings below or in the table of contents on the right of this page to see which tips and tricks we have included so far and we hope this page is a useful resource to you and your team! - -## R Scripts versus R Markdowns - -{{< include /modules_best-practices/markdown-vs-script.qmd >}} - -

-Photo of people walking sampling insects at a farm where a wildflower strip has been planted between two row-cropped areas -

- -## File Paths - -{{< include /modules_best-practices/file-paths.qmd >}} - -

-Photo of two penguins swimming in icy water -

- -## Good Naming Conventions - -{{< include /modules_best-practices/naming-conventions.qmd >}} - -## Package Loading - -{{< include /modules_best-practices/pkg-loading.qmd >}} diff --git a/modules_best-practices/naming-conventions.qmd b/tip_names.qmd similarity index 95% rename from modules_best-practices/naming-conventions.qmd rename to tip_names.qmd index 359944e..0a9674a 100644 --- a/modules_best-practices/naming-conventions.qmd +++ b/tip_names.qmd @@ -1,5 +1,13 @@ +--- +title: "Good Naming Conventions" +--- + +## Overview + When you first start working on a project with your group members, figuring out what to name your folders/files may not be at the top of your priority list. However, following a good naming convention will allow team members to quickly locate files and figure out what they contain. The organized naming structure will also allow new members of the group to be onboarded more easily! +## Naming Tips + Here is a summary of some naming tips that we recommend. These were taken from the [Reproducibility Best Practices module](https://lter.github.io/ssecr/mod_reproducibility.html#naming-tips) in the LTER's SSECR course. Please feel free to refer to the aforementioned link for more information. - Names should be informative @@ -9,4 +17,3 @@ Here is a summary of some naming tips that we recommend. These were taken from t - Spaces and special characters (e.g., é, ü, etc.) in folder/file names may cause errors when someone with a Windows computer tries to read those file paths. You can replace spaces with delimiters like underscores or hyphens to increase machine readability. - Follow a consistent naming convention throughout! - If you and your group members find a naming convention that works, stick with it! Having a consistent naming convention is key to getting new collaborators to follow it. - \ No newline at end of file diff --git a/modules_best-practices/markdown-vs-script.qmd b/tip_notebook-vs-script.qmd similarity index 87% rename from modules_best-practices/markdown-vs-script.qmd rename to tip_notebook-vs-script.qmd index 656fdf2..ede28b4 100644 --- a/modules_best-practices/markdown-vs-script.qmd +++ b/tip_notebook-vs-script.qmd @@ -1,22 +1,33 @@ +--- +title: "Notebooks versus Scripts" +--- When coding in R, either R scripts (.R files) or R markdowns (.Rmd files) are viable options but they have different advantages and disadvantages that we will cover below. -### R Scripts - Positives +## R Scripts + +### Positives R scripts' greatest strength is their flexibility. They allow you to format a file in whatever way is most intuitive to you. Additionally, R scripts can be cleaner for `for` loops insofar as they need not be concerned with staying within a given code chunk (as would be the case for a .Rmd). Developing a new workflow can be swiftly accomplished in an R script as some or all of the code in a script can be run by simply selecting the desired lines rather than manually running the desired chunks in a .Rmd file. Finally, R scripts can also be a better home for custom functions that can be `source`d by another file (even a .Rmd!) for making repeated operations simpler to read. -### R Scripts - Potential Weaknesses +### Potential Weaknesses The benefit of extreme flexibility in R scripts can sometimes be a disadvantage however. We've all seen (and written) R scripts that have few or no comments or where lines of code are densely packed without spacing or blank lines to help someone new to the code understand what is being done. R scripts can certainly be written in a way that is accessible to those without prior knowledge of what the script accomplishes but they do not *enforce* such structure. This can make it easy, especially when we're feeling pressed for time, to exclude structure that helps our code remain reproducible and understandable. -### R Markdowns - Positives +## R Markdowns + +### Positives R markdown files' ability to "knit" as HTML or PDF documents makes them extremely useful in creating outward-facing reports. This is particularly the case when the specific code is less important to communicate than visualizations and/or analyses of the data but .Rmd files do facilitate `echo`ing the code so that report readers can see how background operations were accomplished. The code chunk structure of these files can also nudge users towards including valuable comments (both between chunks and within them) though of course .Rmd files do not enforce such non-code content. -### R Markdowns - Potential Weaknesses +### Potential Weaknesses R markdowns can fail to knit due to issues even when the code within the chunks works as desired. Duplicate code chunk names or a failure to install LaTeX can be a frustrating hurdle to overcome between functioning code and a knit output file. When code must be re-run repeatedly (as is often the case when developing a new workflow) the stop-and-start nature of running each code chunk separately can also be a small irritation. -### Script vs. Markdown Summary +## Script vs. Markdown Summary Taken together, both R scripts and R markdown files can empower users to write reproducible, transparent code. However, both file types have some key limitations that should be taken into consideration when choosing which to use as you set out to create a new code product. + +

+Photo of people walking sampling insects at a farm where a wildflower strip has been planted between two row-cropped areas +

diff --git a/modules_best-practices/pkg-loading.qmd b/tip_packages.qmd similarity index 77% rename from modules_best-practices/pkg-loading.qmd rename to tip_packages.qmd index 3fc7658..097a087 100644 --- a/modules_best-practices/pkg-loading.qmd +++ b/tip_packages.qmd @@ -1,21 +1,32 @@ +--- +title: "Streamlined R Package Loading" +--- + +## Overview Loading packages / libraries in R can be cumbersome when working collaboratively because there is no guarantee that you all have the same packages installed. While you could comment-out an `install.packages()` line for every package you need for a given script, we recommend using the R package `librarian` to greatly simplify this process! -`librarian::shelf()` accepts the names of all of the packages--either CRAN or GitHub--installs those that are missing in that particular R session and then attaches all of them. See below for an example: +`librarian::shelf()` accepts the names of all of the packages--either CRAN or GitHub--installs those that are missing in that particular R session and then attaches all of them. + +## Traditional Package Loading To load packages typically you'd have something like the following in your script: -```{r library_og_method, eval = FALSE} -## Install packages (if needed) -# install.packages("tidyverse") -# install.packages("devtools") -# devtools::install_github("NCEAS/scicomptools") +```{r library_og_method} +#| eval: false + +# Install packages (if needed) +install.packages("tidyverse") +install.packages("devtools") +devtools::install_github("NCEAS/scicomptools") # Load libraries library(tidyverse); library(scicomptools) ``` -With `librarian::shelf()` however this becomes *much* cleaner! In addition to being fewer lines, using `librarian` also removes the possibility that someone running your code misses one of the packages that your script depends on and then the script breaks for them later on. `librarian::shelf()` automatically detects whether a package is installed, installs it if necessary, and then attaches the package. +## Package Loading with `librarian` + +With `librarian::shelf()` however this becomes _much_ cleaner! In addition to being fewer lines, using `librarian` also removes the possibility that someone running your code misses one of the packages that your script depends on and then the script breaks for them later on. `librarian::shelf()` automatically detects whether a package is installed, installs it if necessary, and then attaches the package. In essence, `librarian::shelf()` wraps `install.packages()`, `devtools::install_github()`, and `library()` into a single, human-readable function. diff --git a/modules_best-practices/file-paths.qmd b/tip_paths.qmd similarity index 64% rename from modules_best-practices/file-paths.qmd rename to tip_paths.qmd index 1eaf4dd..5f2fb42 100644 --- a/modules_best-practices/file-paths.qmd +++ b/tip_paths.qmd @@ -1,9 +1,14 @@ +--- +title: "Reproducible File Paths" +--- -This section contains our recommendations for handling **file paths**. When you code collaboratively (e.g., with GitHub), accounting for the difference between your folder structure and those of your colleagues becomes critical. Ideally your code should be completely agnostic about (1) the operating system of the computer it is running on (i.e., Windows vs. Mac) and (2) the folder structure of the computer. We can--fortunately--handle these two considerations relatively simply. +## Overview -This may seem somewhat dry but it is worth mentioning that failing to use relative file paths is a significant hindrance to reproducibility (see [Trisovic et al. 2022](https://www.nature.com/articles/s41597-022-01143-6)). +This section contains our recommendations for handling file paths. When you code collaboratively (e.g., with GitHub), accounting for the difference between your folder structure and those of your colleagues becomes critical. **Ideally your code should be completely agnostic about (1) the operating system of the computer it is running on (i.e., Windows vs. Mac) and (2) the folder structure of the computer**. We can--fortunately--handle these two considerations relatively simply. This may seem somewhat dry but it is worth mentioning that failing to use relative file paths is a significant hindrance to reproducibility (see [Trisovic _et al._ 2022](https://www.nature.com/articles/s41597-022-01143-6)). -### 1. Preserve File Paths as Objects Using `file.path` +You may also find our [tutorial on storing user-specific information](https://lter.github.io/scicomp/tutorial_json.html) valuable in this context. + +## 1. Preserve File Paths as Objects Depending on the operating system of the computer, the slashes between folder names are different (`\` versus `/`). The `file.path` function automatically detects the computer operating system and inserts the correct slash. We recommend using this function and assigning your file path to an object. @@ -22,7 +27,7 @@ my_raw_data <- read.csv(file = file.path(my_path, "raw_data.csv")) write.csv(x = data_object, file = file.path(my_path, "tidy_data.csv")) ``` -### 2. Create Necessary Sub-Folders in the Code with `dir.create` +## 2. Create Necessary Sub-Folders in the Code Using `file.path` guarantees that your code will work regardless of the upstream folder structure but what about the folders that you need to export or import things to/from? For example, say your `graphs.R` script saves a couple of useful exploratory graphs to the "Plots" folder, how would you guarantee that everyone running `graphs.R` *has* a "Plots folder"? You can use the `dir.create` function to create the folder in the code (and include your path object from step 1!). @@ -36,8 +41,10 @@ ggplot2::ggsave(filename = file.path(my_path, "Plots", "my_plot.png")) The `showWarnings` argument of `dir.create` simply warns you if the folder you're creating already exists or not. There is no negative to "creating" a folder that already exists (nothing is overwritten!!) but the warning can be confusing so we can silence it ahead of time. -### File Paths Summary +## File Paths Summary We strongly recommend following these guidelines so that your scripts work regardless of (1) the operating system, (2) folders "upstream" of the working directory, and (3) folders within the project. This will help your code by flexible and reproducible when others are attempting to re-run your scripts! -Also, for more information on how to read files in cloud storage locations such as Google Drive, Box, Dropbox, etc., please refer to our [Other Tutorials](https://nceas.github.io/scicomp.github.io/tutorials.html). \ No newline at end of file +

+Photo of two penguins swimming in icy water +