diff --git a/NAMESPACE b/NAMESPACE index b2020df..366c797 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -1,6 +1,7 @@ # Generated by roxygen2: do not edit by hand export(catch) +export(create_object_list) export(loupe) export(release) importFrom(lifecycle,deprecated) diff --git a/R/catch.R b/R/catch.R index 5afe165..24206f8 100644 --- a/R/catch.R +++ b/R/catch.R @@ -5,6 +5,8 @@ #' which contains only rows that have changed compared to previous data. It will #' not return any new rows. #' +#' The underlying functionality is handled by `create_object_list()`. +#' #' @param df_current data.frame, the newest/current version of dataset x. #' @param df_previous data.frame, the old version of dataset, for example x - t1. #' @param datetime_variable character, which variable to use as unique ID to join `df_current` and `df_previous`. Usually a "datetime" variable. @@ -13,6 +15,7 @@ #' also returns a waldo object as in `loupe()`. #' #' @seealso [loupe()] +#' @seealso [create_object_list()] #' #' @examples #' df_caught <- butterfly::catch( @@ -25,92 +28,29 @@ #' #' @export catch <- function(df_current, df_previous, datetime_variable) { - # Check input is as expected - stopifnot("`df_current` must be a data.frame" = is.data.frame(df_current)) - stopifnot("`df_previous` must be a data.frame" = is.data.frame(df_previous)) - - # Check if `datetime_variable` is in both `df_current` and `df_previous` - if (!datetime_variable %in% names(df_current) || !datetime_variable %in% names(df_previous)) { - stop( - "`datetime_variable` must be present in both `df_current` and `df_previous`" - ) - } - - # Using semi_join to extract rows with matching datetime_variables - # (ie previously generated data) - df_current_without_new_row <- dplyr::semi_join( + butterfly_object_list <- create_object_list( df_current, df_previous, - by = datetime_variable + datetime_variable ) - # Compare the current data with the previous data, without "new" values - waldo_object <- waldo::compare( - df_current_without_new_row, - df_previous - ) - - # Obtaining the new rows to provide in feedback - df_current_new_rows <- dplyr::anti_join( - df_current, - df_previous, - by = datetime_variable - ) - - if (nrow(df_current_new_rows) == 0) { - warning( - "There are no new rows. Check '", - deparse(substitute(df_current)), - "' is your most recent data, and '", - deparse(substitute(df_previous)), - "' is your previous data." - ) - } else { - # Tell the user which rows are new, regardless of previous data changing - cli::cat_line( - paste0( - "The following rows are new in '", - deparse(substitute(df_current)), - "': " - ), - col = "green" - ) - - cli::cat_print( - df_current_new_rows + # By using an inner join, we drop any row which does not match in + # df_previous. + df_rows_changed_from_previous <- suppressMessages( + dplyr::anti_join( + butterfly_object_list$df_current_without_new_row, + df_previous ) - } - - # Return a simple message if there are no changes in previous data - if (length(waldo_object) == 0) { - stop( - "There are no differences between current and previous data." - ) - } else { - # Return detailed breakdown and warning if previous data have changed. - if (length(waldo_object) > 0) { - cli::cat_line() + ) - cli::cat_bullet( - "The following rows have changed from the previous data, and will be returned:", - bullet = "info", - col = "orange", - bullet_col = "orange" - ) + cli::cat_line() - cli::cat_print( - waldo_object - ) + cli::cat_bullet( + "Only these rows are returned.", + bullet = "info", + col = "orange", + bullet_col = "orange" + ) - # By using an inner join, we drop any row which does not match in - # df_previous. - df_rows_changed_from_previous <- suppressMessages( - dplyr::anti_join( - df_current_without_new_row, - df_previous - ) - ) - } - } return(df_rows_changed_from_previous) } diff --git a/R/create_object_list.R b/R/create_object_list.R new file mode 100644 index 0000000..b14295d --- /dev/null +++ b/R/create_object_list.R @@ -0,0 +1,139 @@ +#' create_object_list: creates a list of objects used in all butterfly functions +#' +#' This function creates a list of objects which is used by all of `loupe()`, +#' `catch()` and `release()`. +#' +#' This function matches two dataframe objects by their unique identifier +#' (usually "time" or "datetime in a timeseries). +#' +#' It informs the user of new (unmatched) rows which have appeared, and then +#' returns a `waldo::compare()` call to give a detailed breakdown of changes. +#' +#' The main assumption is that `df_current` and `df_previous` are a newer and +#' older versions of the same data, and that the `datetime_variable` variable name always +#' remains the same. Elsewhere new columns can of appear, and these will be +#' returned in the report. +#' +#' @param df_current data.frame, the newest/current version of dataset x. +#' @param df_previous data.frame, the old version of dataset, for example x - t1. +#' @param datetime_variable string, which variable to use as unique ID to join +#' `df_current` and `df_previous`. Usually a "datetime" variable. +#' +#' @returns A list containing boolean where TRUE indicates no changes to +#' previous data and FALSE indicates unexpected changes, a dataframe of +#' the current data without new rows and a dataframe of new rows only +#' +#' @examples +#' butterfly_object_list <- butterfly::create_object_list( +#' butterflycount$february, +#' butterflycount$january, +#' datetime_variable = "time" +#' ) +#' +#' butterfly_object_list +#' +#' @export +create_object_list <- function(df_current, df_previous, datetime_variable) { + # Check input is as expected + stopifnot("`df_current` must be a data.frame" = is.data.frame(df_current)) + stopifnot("`df_previous` must be a data.frame" = is.data.frame(df_previous)) + + # Check if `datetime_variable` is in both `df_current` and `df_previous` + if (!datetime_variable %in% names(df_current) || !datetime_variable %in% names(df_previous)) { + stop( + "`datetime_variable` must be present in both `df_current` and `df_previous`" + ) + } + + # Initialise list to store objects used by `loupe()`, `catch()` and `release()` + list_butterfly <- list( + "waldo_object" = character(), + "df_current_without_new_row" = data.frame(), + "df_current_new_rows" = data.frame() + ) + + # Using semi_join to extract rows with matching datetime_variables + # (ie previously generated data) + df_current_without_new_row <- dplyr::semi_join( + df_current, + df_previous, + by = datetime_variable + ) + + # Obtaining the new rows to provide in feedback + df_current_new_rows <- dplyr::anti_join( + df_current, + df_previous, + by = datetime_variable + ) + + # Compare the current data with the previous data, without "new" values + waldo_object <- waldo::compare( + df_current_without_new_row, + df_previous + ) + + # Creating a feedback message depending on the waldo object's output + # First checking if there are new rows at all: + if (nrow(df_current_new_rows) == 0) { + stop( + "There are no new rows. Check '", + deparse(substitute(df_current)), + "' is your most recent data, and '", + deparse(substitute(df_previous)), + "' is your previous data. If comparing like for like, try waldo::compare()." + ) + } else { + # Tell the user which rows are new, regardless of previous data changing + cli::cat_line( + "The following rows are new in '", + deparse(substitute(df_current)), + "': ", + col = "green" + ) + cli::cat_print( + df_current_new_rows + ) + } + + # Return a simple message if there are no changes in previous data + if (length(waldo_object) == 0) { + cli::cat_bullet( + "And there are no differences with previous data.", + bullet = "tick", + col = "green", + bullet_col = "green" + ) + + butterfly_status <- TRUE + + } else { + # Return detailed breakdown and warning if previous data have changed. + if (length(waldo_object) > 0) { + cli::cat_line() + + cli::cat_bullet( + "The following values have changes from the previous data.", + bullet = "info", + col = "orange", + bullet_col = "orange" + ) + + cli::cat_print( + waldo_object + ) + + butterfly_status <- FALSE + + } + } + + # Populate list with objects + list_butterfly <- list( + butterfly_status = butterfly_status, + df_current_without_new_row = df_current_without_new_row, + df_current_new_rows = df_current_new_rows + ) + + return(list_butterfly) +} diff --git a/R/loupe.R b/R/loupe.R index d0ea25a..5212893 100644 --- a/R/loupe.R +++ b/R/loupe.R @@ -18,11 +18,15 @@ #' remains the same. Elsewhere new columns can of appear, and these will be #' returned in the report. #' +#' The underlying functionality is handled by `create_object_list()`. +#' #' @param df_current data.frame, the newest/current version of dataset x. #' @param df_previous data.frame, the old version of dataset, for example x - t1. #' @param datetime_variable string, which variable to use as unique ID to join `df_current` and `df_previous`. Usually a "datetime" variable. #' -#' @returns A waldo object containing a message on differences or 'And there are no differences with previous data'. +#' @returns A boolean where TRUE indicates no changes to previous data and FALSE indicates unexpected changes. +#' +#' @seealso [create_object_list()] #' #' @examples #' # This example contains no differences with previous data @@ -41,81 +45,12 @@ #' #' @export loupe <- function(df_current, df_previous, datetime_variable) { - # Check input is as expected - stopifnot("`df_current` must be a data.frame" = is.data.frame(df_current)) - stopifnot("`df_previous` must be a data.frame" = is.data.frame(df_previous)) - - # Check if `datetime_variable` is in both `df_current` and `df_previous` - if (!datetime_variable %in% names(df_current) || !datetime_variable %in% names(df_previous)) { - stop( - "`datetime_variable` must be present in both `df_current` and `df_previous`" - ) - } - - # Using semi_join to extract rows with matching datetime_variables - # (ie previously generated data) - df_current_without_new_row <- dplyr::semi_join( + butterfly_object_list <- create_object_list( df_current, df_previous, - by = datetime_variable - ) - - # Compare the current data with the previous data, without "new" values - waldo_object <- waldo::compare( - df_current_without_new_row, - df_previous + datetime_variable ) - # Obtaining the new rows to provide in feedback - df_current_new_rows <- dplyr::anti_join( - df_current, - df_previous, - by = datetime_variable - ) - - # Creating a feedback message depending on the waldo object's output - # First checking if there are new rows at all: - if (nrow(df_current_new_rows) == 0) { - stop( - "There are no new rows. Check '", - deparse(substitute(df_current)), - "' is your most recent data, and '", - deparse(substitute(df_previous)), - "' is your previous data. If comparing like for like, try waldo::compare()." - ) - } else { - # Tell the user which rows are new, regardless of previous data changing - cli::cat_line( - "The following rows are new in '", - deparse(substitute(df_current)), - "': ", - col = "green" - ) - cli::cat_print( - df_current_new_rows - ) - } - - # Return a simple message if there are no changes in previous data - if (length(waldo_object) == 0) { - cli::cat_bullet( - "And there are no differences with previous data.", - bullet = "tick", - col = "green", - bullet_col = "green" - ) - } else { - # Return detailed breakdown and warning if previous data have changed. - if (length(waldo_object) > 0) { - cli::cat_line() + return(butterfly_object_list$butterfly_status) - cli::cat_bullet( - "But the following values have changes from the previous data:", - bullet = "info", - col = "orange", - bullet_col = "orange" - ) - return(waldo_object) - } - } } diff --git a/R/release.R b/R/release.R index 356cba9..e65ba48 100644 --- a/R/release.R +++ b/R/release.R @@ -8,11 +8,13 @@ #' @param df_current data.frame, the newest/current version of dataset x. #' @param df_previous data.frame, the old version of dataset, for example x - t1. #' @param datetime_variable string, which variable to use as unique ID to join `df_current` and `df_previous`. Usually a "datetime" variable. +#' @param include_new boolean, should new rows be included? Default is TRUE. #' #' @returns A dataframe which contains only rows of `df_current` that have not changed from `df_previous`, and includes new rows. #' also returns a waldo object as in `loupe()`. #' #' @seealso [loupe()] +#' @seealso [create_object_list()] #' #' @examples #' df_released <- butterfly::release( @@ -24,102 +26,52 @@ #' df_released #' #' @export -release <- function(df_current, df_previous, datetime_variable) { - # Check input is as expected - stopifnot("`df_current` must be a data.frame" = is.data.frame(df_current)) - stopifnot("`df_previous` must be a data.frame" = is.data.frame(df_previous)) - - # Check if `datetime_variable` is in both `df_current` and `df_previous` - if (!datetime_variable %in% names(df_current) || !datetime_variable %in% names(df_previous)) { - stop( - "`datetime_variable` must be present in both `df_current` and `df_previous`" - ) - } - - # Using semi_join to extract rows with matching datetime_variables - # (ie previously generated data) - df_current_without_new_row <- dplyr::semi_join( +release <- function(df_current, df_previous, datetime_variable, include_new = TRUE) { + butterfly_object_list <- create_object_list( df_current, df_previous, - by = datetime_variable + datetime_variable ) - # Compare the current data with the previous data, without "new" values - waldo_object <- waldo::compare( - df_current_without_new_row, - df_previous - ) - - # Obtaining the new rows to provide in feedback - df_current_new_rows <- dplyr::anti_join( - df_current, - df_previous, - by = datetime_variable + # By using an inner join, we drop any row which does not match in + # df_previous. + df_current_without_changed_rows <- suppressMessages( + dplyr::inner_join( + butterfly_object_list$df_current_without_new_row, + df_previous + ) ) - if (nrow(df_current_new_rows) == 0) { - warning( - "There are no new rows. Check '", - deparse(substitute(df_current)), - "' is your most recent data, and '", - deparse(substitute(df_previous)), - "' is your previous data." - ) - } else { - # Tell the user which rows are new, regardless of previous data changing - cli::cat_line( - paste0( - "The following rows are new in '", - deparse(substitute(df_current)), - "': " - ), - col = "green" + # Returng the dataframe with or without new rows added + if (include_new == TRUE) { + # Then we add the new rows back in and return the dataframe as such + df_release <- dplyr::bind_rows( + butterfly_object_list$df_current_new_rows, + df_current_without_changed_rows ) - cli::cat_print( - df_current_new_rows - ) - } - # Return a simple message if there are no changes in previous data - if (length(waldo_object) == 0) { - warning( - "There are no differences between current and previous data. Returning object identical to: ", - deparse(substitute(df_current)) - ) + cli::cat_line() - df_release <- df_current - } else { - # Return detailed breakdown and warning if previous data have changed. - if (length(waldo_object) > 0) { - cli::cat_line() + cli::cat_bullet( + "These will be dropped, but new rows are included.", + bullet = "info", + col = "orange", + bullet_col = "orange" + ) - cli::cat_bullet( - "The following rows have changed from the previous data, and will be dropped: ", - bullet = "info", - col = "orange", - bullet_col = "orange" - ) + return(df_release) - cli::cat_print( - waldo_object - ) + } else if (include_new == FALSE) { + cli::cat_line() - # By using an inner join, we drop any row which does not match in - # df_previous. - df_current_without_changed_rows <- suppressMessages( - dplyr::inner_join( - df_current_without_new_row, - df_previous - ) - ) + cli::cat_bullet( + "These will be dropped, along with new rows.", + bullet = "info", + col = "orange", + bullet_col = "orange" + ) - # Using inner_join does mean that the new rows will need to be added - # back in. - df_release <- dplyr::bind_rows( - df_current_new_rows, - df_current_without_changed_rows - ) - } + # If new rows are not included, simply return the df without changed rows + return(df_current_without_changed_rows) } - return(df_release) } diff --git a/README.Rmd b/README.Rmd index d027d6f..1981d77 100644 --- a/README.Rmd +++ b/README.Rmd @@ -46,9 +46,10 @@ devtools::install_github("thomaszwagerman/butterfly") ## Overview The butterfly package contains the following: - * `butterfly::loupe()` - examines in detail whether previous values have changed, and reports them using `waldo::compare()`. + * `butterfly::loupe()` - examines in detail whether previous values have changed, and returns TRUE/FALSE for no change/change. * `butterfly::catch()` - returns rows which contain previously changed values in a dataframe. * `butterfly::release()` - drops rows which contain previously changed values, and returns a dataframe containing new and unchanged rows. + * `butterfly::create_object_list()` - returns a list of objects required by all of `loupe()`, `catch()` and `release()`. Contains underlying functionality. * `butterflycount` - a list of monthly dataframes, which contain fictional butterfly counts for a given date. ## Examples diff --git a/README.md b/README.md index cd35727..ec89c31 100644 --- a/README.md +++ b/README.md @@ -46,11 +46,14 @@ devtools::install_github("thomaszwagerman/butterfly") The butterfly package contains the following: - `butterfly::loupe()` - examines in detail whether previous values have - changed, and reports them using `waldo::compare()`. + changed, and returns TRUE/FALSE for no change/change. - `butterfly::catch()` - returns rows which contain previously changed values in a dataframe. - `butterfly::release()` - drops rows which contain previously changed values, and returns a dataframe containing new and unchanged rows. +- `butterfly::create_object_list()` - returns a list of objects required + by all of `loupe()`, `catch()` and `release()`. Contains underlying + functionality. - `butterflycount` - a list of monthly dataframes, which contain fictional butterfly counts for a given date. @@ -96,21 +99,22 @@ butterfly::loupe( butterflycount$january, datetime_variable = "time" ) -#> The following rows are new in 'butterflycount$february': +#> The following rows are new in 'df_current': #> time count #> 1 2024-02-01 17 #> ✔ And there are no differences with previous data. +#> [1] TRUE butterfly::loupe( butterflycount$march, butterflycount$february, datetime_variable = "time" ) -#> The following rows are new in 'butterflycount$march': +#> The following rows are new in 'df_current': #> time count #> 1 2024-03-01 23 #> -#> ℹ But the following values have changes from the previous data: +#> ℹ The following values have changes from the previous data. #> old vs new #> count #> old[1, ] 17 @@ -121,6 +125,7 @@ butterfly::loupe( #> #> `old$count`: 17 22 55 18 #> `new$count`: 17 22 55 11 +#> [1] FALSE ``` `butterfly::loupe()` uses `dplyr::semi_join()` to match the new and old @@ -147,11 +152,11 @@ df_caught <- butterfly::catch( butterflycount$february, datetime_variable = "time" ) -#> The following rows are new in 'butterflycount$march': +#> The following rows are new in 'df_current': #> time count #> 1 2024-03-01 23 #> -#> ℹ The following rows have changed from the previous data, and will be returned: +#> ℹ The following values have changes from the previous data. #> old vs new #> count #> old[1, ] 17 @@ -162,6 +167,8 @@ df_caught <- butterfly::catch( #> #> `old$count`: 17 22 55 18 #> `new$count`: 17 22 55 11 +#> +#> ℹ Only these rows are returned. df_caught #> time count @@ -177,11 +184,11 @@ df_released <- butterfly::release( butterflycount$february, datetime_variable = "time" ) -#> The following rows are new in 'butterflycount$march': +#> The following rows are new in 'df_current': #> time count #> 1 2024-03-01 23 #> -#> ℹ The following rows have changed from the previous data, and will be dropped: +#> ℹ The following values have changes from the previous data. #> old vs new #> count #> old[1, ] 17 @@ -192,6 +199,8 @@ df_released <- butterfly::release( #> #> `old$count`: 17 22 55 18 #> `new$count`: 17 22 55 11 +#> +#> ℹ These will be dropped, but new rows are included. df_released #> time count diff --git a/codemeta.json b/codemeta.json index 052e16e..31830e1 100644 --- a/codemeta.json +++ b/codemeta.json @@ -128,8 +128,9 @@ }, "SystemRequirements": null }, - "fileSize": "330.734KB", + "fileSize": "337.771KB", "readme": "https://github.com/thomaszwagerman/butterfly/blob/main/README.md", "contIntegration": ["https://github.com/thomaszwagerman/butterfly/actions/workflows/R-CMD-check.yaml", "https://app.codecov.io/gh/thomaszwagerman/butterfly?branch=main"], + "developmentStatus": "https://lifecycle.r-lib.org/articles/stages.html#experimental", "keywords": ["qaqc", "timeseries"] } diff --git a/man/catch.Rd b/man/catch.Rd index 337ba5b..ee0ea5a 100644 --- a/man/catch.Rd +++ b/man/catch.Rd @@ -23,6 +23,9 @@ This function matches two dataframe objects by their unique identifier which contains only rows that have changed compared to previous data. It will not return any new rows. } +\details{ +The underlying functionality is handled by \code{create_object_list()}. +} \examples{ df_caught <- butterfly::catch( butterflycount$march, @@ -35,4 +38,6 @@ df_caught } \seealso{ \code{\link[=loupe]{loupe()}} + +\code{\link[=create_object_list]{create_object_list()}} } diff --git a/man/create_object_list.Rd b/man/create_object_list.Rd new file mode 100644 index 0000000..d1a6fa0 --- /dev/null +++ b/man/create_object_list.Rd @@ -0,0 +1,47 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/create_object_list.R +\name{create_object_list} +\alias{create_object_list} +\title{create_object_list: creates a list of objects used in all butterfly functions} +\usage{ +create_object_list(df_current, df_previous, datetime_variable) +} +\arguments{ +\item{df_current}{data.frame, the newest/current version of dataset x.} + +\item{df_previous}{data.frame, the old version of dataset, for example x - t1.} + +\item{datetime_variable}{string, which variable to use as unique ID to join +\code{df_current} and \code{df_previous}. Usually a "datetime" variable.} +} +\value{ +A list containing boolean where TRUE indicates no changes to +previous data and FALSE indicates unexpected changes, a dataframe of +the current data without new rows and a dataframe of new rows only +} +\description{ +This function creates a list of objects which is used by all of \code{loupe()}, +\code{catch()} and \code{release()}. +} +\details{ +This function matches two dataframe objects by their unique identifier +(usually "time" or "datetime in a timeseries). + +It informs the user of new (unmatched) rows which have appeared, and then +returns a \code{waldo::compare()} call to give a detailed breakdown of changes. + +The main assumption is that \code{df_current} and \code{df_previous} are a newer and +older versions of the same data, and that the \code{datetime_variable} variable name always +remains the same. Elsewhere new columns can of appear, and these will be +returned in the report. +} +\examples{ +butterfly_object_list <- butterfly::create_object_list( + butterflycount$february, + butterflycount$january, + datetime_variable = "time" +) + +butterfly_object_list + +} diff --git a/man/loupe.Rd b/man/loupe.Rd index 6c7eea2..cb59b2e 100644 --- a/man/loupe.Rd +++ b/man/loupe.Rd @@ -14,7 +14,7 @@ loupe(df_current, df_previous, datetime_variable) \item{datetime_variable}{string, which variable to use as unique ID to join \code{df_current} and \code{df_previous}. Usually a "datetime" variable.} } \value{ -A waldo object containing a message on differences or 'And there are no differences with previous data'. +A boolean where TRUE indicates no changes to previous data and FALSE indicates unexpected changes. } \description{ A loupe is a simple, small magnification device used to examine small details @@ -35,6 +35,8 @@ The main assumption is that \code{df_current} and \code{df_previous} are a newer older versions of the same data, and that the \code{datetime_variable} variable name always remains the same. Elsewhere new columns can of appear, and these will be returned in the report. + +The underlying functionality is handled by \code{create_object_list()}. } \examples{ # This example contains no differences with previous data @@ -52,3 +54,6 @@ butterfly::loupe( ) } +\seealso{ +\code{\link[=create_object_list]{create_object_list()}} +} diff --git a/man/release.Rd b/man/release.Rd index ebd91c8..ed62da8 100644 --- a/man/release.Rd +++ b/man/release.Rd @@ -4,7 +4,7 @@ \alias{release} \title{Release: return current dataframe without changed old rows} \usage{ -release(df_current, df_previous, datetime_variable) +release(df_current, df_previous, datetime_variable, include_new = TRUE) } \arguments{ \item{df_current}{data.frame, the newest/current version of dataset x.} @@ -12,6 +12,8 @@ release(df_current, df_previous, datetime_variable) \item{df_previous}{data.frame, the old version of dataset, for example x - t1.} \item{datetime_variable}{string, which variable to use as unique ID to join \code{df_current} and \code{df_previous}. Usually a "datetime" variable.} + +\item{include_new}{boolean, should new rows be included? Default is TRUE.} } \value{ A dataframe which contains only rows of \code{df_current} that have not changed from \code{df_previous}, and includes new rows. @@ -35,4 +37,6 @@ df_released } \seealso{ \code{\link[=loupe]{loupe()}} + +\code{\link[=create_object_list]{create_object_list()}} } diff --git a/tests/testthat/test-catch.R b/tests/testthat/test-catch.R index 12f3054..311dd72 100644 --- a/tests/testthat/test-catch.R +++ b/tests/testthat/test-catch.R @@ -1,28 +1,3 @@ -test_that("warning when no new rows", { - # And when the previous/current dfs have been swapped. - expect_warning( - catch( - butterflycount$january, - butterflycount$february, - datetime_variable = "time" - ) - ) -}) - -test_that("error when rows are identical", { - # This should occur when dfs are identical - expect_error( - # Suppressing warning, as this is also given (see above) - suppressWarnings( - catch( - butterflycount$january, - butterflycount$january, - datetime_variable = "time" - ) - ) - ) -}) - test_that("correct message is fed back", { expect_output( catch( @@ -39,7 +14,7 @@ test_that("correct message is fed back", { butterflycount$february, datetime_variable = "time" ), - "The following rows have changed from the previous data, and will be returned:" + "Only these rows are returned" ) }) diff --git a/tests/testthat/test-create_object_list.R b/tests/testthat/test-create_object_list.R new file mode 100644 index 0000000..3523b13 --- /dev/null +++ b/tests/testthat/test-create_object_list.R @@ -0,0 +1,70 @@ +test_that("error when no new rows", { + # This should occur when dfs are identical + expect_error( + create_object_list( + butterflycount$january, + butterflycount$january, + datetime_variable = "time" + ) + ) + # And when the previous/current dfs have been swapped. + expect_error( + create_object_list( + butterflycount$january, + butterflycount$february, + datetime_variable = "time" + ) + ) +}) + +test_that("correct message is fed back", { + expect_output( + create_object_list( + butterflycount$february, + butterflycount$january, + datetime_variable = "time" + ), + "The following rows are new in" + ) + expect_output( + create_object_list( + butterflycount$february, + butterflycount$january, + datetime_variable = "time" + ), + "And there are no differences with previous data." + ) + expect_output( + create_object_list( + butterflycount$march, + butterflycount$february, + datetime_variable = "time" + ), + "The following values have changes from the previous data." + ) +}) + +test_that("a list of three objects is returned", { + expect_length( + create_object_list( + butterflycount$february, + butterflycount$january, + datetime_variable = "time" + ), + 3 + ) +}) + +test_that("comparison object is returned when not equal", { + create_object_list_output <- create_object_list( + butterflycount$march, + butterflycount$february, + datetime_variable = "time" + ) + expect_gt( + length( + create_object_list_output + ), + 0 + ) +}) diff --git a/tests/testthat/test-loupe.R b/tests/testthat/test-loupe.R index 7082059..eced681 100644 --- a/tests/testthat/test-loupe.R +++ b/tests/testthat/test-loupe.R @@ -1,70 +1,19 @@ -test_that("error when no new rows", { - # This should occur when dfs are identical - expect_error( +test_that("TRUE is returned when equal", { + expect_true( loupe( - butterflycount$january, - butterflycount$january, - datetime_variable = "time" - ) - ) - # And when the previous/current dfs have been swapped. - expect_error( - loupe( - butterflycount$january, butterflycount$february, + butterflycount$january, datetime_variable = "time" ) ) }) -test_that("correct message is fed back", { - expect_output( - loupe( - butterflycount$february, - butterflycount$january, - datetime_variable = "time" - ), - "The following rows are new in" - ) - expect_output( - loupe( - butterflycount$february, - butterflycount$january, - datetime_variable = "time" - ), - "And there are no differences with previous data." - ) - expect_output( +test_that("FALSE is returned when NOT equal", { + expect_false( loupe( butterflycount$march, butterflycount$february, datetime_variable = "time" - ), - "But the following values have changes from the previous data:" - ) -}) - -test_that("comparison object is not returned when equal", { - expect_length( - loupe( - butterflycount$february, - butterflycount$january, - datetime_variable = "time" - ), - 0 - ) -}) - -test_that("comparison object is returned when not equal", { - loupe_output <- loupe( - butterflycount$march, - butterflycount$february, - datetime_variable = "time" - ) - expect_gt( - length( - loupe_output - ), - 0 + ) ) }) diff --git a/tests/testthat/test-release.R b/tests/testthat/test-release.R index ee60ec4..fab33f0 100644 --- a/tests/testthat/test-release.R +++ b/tests/testthat/test-release.R @@ -1,33 +1,22 @@ -test_that("warning when no new rows", { - # And when the previous/current dfs have been swapped. - expect_warning( - release( - butterflycount$january, - butterflycount$february, - datetime_variable = "time" - ) - ) -}) - -test_that("warning when there are no different rows to drop", { - # This should occur when dfs are identical - expect_warning( +test_that("correct message is fed back", { + expect_output( release( + butterflycount$march, butterflycount$february, - butterflycount$january, - datetime_variable = "time" - ) + datetime_variable = "time", + include_new = TRUE + ), + "These will be dropped, but new rows are included" ) -}) -test_that("correct message is fed back", { expect_output( release( butterflycount$march, butterflycount$february, - datetime_variable = "time" + datetime_variable = "time", + include_new = FALSE ), - "The following rows have changed from the previous data, and will be dropped:" + "These will be dropped, along with new rows" ) }) diff --git a/vignettes/butterfly.Rmd b/vignettes/butterfly.Rmd index a1e22f5..a06a16c 100644 --- a/vignettes/butterfly.Rmd +++ b/vignettes/butterfly.Rmd @@ -38,9 +38,10 @@ devtools::install_github("thomaszwagerman/butterfly") ## Overview The butterfly package contains the following: - * `butterfly::loupe()` - examines in detail whether previous values have changed, and reports them using `waldo::compare()`. + * `butterfly::loupe()` - examines in detail whether previous values have changed, and returns TRUE/FALSE for no change/change. * `butterfly::catch()` - returns rows which contain previously changed values in a dataframe. * `butterfly::release()` - drops rows which contain previously changed values, and returns a dataframe containing new and unchanged rows. + * `butterfly::create_object_list()` - returns a list of objects required by all of `loupe()`, `catch()` and `release()`. Contains underlying functionality. * `butterflycount` - a list of monthly dataframes, which contain fictional butterfly counts for a given date. ## How to use butterfly