function for calculating scales #126

wibeasley · 2022-10-17T15:24:20Z

inputs:

vector of column names
minimum count of nonmissing columns
weights vector

wibeasley · 2022-10-18T14:58:17Z

@genevamarshall, @yutiantang and others are using sjstats::mean_n(). It doesn't support nonuniform weights. And (at least currently) uses a slow approach that involves casting the data.frame to a matrix.

wibeasley · 2023-10-22T21:53:29Z

I've been working on something that meets all these requirements except for for the nonuniform weights.
https://github.com/LiveOak/vasquez-border-reentry-1

row_sum <- function(
    d,
    columns_to_average        = character(0),
    pattern, 
    new_column_name  = "row_sum",
    threshold_proportion      = .75,
    verbose                   = FALSE
) {

  if (length(columns_to_average) == 0L) {
    columns_to_average <-
      d |>
      colnames() |>
      grep(
        x         = _,
        pattern   = pattern,
        value     = TRUE,
        perl      = TRUE
      )

    if (verbose) {
      message(
        "The following columns will be summed:\n- ",
        paste(columns_to_average, collapse = "\n- ")
      )
    }
  }

  d |>
    dplyr::mutate(
      row_sum = # Finding the sum (used by m4)
        rowSums(
          dplyr::across(!!columns_to_average),
          na.rm = TRUE
        ),
      nonmissing_count =
        rowSums(
          dplyr::across(
            !!columns_to_average,
            .fns = \(x) { !is.na(x) }
          )
        ),
      nonmissing_proportion = nonmissing_count / length(columns_to_average),
      {{new_column_name}} :=
        dplyr::if_else(
          threshold_proportion <= nonmissing_proportion,
          row_sum,
          # row_sum / nonmissing_count,
          NA_real_
        )
    ) |>
    dplyr::select(
      -row_sum,
      -nonmissing_count,
      -nonmissing_proportion,
    )
  # Alternatively, return just the new columns
  # dplyr::pull({{new_column_name}})
}

ref #126

I'm not sure why it was producing errors before ref #126

ref #126

DavidBard · 2024-10-03T13:08:17Z

@wibeasley Feature request and questions:
FR: Would be nice to have a row_mean function as well, which averages across all nonmissing items.
Q1: For row_sum, should 'columns_to_average' argument be 'columns_to_sum' instead?
Q2: Can you provide an example of how this function might be used inside a dplyr::mutate statement?

wibeasley · 2024-10-03T14:08:06Z

@DavidBard,

sure: add row_mean() #142
good catch: change parameter from columns_to_average() to columns_to_process() #141

see https://ouhscbbmc.github.io/OuhscMunge/reference/row_sum.html#examples

wibeasley self-assigned this Oct 17, 2022

wibeasley added a commit that referenced this issue Oct 28, 2023

starting row_sum()

a0e35dc

ref #126

wibeasley added a commit that referenced this issue Oct 28, 2023

row_sum() checks

4457547

ref #126

wibeasley added a commit that referenced this issue Oct 28, 2023

change to verbose

628fbe7

I'm not sure why it was producing errors before ref #126

wibeasley added a commit that referenced this issue Oct 28, 2023

tests for row_sum()

bef38cf

ref #126

wibeasley added the potential-new-function label Oct 28, 2023

wibeasley added a commit that referenced this issue Oct 28, 2023

include count of nonmissing cells

e1c187f

ref #126

wibeasley added a commit that referenced this issue Oct 28, 2023

improve doc

82164f4

ref #126

wibeasley mentioned this issue Oct 28, 2023

row_sum() #132

Merged

This was referenced Oct 3, 2024

change parameter from columns_to_average() to columns_to_process() #141

Closed

add row_mean() #142

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

function for calculating scales #126

function for calculating scales #126

wibeasley commented Oct 17, 2022 •

edited

Loading

wibeasley commented Oct 18, 2022

wibeasley commented Oct 22, 2023 •

edited

Loading

DavidBard commented Oct 3, 2024

wibeasley commented Oct 3, 2024 •

edited

Loading

function for calculating scales #126

function for calculating scales #126

Comments

wibeasley commented Oct 17, 2022 • edited Loading

wibeasley commented Oct 18, 2022

wibeasley commented Oct 22, 2023 • edited Loading

DavidBard commented Oct 3, 2024

wibeasley commented Oct 3, 2024 • edited Loading

wibeasley commented Oct 17, 2022 •

edited

Loading

wibeasley commented Oct 22, 2023 •

edited

Loading

wibeasley commented Oct 3, 2024 •

edited

Loading