Skip to main content eteppo

Nesting Dataframes with Different Sets of Values in R

Published: 2023-08-04
Updated: 2023-08-04

tidyr::nest can give you a list of dataframes where the nesting variables take unique values. But sometimes you’d like subsets where the nesting variable takes different combinations of values. For example, I wanted to compare three classes to a control class, and each of the three classes to the control class one by one. So you can generalize this operation of nesting a little bit.

nest_with <- function(data, variable, subgroups) {
  assertthat::assert_that(base::is.data.frame(data))
  assertthat::assert_that(rlang::is_list(subgroups))

  subset_names <- names(subgroups)
  data_subsets <- subgroups %>%
    unname() %>%
    map(
      function(value_set) { filter(data, {{ variable }} %in% value_set) }
    )
  output <- tibble::tibble(subgroup = subset_names, data = data_subsets)
  return(output)
}

Now you could call something like follows to get different subsets of the data for different comparisons.

data %>%
  nest_with(
    variable = category,
    subgroups = list(
      "ABCD" = c("A", "B", "C", "D"),
      "AD" = c("A", "D"),
      "BD" = c("B", "D"),
      "CD" = c("C", "D")
    )
  )