Nesting Dataframes with Different Sets of Values in R
tidyr::nest
can give you a list of dataframes where the nesting variables take unique values. But sometimes you’d like subsets where the nesting variable takes different combinations of values. For example, I wanted to compare three classes to a control class, and each of the three classes to the control class one by one. So you can generalize this operation of nesting a little bit.
nest_with <- function(data, variable, subgroups) {
assertthat::assert_that(base::is.data.frame(data))
assertthat::assert_that(rlang::is_list(subgroups))
subset_names <- names(subgroups)
data_subsets <- subgroups %>%
unname() %>%
map(
function(value_set) { filter(data, {{ variable }} %in% value_set) }
)
output <- tibble::tibble(subgroup = subset_names, data = data_subsets)
return(output)
}
Now you could call something like follows to get different subsets of the data for different comparisons.
data %>%
nest_with(
variable = category,
subgroups = list(
"ABCD" = c("A", "B", "C", "D"),
"AD" = c("A", "D"),
"BD" = c("B", "D"),
"CD" = c("C", "D")
)
)