Skip to contents

A helper function for developing new Evaluator functions that summarize results over pre-specified groups in a grouped data.frame (e.g., over multiple experimental replicates).


  eval_id = NULL,
  summary_funs = c("mean", "median", "min", "max", "sd", "raw"),
  custom_summary_funs = NULL,
  na_rm = FALSE



A grouped data.frame of evaluation results to summarize.


Character string. ID to be used as a suffix when naming result columns. Default NULL does not add any ID to the column names.


Character string. Name of column in eval_data with values to summarize.


Character vector specifying how to summarize evaluation metrics. Must choose from a built-in library of summary functions - elements of the vector must be one of "mean", "median", "min", "max", "sd", "raw".


Named list of custom functions to summarize results. Names in the list should correspond to the name of the summary function. Values in the list should be a function that takes in one argument, that being the values of the evaluated metrics.


A logical value indicating whether NA values should be stripped before the computation proceeds.


A tibble containing the summarized results aggregated over the given groups. These columns correspond to the requested statistics in summary_funs and custom_summary_funs and end with the suffix specified by eval_id. Note that the group IDs are also retained in the returned tibble.


# create example eval_data to summarize
eval_data <- tibble::tibble(.rep = rep(1:2, times = 2), 
                            .dgp_name = c("DGP1", "DGP1", "DGP2", "DGP2"),
                            .method_name = "Method",
                            result = 1:4) %>%
  dplyr::group_by(.dgp_name, .method_name)
# summarize `result` column in eval_data
results <- summarize_eval_results(eval_data = eval_data, eval_id = "res",
                                  value_col = "result")
# only compute mean and sd of `result` column in eval_data over given groups
results <- summarize_eval_results(eval_data = eval_data, eval_id = "res",
                                  value_col = "result",
                                  summary_funs = c("mean", "sd"))
# summarize `results` column using custom summary function
range_fun <- function(x) return(max(x) - min(x))
results <- summarize_eval_results(eval_data = eval_data, value_col = "result",
                                  custom_summary_funs = list(range = range_fun))