Developer function for summarizing evaluation results.
Source:R/evaluator-lib-utils.R
eval_summarizer.Rd
A helper function for developing new Evaluator
functions
that summarize results over pre-specified groups in a grouped
data.frame
(e.g., over multiple experimental replicates). This is
often used in conjunction with eval_constructor()
.
Usage
eval_summarizer(
eval_data,
eval_id = NULL,
value_col,
summary_funs = c("mean", "median", "min", "max", "sd", "raw"),
custom_summary_funs = NULL,
na_rm = FALSE
)
Arguments
- eval_data
A grouped
data.frame
of evaluation results to summarize.- eval_id
Character string. ID to be used as a suffix when naming result columns. Default
NULL
does not add any ID to the column names.- value_col
Character string. Name of column in
eval_data
with values to summarize.- summary_funs
Character vector specifying how to summarize evaluation metrics. Must choose from a built-in library of summary functions - elements of the vector must be one of "mean", "median", "min", "max", "sd", "raw".
- custom_summary_funs
Named list of custom functions to summarize results. Names in the list should correspond to the name of the summary function. Values in the list should be a function that takes in one argument, that being the values of the evaluated metrics.
- na_rm
A
logical
value indicating whetherNA
values should be stripped before the computation proceeds.
Value
A tibble
containing the summarized results aggregated
over the given groups. These columns correspond to the requested
statistics in summary_funs
and custom_summary_funs
and end
with the suffix specified by eval_id
. Note that the group IDs are
also retained in the returned tibble
.
Examples
# create example eval_data to summarize
eval_data <- tibble::tibble(.rep = rep(1:2, times = 2),
.dgp_name = c("DGP1", "DGP1", "DGP2", "DGP2"),
.method_name = "Method",
result = 1:4) %>%
dplyr::group_by(.dgp_name, .method_name)
# summarize `result` column in eval_data
results <- eval_summarizer(eval_data = eval_data, eval_id = "res",
value_col = "result")
# only compute mean and sd of `result` column in eval_data over given groups
results <- eval_summarizer(eval_data = eval_data, eval_id = "res",
value_col = "result",
summary_funs = c("mean", "sd"))
# summarize `results` column using custom summary function
range_fun <- function(x) return(max(x) - min(x))
results <- eval_summarizer(eval_data = eval_data, value_col = "result",
custom_summary_funs = list(range = range_fun))