Plot ROC/PR curves for feature selection.

Plot ROC/PR curves for feature selection or some summary thereof across experimental replicates.

Usage

plot_feature_selection_curve(
  fit_results = NULL,
  eval_results = NULL,
  eval_name = NULL,
  eval_fun = "summarize_feature_selection_curve",
  eval_fun_options = NULL,
  vary_params = NULL,
  curve = c("ROC", "PR"),
  show = c("line", "ribbon"),
  ...
)

Arguments

fit_results

A tibble, as returned by fit_experiment().

eval_results

A list of result tibbles, as returned by evaluate_experiment().

eval_name

Name of Evaluator containing results to plot. If NULL, the data used for plotting is computed from scratch via eval_fun.

eval_fun

Character string, specifying the function used to compute the data used for plotting if eval_name = NULL. If eval_name is not NULL, this argument is ignored.

eval_fun_options

List of named arguments to pass to eval_fun.

vary_params

A vector of DGP or Method parameter names that are varied across in the Experiment.

curve

Either "ROC" or "PR" indicating whether to plot the ROC or Precision-Recall curve.

show

Character vector with elements being one of "boxplot", "point", "line", "bar", "errorbar", "ribbon", "violin", indicating what plot layer(s) to construct.

...

Arguments passed on to plot_eval_constructor

eval_id: (Optional) Character string. ID used as the suffix for naming columns in evaluation results tibble. If eval_summary_constructor() was used to construct the Evaluator, this should be the same as the eval_id argument in eval_summary_constructor(). Only used to assign default (i.e., "auto") aesthetics in ggplot.
x_str: (Optional) Name of column in data frame to plot on the x-axis. Default "auto" chooses what to plot on the x-axis automatically.
y_str: (Optional) Name of column in data frame to plot on the y-axis if show is anything but "boxplot". Default "auto" chooses what to plot on the y-axis automatically.
y_boxplot_str: (Optional) Name of column in data frame to plot on the y-axis if show is "boxplot". Default "auto" chooses what to plot on the y-axis automatically.
err_sd_str: (Optional) Name of column in data frame containing the standard deviations of y_str. Used for plotting the errorbar and ribbon ggplot layers. Default "auto" chooses what column to use for the standard deviations automatically.
color_str: (Optional) Name of column in data frame to use for the color and fill aesthetics when plotting. Default "auto" chooses what to use for the color and fill aesthetics automatically. Use NULL to avoid adding any color and fill aesthetic.
linetype_str: (Optional) Name of column in data frame to use for the linetype aesthetic when plotting. Used only when show = "line". Default "auto" chooses what to use for the linetype aesthetic automatically. Use NULL to avoid adding any linetype aesthetic.
facet_formula: (Optional) Formula for ggplot2::facet_wrap() or ggplot2::facet_grid() if need be.
facet_type: One of "grid" or "wrap" specifying whether to use ggplot2::facet_wrap() or ggplot2::facet_grid() if need be.
plot_by: (Optional) Name of column in eval_tib to use for subsetting data and creating different plots for each unique value. Default "auto" chooses what column to use for the subsetting automatically. Use NULL to avoid creating multiple plots.
add_ggplot_layers: List of additional layers to add to a ggplot object via +.
boxplot_args: (Optional) Additional arguments to pass into ggplot2::geom_boxplot().
point_args: (Optional) Additional arguments to pass into ggplot2::geom_point().
line_args: (Optional) Additional arguments to pass into ggplot2::geom_line().
bar_args: (Optional) Additional arguments to pass into ggplot2::geom_bar().
errorbar_args: (Optional) Additional arguments to pass into ggplot2::geom_errorbar().
ribbon_args: (Optional) Additional arguments to pass into ggplot2::geom_ribbon().
violin_args: (Optional) Additional arguments to pass into ggplot2::geom_violin().
facet_args: (Optional) Additional arguments to pass into ggplot2::facet_grid() or ggplot2::facet_wrap().
interactive: Logical. If TRUE, returns interactive plotly plots. If FALSE, returns static ggplot plots.

Value

If interactive = TRUE, returns a plotly object if plot_by is NULL and a list of plotly objects if plot_by is not NULL. If interactive = FALSE, returns a ggplot object if plot_by is NULL and a list of ggplot objects if plot_by is not NULL.

Examples

# generate example fit_results data
fit_results <- tibble::tibble(
  .rep = rep(1:2, times = 2),
  .dgp_name = c("DGP1", "DGP1", "DGP2", "DGP2"),
  .method_name = c("Method"),
  feature_info = lapply(
    1:4,
    FUN = function(i) {
      tibble::tibble(
        # feature names
        feature = c("featureA", "featureB", "featureC"),
        # true feature support
        true_support = c(TRUE, FALSE, TRUE),
        # estimated feature support
        est_support = c(TRUE, FALSE, FALSE),
        # estimated feature importance scores
        est_importance = c(10, runif(2, min = -2, max = 2))
      )
    }
  )
)

# generate example eval_results data
eval_results <- list(
  ROC = summarize_feature_selection_curve(
    fit_results,
    curve = "ROC",
    nested_cols = "feature_info",
    truth_col = "true_support",
    imp_col = "est_importance"
  ),
  PR = summarize_feature_selection_curve(
    fit_results,
    curve = "PR",
    nested_cols = "feature_info",
    truth_col = "true_support",
    imp_col = "est_importance"
  )
)

# create summary ROC/PR plots using pre-computed evaluation results
roc_plt <- plot_feature_selection_curve(eval_results = eval_results,
                                        eval_name = "ROC", curve = "ROC",
                                        show = c("line", "ribbon"))
pr_plt <- plot_feature_selection_curve(eval_results = eval_results,
                                       eval_name = "PR", curve = "PR",
                                       show = c("line", "ribbon"))
# or alternatively, create the same plots directly from fit results
roc_plt <- plot_feature_selection_curve(fit_results = fit_results,
                                        show = c("line", "ribbon"),
                                        curve = "ROC",
                                        eval_fun_options = list(
                                          nested_cols = "feature_info",
                                          truth_col = "true_support",
                                          imp_col = "est_importance"
                                        ))
pr_plt <- plot_feature_selection_curve(fit_results = fit_results,
                                       show = c("line", "ribbon"),
                                       curve = "PR",
                                       eval_fun_options = list(
                                         nested_cols = "feature_info",
                                         truth_col = "true_support",
                                         imp_col = "est_importance"
                                       ))

# can customize plot (see plot_eval_constructor() for possible arguments)
roc_plt <- plot_feature_selection_curve(eval_results = eval_results,
                                        eval_name = "ROC", curve = "ROC",
                                        show = c("line", "ribbon"),
                                        plot_by = ".dgp_name")

Usage

Arguments

Value

See also

Examples