Plots summary of data distribution — plot_data

Plots a summary of the feature distributions (either together or separately per feature) in the data. Only continuous (i.e., numeric) and categorical (i.e., character or factor) features are used for plotting.

plot_data_distribution(
  data,
  by_feature = NULL,
  plot_type = "auto",
  xlab = "Value",
  title = NULL,
  plot_heights = 1,
  theme_options = NULL,
  ...
)

Arguments

data: A data matrix, data frame, or vector.
by_feature: Logical. If TRUE, plots distributions for each feature separately. If FALSE, plots distribution of all features together. Default is TRUE if there are <10 features and FALSE otherwise.
plot_type: Type of plot. Default is "auto", which uses a kernel density plot for continuous features and a bar plot for categorical features. If not "auto", `plot_type` should be a list with two named elements: `continuous` and `categorical`. The `continuous` element must be one of "density", "histogram", and "boxplot" while the `categorical` element must be "bar" (with more options to come), indicating the type of plot to use for continuous and categorical features, respectively.
xlab: X-axis label.
title: Plot title.
plot_heights: (Optional) numeric vector of relative row heights of subplots. Only used if both continuous and categorical features are found in the data. For example, heights = c(2, 1) would make the first row twice as tall as the second row.
theme_options: (Optional) list of arguments to pass to vthemes::theme_vmodern().
...: Additional arguments to pass to ggplot2::geom_*().

Value

A ggplot object.