plot_data_distribution.Rd
Plots a summary of the feature distributions (either together or separately per feature) in the data. Only continuous (i.e., numeric) and categorical (i.e., character or factor) features are used for plotting.
plot_data_distribution(
data,
by_feature = NULL,
plot_type = "auto",
xlab = "Value",
title = NULL,
plot_heights = 1,
theme_options = NULL,
...
)
A data matrix, data frame, or vector.
Logical. If TRUE
, plots distributions for each
feature separately. If FALSE
, plots distribution of all features
together. Default is TRUE
if there are <10 features and FALSE
otherwise.
Type of plot. Default is "auto", which uses a kernel density plot for continuous features and a bar plot for categorical features. If not "auto", `plot_type` should be a list with two named elements: `continuous` and `categorical`. The `continuous` element must be one of "density", "histogram", and "boxplot" while the `categorical` element must be "bar" (with more options to come), indicating the type of plot to use for continuous and categorical features, respectively.
X-axis label.
Plot title.
(Optional) numeric vector of relative row heights of subplots. Only used if both continuous and categorical features are found in the data. For example, heights = c(2, 1) would make the first row twice as tall as the second row.
(Optional) list of arguments to pass to vthemes::theme_vmodern().
Additional arguments to pass to ggplot2::geom_*().
A ggplot object.