get_data_summary.Rd
Provides a summary of the given (X, y) data in table form. Serves as a wrapper function around skimr::skim(), which skims a data frame and returns a broad overview of useful summary statistics. This wrapper can currently handle columns of type "factor", "numeric", "character", "logical", "complex", "Date", and "POSIXct". All other column types are ignored.
get_data_summary(
X,
y = NULL,
skim_out = NULL,
digits = 2,
sigfig = FALSE,
features = NULL,
max_features = 1000,
html = knitr::is_html_output(),
...
)
Data matrix or data frame.
Response vector.
(Optional) cached output of `skimr::skim()`. Specify if the skim output has been pre-computed in order to reduce computation.
Number of digits to display for numeric values
Logical. If TRUE
, digits
refers to the number of
significant figures. If FALSE
, digits
refers to the number of
decimal places.
(Optional) vector of features to include in summary. Default
(NULL
) is to include all features.
(Optional) maximum number of features to include in
summary. Only used if features = NULL
. Default is 1000. If the
number of features in X exceeds `max_features`, the features kept in the
summary are chosen randomly.
Logical indicating whether or not the output is an html table or a latex table.
Additional arguments to pass to vthemes::pretty_DT() if
html = TRUE
or vthemes::pretty_kable() if html = FALSE
.
Returns an html table (i.e., the output of vthemes::pretty_DT()) or a latex table (i.e., the output of vthemes::pretty_kable()), containing a broad overview of summary statistics for each data column.