filter_cols.Rd
Given data X, filters out columns in X according to various data preprocessing/cleaning procedures. `filter_cols_by_var` reduces the number of features in the data by keeping those with the largest variance.
filter_cols_by_var(X, min_var = NULL, max_p = NULL)
A data matrix or data frame.
(Optional) minimum variance threshold. All columns with
variance lower than `min_var` are removed. If NULL
(default), no
variance threshold is applied.
(Optional) maximum number of features to keep. Only features
with the top `max_p` highest variances are kept. If NULL
(default),
there is no limit on the maximum number of features to keep.
A cleaned data matrix or data frame.