Generate independent Gaussian covariates and (binary) logistic response data.
logistic_gaussian_dgp.Rd
Generate independent normally-distributed covariates and logistic response data.
Usage
logistic_gaussian_dgp(
n,
p,
s = p,
betas = NULL,
betas_sd = 1,
intercept = 0,
data_split = FALSE,
train_prop = 0.5,
return_values = c("X", "y", "support"),
...
)
Arguments
- n
Number of samples.
- p
Number of features.
- s
Sparsity level of features. Coefficients corresponding to features after the
s
position (i.e., positions i =s
+ 1, ...,p
) are set to 0.- betas
Coefficient vector for observed design matrix. If a scalar is provided, the coefficient vector is constant. If
NULL
(default), entries in the coefficient vector are drawn iid from N(0,betas_sd
^2). Can also be a function that generates the coefficient vector; seegenerate_coef()
.- betas_sd
(Optional) SD of normal distribution from which to draw
betas
. Only used ifbetas
argument isNULL
or is a function in which casebetas_sd
is optionally passed to the function assd
; seegenerate_coef()
.- intercept
Scalar intercept term.
- data_split
Logical; if
TRUE
, splits data into training and test sets according totrain_prop
.- train_prop
Proportion of data in training set if
data_split = TRUE
.- return_values
Character vector indicating what objects to return in list. Elements in vector must be one of "X", "y", "support".
- ...
Not used.
Value
A list of the named objects that were requested in
return_values
. See brief descriptions below.
- X
A
data.frame
.- y
A response vector of length
nrow(X)
.- support
A vector of feature indices indicating all features used in the true support of the DGP.
Note that if data_split = TRUE
and "X", "y"
are in return_values
, then the returned list also contains slots for
"Xtest" and "ytest".