Skip to contents

DGP, a data-generating process which can generate() data in an Experiment.

Generally speaking, users won't directly interact with the DGP R6 class, but instead indirectly through create_dgp() and the following Experiment helpers:

See also

Public fields

name

The name of the DGP.

dgp_fun

The user-defined data-generating process function.

dgp_params

A (named) list of user-defined default arguments to input into the data-generating process function.

Methods


Method new()

Initialize a new DGP object.

Usage

DGP$new(.dgp_fun, .name = NULL, ...)

Arguments

.dgp_fun

The user-defined data-generating process function.

.name

(Optional) An optional name for the DGP, helpful for later identification.

...

User-defined default arguments to pass to .dgp_fun() when DGP$generate() is called.

Returns

A new instance of DGP.


Method generate()

Generate data from a DGP.

Usage

DGP$generate(...)

Arguments

...

User-defined arguments to pass into DGP$dgp_fun() that will overwrite the initialized DGP parameters. If no additional arguments are provided, data will be generated using DGP$dgp_fun() with the parameters that were set when DGP$new() was called.

Returns

Result of DGP$dgp_fun(). If the result is not a list, it will be coerced to a list.


Method print()

Print a DGP in a nice format, showing the DGP's name, function, and parameters.

Usage

DGP$print()

Returns

The original DGP object, invisibly.


Method clone()

The objects of this class are cloneable with this method.

Usage

DGP$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# create an example DGP function
dgp_fun <- function(n, beta, rho, sigma) {
  cov_mat <- matrix(c(1, rho, rho, 1), byrow = TRUE, nrow = 2, ncol = 2)
  X <- MASS::mvrnorm(n = n, mu = rep(0, 2), Sigma = cov_mat)
  y <- X %*% beta + rnorm(n, sd = sigma)
  return(list(X = X, y = y))
}

# create DGP (with uncorrelated features)
dgp <- DGP$new(.dgp_fun = dgp_fun,
               .name = "Linear Gaussian DGP",
               # additional named parameters to pass to dgp_fun() by default
               n = 50, beta = c(1, 0), rho = 0, sigma = 1)

print(dgp)
#> DGP Name: Linear Gaussian DGP 
#>    Function: function (n, beta, rho, sigma)  
#>    Parameters: List of 4
#>      $ n    : num 50
#>      $ beta : num [1:2] 1 0
#>      $ rho  : num 0
#>      $ sigma: num 1

data_uncorr <- dgp$generate()
cor(data_uncorr$X)
#>            [,1]       [,2]
#> [1,]  1.0000000 -0.3978365
#> [2,] -0.3978365  1.0000000

data_corr <- dgp$generate(rho = 0.7)
cor(data_corr$X)
#>           [,1]      [,2]
#> [1,] 1.0000000 0.7100391
#> [2,] 0.7100391 1.0000000