Setting Up Your Simulation Study
Source:vignettes/simulation-directory-setup.Rmd
simulation-directory-setup.Rmd
We recommend initializing your simulation directory as you would any new analysis: using a project. Doing so improves reproducibility and replicability; it keeps your code, data, and outputs neatly separated from other analyses.
In the same vein, we highly encourage the use of version control software — git in particular. It helps you keep track of changes, undo snafus, and seamlessly collaborate with your peers. Your git repositories can be linked to GitHub, a hosting service that lets you manage your projects. If you’re unfamiliar with git or GitHub, or haven’t yet set git up on you machine, we suggest reviewing Chapters 4 through 8 of Happy Git and GitHub for the useR.
While you could create a version-controlled R project manually, we
suggest using the create_sim()
function. Briefly, this
function takes as argument a path
indicating the location
of the new directory for your simulation study and creates the necessary
folders and files. It also initializes a new project and a git
repository. This repository can optionally be linked to GitHub using,
for example, usethis::use_github()
.
If you don’t want to use git, set create_sim()
’s
init_git
argument to FALSE
.
After creating your project with create_sim()
, you’ll
find the following directories and files at the designated
path
:
-
R/
: This folder will contain your simulation study’s R code.-
meal.R
: YoursimChef
experiment is assembled in this file. -
dgp/
: Data-generating process functions are stored here. -
method/
: Method functions are stored here. -
eval/
: Evaluator functions are stored here. -
viz/
: Visualizer functions are stored here.
-
-
results/
: Your simulation experiment’s output are automatically saved here. -
README.md
: This markdown file briefly summarizes your simulation study’s goals and results. Some guidance on navigating your project’s directory structure might also be appreciated by your collaborators — and even yourself at some point in the future.
We also suggest incorporating unit tests in your simulation study. Unit tests allow you to test individual units of code to make sure that they are working as expected. The trustworthiness of your simulation study’s results is in turn improved.
simChef
integrates with the testthat
R package,
the most commonly used unit testing framework for R software
development. The scaffolding required for testing your simulation
study’s code is generated automatically when setting
create_sim()
’s tests
argument to
TRUE
—the default. All of your tests can then be carried out
by running run_tests()
from the root directory of your
simulation project.
Here’s a rundown of the directories and files that are created by
create_sim()
:
-
tests/
: The folder containing all of the unit testing material.-
testthat.R
: This script callstestthat::test_dir()
on the following sub-directories to run the tests contained therein. All packages used by the functions in yourR/
directory must be included manually in this file. -
testthat/dgp-tests/
: Tests for data-generating process functions. -
testthat/method-tests/
: Tests for method functions. -
testthat/eval-tests/
: Tests for evaluator functions. -
testthat/viz-tests/
: Tests for visualizer functions.
-
If you know that your simulation study will require plenty of
computational resources, then you might consider running it on a
high-performance computing environment with a workload managers, like SLURM.
Directories for organizing high-power computing-related files can be
generated by setting hpc = TRUE
when calling
create_sim()
:
-
scripts/
: This directory folder contains scripts, like Bash files, for initiating simulation studies in high-power computing environments. For added flexibility, we recommend the R packageoptparse
to parse command-line options from your bash scripts withinmeal.R
. -
logs/
: The files output by these workload managers can be stored in this directory.
Again with reproducibility and replicability in mind, we encourage
the use of renv
for project-local dependency management. renv
can be
initialized when creating your project by setting
init_renv = TRUE
in your call to create_sim()
.
The general renv
workflow can be found here.