Yu Group Software: R

class: center, middle, inverse, title-slide

# Yu Group Software: R
## R dev <> simChef <> vdocs
### Tiffany Tang and James Duncan
### 2022/04/27

---

# Roadmap

### 1. R package development
- *Walkthrough*: creating a toy R package
- Useful tools: documentation, unit testing, GitHub actions

### 2. `simChef` for cooking up simulations in R

### 3. `vdocs` (short for veridical docs)

*Resources available at: https://github.com/Yu-Group/r-dev

---

# Why bother with R packages?

- For sharing re-usable code with others

```r
devtools::install_github("Yu-Group/simChef")

library(simChef)
# You can now use functions from simChef!
```

- For organizing your code

- Lots of helpful tools that are tied to this package structure (e.g., [`usethis`](https://usethis.r-lib.org/), [`devtools`](https://devtools.r-lib.org/), [`roxygen2`](https://roxygen2.r-lib.org/), [`pkgdown`](https://pkgdown.r-lib.org/))

---

# Let's get started with a toy R package

```r
# create package with tidyverse conventions implemented
usethis::create_tidy_package("R.pkg.ex")  
```

This will

- create a new package named `R.pkg.ex`
- apply as many of the tidyverse conventions as possible
- issue a few reminders
- activate the new package.

```r
# or alternatively, a more barebones start
usethis::create_package("R.pkg.ex")
```

---

# Writing a reusable R function

All R functions belong in the `R/` directory.

Let's create a file called `R/example_function.R` and add the following to the file:

```r
remove_constant_columns <- function(X) {
  const_cols <- purrr::map_lgl(X, ~all(duplicated(.x)[-1L]))
  X_cleaned <- X %>%
    dplyr::select(which(!const_cols))
  return(const_cols)
}
```

What does this function do?

---

# Documentation with `roxygen2`

```r
#' Remove constant columns in data
#'
#' @description Given some data X, removes all columns in the data that are a
#'   constant value (while ignoring NAs).
#'
#' @param X A data frame.
#'
#' @return A data frame where constant columns have been removed.
#'
#' @examples
#' X <- data.frame(a = c(1, 1, 1), b = c(1, 2, 3))
#' X_cleaned <- remove_constant_columns(X)
#' 
#' @export
remove_constant_columns <- function(X) {
  const_cols <- purrr::map_lgl(X, ~all(duplicated(.x)[-1L]))
  X_cleaned <- X %>%
    dplyr::select(which(!const_cols))
  return(const_cols)
}
```

We can then render the documentation easily via:

```r
devtools::document()  # or equivalently, roxygen2::roxygenise()
```

After this, we can see the function documentation via the usual:

```r
? remove_constant_columns
```

---

# Managing dependencies

Note however that `remove_constant_columns` depends on other packages.

```r
remove_constant_columns <- function(X) {
  const_cols <- `purrr`::map_lgl(X, ~all(duplicated(.x)[-1L]))
  X_cleaned <- X `%>%`
    `dplyr`::select(which(!const_cols))
  return(const_cols)
}
```

The `usethis` package makes it easy to manage these dependencies:

```r
usethis::use_package("purrr")
usethis::use_package("dplyr")
usethis::use_pipe()
```

---

# Additional documentation

### Readme

This is the first thing that people will see on GitHub (and package website).

The `README.Rmd` file has already been initialized by `usethis::create_tidy_package()`. We next need to render the README.Rmd into a README.md via:

```r
devtools::build_readme()
```

### Creating vignettes

Vignettes are a great way to teach others how to use your code/package.

To begin creating a vignette, `usethis` provides a nice starter template:

```r
usethis::use_vignette("example_vignette")
```

This initializes a vignette found at `vignettes/example_vignette.Rmd`

---

# Creating a website

So far, we have created an R package with a toy R function, some documentation, and a vignette.

An easy way to organize all this documentation and to improve accessibility is through a package website.

```r
pkgdown::build_site()
```

---

# Other useful commands

To check whether or not the package can be built successfully:

```r
devtools::check()
```

[Note that `devtools::check()` rebuilds all the accompanying documentation, runs all unit tests, and goes through CRAN checks.]

<br>
To quickly load in the package under development and its most recent changes:

```r
devtools::load_all()
```

<br>
To automatically style source code according to the tidyverse style guide:

```r
usethis::use_tidy_style()
```

---

# testthat

Testing is an important part of the data science workflow!

🖇[`testthat`](https://testthat.r-lib.org/)  makes it simple.

```r
hitchhiker <- function(x) {
  return(x + 1)
}

testthat::expect_equal(hitchhiker(41), 42)
testthat::expect_equal(hitchhiker(42), 42)

## Error: hitchhiker(42) not equal to 42.
## 1/1 mismatches
## [1] 43 - 42 == 1
```

Other useful `testthat` verbs:

```r
expect_identical() # test if objects are the same
expect_true(); expect_false()
expect_lt(); expect_gt()
expect_error(); expect_warning()
expect_snapshot() # saves stdout and later compares with this gold standard
```

---

# testthat

📁 `testthat` tests should be in a directory called `tests/testthat/` next to your
project's `R/` directory. (`usethis::use_testthat` can set that up for you.)

📄 `testthat` test files should be named like `test-*.R`, e.g. `test-xyz.R`. Again,
there's a `usethis` function to create this: `usethis::use_test("xyz")`.

👍 create one `test-*.R` file for every `*.R` file in `R/`, e.g.,

Here's an example from `simChef`:

```r
test_that("fit_experiment works properly with future::multicore", {

# multicore plan isn't supported on windows
  skip_on_os("windows")

# fit_experiment_fixture() returns sequential and parallel simulation results
  results <- fit_experiment_fixture(future::multicore)

expect_identical(results[[1]], results[[2]])
})
```

---

# withr

🖇[`withr`](https://withr.r-lib.org/index.html) is another helpful package for
writing testing code and much more.

* Prevents unintended side effects (aka, hard to squash 🦟!): `with_environment()`

* Cleans up temporary state: `with_tempfile()`

* Helps with reproducibility: `with_seed()`

* Can run arbitrary code when exiting a frame in the call stack: `defer()`, `defer_parent()`

---

# withr in simChef

We use `withr` to temporarily set a `future` plan and reset it when the code
inside of `with_plan` finishes.

```r
set_plan <- function(plan, ...) {
  # get the original plan so we can reset at the end
  old_plan <- future::plan()
  future::plan(plan, ...)
  return(old_plan)
}

reset_plan <- function(plan) {
  future::plan(plan)
}

with_plan <- withr::with_(set_plan, reset_plan)

fit_experiment_fixture <- function(plan) {
  ... # setup code

parallel <- withr::with_seed( # set the random seed
    seed,
    with_plan( # set the plan
      plan,
      fit_experiment(experiment, n_reps = 10) # get results in parallel
    )
  )
  return(list(parallel, sequential))
}
```

---

# GitHub Actions

🖇[GitHub Actions](https://docs.github.com/en/actions): automated scripts that run
on GitHub's servers when you make changes to a repo.

They help automate:

* testing

* code "linting" (syntax / style checking)

* (documentation) website deployment

* many other repetitive tasks that run when code changes

---

# r-lib's GitHub actions

Many of the core R packages we've mentioned are maintained in the
🖇[`r-lib`](https://github.com/r-lib) GitHub organization, including:
* `devtools`
* `roxygen2`
* `testthat`
* `pkgdown`
* `usethis`

`r-lib` also maintains a very useful repo with many GitHub Actions scripts that
you can use in your workflows: 🖇[r-lib/actions](https://github.com/r-lib/actions).

Of course there's a `usethis` command:

```r
# creates an action to run R CMD check
usethis::use_github_actions("check-standard")`
```

You can swap out `"check-standard"` for any of the example actions in [`r-lib/actions/examples/`](https://github.com/r-lib/actions/tree/v2-branch/examples).

---

# Example: Automated grading of STAT 215A lab reports

🖇[auto215a](https://github.com/Yu-Group/auto215a/) (private repo)

```yaml
jobs:
  run-grader:

...

strategy:
      fail-fast: false
      matrix:
        student:
          # add all the students here
          - {repo: 'student1-username/stat215a', id: 'anon-id1'}
          - {repo: 'student2-username/stat215a', id: 'anon-id2'}
        lab:
          # add the lab(s) / function(s) to test here
          # each function should be in a separate .R file of the same name
          - {R-dir: 'lab3/R', fun: 'similarity_fn'}

steps:
        # get this repo and set as the default working directory
      - uses: actions/checkout@v3

# setup R
      - uses: r-lib/actions/setup-r@v2
        with:
          r-version: 'release'
          use-public-rspm: true

- name: Install grader dependencies
        run: |
          ## --------------------------------------------------------------------
          install.packages(c("remotes", "lintr"))
          remotes::install_github("Yu-Group/auto215a/auto215aR")
        shell: Rscript {0}

# checkout student repo
      - uses: actions/checkout@v3
        with:
          repository: ${{ matrix.student.repo }}
          path: ./${{ matrix.student.id }}

- name: Install student dependencies, test their functions, and lint their code
        run: |
          ## --------------------------------------------------------------------
          # install
          remotes::install_deps()
          # source
          source("${{ matrix.lab.R-dir }}/${{ matrix.lab.fun }}.R")
          # test
          auto215aR::test_${{ matrix.lab.fun }}(${{ matrix.lab.fun }}, submit_id = "${{ matrix.student.id }}")
          # lint
          lintr::lint("${{ matrix.lab.R-dir }}/${{ matrix.lab.fun }}.R")
        shell: Rscript {0}
        working-directory: ./${{ matrix.student.id }}
```

---

# `Yu-Group` Actions

GitHub Action are a key part of the group's project / software development
workflow, used by:

* `simChef`
* `vdocs`
* `vflow`
* `imodels`
* Many more and hopefully yours too!

---

# Additional resources

We created these slides in `Rmarkdown` via the package 🖇[`xaringan`](https://bookdown.org/yihui/rmarkdown/xaringan.html).

These extensions make `xaringan` even more powerful:

* 🖇[xaringanthemer](https://pkg.garrickadenbuie.com/xaringanthemer)
* 🖇[xaringanExtra](https://pkg.garrickadenbuie.com/xaringanExtra)

Here are some great bookmarks for software dev / productivity in R:

* 🖇[Advanced R (2nd edition)](https://adv-r.hadley.nz)
* 🖇[R Packages (2nd edition)](https://r-pkgs.org/)
* 🖇[bookdown: Authoring Books and Technical Documents with R Markdown](https://bookdown.org/yihui/bookdown/)
* 🖇 [blogdown: Creating Websites with R Markdown](https://bookdown.org/yihui/blogdown/)

---

# simChef