Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generative vignette #24

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
5 changes: 5 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,11 @@ Authors@R: c(
role = c("aut", "cph"),
email = "[email protected]",
comment = c(ORCID = "0000-0003-0422-7977")),
person(given = "Jitao David",
family = "Zhang",
role = c("aut", "cph"),
email = "[email protected]",
comment = c(ORCID="0000-0002-3085-0909")),
person(given = "F. Hoffman-La Roche", role = c("cph", "fnd")))
Description:
Reducing batch effect by intellegently assigning samples to batches.
Expand Down
134 changes: 94 additions & 40 deletions vignettes/basic_examples.Rmd
Original file line number Diff line number Diff line change
@@ -1,17 +1,30 @@
---
title: "Basic example"
output: rmarkdown::html_vignette
title: "Basic example of using designit: plate layout with two factors"
output:
rmarkdown::html_vignette:
html_document:
df_print: paged
mathjax: default
number_sections: true
toc: true
toc_depth: 2
vignette: >
%\VignetteIndexEntry{Basic example}
%\VignetteIndexEntry{Basic example of using designit: plate layout with two factors}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
This vignette demonstrates the use of the _deisngit_ package with a series
of examples deriving from the same task, namely to randomize samples of a
two-factor experiment into plate layouts. We shall start with the most basic
use and gradually exploring some basic yet useful utilities provided
by the package.

```{r include=FALSE, message=FALSE, warning=FALSE}
knitr::opts_chunk$set(echo = TRUE,
fig.height=6, fig.width=6,
collapse = TRUE,
comment = "#>")
```

```{r setup}
Expand All @@ -21,14 +34,11 @@ library(dplyr)
library(tidyr)
```

# Plate layout with two factors

## The samples
# The samples and the conditions

Samples of a 2-condition in-vivo experiment are to be
placed on 48 well plates.
Our task is to randomize samples of an in-vivo experiment with multiple conditions. Our aim is to place them in several 48-well plates.

These are the conditions
These are the conditions:

```{r}
# conditions to use
Expand All @@ -44,10 +54,10 @@ conditions <- data.frame(
gt::gt(conditions)
```

We will have 3 animals per groups with 4 replicates each
We will have 3 animals per group, with 4 replicates of each animal.

```{r}
# sample table (2 animals per group with 3 replicates)
# sample table
n_reps <- 4
n_animals <- 3
animals <- bind_rows(replicate(n_animals, conditions, simplify = FALSE),
Expand All @@ -64,14 +74,16 @@ samples <- bind_rows(replicate(n_reps, animals, simplify = FALSE),

samples %>%
head(10) %>%
arrange(animal, group, replicate) %>%
gt::gt()
```
## Plate layout requirements

# Plate layout requirements

Corner wells of the plates should be left empty.
This means on a 48 well plate we can place 44 samples.
Since we have `r nrow(samples)` samples, they will fit on
`r ceiling(nrow(samples)/44)` plates
`r ceiling(nrow(samples)/44)` plates.

```{r}
n_samp <- nrow(samples)
Expand All @@ -81,9 +93,9 @@ n_plates <- ceiling(n_samp / n_loc_per_plate)
exclude_wells <- expand.grid(plate = seq(n_plates), column = c(1, 8), row = c(1, 6))
```

## Setting up a Batch container
# Setting up a BatchContainer object

Create a BatchContainer object that provides all possible locations
First, we create a BatchContainer object that provides all possible locations.

```{r}
bc <- BatchContainer$new(
Expand All @@ -97,9 +109,9 @@ bc$exclude
bc$get_locations() %>% head()
```

## Moving samples
# Moving samples

Use random assignment function to place samples to plate locations
Next, we use the random assignment function to place samples to plate locations.

```{r}
bc <- assign_random(bc, samples)
Expand All @@ -108,15 +120,15 @@ bc$get_samples()
bc$get_samples(remove_empty_locations = TRUE)
```

Plot of the result using the `plot_plate` function
To check the results visually, we can plot of the result using the `plot_plate` function.

```{r, fig.width=6, fig.height=3.5}
plot_plate(bc,
plate = plate, column = column, row = row,
.color = treatment, .alpha = dose
)
```
To not show empty wells, we can directly plot the sample table as well
To not show empty wells, we can directly plot the sample table as well.

```{r, fig.width=6, fig.height=3.5}
plot_plate(bc$get_samples(remove_empty_locations = TRUE),
Expand All @@ -125,12 +137,11 @@ plot_plate(bc$get_samples(remove_empty_locations = TRUE),
)
```

To move individual samples or manually assigning all locations we can use the
`batchContainer$move_samples()` method
Sometimes we may wish to move samples, or to swap samples, or to manually
assign some locations. To move individual samples or manually assigning all
locations we can use the `batchContainer$move_samples()` method.

To swap two or more samples use:

**Warning**: This will change your BatchContainer in-place.
To swap two or more samples, use

```{r, fig.width=6, fig.height=3.5}
bc$move_samples(src = c(1L, 2L), dst = c(2L, 1L))
Expand All @@ -146,6 +157,7 @@ To assign all samples in one go, use the option `location_assignment`.
**Warning**: This will change your BatchContainer in-place.

The example below orders samples by ID and adds the empty locations afterwards

```{r, fig.width=6, fig.height=3.5}
bc$move_samples(
location_assignment = c(
Expand All @@ -160,11 +172,28 @@ plot_plate(bc$get_samples(remove_empty_locations = TRUE, include_id = TRUE),
)
```

## Run an optimization
# Scoring a layout

In the context of randomization, a good layout means that known independent
variables and/or covariates that may affect the dependent variable(s) are
as uncorrelated as possible with the layout. To evaluate how good a layout is,
we need a scoring function, which we assign to the `BatchContainer` object.

In this example, the scoring function `osat_score_generator` will assess how
well treatment and dose are balanced across the plates.

```{r}
bc$scoring_f <- osat_score_generator(
batch_vars = "plate",
feature_vars = c("treatment", "dose")
)
```

# Running an optimization

The optimization procedure is invoked with e.g. `optimize_design`.
Here we use a simple shuffling schedule:
swap 10 samples for 100 times, then swap 2 samples for 400 times.
Once we have setup an initial layout, which may be suboptimal, we can optimize it in multiple ways,
for instance by sample shuffling. The optimization procedure is invoked with e.g. `optimize_design`.
Here we use a simple shuffling schedule: swap 10 samples for 100 times, then swap 2 samples for 400 times.

To evaluate how good a layout is, we need a scoring function.

Expand Down Expand Up @@ -208,15 +237,15 @@ ggplot(
facet_wrap(~plate)
```

## Customizing the plate layout
# Customizing the plate layout

To properly distinguish between empty and excluded locations one can do the
following.

* Supply the BatchContainer directly
* set `add_excluded = TRUE`, set `rename_empty = TRUE`
* supply a custom color palette
* excluded wells have NA values and can be colored with `na.value`
* Supply the BatchContainer directly;
* set `add_excluded = TRUE` and set `rename_empty = TRUE`;
* supply a custom color palette;
* excluded wells have NA values and can be colored with `na.value`.

```{r, fig.width=6, fig.height=3.5}
color_palette <- c(
Expand All @@ -232,8 +261,8 @@ plot_plate(bc,
scale_fill_manual(values = color_palette, na.value = "darkgray")
```

To remove all empty wells from the plot, hand the pruned sample list.
to plot_plate rather than the whole BatchContainer.
To remove all empty wells from the plot, hand the pruned sample list
to `plot_plate` rather than the whole `BatchContainer` object.
You can still assign your own colors.

```{r, fig.width=6, fig.height=3.5}
Expand All @@ -255,3 +284,28 @@ plate = plate, column = column, row = row,
) +
scale_fill_viridis_d()
```

# Summary

To summarize

1. In order to randomize the layout of samples from an experiment, create an
instance of `BatchContainer` with `BatchContainer$new()`.
2. Use functions `assign_random` and `plot_plate` to assign samples randomly
and to plot the plate layout. If necessary, you can retrieve the samples from
the BatchContainer instance `bc` with the method `bc$get_samples()`, or move
samples with the method `bc$move_samples()`.
3. The scoring function of `bc` can be set by `bc$scoring_f`. Once it is set,
we can optimize the design, for instance by shuffling the samples.
4. Various options are available to further customize the design.

Now you have already the first experience of using _designit_ for randomization,
it is time to apply the learning to your work. If you need more examples or
if you want to understand more details of the package, please explore other
vignettes of the package as well as check out the documentations.

# Session information

```{r sessionInfo}
sessionInfo()
```
Loading
Loading