Skip to content

Commit

Permalink
Merge pull request #364 from Olink-Proteomics/lod_tutorial
Browse files Browse the repository at this point in the history
Lod tutorial documentation
  • Loading branch information
kathy-nevola committed May 30, 2024
2 parents 9c0b9be + eefad69 commit 3faa92b
Show file tree
Hide file tree
Showing 5 changed files with 273 additions and 0 deletions.
21 changes: 21 additions & 0 deletions OlinkAnalyze/R/olink_lod.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,27 @@
#' `Normalization = "Plate Control"`, LOD and PCNormalizedLOD are identical.
#'
#' @export
#' @examples
#' \dontrun{
#' \donttest{
#' try({ # This will fail if the files do not exist.
#'
#' # Import NPX data
#' npx_data <- read_NPX("path/to/npx_file")
#'
#' # Estimate LOD from negative controls
#' npx_data_lod_NC <- olink_lod(data = npx_data, lod_method = "NCLOD")
#'
#' # Estimate LOD from fixed LOD
#' ## Locate the fixed LOD file
#' lod_file_path <- "path/to/lod_file"
#'
#' npx_data_lod_Fixed <- olink_lod(data = npx_data,
#' lod_file_path = lod_file_path,
#' lod_method = "FixedLOD")
#' })
#' }
#' }
#'
olink_lod <- function(data, lod_file_path = NULL, lod_method = "NCLOD"){

Expand Down
Binary file added OlinkAnalyze/man/figures/small_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 23 additions & 0 deletions OlinkAnalyze/man/olink_lod.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

179 changes: 179 additions & 0 deletions OlinkAnalyze/vignettes/LOD.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
---
title: "Calculating LOD from Olink® Explore data"
output:
html_vignette:
toc: true
toc_depth: 3
vignette: >
%\VignetteIndexEntry{Calculating LOD from Olink® Explore data}
%\VignetteEncoding{UTF-8}
%\VignetteEngine{knitr::rmarkdown}
date: 'Compiled: `r format(Sys.Date(), "%B %d, %Y")`'
editor_options:
markdown:
wrap: 72
---
![](../man/figures/small_logo.png)

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
tidy = FALSE,
tidy.opts = list(width.cutoff = 95),
fig.width = 6,
fig.height = 3,
message = FALSE,
warning = FALSE,
time_it = TRUE,
fig.align = "center"
)
library(OlinkAnalyze)
library(dplyr)
```


## Introduction

This tutorial describes how to use Olink® Analyze to integrate Limit of Detection (LOD) into Olink® Explore HT and Olink® Explore 384/3072 datasets. Although it is recommended to use all Olink Explore data in downstream analyses, LOD information can be useful when performing technical evaluations of a dataset.

In this tutorial, you will learn how to use `olink_lod()` to add LOD information to your Olink Explore dataset. Note that Olink Analyze does not contain example Olink Explore HT or Olink Explore 384/3072 datasets within the package, so external data will be necessary for the code below to work. All file paths should be replaced with a path to your data and fixed LOD reference file (if applicable).


## Integrating LOD

Limit of Detection (LOD) is a metric that indicates the lowest measurable value of a protein. LOD can be helpful when performing technical evaluations of NPX™ datasets, such as calculating CVs. As a note, LOD is less important in downstream statistical analyses as values under LOD typically converge across groups. As such, including data below LOD is unlikely to increase the risk of false positive discoveries. Furthermore, data below LOD can be instrumental in downstream analyses such as biomarker discovery as a protein may be well expressed in one group and not measured in another group. In this case, this protein can be a strong biomarker candidate for specific groups.

LOD can be added to Olink Explore NPX datasets using `olink_lod()`. This function can calculate LOD from an NPX dataset using the dataset's negative controls or a list of predetermined fixed LOD values (supplied by Olink Support upon request). As the default setting, `olink_lod()` will calculate LOD using a dataset's negative controls.

Olink Explore data is commonly delivered plate control (PC) normalized or intensity normalized (the normalization type employed is indicated in the NPX file column Normalization), where the latter is dependent on that the analyzed samples are randomized. These are reported in the two respective columns PCNormalizedNPX and NPX. Please notice that for PC normalized datasets the content in these two columns will be identical, while for intensity normalized datasets the NPX column will include the intensity normalized values. Similarly, the `olink_lod()` function adds two columns to your dataset; PCNormalizedLOD and LOD respectively. For a PC normalized dataset the content in these two columns will be identical, while for an intensity normalized dataset the LOD column will include LOD values based on the intensity normalized NPX values. Examples of results for plate control and intensity normalization are shown in the tables below.

```{r, echo=FALSE}
set.seed(1234)
table1<-npx_data1 |>
head() |>
dplyr::select(-c(Index, MissingFreq, Panel_Version, QC_Warning, Subject, Treatment, Site, Time, Project, Panel, PlateID)) |>
dplyr::mutate(Count = round(NPX * (100+sample(seq(-5,15), size = 1)))) |>
dplyr::mutate(SampleType = "SAMPLE") |>
dplyr::mutate(Normalization = "Plate control") |>
dplyr::mutate(NPX = round(NPX,digits = 2)) |>
dplyr::mutate(LOD = round(LOD, digits = 2)) |>
dplyr::mutate(PCNormalizedNPX = NPX) |>
dplyr::mutate(PCNormalizedLOD = LOD) |>
dplyr::select(SampleID, SampleType, OlinkID, UniProt, Assay, Count, NPX, Normalization, PCNormalizedNPX, LOD, PCNormalizedLOD)
table1 |>
knitr::kable(caption = "Example results from Plate Control Normalized Project") |>
kableExtra::kable_styling(font_size = 10)
table1 |>
dplyr::mutate(Normalization = "Intensity") |>
dplyr::mutate(NPX = round(NPX,digits = 2)) |>
dplyr::mutate(LOD = round(LOD, digits = 2)) |>
dplyr::mutate(PCNormalizedNPX = round(NPX + 4.16, digits = 2))|>
dplyr::mutate(PCNormalizedLOD = round(LOD + 4.16, digits = 2)) |>
dplyr::select(SampleID, SampleType, OlinkID, UniProt, Assay, Count, NPX, Normalization, PCNormalizedNPX, LOD, PCNormalizedLOD) |>
knitr::kable(caption = "Example results from Intensity Normalized Project") |>
kableExtra::kable_styling(font_size = 10)
```


## Import Olink Explore datasets

Olink Explore datasets are standard Olink Explore HT and Olink Explore 384/3072 NPX tables. The `read_NPX()` function can be used to import an NPX file in parquet form as generated by Olink® NPX Explore Software. More information on using `read_NPX()` can be found in [the Olink Analyze Overview tutorial](Vignett.html).


```{r dataset_generation, eval = FALSE, message=FALSE, warning=FALSE}
explore_npx <- read_NPX("~/Explore_NPX_file.parquet")
```


## Integrating Negative Control LOD

The negative control (NC) LOD method requires at least 10 negative controls in a dataset. Negative control data is available in the standard exported Explore HT and Explore 384/3072 NPX parquet files. NCs can be identified through the SampleID and SampleType columns.

A negative control will not contribute to the minimum number of required NCs if either of the following apply:

+ The negative control contains an assay QC warning across all assays, excluding Olink's internal control assays
+ The negative control does not pass sample QC criteria (sample QC failure or warning) in all of the data (i.e. all Explore HT blocks, all Explore 3072 panels, or all Explore 384 panels that were measured)

Negative controls are used to calculate LOD from either PC normalized NPX or counts. For assays with more than 150 counts in one of the negative controls, LOD is calculated using the median PC normalized NPX and adding 3 standard deviations, or 0.2 NPX whichever is larger. For assays with fewer than 150 counts in all negative controls, LOD is calculated using the count values which are then converted into PC normalized NPX.


The resulting LOD is the PC normalized negative control LOD. In the event that the Explore dataset is intensity normalized, an intensity normalization adjustment factor is applied and the resulting intensity normalized LOD is reported in the LOD column and the PC normalized LOD is reported in the PCNormalizedLOD column.


```{r NCLOD_example, eval = FALSE, message=FALSE, warning=FALSE}
# Integrating negative control LOD for intensity normalized data
explore_npx <- read_NPX("Path_to/Explore_NPX_file.parquet")
olink_lod(explore_npx, lod_method = "NCLOD")
```

## Integrating Fixed LOD

The fixed LOD method uses fixed LOD values that have been calculated on negative controls used in Olink reference runs using the method described above for negative control LOD. These values are specific to the Data Analysis Reference ID, which can be found in your dataset. The fixed LOD data is available in an external CSV file which can be provided by Olink Support (support\@olink.com). The fixed LOD values reported in this CSV file are the PC normalized LODs.

The fixed LOD file is read into the `olink_lod()` function to be integrated into an Explore dataset. In the event that the Explore dataset is intensity normalized, an intensity normalization adjustment factor is applied and the resulting intensity normalized LOD is reported in the LOD column and the PC normalized LOD is reported in the PCNormalizedLOD column.

```{r FixedLOD, eval = FALSE, message=FALSE, warning=FALSE}
# Reading in Fixed LOD file path into R environment
fixedLOD_filepath <- "Path_to/ExploreHT_fixedLOD.csv"
# Integrating Fixed LOD for intensity normalized data
explore_npx <- read_NPX("~/Explore_NPX_file.parquet")
olink_lod(explore_npx, lod_file_path = fixedLOD_filepath, lod_method = "FixedLOD")
```

## When to use Fixed LOD vs NC LOD
For smaller sized studies (<10 NCs) we recommend using fixed LOD to integrate LOD values into your NPX dataset, as LOD calculations on fewer NCs may provide non-accurate values. However, it is important to keep in mind that fixed LOD values are not specific to your project, rather these values are generated by Olink when a new lot of reagents is released.

For larger projects we recommend calculating LOD from NC to obtain LOD values that are specific to your project. However, this requires that the dataset has at least 10 NCs with passing SampleQC.

## Adjusting LOD for Intensity Normalized Data

If an Olink Explore dataset is intensity normalized, a normalization adjustment factor is applied to the PC normalized LOD within the `olink_lod()` function.

For each assay, this adjustment factor is calculated as the median NPX of all samples (excluding Olink's external controls) within each plate. For Olink Explore 3072, overlapping assays are assessed separately, within their respective panels. The intensity normalized negative control LOD is calculated by subtracting this adjustment factor from the PC normalized negative control LOD.

The intensity normalization LOD adjustment is applied to both the negative control and fixed LOD methods.

## Export Olink Explore Data with LOD
Olink Explore data with LOD data can be exported using arrow::write_parquet to export Olink Explore data as a parquet file in long format.

```{r explore_npx_export, eval = FALSE, message=FALSE, warning=FALSE}
# Exporting Olink Explore data with LOD information as a parquet file
explore_npx <- read_NPX("Path_to/Explore_NPX_file.parquet")
explore_npx_NC_LOD <- explore_npx %>%
olink_lod(lod_method = "NCLOD") %>%
arrow::write_parquet(, file = "NPX_data_NC_LOD.parquet")
```


## Contact Us

We are always happy to help. Email us with any questions:

- biostat\@olink.com for statistical services and general stats questions


- support\@olink.com for Olink lab product and technical support

- info\@olink.com for more information

## Legal Disclaimer

© 2024 Olink Proteomics AB.

Olink products and services are For Research Use Only and not for Use in Diagnostic Procedures.

All information in this document is subject to change without notice. This document is not intended to convey any warranties, representations and/or recommendations of any kind, unless such warranties, representations and/or recommendations are explicitly stated.

Olink assumes no liability arising from a prospective reader’s actions based on this document.

OLINK, NPX, PEA, PROXIMITY EXTENSION, INSIGHT and the Olink logotype are trademarks registered, or pending registration, by Olink Proteomics AB. All third-party trademarks are the property of their respective owners.

Olink products and assay methods are covered by several patents and patent applications https://www.olink.com/patents/.
50 changes: 50 additions & 0 deletions OlinkAnalyze/vignettes/Vignett.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,56 @@ A tibble of NPX data in long format containing normalized NPX values, including
* Project: Name given from the dataframe of origin.
* Adj_factor: Adjustment factor, i.e. how much was added to or subtracted from the original NPX value.

## Integrating Explore NPX LOD (olink_lod)

The olink_lod function adds LOD information to an Explore HT or Explore 3072 NPX dataframe. This function can incorporate LOD based on either an Explore dataset’s negative controls or using predetermined fixed LOD values, which can be provided by Olink’s Support team in an external file upon request. The default LOD calculation method is based off of the negative controls. If an NPX file is intensity normalized, both intensity normalized and PC normalized LODs are provided.

### Function arguments

+ data: Explore HT or Explore 3072 tibble/data frame in parquet file format such as produced by the read_NPX function.
+ lod_file_path: Path to fixed LOD file provided by Olink Support. This is only applicable if using the fixed LOD method.
+ lod_method: String of LOD method name. Must be either "NCLOD" (default) or "FixedLOD".

```{r NCLOD_example, eval = F, echo = T}
# Integrating negative control LOD into Explore NPX dataset
explore_npx <- read_NPX("~/Explore_NPX_file.parquet")
olink_lod(explore_npx, lod_method = "NCLOD")
# Integrating fixed LOD into Explore NPX dataset - note that these are NOT real fixed LOD values
fixedLOD_filepath <- "~/ExploreHT_fixedLOD.csv"
explore_npx <- read_NPX("~/Explore_NPX_file.parquet")
olink_lod(explore_npx, lod_file_path = fixedLOD_filepath, lod_method = "FixedLOD")
```

### Function output

A tibble with the following columns:

+ **SampleID** _\<chr\>_: Sample names or IDs.
+ **SampleType** _\<chr\>_: Sample type. Indicates whether a sample is a study sample or a type of Olink control sample.
+ **WellID** _\<chr\>_: Well location in the plate.
+ **PlateID** _\<chr\>_: Name of the plate.
+ **DataAnalysisRefID** _\<chr\>_: Version of the panel (Explore 3072) or block (Explore HT). A new panel or block version might include some different or improved assays.
+ **OlinkID** _\<chr\>_: Unique ID for each assay assigned by Olink. In case the assay is included in more than one panels it will have a different OlinkID in each one.
+ **UniProt** _\<chr\>_: UniProt ID.
+ **Assay** _\<chr\>_: Common gene name for the assay.
+ **AssayType** _\<chr\>_: Assay type. Indicates whether an assay is a panel assay or an Olink control assay.
+ **Panel** _\<chr\>_: Olink Panel that samples ran on. Read more about Olink Panels here: https://olink.com/products-services/.
+ **Block** _\<chr\>_: Olink Block that samples ran on.
+ **Count** *<int>*: Raw counts generated during sequencing.
+ **ExtNPX** *<num*: Extension normalized NPX value that is used in NPX calculation. Read more about ExtNPX here: https://olink.com/faq/how-is-the-npx-value-calculated-in-explore/
+ **NPX** *<num>*: Normalized Protein eXpression, is Olink’s unit of protein expression level in a log2 scale. The majority of the functions of this package use NPX values for calculations. If Explore data is PC normalized, NPX reflects the PC normalized value. If Explore data is intensity normalized NPX, NPX reflects the intensity normalized value. Read more about NPX here: https://olink.com/faq/what-is-npx/.
+ **Normalization** _\<chr\>_: The normalization method used.
+ **PCNormalizedNPX** *<num>*: Normalized Protein eXpression, is Olink’s unit of protein expression level in a log2 scale. The majority of the functions of this package use NPX values for calculations. Regardless of normalization method, this column always reflects PC normalized NPX values. Read more about NPX here: https://olink.com/faq/what-is-npx/.
+ **AssayQC** _\<chr\>_:Indicates whether an assay contains a QC warning.
+ **SampleQC** _\<chr\>_: Indicates whether a sample contains a QC warning within a block.
+ **ExploreVersion** _\<chr\>_: The version of the Explore software library that was used to produce the data file.
+ **LOD** *<num>*: LOD added by olink_lod function. If Explore dataset is intensity normalized, this will reflect the intensity normalized LOD. If Explore dataset is PC normalized, this will reflect the PC normalized LOD.
+ **PCNormalizedLOD** *<num>*: LOD added by olink_lod function. This will always reflect the PC normalized LOD, regardless of the normalization method applied to the Explore dataset.
+ **LOD** *<num>*: LOD added by olink_lod function. If Explore dataset is intensity normalized, this will reflect the intensity normalized LOD. If Explore dataset is PC normalized, this will reflect the PC normalized LOD.
+ **PCNormalizedLOD** *<num>*: LOD added by olink_lod function. This will always reflect the PC normalized LOD, regardless of the normalization method applied to the Explore dataset.


# Statistical analysis

## T-test analysis (olink_ttest)
Expand Down

0 comments on commit 3faa92b

Please sign in to comment.