From ceb0b75516bbbd12f73291628e891b9a63424e76 Mon Sep 17 00:00:00 2001 From: Daniel Bolin Date: Wed, 13 Dec 2023 17:05:26 -0500 Subject: [PATCH] Update readme --- README.md | 93 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 92 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 99768db..77c93e8 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,93 @@ # hra-workflows -HRA workflows for processing experimental data +HRA workflows for annotating h5ad using different tools. + +## Table of Content +- [Requirements](#requirements) +- [Building docker images](#building-the-docker-images-locally) +- [Running the annotation tools](#running-the-annotation-tools) + - [Downloading models](#1-download-annotation-tool-models) + - [Creating jobs](#2-create-a-job-file) + - [Running jobs](#3-running-the-job) +- [Adding custom annotation tools](#adding-a-new-annotation-tool) + - [Folder structure](#1-tool-folder-structure) + - [Updating top level pipeline](#2-updating-top-level-steps) + + +## Requirements +- [cwl-runner](https://github.com/common-workflow-language/cwltool) +- Docker + +## Building the docker images locally +Docker images can be build locally by running `./scripts/build-containers.sh`. By default the script will build all containers when run. To build individual images or a set of images provide the container names as arguments to the build script, ex. `./scripts/build-containers.sh azimuth gene-expression`. + +## Running the annotation tools + +### 1. Download annotation tool models +Download model data by running `cwl-runner download-models.cwl`. + +### 2. Create a job file +The first step is to create a job file that will specify inputs to the pipeline. The file can be written as either a json or yaml file. + +#### Example job.yml running Azimuth +```yaml +matrix: + class: File + path: path/to/data.h5ad +organ: UBERON:0002048 # Uberon id for lung +algorithms: + # Algorithm specific options are documented in the container's options.yml + - azimuth: + referenceDataDir: + class: Directory + path: path/to/models/directory +``` + +### 3. Running the job +After creating a job file running the annotation tools is as simple as running `cwl-runner pipeline.cwl my-job.yml` (replace `my-job.yml` with your job file). + +## Adding a new annotation tool + +### 1. Tool folder structure +An annotation tool generally has the following file structure: +``` +containers/ + my-annotation-tool/ + Dockerfile + options.yml + pipeline.cwl + download-data.cwl (optional) + context/* + code and assets... +``` + +Where each file should perform the following function: +- `Dockerfile` + - Instructions for building a docker image. +- `options.yml` + - Cwl definition of tool specific options. +- `pipeline.cwl` + - Main cwl pipeline for running the tool. + - 3 inputs: "matrix", "organ", and "options" + - 3 outputs: "annotations", "annotated_matrix", and "report". +- `download-data.cwl` (optional) + - Download models and other data required for running the tool. + - Implement this pipeline when the model data is to large to embed directly in the docker image. +- `context/*` + - Directory containing the code and assets implementing the tool. + +### 2. Updating top level steps +After implementing a new algorithm a few changes have to be made to enable the tool from the main pipeline. The files that have to be updated are: `pipeline.cwl`, `./steps/annotate.cwl`, and `./steps/run-one.cwl`. After adding the new tool to the top level pipeline it can be used by specifying the tool in a job file. + +#### Example job for a new annotation tool +```yaml +matrix: + class: File + path: path/to/data.h5ad +organ: UBERON:0002048 # Uberon id for lung +algorithms: + - my-annotation-tool: + # Options specific to my-annotation-tool + option1: value1 + option2: value2 + ... +```