miRBooking

Implementation of the miRBooking algorithm and metrics in C

fast and memory efficient
usable from Python, JavaScript and Vala via GObject introspection
memory-mapped score tables, target and miRNAs FASTA for low memory footprint in parallel execution
binary with support for static linking for more portability
stdin/stdout for piping from and into other tools

Usage

mirbooking --targets targets.fa
           --mirnas mirnas.fa
           --seed-scores scores-7mer-3mismatch-ending
           [--accessibility-scores accessibility-scores[.gz]]
           [--supplementary-model none]
           [--supplementary-scores scores-3mer]
           [--input stdin]
           [--output stdout]
           [--output-format tsv]
           [--sparse-solver best-available]
           [--max-iterations 100]
           [--5prime-footprint 9]
           [--3prime-footprint 7]
           [--cutoff 100]
           [--relative-cutoff 0]
           [--blacklist blacklist.tsv]

To obtain detailed usage and options, launch mirbooking --help.

The command line program expects a number of inputs:

--targets, a FASTA containing RNA transcripts where the identifier is the accession with support for alternative flavours from NCBI RefSeq and GenBank via --ncbi-targets, and GENCODE via --gencode-targets
--mirnas, a FASTA containing mature miRNAs where the identifier is the accession with support for alternative flavour from miRBase via --mirbase-mirnas
--seed-scores, a sparse score table of seed free energies which can be generated using generate-score-table program described below
--accessibilitiy-scores contains entries with position-wise free energy contribution (or penalty) on the targets
--supplementary-scores contains either 4mer or 3mer
--input, a quantity file mapping target and miRNA accessions to expressed quantity in picomolars units

Tables for seed and supplementary scores are provided in the data folder. These were computed with RNAcofold binding energy from ViennaRNA package.

Note that Yan et al. (--supplementary-model=yan-et-al-2018) model requires a 3mer table whereas Zamore et al. (--supplementary-model=zamore-et-al-2012) require a 4mer table.

Tables for seed and supplementary bindings are automatically located (new in 2.3).

The --cutoff parameter can exploit a known upper bound on the complex concentration to adjust the granularity of the model. Only interaction that can ideally reach the specified picomolar concentration will be modeled.

The --relative-cutoff parameter is similar, but instead filter based on the ideal substrate bound fraction.

The output is a TSV with the following columns:

Column	Description
gene_accession	Gene accession with version (new in 2.3)
gene_name	Name of the gene or N/A if unknown (new in 2.3)
target_accession	Target accession with version
target_name	Name of the target or N/A if unknown
target_quantity	Total target concentration in picomolars
position	Site position on the target
mirna_accession	miRNA accession
mirna_name	Name of the miRNA or N/A if unknown
mirna_quantity	Total miRNA concentration in picomolars
score	Michalis-Menten constant of the miRNA::MRE duplex
quantity	miRNA::MRE duplex concentration this target position in picomolars

The detailed TSV output which expands the score structure in its constituents can be used with --output-format=tsv-detailed (new in 2.3). In this mode, the score column is replaced by kf, kr, kcleave, krelease, kcat, kother, kd and km.

The GFF3 output can be used with --output-format=gff3. The score will indicate the bound fraction of the position.

Wiggle output can also be produced with --output-format=wig. The score will be the position-wise bound fraction of substrate which properly account for overlapping microRNA.

The --blacklist parameter indicates a file that contains interactions that the model should ignore. This is particularly useful if you know beforehand they will be too weak at equilibrium to be worth modeling. The format is a three column TSV containing only the columns target_accession, position and mirna_accession from the output format.

Installation

You'll need Meson and Ninja as well as GLib development files installed on your system.

mkdir build && cd build
meson --buildtype=release
ninja
ninja install

To generate fast code, configure with meson -Doptimization=3.

You can perform a local installation using meson --prefix=$HOME/.local, but you'll need LD_LIBRARY_PATH set accordingly since the mirbooking program uses a shared library. Otherwise, a static linkage can be done by calling meson --default-library=static.

To generate introspection metadata, use meson -Dwith_introspection=true. To generate Vala bindings, use meson -Dwith_vapi=true.

CBLAS is required and you can alternatively opt for ATLAS with -Dwith_atlas=true or OpenBLAS -Dwith_openblas=true implementations instead of the default netlib CBLAS. If configured with -Dwith_mkl=true, MKL CBLAS will be used instead. The OpenMP flavour of OpenBLAS is used when configured with -Dwith_openmp=true.

FFTW can be optionally used to compute more accurate silencing by specifying meson -Dwith_fftw3=true. If you redistribute miRBooking source code, be careful not to enable this as a default because of the GPL license covering this dependency. If you have access to Intel MKL, you can alternatively use its FFTW3 implementation with -Dwith_mkl_fftw3=true.

OpenMP can be optionally used to parallelize the evaluation of partial derivatives and some supported solvers by specifying -Dwith_openmp=true.

MPI can be optionally used to distribute the computation across multiple machine on supported solvers (i.e. mkl-cluster) by specifying -Dwith_mpi=true.

Solver	Build Options
LAPACK	No option since this is the fallback solver.
SuperLU	`-Dwith_superlu=true`
SuperLU MT	`-Dwith_superlu_mt=true`
UMFPACK	`-Dwith_umfpack=true`
cuSOLVER	`-Dwith_cuda=<cuda_toolkit_api_version> -Dwith_cusolver=true`
MKL DSS	`-Dwith_mkl=true -Dmkl_root=<path to mkl> -Dwith_mkl_dss=true`
MKL Cluster	`-Dwith_mpi=true -Dwith_mkl=true -Dmkl_root=<path to mkl> -Dwith_mkl_cluster=true`
MKL LAPACK	`-Dwith_mkl=true -Dmkl_root=<path to mkl> -Dwith_mkl_lapack=true`
PARDISO	`-Dwith_pardiso=true`

LAPACK is not a sparse linear solver and thus will not handle typical workload very well, but it will perform orders of magnitude faster on dense jacobians.

cuSOLVER require CUDA toolkit whose API version is to be specified with -Dwith_cuda=<cuda_toolkit_api_version>.

MKL DSS and MKL Cluster can benefit from TBB instead of OpenMP, which can be enabled with -Dwith_mkl_tbb=true.

MKL DSS and MKL Cluster can be used with the 64 bit interface, allowing much larger systems to be solved with -Dwith_mkl_ilp64=true. However, this will break other solvers as it will load a 64 bit BLAS.

PARDISO cannot be used along with MKL DSS because they define common symbols.

By default, the best sparse solver available among the following will be used (new in 2.3):

MKL-DSS
PARDISO
UMFPACK
SuperLU
LAPACK

Numerical integration

In addition to determine the steady state, miRBooking can also perform numerical integration of the microtargetome using the programming API.

Other tools

In addition to the mirbooking binary, this package ship a number of utilities.

Te generate-score-table compute a hybridization energy table for a given seed mask. Either ViennaRNA or mcff is required to compute energies.

generate-score-table [--method=RNAcofold]
                     [--temperature=310.5]
                     [--mask=||||...]
                     [--hard-mask=||||...]
                      --output scores

The seed mask defines folding constraints on the target with | for a canonical match, x for a canonical mismatch and . for no constraint. It also determines the seed length. If a hard mask is provided, unsatisfying interactions are filtered out (new in 2.3).

It's also possible to ajust the folding temperature (new in 2.3).

The number of workers can be tuned by setting OMP_NUM_THREADS environment variable.

The mirbooking-iterative tool is a wrapper script around miRBooking which takes advantage of the --blacklist flag by solving the equilibrium gradually and excluding weak interactions in subsequent models.

It takes the same arguments as mirbooking with the slight distinction that the --cutoff now indicates the target cutoff.

C API

The API is conform to the GLib style and enable a wide range of use. It is fairly easy to use and a typical experimentation session is:

create a broker via mirbooking_broker_new
create some sequence objects with mirbooking_target_new and mirbooking_mirna_new
setup quantities via mirbooking_broker_set_sequence_quantity
call mirbooking_broker_evaluate and mirbooking_broker_step repeatedly to perform a full hybridization or numerical integration
retrieve and inspect the microtargetome with mirbooking_broker_get_target_sites

For a more detailed usage and code example, the main program source in bin/mirbooking.c is very explicit as it perform a full session and fully output the target sites.

Cite this software

Poirier-Morency, G. Modélisation des réseaux de régulation de l’expression des gènes par les microARN. (Université de Montréal, 2021). https://doi.org/1866/25104

Name		Name	Last commit message	Last commit date
Latest commit History 555 Commits
.github/workflows		.github/workflows
bin		bin
contrib		contrib
data		data
src		src
tests		tests
tools		tools
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
meson.build		meson.build
meson_options.txt		meson_options.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

miRBooking

Usage

Installation

Numerical integration

Other tools

C API

Cite this software

About

Releases 1

Packages

Languages

License

major-lab/mirbooking

Folders and files

Latest commit

History

Repository files navigation

miRBooking

Usage

Installation

Numerical integration

Other tools

C API

Cite this software

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages