Skip to content

Transcriptiome Analysis in Python Imported from R

License

Notifications You must be signed in to change notification settings

fcomitani/tapir

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transcriptome Analysis in Python Imported from R

v 0.1

GitHub tag (latest by date) PyPI Licence GitHub top language Documentation Status

tapir is a python 3 package for the analysis of gene expression data. It includes a number of functions for statistical analysis, differential expression and gene sets enrichment analysis.

WARNING: This library is still in active development and we hope to add more options in the future. Please feel free to leave feedback, suggestions or to contribute to this repository.

This library includes

  • TMM normalization with EdgeR
  • differential expression analysis with EdgeR
  • gene sets enrichment analysis with gseapy
  • survival analysis with lifelines
  • immune deconvolution with MCPcounter
  • dimensionality reduction with UMAP
  • plotting functions for distribution comparisons, heatmaps and gene sets networks.

Detailed documentation, API references and tutorials can be found at this link.

Dependencies

Besides basic scientific and plotting libraries, the current version requires

- gseapy
- lifelines
- rpy2
- seaborn
- scikit-learn
- statsmodels
- umap-learn

** R, EdgeR and MCPcounter need to be installed independently. **

Installation

tapir releases can be easily installed through the python standard package manager
pip install tapir-rna

To install the latest (unreleased) version you can download it from this repository by running

git clone https://github.com/fcomitani/tapir
cd tapir
python setup.py install

Basic usage

Given an input dataset in pandas-like format (samples X genes), the build_dgelist and diff_exp functions will allow you to normalize the samples as TMM and fit a glmQL model for differential expression significance.

from tapir.edger import build_dgelist, diff_exp

dgelist, tmmlog = build_dgelist(input_table)
de              = diff_exp(dgelist, groups, filter=True)

Contact us

  • federico.comitani at sickkids.ca
  • josh.nash at sickkids.ca

Contributions

This library is still a work in progress and we are striving to improve it, by adding more flexibility and increase the memory and time efficiency of the code. If you would like to be part of this effort, please fork the master branch and work from there.

Contributions are always welcome.