Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can read2tree be run offline and with custom marker gene list? #22

Open
masudermann opened this issue Apr 26, 2023 · 3 comments
Open

Can read2tree be run offline and with custom marker gene list? #22

masudermann opened this issue Apr 26, 2023 · 3 comments

Comments

@masudermann
Copy link

Hello,

I tried out read2tree and was impressed. I had a quick followup. Is it possible for users to input their own custom list of maker genes?

I work with several different Phytophthora species. When I use the OMA browser, I can only obtain marker genes from 7 species. Being able to provide my own set of genes would be advantageous.

From my quick look at the paper, documentation, and other questions people had, it looks like there isn't currently a way to run read2tree offline or with a custom set of marker genes. If this is is the case, will a future update incorporate this option?

@alpae
Copy link
Member

alpae commented Apr 26, 2023

Dear @masudermann

it is possible, but not entirely straight forward. You need to provide

  1. a list of marker genes in fasta format for their protein sequence. Note that it is expected that each sequence contains the species it belongs encoded in a [species tag] at the end of the fasta header. There must be at most one sequence per species in each marker gene (and the sequence need to be all orthologous to one another).
  2. You need to have a fast file with the same headers containing all the coding sequences (CDS) coresponding to the protein sequences. You can provide all the sequences in a single fasta file.

Then, you should be able to run read2tree with the command:

read2tree --tree --standalone_path <marker_genes>  --dna_reference <cds_file> ...

If you observe any problems we would be glad to hear about them. The tool should definitively be able to work also with markers not coming from OMA (but it is certainly much less tested).

Cheers Adrian

@masudermann
Copy link
Author

Thank you! The instructions are helpful. I will keep you posted.

@sinamajidian
Copy link
Contributor

For future references: we have also some instruction here which works for NCBI refSeq. We would be happy to generalise it for specific format of your interest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants