Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nhmmer failed to run - Error: Invalid alphabet type in target for nhmmer. Expect DNA or RNA #54

Open
minjinhan opened this issue Jan 16, 2021 · 6 comments

Comments

@minjinhan
Copy link

Hello
I try to run barrnap to identify rRNA from a eukaryotic genome , the commad as follow:
barrnap --kingdom euk --threads 20 --outseq rRNA.fasta < chr1.fasta

After running, we got following error . Can you supply suggestions to solve this problem? Thanks!
[barrnap] This is barrnap 0.9
[barrnap] Written by Torsten Seemann
[barrnap] Obtained from https://github.com/tseemann/barrnap
[barrnap] Detected operating system: linux
[barrnap] Adding /miniconda3/lib/barrnap/bin/../binaries/linux to end of PATH
[barrnap] Checking for dependencies:
[barrnap] Found nhmmer - /miniconda3/bin/nhmmer
[barrnap] Found bedtools -/miniconda3/bin/bedtools
[barrnap] Will use 20 threads
[barrnap] Setting evalue cutoff to 1e-06
[barrnap] Will tag genes < 0.8 of expected length.
[barrnap] Will reject genes < 0.25 of expected length.
[barrnap] Using database: /miniconda3/lib/barrnap/bin/../db/euk.hmm
[barrnap] Scanning chr1.fasta for euk rRNA genes... please wait
[barrnap] Command: nhmmer --cpu 20 -E 1e-06 --w_length 3878 -o /dev/null --tblout /dev/stdout '/miniconda3/lib/barrnap/bin/../db/euk.hmm' 'chr1.fasta'
[barrnap] ERROR: nhmmer failed to run - Error: Invalid alphabet type in target for nhmmer. Expect DNA or RNA.

I am sure there are no other alphabets in the fasta sequence except A/T/C/G.

@snayfach
Copy link

snayfach commented Feb 1, 2021

I've gotten the same error. For me, what caused the error was one sequence composed entirely of G and T nucleotides. Adding a single A and C nucleotide resulted in no error. This should be an easy :-)

@jdwinkler-lanzatech
Copy link

I also just ran into this problem as well.

@zxgsy520
Copy link

I also just ran into this problem as well. I added A and same problem. The Internet said it was a problem with the conda installation.

@ptrebert
Copy link

ptrebert commented Mar 17, 2022

I just stumbled upon this; in case this is still relevant @zxgsy520 there is a switch to set the alphabet type for the query and use this as "guide" in case the alphabet type cannot be guessed for the target; --dna introduced in this PR
EddyRivasLab/hmmer#252
The switch is available in nhmmer v3.3.2 installed via conda

correction: the fix in the PR has only been merged into the dev branch, the switch --dna exists in latest release but does not include the fix

@ZeweiSong
Copy link

I bypassed this issue by replacing all ambiguous bases (M, K, H, et al.) to N.

@cabbagesofdoom
Copy link

I got this issue for a genome that started with a telomere repeat and did not have all four bases in the first few hundred characters. I got around it by replacing the first four characters of each sequence with GATC and then running on the temporary file:

PREFIX=$(basename ${GENOME/.fasta/})
sed 's/^[ACGT][ACGT][ACGT][ACGT]/GATC/' $GENOME > $PREFIX.tmp.fasta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants