Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

m6anet, eventalign, galaxy #155

Open
mocherry opened this issue Mar 6, 2024 · 3 comments
Open

m6anet, eventalign, galaxy #155

mocherry opened this issue Mar 6, 2024 · 3 comments

Comments

@mocherry
Copy link

mocherry commented Mar 6, 2024

Dear m6anet-team,

I have a question regarding m6anet, which I try to use in Galaxy to reproduce some RNA-sequencing study in order to get familiar with this topic.
In your manual you write for data-prep:
"will need: * reads.fastq: fastq file generated from basecalling the raw .fast5 files * reads.sorted.bam: sorted bam file obtained from aligning reads.fastq to the reference transcriptome file * transcript.fa: reference transcriptome file"

I was told that with Dorado on fast5/pod5 files, information about modified bases is not preserved in fastq-files and that only in bam-files produced by Dorado this info is contained as a tag.

So, what is the exact pipeline to produce the fastq-files? Will eventalign in Galaxy get the modification from the supplied fast5-files, i.e. is it sufficient to do basecalling (Dorado) with fastq as output option.
How do you suggest to produce the sorted bam-file? What would you use for alignment? Minimap?

Please excuse if these questions sound somewhat naive, but I have had a hard time so far getting m6A-info from the sequencing data I have available and am not at all familiar with Python, which is why I want to do the analysis in galaxy.

Thanks for your help and consideration,
Matthias

@yuukiiwa
Copy link
Collaborator

yuukiiwa commented Mar 6, 2024

Hi Matthias (tagging you here @mocherry),

  1. You can pass the --emit-fastq flag to dorado basecaller, which would emit a fastq file, this is sufficient for downstream running nanopolish and m6anet

  2. You can use minimap2 and samtools to get a sorted.bam file:

minimap2 -ax map-ont -uf -t 3 --secondary=no <MMI> <PATH/TO/FASTQ.GZ> > <PATH/TO/SAM> 2>> <PATH/TO/SAM_LOG>
samtools view -Sb <PATH/TO/SAM> > <PATH/TO/BAM>
samtools sort <PATH/TO/BAM> -o <PATH/TO/SORTED.BAM> 
samtools index <PATH/TO/BAM>
  1. You can then use the fastq file and the fast5 files (or convert the pod5 files to fast5 files with pod5 convert to_fast5 and run nanopolish index

  2. Then, you can run nanopolish eventalign with the fast5, fastq, and sorted.bam, which will give you an eventalign.txt file to input to m6anet dataprep.

Not sure whether you are open to using command line, but you can check out the nf-core/nanoseq, which does all the steps for you.

Thanks!

Best wishes,
Yuk Kei

@mocherry
Copy link
Author

mocherry commented Mar 7, 2024

Hi Yuk Kei,

thanks a lot.
I will give it a try. I am not too familiar with command line stuff, so I will look into nf-core/nanoseq and hope that I understand what I have to do there.
Maybe I can back with more questions once I have tried and get stuck.
Best,
Matthias

@mocherry
Copy link
Author

mocherry commented Mar 9, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants