Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with bacteria_tradis mapping #136

Open
bethanyf3 opened this issue Feb 28, 2024 · 1 comment
Open

Problems with bacteria_tradis mapping #136

bethanyf3 opened this issue Feb 28, 2024 · 1 comment

Comments

@bethanyf3
Copy link

Hi,

I have two libraries that I am having problems with the bacteria_tradis mapping for. I have used cutadapt to trim the transposon (and adapters) from the reads and selected only the reads that contained the transposon to go to the output folder. Reads were also filtered to discard anything shorter than 15bp.

I have tried to align these reads using both BWA and smalt (using tagless mode for both). Using BWA with default parameters, I’m getting 100% matched reads for every sample, but about half have 0% mapping. The other half have varying levels of mapping, and all of these samples have at least twice as many unique insertion sites as I am expecting.

Using smalt with -m 10 (all other parameters at default), I am getting 100% matched reads but very low levels of mapping, < 5% for every sample. The number of UIS is still higher than I would expect for one of the libraries, and I have very low coverage (average around 1.5 reads per UIS).

For both BWA and smalt I have seen that the number of UIS seems to be proportional to the number of reads rather than the number of mutants in the libraries.

Do you have any ideas of things I could try for either mapping tool? Thanks!

@lbarquist
Copy link
Contributor

Hi,

First, I would try to follow the tutorial exactly, e.g. using smalt and not trimming tags yourself:

https://github.com/sanger-pathogens/Bio-Tradis/blob/master/BioTraDISTutorial.pdf

BWA has some problems with mismapping, as it will soft-clip reads leading to a lot of incorrect read assignments, so I would avoid using it. I've also never used the tagless mode, this was added later, so I don't know how this might affect the mapping.

If you're still getting low mapping rates, I would try A) using FastQC to check that you don't have some obvious problems (e.g. additional adapter contamination, low quality, weird over represented sequences), B) BLAST some of your reads against the reference genome and through NCBI BLAST to see if they actually do match your reference genome, or if not what the source might be. Depending on how the libraries were made, potentially this could also arise from something like a delivery vector, e.g. if you're conjugating in a plasmid with the transposon on it and it hasn't been properly cleared.

Let me know if this helps.

-Lars

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants