-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PWM detection issue? [HELP] [QUESTION] #41
Comments
Hi @EdPym Yes, you are totally right. But this is not a bug. The Rel Score is calculated like this: This means that, roughly, I sum the corresponding probabilities of each nucleotide at each position. So yes, for the same score, there are patterns found which are less relevant than certain or even false. And this is a very good example. With a PWM more simple: For AGGAC: 1 + 0.5 + 0.5 + 0 + 1 = 3 In this example, let's assume that A in position 1 and G in position 4 are obligatory. You see that for AGGACthere is no G at positon 4 and the score is equal to 3. In ATTGC A and G are good but the score is also equal to 3. I'm working on a way to discriminate this more effectively. I create LCS option. It allows you to look at the number of similar consecutive nucleotides between the pattern found and the PWM. It requires a lot of resources so it is possible that it will crash the software. I am also working on a standalone which will allow us to get rid of Streamlit and have good computing power. But for your example it works. And you will see that ultimately, you may have other more interesting targets.
It is important to understand that the RelScore is a global score. The LCS also calculates a RelScore but only on the retained part. So the LCS does a local score. |
Describe the bug
Using individual Motif finder it appears to detect binding sites that don't match the PWM.
Here are two results from a search.
1518 | aatAAATCAGAGCTAaag | 0.769912 | + | → | n.d. | n.d | n.d
463 | gtcAAACTAAAGGACcgg | 0.769912 | + | → | n.d. | n.d | n.d
The G (7th Position) and C (10th position) are absolutely required in the PWM. So not sure why site 463 is found?
PWM = MA0451.1
The text was updated successfully, but these errors were encountered: