Protein sequence abundance #13

cmorganl · 2019-07-18T22:17:07Z

Hi there,

I've just started using PLASS, specifically plass assemble, and I really like it!

This isn't an issue but a potential enhancement. I was wondering how I might be able to recover abundance information for each protein assembled. Its not as simple as with nucleotide contigs since I'm unable to align the reads back to the assembly in this case.

Could the abundance of proteins be included in the header, as SPAdes does? Alternatively, a table mapping header to its respective abundance would be convenient as well. I hope I haven't misinterpreted the output and its already provided :)

Thanks!
Connor

The text was updated successfully, but these errors were encountered:

martin-steinegger · 2019-07-18T22:34:20Z

Yes I would love to have this as well. 👍
But so far I did not come up with an solution how I could gather this information.

It is hard to set rules what alignments should be part of the abundance computation since plass allows to assemble similar proteins and not just exact once.

We estimated the abundance in the plass publication using the mmseqs map workflow. This workflow performs a six frame translated search and has strict mapping thresholds.

apcamargo · 2019-07-25T19:08:35Z

Wouldn't it possible to use an EM algorithm to "distribute" the read among its multiple hits?

martin-steinegger · 2019-07-25T21:31:17Z

@apcamargo yes this a good idea. But do you know if EM would be fast enough to handle this large amount of data?

apcamargo · 2019-07-25T23:28:55Z

I don't know. It would depend on the total amount of proteins and reads. I imagine it would be much slower than a typical RNA-Seq quantification, but not unfeasible. But that's just my impression, I've never done something like that.

mooreryan · 2019-11-08T15:16:10Z

I was just thinking about this as well...are there any current plans to include this in plass, or would it be better for now to try the mmseqs map workflow?

martin-steinegger added the enhancement New feature or request label Jul 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Protein sequence abundance #13

Protein sequence abundance #13

cmorganl commented Jul 18, 2019

martin-steinegger commented Jul 18, 2019 •

edited

apcamargo commented Jul 25, 2019

martin-steinegger commented Jul 25, 2019

apcamargo commented Jul 25, 2019 •

edited

mooreryan commented Nov 8, 2019

Protein sequence abundance #13

Protein sequence abundance #13

Comments

cmorganl commented Jul 18, 2019

martin-steinegger commented Jul 18, 2019 • edited

apcamargo commented Jul 25, 2019

martin-steinegger commented Jul 25, 2019

apcamargo commented Jul 25, 2019 • edited

mooreryan commented Nov 8, 2019

martin-steinegger commented Jul 18, 2019 •

edited

apcamargo commented Jul 25, 2019 •

edited