Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protein FDR calculation #1

Open
ohickl opened this issue Jul 5, 2019 · 6 comments
Open

Protein FDR calculation #1

ohickl opened this issue Jul 5, 2019 · 6 comments

Comments

@ohickl
Copy link

ohickl commented Jul 5, 2019

Hi,
I am a bit confused by the values calculated for the final protein report. I looks like this with my data for example:

  • Numbers of proteins before filtering
    Decoy_Proteins_Before_Filtering = 241
    Target_Proteins_Before_Filtering = 37016

  • Numbers of proteins after filtering
    Decoy_Proteins_After_Filtering = 60
    Target_Proteins_After_Filtering = 12462

  • Protein FDR = Decoy_Proteins_After_Filtering / Target_Proteins_After_Filtering
    Protein_FDR = 0.96%

The ~12500 proteins with the 60 decoys are reported afterwards. But how does it end up with 0.96% decoy FDR? If it only found 241 decoys with almost 40k proteins before filtering it was already way below 1% or am I missing something?

Alo it get the following error trying to produce a pepXML file:

python2.7 /opt/sipros/Scripts/sipros_psm_tabulating.py -i /scratch/maxquant/OH/Sipros/method_test/markert_strap_brp_01/output -o /scratch/maxquant/OH/Sipros/method_test/markert_strap_brp_01/output -c /scratch/maxquant/OH/Sipros/method_test/markert_strap_brp_01/20190703_method_test.cfg -x
[Fri Jul 5 11:11:30 2019] Beginning Sipros Ensemble Tabulating (1.0.1 (Alpha))
[Step 1] Parse options and get config file: Running -> Done!
[Step 2] Generate PSM table: Running -> Done!
[Step 3] Merge Protein list: Running -> Done!
[Step 4] Generate Pepxml: Running -> Traceback (most recent call last):
File "/opt/sipros/Scripts/sipros_psm_tabulating.py", line 662, in <module> sys.exit(main())
File "/opt/sipros/Scripts/sipros_psm_tabulating.py", line 647, in main writePepxml(base_out + '.tab', config_dict, modification_dict, element_modification_list_dict, output_folder)
File "/opt/sipros/Scripts/sipros_psm_tabulating.py", line 406, in writePepxml psm_obj.score_process()
File "/opt/sipros/Scripts/sipros_psm_tabulating.py", line 348, in score_process diff = (pep.scorelist[idx1]/l1[0].scorelist[idx1]) - 1
ZeroDivisionError: float division by zero

Also are re you still actively working on Sipros Ensemble?

Love Sipros Ensemble and the results so far!

Cheers

Oskar

@guo-xuan
Copy link
Owner

guo-xuan commented Jul 27, 2019 via email

@ohickl
Copy link
Author

ohickl commented Jul 31, 2019

Hi Xuan,

got it. Thanks!
Do you plan on implementing protein level FDR filtering? I think I read something about it in the readme or the publication. I tried it by setting the FDR_Filtering = Protein in the config file but it does still seem to Filter on 1% peptide FDR.
I would like to do that, because I tend to get a protein level FDR of above 1% when filtering on at least 1 or more unique peptides. The effect is especially strong when searching large databases (e.g. the one I tried contained about 18*10^6 target sequences).
Thanks for your time!

Oskar

@guo-xuan
Copy link
Owner

guo-xuan commented Aug 5, 2019 via email

@ohickl
Copy link
Author

ohickl commented Aug 13, 2019

Hi Xuan,

sorry about that. I would like to filter on protein level.

@guo-xuan
Copy link
Owner

guo-xuan commented Aug 22, 2019 via email

@ohickl
Copy link
Author

ohickl commented Jan 23, 2020

Hey Xuan,

sorry for the late reply. I am still interested in your python script. Could you send it to me at [email protected]? Your last reply went to github and there was no file attached.
Are there any news regarding the development of Sipros Ensemble? Id love to see it continued!

Cheers
Oskar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants