Identification workflow needed. #346

ypriverol · 2024-01-21T08:24:43Z

Description of feature

First use case: We are doing some research about how to control FDR when integrating multiple datasets at quantms.org at protein level. We currently have a method mad_decoy that we are trying to improve now with the entrapment approach. For the new method, we need to study the distribution of peptide probabilities for each identification.

Second use case: In addition, we want to explore for AI-dataset generation the impact of rank 1,2,3 PSMs in search engines. Most of the AI-dataset for spectra prediction training are based on rank 1 spectra. We want to release dataset with rank 1, rank 2, and 3 spectra for exploring the impact on AI prediction methods.

Third use case: ID at large-scale for spectra library generation.

This subworkflow will help used to provide a solution for peptide identification outside the quant part. It is actually related with #345. I recommend the following @daichengxin @jpfeuffer @timosachsenberg :

Perform peptide identification with the three search engines (SAGE, MSGF+, COMET)
We should make sure that this work for inmunopeptidomics datasets, when searching with no enzymatic restriction.
Percolator will be optional, this is needed because if you want to study pure search engine results and all the search engine ranks, you may need to skip Percolator who select the first rank for each search engine.
ConsesusID should be applied on Percolator results or other type of search engines that do not remove the ranks.
We should have an ID-filter that filter by PSMs FDR.
Export to quantms.io PSM file.

Feedback from @jpfeuffer @daichengxin @timosachsenberg would be great.

ypriverol added the enhancement New feature or request label Jan 21, 2024

ypriverol assigned jpfeuffer and daichengxin Jan 21, 2024

ypriverol added this to the Release 1.3 milestone Jan 22, 2024

ypriverol linked a pull request Feb 1, 2024 that will close this issue

first PR about identification subworkflow #351

Merged

11 tasks

daichengxin closed this as completed in #351 Mar 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identification workflow needed. #346

Identification workflow needed. #346

ypriverol commented Jan 21, 2024 •

edited

Loading

Identification workflow needed. #346

Identification workflow needed. #346

Comments

ypriverol commented Jan 21, 2024 • edited Loading

Description of feature

ypriverol commented Jan 21, 2024 •

edited

Loading