Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the charge min and max and missclevages are sometimes not working #364

Open
ypriverol opened this issue Mar 25, 2024 · 3 comments
Open

the charge min and max and missclevages are sometimes not working #364

ypriverol opened this issue Mar 25, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@ypriverol
Copy link
Member

Description of the bug

@jpfeuffer @timosachsenberg @daichengxin I found one dataset that we search using msgf, here the command:

#!/bin/bash -euo pipefail
MSGFPlusAdapter \
    -protocol automatic \
    -in 01086_C01_P010738_S00_N03_R1.mzML \
    -out 01086_C01_P010738_S00_N03_R1_msgf.idXML \
    -executable $(find /usr/local/share/msgf_plus-*/MSGFPlus.jar -maxdepth 0) \
    -threads 6 \
    -java_memory 30720 \
    -database "GRCh38r110_GCA97s_coding_proteins_19Jul23-decoy.fa" \
    -instrument high_res \
    -matches_per_spec 1 \
    -min_precursor_charge 2 \
    -max_precursor_charge 4 \
    -min_peptide_length 6 \
    -max_peptide_length 40 \
    -max_missed_cleavages 2 \
    -isotope_error_range 0,1 \
    -enzyme "Trypsin/P" \
    -tryptic fully \
    -precursor_mass_tolerance 40.0 \
    -precursor_error_units ppm \
    -fixed_modifications 'Carbamidomethyl (C)' \
    -variable_modifications 'Acetyl (Protein N-term)' 'Deamidated (N)' 'Deamidated (Q)' 'Oxidation (M)' \
    -max_mods 3 \
    -PeptideIndexing:IL_equivalent \
    -PeptideIndexing:unmatched_action warn \
    -debug 0 \
     \
    2>&1 | tee 01086_C01_P010738_S00_N03_R1_msgf.log

However in the file output I found the following id:

<PeptideIdentification score_type="SpecEValue" higher_score_better="false" significance_threshold="0.0" MZ="664.68194580078125" RT="33
78.397500000000036" spectrum_reference="controllerType=0 controllerNumber=1 scan=24975" >
			<PeptideHit score="1.4043417e-21" sequence="INNAHTIGC(Carbamidomethyl)NAVSWAPAVVPGSLIDHPSGQKPNYIKR" charge="6" aa_before="K K 
K K K K K K K K K K K K K" aa_after="F F F F F F F F F F F F F F F" start="130 147 144 130 190 130 147 144 130 190 130 147 144 130 190" end="166 183 1
80 166 226 166 183 180 166 226 166 183 180 166 226" protein_refs="PH_14293 PH_14294 PH_14295 PH_14296 PH_14297 PH_44721 PH_44722 PH_44723 PH_44724 PH_
44725 PH_112619 PH_112620 PH_112621 PH_112622 PH_112623" >
				<UserParam type="float" name="MS:1002049" value="103.0"/>
				<UserParam type="float" name="MS:1002050" value="165.0"/>
				<UserParam type="float" name="MS:1002052" value="1.4043417e-21"/>
				<UserParam type="float" name="MS:1002053" value="6.614773000000001e-14"/>
				<UserParam type="string" name="AssumedDissociationMethod" value="HCD"/>
				<UserParam type="string" name="CTermIonCurrentRatio" value="0.3437819"/>
				<UserParam type="string" name="ExplainedIonCurrentRatio" value="0.39947474"/>
				<UserParam type="string" name="MS2IonCurrent" value="2429519.8"/>
				<UserParam type="string" name="MeanErrorAll" value="4.888304"/>
				<UserParam type="string" name="MeanErrorTop7" value="2.5796666"/>
				<UserParam type="string" name="MeanRelErrorAll" value="-0.8928608"/>
				<UserParam type="string" name="MeanRelErrorTop7" value="2.5497687"/>
				<UserParam type="string" name="NTermIonCurrentRatio" value="0.055692848"/>
				<UserParam type="string" name="NumMatchedMainIons" value="23"/>
				<UserParam type="string" name="StdevErrorAll" value="4.698519"/>
				<UserParam type="string" name="StdevErrorTop7" value="1.8443376"/>
				<UserParam type="string" name="StdevRelErrorAll" value="6.7211905"/>
				<UserParam type="string" name="StdevRelErrorTop7" value="1.885455"/>
				<UserParam type="float" name="calcMZ" value="664.51446533203125"/>
				<UserParam type="int" name="pass_threshold" value="1"/>
				<UserParam type="int" name="start" value="191"/>
				<UserParam type="int" name="end" value="227"/>
				<UserParam type="string" name="target_decoy" value="target"/>
				<UserParam type="string" name="isotope_error" value="1"/>
				<UserParam type="string" name="protein_references" value="non-unique"/>
			</PeptideHit>
			<UserParam type="string" name="MS:1001115" value="24975"/>
		</PeptideIdentification>

What could be the problem, this also happens for comet.

Command used and terminal output

No response

Relevant files

No response

System information

No response

@ypriverol ypriverol added the bug Something isn't working label Mar 25, 2024
@timosachsenberg
Copy link

https://github.com/OpenMS/OpenMS/blob/079143800f7ed036a7c68ea6e124fe4f5cfc9569/src/topp/MSGFPlusAdapter.cpp#L166
according to this comment in our adapter it is only used if no charge is annotated in the mzML

@ypriverol
Copy link
Member Author

@jpfeuffer @timosachsenberg would it make sense to add a parameter to filter the psms in that charge range?

@timosachsenberg
Copy link

good question.
I think these high charge peptides are potentially interesting so one could argue that one wants them to be reported.
On the other hand you get more defined / consistent results without filtering.
I would probably keep them by default but I could add an optional filter if we decide that we want to filter them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants