Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variants shouldn't be map to reference sequences #1

Open
ypriverol opened this issue Jul 27, 2021 · 12 comments
Open

Variants shouldn't be map to reference sequences #1

ypriverol opened this issue Jul 27, 2021 · 12 comments

Comments

@ypriverol
Copy link
Member

@EamonnOCearnaigh

As I mentioned in the discussion the other day would be great if we have a logic in the tool that if the user uses the MM options -mm NUM to allow missmatches and the query peptide matches with 0 miss-matches and 1..2.. etc miss-matches have a way to discard the 1..2. .etc options. The use case is:

Peptide Query: AAAA -> Protein A in sequence ..AAAA... and Protein B in sequence ..AVAA.. we discard the second match because it has more probabilities (I will say 100%) that this is only a reference peptide. James mentioned that he is doing a two-step search to discard reference peptides first and then search with multiple gaps.

Would be great if we can implement that feature.

@ypriverol ypriverol transferred this issue from bigbio/pgatk Aug 18, 2021
@EamonnOCearnaigh
Copy link

Sounds great - I'll see if I can plan out another edit to implement that this week. Hey also, James passed a maven command on to me but I am still facing errors when running pepgenome from the terminal. Would you have any ideas?

@ypriverol
Copy link
Member Author

Which error do you get?

@ypriverol
Copy link
Member Author

this command should work:

$ java -jar pepgenome-1.1.1-bin.jar
usage: Arguments: -fasta TRANSL -gtf ANNO -in *.tsv[,*.tsv] [-format OUTF]
                  [-merge TRUE/FALSE] [-source SRC] [-mm NUM] [-mmmode
                  TRUE/FALSE] [-species SPECIES] [-chr 0/1]
 -ann <arg>              Filepath for file containing genome annotation in
                         GTF or GFF3 format
 -chr <arg>              Export chr prefix Allowed 0, 1  (default: 0)
 -exco <arg>             Use exon coordinates rather than CDS (Unannotated
                         peptides)
 -fasta <arg>            Filepath for file containing protein sequences in
                         FASTA format
 -format <arg>           Select the output formats from gtf, gct, bed,
                         ptmbed, all or combinations thereof separated by
                         ',' (default all)
 -genome <arg>           Filepath for file containing genome sequence in
                         FASTA format used to extract chromosome names and
                         order and differenciate between assembly and
                         scaffolds. If not set chromosome and scaffold
                         names and order is extracted from GTF input.
 -gff <arg>              Filepath for file containing genome annotation in
                         GFF3 format
 -gtf <arg>              Filepath for file containing genome annotation in
                         GTF format
 -h                      Print this help & exit
 -in <arg>               Comma(,) separated file paths for files
                         containing peptide identifications (Contents of
                         the file can tab separated format. i.e., File
                         format: four columns: SampleName
                         PeptideSequence
                         PSMs
                         Quant; or mzTab, and mzIdentML)
 -inf <arg>              Format of the input file (mztab, mzid, pavro, or
                         tsv). (default tsv)
 -inm <arg>              Compute the kmer algorithm in memory or using
                         database algorithm (default 0, database 1)
 -merge <arg>            Set 'true' to merge mappings from all files from
                         input (default 'false')
 -mm <arg>               Allowed mismatches (0, 1 or 2; default: 0)
 -mmmode <arg>           Mismatch mode (true or false): if true
                         mismatching with two mismatches will only allow 1
                         mismatch every kmersize (default: 5) positions.
                         (default: false)
 -source <arg>           Please give a source name which will be used in
                         the second column in the output gtf file
                         (default: PoGo)
 -variant_filter <arg>   Peptide filter mode.

@EamonnOCearnaigh
Copy link

Hey, sorry for delay - job hunting at the minute since my placement contract is ending.

I'm having trouble actually generating the JAR. The JAR wasn't working when generated from the Iintellij Jjava application configurations that I was using for debugging. So, I switched it to a maven configuration and I'm trying to get it to build. I tried running "mvn install" on the command line which downloaded some files but resulted in:

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 06:38 min
[INFO] Finished at: 2021-08-26T03:10:49+01:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project pepgenome: Could not resolve dependencies for project org.bigbio.pgatk:pepgenome:jar:1.1.beta: Could not find
artifact com.sun.java:tools:jar:13.0.2 at specified path C:\Program Files\Java\jdk-13.0.2/../lib/tools.jar

Everything is set to use the same version of java, any idea what's causing this? I didn't get this error during the testing stages at all.

@ypriverol
Copy link
Member Author

I remove that dependency. Can you try now @EamonnOCearnaigh

@EamonnOCearnaigh
Copy link

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8.108 s
[INFO] Finished at: 2021-08-26T23:54:46+01:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project pepgenome: Could not resolve dependencies for project io.github.bigbio:pepgenome:jar:1.1.1: Failed to collect
dependencies at uk.ac.ebi.jmzidml:jmzidentml:jar:1.2.11 -> uk.ac.ebi.pride.architectural:pride-xml-handling:pom:1.0.3 -> psidev.psi.tools:xxindex:jar:0.
23 -> net.sourceforge.cpdetector:cpdetector:jar:1.0.7 -> net.sourceforge.jargs:jargs:jar:1.0: Failed to read artifact descriptor for net.sourceforge.jar
gs:jargs:jar:1.0: Could not transfer artifact net.sourceforge.jargs:jargs:pom:1.0 from/to sonatype-release (https://oss.sonatype.org/service/local/stagi
ng/deploy/maven2): authentication failed for https://oss.sonatype.org/service/local/staging/deploy/maven2/net/sourceforge/jargs/jargs/1.0/jargs-1.0.pom,
status: 401 Unauthorized -> [Help 1]

@EamonnOCearnaigh
Copy link

I'm running "mvn install" on the command line. Is that the right command?

@ypriverol
Copy link
Member Author

Yes, mvn install is fine. Which Java version do you have? Most of these issues are related with the Java version you are using.

@ypriverol
Copy link
Member Author

I have added the dependency to our internal maven repo. Can you try again @EamonnOCearnaigh ?

@EamonnOCearnaigh
Copy link

It got further this time.

Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/struts/struts-core/1.3.8/struts-core-1.3.8.jar (329 kB at 270 kB/s)
Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/struts/struts-taglib/1.3.8/struts-taglib-1.3.8.jar (252 kB at 205 kB/s)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:00 min
[INFO] Finished at: 2021-08-27T18:29:02+01:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:3.1.1:jar (attach-javadocs) on project pepgenome: MavenReportException: Err
or while generating Javadoc: Unable to find javadoc command: The environment variable JAVA_HOME is not correctly set. -> [Help 1]

Also my Java version is:

java version "13.0.2" 2020-01-14
Java(TM) SE Runtime Environment (build 13.0.2+8)
Java HotSpot(TM) 64-Bit Server VM (build 13.0.2+8, mixed mode, sharing)

@ypriverol
Copy link
Member Author

Java doesn't find thw javadoc command. I have not idea why. Can you check in stackoverflow

@EamonnOCearnaigh
Copy link

Will do. For now, do you have a working JAR you could post to GitHub or send by email? We just need to place it into a NextFlow pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants