-
Notifications
You must be signed in to change notification settings - Fork 28
AssertionError: amino acid sequence length (80) less than mutation position 81 #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The first thing I would check is that your references match the genome reference used to generate the vcf file. I have seen this error occur when there is a miss match between the two since the genomic positions do not match up. |
The reference used for making the vcf file was hg38 and the references I'm giving MuPeXI are GRCh38. As I understand it they used the same positions, so would that make a difference? |
No this should be fine - I will look into this error soon and figure out why it occurs. |
Thanks! |
Dear d-henness
MuPeXI is generated to take the VCF file from a variant caller preferably MuTect2 - therefor i cannot ensure that your error does not occur due to the processing through VEP. Do you see se the same error if you run the VCF file directly obtained from MuTect2 |
I do. |
Can you send me a link to this original MuTect2 file which have not been processed with VEP? |
I'll email you the file |
I encountered the same error. Has this been resolved? |
I have encountered the same error with the example vcf and tsv files provided in the data/ folder on the MuPeXI GitHub repo. Has this been resolved? |
I am getting the following error when I try and run MuPeXI on one of my vcf files.
Reading in data
Creating proteome reference dictionary
Creating genome reference dictionary
Creating cancer genes list
VEP: Starting process for running the Ensembl Variant Effect Predictor
Detecting variant caller
MuTect2
Change VCF to the VEP compatible
Extracting allele frequencies
Running VEP
Creating mutation information dictionary
MuPeX: Starting mutant peptide extraction
Extracting all possible peptides from reference
Peptides of 9 aa are being extracted
Peptide extraction begun
Traceback (most recent call last):
File "/home/arunimas/MuPeXI/MuPeXI.py", line 1807, in
main(sys.argv[1:])
File "/home/arunimas/MuPeXI/MuPeXI.py", line 78, in main
peptide_info, peptide_counters, fasta_printout, pepmatch_file_names = peptide_extraction(peptide_length, vep_info, proteome_reference, genome_reference, reference_peptides, reference_peptide_file_names, input_.fasta_file_name, paths.peptide_match, tmp_dir, input_.webserver, input_.print_mismatch, input_.keep_temp, input_.prefix, input_.outdir, input_.num_mismatches)
File "/home/arunimas/MuPeXI/MuPeXI.py", line 730, in peptide_extraction
peptide_sequence_info = mutation_sequence_creation(mutation_info, proteome_reference, genome_reference, p_length)
File "/home/arunimas/MuPeXI/MuPeXI.py", line 763, in mutation_sequence_creation
peptide_sequence_info = insertion_peptide(proteome_reference, mutation_info, peptide_length, PeptideSequenceInfo)
File "/home/arunimas/MuPeXI/MuPeXI.py", line 789, in insertion_peptide
asserted_proteome = reference_assertion(proteome_reference, mutation_info, reference_type = 'proteome')
File "/home/arunimas/MuPeXI/MuPeXI.py", line 1073, in reference_assertion
assert len(seq) >= mutation_info.prot_pos, 'amino acid sequence length ({}) less than mutation position {}'.format(len(seq), mutation_info.prot_pos)
AssertionError: amino acid sequence length (80) less than mutation position 81
I run MuPeXI with
/home/arunimas/MuPeXI/MuPeXI.py -v header.vcf -a HLA-A01:01,HLA-A32:01,HLA-B08:01,HLA-B14:01,HLA-C07:01,HLA-C08:02 -c /home/arunimas/MuPeXI/config.ini -t
I've attached a minimal vcf file which reproduces this error
header.vcf.gz
Is there anything I can do to fix this myself?
The text was updated successfully, but these errors were encountered: