Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arriba + variants key error and strange log #337

Closed
lydiayliu opened this issue Jan 13, 2022 · 7 comments · Fixed by #339
Closed

arriba + variants key error and strange log #337

lydiayliu opened this issue Jan 13, 2022 · 7 comments · Fixed by #339

Comments

@lydiayliu
Copy link
Collaborator

a=/hot/users/yiyangliu/MoPepGen/Parser/VEP/gencode/gsnp/CPCG0324.gencode.tsv.s.gvf
b=$(basename -- "$a"); echo ${b};
c="${b%%.*}"; echo ${c};
moPepGen callVariant \
    --input-variant /hot/users/yiyangliu/MoPepGen/Parser/Fusion/arriba-2.1.0/${c}.s.gvf \
        /hot/users/yiyangliu/MoPepGen/Parser/VEP/gencode/gsnp/${b} \
        /hot/users/yiyangliu/MoPepGen/Parser/VEP/gencode/gindel/${b} \
        /hot/users/yiyangliu/MoPepGen/Parser/VEP/gencode/somaticsniper/${b} \
        /hot/users/yiyangliu/MoPepGen/Parser/VEP/gencode/pindel/${b} \
    --index-dir /hot/users/yiyangliu/MoPepGen/Index/GRCh38-EBI-GENCODE34/ \
    --verbose-level 1 \
    --threads 16 \
    --output-fasta /hot/users/yiyangliu/MoPepGen/Variant/Fusion/arriba-2.1.0/ssm/${c}.fasta > /hot/users/yiyangliu/MoPepGen/Variant/Fusion/arriba-2.1.0/ssm/${c}.log
...
[ 2022-01-13 00:00:27 ] Exception raised from fusion FUSION-ENSG00000204177.10:11495-ENSG00000204179.10:53590
An error has occured during the function execution
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/ppft/__main__.py", line 111, in run
    __result = __f(*__args)
  File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/call_variant_peptide.py", line 191, in wrapper
    return call_variant_peptides_wrapper(*dispatch)
  File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/call_variant_peptide.py", line 160, in call_variant_peptides_wrapper
    _peptides = call_peptide_fusion(
  File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/call_variant_peptide.py", line 357, in call_peptide_fusion
    dgraph.create_variant_graph(
  File "/usr/local/lib/python3.8/site-packages/moPepGen/svgraph/ThreeFrameTVG.py", line 916, in create_variant_graph
    cursors = self.apply_fusion(
  File "/usr/local/lib/python3.8/site-packages/moPepGen/svgraph/ThreeFrameTVG.py", line 555, in apply_fusion
    insertion_variants = variant_pool.filter_variants(
  File "/usr/local/lib/python3.8/site-packages/moPepGen/seqvar/VariantRecordPool.py", line 177, in filter_variants
    gene_id = self.anno.transcripts[tx_id].transcript.gene_id
KeyError: 'ENST00000506881.5'

please also check out the very end of the log message. I believe the error is reported on two threads, but there is a lot of strange characters
^@^@^@^@^@^@^@^@^@^@^
in the log...

Also with the new multiprocessing, error reporting always happens twice. The above error was written to the LOG file (so stdout), but on the terminal you also get this below (which is stderr). Is this split design intentional?

Traceback (most recent call last):
  File "/usr/local/bin/moPepGen", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/__main__.py", line 79, in main
    args.func(args)
  File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/call_variant_peptide.py", line 279, in call_variant_peptide
    for peptides in peptide_series:
TypeError: 'NoneType' object is not iterable
@zhuchcn
Copy link
Member

zhuchcn commented Jan 13, 2022

My guess is the first error message is printed by the worker process/thread and the second is printed by the main thread. The binary might be because when data is pickled/unpicked and transferred between threads it gets somehow messed up.

@zhuchcn
Copy link
Member

zhuchcn commented Jan 13, 2022

I opened an issue at uqfoundation/pathos#228

@lydiayliu
Copy link
Collaborator Author

Adding another case here for
a=/hot/users/yiyangliu/MoPepGen/Parser/VEP/gencode/gsnp/CPCG0249.gencode.tsv.s.gvf

[ 2022-01-13 17:36:55 ] 16000 transcripts processed.
[ 2022-01-13 17:37:13 ] Exception raised from fusion FUSION-ENSG00000118260.15:47919-ENSG00000227308.2:22649
An error has occured during the function execution
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/ppft/__main__.py", line 111, in run
    __result = __f(*__args)
  File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/call_variant_peptide.py", line 191, in wrapper
    return call_variant_peptides_wrapper(*dispatch)
  File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/call_variant_peptide.py", line 160, in call_variant_peptides_wrapper
    _peptides = call_peptide_fusion(
  File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/call_variant_peptide.py", line 357, in call_peptide_fusion
    dgraph.create_variant_graph(
  File "/usr/local/lib/python3.8/site-packages/moPepGen/svgraph/ThreeFrameTVG.py", line 916, in create_variant_graph
    cursors = self.apply_fusion(
  File "/usr/local/lib/python3.8/site-packages/moPepGen/svgraph/ThreeFrameTVG.py", line 555, in apply_fusion
    insertion_variants = variant_pool.filter_variants(
  File "/usr/local/lib/python3.8/site-packages/moPepGen/seqvar/VariantRecordPool.py", line 177, in filter_variants
    gene_id = self.anno.transcripts[tx_id].transcript.gene_id
KeyError: 'ENST00000607654.1'

same transcript is hit in 4 threads producing 4 errors, some with the strange symbols in between

@zhuchcn
Copy link
Member

zhuchcn commented Jan 14, 2022

Case one (CPCG0324) seems also to be fixed by #339. Fun fact, for this fusion, the donor part has 1300 bases, the accepter's exonic sequence has 1710 bases, but the intronic region carried over from the accepter gene has 91093 bases 😂

@zhuchcn
Copy link
Member

zhuchcn commented Jan 14, 2022

Case 2 also fixed!

@zhuchcn zhuchcn linked a pull request Jan 14, 2022 that will close this issue
@lydiayliu
Copy link
Collaborator Author

the intronic region carried over from the accepter gene has 91093 bases

lmao! it's these introns that are making fusion run time super slow lolll

gimme a sec to double check both of these!

@lydiayliu
Copy link
Collaborator Author

both cases confirmed resolved. wow #339 is the bomb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants