Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue ussing --ranges #6

Closed
sa-andre opened this issue Jul 3, 2024 · 4 comments
Closed

Issue ussing --ranges #6

sa-andre opened this issue Jul 3, 2024 · 4 comments

Comments

@sa-andre
Copy link

sa-andre commented Jul 3, 2024

Hello, I am trying once again the analysis using ranges and it isn't working.

I installed syny via conda and tried the example file, which completed correctly. When using my files with --ranges, it didnt work. I tried the same files and scaffolds but using --include (only with scaffolds names) and it worked, so it seems to be something going on with the ranges I am using. I doubled checked the ranges and couldn't find any error. I tried both with mashmap or minimap, but it runs completely but fails to generate any alignment (as far as I understand, it seems to be an error in the paf alignment. In attachment i am sending the error files of both mashmap and minimap. The genomes I am using are those you suggested that contained annotations. In one of the error files it suggested being killed by linus out of memory killer, however i did run using --include with no OOMK errors, which is supposedly more demanding than --ranges would be, so I am not sure the memory is the actual problem.

wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/904/425/465/GCF_904425465.1_Colossoma_macropomum/GCF_904425465.1_Colossoma_macropomum_genomic.gbff.gz
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/015/220/715/GCF_015220715.1_fPygNat1.pri/GCF_015220715.1_fPygNat1.pri_genomic.gbff.gz

ranges1.txt
minimapsyny.log
minimaperror.log
ttachments/files/16086040/ranges1.txt)
mashmapsyny.log
mashmaperror.log

@Pombert-JF
Copy link
Member

The issue stems from list_maker.pl. Working on a fix (code was missing an if (exists $ranges{$contig}){ } condition.

The putative fix works fine on the mashmap alignments. Running the diamond searches is surprisingly slow however. Will likely have to run it overnight to test it properly. Might take a day or two to get a clean fix.

@Pombert-JF
Copy link
Member

list_maker.pl is now fixed and works properly with subranges. Also had to fix an issue with isoforms that resulted in concatenated strings and abnormally long runtimes (it was messing up DIAMOND homology searches). The new version has been pushed to GitHub.

Running the new version on your data with run_syny.pl -a *.gbff.gz --ranges ranges1.txt --aligner mashmap --out SUBRANGESmash -g 0 1 5 resulted in:

GCF_015220715_vs_GCF_904425465 gap_5 1e5 19 2x10 8 blue
GCF_015220715_vs_GCF_904425465 mmap 1e5 19 2x10 8 blue

GCF_015220715_vs_GCF_904425465 gap_5 barplot 19 2x10 8 Spectral
GCF_015220715_vs_GCF_904425465 mmap barplot 19 2x10 8 Spectral

@sa-andre
Copy link
Author

sa-andre commented Jul 8, 2024

I tried using minimap (default) and without --gaps and it worked alright. Thanks again!

why it is now generating two dotplot graphs, one that says minimap and the other that says gap?

@Pombert-JF
Copy link
Member

The .mmap files are the plots for the minimap2/mashmap3 genome alignments.
The .gap files are the plots generated from the gene cluster inferences.

Will close this issue as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants