Long-read nanopore sequencing uncovers population-specific structural variation in the Middle East and North Africa
This is a web-based app that overlaps patient SVs with several databases to reduce the number of actionable SVs and enhance diagnostics.
Patient SVs are overlapped with various databases based on the following parameters: (Write the method of overlap). However, users have control over the similarity threshold when comparing with databases on the left-hand side. Generally, 50% reciprocal overlap is recommended to identify if an SV was reported previously.
The databases used in the app to reduce the number of actionable SVs are;
1- Human Genome Structural Variation Consortium, Phase 2 (HGSVC2). [https://www.internationalgenome.org/data-portal/data-collection/hgsvc2]
2- Sniffles-based calls from 1,019 individuals from the 1k-ONT project [https://www.nature.com/articles/s41586-025-09290-7#Sec2] merged with 61 individuals of MENA ancestry using SURVIVOR [https://github.com/fritzsedlazeck/SURVIVOR]
3- SVs from 61 MENA individuals, where an SV was called by at least three different SV callers, including Svim, dellly, Sniffles and CuteSV.
The app aims to identify previously reported SVs from publicly available databases and reduce the number of SVs in patients overlapping with OMIM genes exons (https://www.omim.org/) and GIAB medically relevant genes [https://www.nature.com/articles/s41467-024-53260-y#data-availability]. At the end of the anaysis users will be able to download patient SVs with various allele frequency metrics from the 1K-ONT and a list of patient SVs unique to that patient that were not reported in any of the databases, including SVs overlapping with OMIM exons and medically relevant genes
requirements can be found in requirements.txt. Alternatively you can also make use of our docker container.
after installing the requirements you can execute the app by running:
streamlit run app.py
alternatively we provide a docker container for easy execution:
tbd