-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the Roth Lab Wiki!
At the Roth lab, we currently have access to three HPC clusters. Two at our University of Toronto location, and one at the University of Pittsburgh.
-
galen
, located at the Lunenfeld-Tanenbaum Research Institute (LTRI, Mt Sinai Hospital)- Galen uses the "Slurm" scheduling system
- After our move to Pittsburgh, we only have a grace period until approximately September 2024 to use this resource.
-
dc
, located at the Donnelly Centre-
dc
uses the "PBS" scheduling system - After our move to Pittsburgh, we only have a grace period until approximately September 2024 to use this resource.
-
-
cluster.csb.pitt.edu
, located at the Department of Computational Systems Biology, University of Pittsburgh-
cluster
uses the "Slurm" scheduling system as well.
-
Despite the clusters using different scheduling systems, we have a custom abstraction layer called clusterutil that offers a unified interface.
- How to get started on the Pittsburgh cluster
- How to get started on the Galen cluster
- HPC jobs with clusterutil
We have two web servers at the LTRI in Toronto: dalai
out web-dev server and yantra
, the web production server.
Help articles:
We have two servers for the purpose of storing and demultiplexing our sequencing data.
-
rothseq1
, located at the LTRI (Mt Sinai) -
rothsequt
, located at the Donnelly Centre These servers do not have access to the HPC clusters. Howeverrothseq1
shares user home directories withgalen
, making it easy to share data between the two.
Help articles:
TileSeq is a method for analyzing variant effect libraries. Instead of using barcodes to establish clone identity, TileSeq reads out the genotypes of the clones directly, albeit only within short "tiles" of the mutagenized target sequence. These tiles are designed to be ~150bp in length, which is just short enough to be covered completely by a standard Illumina read. This way, information from the forward (R1) and reverse (R2) reads can be used to distinguish real variants from sequencing errors. Tileseq data is analyzed in two steps: First, using the tileseq_mut pipeline, reads are aligned to the template and variants are called and counted. In the second step, the tileseqMave pipeline is used to calculate enrichment of variants between conditions, an error model and filters are applied, and QC outputs are generated.
BarSeq is another method for analyzing variant effect libraries. Here clones are carrying (mostly) unique barcodes. The association between barcode and genotype is established using long-read sequencing (PacBio), which is analyzed via the Pacybara pipeline. After the selection assay, the barcodes in the condition-specific libraries are sequenced via Illumina short-reads and analyzed via the bartender wrapper Pacybartender (also found in the pacybara repo).
MaveVis is a visualization tool for variant effect maps. It displays maps as 'genophenograms', i.e. heatmaps of variant effect scores on a grid of all amino acid positions vs all possible amino acid changes at the given position. It also has the option of displaying additional information tracks, such as sequence conservation, protein domains, secondary structure, accessible surface area and interaction interfaces. A web interface for visualizing datasets from MaveDB via MaveVis is available here.
MaveQuest is an online database for querying literature-curated functional assays, phenotypes and clinical interests of human genes for Multiplex Assays of Variant Effect (MAVE) studies.
Start at the main Wiki page: [MaveQuest] Main Page
Other Wiki pages:
- [MaveQuest] Update Source Data
- [MaveQuest] Upload Curated Data
- [MaveQuest] Maintenance Log
- [MaveQuest] Cloud Infrastructure
MaveRegistry is a collaborative resource for sharing progress on Multiplexed Assays of Variant Effect (MAVE).
Start at the main Wiki page: [MaveRegistry] Main Page
Other Wiki pages:
UK Biobank is a large long-term biobank study in the United Kingdom which is investigating the respective contributions of genetic predisposition and environmental exposure to the development of disease.
This section contains projects in the lab using UK Biobank data.
Start at the main Wiki page: [UKB Projects] Main Page