Releases · broadinstitute/long-read-pipelines

Fixed an issue where, when updating rows that include manual edits from Terra, an inconsistent sort order caused values to be randomly assigned.
While sample tables can be changed by simply uploading a new version, the sample set membership table seems to just add redundant entries, potentially causing downstream problems. Fixed by only uploading the changed membership entries, rather than the full table.
When manual edits in the sample table cause sample_set entries to be invalidated, remove the invalid sample_set entries.

More automation updates.

Updated update_nanopore_tables to allow for table editing and sample_set invalidation (similar to update_pacbio_tables script).
Added a verbosity argument to workflow automator.

Add fastq_pass and fast5_pass dirs to Nanopore sample table in Terra
Fixes for replacing nan values
Add flowcell id (CellPac.Barcode) to Terra table

Assets 2

12 Jul 12:57

github-actions

3.0.3

6545773

lrp_3.0.3

Bug fixes for automatic population and updates to Terra tables. (#226)

Bug fixes for automatic population and updates to Terra tables.

Fixed an issue where, when updating rows that include manual edits from Terra, an inconsistent sort order caused values to be randomly assigned.
While sample tables can be changed by simply uploading a new version, the sample set membership table seems to just add redundant entries, potentially causing downstream problems. Fixed by only uploading the changed membership entries, rather than the full table.
When manual edits in the sample table cause sample_set entries to be invalidated, remove the invalid sample_set entries.

More automation updates.

Updated update_nanopore_tables to allow for table editing and sample_set invalidation (similar to update_pacbio_tables script).
Added a verbosity argument to workflow automator.

Add fastq_pass and fast5_pass dirs to Nanopore sample table in Terra
Fixes for replacing nan values

Assets 2

09 Jul 17:56

github-actions

3.0.2

c244709

lrp_3.0.2

Relax Longshot task execution stringency (#234)

Assets 2

14 Jun 07:28

github-actions

3.0.1

a3d9130

lrp_3.0.1

Release of 3.0 pipeline (#224)

This marks the release of the 3.0.* workflow series, intended to massively increase the usability of these workflows in Terra. This release, therefore, contains substantial, breaking changes from the 2.1.* workflow series.

First, processing is now more logically divided between flowcell, sample, and cohort levels. The majority of workflows are now intended to run in series - e.g. first flowcell-level processing, followed by sample-level processing (which presumes that flowcell-level outputs are sample-level inputs). Future improvements are dependent on this assumed behavior.

Second, all top-level workflows are now expected to return outputs that are compatible with Terra’s data model. To retain control over directory structure, we provide functionality for tasks to copy files to a destination bucket and return that storage path to the data model.

Finally, we’ve made many improvements to the workflows themselves. Those changes include:

PB Sequel II/IIe:

Updated to CCS 6.0.0 (so that Sequel II data will be compatible with Sequel IIe on-board basecalling).
Added extracthifi tool to subset CCS reads to only rq >= 0.99 reads, so that all subsequent tasks in Sequel II and Sequel IIe CCS datasets are processed on reads in the same way.

ONT Basecalling

Upgraded to Guppy 4.5.4
Improved basecalling on datasets where the sequencing_summary.txt and final_summary.txt files are not present
Added support for demultiplexing with standard barcoding kits. Because basecalling can be single-plex or multiplexed, its output can technically be 1-to-1 or 1-to-many. That severely breaks compatibility with Terra's data model. Thus, we now only return the directory to which the results are being written. We will rely on scripts that scan input directories to populate the data model instead.

Assembly

Added Canu assembly workflows
Added Flye assembly workflows
Added Hifiasm assembly workflows
Added Quast to assembly workflows for automatic assembly performance reports
Made Quast comparison against a reference sequence optional

Variant calling

Added variant calling subworkflows for PBCCS, PBCLR, and ONT data
Added DeepVariant-PEPPER to ONT calling
Parallelized most variant calling tasks
Allow variant calling to be turned off for whole genome workflows (useful if one wants to make CNV calls on the BAMs and doesn't need the SNV/SV calls immediately).