Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GRIDSS Assembly #179

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open

Add GRIDSS Assembly #179

wants to merge 18 commits into from

Conversation

Faizal-Eeman
Copy link
Contributor

@Faizal-Eeman Faizal-Eeman commented Oct 4, 2024

Description

Add GRIDSS Assembly (step 2 of 3 in GRIDSS).

Closes #177

Testing Results

  • DNA A-mini
    • sample: HG002.N-0, S2.T-0
    • input yaml: /hot/software/pipeline/pipeline-call-sSV/Nextflow/development/unreleased/mmootor-add-GRIDSS-assembly/TWGSAMIN000001-T003-S03-F.yaml
    • config: /hot/software/pipeline/pipeline-call-sSV/Nextflow/development/unreleased/mmootor-add-GRIDSS-assembly/TWGSAMIN000001-T003-S03-F.config
    • output: /hot/software/pipeline/pipeline-call-sSV/Nextflow/development/unreleased/mmootor-add-GRIDSS-assembly/call-sSV-6.1.0/S2_v1.1.5/GRIDSS-2.13.2/output
    • intermediate: /hot/software/pipeline/pipeline-call-sSV/Nextflow/development/unreleased/mmootor-add-GRIDSS-assembly/call-sSV-6.1.0/S2_v1.1.5/GRIDSS-2.13.2/intermediate

Checklist

  • I have read the code review guidelines and the code review best practice on GitHub check-list.

  • I have reviewed the Nextflow pipeline standards.

  • The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].

  • I have set up or verified the branch protection rule following the github standards before opening this pull request.

  • I have added my name to the contributors listings in the manifest block in the nextflow.config as part of this pull request, am listed
    already, or do not wish to be listed. (This acknowledgement is optional.)

  • I have added the changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.

  • I have updated the version number in the metadata.yaml and manifest block of the nextflow.config file following semver, or the version number has already been updated. (Leave it unchecked if you are unsure about new version number and discuss it with the infrastructure team in this PR.)

  • I have tested the pipeline on at least one A-mini sample with algorithm = ['delly', 'manta']. The paths to the test config files and output directories are attached above.

@Faizal-Eeman Faizal-Eeman requested a review from a team as a code owner October 4, 2024 22:56
Copy link

@kiarod kiarod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Clean work @Faizal-Eeman

main.nf Outdated Show resolved Hide resolved
Comment on lines +63 to +68
publishDir "${params.workflow_output_dir}/output",
pattern: "${tumor_id}.assembly.bam",
mode: "copy",
saveAs: {
"${output_filename}_${sanitize_string(file(it).getName().replace("${tumor_id}.", ""))}"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment (blocking): What's in this BAM file? It seems like it should be an intermediate file rather than an output

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assembly.bam contains breakend assemblies that are used in conjunction with input T/N BAMs for variant calling.

The official GRIDSS documentation hasn't described the assembly.bam as an intermediate but it has described assembly.sv.bam.gridss.working as an intermediate.

assembly.bam is explicitly defined as a required parameter for step 3 in GRIDSS which is variant calling. For this reason I thought I'd put this in the output dir.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm from a pipeline perspective, the BAM should then be intermediate since it's used between steps within GRIDSS but not the actual output of GRIDSS (the variant calls). If the BAM is only intended for use for the third step in GRIDSS, we don't need to be saving the BAM every time we run the pipeline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering the same. The intermediate BAM files however can be used for SV visualization using IGV, so I'm thinking we add documentation to README suggesting users to set save_intermediate_files = true if they want these intermediate BAMs for visualization or any downstream steps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's fine to document the option in the README for users who might want it

main.nf Outdated Show resolved Hide resolved
@Faizal-Eeman
Copy link
Contributor Author

F32 test - /hot/software/pipeline/pipeline-call-sSV/Nextflow/development/unreleased/mmootor-add-GRIDSS-assembly/ILHNLNEV000002-T001-P01-F.F32/ILHNLNEV000002-T001-P01-F_F32_assembly.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add GRIDSS assembly
3 participants