Skip to content

gatk4/markduplicates and picard/markduplicates output file extension confusion #9623

@pmoris

Description

@pmoris

Have you checked the docs?

Description of the bug

I've discovered a few weird behaviours when comparing the gatk4/markduplicates and picard/markduplicates modules.

1. omitting a file extension in gatk4/markduplicates ext.prefix causes the output bam channel to be empty

When ext.prefix is set without a file extension (e.g., ext.prefix = { "${meta.id}.markdup.bam" }), the module creates an internal output file, but the output channel remains empty because the file doesn't match the glob pattern *bam. No files get published either.

Expected Behavior: the module could add a safety check similar to the picard module (see below), or require .bam/.cram. Or auto-append .bam if not detected.

Something like this might be enough?

if (!prefix.endsWith('.bam') && !prefix.endsWith('.cram')) {
    prefix = "${prefix}.bam"
}

2. the general behaviour of these two modules is a bit at odds and confusing

When running picard/markduplicates without specifying an extension, an error is thrown (

if ("$reads" == "${prefix}.${suffix}") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!"
).

What it expects is a prefix of the type ext.prefix = { "${meta.id}.markdup" }. Using something like ext.prefix = { "${meta.id}.markdup.bam" } leads to output files with a.bam.bam` extension.

In gatk4/markduplicates on the other hand, not specifying a prefix seems to work just fine. I'm not sure why, since I'd expect the input and output file to overwrite each other since they have the same name (see

prefix = task.ext.prefix ?: "${meta.id}.bam"
). Regardless, if you do specify a custom extension, it needs to include the .bam part, otherwise it will not get outputted as described above.


This seems to be related to this issue? #8118

Command used and terminal output

Relevant files

No response

System information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions