Fixed the issue #811 and #812, which leading the pipeline break #813

Dedaniya08 · 2024-12-09T08:38:14Z

PR checklist

Release 2.9.0

Release 2.10.0

Release 2.11.0

Release 2.12.0

…ine break

Dedaniya08 · 2024-12-09T08:47:24Z

The issue of pipeline breaking is caused at two steps for issue #811 and #812 . I have added the retry max 3 attempt which has lead to resolving the both the error in both the files

Both the step are resolved.

d4straub · 2024-12-09T08:53:57Z

Hi there, thanks for the PR. However, this is a very bad idea to retry for any error code. This can lead to a lot of confusion. I oppose that change.
We typically only retry on specific error codes, see

ampliseq/conf/base.config

Line 18 in 8f139ce

errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' }

.
I do not want to change the behavior that when there is a reproducible, justified error, a retry is launched for no reason. Thats why we do not retry everything just on default. In the overwhelming amount of cases thats just a waste of resources.

If that change solves you problem thats great, but there is no need to change pipeline code instead you can achieve this also by using -c azure.conf that contains:

process {
    errorStrategy = 'retry'
}

or use more process targeted approach as in e.g.

ampliseq/conf/base.config

Lines 65 to 69 in 8f139ce

    
           withName:QIIME2_EXTRACT { 
        
               cpus   = { 12    * task.attempt } 
        
               memory = { 12.GB * task.attempt } 
        
               time   = { 24.h  * task.attempt } 
        
           }

Dedaniya08 · 2024-12-11T04:59:16Z

The code is encountering frequent issues. In this PR, step #811 focuses on renaming the files, and step #812 maps the filtered files. I verified the files after a retry, and the output remains consistent. There are no discrepancies in the files before and after applying this PR.

The recurring issue is due to Azure's local environment and the process of extracting files from the data lake. This behavior will persist without the changes introduced here. Therefore, I strongly recommend including this PR to address the problem effectively.

d4straub · 2024-12-11T08:24:09Z

Thanks for explaining your reasoning. I do believe that the resulting files are fine and I acknowledge that this proposed change will solve the issue on Azure. But it will negatively affect the execution of the pipeline on any other platform (retry for all exit codes of RENAME_RAW_DATA_FILES & DADA2_FILTNTRIM, which is not desirable). I would prefer a fix that doesn't have a negative impact at all.
I'll try to lure some more people here to get more opinions.

ewels

This issue seems related to the Azure infrastructure setup and is not general to all users of the pipeline, or really pipeline specific. I do not think that we should apply blanket retry for all users like this.

Alternatives would be setting this on user configs instead, or better still adding to the azurebatch institutional config as a pipeline-specific config.

The latter is already set up in this pipeline so no changes needed here, purely nf-core/configs repo. Need to create a new config file there and then include it here.

modules/local/dada2_filtntrim.nf

modules/local/rename_raw_data_files.nf

ewels · 2024-12-11T08:45:17Z

It's also possible that the retry solves the issue here because the retry attempts give the task more memory (the task.attempt multiplier). This could be needed when pulling data from a data lake, which is a slightly non-standard method of data staging.

ampliseq/conf/base.config

Line 30 in 8f139ce

memory = { 6.GB * task.attempt }

So something else that you could try in the azure-specific pipeline config is to assign more memory to these tasks. Then they might not fail in the first place and you may not need the retry.

Co-authored-by: Phil Ewels <[email protected]>

without setting maxRetries as 3 , just considering the errorStrategy Co-authored-by: Phil Ewels <[email protected]>

ewels · 2024-12-11T10:16:31Z

@d4straub - suggest we close this PR and @Dedaniya08 you can open one to nf-core/configs with the relevant changes.

d4straub · 2024-12-11T10:24:50Z

Yes I agree @ewels , better solved in nf-core/configs. Thanks a lot!

d4straub and others added 5 commits April 3, 2024 12:24

Merge pull request nf-core#725 from nf-core/dev

717abb8

Release 2.9.0

Merge pull request nf-core#755 from nf-core/dev

3f40a1b

Release 2.10.0

Merge pull request nf-core#771 from nf-core/dev

0473e15

Release 2.11.0

Release 2.12.0

8f139ce

Release 2.12.0

Fixed the issue nf-core#811 and nf-core#812, which leading the pipel…

05000f3

…ine break

This comment was marked as resolved.

Sign in to view

Dedaniya08 changed the base branch from master to dev December 9, 2024 08:39

ewels requested changes Dec 11, 2024

View reviewed changes

modules/local/dada2_filtntrim.nf Outdated Show resolved Hide resolved

modules/local/rename_raw_data_files.nf Outdated Show resolved Hide resolved

Dedaniya08 and others added 2 commits December 11, 2024 15:42

Update modules/local/rename_raw_data_files.nf

41cc87b

Co-authored-by: Phil Ewels <[email protected]>

Update modules/local/dada2_filtntrim.nf

dcb06e1

without setting maxRetries as 3 , just considering the errorStrategy Co-authored-by: Phil Ewels <[email protected]>

Dedaniya08 requested a review from ewels December 11, 2024 10:14

d4straub closed this Dec 11, 2024

This was referenced Jan 10, 2025

AMPLISEQ:DADA2_PREPROCESSING: DADA2_FILTNTRIM pipeline break #812

Closed

AMPLISEQ:RENAME_RAW_DATA_FILES pipeline break #811

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed the issue #811 and #812, which leading the pipeline break #813

Fixed the issue #811 and #812, which leading the pipeline break #813

Dedaniya08 commented Dec 9, 2024 •

edited

Loading

This comment was marked as resolved.

Dedaniya08 commented Dec 9, 2024

d4straub commented Dec 9, 2024 •

edited

Loading

Dedaniya08 commented Dec 11, 2024

d4straub commented Dec 11, 2024

ewels left a comment

ewels commented Dec 11, 2024

ewels commented Dec 11, 2024

d4straub commented Dec 11, 2024

Fixed the issue #811 and #812, which leading the pipeline break #813

Fixed the issue #811 and #812, which leading the pipeline break #813

Conversation

Dedaniya08 commented Dec 9, 2024 • edited Loading

PR checklist

This comment was marked as resolved.

Dedaniya08 commented Dec 9, 2024

d4straub commented Dec 9, 2024 • edited Loading

Dedaniya08 commented Dec 11, 2024

d4straub commented Dec 11, 2024

ewels left a comment

Choose a reason for hiding this comment

ewels commented Dec 11, 2024

ewels commented Dec 11, 2024

d4straub commented Dec 11, 2024

Dedaniya08 commented Dec 9, 2024 •

edited

Loading

d4straub commented Dec 9, 2024 •

edited

Loading