Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edge case: Clustering with VSEARCH fails at QIIME2_INSEQ #668

Closed
d4straub opened this issue Nov 29, 2023 · 1 comment
Closed

Edge case: Clustering with VSEARCH fails at QIIME2_INSEQ #668

d4straub opened this issue Nov 29, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@d4straub
Copy link
Collaborator

Description of the bug

With --vsearch_cluster in some (potentially rare) edge case, QIIME2_INSEQ complains that the fasta file isnt valid. This is due to masking low complexity regions (in that case multiple G's in a row) and QIIME2 expects all capitalized nucleotide symbols.

Masking can be prevented with --qmask "none", so that a config that contains

process {
    withName: VSEARCH_CLUSTER {
        ext.args = '--id 0.97 --usersort --qmask "none"'
        ext.args2 = '--cluster_smallmem'
        ext.args3 = '--clusters'
    }
}

will fix the issue.

Command used and terminal output

No response

Relevant files

No response

System information

No response

@d4straub d4straub added the bug Something isn't working label Nov 29, 2023
@d4straub d4straub changed the title Clustering with VSEARCH fails at QIIME2_INSEQ Edge case: Clustering with VSEARCH fails at QIIME2_INSEQ Nov 29, 2023
@d4straub
Copy link
Collaborator Author

Thats in dev, will be in 2.8.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant