Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assignTaxonomic() never ends with UNITE reference dataset #2069

Open
danielsangarci opened this issue Jan 3, 2025 · 3 comments
Open

assignTaxonomic() never ends with UNITE reference dataset #2069

danielsangarci opened this issue Jan 3, 2025 · 3 comments

Comments

@danielsangarci
Copy link

danielsangarci commented Jan 3, 2025

Hello,

I am analyzing 16S rRNA and ITS sequences from bacteria and fungi. While the assignTaxonomic() function works as expected with the 16S sequences and SILVA reference dataset (processing completes in a few hours), I encounter an issue when running it on ITS sequences with the UNITE reference dataset (the process never completes, even after more than 24 hours).

I have tried running it with a small subset of data (2 samples with 10 sequences each), but it still never ends.

seqtab.nochim <- seqtab.nochim[5:6,1:10]
taxa <- assignTaxonomy(seqtab.nochim, "sh_general_release_dynamic_04.04.2024.fasta", multithread=TRUE)
UNITE fungal taxonomic reference detected.

Do you know what could be the problem or how could i solve it?
Thank you so much in advance.

@danielsangarci danielsangarci changed the title assignTaxonomic() never ends with UNITE references dataset assignTaxonomic() never ends with UNITE reference dataset Jan 3, 2025
@benjjneb
Copy link
Owner

benjjneb commented Jan 7, 2025

My guess is you are hitting the memory ceiling, which slows down assignTaxonomy extremely as it has to start swapping. How much memory is available in the compute environment you are using? Do you have access to something with more available memory?

@danielsangarci
Copy link
Author

My laptop has 8Gb of RAM, so its possible thats the problem.

But... why did it work then with 16S and SILVA reference dataset? does 16S analysis requere less RAM memory?

@benjjneb
Copy link
Owner

benjjneb commented Jan 9, 2025

Nothing specifically about 16S, it's the size of the database both in terms of number of sequences and in the number of unique terminal taxa that determines the memory requirements.

32GB is definitely enough to use UNITE (that's what I have on my machine). There's a good chance 16GB is enough as well, it used to be with older versions of UNITE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants