Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReferenceSeeker and Metagenome-Assembled Genomes #23

Closed
padbc opened this issue Oct 29, 2021 · 2 comments
Closed

ReferenceSeeker and Metagenome-Assembled Genomes #23

padbc opened this issue Oct 29, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@padbc
Copy link

padbc commented Oct 29, 2021

This is neither a feature request nor a bug, although it's closer to the former.

Thank you for developing such a useful tool. If I understood correctly, ReferenceSeeker can be used with MAGs. If so, do you have approximate guidelines as to what would be appropriate in terms of, e.g., contamination and completeness? Thanks very much.

@padbc padbc added the enhancement New feature or request label Oct 29, 2021
@oschwengers
Copy link
Owner

oschwengers commented Nov 1, 2021

Hi @padbc ,
Yes, in principle, ReferenceSeeker can be used with any genome of any taxon, though we only use it with and provide databases for prokaryotic genomes. This is due to the inherent combination of methodologies: both the k-mer profile-based lookup of candidate genomes and ANI calculations against reference genomes are taxon independent and work on any DNA sequence.

This having said, in terms of methodology there's no difference between bacterial isolates' genomes and MAGs.
However, one should bear in mind that both contamination and completeness have an impact on the results. A contamination will reduce the ANI value and incomplete genomes will - of course - result in lower conserved DNA values.
For instance, having a contamination of more than 5% of a genome will make it impossible to detect a reference genome of the same species due to the 0.95 ANI threshold. The same holds true for incomplete genomes and the 69% conserved DNA threshold. In these cases you might want to adapt these values via --ani and --conserved-dna.

In situations where you cannot find any reference genome, you might also want to give --unfiltered a try.

I'm sorry that I cannot provide any specific thresholds. I don't have much experience with MAGs and everything will highly depend on the completeness and contamination values.

@oschwengers oschwengers pinned this issue Nov 1, 2021
@padbc
Copy link
Author

padbc commented Nov 1, 2021

Excellent -- thanks very much for detailed reply.

@padbc padbc closed this as completed Nov 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants