Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High level RDP classification despite good BLAST hits #87

Open
pieterprovoost opened this issue Nov 15, 2023 · 0 comments
Open

High level RDP classification despite good BLAST hits #87

pieterprovoost opened this issue Nov 15, 2023 · 0 comments

Comments

@pieterprovoost
Copy link
Member

In some cases VEARCH/BLAST return good and consistent hits (>97%) but RDP only classifies at a very high taxonomic level. Could be due to inconsistencies in the reference database taxonomy, or missing reference sequences.

For example (COI eDNA Expeditions sample S176):

3241    asv.3241    Eukaryota   Identification based on the RDP classifier at the confidence level 0.6: taxonomy Eukaryota;undef_Eukaryota;Cercozoa;Chlorarachniophyceae;undef_Chlorarachniophyceae;undef_undef_Chlorarachniophyceae;Chlorarachnion;Chlorarachnion_reptans, confidences 0.99;0.45;0.16;0.16;0.16;0.16;0.16;0.16. Confirmation with VSEARCH against the COI_ncbi_1_50000 database at 0.97 similarity: hits GQ896380, identities 97.1, taxonomy Bigelowiella_natans, consensus Eukaryota;Cercozoa;Chlorarachniophyceae;Bigelowiella;Bigelowiella_natans.

Input:

S1,/home/ubuntu/data/raw_sequences/eDNAexpeditions/batch1/concatenated/GC142030_TGCCGGTCAG-TTGTATCAGG_S176_R1.fastq.gz,forward
S1,/home/ubuntu/data/raw_sequences/eDNAexpeditions/batch1/concatenated/GC142030_TGCCGGTCAG-TTGTATCAGG_S176_R2.fastq.gz,reverse
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant