Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-matching sequences achieving much high scores than exact matching sequences #11

Open
hmusta opened this issue Dec 9, 2021 · 0 comments

Comments

@hmusta
Copy link

hmusta commented Dec 9, 2021

I've noticed that in some cases, sequences that have no significant alignment to a reference achieve higher scores (and much lower e-values) than those that align perfectly.

In particular, this occurs with the NC_000913.3 reference sequence when aligning the following reads in a file called test.fa

>NZ_CP011331.1-92/2
GGCACGACTGATTCCGCCGACTCCTGTGTCCACTGCACAAAGTCCTGTTGCAGACGGTCACGGTTAATGCCGGTCAACAGCCGGGCGGCAGCAGGCGGTATATAACGCAGCGGCGAATGCAGCAGGGCCAGCAGCAACACGCCGCCGCGC
>NZ_CP031912.1-287/1
ACACGCCGGACTGACTCTGGCGGGTTGGGTGGCGAACGATGTTACGCCTCCGGGAAAACGTCACGCTGAATATATGACCACGCTCACCCGCATGATTCCCGCGCCGCTGCTGGGAGAGATCCCCTGGCTTGCAGAAAATCCAGAAAATGC

The first read has very few matches to the reference, but BlastFrost reports and e-value of 1e-427, with the underlying score being 334, whereas the second read matches exactly, but gets and e-value of 1e-427 and score of 136.

I built the graph using the command

./Bifrost build -k 15 -r NC_000913.3.fa -c -o NC_000913.3.bifrost

and aligned with

./BlastFrost -g NC_000913.3.bifrost.gfa -f NC_000913.3.bifrost.bfg_colors -q test.fa -k 15

I am using commit d425ac6 (BlastFrost) and 82adeedd5eb1007fdfae2324b339ae32c1c128d3 (bifrost).

Any guidance would be greatly appreciated.

Best regards,
Harun

@hmusta hmusta changed the title Non-matching sequences achieving high scores Non-matching sequences achieving much high scores than exact matching sequences Dec 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant