Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with my BlastFrost result #12

Open
davidmaimoun opened this issue Jun 13, 2022 · 5 comments
Open

Issue with my BlastFrost result #12

davidmaimoun opened this issue Jun 13, 2022 · 5 comments

Comments

@davidmaimoun
Copy link

Hi,
I ran Blastfrost on 5 listeria stains (the SRR18349609, 10, 11 are in a same cluster according to the pathogene detection of ncbi)
When I querying a sequence shared by all the stains, or a sequence from another stain (SRR18349613 or 15), the result is more or less the same, i-e, it displays match for the SRR18349609 only (and only the number changes according to the query).

Somebody can help me please?

Thank you!

fastas.txt list file:
../fastas_list/SRR18349609.fasta
../fastas_list/SRR18349610.fasta
../fastas_list/SRR18349611.fasta
../fastas_list/SRR18349613.fasta
../fastas_list/SRR18349615.fasta

Commands:
Bifrost build -t 4 -k 31 -i -d -s fastas.txt -c -o a_graph
BlastFrost -g a_graph.gfa -f a_graph.bfg_colors -q query.txt -o test

Messages from the terminal
Graph loading successful
QuerySearch initialized!
Goodbye!

Results:
test ../fastas_list/SRR18349609.fasta 1e-2006 1:1521,

@nluhmann
Copy link
Owner

Hi David,
sorry for my late reply. Could you also attach your query sequence? Then I can have a look from my end to see what is going on.
I assume your query should be present in all 5 listeria strains?
Best,
Nina

@davidmaimoun
Copy link
Author

Hi Nina,

Thanks for reaching out!

My queries:

SRR18349609
ACGTGGATTCGGATTCACTTAGCGATATGGACTCCGATTCACTCAATGACGTCGATTCAGACTCACTCAACGACGTGGATT
SRR18349615
GTCCCTTGACGGTATCTAACCAGAAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGATTTATTGGGCGTAAAGCGCGCGCAGGCGGTCTTTTAAGTCTGATGTGAAAGCCCCCGGCTTAACCGGGGAGGGTCATTGGAAACTGGAAGACTGGAGTGCAGAAGAGGAGAGTGGAATTCCACGTGTAGCGGTGAAATGCGTAGATATGTGGAGGAACACCAGTGGCGAAGGCGACTCTCTGGTCTGTAACTGACGCTGAGGCGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAGTGCTAAGTGTTAGGGGGTTTCCGCCCCTTAGTGCTGCAGCTAACGCATTAAGCACTCCGCCTGGGGAGTACGACCGCAAGGTTGAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAGGTCTTGACATCCTTTGACCACTCTGGAGACAGAGCTTTCCCTTCGGGGACAAAGTGACAGGTGGTGCATGGTTGTC

The first one should be present in all the sequences, but the last one no.
It is only for testing, to know the command-lines, understand the outputs..

Thank you for your help Nina

Best,

@nluhmann
Copy link
Owner

nluhmann commented Jun 28, 2022

Hi David,

I tested your queries on strains SRR18349609, SRR18349610 and SRR18349615 and I find both queries in all strains (as confirmed by blastn, but you'll know your data better), though in a rather fragmented fashion (see search result below).

I have to admit that I never really used BlastFrost on graphs built from sequencing reads before, we normally used genome assemblies to be able to add more strains into a single graph. Obviously reads will create a much more convoluted graph in the first place, though I am also getting a weird error regarding color IDs from Bifrost (@GuillaumeHolley did you change your API for Color IDs somehow?). I will look into this once I find some time.

BlastFrost search result, I marked your queries with '_q'.

SRR18349609_q SRR18349615.fastq 1e-400 1:4,0:4,1:2,0:2,1:3,0:5,1:1,0:2,1:3,0:1,1:12,0:2,1:1,0:7,1:2,
SRR18349609_q SRR18349610.fastq 1e-304 1:10,0:2,1:3,0:4,1:2,0:2,1:3,0:1,1:17,0:5,1:2,
SRR18349609_q SRR18349609.fastq 1e-308 1:10,0:2,1:3,0:4,1:2,0:2,1:3,0:1,1:20,0:2,1:2,
SRR18349615_q SRR18349615.fastq 0 1:1,0:1,1:3,0:1,1:2,0:2,1:2,0:1,1:3,0:1,1:3,0:3,1:4,0:5,1:3,0:1,1:2,0:1,1:4,0:2,1:3,0:2,1:1,0:2,1:3,0:1,1:7,0:1,1:1,0:2,1:8,0:1,1:5,0:2,1:8,0:1,1:1,0:1,1:2,0:2,1:5,0:3,1:2,0:1,1:1,0:2,1:1,0:2,1:1,0:1,1:5,0:2,1:3,0:1,1:7,0:1,1:3,0:3,1:3,0:1,1:2,0:1,1:3,0:1,1:3,0:1,1:3,0:2,1:2,0:1,1:1,0:1,1:5,0:1,1:1,0:1,1:3,0:1,1:2,0:1,1:1,0:3,1:2,0:2,1:4,0:1,1:1,0:2,1:1,0:1,1:1,0:2,1:7,0:1,1:2,0:1,1:2,0:2,1:3,0:2,1:6,0:2,1:7,0:2,1:2,0:2,1:1,0:1,1:2,0:3,1:2,0:1,1:1,0:1,1:2,0:1,1:11,0:1,1:1,0:3,1:4,0:2,1:1,0:1,1:1,0:1,1:2,0:1,1:5,0:2,1:3,0:3,1:1,0:1,1:1,0:4,1:2,0:2,1:2,0:2,1:18,0:3,1:1,0:3,1:1,0:1,1:3,0:1,1:1,0:2,1:1,0:1,1:1,0:1,1:3,0:1,1:4,0:2,1:3,0:1,1:3,0:1,1:6,0:1,1:2,0:1,1:3,0:1,1:2,0:5,1:1,0:2,1:4,0:1,1:4,0:4,1:4,0:2,1:1,0:1,1:4,0:4,1:1,0:2,1:5,0:2,1:1,0:1,1:2,0:1,1:3,0:1,1:4,0:1,1:4,0:2,1:1,0:2,1:5,0:1,1:7,0:2,1:1,0:2,1:9,0:1,1:4,0:2,1:1,0:1,1:2,0:6,1:9,0:1,1:2,0:1,1:15,0:2,1:1,0:1,1:3,0:1,1:10,0:1,1:3,0:1,1:5,0:1,1:1,
SRR18349615_q SRR18349610.fastq 0 1:5,0:1,1:2,0:2,1:2,0:1,1:3,0:1,1:5,0:1,1:4,0:5,1:3,0:1,1:2,0:1,1:4,0:1,1:4,0:2,1:1,0:2,1:3,0:1,1:7,0:1,1:1,0:2,1:14,0:2,1:5,0:1,1:2,0:1,1:1,0:1,1:3,0:1,1:5,0:3,1:2,0:1,1:1,0:2,1:1,0:2,1:1,0:1,1:18,0:1,1:3,0:3,1:3,0:1,1:2,0:1,1:3,0:1,1:3,0:1,1:3,0:2,1:2,0:1,1:1,0:1,1:5,0:1,1:1,0:1,1:3,0:1,1:2,0:1,1:1,0:4,1:7,0:1,1:1,0:4,1:1,0:2,1:5,0:1,1:1,0:1,1:3,0:1,1:1,0:2,1:3,0:2,1:6,0:2,1:7,0:1,1:3,0:4,1:3,0:2,1:2,0:1,1:1,0:1,1:2,0:1,1:11,0:1,1:1,0:3,1:4,0:2,1:1,0:1,1:1,0:1,1:2,0:1,1:5,0:2,1:4,0:2,1:1,0:1,1:1,0:4,1:2,0:2,1:2,0:2,1:1,0:2,1:15,0:3,1:1,0:1,1:1,0:1,1:1,0:1,1:3,0:1,1:1,0:2,1:1,0:1,1:1,0:1,1:3,0:1,1:4,0:3,1:2,0:1,1:3,0:1,1:5,0:2,1:2,0:1,1:3,0:1,1:2,0:5,1:1,0:2,1:9,0:4,1:4,0:2,1:1,0:1,1:2,0:4,1:10,0:2,1:1,0:1,1:2,0:1,1:3,0:1,1:4,0:1,1:4,0:2,1:1,0:2,1:16,0:2,1:8,0:2,1:7,0:1,1:2,0:6,1:9,0:1,1:2,0:1,1:2,0:1,1:12,0:2,1:1,0:1,1:3,0:1,1:10,0:1,1:3,0:1,1:1,0:1,1:3,0:1,1:1,
SRR18349615_q SRR18349609.fastq 0 1:5,0:1,1:6,0:1,1:3,0:1,1:5,0:1,1:4,0:5,1:6,0:1,1:9,0:2,1:1,0:2,1:3,0:1,1:7,0:1,1:1,0:2,1:4,0:1,1:9,0:2,1:8,0:1,1:1,0:1,1:3,0:1,1:5,0:3,1:4,0:2,1:1,0:2,1:1,0:1,1:18,0:2,1:2,0:3,1:6,0:2,1:2,0:1,1:3,0:1,1:3,0:2,1:2,0:1,1:1,0:1,1:7,0:1,1:3,0:1,1:2,0:1,1:1,0:4,1:7,0:1,1:1,0:4,1:1,0:2,1:5,0:1,1:1,0:1,1:2,0:2,1:1,0:2,1:3,0:2,1:6,0:2,1:7,0:1,1:2,0:5,1:3,0:2,1:2,0:1,1:1,0:1,1:2,0:1,1:11,0:1,1:1,0:3,1:4,0:2,1:1,0:1,1:1,0:1,1:2,0:1,1:5,0:2,1:4,0:2,1:1,0:1,1:1,0:4,1:2,0:2,1:2,0:2,1:18,0:3,1:1,0:3,1:1,0:1,1:5,0:2,1:1,0:1,1:1,0:1,1:3,0:1,1:4,0:3,1:2,0:1,1:3,0:2,1:8,0:1,1:3,0:1,1:2,0:2,1:2,0:1,1:1,0:2,1:9,0:4,1:4,0:2,1:1,0:1,1:2,0:6,1:8,0:2,1:1,0:2,1:1,0:1,1:3,0:1,1:4,0:1,1:4,0:2,1:1,0:2,1:14,0:1,1:1,0:2,1:8,0:2,1:7,0:1,1:2,0:3,1:12,0:1,1:2,0:1,1:2,0:1,1:11,0:3,1:1,0:1,1:3,0:1,1:10,0:1,1:3,0:1,1:1,0:1,1:3,0:1,1:1,

@GuillaumeHolley
Copy link
Collaborator

Hey @nluhmann,

Yes, the Bifrost API has changed quite a bit over the past 2 years but maybe most significantly for Blastfrost, a big change that breaks color compatibility with previous Bifrost versions was introduced in April this year as reported in the API changelog. I wish I could have avoided this but it was inevitable. The Bifrost version used by Blastfrost is from 2 years ago so Blastfrost, as of now, cannot read any colored graphs generated by Bifrost after April 2022. Two solutions:

  • Update the Bifrost version in Blastfrost. There should only be minor changes to function calls to do.
  • Make the Bifrost graph using any version of Bifrost released before April 28, 2022

Solution one is by far the better one given that Bifrost performance considerably improved after this update.

@davidmaimoun
Copy link
Author

Hi,
I understand now why I got weird error messages too, thank you.
And I see from your analyze Nina that it is working, so it is clear that I missed something.
I'll try it again on my assemblies
A last question: If you don't mind to explain the output a little. I see that there were hits, e-value I know it from blast, but what does the series of number represent.

Thanks for all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants