-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mapping input contig names to output #10
Comments
For cycles that are composed of multiple contigs in the graph, the second output file is meant for this purpose: For plasmids that are isolated in the graph, I don't think the names are maintained, but you can also parse them out of the original fastg by searching for self edges, having headers formatted as X: X; |
and no worries about the comparison |
On this particular run, I have a fasta file with extension
|
Together these outputs mean there was only one plasmid found, and that it was isolated in the original graph. the cov.txt file only gets written to when there are cycles made up of multiple nodes (edges in spades' convention). |
Ok but going back to the original question, would it be possible to make a method of mapping source contigs to output contigs? |
I would appreciate this feature as well. This makes it possible to further analyse putative contigs in anvi'o by importing the full assembly and map the contigs into putative plasmid bins. |
Given that I get a contig name like
RNODE_1_length_2185_cov_1230.39000
, how would I know which contig that is in the original file? I can search the coverage field1230.39
to find the original contig nameNODE_18_length_2240_cov_1230.39_component_1
but I hope there is a better way. Even in the stdout, the contig name is different asEDGE_703_length_2240_cov_1230.39_component_1
.This output is from plasmid spades. Sorry, I don't know if/when I would have the time to do a head to head comparison, regarding my previous issue #8.
The text was updated successfully, but these errors were encountered: