Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modified header names in PHAMB #41

Open
ShailNair opened this issue Jul 7, 2022 · 4 comments
Open

modified header names in PHAMB #41

ShailNair opened this issue Jul 7, 2022 · 4 comments

Comments

@ShailNair
Copy link

Hi,

My assembled contigs have headers as

c_000003956504
c_000004841845
c_000004821562

which matches with the VAMB bin headers. But when I run PHAMB, I get bin headers as :

1470111
816445
3021234
1094390

How to get the PHAMB contig headers in the initial VAMB bin headers format?.

@joacjo
Copy link
Collaborator

joacjo commented Sep 27, 2022

Hi Shail

Can you send me a snapshot of your clusters.tsv file? And how many samples do you have?
I expect that there might a problem in the naming of your contigs.

The framework is designed to use the Sample-IDs from the header of contigs to keep track of where each viral-bin is from.

best,
Joachim

@ShailNair
Copy link
Author

@joacjo I used a single co-assembled contigs file for binning. Here is the snapshot of the cluster.tsv file and phamb generated fna file
QQ图片20220928084217

The cluster.tsv file has 1101140 records.

I followed the How to Run - not in parallel - quick and dirty tutorial.

Thank you.

@joacjo
Copy link
Collaborator

joacjo commented Sep 29, 2022

Hi Shail

Ah I see. The names of the entries in vamb_bins.fna matches the VAMB-cluster names. Remember the bins in the .fna file are concats of the VAMB cluster sequences.

Example: In your clusters.tsv you might have a cluster with multiple contigs:

cluster contig
99999 c_000000123
99999 c_000000321

If this cluster is predicted putative viral, the resulting name in the .fna file will be: 99999

Does this make sense?

Best,
Joachim

@ShailNair
Copy link
Author

ShailNair commented Sep 30, 2022

Thanks. that makes sense.
Thanks for this very helpful tool. We could extract a three times higher number of complete viral contigs (as per CheckV's rule) with PHAMB in comparison to VirSorter2, DeepVirFinder and viralVerify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants