Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computation of the unique genes per assembly provided by the quality report #47

Open
olar785 opened this issue Oct 28, 2021 · 0 comments

Comments

@olar785
Copy link

olar785 commented Oct 28, 2021

Hi Mathew, this is more of a question regarding the quality report but I thought I would post it here in case it is of interest to others.
The quality report produces the following:

***** UNIQUE GENES ORP ~~~~~~~~~~~~~~~~~> 10278
***** UNIQUE GENES TRINITY ~~~~~~~~~~~~~> 9393
***** UNIQUE GENES SPADES55 ~~~~~~~~~~~~> 9058
***** UNIQUE GENES SPADES75 ~~~~~~~~~~~~> 8617
***** UNIQUE GENES TRANSABYSS ~~~~~~~~~~> 10617

From reading the manuscript and the documentation, it is not clear to me what the number of unique genes represent and how it is computed. Transrate does provide the number of unigenes (I believe used interchangeably with contigs) so I first thought it represented the number of unique contigs but the final assembly I get from ORP contains a little over 218,000 unique sequences. As such, I imagine the unique genes presented in the report are contigs that could be assigned to a certain database? Clarity on how this output is generated would be greatly appreciated.
Many thanks,
Olivier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant