-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Multi-database support options for sourmash gather
#536
Comments
hi @tnmquann, agree! Please take a look at: In recent releases (v0.9.8 and beyond, with important bug fixes through v0.9.11, the current release) we added full support for standalone manifests, which should address most of your use cases above. You can now create a single manifest CSV with Please let us know how I can make this clearer in the documentation! You can also use this to create a single RocksDB database from multiple inputs. Unfortunately, at the moment there is no way to use multiple RocksDB databases as a search target without loading them all into memory (which defeats the purpose of them, yes). This is not something I think will improve in the next few releases. |
Hi @ctb, Sorry for the late response. I’ve been testing using a pathlist to create a general RocksDB. Everything is going smoothly except for a warning and a JSON load failure when using
Additionally, I’ve noticed issues when using
For now, loading all databases into memory is manageable as they don’t consume too much. |
I'll take a look - thanks! |
Hi @ctb,
I’m exploring options to enhance the comprehensiveness of microbial profiling using
sourmash gather/multigather
. Specifically, I’m curious if it’s possible to use multiple databases simultaneously in a single command. The branchwater plugin documentation doesn’t mention this feature, but it would be invaluable for achieving a more holistic view of all microorganisms in a sample. This could also benefit users with limited storage or computational capacity who cannot retain all raw sequences for re-sketching later.For example:
Additionally, I have a couple of feature suggestions:
*.zip
database by unzipping and merging, but this failed for some samples with similar hashes. I encountered files differentiated only by a number suffix, such assig1.siz.gz
,sig1.siz.gz_1
,sig1.siz.gz_2
, which complicates merging and management.Looking forward to hearing your thoughts on these ideas and any potential solutions!. Thanks!
The text was updated successfully, but these errors were encountered: