-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quantms failing with a big dataset 15K files #339
Comments
Yes. It's releated with library creation. How do we determine random number? or should it be ratio? |
I think it must be a parameter in the first implementation. We only have a few massive datasets like this in fact, working with a previous dataset of 6k files it works this step. Then, I just suggest taking (in this first implementation) a parameter, which can |
@daichengxin Im reopening this issue, because the solution still doesn't work. In the current implementation, you selected for the empirical_assemembly a certain number of raw files (👌), but in the Assembly step you are passing all the mzMLs; which will not work for any amount of data I'm trying with the
|
"but then in the assembly step all the raw files are used" - the idea would be to not use all raw files for empirical library generation |
is this fixed in point 3 of this PR -> #355 ? |
Yes this issue is fixed. Let me close it, to remove confusion. |
Description of the bug
I'm running a big dataset and the diann assembly step fails vdemichev/DiaNN#899. Looks like a memory issue, I have given more than 1.8TB of memory and 48 CPUs. Vadim has suggested using a random number of files only for the library creation, @daichengxin is this related with the previos PR of #335 ?
Command used and terminal output
No response
Relevant files
No response
System information
No response
The text was updated successfully, but these errors were encountered: