quantms failing with a big dataset 15K files #339

ypriverol · 2024-01-12T09:15:32Z

Description of the bug

I'm running a big dataset and the diann assembly step fails vdemichev/DiaNN#899. Looks like a memory issue, I have given more than 1.8TB of memory and 48 CPUs. Vadim has suggested using a random number of files only for the library creation, @daichengxin is this related with the previos PR of #335 ?

Command used and terminal output

No response

Relevant files

No response

System information

No response

daichengxin · 2024-01-13T07:09:03Z

Yes. It's releated with library creation. How do we determine random number? or should it be ratio?

ypriverol · 2024-01-13T10:35:29Z

I think it must be a parameter in the first implementation. We only have a few massive datasets like this in fact, working with a previous dataset of 6k files it works this step. Then, I just suggest taking (in this first implementation) a parameter, which can empirical_assembly_ms_n = 200 and then the user can easily configure it in the commandline. What do you think?

ypriverol · 2024-02-11T10:41:41Z

@daichengxin Im reopening this issue, because the solution still doesn't work. In the current implementation, you selected for the empirical_assemembly a certain number of raw files (👌), but in the Assembly step you are passing all the mzMLs; which will not work for any amount of data I'm trying with the empirical_assembly_ms_n. I have try 10 files but then in the assembly step all the raw files are used and even If I go for 1TB of memory, the tool fails. What are the options here @daichengxin @vdemichev

#!/bin/bash -euo pipefail
# Precursor Tolerance value was: 20.0
# Fragment Tolerance value was: 50.0
# Precursor Tolerance unit was: ppm
# Fragment Tolerance unit was: ppm

ls -lcth

diann -f {all_mzml} \
        --lib lib.predicted.speclib \
        --threads 24 \
        --out-lib empirical_library.tsv \
        --verbose 3 \
        --rt-profiling \
        --temp ./quant/ \
        --use-quant \
        --quick-mass-acc --individual-mass-acc \
        --individual-windows \
        --gen-spec-lib \
         \
        2>&1 | tee assemble_empirical_library.log

vdemichev · 2024-02-11T11:26:49Z

"but then in the assembly step all the raw files are used" - the idea would be to not use all raw files for empirical library generation

jspaezp · 2024-04-16T23:17:49Z

is this fixed in point 3 of this PR -> #355 ?
https://github.com/bigbio/quantms/pull/355/files#diff-d1791e570272687a02be86fa1f39fa77140e2ae87d4d2d4cd2bf22c6c6e8127b

ypriverol · 2024-04-17T06:20:08Z

Yes this issue is fixed. Let me close it, to remove confusion.

ypriverol added the bug Something isn't working label Jan 12, 2024

ypriverol assigned daichengxin Jan 12, 2024

ypriverol added the high-priority label Jan 12, 2024

ypriverol mentioned this issue Jan 12, 2024

We should dynamically configure some of the steps depending of the experiment size. #340

Open

3 tasks

daichengxin linked a pull request Jan 13, 2024 that will close this issue

library creation #341

Merged

11 tasks

ypriverol closed this as completed in #341 Jan 13, 2024

ypriverol reopened this Feb 11, 2024

ypriverol closed this as completed Apr 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quantms failing with a big dataset 15K files #339

quantms failing with a big dataset 15K files #339

ypriverol commented Jan 12, 2024 •

edited

Loading

daichengxin commented Jan 13, 2024

ypriverol commented Jan 13, 2024

ypriverol commented Feb 11, 2024 •

edited

Loading

vdemichev commented Feb 11, 2024

jspaezp commented Apr 16, 2024 •

edited

Loading

ypriverol commented Apr 17, 2024

quantms failing with a big dataset 15K files #339

quantms failing with a big dataset 15K files #339

Comments

ypriverol commented Jan 12, 2024 • edited Loading

Description of the bug

Command used and terminal output

Relevant files

System information

daichengxin commented Jan 13, 2024

ypriverol commented Jan 13, 2024

ypriverol commented Feb 11, 2024 • edited Loading

vdemichev commented Feb 11, 2024

jspaezp commented Apr 16, 2024 • edited Loading

ypriverol commented Apr 17, 2024

ypriverol commented Jan 12, 2024 •

edited

Loading

ypriverol commented Feb 11, 2024 •

edited

Loading

jspaezp commented Apr 16, 2024 •

edited

Loading