Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected out of memory #113

Open
dalcsy opened this issue Jan 29, 2025 · 4 comments
Open

Unexpected out of memory #113

dalcsy opened this issue Jan 29, 2025 · 4 comments

Comments

@dalcsy
Copy link

dalcsy commented Jan 29, 2025

Hi, I'm using scDRS with two traits and a scRNA dataset (426635, 17688, ~29GB). As stated, I set 120GB memory for my slurm run script but it even failed with 500 GB because of out of memory.
The following is my running script and the error log. Hope you can give me any suggestion if I did anything wrong.

Thank you.

best,
Siyuan


#!/bin/bash
#SBATCH --job-name=scDRS
#SBATCH --output=scDRS.out
#SBATCH --error=scDRS.err
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=5:00:00
#SBATCH --mem=500G

run core compute-score

python /home/sc2514/softwares/scDRS/scDRS-1.0.2/bin/scdrs compute-score --h5ad-file /rds/user/sc2514/hpc-work//DATA/6_scDRS/scRNA_data/merged_data.h5ad --h5ad-species human --gs-file /rds/user/sc2514/hpc-work/CAD_progression_project/OUTPUT/6_scDRS/scDRS_GS/CAD.gs --gs-species human --out-folder /rds/user/sc2514/hpc-work/CAD_progression_project/OUTPUT/6_scDRS/scDRS/CAD --flag-filter-data True --flag-raw-count True --n-ctrl 1000 --flag-return-ctrl-raw-score False --flag-return-ctrl-norm-score True


Computing control scores: 50%|████▉ | 497/1000 [1:03:38<1:04:24, 7.68s/it]
Traceback (most recent call last):
File "/home/sc2514/softwares/scDRS/scDRS-1.0.2/bin/scdrs", line 740, in
fire.Fire()
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/fire/core.py", line 135, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/fire/core.py", line 468, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/sc2514/softwares/scDRS/scDRS-1.0.2/bin/scdrs", line 227, in compute_score
df_res = scdrs.score_cell(
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/scdrs/method.py", line 178, in score_cell
v_ctrl_raw_score, v_ctrl_weight = _compute_raw_score(
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/scdrs/method.py", line 395, in _compute_raw_score
v_raw_score = adata[:, gene_list].X.dot(v_score_weight).reshape([-1])
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/anndata/_core/anndata.py", line 1109, in getitem
return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/anndata/_core/anndata.py", line 289, in init
self._init_as_view(X, oidx, vidx)
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/anndata/_core/anndata.py", line 361, in _init_as_view
self._raw = adata_ref.raw[oidx]
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/anndata/_core/raw.py", line 110, in getitem
X = _subset(self.X, (oidx, vidx))
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/functools.py", line 874, in wrapper
return dispatch(args[0].class)(*args, **kw)
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/anndata/_core/index.py", line 140, in _subset_spmatrix
return a[subset_idx]
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/scipy/sparse/_index.py", line 68, in getitem
return self.copy()
File "/home/sc2514/.conda/envs/scRNA/lib/python3.8/site-packages/scipy/sparse/_data.py", line 92, in copy
return self._with_data(self.data.copy(), copy=True)
numpy.core._exceptions.MemoryError: Unable to allocate 9.54 GiB for an array with shape (1280291246,) and data type float64

@martinjzhang
Copy link
Owner

Have you tried to store the data in sparse format? E.g., set adata.X as a scipy.sparse.csr matrix? scDRS is compatible with the sparse format.

@dalcsy
Copy link
Author

dalcsy commented Jan 31, 2025

Thank you for your quick response. When I checked my adata.X, it was already in the CSR format and the normalized values are in float 64 format. As it is my first time to use this tool, I'm wondering whether there are some flags I set incorrectly that can raise the issue. It seems that my dataset is much smaller than the one in the published paper.

adata.X[0:5,0:5]
<5x5 sparse matrix of type '<class 'numpy.float64'>'
with 10 stored elements in Compressed Sparse Row format>

Thank you!
Siyuan

@martinjzhang
Copy link
Owner

Hi Siyuan,
Did you use the normalized counts (real values) instead of raw counts (integer values)? If you, can you use the raw count as input?
In addition, consider changing the data type to np.float32 to save half of the memory: adata.X = adata.X.astype(np.float32)

@dalcsy
Copy link
Author

dalcsy commented Feb 3, 2025

Thank you. I understood that. My current file includes normalized data and the raw counts are in adata.raw.X. I did not find a flag to tell the scDRS to use adata.raw.X for the raw counts so that I'm trying to overwrite the adata.X matrix with my integer raw counts.
I'll let you know if it works well.

best,
Siyuan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants