Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LinAlgError when running dyn.tl.dynamics() #676

Open
nprashant2001 opened this issue May 11, 2024 Discussed in #423 · 7 comments
Open

LinAlgError when running dyn.tl.dynamics() #676

nprashant2001 opened this issue May 11, 2024 Discussed in #423 · 7 comments

Comments

@nprashant2001
Copy link

Discussed in #423

Originally posted by Ckenen September 21, 2022
When I run dyn.tl.dynamics(adata, group="group", NTR_vel=True), it will raise LinAlgError: SVD did not converge.
However, if I do not provide group parameter, not error will be raised.
But I need M_s and velocity_S in my downstream analysis.
How can I fixed this problem?

Detail as follow:
LinAlgError Traceback (most recent call last)
Cell In [150], line 2
1 # dyn.tl.dynamics(adata, NTR_vel=True)
----> 2 dyn.tl.dynamics(adata, group="group", NTR_vel=True)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/dynamo/tools/dynamics.py:521, in dynamics(adata, filter_gene_mode, use_smoothed, assumption_mRNA, assumption_protein, model, est_method, NTR_vel, group, protein_names, concat_data, log_unnormalized, one_shot_method, fraction_for_deg, re_smooth, sanity_check, del_2nd_moments, cores, tkey, **est_kwargs)
518 warnings.simplefilter("ignore")
520 if experiment_type.lower() in ["one-shot", "one_shot"]:
--> 521 est.fit(one_shot_method=one_shot_method, **est_kwargs)
522 else:
523 # experiment_type can be kin also and by default use
524 # conventional method to estimate k but correct for time
525 est.fit(**est_kwargs)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/dynamo/estimation/csc/velocity.py:1368, in ss_estimation.fit(self, intercept, perc_left, perc_right, clusters, one_shot_method)
1357 if cores == 1:
1358 for i in tqdm(range(n_genes), desc="estimating gamma"):
1359 (
1360 k[i],
1361 k_intercept[i],
1362 _,
1363 k_r2[i],
1364 _,
1365 k_logLL[i],
1366 bs[i],
1367 bf[i],
-> 1368 ) = self.fit_gamma_stochastic(
1369 self.est_method,
1370 U[i],
1371 S[i],
1372 US[i],
1373 S2[i],
1374 perc_left=perc_left,
1375 perc_right=perc_right,
1376 normalize=True,
1377 )
1378 else:
1379 pool = ThreadPool(cores)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/dynamo/estimation/csc/velocity.py:1690, in ss_estimation.fit_gamma_stochastic(self, est_method, u, s, us, ss, perc_left, perc_right, normalize)
1688 bs, bf = None, None
1689 if est_method.lower() == "gmm":
-> 1690 k = fit_stochastic_linreg(u[mask], s[mask], us[mask], ss[mask])
1691 elif est_method.lower() == "negbin":
1692 phi = compute_dispersion(s, ss)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/dynamo/estimation/csc/utils_velocity.py:420, in fit_stochastic_linreg(u, s, us, ss, fit_2_gammas, err_cov)
418 else:
419 cov = np.diag(E.var(1))
--> 420 cov_inv = np.linalg.pinv(cov)
422 # generalized least squares
423 xy, xx = 0, 0

File <array_function internals>:5, in pinv(*args, **kwargs)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/numpy/linalg/linalg.py:2002, in pinv(a, rcond, hermitian)
2000 return wrap(res)
2001 a = a.conjugate()
-> 2002 u, s, vt = svd(a, full_matrices=False, hermitian=hermitian)
2004 # discard small singular values
2005 cutoff = rcond[..., newaxis] * amax(s, axis=-1, keepdims=True)

File <array_function internals>:5, in svd(*args, **kwargs)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/numpy/linalg/linalg.py:1660, in svd(a, full_matrices, compute_uv, hermitian)
1657 gufunc = _umath_linalg.svd_n_s
1659 signature = 'D->DdD' if isComplexType(t) else 'd->ddd'
-> 1660 u, s, vh = gufunc(a, signature=signature, extobj=extobj)
1661 u = u.astype(result_t, copy=False)
1662 s = s.astype(_realType(result_t), copy=False)

File ~/miniconda3/envs/scanpy/lib/python3.10/site-packages/numpy/linalg/linalg.py:97, in _raise_linalgerror_svd_nonconvergence(err, flag)
96 def _raise_linalgerror_svd_nonconvergence(err, flag):
---> 97 raise LinAlgError("SVD did not converge")

LinAlgError: SVD did not converge

@migthms
Copy link

migthms commented May 17, 2024

Hello,

I am trying to follow the tutorial "scNT-seq human hematopoiesis dynamics" and estimate RNA Velocity with Model 1 instead of Model 2 from the adata_hsc_raw (dyn.sample_data.hematopoiesis_raw()).
I defined adata_hsc_raw.uns["pp"] like the one I found in the tutorial of Zebrafish pigmentation.
image

And I got the exact same error as mentioned above when trying to run the following line : dyn.tl.dynamics(adata_hsc_raw, model="stochastic").

dynamo==1.4.0
numpy==1.26.4

Thank you in advance for any response!

@Sichao25
Copy link
Collaborator

Thanks for raising the issue. Could you check whether there are NaNs or Infs in your processed data before dyn.tl.dynamics()?

@migthms
Copy link

migthms commented Aug 12, 2024

There are no NaNs nor Infs in any of those layers coming from the preprocessing and moments computation of dyn.sample_data.hematopoiesis_raw()... but they seem to contain a lot of zeros, could it be the problem ?

image

@migthms
Copy link

migthms commented Aug 12, 2024

I just want to estimate the velocities from unsplicing/splicing data only (Model 1).
Is there a tutorial that I could find and use to do it for the human hematopoiesis dataset ?

@Sichao25
Copy link
Collaborator

Given that most of the data is sparse, zeros shouldn't be an issue most of the time. I can reproduce the problem now. On my end, it happens only with the stochastic model when group-specific parameters are enabled. I'll investigate further and get back to you later.

The zebrafish here only uses the splicing information, which is a good reference.

@Sichao25
Copy link
Collaborator

Sichao25 commented Aug 12, 2024

This is a very good catch. Now I see where the problem is. When performing group-specific dynamics estimation, a NaN value will be generated if one gene vector is entirely composed of zeros within a specific group. Mon cell type in hematopoiesis data contains such a vector after processing.

The best way to resolve this issue is not clear so far. For now, I can add a more detailed error message before we implement an actual solution to handle or ignore such vectors. On your side, I suggest temporarily estimating velocities without the group parameter or manually filtering out those vectors.

@migthms
Copy link

migthms commented Aug 13, 2024

Thank you very much for your investigation and your detailed response!
I will give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants