Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RRuntimeError: Error in colSums(x) : 'x' must be numeric #681

Open
ahmed-agami opened this issue Nov 29, 2024 · 7 comments
Open

RRuntimeError: Error in colSums(x) : 'x' must be numeric #681

ahmed-agami opened this issue Nov 29, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@ahmed-agami
Copy link

ahmed-agami commented Nov 29, 2024

Report

Hi Lukas,

I have followed the steps exactly here: https://pertpy.readthedocs.io/en/latest/tutorials/notebooks/milo.html under the section ''Milo - KNN based differential abundance analysis'' using my AnnData object and I had the above error when I tried to run the Differential abundance testing with GLM.


RRuntimeError Traceback (most recent call last)
Cell In[114], line 1
----> 1 milo.da_nhoods(mdata, design="~condition", model_contrasts="conditionGAPDH-conditioncontrol")

File ~/yes/envs/pertpy/lib/python3.10/site-packages/pertpy/tools/_milo.py:371, in Milo.da_nhoods(self, mdata, design, model_contrasts, subset_samples, add_intercept, feature_key, solver)
369 # Fit NB-GLM
370 dge = edgeR.DGEList(counts=count_mat[keep_nhoods, :][:, keep_smp], lib_size=lib_size[keep_smp])
--> 371 dge = edgeR.calcNormFactors(dge, method="TMM")
372 dge = edgeR.estimateDisp(dge, model)
373 fit = edgeR.glmQLFit(dge, model, robust=True)

File ~/yes/envs/pertpy/lib/python3.10/site-packages/rpy2/robjects/functions.py:208, in SignatureTranslatedFunction.call(self, *args, **kwargs)
206 v = kwargs.pop(k)
207 kwargs[r_k] = v
--> 208 return (super(SignatureTranslatedFunction, self)
209 .call(*args, **kwargs))

File ~/yes/envs/pertpy/lib/python3.10/site-packages/rpy2/robjects/functions.py:131, in Function.call(self, *args, **kwargs)
129 else:
130 new_kwargs[k] = cv.py2rpy(v)
--> 131 res = super(Function, self).call(*new_args, **new_kwargs)
132 res = cv.rpy2py(res)
133 return res

File ~/yes/envs/pertpy/lib/python3.10/site-packages/rpy2/rinterface_lib/conversion.py:45, in cdata_res_to_rinterface..(*args, **kwargs)
44 def _(*args, **kwargs):
---> 45 cdata = function(*args, **kwargs)
46 # TODO: test cdata is of the expected CType
47 return _cdata_to_rinterface(cdata)

File ~/yes/envs/pertpy/lib/python3.10/site-packages/rpy2/rinterface.py:890, in SexpClosure.call(self, *args, **kwargs)
883 res = rmemory.protect(
884 openrlib.rlib.R_tryEval(
885 call_r,
886 call_context.sexp._cdata,
887 error_occured)
888 )
889 if error_occured[0]:
--> 890 raise embedded.RRuntimeError(_rinterface._geterrmessage())
891 return res

RRuntimeError: Error in colSums(x) : 'x' must be numeric

Image

Version information


anndata 0.10.8
anndata2ri 1.3.2
matplotlib 3.9.2
mudata 0.3.1
numba 0.60.0
numpy 1.26.4
pandas 2.2.3
pertpy 0.9.4
rpy2 3.5.17
scanpy 1.10.4
scipy 1.14.1
scvi 1.2.0
seaborn 0.13.2
session_info 1.0.0

PIL 11.0.0
absl NA
adjustText 1.3.0
anyio NA
arrow 1.3.0
asttokens NA
attr 24.2.0
attrs 24.2.0
babel 2.16.0
backports NA
blitzgsea NA
certifi 2024.08.30
cffi 1.17.1
charset_normalizer 3.4.0
chex 0.1.87
comm 0.2.2
cycler 0.12.1
cython_runtime NA
dateutil 2.9.0.post0
debugpy 1.8.9
decorator 5.1.1
decoupler 1.8.0
defusedxml 0.7.1
docrep 0.3.2
equinox 0.11.9
etils 1.11.0
exceptiongroup 1.2.2
executing 2.1.0
fastjsonschema NA
filelock 3.16.1
flax 0.10.2
fqdn NA
fsspec 2024.10.0
h5py 3.12.1
idna 3.10
igraph 0.11.8
ipykernel 6.29.5
ipywidgets 8.1.5
isoduration NA
jaraco NA
jax 0.4.35
jaxlib 0.4.35
jaxopt NA
jaxtyping 0.2.36
jedi 0.19.2
jinja2 3.1.4
joblib 1.4.2
json5 0.10.0
jsonpointer 3.0.0
jsonschema 4.23.0
jsonschema_specifications NA
jupyter_events 0.10.0
jupyter_server 2.14.2
jupyterlab_server 2.27.3
kiwisolver 1.4.7
lamin_utils 0.13.9
legacy_api_wrap NA
leidenalg 0.10.2
lightning 2.4.0
lightning_fabric 2.4.0
lightning_utilities 0.11.9
lineax 0.0.7
llvmlite 0.43.0
markupsafe 3.0.2
matplotlib_inline 0.1.7
ml_collections 1.0.0
ml_dtypes 0.5.0
more_itertools 10.3.0
mpl_toolkits NA
mpmath 1.3.0
msgpack 1.1.0
multipledispatch 0.6.0
natsort 8.4.0
nbformat 5.10.4
numpyro 0.16.0
nvidia NA
opt_einsum 3.4.0
optax 0.2.4
ott 0.4.9
overrides NA
packaging 24.2
parso 0.8.4
patsy 1.0.1
platformdirs 4.3.6
ply 3.11
prometheus_client NA
prompt_toolkit 3.0.48
psutil 6.1.0
pubchempy 1.0.4
pure_eval 0.2.3
pyarrow 18.1.0
pycparser 2.22
pydev_ipython NA
pydevconsole NA
pydevd 3.2.3
pydevd_file_utils NA
pydevd_plugins NA
pydevd_tracing NA
pygments 2.18.0
pynndescent 0.5.13
pyomo 6.8.2
pyparsing 3.2.0
pyro 1.9.1
pythonjsonlogger NA
pytorch_lightning 2.4.0
pytz 2024.2
referencing NA
requests 2.32.3
rfc3339_validator 0.1.4
rfc3986_validator 0.1.1
rich NA
rpds NA
send2trash NA
setuptools 75.6.0
simplejson 3.19.3
six 1.16.0
sklearn 1.5.2
skmisc 0.5.1
sniffio 1.3.1
sparse 0.15.4
sparsecca 0.3.1
stack_data 0.6.3
statsmodels 0.14.4
sympy 1.13.1
texttable 1.7.0
threadpoolctl 3.5.0
toolz 1.0.0
torch 2.5.1+cu124
torchgen NA
torchmetrics 1.6.0
tornado 6.4.2
tqdm 4.67.1
traitlets 5.14.3
triton 3.1.0
typing_extensions NA
tzlocal NA
umap 0.5.7
uri_template NA
urllib3 2.2.3
wcwidth 0.2.13
webcolors NA
websocket 1.8.0
xarray 2024.11.0
yaml 6.0.2
zmq 26.2.0
zoneinfo NA

IPython 8.30.0
jupyter_client 8.6.3
jupyter_core 5.7.2
jupyterlab 4.2.6
notebook 7.2.2

Python 3.10.15 | packaged by conda-forge | (main, Oct 16 2024, 01:24:24) [GCC 13.3.0]
Linux-5.15.0-124-generic-x86_64-with-glibc2.31

Session information updated at 2024-11-29 19:32

@ahmed-agami ahmed-agami added the bug Something isn't working label Nov 29, 2024
@emdann
Copy link
Member

emdann commented Dec 10, 2024

Hi @ahmed-agami, replying here for emdann/milopy#52 (although the examples are different)

Can you check that you have no NA or inf in mdata['milo'].X? Or in mdata['rna'].identifier? What are the levels of condition? Do you get the same error if you use only da_nhoods(mdata, design="~condition") without specifying contrasts?

@ramadatta
Copy link

Hi @emdann ,

Many thanks for your reply. We have been stuck on this for a while.

replying on behalf of @ahmed-agami , as I am analysing the same data. I have simplified the dataset by subsetting it.

Please see below:

Image

Image

Additionally I checked if mdata['rna'].X has any issue, seems everything is okay with the input:

Image

The error stems with or without specifying contrasts.

Image

Best Regards
Datta

@emdann
Copy link
Member

emdann commented Dec 16, 2024

I am not sure whether this is what's causing the issue here, but it seems like you are trying to test for differential abundance on a condition for which you don't have replicates. There are only 3 observations in mdata['milo']. The model needs to estimate variance in cell abundance for multiple samples from each condition. In the first example posted you had 9 observations in mdata['milo']. How are you defining sample_col in milo.count_nhoods? And how does sample_col relate to condition? It would be helpful for debugging if you kept using and reporting the same example code.

@Zach-Sten
Copy link

I am also having the same issue here running the example data from the vingette:
https://pertpy.readthedocs.io/en/latest/tutorials/notebooks/milo.html

It seems there may be an issue with how R recognizes a dataframe vs. how python delineates it in the mdata['milo'].X even after converting the sparse matrix to a dense matrix.

@Zach-Sten
Copy link

One solution I have found is downgrading pertpy to 0.6.0. This may also require you to downgrade jax and jaxlib to 0.4.13 as well. Doing this resolves the issue and runs the vingette code.

I'm trying to find what changed between versions of these packages. Previously I had been using pertpy (0.9.5) and jax (0.4.38) which are very close to what the original post @ahmed-agami was using.

@emdann any ideas on what may have changed between versions?

@emdann
Copy link
Member

emdann commented Jan 7, 2025

My guess is that there might be a mismatch between pertpy and rpy2 versions. Are you using the same rpy2 version in both environments? Which version is this? I am able to run vignettes with pertpy=0.10.0 and rpy2=3.5.10.

@Zach-Sten
Copy link

In the environment I was using when I had the error I was using pertpy=0.9.5 and rpy2=3.5.17. In the environment where I was able to run the vignette previously I was using pertpy=0.6.0 and rpy2=3.5.11.

Too bad they aren't more consistent in the rpy2 versions. I would guess whats causing the error is an issue with r2py > 3.5.11. It seems the error relates to how r2py is treating a numpy array and converting it to a numeric integer matrix for R. Even after converting it from a sparse array to a dense matrix I don't think its applying the right transformation to the data.

If there have been major changes to the milo workflow in pertpy=0.10.0 I would consider making the changes you suggested above.

Unrelated but a step that has saved me a lot of time has been to use the rapids single cell to build the KNN graph.
https://rapids-singlecell.readthedocs.io/en/latest/api/generated/rapids_singlecell.pp.neighbors.html#rapids_singlecell.pp.neighbors

This is especially helpful when dealing with very large datasets. Maybe a future rendition of pertpy and milo could be to enable GPU usage both for the KNN graph and GLM's available in cuml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants