Error - zero dimension #2

asmagen · 2017-04-18T16:23:37Z

Hello,
I get the following error after following the manual for a single-cell dataset I'm working with.

data_obj = featureConstruct(normalized,method = "SelfProjection")
Error in cor(fpkm_for_clust0, method = "pearson") :
'x' has a zero dimension

Why does it happen and how can I solve this?
Thanks, A

asmagen · 2017-04-18T17:20:37Z

Also I get this error:
Error in cor(fpkm_temp, method = "pearson") :
Missing values present in input variable 'x'. Consider using use = 'pairwise.complete.obs'.
I didn't have any NA values in my dataset. Any idea what might cause that?
Thank you.

GIS-SP-Group · 2017-04-19T03:37:38Z

Dear Asmagen,

Have you followed the gene name requirement as stated in the manual?

#########################################################################
Input data:
A data frame of expression values (FPKM, TPM, UMI counts ...), with rows representing genes and columns representing cells. Note the current version of RCA only accepts gene names in the following format: "GenomeLocation_HGNCGeneName_EnsembleID", from which the "HGNCGeneName" is extracted for RCA analysis. For input data with only HGNC names, the users need to attach two strings to the HGNC names to make them into the "XXXX_HGNCGeneNames_YYYY" format"
#########################################################################

asmagen · 2017-04-19T13:56:48Z

So for gene symbol ‘BRCA1’ I need to use ‘XXXX_BRCA1_YYYY’?

…

On Apr 18, 2017, at 11:37 PM, GIS-SP-Group ***@***.***> wrote: XXXX_HGNCGeneNames_YYYY

GIS-SP-Group · 2017-04-20T02:43:44Z

Correct. Sorry for the inconvenience and we will improve this in the next version.

Huipeng

asmagen · 2017-04-20T13:29:05Z

The same issue still occurs. It doesn't have to do anything with the gene names. What can be done about it?

GIS-SP-Group · 2017-04-20T14:13:31Z

Asmagen,

Wonder if you followed the procedure in Vignettes.

Please paste your script here.

Huipeng

asmagen · 2017-04-22T17:30:20Z

library(RCA)

construct data object

rownames(dataset$counts) = sapply(rownames(dataset$counts),function(v) paste('XXXX',v,'YYYY',sep='_'))
data_obj = dataConstruct(dataset$counts);

filt out lowly expressed genes

data_obj = geneFilt(obj_in = data_obj);

normalize gene expression data

data_obj = cellNormalize(data_obj,method='scQ');

log transform the data

normalized = dataTransform(data_obj,"log10");

project the expression data into Reference Component space

data_obj = featureConstruct(normalized,method = "SelfProjection")

generate cell clusters

data_obj = cellClust(data_obj,method="hclust",deepSplit_wgcna=environment$cluster.param2,min_group_Size_wgcna=2)

cluster.association = data_obj$group_labels_color$groupLabel

GIS-SP-Group · 2017-04-24T02:25:26Z

Hi, Asmagen,

Could you provide the table of "normalized$fpkm_transformed" via email? It seems that the "featureConstruct" failed to select any features.

Huipeng

asmagen · 2017-04-24T02:38:08Z

It’s unpublished data so I can’t. It doesn’t make much sense that the issue is specific to my dataset also.

…

On Apr 23, 2017, at 7:25 PM, GIS-SP-Group ***@***.***> wrote: Hi, Asmagen, Could you provide the table of "normalized$fpkm_transformed" via email? It seems that the "featureConstruct" failed to select any features. Huipeng — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKxq8FgS0IzPG-WkjeEP9lWuQdLXoo3Hks5rzAgWgaJpZM4NAlE6>.

GIS-SP-Group · 2017-04-24T03:00:02Z

Ok, since your script works well on our data set, this issue is likely specific to your data set.

Let me know if you are ok with sharing the following information, which might help us to figure out what's going on.

dim(normalized$fpkm_raw)
dim(normalized$fpkm)
sum(normalized$geneFilter)
dim(normalized$fpkm_transformed)
max(normalized$fpkm_transformed)
min(normalized$fpkm_transformed)

asmagen · 2017-04-24T03:09:15Z

Sure.

dim(normalized$fpkm_raw)

[1] 14919 1441

dim(normalized$fpkm)

[1] 13389 1441

sum(normalized$geneFilter)

[1] 13389

dim(normalized$fpkm_transformed)

[1] 7724 1441

max(normalized$fpkm_transformed)

[1] 2.045323

min(normalized$fpkm_transformed)

[1] 0

…

On Apr 23, 2017, at 8:00 PM, GIS-SP-Group ***@***.***> wrote: Ok, since your script works well on our data set, this issue is likely specific to your data set. Let me know if you are ok with sharing the following information, which might help us to figure out what's going on. dim(normalized$fpkm_raw) dim(normalized$fpkm) sum(normalized$geneFilter) dim(normalized$fpkm_transformed) max(normalized$fpkm_transformed) min(normalized$fpkm_transformed) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKxq8DpI_GuTIVe5__9I7x_99QpkAV5Eks5rzBAygaJpZM4NAlE6>.

asmagen · 2017-04-24T23:39:02Z

Any news?

GIS-SP-Group · 2017-04-26T07:58:25Z

Dear Asmegen,

My guess is that the size of your matrix is not compatible with some hard-coded parameters in the package. We need to explore more for a solid answer though.

You could try to run the package with a randomly chosen subset (~500 cells) and see if the problem still exists.

H

asmagen · 2017-04-27T14:42:27Z

Hello,
The featureConstruct works when I select random 500 cells, which is a very small number in comparison to the recent ScRNA-Seq technologies. But the actual clustering fails:
Error in cor(fpkm_temp, method = "pearson") :
Missing values present in input variable 'x'. Consider using use = 'pairwise.complete.obs'.

The code has hard coded parameters that relate to the matrix size? How can it be resolved asap?
Thanks, A

GIS-SP-Group · 2017-04-28T02:06:48Z

Hi, Asmagen,

We have tested our package on many data sets available on our side and it seems to work fine. We are indeed optimizing the package and will release the next version in the next couple of months.

But to have a quick solution for you, we really need something to mimic the difficulty you encountered. We don't need to see your full raw data set. But if you could generate a fake set that could be representative of the original one, that would be great.

Let me know how you think.

H

asmagen · 2017-04-28T13:14:48Z

Attached a subset of the 3k pbmcs published as an example of the Seurat package. The RCA method didn't work for this public dataset as well. Please let me know what's the status when you have news.
example.data.RData.zip

asmagen · 2017-05-10T13:17:53Z

Hello,
What's the status?
Thanks, A

enhaofrank · 2017-08-02T09:28:57Z

Hi, two guys.
Dose the problem have been solved ?
I also get the same error,and my data produced from 10X genomics single cell cellranger pipeline. The data frame of expression values is UMI counts, with rows representing genes and columns representing cells. And gene names is changed to the following format: "GenomeLocation_HGNCGeneName_EnsembleID" .The error info :
data_obj = featureConstruct(normalized,method = "SelfProjection")
Error in cor(fpkm_for_clust0, method = "pearson") :
'x' has a zero dimension

Thank you very much!
Frank

wiseflying · 2017-08-07T03:02:27Z

Dear all,

We have been testing the performance of RCA on multiple datasets on our side. For data sets from dropseq protocol, since they are usually under shallow sequencing, some of the cells might have very few expressed genes (FPKM or UMI count >0). This will cause some problem of RCA.

So when running RCA for large data sets, please do a preliminary QC to filter out bad quality cells (with sum(FPKM>0) <=1000 or sum(FPKM>0)<=500, the same of UMI count data).

Please let me know if more stringent QC would solve the problem.

best
Huipeng

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error - zero dimension #2

Error - zero dimension #2

asmagen commented Apr 18, 2017

asmagen commented Apr 18, 2017 •

edited

Loading

GIS-SP-Group commented Apr 19, 2017

asmagen commented Apr 19, 2017 via email

GIS-SP-Group commented Apr 20, 2017

asmagen commented Apr 20, 2017

GIS-SP-Group commented Apr 20, 2017

asmagen commented Apr 22, 2017

GIS-SP-Group commented Apr 24, 2017

asmagen commented Apr 24, 2017 via email

GIS-SP-Group commented Apr 24, 2017

asmagen commented Apr 24, 2017 via email

asmagen commented Apr 24, 2017

GIS-SP-Group commented Apr 26, 2017

asmagen commented Apr 27, 2017 •

edited

Loading

GIS-SP-Group commented Apr 28, 2017 •

edited

Loading

asmagen commented Apr 28, 2017

asmagen commented May 10, 2017

enhaofrank commented Aug 2, 2017

wiseflying commented Aug 7, 2017

Error - zero dimension #2

Error - zero dimension #2

Comments

asmagen commented Apr 18, 2017

asmagen commented Apr 18, 2017 • edited Loading

GIS-SP-Group commented Apr 19, 2017

asmagen commented Apr 19, 2017 via email

GIS-SP-Group commented Apr 20, 2017

asmagen commented Apr 20, 2017

GIS-SP-Group commented Apr 20, 2017

asmagen commented Apr 22, 2017

construct data object

filt out lowly expressed genes

normalize gene expression data

log transform the data

project the expression data into Reference Component space

generate cell clusters

GIS-SP-Group commented Apr 24, 2017

asmagen commented Apr 24, 2017 via email

GIS-SP-Group commented Apr 24, 2017

asmagen commented Apr 24, 2017 via email

asmagen commented Apr 24, 2017

GIS-SP-Group commented Apr 26, 2017

asmagen commented Apr 27, 2017 • edited Loading

GIS-SP-Group commented Apr 28, 2017 • edited Loading

asmagen commented Apr 28, 2017

asmagen commented May 10, 2017

enhaofrank commented Aug 2, 2017

wiseflying commented Aug 7, 2017

asmagen commented Apr 18, 2017 •

edited

Loading

asmagen commented Apr 27, 2017 •

edited

Loading

GIS-SP-Group commented Apr 28, 2017 •

edited

Loading