Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assay<- is too pedantic when the dimensions are named #51

Closed
LTLA opened this issue Jul 19, 2021 · 2 comments
Closed

assay<- is too pedantic when the dimensions are named #51

LTLA opened this issue Jul 19, 2021 · 2 comments

Comments

@LTLA
Copy link
Contributor

LTLA commented Jul 19, 2021

Identified in drisso/SingleCellExperiment#59:

mat <- matrix(rnorm(1000), ncol=20)
colnames(mat) <- LETTERS[1:20]

library(SummarizedExperiment)
se <- SummarizedExperiment(list(thingy=mat))

names(dim(mat)) <- c("Gene", "Cell")
assay(se) <- mat
## Error in `assays<-`(`*tmp*`, withDimnames = withDimnames, ..., value = `*vtmp*`) : 
##   please use 'assay(x, withDimnames=FALSE)) <- value' or 'assays(x,
##   withDimnames=FALSE)) <- value' when the dimnames on the supplied
##   assay(s) are not identical to the dimnames on SummarizedExperiment
##   object 'x'

I'll put aside the question of whether it's a good idea to rename the dim vector, because someone or something is clearly doing it, so assay<- should be robust to it. I'm guessing there's an identical() call on the old and new dimnames() that is getting tripped by the new names on the dimnames().

Session information
R version 4.1.0 Patched (2021-05-20 r80347)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /home/luna/Software/R/R-4-1-branch-dev/lib/libRblas.so
LAPACK: /home/luna/Software/R/R-4-1-branch-dev/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] SingleCellExperiment_1.15.0 SummarizedExperiment_1.23.0
 [3] Biobase_2.53.0              GenomicRanges_1.45.0       
 [5] GenomeInfoDb_1.29.0         IRanges_2.27.0             
 [7] S4Vectors_0.31.0            BiocGenerics_0.39.0        
 [9] MatrixGenerics_1.5.0        matrixStats_0.58.0         
[11] BiocFileCache_2.1.0         dbplyr_2.1.1               

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6             XVector_0.33.0         pillar_1.6.1          
 [4] compiler_4.1.0         zlibbioc_1.39.0        bitops_1.0-7          
 [7] tools_4.1.0            bit_4.0.4              lattice_0.20-44       
[10] RSQLite_2.2.7          memoise_2.0.0          lifecycle_1.0.0       
[13] tibble_3.1.2           pkgconfig_2.0.3        rlang_0.4.11          
[16] Matrix_1.3-3           DelayedArray_0.19.0    DBI_1.1.1             
[19] filelock_1.0.2         curl_4.3.1             fastmap_1.1.0         
[22] GenomeInfoDbData_1.2.6 withr_2.4.2            dplyr_1.0.6           
[25] httr_1.4.2             generics_0.1.0         vctrs_0.3.8           
[28] rappdirs_0.3.3         grid_4.1.0             bit64_4.0.5           
[31] tidyselect_1.1.1       glue_1.4.2             R6_2.5.0              
[34] fansi_0.4.2            purrr_0.3.4            blob_1.2.1            
[37] magrittr_2.0.1         ellipsis_0.3.2         assertthat_0.2.1      
[40] utf8_1.2.1             RCurl_1.98-1.3         cachem_1.0.5          
[43] crayon_1.4.1          
@LiNk-NY
Copy link
Contributor

LiNk-NY commented Jul 19, 2021

These look like the relevant lines. I agree that checking names on dimnames are a bit too much.

.assays_have_identical_dimnames <- function(assays, x_dimnames)
{
if (length(assays) == 0L)
return(TRUE)
## Like DelayedArray:::simplify_NULL_dimnames() but first replaces list
## elements that are equal to character(0) with NULLs.
normalize_dimnames <- function(dn) {
empty_idx <- which(lengths(dn) == 0L)
if (length(empty_idx) == length(dn))
return(NULL)
dn[empty_idx] <- list(NULL)
dn
}
x_dimnames <- normalize_dimnames(x_dimnames)
ok <- vapply(seq_along(assays),
function(i) {
a <- getListElement(assays, i)
a_dimnames <- normalize_dimnames(dimnames(a))
identical(a_dimnames, x_dimnames)
},
logical(1)
)
all(ok)
}

@hpages
Copy link
Contributor

hpages commented Aug 24, 2021

Addressed in SummarizedExperiment 1.23.3 (see commit 3b26fb6)

Your example @LTLA will still fail with this new version because it turns out that doing names(dim(mat)) <- c("Gene", "Cell") actually drops the dimnames of the matrix (I didn't know that and I find it very surprising) so you would need to do names(dimnames(mat)) <- c("Gene", "Cell") instead.

@hpages hpages closed this as completed Sep 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants