Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt SEraster for Seurat spatial objects #190

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

Adapt SEraster for Seurat spatial objects #190

wants to merge 20 commits into from

Conversation

zskylarli
Copy link

@zskylarli zskylarli commented May 1, 2024

This PR implements SEraster, specifically functions rasterizeGeneExpression, rasterizeCellType, and permutateByRotation, for Seurat objects. Functions can take either one object or a list of objects, and return in the same format.

Changes:

  • createRasterizedObjectis used across rasterization functions to take in 1) original Seurat object + 2) the outputs of rasterizeMatrix to create a new Seurat object of the same spatial class.
    • Currently implemented separately for VisiumV1, VisiumV2, and FOV classes since coordinates pulled for plotting are different by class; this can probably be simplified.
    • Scaling factors are estimated and probably not accurate.
  • Both rasterizeGeneExpression and rasterizeCellType now have additional arguments to specify the data to perform rasterization on, such as assay_name, image, and slot.

Notes:

  • The underlying rasterizeMatrix function used across rasterization methods is called from SEraster as of now, but can also be directly added to the script to avoid additional library requirements.
  • permutateByRotation currently correctly rotates the coordinates for spots but not the underlying H&E image for Visium data - at least not correctly (this is not relevant to Xenium or MERFISH, which are the main focuses in the SEraster paper).
    • This function also adds new FOVs to the original object instead of creating a new object using createRasterizedObject, and the code looks a little redundant.
    • Have tried to create a function rotate to also rotate the image when adding it back to the FOV, and while this does shift the image, it does not align with the rotated spots.

Resources:

@zskylarli zskylarli requested a review from dcollins15 May 15, 2024 15:06
Copy link
Contributor

@dcollins15 dcollins15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far! 🙌

Since this is still a WIP I'll just comment on the SpatialImage/FOV typing stuff - also happy to chat about eliminating code duplication if that would be helpful 🙂

tl;dr: Make sure to pass scale = NULL when calling GetTissueCoordinates and Radius and consider just outputting FOV and VisiumV2 instances (i.e update VisiumV1 inputs on the way through)

Just to recap the state of things: Seurat/SeuratObject define two interfaces (by which I mean structural types) that inherit the SpatialImage virtual class. All concrete SpatialImage classes implement a common set of generic functions - we'll focus on GetTissueCoordinates, Radius, and GetImage:

  1. The original spatial interface is implemented by SlideSeq, STARmap, VisiumV1, and VisiumV2
  • GetTissueCoordinates returns a data.frame whose first column contains X coordinates, second columns contains Y coordinates, and rownames contains unique barcodes from the corresponding assay
  • GetTissueCoordinates accepts a scale param, or else ignores it via ... (this was the case pre Seurat v5.1)
  • Radius accepts a scale param, or else ignores it via ... (added to accommodate hires images from Visium)
  • GetImage returns list of (potentially empty) image values
  • Relied on by SpatialPlot
  • I would strongly recommend against coupling this wrapper to this interface -
  1. FOV which supports various image-based spatial transcriptomic technologies - also implemented by VisiumV2
  • GetTissueCoordinates returns a data.frame with ordered columns "x", "y", and "cell" - rownames may or may not be set to the "cell" column which contains unique barcodes from the corresponding assay
  • Radius accepts a scale param, or else is not defined - Radius.Centroids ignores all params via ...
  • GetImage returns a list of image values, or else is not defined

Eventually, we'd like to deprecate the first interface in favor of the second but only after updating the latter so that:

  • GetTissueCoordinates and Radius can both be passed scale = {TRUE/FALSE}
  • GetImage returns list of (potentially empty) image values

All of that is to say, the main change you'll want to make is to pass scale = NULL into GetTissueCoordinates and Radius anytime you're accessing either property so that you don't have to muck around with any scaling logic inside createRasterizedObjects

I'd also suggest that you avoid trying to output VisiumV1 instances and then update createRasterizedObjects so that it builds an FOV by default and then updates it into a VisiumV2 instance in a tryCatch block alongside a call to ScaleFactors which is only defined for VisiumV1/VisiumV2 images.

In effect, VisiumV1 inputs would get rasterized into VisiumV2 ouputs and everything else would become a FOV. In the immediate term this would be bad because you can't give a FOV to SpatialPlot but that's only because the output of GetTissueCoordinates.FOV doesn't have rownames set which should be a straight-forward patch to make in SeuratObject (I was trying to make as few changes as possible for v5.0.2 but I kinda regret not including this one).

On a semi-related note we might want to consider limiting the output to just FOV - if we're also manipulating the underlying image values then this won't be practical but I'm not totally sure what the story with those is meant to be exaclty 🤷 It's easy enough to convert a FOV to a VisiumV2 given the missing info:

fov_to_visium <- function(fov, image, scale.factors) {
  # build the final `VisiumV2` - essentially just adding `image` and
  # `scale.factors` to the object
  visium.fov <- new(
    Class = "VisiumV2",
    boundaries = fov@boundaries,
    molecules = fov@molecules,
    assay = fov@assay,
    key = fov@key,
    image = image,
    scale.factors = scale.factors
  )
}

#' createRasterizedObject
#' @keyword internal
#'
createRasterizedObject <- function(input, out, name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll eventually want to choose some more informative variable names 🙃

name can be an assay or an image?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, thanks for reminding! - and name should probably be new_assay_name and/or be optional since I just realized rasterizeCellType passes in the metadata column name but it's never used (the condition under it is just to pull the right image to associate the assay with).

R/SEraster.R Outdated
Comment on lines 8 to 84
data_rast <- out$data_rast
pos_rast_temp <- as.data.frame(out$pos_rast)

if(name %in% Assays(input)){
image <- Images(input, assay = name)[1]
} else{
image <- Images(input, assay = DefaultAssay(input))[1]
}

meta_rast <- out$meta_rast[,c("num_cell","type","resolution")]
resolution <- meta_rast[["resolution"]][1]
for (i in seq_along(out$meta_rast$cellID_list)) {
meta_rast$cellID_list[i] <- paste(unlist(out$meta_rast$cellID_list[[i]]), collapse = ", ")
}

output <- CreateSeuratObject(
counts = data_rast,
assay = name,
meta.data = meta_rast
)

class <- class(input[[image]])
if(class == 'VisiumV1') {
scale.use <- ScaleFactors(input[[image]])[['lowres']]
pos_rast <- as.data.frame(cbind(tissue = rep(1, nrow(pos_rast_temp)),
row = pos_rast_temp$x,
col = pos_rast_temp$y,
imagerow = ceiling(pos_rast_temp$x / scale.use),
imagecol = ceiling(pos_rast_temp$y / scale.use)))
rownames(pos_rast) <- rownames(pos_rast_temp)
input[[image]]@scale.factors$spot <- sqrt(nrow(input[[]])/nrow(meta_rast))*Radius(input[[image]])

new.fov <- new(
Class = "VisiumV1",
coordinates = pos_rast,
assay = name,
key = "rasterized_",
image = input[[image]]@image,
scale.factors = input[[image]]@scale.factors,
spot.radius = input[[image]]@scale.factors$spot
)
} else if(class == 'VisiumV2') {
input[[image]]@scale.factors$spot <- sqrt(nrow(input[[]])/nrow(meta_rast))*slot(input[[image]]$centroids, name='radius')
fov <- CreateFOV(
pos_rast_temp[, c("x", "y")],
type = "centroids",
radius = input[[image]]@scale.factors[["spot"]],
assay = name,
key = Key("rasterized_", quiet = TRUE)
)
new.fov <- new(
Class = "VisiumV2",
boundaries = fov@boundaries,
assay = fov@assay,
key = fov@key,
image = input[[image]]@image,
scale.factors = input[[image]]@scale.factors
)
} else if(class == 'FOV') {
radius <- sqrt(nrow(input[[]])/nrow(meta_rast))*slot(input[[image]]$centroids, name='radius')
pos_rast_temp$cell <- rownames(pos_rast_temp)
segmentations.data <- list(
"centroids" = CreateCentroids(coords = pos_rast_temp[, c("x", "y")], nsides = 4, radius = radius),
"segmentation" = CreateSegmentation(coords = pos_rast_temp)
)
new.fov <- CreateFOV(
coords = segmentations.data,
type = c("segmentation", "centroids"),
radius = radius,
molecules = input[[image]]$molecules,
assay = name,
key = Key("rasterized_", quiet = TRUE)
)
}
output@images[[paste0(name,".rasterized")]]<- new.fov
return(output)
}
Copy link
Contributor

@dcollins15 dcollins15 May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the refactor I described above - it makes assumes that neither the background image nor any cell-segmentations are being manipulated which I think is correct?

(FYI, I hacked this together without testing it so I'm not sure that this runs):

Suggested change
data_rast <- out$data_rast
pos_rast_temp <- as.data.frame(out$pos_rast)
if(name %in% Assays(input)){
image <- Images(input, assay = name)[1]
} else{
image <- Images(input, assay = DefaultAssay(input))[1]
}
meta_rast <- out$meta_rast[,c("num_cell","type","resolution")]
resolution <- meta_rast[["resolution"]][1]
for (i in seq_along(out$meta_rast$cellID_list)) {
meta_rast$cellID_list[i] <- paste(unlist(out$meta_rast$cellID_list[[i]]), collapse = ", ")
}
output <- CreateSeuratObject(
counts = data_rast,
assay = name,
meta.data = meta_rast
)
class <- class(input[[image]])
if(class == 'VisiumV1') {
scale.use <- ScaleFactors(input[[image]])[['lowres']]
pos_rast <- as.data.frame(cbind(tissue = rep(1, nrow(pos_rast_temp)),
row = pos_rast_temp$x,
col = pos_rast_temp$y,
imagerow = ceiling(pos_rast_temp$x / scale.use),
imagecol = ceiling(pos_rast_temp$y / scale.use)))
rownames(pos_rast) <- rownames(pos_rast_temp)
input[[image]]@scale.factors$spot <- sqrt(nrow(input[[]])/nrow(meta_rast))*Radius(input[[image]])
new.fov <- new(
Class = "VisiumV1",
coordinates = pos_rast,
assay = name,
key = "rasterized_",
image = input[[image]]@image,
scale.factors = input[[image]]@scale.factors,
spot.radius = input[[image]]@scale.factors$spot
)
} else if(class == 'VisiumV2') {
input[[image]]@scale.factors$spot <- sqrt(nrow(input[[]])/nrow(meta_rast))*slot(input[[image]]$centroids, name='radius')
fov <- CreateFOV(
pos_rast_temp[, c("x", "y")],
type = "centroids",
radius = input[[image]]@scale.factors[["spot"]],
assay = name,
key = Key("rasterized_", quiet = TRUE)
)
new.fov <- new(
Class = "VisiumV2",
boundaries = fov@boundaries,
assay = fov@assay,
key = fov@key,
image = input[[image]]@image,
scale.factors = input[[image]]@scale.factors
)
} else if(class == 'FOV') {
radius <- sqrt(nrow(input[[]])/nrow(meta_rast))*slot(input[[image]]$centroids, name='radius')
pos_rast_temp$cell <- rownames(pos_rast_temp)
segmentations.data <- list(
"centroids" = CreateCentroids(coords = pos_rast_temp[, c("x", "y")], nsides = 4, radius = radius),
"segmentation" = CreateSegmentation(coords = pos_rast_temp)
)
new.fov <- CreateFOV(
coords = segmentations.data,
type = c("segmentation", "centroids"),
radius = radius,
molecules = input[[image]]$molecules,
assay = name,
key = Key("rasterized_", quiet = TRUE)
)
}
output@images[[paste0(name,".rasterized")]]<- new.fov
return(output)
}
image_name <- ifelse(
name %in% Assays(input),
Images(input, assay = name)[1],
DefaultImage(input)
)
input_fov <- input[[image_name]]
input_centroids <- input.fov[["centroids"]]
input_segmentation <- input.fov[["segmentation"]]
input_molecules <- input.fov[["molecules"]]
output_counts <- out$data_rast
output_meta.data <- out$meta_rast[, c("num_cell", "type", "resolution")]
resolution <- output_meta.data[["resolution"]][1]
for (i in seq_along(out$meta_rast$cellID_list)) {
output_meta.data $cellID_list[i] <- paste(
unlist(out$meta_rast$cellID_list[[i]]),
collapse = ", "
)
}
# should the ouput assay/image names reflect `resolution` similar
# to the way `Load10X_Spatial` handles `bin.size`?
output_image_name <- paste0("rasterized.", image_name)
output_coordinates <- as.data.frame(out$pos_rast)
output_radius <- sqrt(
nrow(input[[]]) / nrow(meta_rast)
) * input.centroids@radius
output_centroids <- CreateCentroids(
coords = output_coordinates,
nsides = input_centroids@nsides,
radius = output_radius,
theta = input_centroids@theta
)
output_boundaries <- list(
"centroids" = output_centroids,
# from what I can tell SEraster does not operate on segmentations
# but will likely support this functionality in the future
# for now I think we can just pass this through unchanged?
"segmentation" = input_segmentation
)
output_fov <- CreateFOV(
boundaries = output_boundaries,
molecules = input_molecules,
assay = assay,
key = Key(output_image_name, quiet = TRUE)
)
# `scale_factors` will be set to `NULL` unless there a matching
# implementation of the `ScaleFactors` generic available for `input_fov`
# (i.e if `input_fov` is one of `VisiumV1`/`VisiumV2`)
scale_factors <- tryCatch(
ScaleFactors(input_fov),
error = function(e) {
return (NULL)
}
)
if (!is.null(scale_factors)) {
# if `scale_factors` is defined we should also be
background_image <- GetImage(input_fov)
# which is everything we need to build a `VisiumV2` instance
output_fov <- new(
Class = "VisiumV2",
boundaries = output_fov@boundaries,
molecules = output_fov@molecules,
assay = output_fov@assay,
key = output_fov@key,
image = background_image,
scale.factors = scale.factors
)
}
output <- CreateSeuratObject(
counts = output_counts,
assay = name,
meta.data = output_meta.data
)
# not sure if it's necessary to re-align the image's identifiers
# with the output object's, but just in case
output_fov <- output_fov[[Cells(output)]]
output[[output_image_name]] <- output_fov
return(output)
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I didn't do this was because existing VisiumV1 class objects (like stxbrain in SeuratData) don't have Centroids, and VisiumV2 objects don't have segmentations because that's only relevant to in-situ data, so you wouldn't be able to extract info with

input_fov <- input[[image_name]]
input_centroids <- input.fov[["centroids"]]
input_segmentation <- input.fov[["segmentation"]]
input_molecules <- input.fov[["molecules"]]

Not sure if there is really an alternative but if you have other ideas please let me know!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be pretty easy to essentially ignore the "segmentation"s and pass them straight through:

boundary_names <- Boundaries(input_fov)
input_boundaries <- input_fov@boundaries
input_centroids <- ifelse(
    "centroids" %in% boundary_names,
    input_boundaries[["centroids"]],
    NULL
)
input_molecules <- input_fov@molecules

Wrapping the whole thing in a tryCatch allows us to accomodate non-FOV images:

tryCatch(
    {
      boundary_names <- Boundaries(input_fov)
      input_boundaries <- input_fov@boundaries
      input_centroids <- ifelse(
        "centroids" %in% boundary_names,
        input_boundaries[["centroids"]],
        NULL
      )
      input_molecules <- input_fov@molecules
    },
    error = function(e) {
      input_boundaries <- list()
      input_centroids <- NULL
      input_molecules <- NULL
    }
  )

without having to know the actual type being handed around:

if (!is.null(input_centroids)) {
    output_nsides <- input_centroids@nsides
    output_theta <- input_centroids@theta

    output_radius <- sqrt(
      nrow(input[[]]) / nrow(meta_rast)
    ) * Radius(input_centroids) # unscaled
  } else {
    nsoutput_nsidesides <- NULL
    output_theta <- NULL

    output_radius <- Radius(input_fov)
  }

  output_centroids <- CreateCentroids(
    coords = output_coordinates,
    nsides = output_nsides,
    radius = output_radius,
    theta = output_theta
  )

I guess you could technically also get an FOV with multiple Molecules but all that means is we should avoid using CreateFOV and just call new directly (also note the way "segmentations" will be handed through if it's present):

output_boundaries <- input_boundaries
  output_boundaries[["centroids"]] <- output_centroids

  output_fov <- new(
    Class = "FOV",
    boundaries = output_boundaries,
    molecules = input_molecules,
    assay = assay,
    key = Key(output_image_name, quiet = TRUE)
  )

} else {
dataset <- i
}
return(data.frame(dataset = dataset, xmin = min(pos[,1]), xmax = max(pos[,1]), ymin = min(pos[,2]), ymax = max(pos[,2])))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - line length:

Suggested change
return(data.frame(dataset = dataset, xmin = min(pos[,1]), xmax = max(pos[,1]), ymin = min(pos[,2]), ymax = max(pos[,2])))
return(
data.frame(
dataset = dataset,
xmin = min(pos[,1]),
xmax = max(pos[,1]),
ymin = min(pos[,2]),
ymax = max(pos[,2])
)
)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from the original code so I'm not going to change it

R/SEraster.R Outdated
#' rasterizeGeneExpression
#' @export
#'
rasterizeGeneExpression <- function(input, assay_name = NULL, image = NULL, slot = "counts", resolution = 100, square = TRUE, fun = "mean", n_threads = 1, BPPARAM = NULL, verbose = FALSE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - line length:

Suggested change
rasterizeGeneExpression <- function(input, assay_name = NULL, image = NULL, slot = "counts", resolution = 100, square = TRUE, fun = "mean", n_threads = 1, BPPARAM = NULL, verbose = FALSE) {
rasterizeGeneExpression <- function(
input,
assay_name = NULL,
image = NULL,
slot = "counts",
resolution = 100,
square = TRUE,
fun = "mean",
n_threads = 1,
BPPARAM = NULL,
verbose = FALSE
) {

if (is.null(assay_name)) {
assay_name <- DefaultAssay(spe)
counts <- LayerData(spe[[assay_name]], layer = slot)
out <- SEraster::rasterizeMatrix(counts, coords, bbox = bbox_common, resolution = resolution, square = square, fun = fun, n_threads = n_threads, BPPARAM = BPPARAM, verbose = verbose)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: line length

Suggested change
out <- SEraster::rasterizeMatrix(counts, coords, bbox = bbox_common, resolution = resolution, square = square, fun = fun, n_threads = n_threads, BPPARAM = BPPARAM, verbose = verbose)
out <- SEraster::rasterizeMatrix(
counts,
coords,
bbox = bbox_common,
resolution = resolution,
square = square,
fun = fun,
n_threads = n_threads,
BPPARAM = BPPARAM,
verbose = verbose
)

@dcollins15
Copy link
Contributor

Another edge case to keep in mind is that not all FOV instances will have centroids and (correct me if I'm wrong) SEraster doesn't handle cell segmentations. Whenever we make a call to GetTissueCoordinates.FOV/GetTissueCoordinates.VisiumV2 we need to be specifying which = "centroids" (and check that this doesn't raise any errors).

What is the expected behavior in this case?

@zskylarli
Copy link
Author

zskylarli commented May 22, 2024

Another edge case to keep in mind is that not all FOV instances will have centroids and (correct me if I'm wrong) SEraster doesn't handle cell segmentations. Whenever we make a call to GetTissueCoordinates.FOV/GetTissueCoordinates.VisiumV2 we need to be specifying which = "centroids" (and check that this doesn't raise any errors).

What is the expected behavior in this case?

Will check with the main author but seems counterintuitive to rasterize segmentations.

Also, have tried integrating your fix, but it ends up being that there are too many differences between VisiumV2 instance vs a normal FOV that it is almost cleaner to just implement by class than it is to use a lot of different tryCatchs; for example, the lack of a segmentation for V2 becomes an issue with CreateFOV. Will continue to clean the code but I actually can't think of a better solution than just to handle different classes separately.

Copy link
Contributor

@dcollins15 dcollins15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, have tried integrating your fix, but it ends up being that there are too many differences between VisiumV2 instance vs a normal FOV that it is almost cleaner to just implement by class than it is to use a lot of different tryCatchs; for example, the lack of a segmentation for V2 becomes an issue with CreateFOV.

If we were able to use CreateFOV why wouldn't we just always hand in a named list to the coords param? I don't think we're able to use the function though - see my comment about the handling of molecules.

You also are using the tryCatch to handle VisiumV2 vs. FOV? In terms of the VisiumV1-specific logic, I am strongly against this. Why would we support VisiumV1 instances but not SlideSeq or STARmap?

Will check with the main author but seems counterintuitive to rasterize segmentations.

If this is the case, then shouldn't we be dropping the segmentations? Since new cell barcodes are being generated, the segmentations won't match and you eventually get an error if you try to set the FOV's DefaultBoundary to "segmentation":

Error in `[[<-`:
! Cannot add new cells with [[<-

Similarly if DefaultBoundary is set to "segmentation" for the input object, then you'll also get an error:

Error in `.rowNamesDF<-`(x, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘10022’, ‘10023’ [... truncated] 

DESCRIPTION Outdated Show resolved Hide resolved
@zskylarli
Copy link
Author

zskylarli commented May 23, 2024

Okay, I think I understand what you mean now and have tried to implement the suggestions. One issue remaining is that I can't figure out how the radius is used for plotting FOV objects - I see that the Radius() function is called within GeomSpatial to get the spot.size for SpatialPlots, but I actually have no idea how this is handled for SingleImagePlots; assuming it's different since it's inheriting ggplot's geom_point instead of using our internal geom_spatial? Let me know if you have any ideas.

@dcollins15
Copy link
Contributor

Okay, I think I understand what you mean now and have tried to implement the suggestions.

At a quick glance createRasterizedObject is looking pretty slick now - nicely done.

One issue remaining is that I can't figure out how the radius is used for plotting FOV objects - I see that the Radius() function is called within GeomSpatial to get the spot.size for SpatialPlots, but I actually have no idea how this is handled for SingleImagePlots; assuming it's different since it's inheriting ggplot's geom_point instead of using our internal geom_spatial? Let me know if you have any ideas.

I think it may just be ignored? ImageFeaturePlot and ImageDimPlot both have a size parameter that appears to get handed down to geom_point inside SingleImagePlot. Are you running into plotting issues post-rasterization?

@zskylarli
Copy link
Author

Yeah, just that the radius isn't really passed into the plotting functions, so adjusting size will scale the points correctly but adjusting the radius for spots in the rasterized object doesn't actually seem to do anything.

@zskylarli zskylarli changed the title [WIP] Adapt SEraster for Seurat spatial objects Adapt SEraster for Seurat spatial objects Jun 3, 2024
@zskylarli zskylarli requested a review from dcollins15 June 3, 2024 16:20
@dcollins15
Copy link
Contributor

dcollins15 commented Jun 3, 2024

noice

The vignette is easy to follow and everything seems to be working as expected 👌 I'm still in the process of writing out some suggestions to tidy the code up some but I've got a couple of high-level comments that are worth considering in the meantime:

  1. Namespace collisions - we probably don't want to have to make calls to SeuratWrappers:: in the vignette. We wouldn't have to if we didn't import the SERaster package or otherwise updated the SeuratWrapper methods to use PascalCase instead of camelCase. Since Seurat usually uses PascalCase for function names this might be kind of an elegant way to go.

  2. permutateByRotation is currently clobbering the image cell names with simple indices. This means that until rasterizeGeneExpression or rasterizeCellType is called, the Seurat instances are invalid and cannot be passed to any function that checks that validObject passes (e.g. NormalizeData). Furthermore, the fact that the clobbered cell names still work with the downstream methods might suggest that rasterizeGeneExpression and rasterizeCellType will not work as expected if handed an object where the counts index differs from the coordinates index (e.g. the image has been previously cropped).

  3. I think it might be more ergonomic to have permutateByRotation return a single Seurat instance but with multiple FOVs. Then, instead of allowing a list to be passed to rasterizeGeneExpression and rasterizeCellType we just have to support multiple image names to be specified. This isn't possible to do with a SpatialExperiment so this would be a small departure from the underlying API, but I think this is a better way to capture the workflow with our data model.

Once I've got my final code suggestions pushed up we can touch base and settle on action items to try to get this merged in as soon as practical 👍

Copy link
Contributor

@dcollins15 dcollins15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a closer read through the underlying SEraster package and noticed some concerns regarding the way we’re essentially forking rasterizeGeneExpression, rasterizeCellType, and permutateByRotation. By implementing things this way, we are committing ourselves to manually keeping the two implementations in sync. For really simple changes (e.g., changing the default values of parameters), this is manageable but can be a bit tedious. For more complicated updates, resolving the differences between two slightly different versions of the same method can become very challenging.

On a positive note, your decision not to re-implement rasterizeMatrix was a good one! 👍

I understand you might not want to spend too much more time on this, so I won’t block you here. If I were you, I would consider either:

Decoupling the SeuratWrapper functions from the SEraster so that the wrapper constitutes a new interface (similar to how previous wrappers are implemented, see https://github.com/KlugerLab/ALRA/blob/master/R/alra.R)
Implementing rasterizeGeneExpression and rasterizeCellType as true wrappers that call the equivalent SEraster methods, which would involve converting to and from a SpatialExperiment instance.

Given the timing, I realize either option might not seem very attractive, but I think it’s still valuable to be mindful of any technical debt we knowingly incur.

All that said, I think I’d be ok with merging this in once permutateByRotation has been updated to avoid clobbering cell barcodes - I think it’s just a function of making sure that the rownames are set for the data.frame returned by GetTissueCoordinates 👌

All in all, nice work!

R/SEraster.R Outdated Show resolved Hide resolved
Copy link
Contributor

@dcollins15 dcollins15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our discussion last Friday, we'll rename the main methods to avoid conflicting with the underlying SEraster package:

  • rasterizeGeneExpression -> RunRasterizeGeneExpression
  • rasterizeCellType -> RunRasterizeCellType
  • permutateByRotation -> RunPermutateByRotation

In addition, I've hammered out a small update that consolidates the updateFOV and updateVisiumV2 👍

docs/SEraster.Rmd Outdated Show resolved Hide resolved
R/SEraster.R Outdated Show resolved Hide resolved
R/SEraster.R Outdated Show resolved Hide resolved
R/SEraster.R Outdated Show resolved Hide resolved
R/SEraster.R Outdated Show resolved Hide resolved
zskylarli and others added 3 commits June 12, 2024 15:05
Co-authored-by: David Collins <[email protected]>
Co-authored-by: David Collins <[email protected]>
Co-authored-by: David Collins <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants