Skip to content

Considerations for single cell splicing data #51

Open
@dbogdano

Description

@dbogdano

Hello. Thank you for creating and maintaining this easy to use tool.

When scoring cells, have you considered how single-cell splicing data, stored as a cell-by-intron matrix of percent-spliced/PSI values, might be input? This data is generally more sparse than gene expression, with many values represented as NaN, where no underlying gene expression in the cell can be used to calculate PSI.

As is, score_cell returns NaN values as scores for every cell, likely due to the missing values in the input.

Activity

martinjzhang

martinjzhang commented on Feb 15, 2023

@martinjzhang
Owner

Hi,

Thank you for the question. Does it make sense to replace the NaN values with 0 in the input data? Or this requires new method development?

Best,
Martin

dbogdano

dbogdano commented on Feb 15, 2023

@dbogdano
Author

Hi Martin,

Thanks for the quick response. Unfortunately it isn't that simple, a 0 PSI value refers to a 0 rate of intron inclusion given the RNA-seq reads that either span a given splice junction, suggesting the intron is spliced out, or bypass the junction, suggesting it is retained. PSI values span 0 to 1, with 1 being 100% intron inclusion given the evidence. The NaN values refer to a lack of either type RNA-seq read in the single cell, providing no evidence for inclusion of excision.
Using PSI instead of read pileups overlapping splice junctions allows splicing to be represented without different levels of expression of the underlying gene confounding the measurement.

For now, I'm thinking of just using pseudo-bulked cells representing the mean PSI values of somewhere between 10-100 single cells, grouped together by similar gene expression, before trying anything more sophisticated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @dbogdano@martinjzhang

        Issue actions

          Considerations for single cell splicing data · Issue #51 · martinjzhang/scDRS