Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifications about .abundance_counts #95

Open
denvercal1234GitHub opened this issue Sep 8, 2023 · 4 comments
Open

Clarifications about .abundance_counts #95

denvercal1234GitHub opened this issue Sep 8, 2023 · 4 comments

Comments

@denvercal1234GitHub
Copy link

denvercal1234GitHub commented Sep 8, 2023

Hi @stemangiola - My apologies for re-asking these questions following up from #92. Would you mind giving me some pointers?

Q1. Would you mind confirming that the duplicated cell identifiers are created as a byproduct of the plotting function and not that the input data has duplicated cell identifiers, correct?

Q2. What exactly does the .abundance_ function is calculating when there is one feature and when there are multiple features? Can we control how the expression grouped for plotting, e.g., median, ...?

Q3. Is it possible to modify the code below so that we can split a cell cluster by some threshold of expression of a marker or set of markers and plot these cell splits next to each other in the bar plot?

For example, in the plot below, CATALYST28meta16 is my cluster IDs on the x-axis. I want to have (instead of 1 bar of all the cells per cluster) 2 bars for each cluster, and one bar is of a group of cells that have "low" CD3 expression and the other bar is of a group of cells that have high CD3 expression?

Screenshot 2023-09-08 at 18 03 37
@stemangiola
Copy link
Owner

Q1. Would you mind confirming that the duplicated cell identifiers are created as a byproduct of the plotting function and not that the input data has duplicated cell identifiers, correct?

join_feature shape="long" creates a long table, so .feature cell are repeated (one for every gene)

Q2. What exactly does the .abundance_ function is calculating when there is one feature and when there are multiple features? Can we control how the expression grouped for plotting, e.g., median, ...?

.abundance_ is not a function but rather a column name. It is just extracting the value for a gene for an assay

Q3. Is it possible to modify the code below so that we can split a cell cluster by some threshold of expression of a marker or set of markers and plot these cell splits next to each other in the bar plot?

yes, you can adapt this code

|> mutate(high_value = .abundance_<xxx> > my_threahold)

then you can group by high_value category in t ggplot

@denvercal1234GitHub
Copy link
Author

denvercal1234GitHub commented Sep 10, 2023

Thank you @stemangiola for your response. If .abundance_ is simply a column name, then what does the code below (specifically join_features(features = c("CD4", "CD3"))) plot then because there is some way in which the different features were aggregated to produce the plot?

Or it just simply pull the expression value of each cell for every feature included and just plot them all but not doing anything to the expression among the cells? If so, can we label which dots correspond to which feature?

?tidySingleCellExperiment::join_features just says that "This function extracts information for specified features and returns the information in either long or wide format," but it is not clear how the features are joined.

Thank you again.

F37_sce_backboneClustering |> dplyr::filter(CATALYST28meta16 %in% c("14", "15")) |> join_features(features = c("CD4", "CD3")) |>
  ggplot(aes(CATALYST28meta16, .abundance_exprs, fill = CATALYST28meta16)) +  geom_violin(position = position_dodge(0.75))
Screenshot 2023-09-10 at 14 07 33

@stemangiola
Copy link
Owner

I think you should facet_wrap(~.feature), in your plot you are ignoring the feature column.

@denvercal1234GitHub
Copy link
Author

Thanks @stemangiola -- but I was wondering what does that plot above represents for these genes without facet_wrap(~.feature) as a plot was still generated. Is it sum of the expression of these 2 genes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants