Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I format my data as chi11 in the Get Started documentation? #4

Open
liyawang opened this issue Oct 5, 2023 · 1 comment
Open

Comments

@liyawang
Copy link

liyawang commented Oct 5, 2023

I have manually curated my populations in FlowJo.

@rohitfarmer
Copy link
Collaborator

rohitfarmer commented Oct 6, 2023

Data prepartion for HDStIM from manually gated populations

Hi @liyawang, chi11 is an R list that includes the main input data frame with cells on the rows, markers, and other variables on the columns. The list also contains a vector of state markers, a variable with the column's name with cell populations, a vector of stimulation types, and a variable with the unstim label.

When you say that you want to generate the input data from the manually gated populations from FlowJo, I assume you have FCS files per sample per population exported from FlowJo that you want to read in and generate the input data frame.

Under such conditions, I first prepare a sample information file with the list of FCS files, stimulation types they represent, cell population, subject, etc. Then, I iterate over the FCS files and read in the expression matrix using FlowCore. I attach other variables such as cell population the cells belong to, subject, stimulation type, etc., and concatenate in a single data frame.

Once the data frame is ready, I arcsinh transform the antigen channels. Conventionally, we use a cofactor of 5 for cytof data and 150 for cytek data.

I have uploaded an example dataset to this link https://app.box.com/s/o0gd6605z23bjpxyvl32tybkzqc4iyts with a sample information file. Try the code below to generate the data frame.

library(flowCore)
library(tidyverse)
library(arrow)

data_folder <- file.path("data")
results_folder <- file.path("results")
dir.create(results_folder,recursive = TRUE)

# Load the sample information
cytof_files_fcs <- read_tsv(file.path(data_folder, "sample-information.tsv"),
                            show_col_types = FALSE)

# Iterate over FCS files and concatenate  data into a single data frame
cytof_dat_out <- tibble()
for(j in 1:nrow(cytof_files_fcs)){
        cytof_file_path <- file.path(data_folder, cytof_files_fcs$fcs_file[j])
        cytof_fcs_dat <- read.FCS(cytof_file_path, transformation = FALSE, truncate_max_range = FALSE, alter.names=TRUE)
        cytof_expr_dat <- exprs(cytof_fcs_dat) %>%
                as_tibble()
        cytof_col_names <- as.character(cytof_fcs_dat@parameters@data$name) # Channel names in the FCS columns.
        cytof_col_desc <- as.character(cytof_fcs_dat@parameters@data$desc) # Name of the markers.
        c_names <-  coalesce(cytof_col_desc, cytof_col_names) %>% # User marker names where available else use channel names for example,
        str_split( "_", simplify = TRUE)                          # for time and forward and side scattering.
        c_names <- gsub("-", "_", coalesce(na_if(c_names[,2], ""), c_names[,1]))
        features <- c(gsub("-", "_", panel$antigen), "Time")
        colnames(cytof_expr_dat) <- c_names
        cytof_expr_dat <- dplyr::select(cytof_expr_dat, all_of(features))
        
        # Archsinh transform
        cofactor <- 5
        asi <- function(x) asinh(x/cofactor)
        markers <- setdiff(colnames(cytof_expr_dat), "Time")
        cytof_expr_dat[markers] <- apply(cytof_expr_dat[markers],2, asi)

        cytof_dat_out <- bind_rows(cytof_dat_out, tibble("cell_population" = cytof_files_fcs$cell_population[j],
                                                         "stim_type" = cytof_files_fcs$stim_type[j],
                                                         "subject" = cytof_files_fcs$subject[j],
                                                         cytof_expr_dat))
}

arrow::write_feather(cytof_dat_out, file.path(results_folder, "cytof-dat-asinh-transform.feather"))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants