diff --git a/content/04.body.md b/content/04.body.md index 44cf8c6..ffb79b5 100644 --- a/content/04.body.md +++ b/content/04.body.md @@ -2,16 +2,21 @@ The Human Cell Atlas (HCA) provides unprecedented characterization of molecular phenotypes across individuals, tissues and disease states -- resolving differences to the level of -individual cells. This dataset provides an extraordinary opportunity for scientific advancement, enabled by new tools to rapidly query, characterize, and analyze these intrinsically -high-dimensional data. To facilitate this, our seed network proposes to compress HCA data into fewer dimensions -that preserve the important attributes of the original high dimensional data and yield -interpretable, searchable features. For transcriptomic data, compressing on the gene +individual cells. This dataset provides an extraordinary opportunity for scientific advancement, +enabled by new tools to rapidly query, characterize, and analyze these intrinsically +high-dimensional data. To facilitate this, our seed network proposes to compress HCA data into +fewer dimensions that preserve the important attributes of the original high dimensional data +and yield interpretable, searchable features. For transcriptomic data, compressing on the gene dimension is most attractive: it can be applied to single samples, and genes often provide -information about other co-regulated genes or cellular attributes. We hypothesize that building an ensemble of low dimensional representations across latent space methods will provide a -reduced dimensional space that captures biological sources of variability and is robust to measurement noise. Our seed network will -incorporate biologists and computer scientists from five leading academic institutions who will work together to create foundational technologies -and educational opportunities that promote effective interpretation of low dimensional representations of HCA data. We will continue our active collaborations with other -members of the broader HCA network to integrate state of the art latent space tools, portals, and annotations to enable biological utilization of HCA data through latent spaces. +information about other co-regulated genes or cellular attributes. We hypothesize that +using latent space methods to identify low dimensional representations of HCA data +will accurately capture biological sources of variability and will be robust to measurement +noise. Our seed network incorporates biologists, computer scientists, statisticians, and +data scientists from five leading academic institutions who will work collaboratively together to create +foundational technologies and educational opportunities that promote effective interpretation +of low dimensional representations of HCA data. We will continue our active collaborations with other +members of the broader HCA network to integrate state of the art latent space tools, +portals, and annotations to enable biological utilization of HCA data through latent spaces. ## Scientific Goals