Allen Institute Taxonomy (AIT)

To distribute Allen Institute Taxonomies (AIT) we define an anndata .h5ad file which encapsulates the essential components of a taxonomy required for downstream analysis with a formalized schema.

For information on how to build and work with AIT files, see the companion scrattch R libraries.

For a list of available taxonomies in AIT format, see this table of available taxonomies.

Overview

One major challenge in creating a cell type taxonomy schema is in definition of terms such as "taxonomy", "dataset", "annotation", "metadata", and "data". It is becoming increasingly important to separate out the data from the other components, and compartmentalize all components to avoid the need to download, open, or upload huge and unweildy files. AIT addresses this challenge by extending and modifying the popular CELLxGENE schema to better align with BICAN and Allen Institute needs.

(Brief description of AIT and it's difference from CELLxGENE to be entered here. Also link to version of schema table with new column indicating what is included in CELLxGENE.)

Taxonomy 'modes' are a key concept specific to AIT that allow multiple embedded subsets of the data to be stored in a single .h5ad file. More detail about taxonomy modes and a separate schema describing how they work can be found here.

Related efforts

AIT is being developed alongside three complementary efforts for packaging of taxonomies, data sets, and associated metadata and annotations.

Cell Annotation Platform (CAP): CAP 'is a centralized, community-driven platform for the creation, exploration, and storage of cell annotations for single-cell RNA-sequencing (scRNA-seq) datasets.' The Allen Institute and BICAN are partnering with CAP for annotation of brain (including basal ganglia) and spinal cord taxonomies.
Cell Annotation Schema (CAS): Compatible with CAP and with Taxonomy Development Tools (TdT), CAS functions as a store of extended information about cell sets, including ontology term mappings and evidence for annotation (from annotation transfer and marker expression). CAS complements other cell-centric and occasionally cluster-centric schema more commonly used. CAS has both a general schema and a BICAN-associated schema, and can be embedded in the header (uns) of an AIT file.
Brain Knowledge Platform (BKP): While not publicly laid out anywhere that I can find, the BKP schema is the data model used for Jupyter Notebooks associated with the Allen Brain Cell (ABC) Atlas and will eventually power all novel content hosted on Allen Brain Map. Currently, any data sets to be included in ABC Atlas or MapMyCells must be ingested into BKP.

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
annotations		annotations
assets		assets
conversion_data		conversion_data
conversion_scripts		conversion_scripts
schema		schema
.DS_Store		.DS_Store
README.md		README.md
taxonomies.md		taxonomies.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Allen Institute Taxonomy (AIT)

Overview

Related efforts

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Allen Institute Taxonomy (AIT)

Overview

Related efforts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages