iReceptor Data Curation

This GIT repository contains example files and documentation for loading data into iReceptor repositories. Examples for metadata as well as rearrangement files for a number of widely used annotation tools are provided. The README files in each of the subfolders contain more documentation. The Zenodo link for this release is here:

For more information on Repertoire metadata curation, please refer to:

For more details on Rearrangement data curation, please refer to:

For more details on Clone data curation, please refer to:

For more details on Cell and Expression (GEX) data curation, please refer to:

The AIRR Cell format example

The iReceptor Data Curation process

The iReceptor team follows a relatively strict data curation process. This process is documented on the iReceptor Curation page. We do not discuss this process in detail here, but instead suggest simple processes that can make data curation easier to manage.

The iReceptor curation process is focused around the curation of data for a single study. As such, we recommend that all data that is being curated for a specific study be stored in a single directory. As an example, we will use one of the IMGT example data sets.

As mentioned, we recommended that all files relevant to the curation of data from a single study be located in a single directory. This would include the Repertoire Metadata file for the study as well as all of the Rearrangement, Clone, Cell, and Expression files for each Repertoire. In the case of the IMGT example, this includes a single metadata file (PRJNA248411_Palanichamy_2018-12-18.csv). We tend to structure the metadata file name using the studies Study ID from NCBI, the principal (or contact) author, and the date the file was last modified. In addition, for each of the 8 sample repertoires in the study, in this case there is a single IMGT annotation file. Again, we use the NCBI accession number for the file in the filename to help manage the data. Note that it is possible to have more than one file for a single repertoire. Both the Repertoire Metadata file and the iReceptor Data Loader support having multiple files per repertoire sample.

Given the above structure, it is quite simple to use the iReceptor Data Loading code to load AIRR-seq data in such a form. Please refer to the iReceptor Turnkey Documentation for examples on how to load these data sets.

Name		Name	Last commit message	Last commit date
Latest commit History 329 Commits
metadata		metadata
test		test
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

iReceptor Data Curation

The iReceptor Data Curation process

About

Releases 1

Packages

Contributors 3

License

sfu-ireceptor/dataloading-curation

Folders and files

Latest commit

History

Repository files navigation

iReceptor Data Curation

The iReceptor Data Curation process

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Packages