This repository is the home for the schema for the GA4GH Tool Registry API. The goal of the API is to provide a standardized way to describe the availability of tools and workflows. In this way, we can have multiple repositories that share tools and workflows of various types that are described in workflow languages (e.g. WDL, CWL, Nextflow, Galaxy, Snakemake), have their dependencies embedded as containers (e.g. Docker, Singularity) or suitable alternatives (e.g., Conda), and have a consistent way to interact, search, and retrieve information from these various registries. The end goal is to make it much easier to share scientific tools and workflows, enhancing our ability to make research reproducible, sharable, and transparent.
See the human-readable Reference Documentation. You can also explore the specification in the Swagger Editor. Manually load the JSON if working from a non-develop branch version. Preview documentation from the gh-openapi-docs for the development branch here
The Global Alliance for Genomics and Health (GA4GH) is an international coalition, formed to enable the sharing of genomic and clinical data.
The Cloud Work Stream is focused on creating specific standards for defining, sharing, and executing portable workflows and self-contained tasks, and accessing data across clouds.
We work with many different Driver Projects to develop, enhance, test, and use the Cloud Work Stream APIs.
This is the home of the schema for the GA4GH Tool Registry API. The GA4GH Tool Registry API is a standard for listing and describing available tools (both stand-alone, self-contained tools and workflows in CWL, WDL, Nextflow, Galaxy or Snakemake) in a given registry. This defines a minimal, common API describing tools that we propose for support by multiple tool/workflow registries like Dockstore, BioContainers, and Agora for the purposes of exchange, indexing, and searching.
This repo uses the HubFlow scheme which is closely based on GitFlow. In practice, this means that the master branch contains the last production release of the schema whereas the develop branch contains the latest development changes which will end up in the next production release. As of February 2022, the master branch contains the last production release (currently )) whereas the develop branch contains work which will accumulate and evolve into a 2.1 production release.
Our current iteration focuses on a read-only API due to potentially different views and approaches to registration/security.
Key features of the current API:
- Read-only API
- Serve tool and workflow resources via specifically designed schemas that encourage rich metadata annotation and help enable software FAIRification
- Download individual workflow descriptor files or an archive of all workflow and accessory files (e.g., test files)
- Allow integrators to interrogate the language versions of these workflows (e.g. CWL 1.1, CWL 1.2 or Nextflow DSL2) to identify compatible workflows
- Get specific versions of workflows and tools, potentially with immutable versions with checksums on their files
- Assign globally unique TRS URIs to specific versions of tool and workflow resources
- Provides more structure than a simple unformatted list of tools but it is also a standard for registries to implement as opposed to a registry implementation itself
Questions TRS currently does not (comprehensively) address include the following:
- How do we track authorship? Should we track authorship of the tool metadata, the Docker image, or the underlying algorithm, or all of above?
- How to describe indexing and external services like an external SPARQL service?
- How to better interoperate with the GA4GH Workflow Execution Service (WES) and Task Execution Service (TES) APIs for triggering workflow and tool runs
See the swagger editor to view our schema in progress.
Take cues for now from the CONTRIBUTING.md document.
At the very least, create an issue in our GitHub tracker.
Even better, fork the codebase, fix the issue, and create a pull request back to the project along with your ticket.
To add a registry that supports the GA4GH Registry API:
- fork the repo
- modify registry.json
- submit a pull request back to the project
- we will confirm the site is valid then accept your pull request
See our registry.json for a list of known registries that conform to the Tool Registry API standard.
See the LICENSE
- GA4GH Cloud Work Stream - the wiki and meeting notes for the workstream
- APIs that we co-ordinate/meet with
- Global Alliance for Genomics and Health - GA4GH's main page
- GA4GH Technical Alignment Sub Committee (TASC) - we try to co-ordinate GA4GH API decisions here
- GA4GH Slack - although you may need an invitation from a GA4GH administrator if your email domain name has not been allow-listed, see ga4gh/TASC#44