Registry of DraCor corpora
This registry provides a list of available and planned
DraCor corpora with some meta data for each corpus. The
data is collected from the corpus.xml
files in the individual corpus
repositories.
The list is available in JSON format, see corpora.json or as a node package:
npm install @dracor/registry
Then package exports the corpora list as default:
import corpora from '@dracor/registry';
console.log(corpora);
The status
metadata field indicates the availability and stability of each
corpus. there are currently three recognized values:
- published: the corpus is considered stable and is available at https://dracor.org
- draft: the corpus is currently under development and can be previewed on https://staging.dracor.org
- proposed: the corpus is planned or in very early development. The repository may or may not be publicly available.
These values are a subset of the suggested values in the TEI specification (see https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.docStatus.html).
The status of a corpus can be indicated in its corpus.xml
file using the
revisionDesc
element:
<revisionDesc status="draft">
<change when="2018-12-12" status="proposed"/>
<change when="2020-07-17" status="draft"/>
</revisionDesc>
The update script uses either the status
attribute of the
revisionDesc
element or, if this is not available, the status
attribute of
the latest change
element (i.e. the one with the most recent date in @when
).
jq '.[] | select(.name == "ger")' < corpora.json
The registry can be updated by running the update script
(pnpm run update-corpora
). This script retrieves the corpus.xml from each
repository listed in corpora.json
, extracts the relevant meta data and updates
the respective fields in corpora.json
. Fields that exist in corpora.json
but
have no equivalent in the corpus.xml
are left untouched. You need to have
node
installed and corepack
enabled (run corepack enable
once after
installing node).
cd dracor-registry
pnpm install
pnpm update-corpora
# or using personal access token for GitHub API
GITHUB_API_TOKEN=yourpersonalaccesstoken pnpm update-corpora
To release a new version to npmjs.com you need to be a member of the dracor organization.
pnpm release