This repository contains an overview and some metadata over the corpora we are using. In particular, it allows an easy lookup of text ids and title or author.
Our own corpus, which is currently a combination of TextGrid, GerDraCor and some Shakespeare dramas. We also modified the data where necessary and added additional annotations.
- Overview table:
qd.csv
- Collection prefix:
qd
Installation (in the R console with loaded DramaAnalysis package):
installData("qd")
Extracted from the TextGrid repository. Dates have been added through extraction from the DLINA corpus.
- Overview table:
tg.csv
- Collection prefix:
tg
- URL to preprocessed XMI files: http://www2.ims.uni-stuttgart.de/gcl/reiterns/quadrama/res/tg.zip
Installation (in the R console with loaded DramaAnalysis package):
installData("tg")
Extracted from a fork of the GerDraCor repository.
- Overview table:
gdc.csv
- Collection prefix:
gdc
Installation (in the R console with loaded DramaAnalysis package):
installData("gdc")