Raw annotated data for the treebanks in the Syntacticus collection.
Releases of the collection are hosted on Github.
The texts in the collection are available in two formats:
-
PROIEL XML: These files are the authoritative source files and the only ones that contain all available annotation. They contain the complete morphological, syntactic and information-structure annotation, as well as the complete text, including punctuation, section headers etc. The schema is defined in
proiel.xsd
.