You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
added a script that converts sgml to xml files expected by the reader; however the script is just rudimentary right now, but works for all files of the following structure:
The script takes as argument the name of the file to convert python muc7_SGML2XML.py training.tr.keys.980410
and produces a file with the same name but an additional ".xml" ending.
But:
it seems the reader doesn't annotate coreferences in the CAS? Need to investigate!
The text was updated successfully, but these errors were encountered:
Mir ist auch noch etwas aufgefallen. Und zwar annotiert der MUC7Reader momentan nur den "Text"-Teil des Dokumentes. Wenn du das ganze Dokument annotieren möchtest (was der Normalfall sein sollte) musst du die Kommentare der auskommentierten Methoden in der Methode getNext(CAS) entfernen. Und bei den statischen Variablen musst du die Kommentare bei
/**
* XML elements comprised in an object list
*/
public static final String[] ELEMENT_TEXT_TO_BE_PROCESSED = { ELEMENT_SLUG, ELEMENT_DATE,
ELEMENT_NWORDS,
ELEMENT_PREAMBLE, ELEMENT_TEXT, ELEMENT_TRAILER };
added a script that converts sgml to xml files expected by the reader; however the script is just rudimentary right now, but works for all files of the following structure:
The script takes as argument the name of the file to convert
python muc7_SGML2XML.py training.tr.keys.980410
and produces a file with the same name but an additional ".xml" ending.
But:
it seems the reader doesn't annotate coreferences in the CAS? Need to investigate!
The text was updated successfully, but these errors were encountered: