Update README.md

chrisdrymon · Feb 10, 2022 · bf32f7a · bf32f7a
1 parent 747f718
commit bf32f7a
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/README.md b/README.md
@@ -32,7 +32,7 @@ It is possible that an exceedingly large string may cause memory issues. If you
 perhaps split the text in half and try that. This is an issue that will be addressed in later releases.
 
 ## Design
-This novel architecture utilizes no rules or morphology lookup tables. Rather, it examines individual token morphology and each token's context within the sentence using a series of neural networks. Furthermore, because of the varying tendencies of the many human annotators which are found among the AGDT treebanks, Angel considered annotators as a feature during training. Consequently, while running inference, an annotator must be chosen for the tagger to emulate. "Vanessa Gorman" is the default choice as her annotation style is up to date and she is currently the single most prolific annotator. 
+This architecture utilizes no rules or morphology lookup tables. Rather, it examines individual token morphology and each token's context within the sentence using a series of neural networks. Furthermore, because of the varying tendencies of the many human annotators which are found among the AGDT treebanks, Angel considered annotators as a feature during training. Consequently, while running inference, an annotator must be chosen for the tagger to emulate. "Vanessa Gorman" is the default choice as her annotation style is up to date and she is currently the single most prolific annotator. 
 
 ## Accuracy
 Partially imitating the assessment criteria used by [Barbara McGillivray and Alessandro Vatri](https://www.researchgate.net/publication/328791830_The_Diorisis_Ancient_Greek_Corpus) in the development of their state of the art (91% POS accuracy) tagger they used in their Diorisis corpus, Angel was trained on 26 works in the [AGDT 2.1 treebank](https://github.com/PerseusDL/treebank_data/tree/master/v2.1/Greek) while 7 works were reserved for validation during training. Though Diorisis trained on roughly 50% more data (from the [PROIEL treebanks](https://github.com/proiel/proiel-treebank/)), Angel outperformed it scoring 95.5% accuracy predicting parts-of-speech in the validation set. That score was further confirmed by testing upon the first five works within the [Gorman treebanks](https://github.com/perseids-publications/gorman-trees) wherein it scored 95.7% part-of-speech accuracy, earning it state of the art status by a significant margin.