generated from greek-learner-texts/text-repository-template
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 1b3640a
Showing
9 changed files
with
501 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: Text validation | ||
|
||
on: [push, pull_request] | ||
|
||
jobs: | ||
build: | ||
|
||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- uses: actions/checkout@v1 | ||
- name: Set up Python 3.8 | ||
uses: actions/setup-python@v1 | ||
with: | ||
python-version: 3.8 | ||
- name: Run text | ||
run: | | ||
pip install text-validator | ||
validate-text text-validator.toml text/*.txt |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
# {{ NAME OF TEXT }} | ||
|
||
{{ describe what is being done, the process being followed, and who is involved in the work }} | ||
|
||
This text is being prepared as part of the [Greek Learner Texts Project](https://greek-learner-texts.org/). | ||
|
||
## Contributors | ||
|
||
* {{ list of people who have contributed to this repo }} | ||
|
||
## Source | ||
|
||
{{ indicate original source(s) of text: scans or existing transcriptions }} | ||
|
||
## Progress | ||
|
||
{{ indicate progress, or remove entire section if done }} | ||
|
||
## License | ||
|
||
This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Token-level analysis like lemmatisation or postagging can go here. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Generated HTML versions of the texts should go here. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Any original files (scans and transcriptions) can be placed here. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Any scripts used in the preparation of the texts can go here. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
["text_validator.plugins.whitespace"] | ||
CHECK_CRLF = true | ||
CHECK_TABS = true | ||
CHECK_TRAILING_WHITESPACE = true | ||
CHECK_NO_EOF_NEWLINE = true | ||
|
||
["text_validator.plugins.unicode"] | ||
CONFIRM_UTF_8_NFC = true | ||
|
||
["text_validator.plugins.ref_line_format"] | ||
REF_REGEX = "\\d+\\.\\d+$" | ||
|
||
["text_validator.plugins.characters"] | ||
REPLACE_CHARS = [ | ||
# bad character, suggested replacement | ||
["\u02BC", "\u2019"], | ||
["\u1FBF", "\u2019"], | ||
["\u037E", "\u003B"], | ||
["\u0387", "\u00B7"], | ||
["\u0374", "\u02B9"], | ||
["\u03D5", "\u03C6"], | ||
["\u03D1", "\u03B8"], | ||
] | ||
TOKEN_REGEXES = [ | ||
# each whitespace-separated token must match one of these regexes | ||
"\\d+\\.\\d+$", | ||
"[«(]*[\u0370-\u03FF\u1F00-\u1FFF]+\u2019?[.,:;»)·]*$", | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
* the prepared text in our textpart-per-line format with dotted ref | ||
* can be multiple *.txt files but, if there is an inherent order to the files, this should be reflected in the sort order of the filenames |