Skip to content

Commit d181905

Browse files
committed
1 parent c924822 commit d181905

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

src/sentences/sentence_splitting.jl

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,8 @@ function postproc_splits(sentences::AbstractString)
120120
sentences = replace(sentences, r"(\bMs\.)\n" => s"\1 ")
121121
sentences = replace(sentences, r"(\bMrs\.)\n" => s"\1 ")
122122

123-
123+
# no sentence break in between two words with no punctuation
124+
sentences=replace(sentences,r"([a-zA-Z0-9])\n([a-zA-Z0-9])"=>s"\1 \2")
124125

125126

126127
# possible TODO: filter excessively long / short sentences

0 commit comments

Comments
 (0)