Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to (easily) disable sentence splitter #14

Open
graus opened this issue Nov 27, 2014 · 1 comment
Open

Ability to (easily) disable sentence splitter #14

graus opened this issue Nov 27, 2014 · 1 comment

Comments

@graus
Copy link

graus commented Nov 27, 2014

Given the max n-gram param, we should be able to disable sentence splitting (which is in place to avoid extracting infinitely long n-grams), which could be useful for parsing Chinese Wikipedia's and whatnot.

@larsmans
Copy link
Contributor

Actually the sentence splitter is there to not generate cross-sentence n-grams. But I don't know how much of a problem those are. N-gram length is bounded anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants