Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tesseract output improvement #5

Open
halfguru opened this issue Oct 1, 2019 · 1 comment
Open

Tesseract output improvement #5

halfguru opened this issue Oct 1, 2019 · 1 comment

Comments

@halfguru
Copy link

halfguru commented Oct 1, 2019

Hi,

First of all, thank you for your work. I was looking for OCR projects since it's very difficult to find english subtitles for chinese youtube shows.

I'm wondering if you've attempted to optimize the Tesseract output with different image processing techniques as illustrated here. The use_fullframe argument could be changed to specific rectangular coordinates. Also, the Tesseract wiki indicates a dark text with light background is preferable so adding an option to invert the colors could be helpful. Binarisation could also help further isolate the subtitles. Finally, I believe adding the --psm 6 option to the Tesseract config to indicate a single uniform block of text would be beneficial.

@mongy910
Copy link

@halfguru These are really good insights. In the year since you've posted this, have you found any better solutions? I have the same use case as you (reading chinese soft captions).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants