Skip to content

This is a step by step guide on improving OCR accuracy results through image enhancement tools

Notifications You must be signed in to change notification settings

jzou1995/Optimizing-PDF-OCR-Accuracy-through-Image-Enhancement

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 

Repository files navigation

  1. First step is to convert your PDF into images (e.g. jpeg, PNG files) using python or softwares such as Adobe Acrobat then save all your images to one folder .

  2. Download a free open-source tool called ComicEnhancerPro_eng (for download link, see the creator's blog site https://www.cnblogs.com/stronghorse/p/14594337.html)

  3. Drag one image into the ComicEnhacerProf_eng Image

  4. Then try and test various enhancement features of the tool including USM sharpening, Gamma, JPG QLT, bold, auto level. Depending on the quality of the scan, you may need to adjust the features accordingly. Overdone can also hurt the quality so you should aim for a result that is comfortable for human eyes which will lead to high quality OCR results.

  5. Click File, choose set DPI, then choose folder to set all images into DPI 300.

  6. Then after correctly adjusting the level of enhancement you need (step 4), click File then batch process. Browse the folder that contains all the images, click process all and agree to overwriting the prexisting image files. This will apply all the enhancement features (step 4) onto every image.

  7. Now download a free open-source tool called "Image to pdf or xps". Import all your processed images into it to convert it into one single PDF. Choose location and name. This software will ensure that there will be no damage to image enhancement in the process of converting Image to PDF. This step is critical to ensuring good OCR results. 0171d1b903e4fc1622389060f126a20

  8. You now have a optimized PDF that will have higher OCR accuracy than before.

About

This is a step by step guide on improving OCR accuracy results through image enhancement tools

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published