Download Tesseract 4.0 with "missing dlls" from here. Then, within the code, change the location of tesseract in pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
. Remember to add Tesseract to your PATH.
- Import the image
- Crop the image
- Find the edge of the documents with Canny Edge Detection
- Resize the bounding box of the document edges
- Determining passport country with Color Detection
- Cropping and thresholding images of attributes from documents
- Using PyTesseract to read the cropped attributes
- Comparing attributes
- If all attributes have been correctly matched, the traveler has correct information. Else, the traveler has incorrect information
Main code
Files to train Tesseract on BMMini font
Files to train Tesseract on MiniKylie font
Contains all 69 screenshots used to test the code
Contains images of correct seals for the entry permit
Program used to correct Tesseract box files
Text file with all print statements and results from images in "Final_Dataset"