This project demonstrates Optical Character Recognition (OCR) on images using Python, Tesseract, and OpenCV.
- Python
- OpenCV
- Tesseract
- Install the required packages:
- Download and install Tesseract from here.
- Update the
tesseract_cmd
variable inrun.py
with the path to the Tesseract executable on your system.
- Place the image you want to perform OCR on in the
img
directory. - Update the
cv2.imread
function inrun.py
with the path to your image. - Run the script:
python run.py
The script will load the image, convert it to grayscale (optional, depending on the image), and then apply OCR using Tesseract. The resulting text will be printed to the console.
- The current script is set to recognize the Portuguese language. If you want to use another language, change the
lang
parameter in thepytesseract.image_to_string
function to the appropriate language code. You can find the list of supported languages here.
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License.