Package Manager
- pypi
- poetry
- pdm
- uv
- requests
- aiohttp
- beautifulsoup4
- m3u8
- selenium
- playwright
- pycrypto
- pycryptodome
- Tesseract (⭐️58.1k). Tesseract is one of the most popular OCR libraries in Python. It supports over 100 languages and can extract text from various image formats.
- pytesseract (⭐5.5k). pytesseract is a wrapper around Tesseract OCR engine. It provides a simple interface to extract text from images using Tesseract.
- OpenCV (⭐️75.5k): OpenCV is a computer vision library that can be used for OCR tasks. It provides functions for image preprocessing, text detection, and character recognition.
- EasyOCR (⭐22k). EasyOCR is a recently developed OCR library for Python. It supports more than 80 languages and provides pre-trained models for text extraction from images. EasyOCR is known for its ease of use and high accuracy.
- ddddocr (⭐8.3k)
- Doctr (⭐3.1k)
- Keras-OCR (⭐1.3k)
- GOCR: GOCR is an OCR engine developed in C. It can be used in Python using the PyOCR library.
- //OCRopus: OCRopus is a collection of document analysis and OCR tools. It includes the Tesseract OCR engine and provides additional features for document layout analysis and text extraction.
- //PyOCR: PyOCR is another wrapper around Tesseract OCR engine. It supports multiple OCR engines, including Tesseract, CuneiForm, and GOCR.
API
- Google Cloud Vision API: Google Cloud Vision API is a cloud-based OCR service provided by Google. It offers advanced features like image classification, object detection, and handwriting recognition.
- Microsoft Azure Computer Vision API: Microsoft Azure Computer Vision API is another cloud-based OCR service. It provides OCR capabilities along with other computer vision features like image tagging and face recognition.
- Amazon Textract: Amazon Textract is a machine learning-based OCR service provided by Amazon Web Services. It can extract text and data from scanned documents, invoices, forms, and tables.
- arrow
Connection Pool
- dbutils
MySQL
- PyMySQL
- mysqlclient
- aiomysql
ORM
- peewee
- Django ORM
- SQLAlchemy
- PonyORM
- SQLObject
- Tortoise ORM
Elasticsearch
- elasticsearch. Official Python client for Elasticsearch.
- elasticsearch7
- elasticsearch8
- elasticsearch-dsl. a high-level library whose aim is to help with writing and running queries against Elasticsearch. It is built on top of the official low-level client (elasticsearch-py).
- NumPy
- Polars
- Pandas
- PyTorch