Python Libraries

Package Manager

pypi
poetry
pdm
uv

HTTP & Networking

requests
aiohttp

Parser

beautifulsoup4
m3u8

Browser Automation

selenium
playwright

Crypto

pycrypto
pycryptodome

OCR

Tesseract (⭐️58.1k). Tesseract is one of the most popular OCR libraries in Python. It supports over 100 languages and can extract text from various image formats.
pytesseract (⭐5.5k). pytesseract is a wrapper around Tesseract OCR engine. It provides a simple interface to extract text from images using Tesseract.
OpenCV (⭐️75.5k): OpenCV is a computer vision library that can be used for OCR tasks. It provides functions for image preprocessing, text detection, and character recognition.
EasyOCR (⭐22k). EasyOCR is a recently developed OCR library for Python. It supports more than 80 languages and provides pre-trained models for text extraction from images. EasyOCR is known for its ease of use and high accuracy.
ddddocr (⭐8.3k)
Doctr (⭐3.1k)
Keras-OCR (⭐1.3k)
GOCR: GOCR is an OCR engine developed in C. It can be used in Python using the PyOCR library.
//OCRopus: OCRopus is a collection of document analysis and OCR tools. It includes the Tesseract OCR engine and provides additional features for document layout analysis and text extraction.
//PyOCR: PyOCR is another wrapper around Tesseract OCR engine. It supports multiple OCR engines, including Tesseract, CuneiForm, and GOCR.

API

Google Cloud Vision API: Google Cloud Vision API is a cloud-based OCR service provided by Google. It offers advanced features like image classification, object detection, and handwriting recognition.
Microsoft Azure Computer Vision API: Microsoft Azure Computer Vision API is another cloud-based OCR service. It provides OCR capabilities along with other computer vision features like image tagging and face recognition.
Amazon Textract: Amazon Textract is a machine learning-based OCR service provided by Amazon Web Services. It can extract text and data from scanned documents, invoices, forms, and tables.

Date and Time

arrow

Data Access

Connection Pool

dbutils

MySQL

PyMySQL
mysqlclient
aiomysql

ORM

peewee
Django ORM
SQLAlchemy
PonyORM
SQLObject
Tortoise ORM

Elasticsearch

elasticsearch. Official Python client for Elasticsearch.
elasticsearch7
elasticsearch8
elasticsearch-dsl. a high-level library whose aim is to help with writing and running queries against Elasticsearch. It is built on top of the official low-level client (elasticsearch-py).

Scientific Computing

NumPy

Data Science

Polars
Pandas

Artificial Intelligence

PyTorch