Releases · felixdittrich92/OnnxTR

15 Jan 14:10

felixdittrich92

v0.6.2

a2cc042

v0.6.2 Latest

Latest

What's Changed

NOTE: OnnxTR v0.6.2 requires Python >=3.10

Bug Fixes

[Fix] pathlib issue windows with older onnxruntime versions by @felixdittrich92 in #62

Publicly available pre-built Docker images: here

OnnxTR demo:

OnnxTR model collection: OnnxTR Hugging Face collection

Full Changelog: v0.6.1...v0.6.2

Contributors

felixdittrich92

Assets 2

13 Jan 07:39

felixdittrich92

v0.6.1

21a6e14

v0.6.1

What's Changed

NOTE: OnnxTR v0.6.1 requires Python >=3.10

Small fix for custom loaded detection models where assume_straight_pages=False wasn't set correctly.
Maintenance updates

Publicly available pre-built Docker images: here

OnnxTR demo:

OnnxTR model collection: OnnxTR Hugging Face collection

Full Changelog: v0.6.0...v0.6.1

Assets 2

23 Nov 13:34

felixdittrich92

v0.6.0

d9f8230

v0.6.0

What's Changed

NOTE: OnnxTR v0.6.0 requires Python >=3.10

New version specifiers

To further enhance OnnxTR as a go-to solution for production environments, two new installation options are introduced, tailored for OpenVINO-powered deployments:

pip install "onnxtr[openvino]"
pip install "onnxtr[openvino-headless]"  # same as "onnxtr[openvino]" but with opencv-headless

OpenVINO™ (Open Visual Inference and Neural Network Optimization) is an open-source toolkit developed by Intel to optimize and deploy AI inference across a variety of hardware. It is specifically designed for Intel architectures but supports multiple hardware targets, including CPUs, GPUs, NPUs, and so on. OpenVINO is particularly well-suited for applications requiring high-performance inference, such as computer vision, natural language processing, and edge AI scenarios.

As you can see this provides a great performance boost:

Publicly available pre-built Docker images added

Can be found here

OnnxTR demo

OnnxTR Hugging Face collection

Full Changelog: v0.5.1...v0.6.0

Assets 6

17 Oct 10:27

felixdittrich92

v0.5.1

d17146c

v0.5.1

What's Changed

Improved result.syntesize()
Updated Hugging Face demo

Full Changelog: v0.5.0...v0.5.1

Assets 2

27 Sep 11:16

felixdittrich92

v0.5.0

8285068

v0.5.0

What's Changed

New version specifiers

To go further forward making OnnxTR the choice for production scenarios 2 new installation options was added:

pip install "onnxtr[cpu-headless]"  # same as "onnxtr[cpu]" but with opencv-headless
pip install "onnxtr[gpu-headless]"  # same as "onnxtr[gpu]" but with opencv-headless

Disable page orientation classification

If you deal with documents which contains only small rotations (~ -45 to 45 degrees), you can disable the page orientation classification to speed up the inference.
This will only have an effect with assume_straight_pages=False and/or straighten_pages=True and/or detect_orientation=True.

from onnxtr.models import ocr_predictor
model = ocr_predictor(assume_straight_pages=False, disable_page_orientation=True)

Disable crop orientation classification

If you deal with documents which contains only horizontal text, you can disable the crop orientation classification to speed up the inference.
This will only have an effect with assume_straight_pages=False and/or straighten_pages=True.

from onnxtr.models import ocr_predictor
model = ocr_predictor(assume_straight_pages=False, disable_crop_orientation=True)

Loading custom exported orientation classification models

Syncronized with docTR:

from onnxtr.io import DocumentFile
from onnxtr.models import ocr_predictor, mobilenet_v3_small_page_orientation, mobilenet_v3_small_crop_orientation
from onnxtr.models.classification.zoo import crop_orientation_predictor, page_orientation_predictor
custom_page_orientation_model = mobilenet_v3_small_page_orientation("<PATH_TO_CUSTOM_EXPORTED_ONNX_MODEL>")
custom_crop_orientation_model = mobilenet_v3_small_crop_orientation("<PATH_TO_CUSTOM_EXPORTED_ONNX_MODEL>"))

predictor = ocr_predictor(assume_straight_pages=False, detect_orientation=True)

# Overwrite the default orientation models
predictor.crop_orientation_predictor = crop_orientation_predictor(custom_crop_orientation_model)
predictor.page_orientation_predictor = page_orientation_predictor(custom_page_orientation_model)

FP16 Support

GPU only feature (OnnxTR needs to run on GPU)
Added a script which can be used to convert the default FP32 models to FP16 (Input / Output will be unchanged fp32), this will further speed up the inference on GPU and lower the required VRAM
Script is available at: https://github.com/felixdittrich92/OnnxTR/blob/main/scripts/convert_to_float16.py

Full Changelog: v0.4.1...v0.5.0

Assets 2

21 Aug 06:50

felixdittrich92

v0.4.1

4bebdea

v0.4.1

What's Changed

Fix: straighten_pages=True now also displayed with .show() correctly
Added numpy 2.0 support

New Contributors

@dependabot made their first contribution in #17

Full Changelog: v0.4.0...v0.4.1

Contributors

dependabot

Assets 2

16 Aug 10:23

felixdittrich92

v0.4.0

679c5d6

v0.4.0

What's Changed

Sync with current docTR state
Hf hub integration

HuggingFace Hub integration

Now you can load and/or push models to the hub directly.

Loading

from onnxtr.io import DocumentFile
from onnxtr.models import ocr_predictor, from_hub

img = DocumentFile.from_images(['<image_path>'])
# Load your model from the hub
model = from_hub('onnxtr/my-model')

# Pass it to the predictor
# If your model is a recognition model:
predictor = ocr_predictor(
    det_arch='db_mobilenet_v3_large',
    reco_arch=model
)

# If your model is a detection model:
predictor = ocr_predictor(
    det_arch=model,
    reco_arch='crnn_mobilenet_v3_small'
)

# Get your predictions
res = predictor(img)

Push

from onnxtr.models import parseq, push_to_hf_hub, login_to_hub
from onnxtr.utils.vocabs import VOCABS

# Login to the hub
login_to_hub()

# Recogniton model
model = parseq("~/onnxtr-parseq-multilingual-v1.onnx", vocab=VOCABS["multilingual"])
push_to_hf_hub(
    model,
    model_name="onnxtr-parseq-multilingual-v1",
    task="recognition",  # The task for which the model is intended [detection, recognition, classification]
    arch="parseq",  # The name of the model architecture
    override=False  # Set to `True` if you want to override an existing model / repository
)

# Detection model
model = linknet_resnet18("~/onnxtr-linknet-resnet18.onnx")
push_to_hf_hub(
    model,
    model_name="onnxtr-linknet-resnet18",
    task="detection",
    arch="linknet_resnet18",
    override=True
)

HF Hub search: here.

Collection: here

Full Changelog: v0.3.2...v0.4.0

Assets 2

09 Jul 09:46

felixdittrich92

v0.3.2

7391db8

v0.3.2

What's Changed

Fix: Resize transformation / interpolation adjusted to docTR (#10 #22)

Full Changelog: v0.3.1...v0.3.2

Assets 2

28 Jun 06:16

felixdittrich92

v0.3.1

890ae43

v0.3.1

What's Changed

Minor configuration fix for CUDAExecutionProvider
Adjusted default batch sizes
avoid init EngineConfig multiple times

Full Changelog: v0.3.0...v0.3.1

Assets 2

27 Jun 10:13

felixdittrich92

v0.3.0

04f5744

v0.3.0

What's Changed

Sync with current docTR state
Added advanced options to configure the underlying execution engine
Added new db_mobilenet_v3_large converted models (fp32 & 8bit)

Advanced engine configuration

from onnxruntime import SessionOptions

from onnxtr.models import ocr_predictor, EngineConfig

general_options = SessionOptions()  # For configuartion options see: https://onnxruntime.ai/docs/api/python/api_summary.html#sessionoptions
general_options.enable_cpu_mem_arena = False

# NOTE: The following would force to run only on the GPU if no GPU is available it will raise an error
# List of strings e.g. ["CUDAExecutionProvider", "CPUExecutionProvider"] or a list of tuples with the provider and its options e.g.
# [("CUDAExecutionProvider", {"device_id": 0}), ("CPUExecutionProvider", {"arena_extend_strategy": "kSameAsRequested"})]
providers = [("CUDAExecutionProvider", {"device_id": 0})]  # For available providers see: https://onnxruntime.ai/docs/execution-providers/

engine_config = EngineConfig(
    session_options=general_options,
    providers=providers
)
# We use the default predictor with the custom engine configuration
# NOTE: You can define different engine configurations for detection, recognition and classification depending on your needs
predictor = ocr_predictor(
    det_engine_cfg=engine_config,
    reco_engine_cfg=engine_config,
    clf_engine_cfg=engine_config
)

Full Changelog: v0.2.0...v0.3.0

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Bug Fixes

Contributors

What's Changed

What's Changed

New version specifiers

Publicly available pre-built Docker images added

OnnxTR demo

What's Changed

What's Changed

New version specifiers

Disable page orientation classification

Disable crop orientation classification

Loading custom exported orientation classification models

FP16 Support

What's Changed

New Contributors

Contributors

What's Changed

HuggingFace Hub integration

Loading

Push

What's Changed

What's Changed

What's Changed

Advanced engine configuration

Releases: felixdittrich92/OnnxTR

v0.6.2

What's Changed

Bug Fixes

Contributors

v0.6.1

What's Changed

v0.6.0

What's Changed

New version specifiers

Publicly available pre-built Docker images added

OnnxTR demo

v0.5.1

What's Changed

v0.5.0

What's Changed

New version specifiers

Disable page orientation classification

Disable crop orientation classification

Loading custom exported orientation classification models

FP16 Support

v0.4.1

What's Changed

New Contributors

Contributors

v0.4.0

What's Changed

HuggingFace Hub integration

Loading

Push

v0.3.2

What's Changed

v0.3.1

What's Changed

v0.3.0

What's Changed

Advanced engine configuration