Usage of custom EasyOCR (TableFormer, DocLayNet, etc.) model weights to improve domain-specific documents #586

bit-scientist · 2024-12-13T02:15:04Z

bit-scientist
Dec 13, 2024

I would like to get some advice on what steps should be done to replace the default EasyOCR (TableFormer, DocLayNet, etc.) weights with custom fine-tuned/trained ones to improve performance. As to my understanding, models are downloaded automatically upon first usage by default.
What do I need to do in order to use my own EasyOCR model (or customized doclayout, tableformer models)?

The reason why I need it is that I have prepared my custom dataset with its ground truth labels that matches with EasyOCR's training guideline and am about to train my own EasyOCR model. Hopefully, I would like to integrate it into docling as well.
Is the custom model going to be structurally identical with the default model which enables easy integration?

dolfim-ibm · 2024-12-13T07:51:25Z

dolfim-ibm
Dec 13, 2024
Maintainer

For EasyOCR, we expose the parameters which would allow you to bring your custom model, see https://ds4sd.github.io/docling/reference/pipeline_options/#docling.datamodel.pipeline_options.EasyOcrOptions.

For the Layout and TableFormer models, the model path can be specified as in https://ds4sd.github.io/docling/usage/#provide-specific-artifacts-path

4 replies

ninedesu Dec 13, 2024

Could you explain more about what we can do with the artifacts path? I'm still new to this, so I don’t fully understand yet.

I noticed that the table I’m extracting detects two rows as one row, resulting in two data appearing in a single cell. I believe this issue is caused by the TableFormer model rather than the OCR, as I haven’t found any errors in the values or characters themselves. For reference, I'm trying to extract complex financial statement.

dolfim-ibm Dec 13, 2024
Maintainer

Artifacts path at the moment only allows to swap the model weights, but it must be the same input/output format for the models.

ninedesu Dec 14, 2024

Alright, I would like to try a custom fine-tuned model in the EasyOCR pipeline. Should I specify the path to the custom model like this?

ocr_options = EasyOcrOptions(model_storage_directory="path/to/custommodelpath")

From my understanding of the instructions for using a custom model in EasyOCR (as outlined here), I need to place the .pth, .yaml, and .py files for the custom model in the appropriate EasyOCR directory on my laptop. Then, I would call it using:
reader = easyocr.Reader(['en'], recog_network='custom_example')
Should I follow the same approach in this case?

dolfim-ibm Dec 16, 2024
Maintainer

I think you are saying one would need to add the recog_network option for EasyOCR, correct?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage of custom EasyOCR (TableFormer, DocLayNet, etc.) model weights to improve domain-specific documents #586

{{title}}

Replies: 1 comment 4 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Usage of custom EasyOCR (TableFormer, DocLayNet, etc.) model weights to improve domain-specific documents #586

bit-scientist Dec 13, 2024

Replies: 1 comment · 4 replies

dolfim-ibm Dec 13, 2024 Maintainer

ninedesu Dec 13, 2024

dolfim-ibm Dec 13, 2024 Maintainer

ninedesu Dec 14, 2024

dolfim-ibm Dec 16, 2024 Maintainer

bit-scientist
Dec 13, 2024

Replies: 1 comment 4 replies

dolfim-ibm
Dec 13, 2024
Maintainer

dolfim-ibm Dec 13, 2024
Maintainer

dolfim-ibm Dec 16, 2024
Maintainer