Usage of custom EasyOCR (TableFormer, DocLayNet, etc.) model weights to improve domain-specific documents #586
Closed
bit-scientist
started this conversation in
General
Replies: 1 comment 4 replies
-
For EasyOCR, we expose the parameters which would allow you to bring your custom model, see https://ds4sd.github.io/docling/reference/pipeline_options/#docling.datamodel.pipeline_options.EasyOcrOptions. For the Layout and TableFormer models, the model path can be specified as in https://ds4sd.github.io/docling/usage/#provide-specific-artifacts-path |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I would like to get some advice on what steps should be done to replace the default EasyOCR (TableFormer, DocLayNet, etc.) weights with custom fine-tuned/trained ones to improve performance. As to my understanding, models are downloaded automatically upon first usage by default.
What do I need to do in order to use my own EasyOCR model (or customized doclayout, tableformer models)?
The reason why I need it is that I have prepared my custom dataset with its ground truth labels that matches with EasyOCR's training guideline and am about to train my own EasyOCR model. Hopefully, I would like to integrate it into docling as well.
Is the custom model going to be structurally identical with the default model which enables easy integration?
Beta Was this translation helpful? Give feedback.
All reactions