Key value pairs? Forms? #216
ericfeunekes
started this conversation in
Ideas
Replies: 2 comments 9 replies
-
This is a great idea, I think that Docling could definitely tackle this issue by creating a specific pipeline targeting government forms as you mentioned. The technology already exists, I believe it's simply a matter of tailoring the output. |
Beta Was this translation helpful? Give feedback.
4 replies
-
Here are some forms you can try from the Canadian tax authority. You can get French and English by clicking the language in the top right https://www.canada.ca/en/revenue-agency/services/forms-publications/forms.html |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm curious if you have thoughts on how to use docling for things like completed government forms? E.g extract all key value pairs from a page.
It seems like a difficult problem because a table recognition model could easily get confused between tables and KV pairs. But I'm bringing it up because the extensibility of this library seems like it offers a great opportunity to build something like this that actually works, particularly with the JSON output.
I'm not aware of any specific model that can do it just yet, but even something like a moderately powerful VLM could be inserted in the pipeline somewhere to predict the KV pairs elements.
So have you thought about how to integrate this? How would you build it into the pipeline somewhere as a prediction, even using a powerful closed source model as a proof of concept initially?
Beta Was this translation helpful? Give feedback.
All reactions