You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Docling parses some pdfs successfully but fails to write the markdown file with the results. UnicodeEncodeError: 'charmap' codec can't encode character '\u2217' in position 51: character maps to <undefined>
I was able to resolve this for this specific PDF by changing line 1941 in this file under docling_core\types\doc\document.py but the tests failed
Bug
Docling parses some pdfs successfully but fails to write the markdown file with the results.
UnicodeEncodeError: 'charmap' codec can't encode character '\u2217' in position 51: character maps to <undefined>
I was able to resolve this for this specific PDF by changing line 1941 in this file under docling_core\types\doc\document.py but the tests failed
Steps to reproduce
docling computational-challenges-in-bounded-model-checking-44b7toabj9.pdf
I've encountered this on other PDFs as well:
https://batch.libretexts.org/print/url=https://math.libretexts.org/Bookshelves/Combinatorics_and_Discrete_Mathematics/Elementary_Foundations%3A_An_Introduction_to_Topics_in_Discrete_Mathematics_(Sylvestre)/03%3A_Boolean_algebra/3.02%3A_Disjunctive_Normal_Form.pdf
Docling version
Docling version: 2.12.0
Docling Core version: 2.9.0
Docling IBM Models version: 3.1.0
Docling Parse version: 3.0.0
Python version
Python 3.11.9
The text was updated successfully, but these errors were encountered: