Replies: 1 comment
-
You cannot - except with your own code of course. PyMuPDF is not deciding about block segmentation, this is a result of MuPDF's algorithms. The next MuPDF version 1.25.0 will bring significant improvements here. With a new text extraction option, MuPDF can be asked to search for recognizable page layout segments which will each be turned into a block for PyMuPDF. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Extraction of blocks in version 1.24.6 is perfect. How can I make version 1.24.13 work like 1.24.6? Thank you!
pdf1246-cut.pdf
pdf12413-cut.pdf
Beta Was this translation helpful? Give feedback.
All reactions