Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The bbox of the table has issues; the content of the current table is very long, but the bbox height is very short #95

Open
tahitimoon opened this issue Nov 26, 2024 · 2 comments

Comments

@tahitimoon
Copy link

The bbox of the table has issues; the content of the current table is very long, but the bbox height is very short, as shown in the figure below.

{'chunk_type': 'table', 'page': 31, 'bbox': [438.73, 123.0, 960.6, 132.22]}

DOCUMENTS INCORPORATED BY REFERENCE > PART I > (In millions)

Years ended
September 30, 2023
--- ---
Net income $ 96,995
Other comprehensive income/(loss): Change in foreign currency translation, net of tax (765)
Change in unrealized gains/losses on derivative instruments, net of tax: Change in fair value of derivative instruments 323
Adjustment for net (gains)/losses realized and included in net income (1,717)
Total change in unrealized gains/losses on derivative instruments (1,394)
Change in unrealized gains/losses on marketable debt securities, net of tax: Change in fair value of marketable debt securities 1,563
Adjustment for net (gains)/losses realized and included in net income 253
Total change in unrealized gains/losses on marketable debt securities 1,816
Total other comprehensive income/(loss) (343)
Total comprehensive income $ 96,652
@irony
Copy link

irony commented Nov 27, 2024

+1 on this. A workaround we use is to multiply the amount of rows with an estimated rowHeight which works but of course aren't ideal.

@tahitimoon
Copy link
Author

+1 on this. A workaround we use is to multiply the amount of rows with an estimated rowHeight which works but of course aren't ideal.

It seems this library might be outdated and no longer functional. I attempted the suggested method, but unfortunately, it doesn't work. I recommend trying IBM's Docling instead; it appears to perform effectively. I've already made the switch to that library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants