Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update for complex fonts, rendering, and experimental high-level API #82

Merged
merged 25 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
2ec42dc
updated the differences for fonts
PeterStaar-IBM Jan 11, 2025
715aafd
took care of encoding-name as stream and rescaled the height of the b…
PeterStaar-IBM Jan 12, 2025
225082e
working on the ParsedPaginatedDocument structure
PeterStaar-IBM Jan 13, 2025
e8b0771
fixed the reformatting
PeterStaar-IBM Jan 13, 2025
4c03a1c
added the pdf_parser python class
PeterStaar-IBM Jan 13, 2025
654185d
reformatted the pdf_parser
PeterStaar-IBM Jan 13, 2025
833c528
reformatted the code
PeterStaar-IBM Jan 14, 2025
6cbd931
Refactor and renaming high-level APIs (WIP)
cau-git Jan 14, 2025
3aea044
Establish high-level DoclingPdfParser and PdfDocument APIs
cau-git Jan 15, 2025
2950c3f
Establish high-level DoclingPdfParser and PdfDocument APIs
cau-git Jan 15, 2025
9e9abb0
fixed the tests and added PdfColoredElement
PeterStaar-IBM Jan 16, 2025
559c373
added the suppression of the QPDF warnings
PeterStaar-IBM Jan 16, 2025
49097d7
fixed the nulls in the stream
PeterStaar-IBM Jan 16, 2025
229c0c4
updated the visualize script to use the pdf-parser
PeterStaar-IBM Jan 16, 2025
132b663
Mark DoclingPdfParser API as experimental
cau-git Jan 16, 2025
b2b8848
refactoring the code
PeterStaar-IBM Jan 17, 2025
40cba22
refactoring the rendering
PeterStaar-IBM Jan 17, 2025
3d30a01
merged with branch
PeterStaar-IBM Jan 17, 2025
fe05ce4
updated the tests and refactored some data-structures
PeterStaar-IBM Jan 17, 2025
fb53be2
updated the tests
PeterStaar-IBM Jan 17, 2025
106a608
Update test GT for docling-parse-v2
cau-git Jan 17, 2025
2648938
Apply styling fixes
cau-git Jan 17, 2025
c76f6b0
updated the font-scaling with minimum capheight
PeterStaar-IBM Jan 17, 2025
ff2a1f1
Merge branch 'dev/add-support-for-complex-fonts' of github.com:DS4SD/…
PeterStaar-IBM Jan 17, 2025
c944245
Updated test again
cau-git Jan 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ repos:
hooks:
- id: autoflake
name: autoflake
entry: poetry run autoflake docling_parse
entry: poetry run autoflake docling_parse tests
pass_filenames: false
language: system
files: '\.py$'
Expand Down
Loading
Loading