Releases: docling-project/docling
Releases Β· docling-project/docling
v2.44.0
Feature
Fix
- html: Parse rawspan and colspan when they include non numerical values (#2048) (
ed56f2d
)
- Support new mlx-vlm module (#2001) (
0130e3a
)
- Extend error reporting when verbose logging is enabled (#2017) (
2eb760d
)
- HTML: Replace non-standard Unicode characters (#2006) (
86f7012
)
Documentation
v2.43.0
Feature
Fix
- markdown: Ensure correct parsing of nested lists (#1995) (
aec29a7
)
- HTML: Remove an unnecessary print command (#1988) (
945721a
)
v2.42.2
Fix
- HTML: Concatenation of child strings in table cells and list items (#1981) (
5132f06
)
- docx: Adding plain latex equations to table cells (#1986) (
0b83609
)
- Preserve PARTIAL_SUCCESS status when document timeout hits (#1975) (
98e2fcf
)
- Multi-page image support (tiff) (#1928) (
8d50a59
)
Documentation
v2.42.0
Feature
- Add option to control empty clusters in layout postprocessing (#1940) (
a436be7
)
Fix
- Safe pipeline init, use device_map in transformers models (#1917) (
cca05c4
)
- Fix HTML table parser and JATS backend bugs (#1948) (
e1e3053
)
- KeyError: 'fPr' when processing latex fractions in DOCX files (#1926) (
95e7096
)
- Change granite vision model URL from preview to stable version (#1925) (
c5fb353
)
Documentation
v2.41.0
Feature
Fix
- ocr-utils: Unit test and fix the
rotate_bounding_box
function (#1897) (931eb55
)
- Docs are missing osd packages for tesseract on RHEL (#1905) (
e25873d
)
- Use only backend for picture classifier (#1904) (
edd4356
)
- Typo in asr options (#1902) (
dd8fde7
)
v2.40.0
Feature
- Introduce LayoutOptions to control layout postprocessing behaviour (#1870) (
ec6cf6f
)
- Integrate ListItemMarkerProcessor into document assembly (#1825) (
56a0e10
)
Fix
- Secure torch model inits with global locks (#1884) (
598c9c5
)
- Ensure that TesseractOcrModel does not crash in case OSD is not installed (#1866) (
ae39a94
)
Performance
- msexcel: _find_table_bounds use iter_rows/iter_cols instead of Worksheet.cell (#1875) (
13865c0
)
- Move expensive imports closer to usage (#1863) (
3089cf2
)
v2.39.0
Feature
- Leverage new list modeling, capture default markers (#1856) (
0533da1
)
Fix
- markdown: Make parsing of rich table cells valid (#1821) (
e79e4f0
)
v2.38.1
Fix
- Updated granite vision model version for picture description (#1852) (
d337825
)
- markdown: Fix single-formatted headings & list items (#1820) (
7c5614a
)
- Fix response type of ollama (#1850) (
41e8cae
)
- Handle missing runs to avoid out of range exception (#1844) (
4002de1
)
v2.38.0
Feature
Fix
- docx: Ensure list items have a list parent (#1827) (
d26dac6
)
- msword_backend: Identify text in the same line after an image #1425 (#1610) (
1350a8d
)
- Ensure uninitialized pages are removed before assembling document (#1812) (
dd7f64f
)
- Formula conversion with page_range param set (#1791) (
dbab30e
)
Documentation