Skip to content

Commit 28d1c74

Browse files
authored
chore: update README (#13)
Signed-off-by: Panos Vagenas <[email protected]>
1 parent f09ffcc commit 28d1c74

File tree

1 file changed

+7
-6
lines changed

1 file changed

+7
-6
lines changed

README.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
<p align="center">
2-
<a href="https://github.com/ds4sd/docling"> <img loading="lazy" alt="Docling" src="https://github.com/DS4SD/docling/raw/main/logo.png" width="150" />
2+
<a href="https://github.com/ds4sd/docling">
3+
<img loading="lazy" alt="Docling" src="https://github.com/DS4SD/docling/raw/main/logo.png" width="150" />
4+
</a>
35
</p>
46

57
# Docling
@@ -11,7 +13,7 @@
1113
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)
1214
[![Pydantic v2](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/pydantic/pydantic/main/docs/badge/v2.json)](https://pydantic.dev)
1315
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/pre-commit/pre-commit)
14-
[![License MIT](https://img.shields.io/github/license/ds4sd/deepsearch-toolkit)](https://opensource.org/licenses/MIT)
16+
[![License MIT](https://img.shields.io/github/license/DS4SD/docling)](https://opensource.org/licenses/MIT)
1517

1618
Docling bundles PDF document conversion to JSON and Markdown in an easy, self-contained package.
1719

@@ -49,7 +51,7 @@ The output of the above command will be written to `./scratch`.
4951

5052
### Adjust pipeline features
5153

52-
**Control pipeline options**
54+
#### Control pipeline options
5355

5456
You can control if table structure recognition or OCR should be performed by arguments passed to `DocumentConverter`:
5557
```python
@@ -62,16 +64,15 @@ doc_converter = DocumentConverter(
6264
)
6365
```
6466

65-
**Control table extraction options**
67+
#### Control table extraction options
6668

6769
You can control if table structure recognition should map the recognized structure back to PDF cells (default) or use text cells from the structure prediction itself.
6870
This can improve output quality if you find that multiple columns in extracted tables are erroneously merged into one.
6971

7072

7173
```python
72-
7374
pipeline_options = PipelineOptions(do_table_structure=True)
74-
pipeline_options.table_structure_options.do_cell_matching = False # Uses text cells predicted from table structure model
75+
pipeline_options.table_structure_options.do_cell_matching = False # uses text cells predicted from table structure model
7576

7677
doc_converter = DocumentConverter(
7778
artifacts_path=artifacts_path,

0 commit comments

Comments
 (0)