issues Search Results · repo:docling-project/docling-eval language:Python
Filter by
12 results
(72 ms)12 results
indocling-project/docling-eval (press backspace or delete to remove)Hi Docling Team,
I really enjoyed reading your technical report, especially the section describing the 89-PDF benchmark dataset:
To enable a meaningful benchmark, we composed a test set of 89 PDF files ...
CHN-ChenYi
- Opened 4 days ago
- #117
The current design of docling-eval assumes the workflow:
1. create-gt: Create a Ground Truth dataset in HF parquet format.
2. create-eval: Create a prediction dataset in HF parquet format that contains ...
nikos-livathinos
- Opened 12 days ago
- #112
Currently, the TEDS metrics calculation only looks at td tag while building the tree for APTED algorithm. (Reference)
IMO, this will unfairly penalize any hyperscaler or even WDU/docling in case they ...
divekarsc
- 4
- Opened 15 days ago
- #110
Instantiating DoclingPredictionProvider with do_visualization=False as follows:
docling_provider = DoclingPredictionProvider(
do_visualization=False, ignore_missing_predictions=False
)
Will ...
bug
wai25
- Opened 16 days ago
- #107
In the current matching strategy, a point on a polyline is associated with the smallest bounding box that contains it.
https://github.com/docling-project/docling-eval/blob/b507977171780650860e74ae48f3edadd4a60b78/docling_eval/dataset_builders/cvat_dataset_builder.py#L225-L230 ...
Saidgurbuz
- Opened 16 days ago
- #106
Docling, WDU Tables/OCR tests fail with the error: RuntimeError: Cannot visualize document without images
To reproduce, update test_tables_aws.py to use Docling and run.
poetry run pytest -v tests/test_tables_docling.py ...
bug
wai25
- 4
- Opened 17 days ago
- #105
The test_ocr_xfund_google.py test is failing and likely other tests too.
To reproduce the error: poetry run pytest -v tests/test_ocr_xfund_google.py
poetry run pytest -v tests/test_ocr_xfund_google.py ...
bug
samiuc
- 2
- Opened 17 days ago
- #104
Given that docling-eval is able to create ground truth and prediction datasets built around the DoclingDocument format
we may also want to export the entire GT/prediction dataset in another format. This ...
nikos-livathinos
- Opened on Apr 23
- #80
- Introduce DoclingDocumentDatasetBuilder to build Ground Truth datasets from lossless serializations of
DoclingDocument files (e.g. jsons).
- It is useful when DoclingDocument objects have been ...
nikos-livathinos
- Opened on Apr 23
- #79
In its current implementation docling-eval is focused on the standardization of the evaluation where the DoclingDocument
format is used as the interface to store both the ground-truth and the predictions. ...
nikos-livathinos
- Opened on Mar 10
- #43

Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Press the /
key to activate the search input again and adjust your query.
Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Restrict your search to the title by using the in:title qualifier.