Skip to content

issues Search Results · repo:docling-project/docling-eval language:Python

Filter by

12 results
 (72 ms)

12 results

indocling-project/docling-eval (press backspace or delete to remove)

Hi Docling Team, I really enjoyed reading your technical report, especially the section describing the 89-PDF benchmark dataset: To enable a meaningful benchmark, we composed a test set of 89 PDF files ...
  • CHN-ChenYi
  • Opened 
    4 days ago
  • #117

The current design of docling-eval assumes the workflow: 1. create-gt: Create a Ground Truth dataset in HF parquet format. 2. create-eval: Create a prediction dataset in HF parquet format that contains ...
  • nikos-livathinos
  • Opened 
    12 days ago
  • #112

Currently, the TEDS metrics calculation only looks at td tag while building the tree for APTED algorithm. (Reference) IMO, this will unfairly penalize any hyperscaler or even WDU/docling in case they ...
  • divekarsc
  • 4
  • Opened 
    15 days ago
  • #110

Instantiating DoclingPredictionProvider with do_visualization=False as follows: docling_provider = DoclingPredictionProvider( do_visualization=False, ignore_missing_predictions=False ) Will ...
bug
  • wai25
  • Opened 
    16 days ago
  • #107

In the current matching strategy, a point on a polyline is associated with the smallest bounding box that contains it. https://github.com/docling-project/docling-eval/blob/b507977171780650860e74ae48f3edadd4a60b78/docling_eval/dataset_builders/cvat_dataset_builder.py#L225-L230 ...
  • Saidgurbuz
  • Opened 
    16 days ago
  • #106

Docling, WDU Tables/OCR tests fail with the error: RuntimeError: Cannot visualize document without images To reproduce, update test_tables_aws.py to use Docling and run. poetry run pytest -v tests/test_tables_docling.py ...
bug
  • wai25
  • 4
  • Opened 
    17 days ago
  • #105

The test_ocr_xfund_google.py test is failing and likely other tests too. To reproduce the error: poetry run pytest -v tests/test_ocr_xfund_google.py poetry run pytest -v tests/test_ocr_xfund_google.py ...
bug
  • samiuc
  • 2
  • Opened 
    17 days ago
  • #104

Given that docling-eval is able to create ground truth and prediction datasets built around the DoclingDocument format we may also want to export the entire GT/prediction dataset in another format. This ...
  • nikos-livathinos
  • Opened 
    on Apr 23
  • #80

- Introduce DoclingDocumentDatasetBuilder to build Ground Truth datasets from lossless serializations of DoclingDocument files (e.g. jsons). - It is useful when DoclingDocument objects have been ...
  • nikos-livathinos
  • Opened 
    on Apr 23
  • #79

In its current implementation docling-eval is focused on the standardization of the evaluation where the DoclingDocument format is used as the interface to store both the ground-truth and the predictions. ...
  • nikos-livathinos
  • Opened 
    on Mar 10
  • #43
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue search results · GitHub