Different outputs across machines (mac/WSL2 ubuntu 24 vs ubuntu 24 on VM) #562
-
Hello, I have been experiencing an issue related to the output of a pdf conversion to both markdown and json/dict. I have a particular PDF that I intend to convert to markdown. I have tried doing so in both a mac m4 and a WSL2 pc and in a linux vm on azure running ubuntu 24 (same as the the WSL2). Also, just for testing there was a test in a VM deployed on Hyperstack and using docker images with different ubuntu versions. The "bad" output was the same across all non mac/WSL2 tests. When the code is ran in both mac/WSL2 then the output is the expected one, but when we run it in a linux vm then the output is totally different. I would like to know whether there is any setting that I am missing on my VM or why I cannot replicate the same behaviour on all machines. Below, I will present the code and the outputs: Codefrom docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.document_converter import DocumentConverter, PdfFormatOption
def extract_pdf_with_docling(file_path: str) -> str:
pipeline_options = PdfPipelineOptions()
pipeline_options.do_ocr = True
pipeline_options.ocr_options.use_gpu = False
pipeline_options.do_table_structure = True
pipeline_options.table_structure_options.do_cell_matching = True
doc_converter = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options)
}
)
return doc_converter.convert(file_path).document.export_to_markdown()
print(
extract_pdf_with_docling(
""
)
) Mac/WSL output## Solicitaçıes de Compra
| Solicitaçªo 606 | Data da solicitaçªo 22/08/2000 | Data da œltima autorizaçªo 22/08/2000 |
|----------------------------------------------------------------------------|----------------------------------|-----------------------------------------|
| Obra 8 - Company LTDA 929.334234.23123 | Solicitante Lucas Silva | |
| Cotaçıes 661 | Pedidos 468 | |
| Insumo | Autorizaçªo | Und. | Qtd. prevista | Qtd. atendida | Saldo Dt. entrega | UC | ReferŒncia | Item apropriado | Qtd. aprop. |
|-------------------------------------------------------------------------------|---------------|--------|-----------------|-----------------|---------------------|------|----------------|--------------------------------------------------------------------------------|---------------|
| 5419 - RØgua de alumínio / 3 mts | Sim | pc | 1,0000 | 1,0000 | 0,0000 27/08/2024 | 5 | 24.001.002.001 | Ferramentas | 1,0000 |
| 5549 - Arame n" 18 galvanizado / rolo com 1 kg | Sim | rol | 6,0000 | 6,0000 | 0,0000 21/08/2024 | 5 | 18.001.001.002 | Emboço aplicado em paredes externas, argamassa industrializada | 6,0000 |
| 5802 - Rolo de papelªo / ondulado 1,20x50 mts | Sim | rol | 6,0000 | 6,0000 | 0,0000 21/08/2024 | 5 | 15.001.001.008 | Revestimento porcelanato em piso, placa 60x60cm, aplicaçªo em Æreas privativas | 6,0000 |
| 5942 - Escada dupla Ø reforçada com degraus e corrimªos duplos cap. atØ 120kg | Sim | un | 1,0000 | 1,0000 | 0,0000 27/08/2024 | 5 | 24.001.002.001 | Ferramentas | 1,0000 |
## Observaçıes Escada Linux VM output22/08/2000
Qtd. aprop. 1,0000
82 de 197
| Data da œltima autorizaçªo |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Item apropriado Ferramentas Emboço aplicado em paredes externas, argamassa industrializada Revestimento porcelanato em piso, placa 60x60cm, aplicaçªo em Æreas privativas Ferramentas |
Saldo 0,0000
Qtd. atendida 1,0000
| | Data da œltima autorizaçªo | Item apropriado Ferramentas | Emboço aplicado em paredes externas, argamassa industrializada Revestimento porcelanato em piso, placa 60x60cm, aplicaçªo em Æreas privativas | Ferramentas | |
|-----------------------------------------------------|-----------------------------------------------------------------------------------|-----------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------|-----------------------------------------------------|
| | Lucas Silva | ReferŒncia 24.001.002.001 | 18.001.001.002 15.001.001.008 | 24.001.002.001 | |
| | 21/08/2024 | | | | |
| Solicitaçıes de Compra Data da solicitaçªo | 8 - Company LTDA 929.334234.23123 Solicitante | Qtd. atendida Dt. entrega UC 1,0000 22/08/2000 5 | 6,0000 21/08/2024 5 6,0000 22/08/2000 5 | 1,0000 27/08/2024 5 | |
| Saldo 0,0000 0,0000 0,0000 0,0000 SIENGE / SOFTPLAN | Saldo 0,0000 0,0000 0,0000 0,0000 SIENGE / SOFTPLAN | Saldo 0,0000 0,0000 0,0000 0,0000 SIENGE / SOFTPLAN | Saldo 0,0000 0,0000 0,0000 0,0000 SIENGE / SOFTPLAN | Saldo 0,0000 0,0000 0,0000 0,0000 SIENGE / SOFTPLAN | Saldo 0,0000 0,0000 0,0000 0,0000 SIENGE / SOFTPLAN |
Qtd. prevista 1,0000
Und. pc
Insumo Autorizaçªo 5419 - RØgua de alumínio / 3 mts Sim
0,0000
6,0000
6,0000
rol
5549 - Arame n' 18 galvanizado / rolo com 1 kg Sim
0,0000
6,0000
6,0000
rol
Sim
5802 - Rolo de papelªo / ondulado 1,20x50 mts
0,0000
1,0000
1,0000
un
5942 - Escada dupla Ø reforçada com degraus e corrimªos duplos cap. atØ 120kg Sim
SIENGE / SOFTPLAN
03/12/2024 - 17:40:32
Solicitaçªo 606 Obra 8 - BROOKLIN EMPREENDIMENTOS IMOBILIARIOS SPE LTDA 46.388.309/0001-00
Cotaçıes 661
6,0000
6,0000
1,0000
Observaçıes Escada I appreciate your attention. Best, Jon |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 8 replies
-
Can you please provide the output of |
Beta Was this translation helpful? Give feedback.
@jon-torres Yes you can do this by setting up your converter the following way: