You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been trying to analyze the documents using layout parser on different types of documents, I am able to get expected results on True pdfs but not on scanned pdfs, it is detecting the scanned pdf image contents as figure or not as expected results.
I am facing this issue only for the scanned pdfs
Checklist
I have searched related issues but cannot get the expected help.
I have been trying to analyze the documents using layout parser on different types of documents, I am able to get expected results on True pdfs but not on scanned pdfs, it is detecting the scanned pdf image contents as figure or not as expected results.
I am facing this issue only for the scanned pdfs
Checklist
To Reproduce
import layoutparser as lp
import cv2
image = cv2.imread("test.png")
image = image[..., ::-1]
model = lp.models.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config',
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
color_map = {
'Text': 'red',
'Title': 'blue',
'List': 'green',
'Table': 'purple',
'Figure': 'pink',
}
layout = model.detect(image)
lp.draw_box(image, layout, box_width=3,color_map=color_map)
Environment
Contains 2 images:
1: Scanned pdf image result
2: Proper pdf image result
The text was updated successfully, but these errors were encountered: