You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So I have a PDF with just one field on it - a field named "xxx.yyy". When I run pdf2json 3.0.5 on the PDF I'm told that the only field on that PDF is "yyy".
pdftk 2.02 also finds "xxx.yyy" when I run pdftk test.pdf dump_data_fields:
FieldType: Text
FieldName: xxx.yyy
FieldFlags: 0
FieldJustification: Left
Unfortunately, pdftk doesn't return the coordinates whereas pdf2json does.
According to qpdf test.pdf --json the field's alternativename, fullname and mappingname are "xxx.yyy" whereas the partialname is "yyy" so maybe that's the issue?
The text was updated successfully, but these errors were encountered:
So I used qpdf's QDF mode (qpdf test.pdf --qdf test.qdf) to further dig into this and I guess the issue is that when there are dots the dots are treated as parent objects.
So if you look at the /T tag in isolation you get yyy. The xxx is due to the /Parent 17 0 R bit:
%% Object stream: object 17, index 0; original object ID: 10
<<
/Kids [
7 0 R
]
/T (xxx)
>>
So I guess what pdf2json needs to do is to recursively go back and find each parent until there is no parent and it needs to prepend each parent to the /T tag with dots separating each part.
So I have a PDF with just one field on it - a field named "xxx.yyy". When I run pdf2json 3.0.5 on the PDF I'm told that the only field on that PDF is "yyy".
test.pdf demonstrates the problem.
Here's what Adobe Acrobat Pro 2020 shows:
pdftk 2.02 also finds "xxx.yyy" when I run
pdftk test.pdf dump_data_fields
:Unfortunately, pdftk doesn't return the coordinates whereas pdf2json does.
According to
qpdf test.pdf --json
the field's alternativename, fullname and mappingname are "xxx.yyy" whereas the partialname is "yyy" so maybe that's the issue?The text was updated successfully, but these errors were encountered: