You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @victorcasignia I did went through the document and the body seems to be correctly processed. You can double check that the body is all in the output. Even the head of sections are numbered correctly.
Now, the issues are all in the Appendix, which is larger than the body of the article. The first part are correctly handlded until "blue" (where the document ends). After that the model decided that it's body so all the content after page 21 is actually appended after the "Future work" section.
From our end, we could flag this issue so that we can use the document as training data for the fulltext model, but will be for version 0.8.2.
Using Docker to run the service.
Used on a 21 pages PDF. It only extracts up to page 9 then it jumps to the bibliography. How do I resolve this?
The text was updated successfully, but these errors were encountered: