You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
itogaston opened this issue
Jan 16, 2025
· 1 comment
Labels
answered🤖 The question has been answered. Will be closed automatically if no new commentsbugSomething isn't workingquestionFurther information is requested
I checked the documentation and related resources and couldn't find an answer to my question.
Your Question
I have a really long pdf (475 pages), it raises this error Saying "Documents appears to be too short (ie 100 tokens or less). Please provide longer documents."
I think the problem is not the document being short but the oposite.
debuggin my code the size of my document gives 390000. The size exceeds the upper limit of the last bin. So there is no bin to place this, falling for the default condition which raises the above exception.
I think the Document should be process as those in the last bean or raise an Exception saying the document is too big.
The text was updated successfully, but these errors were encountered:
Hey, you're right about the error. It should have been the other way. But on the other side, if you feed in a document that is 475 pages long that would cost you very much on the extraction side of things, for example, it will try to extract headings from every page in the document,etc. The smart thing to do is split the documents into several small documents and then process a part of them for test generation. We understand this is a hack, we will take this into consideration when we roll out the next big iteration on test generation.
answered🤖 The question has been answered. Will be closed automatically if no new commentsbugSomething isn't workingquestionFurther information is requested
Your Question
I have a really long pdf (475 pages), it raises this error Saying "Documents appears to be too short (ie 100 tokens or less). Please provide longer documents."
I think the problem is not the document being short but the oposite.
code to classify documents based on size:
debuggin my code the size of my document gives 390000. The size exceeds the upper limit of the last bin. So there is no bin to place this, falling for the default condition which raises the above exception.
I think the Document should be process as those in the last bean or raise an Exception saying the document is too big.
The text was updated successfully, but these errors were encountered: