Document too long at transforms

- [x] I checked the [documentation](https://docs.ragas.io/) and related resources and couldn't find an answer to my question.

**Your Question**
I have a really long pdf (475 pages), it raises [this](https://github.com/explodinggradients/ragas/blob/b3c768b5e0526e2051a597094f3f4e45e9c1143c/src/ragas/testset/transforms/default.py#L160) error Saying "Documents appears to be too short (ie 100 tokens or less). Please provide longer documents." 
I think the problem is not the document being short but the oposite.

[code](https://github.com/explodinggradients/ragas/blob/b3c768b5e0526e2051a597094f3f4e45e9c1143c/src/ragas/testset/transforms/default.py#L77C5-L79C64) to classify documents based on size:
```py
bin_ranges = [(0, 100), (101, 500), (501, 100000)]
result = count_doc_length_bins(documents, bin_ranges)
result = {k: v / len(documents) for k, v in result.items()}
```

debuggin my code the size of my document gives 390000. The size exceeds the upper limit of the last bin. So there is no bin to place this, falling for the default condition which raises the above exception.

I think the Document should be process as those in the last bean or raise an Exception saying the document is too big.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Document too long at transforms #1852

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Document too long at transforms #1852

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions