HTTP Storage Plugin INDEX pagination mode is field-order sensitive and should not be #2895

hyperbolix · 2024-03-26T22:27:49Z

Describe the bug
HTTP Storage Plugin INDEX pagination mode is field-order sensitive and should not be (JSON is supposed to be unordered). When the pagination fields 'has_next' and 'next_page' come after the data path field 'results', the presence of a next page is ignored. The next page fields are only honored when they come before the data path field.

To Reproduce
Steps to reproduce the behavior:

Start Drill.
Start the Python3 HTTP server that demonstrates the issue:
https://gist.github.com/hyperbolix/b66f07d5d007ca2df321550241d748ac
Enable the http storage plugin and configure it using this configuration: https://gist.github.com/hyperbolix/7df108c8169cf54bcc72fdc499be1dce
Open the Drill query interface and issue this query:
select * from http.a_bug_test where total_results_val=300 and page_size_val=10 limit 15
See that it returns 10 results, but it should return 15.
Modify the Python3 HTTP server, moving lines 52-58 so they come before line 50, so that the 'has_next' and 'next_page' fields come before the 'results' field.
Restart the Python3 HTTP server and issue the query again from step 4.
Observe than now 15 results are returned.

Expected behavior
The INDEX pagination mode page token fields should be honored regardless of field ordering in the JSON response. Or, at a minimum, the documentation should clearly indicate that these fields must come before the results field (which is not the order in the example in the HTTP pagination readme today).

Error detail, log output or screenshots
When the 'has_next' and 'next_page' fields come after the data path 'results' field, they are ignored, meaning the result set is incorrectly trimmed short, as though there are not additional pages, yet there are additional pages.

Drill version
Observed in 1.21.1 and also in the latest commit as of 2024-03-26: 749772c

The text was updated successfully, but these errors were encountered:

cgivre · 2024-03-27T14:24:55Z

@hyperbolix I've been reading this ticket and I'm a little confused here. The index pagination is intended to be used when the API in question returns an index or an index link to indicate the next page.

With that said, the parameters page_size and total_results suggest that the correct pagination method would be page pagination. (https://github.com/apache/drill/blob/master/contrib/storage-http/Pagination.md#page-pagination). Page pagination is when the API provides paging information such as the page size and result count.

If you use the page pagination with a limit, that will work and there are unit tests for that.

hyperbolix · 2024-03-27T15:51:56Z

This test rig doesn't exactly represent the system I've been experimenting with using Drill that led to the discovery of this bug. It was simply the most expedient way to demonstrate the issue. It helped to be able to tune total result size and max results per page to prove the issue would occur at various page sizes, and it helped to keep the 'next_page' token simple.

I appreciate your guidance in the matter; I can assure you the I am actually using page tokens in my experiments thereby necessitating the use of the 'index' pagination method. I can't recall a single cloud computing list API that doesn't use page tokens, and if I came across one, I'd be worried about using it.

hyperbolix added the bug label Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP Storage Plugin INDEX pagination mode is field-order sensitive and should not be #2895

HTTP Storage Plugin INDEX pagination mode is field-order sensitive and should not be #2895

hyperbolix commented Mar 26, 2024 •

edited

Loading

cgivre commented Mar 27, 2024 •

edited

Loading

hyperbolix commented Mar 27, 2024 •

edited

Loading

HTTP Storage Plugin INDEX pagination mode is field-order sensitive and should not be #2895

HTTP Storage Plugin INDEX pagination mode is field-order sensitive and should not be #2895

Comments

hyperbolix commented Mar 26, 2024 • edited Loading

cgivre commented Mar 27, 2024 • edited Loading

hyperbolix commented Mar 27, 2024 • edited Loading

hyperbolix commented Mar 26, 2024 •

edited

Loading

cgivre commented Mar 27, 2024 •

edited

Loading

hyperbolix commented Mar 27, 2024 •

edited

Loading