Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Concept] - Infinite Query API #4393

Draft
wants to merge 46 commits into
base: master
Choose a base branch
from
Draft

Conversation

riqts
Copy link
Contributor

@riqts riqts commented May 5, 2024

Status Update, 2024-10-27 (by Mark)

I've finally had time to seriously focus on figuring out how we're going to add official infinite query support to RTK Query, and we've got some good progress! 🎉

Earlier this year, @riqts submitted this PR to try implementing React Query's public API on top of RTKQ's internals. However, this PR had sat there untouched, since neither Lenz nor I had time to look at i.

In the last couple weeks I've had both the time and energy to prioritize understanding how infinite queries work in general, how React Query's API is designed and implemented, and how this draft PR is implemented and what it actually does thus far.

Over the last few days, I've done some significant work on that draft to fix issues with the TS types, add some tests, clean up some of the rough spots in the API design, and improve the functionality.

Also, Lenz and I met with Dominik (React Query maintainer), and discussed how their implementation works and why they made certain design decisions.

As of right now, the draft PR builds and passes some initial tests. I need to add a lot more tests and try it out in some actual meaningful examples, but I think what's there is actually ready for some brave folks to try it out and give us feedback.

You can try installing the PR preview build using the installation instructions from the "CodeSandbox CI" job listed at the bottom of the PR. Please leave comments and feedback over in that PR!

My current somewhat ambitious goal is to ship a final version of infinite query support for RTKQ by the end of this year. I am absolutely not going to guarantee that :) It's entirely dependent on how much free time I have to dedicate to this effort, how complicated this turns out to be, and how much polish is needed. But in terms of maintenance effort, shipping this is now my main priority!

PR Build Installation Instructions

From the "CodeSandbox CI" job in the PR checks list:

# yarn 1
yarn add https://pkg.csb.dev/reduxjs/redux-toolkit/commit/ffafe626/@reduxjs/toolkit 

# yarn 2, 3
yarn add @reduxjs/toolkit@https://pkg.csb.dev/reduxjs/redux-toolkit/commit/ffafe626/@reduxjs/toolkit/_pkg.tgz 

# npm
npm i https://pkg.csb.dev/reduxjs/redux-toolkit/commit/ffafe626/@reduxjs/toolkit 

I'll try to keep this updated as we do more pushes, but you may want to check the last commit hash and use that if necessary.

Todos

Jotting down some todos as I think of them:

Functionality

  • Refetching
  • Max pages
    • enforce both gN/PPP when maxPages > 0
  • hasNext/PreviousPage flags
  • Investigate moving pageParams into some new metadata field in the cache entry, so that it's not directly exposed to the user in data (per discussion with Dominik)
  • Possibly some kind of combinePages option, so that you don't have to do selectFromResult: ({data}) => data.pages.flat()` (and memoize it) for every endpoint
    • do we wrap this in createSelector by default?
  • See how much of the types and logic can be deduplicated

Tests / Example Use Cases

React Query examples

ref: https://tanstack.com/query/latest/docs/framework/react/guides/infinite-queries

  • refetches
  • bi-directional list
  • reversed display
  • manual update
  • page limits
  • more pageParam examples
  • not directly listed, but: staleness / polling?

Other

  • optimistic updates? (what does this even look like, conceptually and usage-wise?)
  • upsertQueryData / upsertQueryEntries?
  • RN FlatList
  • CRUD edits to pages?
  • Tag invalidation of an infinite endpoint

Original PR Author Comments

This is a conceptual display of an API and how it would work inside RTKQ for an infinite Query. This is not a final implementation.
It is derived from the API for react query Infinite Query but with RTKQ's useInfiniteQuery hook etc and implementation.

disclaimer:
The typing and actual code for a library and implementation is bad, it basically just takes and repeats 90% of the query hook logic, the final implementation would more likely be an extension of the Query definition, but I wanted to completely separate it to make it more clear for feedback on the API.

Summary

useInfiniteQuery hook - works almost the same as useQuery

  • New args:

    • takes an initial page param
    • takes an argument for getNextPage
    • takes an optional argument function for getPreviousPage
  • New Returns:

    • data is now an object containing infinite query data: data.pages array containing the fetched pages and data.pageParams array containing the page params used to fetch the pages
    • fetchNextPage trigger function, similar to a lazyQuery trigger but combined with the querySubscription
    • fetchPreviousPage
    • hasNextPage - I haven't implemented it yet :D
    • hasPrevPage - I haven't implemented it yet :D
    • isFetchingNextPage
    • isFetchingPrevPage
  • InfiniteQuery is a new EndpointDefinition

  • Uses its own initiator, and then initiates a typical QueryThunk

  • Additional logic added to querySlice that adds direction/param to the querySubState and acts as the discriminator for an InfiniteQuery (different to arg which acts as the set cache key)

  • ExecuteEndpoint is changed to fetch every page from the pageParams that hasn't been fetched yet and add to the data object in the direction specified.

Still needs to be done:

  • Tests - Lots
  • Turn repeated types/functions into extensions of Query logic
  • hasPrevPage & hasNextPage state
  • Didn't add prevPage yet but I did add nextPage and it's the same thing
  • Feedback - I have not used a middleware in any capacity, the querySlice just alters the substate and it uses a queryThunk otherwise. However, @phryneas mentioned a middleware is probably how it would be handled, and he's always right eventually, so I will need to be told what part is better handled there.

Open Questions

  • Middleware appropriate here?
  • Am I meant to be using merge still? Should the selector be handling the cache differently?
    • Currently it's just one query and the next page trigger continuously increments the cursor/pageParam but potentially we should be executing standard query for each PageParam and then selecting them all together?
  • Is this API appropriate at all for RTKQ?
  • Do I burn it all down?

Copy link

codesandbox bot commented May 5, 2024

Review or Edit in CodeSandbox

Open the branch in Web EditorVS CodeInsiders

Open Preview

@riqts riqts marked this pull request as draft May 5, 2024 05:40
Copy link

netlify bot commented May 5, 2024

Deploy Preview for redux-starter-kit-docs ready!

Name Link
🔨 Latest commit ffafe62
🔍 Latest deploy log https://app.netlify.com/sites/redux-starter-kit-docs/deploys/671e8a467006b200083fc91a
😎 Deploy Preview https://deploy-preview-4393--redux-starter-kit-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

codesandbox-ci bot commented May 5, 2024

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Latest deployment of this branch, based on commit ffafe62:

Sandbox Source
@examples-query-react/basic Configuration
@examples-query-react/advanced Configuration
@examples-action-listener/counter Configuration
rtk-esm-cra Configuration

@markerikson
Copy link
Collaborator

Hi! I finally found some free time today to start doing some research on infinite query APIs in other libs like React Query, and am starting to wrap my brain around this space a bit.

I haven't tried to dig through this PR yet, but I definitely endorse the idea of wholesale blatantly ripping off taking inspiration from React Query's API here :) (Especially after having heard Tanner say that "we handle edge cases other libs don't", although I don't have specifics.)

I am trying to understand the space enough to have actual feedback, but as a very first starting point, one goal would be that this ought to work as part of the UI-agnostic RTKQ core, so that any UI layer can use it (React, Angular, etc).

Looking at React Query's impl, there's a core InfiniteQueryBehavior class and some associated InfiniteQueryObserver, and then the useInfiniteQuery hook just instantiates that.

I haven't even looked at your code yet, so I'm not sure what the specific implementation approach is, but that's the kind of ideal approach I'd like to end up with.

Let me see if I can rebase this PR so it's at least up to date and builds.

@markerikson
Copy link
Collaborator

markerikson commented Sep 29, 2024

Okay, I just rebased this. I probably broke things somewhere in the process - the rebase was complicated due to us having drastically rearranged a lot of the internal typedefs a few weeks ago.

The one test added to the PR appears to run and pass. There's a dozen TS errors, which look to be related to places where we expect values like endpointDefinition.select to exist and TS no longer thinks they do.

FWIW, @phryneas separately commented that he happens to like SWR's infinite query API a bit better:

Mainly the fact that it treats all pages as individuals, gives you the ability to refetch only individual pages, and has a really compact api for doing so.
https://swr.vercel.app/docs/pagination#useswrinfinite

I don't know enough about this space yet to have an opinion here :)

I think it's worth pushing this branch forward and seeing if we can reach feature and test parity with React Query's behavior, but maybe also putting together another branch that takes more of an SWR-style approach and see how they compare.

Also, I'm going to be seeing Dominik ( @TkDodo ) at React Advanced and we'll try to chat about this topic as well.

@markerikson
Copy link
Collaborator

markerikson commented Sep 29, 2024

Okay, I think this at least passes enough to let the CodeSandbox CI build succeed, so we should have a viable PR package preview to play with now!

Off the top of my head, the big open questions are:

  • does this draft PR actually work as intended, thus far?
  • what happens if we start porting tests from React Query?
  • does it work validly without React (ie just the thunks)?
  • what is missing to flesh this out?
  • assuming the whole approach runs as intended, is this actually a viable solution to the use case / feature requests? how many of the questions and use cases from the two large "infinite query" discussion threads are addressed via this implementation? how many edge cases does this not cover?
  • per Lenz's comment, how would this compare to an SWR-style API instead?

@kcrwfrd
Copy link

kcrwfrd commented Oct 2, 2024

Hm, in my email notification there is a quote from @phryneas but it seems edited out now

From my experience with Apollo Client, merging multiple pages into one cache object is not a good choice.

Of course I defer to his experience and expertise, but it's kind of funny because it's something that seemed desirable for us...

We have a direct message UI that uses a paginated API request, which we further enhanced with WebSockets for update/delete of existing messages and the addition of new ones. Currently we're using the merge strategy with our RTK Query endpoint to merge multiple pages all together, also making use of createEntityAdapter to handle CRUD updates. The use of the single entity EntityAdapter automatically dedupes and sorts the messages for us.

If there are separate cache entries for each page, now we have to iterate through them all in order to find the entity to update/delete, and we're left with a dilemma of where to add new messages that came from the WebSockets instead of the paginated REST requests...now the page size will be out of sync if the user scrolls around and triggers new queries.

(I guess with our current merge implementation the pagination can already get out of sync... but perhaps this can be solved for with cursor-based pagination).

I'm super curious to understand what formed the opinion that "merging multiple pages into one cache object is not a good choice" :) and it would be great to have this use-case considered, as I imagine it could very well be a common one.

@phryneas
Copy link
Member

phryneas commented Oct 2, 2024

I'm super curious to understand what formed the opinion that "merging multiple pages into one cache object is not a good choice" :) and it would be great to have this use-case considered, as I imagine it could very well be a common one.

Both ways are not fool-proof, but at this point I believe that having to maintain one big cache object can be a burden to the developer, and I would like to avoid going that route in a new implementation.

Some thoughts:

In cursor-based pagination, additions and deletions can work easier with multiple pages that get stitched.

  • Either you do a manual cache update, but don't refetch. In that case, your one page might get shorter or longer than the default, but all pages before and after can still normally refresh.
  • Or you refresh the "current" page, in which case the end cursor will move, which can automatically make RTKQ refetch all follow-up pages
  • On the other hand, if you manually have to maintain that list, you can now end up with duplicate rows (in the end of deletion), swallowed rows (in the end of addition), and you lose clear signal where you can actually refetch things - you have to kinda track that, but it's not exactly clear how.

With offset-based or page-based pagination, you kinda end up with the opposite problem, where the "big blob" approach seems simpler, but the SWR api helps here by passing in the lastData to determine the next page index. So you don't blindly go assuming the last page had 10 elements and stepping 10 forward, but you can actually see that (assuming an optimistic update was made) that it has 9 or 11, and refetch that, too.
What we would need would be an additional mechanism to also tell RTKQ "invalidate all partial pages after this page" in case of a refetch where we detect that elements were deleted/inserted. But we'd need something similar in a "big blob" case, too.

So... cursor-based is in my eyes easier, offset-based not necessarily more easy or difficult, just different.

we have to iterate through them all in order to find the entity to update/delete,

That could be done without iterating, with provides containing ids, and selectInvalidatedBy.

now the page size will be out of sync if the user scrolls around and triggers new queries.

That's why I like the SWR approach, where a page could grow or shrink without causing a lot of trouble.

@TkDodo
Copy link

TkDodo commented Oct 4, 2024

@phryneas let’s discuss this in person in London if you’re there, just some quick, high level thoughts:

That's why I like the SWR approach, where a page could grow or shrink without causing a lot of trouble.

I don’t think we have problems with pages shrinking / growing when you have one cache entry, as each page is still stored separately. It’s fine to have one request return 10 pages, and then the next one just returns 9. You can also just delete one item from a page, that doesn’t mean an item from the next page must be moved manually - it can just stay the way it is.

One thing that’s a conscious decision for us to have one big cache entry is that it commits or errors (or retries) as one entity. That means the page only renders and receives an update after all refetches are done. Assume someone added an entry on the first page, which will “shift” all follow-up pages. If they are separate cache entries - wouldn’t you see the UI temporarily show duplicate entries? And wouldn’t it stay that way if fetching the 2nd or 3rd page fails?

Or you refresh the "current" page, in which case the end cursor will move, which can automatically make RTKQ refetch all follow-up pages

This seems great - the cursor is an input to the other cache entries, making them refresh automatically. But it also means that updates to pages before my page aren’t reflected. If someone renames an entry on page 1 of 5, and I rename something on page 3 of 5, only 3,4,5 will refetch and I won’t know about the change on page 1.

Also, refetches staring from a page in the middle can be weird with paginated queries. Suppose I change an entry on page 3 (pageSize=3) and someone deletes 3 entries before that:

A, B, C (page=1)
D, E, F (page=2)
G, H, I (page=3)
J, K, L (page=4)

Let’s rename H to X, while at the same time, C, D and E were deleted by others. If we only start to refetch with page=3, we will have an end result of:

A, B, C (page=1)
D, E, F (page=2)
J, K, L (page=3)

so the entry we updated isn’t even visible anymore, while the database actually has:

A, B, F  (page=1)
G, X, I  (page=2)
J, K, L  (page=3)

so I think the only safe thing to do is to refetch all pages when you refetch an infinite query, from the start. At least this is what we’re doing and I’m trying to talk sense into that approach 😂

@phryneas
Copy link
Member

phryneas commented Oct 4, 2024

@TkDodo that sounds like you are much closer to the "multiple cache entries" than the "single cache entry" I'm arguing against - in our case, I'd use a selector to combine them into a single "outside visible cache entry", so I don't think we're far away from each other at all :)

But yeah, let's definitely talk about this in London!

@riqts
Copy link
Contributor Author

riqts commented Oct 6, 2024

This draft is in a rougher state (especially around types) as it was mostly me forcing some things to get it working, but given the activity on it again; I'll hop in and clean it up, set some tests and update the discussion on the state of it. Honestly need to refresh myself with the problem space as I haven't done anything here since opening it :D

@markerikson
Copy link
Collaborator

markerikson commented Oct 24, 2024

Okay, spent the last couple days doing some significant work on this PR to wrap my head around it, understand what it does so far, and try out both the TS types and actual implementation to see what's lacking.

I've pushed several updates:

  • The TS types now correctly reflect that the final data is an InfiniteData object of {pages: T[], pageParams: PageParam[]}
  • The original PR required that you always had to pass that data value into the thunk every time it got dispatched, meaning that you always had to retrieve and then pass in the actual contents from the cache entry. I've reworked it so that the thunk now looks up the cache entry as needed and falls back to an empty data value otherwise
  • The infinite query fetching logic only worked if the endpoint defined a query field, but not a queryFn. I've reworked the guts of executeEndpoint so that there's one function that does the work of calling either query + baseQuery or queryFn appropriately, and the infinite query logic calls that multiple times if needed
  • I've reworked the types to separate out the concept of a QueryArg vs a PageParam. As an example, you might want to fetch "fire" Pokemon as the string cache key, but the page params should be numbers.
  • I added an initialPageParam required option to match how React Query works

This desperately needs more tests and fleshing out, but I feel like it's getting close to a point where it works enough that we can try it out and say "is this the right API design in general?", knowing that it at least is working the way we think this API design ought to work.

aaaand as I say that I see that I borked the initialPageParam option type somehow. About to run off to hang out with conference folks the rest of the night, I'll fix this later! okay, should be fixed!

I do see this PR has been failing against TS 4.7 and 4.8 for a while. That'll need to be fixed eventually, but doesn't stop us from iterating on it and figuring out if this API design is the approach we want.

@markerikson
Copy link
Collaborator

Pushed a bunch of fixes for useInfiniteQuery:

  • Has the correct {pages, pageParams} type for data
  • Has the correct QueryArg type for the argument
  • Also removed the need to pass through data, since the thunk now takes care of that
  • The query hook now accepts initialPageParam as an option, but no longer accepts getNext/PreviousPageParam. I can imagine it might be hypothetically useful to override those at the hook level even though they were defined in the endpoint, so I'm open to putting those back, but for now I've removed them from the hook.
  • Bunch of other types fixes, several courtesy of @aryaemami59

Also changed the pagination command 'backwards' to 'backward' to match the other 'forward', and added some more unit/type tests. It's not a full suite, but there's enough there to test basic loading, next/prev page behavior, and starting with an offset initialPageParam, for both the vanilla thunk and the hook.

@markerikson
Copy link
Collaborator

Jotting down some todos as I think of them:

Functionality

  • Refetching
  • Max pages
    • enforce both gN/PPP when maxPages > 0
  • hasNext/PreviousPage flags
  • Investigate moving pageParams into some new metadata field in the cache entry, so that it's not directly exposed to the user in data (per discussion with Dominik)
  • Possibly some kind of combinePages option, so that you don't have to do selectFromResult: ({data}) => data.pages.flat()` (and memoize it) for every endpoint
    • do we wrap this in createSelector by default?

Tests / Example Use Cases

React Query examples

ref: https://tanstack.com/query/latest/docs/framework/react/guides/infinite-queries

  • refetches
  • bi-directional list
  • reversed display
  • manual update
  • page limits
  • more pageParam examples
  • not directly listed, but: staleness / polling?

Other

  • optimistic updates? (what does this even look like, conceptually and usage-wise?)

@markerikson
Copy link
Collaborator

For notification purposes: just edited the original PR description comment with an info-dump on the status of this PR and a bunch of todos. Basically: "TRY THIS OUT AND GIVE US FEEDBACK, I'M WORKING ON THIS!"

@remus-selea
Copy link

I've been testing this out, and I've noticed that upsertQueryData no longer behaves as it used to. I verified with the Redux DevTools extension and it now removes the data field entirely for the query.

Steps to reproduce

  1. Use "@reduxjs/toolkit": "2.3.0"
  2. Trigger upsertQueryData on a non-infinite query endpoint (e.g., by clicking a button):
  dispatch(apiSlice.util.upsertQueryData("getPosts", arg, newPosts));
  1. Verify that it works by checking that the data field is present with the changed values for the api/executeQuery/fulfilled action.
  2. Switch to "@reduxjs/toolkit": "https://pkg.csb.dev/reduxjs/redux-toolkit/commit/ffafe626/@reduxjs/toolkit"
  3. Repeat step 2
  4. Verify the api/executeQuery/fulfilled action. The data field is now removed entirely.

@jack-bliss
Copy link

Would the team be open to supporting pagination through the link header? I think this is a pretty nice API because it means we don't have to maintain any frontend code - the backend just says how to get the next page and we're done. If the backend decides to change how pagination works (for example, switching from page=1 to offset=100 or cursor=abcd etc.) then the UI can just pick that up and roll with it. Plus it's already a standard so there may be many APIs out there already implementing this.

@remus-selea
Copy link

I built a simple example showcasing the use of an infinite query with the Pokémon API for others to explore
https://codesandbox.io/p/sandbox/h5np4m

@markerikson
Copy link
Collaborator

@jack-bliss how does that work with React Query's API today, if at all?

@jack-bliss
Copy link

@jack-bliss how does that work with React Query's API today, if at all?

After learning more about how react-query works, I don't think it does at all. Mildly disappointing but we live and learn.

@TkDodo
Copy link

TkDodo commented Nov 23, 2024

@jack-bliss how does that work with React Query's API today, if at all?

After learning more about how react-query works, I don't think it does at all. Mildly disappointing but we live and learn.

I wouldn’t say it’s impossible - it’s just that, as a tool that isn’t tied to HTTP, we don’t offer any integration per default.

But nothing stops you from:

  • reading headers in your queryFn, extract whatever information you want from the Link response header and attach that to the query data returned from the queryFn.
  • implement getNextPageParam to return that part of the data where the link rel is next
  • implement getPreviousPageParam to return that part of the data where the link rel is prev
  • in the queryFn, take the pageParam and use that as a url to fetch.

pseudo implementation:

useInfiniteQuery({
  queryKey: ['tasks'],
  queryFn: async ({ pageParam }) => {
    const response = await fetch(pageParam)
    if (!response.ok) {
      throw new Error('failed to fetch')
    }
    const data = await response.json()
    const links = parseLinkHeadersFromResponse(response.headers)

    return { data, links }
  },
  initialPageParam: '/tasks',
  getNextPageParam: (lastPage) => lastPage.links.next
  getPreviousPageParam: (lastPage) => lastPage.links.prev
})

parseLinkHeadersFromResponse is left out as an exercise to the reader (or chatGPT) :)

@markerikson
Copy link
Collaborator

@TkDodo thanks! yeah, that was roughly the train of thought in my own head. At its core, neither R-Q nor RTKQ actually "know" about HTTP headers even if that is the dominant use case, so it's up to the query function logic to extract that info from the response and make use of it in the data.

@jack-bliss
Copy link

jack-bliss commented Nov 25, 2024

I wouldn’t say it’s impossible - it’s just that, as a tool that isn’t tied to HTTP, we don’t offer any integration per default.

But nothing stops you from:

  • reading headers in your queryFn, extract whatever information you want from the Link response header and attach that to the query data returned from the queryFn.
  • implement getNextPageParam to return that part of the data where the link rel is next
  • implement getPreviousPageParam to return that part of the data where the link rel is prev
  • in the queryFn, take the pageParam and use that as a url to fetch.

pseudo implementation:

...snip...

parseLinkHeadersFromResponse is left out as an exercise to the reader (or chatGPT) :)

we currently use codegen from an openapi spec, and we also return the full URL in the next link header (since that's what the spec suggests) rather than just the next params. not sure if that's compatible with the setup you're proposing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants