Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(import): simpler import + state restore logic #625

Draft
wants to merge 1 commit into
base: streaming-base
Choose a base branch
from

Conversation

floryst
Copy link
Collaborator

@floryst floryst commented Jul 16, 2024

This commit contains two changes that probably can be split up into two separate PRs that can be applied to the base branch. I'm putting this up as a draft for now so @PaulHax can get a sense of the changes.

The pipeline code has been removed in favor of a simpler chain-of-responsibility approach, using evaluateChain and asyncSelect.

evaluateChain is responsible for evaluating a data source against a chain of import handlers until one of them returns a new data source.

To keep processing a data source like how the old pipeline code supported nested executions, evaluateChain is invoked inside a loop for every data source. asyncSelect is used to drive the loop execution, seleting evaluateChain promises whenever they are done.

The state schema is updated to generically operate on serialized data sources. Instead of special-casing for remote files, the serialized DataSource type encodes this state.

@floryst floryst force-pushed the simplified-state-load-restore branch from 1849857 to 382c45e Compare July 30, 2024 18:30
@floryst floryst force-pushed the streaming-base branch 2 times, most recently from c7c1a32 to 19e8af5 Compare August 8, 2024 00:41
The pipeline code has been removed in favor of a simpler
chain-of-responsibility approach, using `evaluateChain` and
`asyncSelect`.

`evaluateChain` is responsible for evaluating a data source against a
chain of import handlers until one of them returns a new data source.

To keep processing a data source like how the old pipeline code
supported nested executions, `evaluateChain` is invoked inside a loop
for every data source. `asyncSelect` is used to drive the loop
execution, seleting `evaluateChain` promises whenever they are done.

The state schema is updated to generically operate on serialized data
sources. Instead of special-casing for remote files, the serialized
DataSource type encodes this state.
@floryst floryst force-pushed the simplified-state-load-restore branch from 382c45e to f59f77d Compare August 8, 2024 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant