-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interceptors #31
base: main
Are you sure you want to change the base?
Interceptors #31
Conversation
Signed-off-by: Matthias Wahl <[email protected]>
source: | ||
- id: ws-in | ||
type: ws | ||
interceptors: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we keep in and out separated what's the difference to using pre- and postprocesssors? i.e. what is the benefit we'd get from interceptors if a user still has to configure the chains separately. It seems to just move the current system into a more nested format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, i first went with only 1 chain. Why this isn't satisfying, there are two elements to this:
- There is the minor issue, that users might not always want to have the exact reversed operations. Using interceptors that are strictly bidirectional and reverse their operation from the other direction, we take that possibility from users.
- Also, there more general and important (to me) is that interceptor operations are fundamentally different for the two directions. They need to be configured differently. Putting this into the same entity, conflates and confuses
The most obvious example is the interceptor splitting input bytes by\n
. The reverse operation on outgoing data could be appending a\n
to each incoming byte array. But this is not doing exactly the reverse, it should also join those lines, to be the exact opposite. In our current implementation, some sinks do batch, some dont.
Adding this joining/batching to the interceptor who splits in one direction and joins in the other, would require to configure the max batch size and/or match batch wait time. To me it is cleaner and more flexible to separate it out into two chains like described in the RFC. Each interceptor (split
andjoin
) has a cleaner operation, that does different things.
benefits compared to pre-/postprocessors
- We have a clearer api. we have one entity for which the direction of processing is irrelevant.
- Clearer purpose for each interceptor due to more generic concept/operation.
split
andjoin
instead oflines
. - They are configurable
- Having two chains does not block people who want to do different operations on incoming and outgoing data.
If you have a good idea, to smunch both chains into one and stuff is still clear and maintains the level of flexibility this RFC exposes, I am all for it.
|
||
A `Codec` is stateless and bidirectional. All existing pre- and postprocessors define the same operation (e.g. split a given byte sequence into lines). Interceptors would join those dual operations into one entity. | ||
|
||
It might be interesting at some point to not only allow a simple chain of interceptors, feeding bytes through them, but allow graphs with branching and joining, decisions where to continue based on investigation of the current chunk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried about this, it seems to add a lot of complexity. Mentioning it in three spots somewhat sets the expectation of that being implemented.
For example how would a graph based set of processors be different from a pipeline? Would users now be able to define whole pipelines in the connector? How would we guarantee that data that flows through a graph in the inbound follows the right graph outbound? How can we make sure this is easy to understand for a user? How do we make sure it's debug-gable? Last but not least what would be use-cases for this?
Perhaps we can just take that out and include it in a new RFC if it becomes relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense!
related: tremor-rs/tremor-runtime#529 |
Rendered