Routing API #70

tombentley · 2025-06-07T00:34:41Z

No description provided.

Signed-off-by: Tom Bentley <[email protected]>

gunnarmorling

This looks great overall! A few comments and suggestions inline.

proposals/004-routing-api.md

gunnarmorling · 2025-06-14T08:55:03Z

proposals/004-routing-api.md

+
+A route object has a `name`, an optional list of `filters` (being the names of the filters to be applied to requests/responses that traverse this route) and either a 
+`cluster` or a `router` property, which names the receiver which will handle requests after any filters have been applied. 
+Exactly one of `cluster` or `router` must be specified.


Could this also be receiver instead? This conveys the role of that entity in this context more clearly, IMO. On the downside, it would require names to be unique across cluster and router objects.

There's two points there:

Is the property better named for the role or the type which fulfills the role? I agree that role is better in general. But is receiver, specifically, a good name? I'm not sure an abstract noun makes the best choice, especially when the things which fulfill that role won't be associated with that word in the config file (how is a newbie to know that a router is a receiver? We're forcing them to read the docs and grok a bunch of concepts). Perhaps connectsTo would be a bit more obvious?

Should the names of things that fulfill the role of a receiver be in a single namespace, or a per-type one?

We already have filterDefinitions and virtualClusters as collections of named things, each in their own namespace. So it would be least surprising to stick with a per-type pattern.

OTOH, filters are not "receivers" as I've defined them here, so we could probably wrangle a compromise.

However, I think that compromise is a bit grubby. The problem is that a list of items of mixed type (either a router or a cluster) either makes the type of the item or explicit, or not.

Hiding the type of an item is not desirable because it makes it much harder to build a model picture of the overall graph. E.g. when I'm looking an an item that could be a router or a cluster then it's important to know that something is a router, because I know a router should have routes, and those routes define the next node in the DAG. Likewise I know cluster should not have routes, and that what I'm looking is is conceptually a leaf node of the DAG.

But making the type of an item explicit isn't great either. Here's a fully worked example (note the indirection properties like virtualCluster, cluster and router, which make the type of each node explicit):

nodeDefinitions: - name: my-simple-vc virtualCluster: portIdentifiesNode: ... filters: ... connectsTo: my-backing-cluster - name: my-clever-vc virtualCluster: portIdentifiesNode: ... filters: ... connectsTo: my-router - name: my-backing-cluster backingCluster: bootstrapServer: ... tls: ... - name: my-router router: type: MyRouter config: ... routes: - name: to-backing-cluster filters: # a list of filter names ... connectsTo: my-backing-cluster

Personally I find that quite hard to read because:

It could be in any order

The referring properties (connectsTo) can appear either a either a child of a virtualCluster, or a grandchild of a router (via a route).

A different kind of compromise might work better:

clusterDefinitions: - name: my-cluster ... # as defined in this proposal currently virtualClusterDefinitions: - name: my-vc filters: ... connectsTo: cluster: my-cluster - name: my-vc filters: ... connectsTo: router: my-router routerDefinions: - name: my-router type: ... config: ... routes: - name: to-backing-cluster filters: # a list of filter names connectsTo: cluster: my-backing-cluster

I find this easier to read:

There's a single place to look for all the backing clusters that can be connected to (clusterDefinitions)

There's a single place to look for all virtual cluster presented to clients (virtualClusterDefinitions)

All the intermediate nodes a jumbled together, but they're necessarily all Routers, so the list is easier to read because of the repeating pattern to the items.

The connectsTo role is used for both routers and virtual clusters.

The nature of the thing being connected-to is explicit (it's either a cluster or a router), so I know which top level property to look in to find it.

Wow, that's gone quite deep! But wdyt?

Is the property better named for the role or the type which fulfills the role? I agree that role is better in general. But is receiver, specifically, a good name?

Yes, also not a huge fan of receiver, only went with it because it was the originally used term. connectsTo is ok, but it's also kinda irregular, as all the other terms seem to be nouns, and now suddenly we switch to an imperative verb. Perhaps destination, output, sink could work?

As for the type vs. role-based naming, I only meant this in the context of this particular polymorphic reference, not for the entity definitions themselves (totally agreed that the type-based naming is the right thing to do there). To me, it's comparable to naming variables in code: I generally dislike things like customersList, i.e. encoding the type, rather the focus should be on the semantics of what it is: customers.

If you feel that something like destination: my-backing-cluster is not ideal in terms of discoverability when authoring the config (which is a good point, I'd hope schema-based tooling would offer the right completions, though?), then I like that last proposal of connectsTo: cluster: my-backing-cluster, which encodes the role and then makes the available types very explicit 👍. Just perhaps with a different name, as suggested above.

proposals/004-routing-api.md

gunnarmorling · 2025-06-14T09:00:06Z

proposals/004-routing-api.md

+This will prevent the possibility of a request getting stuck in a router loop. 
+
+In order for non-trivial router graphs to be useful, `Router` authors will need to follow the same principle of composition as `Filter` authors.
+That is, a `Router` implementation should only talk to its receivers, and not, for example, make direct connections of its own to a backing cluster.


I agree that routers should not make "out of bands" requests, but this doesn't follow from the previous "in order to be useful" to me. I mean, out of bands requests can be useful, but we're discouraging them for the sake of validation, error handling, etc., I suppose.

The idea is that all communication with the cluster should be exclusively through the router/filter graph. Routers and filters should not make their own TCP connections directly to brokers.
Routers and filters may originate their own requests through the FilterContext/RouterContext API.

Will reword.

proposals/004-routing-api.md

gunnarmorling · 2025-06-14T09:08:22Z

proposals/004-routing-api.md

+      filters: # a list of filter names
+        ...
+      cluster: my-backing-cluster
+virtualClusters:


Strictly speaking, is the notion of virtual clusters even needed at this point? IIUC, it's equivalent to having a router definition with a no-op router and one single defined route with the filter chain. There probably is value in having this more concise way of modeling that common case, but it also goes a bit against the principle of providing exactly one way to accomplish a given task. Some food for thought, not sure what's best.

I think there's some value in modelling the source and sink nodes of the graph as first-class entities, rather than as special cases of Router:

Such special casing is not easily discoverable and so can be surprising to users.

the proxy runtime itself needs to know about them because it has to bind server sockets or open client connections etc. All the other types of Routers are treated as black boxes, without the runtime interpreting their configuration.

for many users the "point of entry" is a different kind of thing, and for similar reasons. Having one place to look for all the networking-relevant configuration is better than it being obscured amonst a lot of non-networking stuff.

Obviously either way of modelling it works, and I don't think either is more or less powerful than the other. So I think this choice boils down to criteria/opinions about what makes a tasteful API.

Yes, agreed. +1 on keeping virtual clusters a separate first-class concept.

Co-authored-by: Gunnar Morling <[email protected]> Signed-off-by: Tom Bentley <[email protected]>

Signed-off-by: Tom Bentley <[email protected]>

prabhamanepalli · 2025-07-15T05:17:30Z

proposals/004-routing-api.md

+Only one backing topic is writable at any given logical time.
+
+* **Principal-aware routing**.
+A natural variation on basic SASL termination is to use the identity of the authenticated client to drive the decision about which backing cluster to route requests to.


Routing could happen on other different parameters too like topic name, attribute value in the payload or header. But, has some constraints in each like handling duplicates.

prabhamanepalli

I really like this proposal and ability to present multiple backing Kafka clusters as one single cluster to clients.

Can you please look into different types of routing other than identity/principal based routing.

Routing based on topic name, some embedded attribute in the header or payload itself.

I would like to know if the API being proposed is extensible for other types of routing other than principal based routing.

tombentley · 2025-07-16T05:01:26Z

@prabhamanepalli thanks for taking a look!

Can you please look into different types of routing other than identity/principal based routing.
Routing based on topic name, some embedded attribute in the header or payload itself.

Those should absolutely be possible, and I consider them in-scope for this proposal. The "Union cluster" idea alluded to in the text is really a special case of this, where the mapping between topics (or more generally named broker-side entities like consumer groups) is defined statically in the configuration file. More generally, that mapping could be supplied over the network (e.g. from a database, metadata store or whatever).

It's up to the router how each request gets routed. Most common would be routing a request to a single target cluster or next router. But I don't think the API should enforce that. If a router implementation wanted to route a produce request, for example, to multiple clusters then that should be possible at the API level, even if dual writes are not usually a good idea.

What's not in scope for this proposal currently is making the set of routers, or the set of target clusters, dynamic during the lifetime of the proxy process. Dynamic configuration update is a separate item on the backlog.

Routing API

a2bdb40

Signed-off-by: Tom Bentley <[email protected]>

This was referenced Jun 7, 2025

Client certificate principal propagation to filter context. kroxylicious/kroxylicious#2270

Closed

Client ssl principal propagation kroxylicious/kroxylicious#2294

Closed

gunnarmorling reviewed Jun 14, 2025

View reviewed changes

tombentley and others added 2 commits June 16, 2025 22:56

Apply suggestions from code review

10ea8dc

Co-authored-by: Gunnar Morling <[email protected]> Signed-off-by: Tom Bentley <[email protected]>

Clarify wording

8378acd

Signed-off-by: Tom Bentley <[email protected]>

tombentley mentioned this pull request Jul 4, 2025

Authentication API #71

Closed

prabhamanepalli reviewed Jul 15, 2025

View reviewed changes

Uh oh!

Routing API #70

Are you sure you want to change the base?

Routing API #70

Uh oh!

Conversation

tombentley commented Jun 7, 2025

Uh oh!

gunnarmorling left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

prabhamanepalli left a comment

Choose a reason for hiding this comment

Uh oh!

tombentley commented Jul 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants