010 - Authorizer Filter #79

k-wall · 2025-09-19T10:44:28Z

The proposed Authorizer filter gives the ability to add authorisation checks into a Kafka system which will be enforced by the proxy.

why: We are identifying use-cases where making authorization decisions at the proxy is desireable. Examples include where one wishes to restrict a virtual cluster to a sub-set of the resources (say topics) of the cluster.

Signed-off-by: Keith Wall <[email protected]>

tombentley

I think this is definitely heading in the right direction. I'm currently experimenting with a few ideas based on what you describe. I'd like to carry on doing that for a few days and then use the knowledge gained to revise this, if that's OK with you @k-wall?

tombentley · 2025-09-22T02:32:17Z

proposals/010-authorizer.md

+```java
+// Inspired by org.apache.kafka.server.authorizer.AuthorizableRequestContext
+interface AuthorizableRequestContext {
+    String principal();


I think this is a little too simplistic. Currently the proxy runtime may know any of

A SASL authorization id

A TLS subject DN

A set of TLS SANs

What you're saying here is that something knows how to turn those things into a single principal string. In general there is not a single answer to that, so it needs to be pluggable.

Futhermore, there's the (relatively) easy opportunity for making this API more general allowing other information to be loaded from a remote system to augment what's known about the client (think LDAP lookup).

Moreover, we should decide explicitly whether we are following:

the "Kafka way", where there's a single principal but this context can convey additional information like the client id, remote peer address, etc,

or the "JAAS way", where there is a Subject object which can hold a set of principals which can be used to model that additional information. In other words, you could have ClientId and ClientAddress as principals, and to make authorization decisions using that info.

As motivation, a use case for making an access control decision which takes into account clientId is: Suppose there is some team A in Example Corp which runs the internal KaaS, and some team B which runs a bunch of applications. Team B just has a single service account and uses different client ids for each of their apps. Team A find that one of Team Bs applications is causing trouble in the cluster, and need to disable it from connecting until they can get Team B to take a look. So they need to deny the (authorizedId, clientId)-pair, so that B's other (well-behaved applications) still run.

The difference between the two ways of exposing client Id in the API crops up mainly in the configuration language you propose for granting access. It's more natural to be able to say "deny CONNECT on CLUSTER to User:A and ClientId:naughty" (i.e. you just need support and and or on principal predicates in your configuration language). In contrast, exposing it via additional properties makes the language less elegant, because you're lost the regularity of the ${PrinicpalType}:${PrincipalName} syntax.

What you're saying here is that something knows how to turn those things into a single principal string. In general there is not a single answer to that, so it needs to be pluggable.

I don't disagree at all. However, isn't it basically the same conversation that we had during the Authenticate API work (#71) (whether to use Subject, Principal, Set). There we decided to be content with a String until such times we need something richer. The proposal takes the approach.

We've got the prinicpalType in the rules yaml, so that gives us wiggle room to refactor the APIs and implemementation later, without breaking user-config.

However, isn't it basically the same conversation that we had during the Authenticate API work (#71) (whether to use Subject, Principal, Set).

Yes, more or less.

There we decided to be content with a String until such times we need something richer.

I think now could be such a time, or at least it's worth discussing it again now.

Let me put my cards on the table: I like the JASS model where you have a Subject which can have 0 or more Principals associated with it, and where the framework itself is ambivalent about the concrete Principals which applications might define. I don't know why something different was done for authz in Kafka. They ended up having to solve similar problems anyway, such as exposing things like client id to authorizers, when they rewrote their Authorizer API.

I'm not advocating to use JAAS for authorization. I don't think itself JAAS gives use anything useful, especially with the removal of the security manager and all the associated classes. But we could easily reuse the model (Subject is a trivial record type, and Principal is a trivial interface), and doing so would seem to give us a route to making access control decisions on properties about subjects which are held externally, such as their roles.

But stepping away from JAAS specifically, my point is that:

authN currently gives us an identity as a string,

and different users will want to base their authZ decisions on different sources of identity (SASL, X509)

So we need some way of saying which identity should be used to grant access:

This could be as simple as an enum in the config which says USE_SASL or USE_X509,

It could be a PrincipalBuilder interface if we want a system which only admits a single kind of principal (the Kafka way),

It could be a SubjectBuilder if we want a system which admits multiple kinds of principal (the JAAS way)

Both the latter two have the option of supporting a lookup of some additional information in some external system (e.g. introspection endpoint, or AD/LDAP or whatever).

I think we should go the JAAS way. We should add a method FilterContext#authenticatedSubject which will return the SASL and/or X509 Principals established in the channel. The subject could be mutable, allowing say an LDAP plugin to augment the Subject with Group principals.

The #authorize method will accept a Subject rather than a String. It will be up to the authorizer service implementation which Principal types it supports. The implementation proposed by this proposal will initially handle the SASL principals, but the door should be open to support X509 principals later. We'll probably need something analogous to Kafka's `ssl.principal.mapping.rules.

I've taken a look at adding:

interface FilterContext { // ... CompletionStage<Subject> authenticatedSubject() }

It needs to be a CompletionStage if we wanted to support augmentation. But this has consequences in terms of the FilterHandler's state machine. Specifically, when a Filter impl's on*() method asks for the authenticatedSubject() and the stage is not done then we should be toggling autoread off. Doing that means autoread might be toggled for multiple different reasons, and we need to keep track of what the state ought to be. For example, a filter might do both a call to authenticatedSubject() and return a deferred response. In this case the completion of the stage for the subject should not turn autoread back on. That should only happen when the deferred response is completed too. I don't think it's terribly complicated, but it's not entirely trivial either.

I think to keep things simplish we should avoid making the creation of the subjects pluggable for now. This means that although authenticatedSubject() can return CompletionStage<Subject> in the API, we can assume for now that those futures are, in fact, always completed.

I don't understand why authenticatedSubject needs to return a CompletionStage. Won't there always be a mutable Subject, which might be empty? Filters or plugins will add to the Subject as they learn more about the connection.

aside: is worth separating out the new FilterContext API work from the authorizer? That would allow the Audit Filter work to continue.

The problem with that approach is that in introduces non-determinism into how the proxy can behave. For example different ordering of events between the subject being updated and the subject being queried by a filter (and making an authz decision based on that subject) might lead to false negative (or, worse, false positive) access control decisions:

A filter provides an authorized id, which initiates a request is made to AD to get some roles.

A filter queries the authenticated subject. Finds only a SaslId, and makes a decision based on that.

The AD response is received.

We cannot block (because netty), nor otherwise prevent a filter making progress at step 1. That's because the API for providing the SASL authorized id is not async. Pausing invocation of the rest of filters in the pipeline after the authenticating filter returns does not work, because nothing prevents the same filter, having provided the id from immediately asking for the subject.

Therefore to avoid the badness in step 2 we have to avoid actually providing the subject until we have received the response from AD. But we can't simply block the thread (because netty), so returning a CompletionStage seems to me to by the only choice we have.

tombentley · 2025-09-22T02:39:31Z

proposals/010-authorizer.md

+    String operationName();
+}
+
+enum CoreAclOperation implements AclOperation {


One thing I don't like about this is that there's no safety mechanism to ensure that the operation is compatible with the resource type.

I was playing with a pattern last week where we would represent a resource type simply as an enum of the operations it supports (your ResourceType becomes a Class of that enum). Thus TopicResource ends up with READ, WRITE, DESCRIBE etc and ClusterResource ends up with CONNECT, DESCRIBE etc, and you can construct a type safe API where asking for an authorization decision on TopicResource.CONNECT is a compile time error.

I like the sound of that suggestion. I was intending to iterate on the java interfaces in a pull request. You've started that already. That absolutely fine of course.

proposals/010-authorizer.md

tombentley · 2025-09-22T02:44:58Z

proposals/010-authorizer.md

+
+```java
+// Inspired by org.apache.kafka.server.authorizer.AuthorizableRequestContext
+interface AuthorizableRequestContext {


You see to consider how you handle an anonymous client. In particular what is their principal.

Good point. I hadn't thought about that at all. I suppose you might have two gateways, one attached to a SASL listener and one attached to listener using anonymous. You might want to give the anonymous connections fewer privileges.

I need to think about this one.

Two things come to mind immediate which I want to note down:

As things stand now, as Kafka doesn't use https://www.rfc-editor.org/rfc/rfc4505, there's no SASL negotiation, so there's nothing to actually call io.kroxylicious.proxy.filter.FilterContext#clientSaslAuthenticationSuccess. #clientSaslContext would return an empty optional.

Kafka uses User:ANONYMOUS is its ACL, but that feels weak. Somebody could define a user called ANOMYMOUS and get the rights ascribed to ANONYMOUS. How useful is such an attack vector? I'm not sure.

tombentley · 2025-09-22T03:18:48Z

proposals/010-authorizer.md

+
+For the `CLUSTER` `CONNECT` authorization check, this will be implemented implicitly.  The check will return `ALLOW` if there is at least one resource rule for the principal.  If there are no resource rules for the principal, the authorizer will return `DENY`.
+
+```yaml


I wonder whether YAML really is the right choice here. By saying it's YAML we end up in the space of "programming in YAML", which is at best painful. I think what you're specified is not bad, but it's not bad because it's limited and inflexible. For example it wouldn't cope with the clientId use case I mentioned earlier, because:

there's no way to say "ALLOW all these things EXCEPT for all those things". That can end up being very limiting because the only way around it is to write to enumerate exactly what you want to allow, which forces expansion of patterns, which could require very very many literals.

principals means "a subject with any of these principals". There's no way to match only subjects with principal serviceAccount:A and also clientId:naughty.

Of course you can model something more sophisticated using YAML, but really my point is that evolving the schema for YAML DSLs like this is just really really painful, and the resulting YAML gets less and less easy to read and write.

Perhaps it would be better to have a proper grammar (e.g. using Antlr, for example), so that future evolution is relatively easy.

allow READ on Topic with name='foo' to subjects with UserPrincipal in ('bob', 'grace') allow * on Topic with name startingWith 'bar' to subjects with UserPrincipal in ('bob', 'grace') allow * on Topic with name matching /baz*/ to subjects with UserPrincipal in ('bob', 'grace')

tombentley · 2025-09-22T21:41:55Z

proposals/010-authorizer.md

+```java
+// Inspired by org.apache.kafka.server.authorizer.AuthorizableRequestContext
+interface AuthorizableRequestContext {
+    String principal();


However, isn't it basically the same conversation that we had during the Authenticate API work (#71) (whether to use Subject, Principal, Set).

Yes, more or less.

There we decided to be content with a String until such times we need something richer.

I think now could be such a time, or at least it's worth discussing it again now.

Let me put my cards on the table: I like the JASS model where you have a Subject which can have 0 or more Principals associated with it, and where the framework itself is ambivalent about the concrete Principals which applications might define. I don't know why something different was done for authz in Kafka. They ended up having to solve similar problems anyway, such as exposing things like client id to authorizers, when they rewrote their Authorizer API.

I'm not advocating to use JAAS for authorization. I don't think itself JAAS gives use anything useful, especially with the removal of the security manager and all the associated classes. But we could easily reuse the model (Subject is a trivial record type, and Principal is a trivial interface), and doing so would seem to give us a route to making access control decisions on properties about subjects which are held externally, such as their roles.

But stepping away from JAAS specifically, my point is that:

authN currently gives us an identity as a string,

and different users will want to base their authZ decisions on different sources of identity (SASL, X509)

So we need some way of saying which identity should be used to grant access:

This could be as simple as an enum in the config which says USE_SASL or USE_X509,

It could be a PrincipalBuilder interface if we want a system which only admits a single kind of principal (the Kafka way),

It could be a SubjectBuilder if we want a system which admits multiple kinds of principal (the JAAS way)

Both the latter two have the option of supporting a lookup of some additional information in some external system (e.g. introspection endpoint, or AD/LDAP or whatever).

tombentley · 2025-09-22T21:44:50Z

proposals/010-authorizer.md

+config:
+  authorizer: FileBasedAllowListAuthorizer
+  authorizerConfig:
+    rulesFile: /path/to/allow-list.yaml


For the operator, I guess the user would need to put the file in a ConfigMap which we'd need to mount. I think that should all just work via the interpolation syntax, right?

Yeah, I believe it will.

tombentley · 2025-09-22T22:52:06Z

proposals/010-authorizer.md

+It will also use the same implied operation semantics as implemented by Kafka itself, such as where `ALTER` implies `DESCRIBE`, as described by 
+`org.apache.kafka.common.acl.AclOperation`.


We should define what has responsibility for this. It could be:

A part of the contract of interface Authorizer

An implementation detail of the particular Authorizer implementation being proposed, so that other Authorizers are not required to respect this implication

If the former then it's natural to model the implication relation on the operation enum. If the latter then probably we're saying that the Authorizer implementation has to special case those topic operations.

tombentley · 2025-09-23T01:35:39Z

proposals/010-authorizer.md

+| DELETE    | Topic         | DeleteTopics                                                                                                                                                                                                           |
+| ALTER     | Topic         | AlterConfigs, IncrementalAlterConfigs, CreatePartitions                                                                                                                                                                |
+| DESCRIBE  | Topic         | ListOffset, OffsetFetch, OffsetFetchForLeaderEpoch DescribeProducers, ConsumerGroupHeartbeat, ConsumerGroupDescribe, ShareGroupHeartbeat, ShareGroupDescribe, MetaData, DescribeTopicPartitions, ConsumerGroupDescribe |
+| CONNECT   | Cluster       | SaslAuthenticate                                                                                                                                                                                                       |


Is Cluster here a virtual cluster or a target cluster?

If this was target cluster, it's hard to identify currently. We don't name our target cluster in the proxy configuration, it's an object embedded in a VirtualCluster container a bootstrapServers. Also, how would the Filter know which Target Cluster it's connected to? Maybe it implies we would have to elevate Target Cluster to a named entity in it's own right and give Filters knowledge of the Target Cluster they are connected to.

If it's virtual cluster, we can identify it by name in config and via the Filter API.

I think for current purposes it would be fine if this were virtual cluster. But we should have explicit names (i.e. a VIRTUAL_CLUSTER resource type) so that it's unambiguous and we could add TARGET_CLUSTER` in the future without upset.

My intent was to allow access to the virtual cluster to be authorized. So VIRTUAL_CLUSTER resource type is sensible. Do we actually have a use-case for applying permissions to the TARGET_CLUSTER?

proposals/010-authorizer.md

Signed-off-by: Keith Wall <[email protected]>

k-wall · 2025-09-25T10:53:26Z

We said during the sync call that @tombentley was going to iterate on the Java APIs in a PR, we'd then update the proposal with the outcome of that.

tombentley · 2025-09-26T03:07:09Z

We said during the sync call that @tombentley was going to iterate on the Java APIs in a PR, we'd then update the proposal with the outcome of that.

See kroxylicious/kroxylicious#2702

k-wall requested a review from a team as a code owner September 19, 2025 10:44

This comment was marked as outdated.

Sign in to view

k-wall force-pushed the authorizer branch from 1ec7fbd to 4032fb0 Compare September 19, 2025 11:28

k-wall changed the title ~~Authorizer~~ Authorizer FIlter Sep 19, 2025

k-wall force-pushed the authorizer branch 2 times, most recently from 1182603 to b9bd217 Compare September 19, 2025 11:34

k-wall changed the title ~~Authorizer FIlter~~ 011 - Authorizer FIlter Sep 19, 2025

k-wall changed the title ~~011 - Authorizer FIlter~~ 010 - Authorizer FIlter Sep 19, 2025

k-wall force-pushed the authorizer branch from bf9fd10 to 71563cd Compare September 19, 2025 12:45

feat(filters): Authorizer

af4a347

Signed-off-by: Keith Wall <[email protected]>

k-wall force-pushed the authorizer branch from 71563cd to af4a347 Compare September 19, 2025 12:50

tombentley changed the title ~~010 - Authorizer FIlter~~ 010 - Authorizer Filter Sep 21, 2025

tombentley reviewed Sep 22, 2025

View reviewed changes

tombentley reviewed Sep 23, 2025

View reviewed changes

k-wall mentioned this pull request Sep 24, 2025

Authorization Filter kroxylicious/kroxylicious#2696

Open

robobario reviewed Sep 24, 2025

View reviewed changes

proposals/010-authorizer.md Show resolved Hide resolved

franvila self-requested a review September 25, 2025 09:00

k-wall added 2 commits September 25, 2025 11:21

addressing review comments - authorized operations values

1529d33

Signed-off-by: Keith Wall <[email protected]>

addressing review comments

56d3ac3

Signed-off-by: Keith Wall <[email protected]>

k-wall mentioned this pull request Sep 25, 2025

011 - user namespace filter #80

Open

tombentley mentioned this pull request Sep 26, 2025

Authz kroxylicious/kroxylicious#2702

Draft


		For the `CLUSTER` `CONNECT` authorization check, this will be implemented implicitly. The check will return `ALLOW` if there is at least one resource rule for the principal. If there are no resource rules for the principal, the authorizer will return `DENY`.

		```yaml

		It will also use the same implied operation semantics as implemented by Kafka itself, such as where `ALTER` implies `DESCRIBE`, as described by
		`org.apache.kafka.common.acl.AclOperation`.

Uh oh!

010 - Authorizer Filter #79

Are you sure you want to change the base?

010 - Authorizer Filter #79

Uh oh!

Conversation

k-wall commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

tombentley left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

k-wall Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

k-wall Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

k-wall commented Sep 25, 2025

Uh oh!

tombentley commented Sep 26, 2025

Uh oh!

Uh oh!

k-wall commented Sep 19, 2025 •

edited

Loading

k-wall Sep 22, 2025 •

edited

Loading

k-wall Sep 22, 2025 •

edited

Loading