-
-
Notifications
You must be signed in to change notification settings - Fork 8
010 - Authorizer Filter #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
1182603
to
b9bd217
Compare
Signed-off-by: Keith Wall <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is definitely heading in the right direction. I'm currently experimenting with a few ideas based on what you describe. I'd like to carry on doing that for a few days and then use the knowledge gained to revise this, if that's OK with you @k-wall?
```java | ||
// Inspired by org.apache.kafka.server.authorizer.AuthorizableRequestContext | ||
interface AuthorizableRequestContext { | ||
String principal(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a little too simplistic. Currently the proxy runtime may know any of
- A SASL authorization id
- A TLS subject DN
- A set of TLS SANs
What you're saying here is that something knows how to turn those things into a single principal string. In general there is not a single answer to that, so it needs to be pluggable.
Futhermore, there's the (relatively) easy opportunity for making this API more general allowing other information to be loaded from a remote system to augment what's known about the client (think LDAP lookup).
Moreover, we should decide explicitly whether we are following:
- the "Kafka way", where there's a single principal but this context can convey additional information like the client id, remote peer address, etc,
- or the "JAAS way", where there is a Subject object which can hold a set of principals which can be used to model that additional information. In other words, you could have
ClientId
andClientAddress
as principals, and to make authorization decisions using that info.
As motivation, a use case for making an access control decision which takes into account clientId
is: Suppose there is some team A in Example Corp which runs the internal KaaS, and some team B which runs a bunch of applications. Team B just has a single service account and uses different client ids for each of their apps. Team A find that one of Team Bs applications is causing trouble in the cluster, and need to disable it from connecting until they can get Team B to take a look. So they need to deny the (authorizedId, clientId)-pair, so that B's other (well-behaved applications) still run.
The difference between the two ways of exposing client Id in the API crops up mainly in the configuration language you propose for granting access. It's more natural to be able to say "deny CONNECT on CLUSTER to User:A and ClientId:naughty" (i.e. you just need support and
and or
on principal predicates in your configuration language). In contrast, exposing it via additional properties makes the language less elegant, because you're lost the regularity of the ${PrinicpalType}:${PrincipalName}
syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you're saying here is that something knows how to turn those things into a single principal string. In general there is not a single answer to that, so it needs to be pluggable.
I don't disagree at all. However, isn't it basically the same conversation that we had during the Authenticate API work (#71) (whether to use Subject, Principal, Set). There we decided to be content with a String until such times we need something richer. The proposal takes the approach.
We've got the prinicpalType in the rules yaml, so that gives us wiggle room to refactor the APIs and implemementation later, without breaking user-config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, isn't it basically the same conversation that we had during the Authenticate API work (#71) (whether to use Subject, Principal, Set).
Yes, more or less.
There we decided to be content with a String until such times we need something richer.
I think now could be such a time, or at least it's worth discussing it again now.
Let me put my cards on the table: I like the JASS model where you have a Subject
which can have 0 or more Principals
associated with it, and where the framework itself is ambivalent about the concrete Principals which applications might define. I don't know why something different was done for authz in Kafka. They ended up having to solve similar problems anyway, such as exposing things like client id to authorizers, when they rewrote their Authorizer API.
I'm not advocating to use JAAS for authorization. I don't think itself JAAS gives use anything useful, especially with the removal of the security manager and all the associated classes. But we could easily reuse the model (Subject
is a trivial record
type, and Principal
is a trivial interface
), and doing so would seem to give us a route to making access control decisions on properties about subjects which are held externally, such as their roles.
But stepping away from JAAS specifically, my point is that:
- authN currently gives us an identity as a string,
- and different users will want to base their authZ decisions on different sources of identity (SASL, X509)
So we need some way of saying which identity should be used to grant access:
- This could be as simple as an enum in the config which says
USE_SASL
orUSE_X509
, - It could be a
PrincipalBuilder
interface if we want a system which only admits a single kind of principal (the Kafka way), - It could be a
SubjectBuilder
if we want a system which admits multiple kinds of principal (the JAAS way)
Both the latter two have the option of supporting a lookup of some additional information in some external system (e.g. introspection endpoint, or AD/LDAP or whatever).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should go the JAAS way. We should add a method FilterContext#authenticatedSubject which will return the SASL and/or X509 Principals established in the channel. The subject could be mutable, allowing say an LDAP plugin to augment the Subject with Group principals.
The #authorize
method will accept a Subject rather than a String. It will be up to the authorizer service implementation which Principal types it supports. The implementation proposed by this proposal will initially handle the SASL principals, but the door should be open to support X509 principals later. We'll probably need something analogous to Kafka's `ssl.principal.mapping.rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken a look at adding:
interface FilterContext {
// ...
CompletionStage<Subject> authenticatedSubject()
}
It needs to be a CompletionStage
if we wanted to support augmentation. But this has consequences in terms of the FilterHandler
's state machine. Specifically, when a Filter
impl's on*()
method asks for the authenticatedSubject()
and the stage is not done then we should be toggling autoread off. Doing that means autoread might be toggled for multiple different reasons, and we need to keep track of what the state ought to be. For example, a filter might do both a call to authenticatedSubject()
and return a deferred response. In this case the completion of the stage for the subject should not turn autoread back on. That should only happen when the deferred response is completed too. I don't think it's terribly complicated, but it's not entirely trivial either.
I think to keep things simplish we should avoid making the creation of the subjects pluggable for now. This means that although authenticatedSubject()
can return CompletionStage<Subject>
in the API, we can assume for now that those futures are, in fact, always completed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why authenticatedSubject needs to return a CompletionStage. Won't there always be a mutable Subject, which might be empty? Filters or plugins will add to the Subject as they learn more about the connection.
aside: is worth separating out the new FilterContext API work from the authorizer? That would allow the Audit Filter work to continue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with that approach is that in introduces non-determinism into how the proxy can behave. For example different ordering of events between the subject being updated and the subject being queried by a filter (and making an authz decision based on that subject) might lead to false negative (or, worse, false positive) access control decisions:
- A filter provides an authorized id, which initiates a request is made to AD to get some roles.
- A filter queries the authenticated subject. Finds only a
SaslId
, and makes a decision based on that. - The AD response is received.
We cannot block (because netty), nor otherwise prevent a filter making progress at step 1. That's because the API for providing the SASL authorized id is not async. Pausing invocation of the rest of filters in the pipeline after the authenticating filter returns does not work, because nothing prevents the same filter, having provided the id from immediately asking for the subject.
Therefore to avoid the badness in step 2 we have to avoid actually providing the subject until we have received the response from AD. But we can't simply block the thread (because netty), so returning a CompletionStage seems to me to by the only choice we have.
String operationName(); | ||
} | ||
|
||
enum CoreAclOperation implements AclOperation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I don't like about this is that there's no safety mechanism to ensure that the operation is compatible with the resource type.
I was playing with a pattern last week where we would represent a resource type simply as an enum
of the operations it supports (your ResourceType
becomes a Class
of that enum
). Thus TopicResource
ends up with READ
, WRITE
, DESCRIBE
etc and ClusterResource ends up with CONNECT
, DESCRIBE
etc, and you can construct a type safe API where asking for an authorization decision on TopicResource.CONNECT
is a compile time error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the sound of that suggestion. I was intending to iterate on the java interfaces in a pull request. You've started that already. That absolutely fine of course.
|
||
```java | ||
// Inspired by org.apache.kafka.server.authorizer.AuthorizableRequestContext | ||
interface AuthorizableRequestContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You see to consider how you handle an anonymous client. In particular what is their principal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I hadn't thought about that at all. I suppose you might have two gateways, one attached to a SASL listener and one attached to listener using anonymous. You might want to give the anonymous connections fewer privileges.
I need to think about this one.
Two things come to mind immediate which I want to note down:
- As things stand now, as Kafka doesn't use https://www.rfc-editor.org/rfc/rfc4505, there's no SASL negotiation, so there's nothing to actually call
io.kroxylicious.proxy.filter.FilterContext#clientSaslAuthenticationSuccess
.#clientSaslContext
would return an empty optional. - Kafka uses User:ANONYMOUS is its ACL, but that feels weak. Somebody could define a user called
ANOMYMOUS
and get the rights ascribed to ANONYMOUS. How useful is such an attack vector? I'm not sure.
|
||
For the `CLUSTER` `CONNECT` authorization check, this will be implemented implicitly. The check will return `ALLOW` if there is at least one resource rule for the principal. If there are no resource rules for the principal, the authorizer will return `DENY`. | ||
|
||
```yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether YAML really is the right choice here. By saying it's YAML we end up in the space of "programming in YAML", which is at best painful. I think what you're specified is not bad, but it's not bad because it's limited and inflexible. For example it wouldn't cope with the clientId
use case I mentioned earlier, because:
- there's no way to say "ALLOW all these things EXCEPT for all those things". That can end up being very limiting because the only way around it is to write to enumerate exactly what you want to allow, which forces expansion of patterns, which could require very very many literals.
principals
means "a subject with any of these principals". There's no way to match only subjects with principalserviceAccount:A
and alsoclientId:naughty
.
Of course you can model something more sophisticated using YAML, but really my point is that evolving the schema for YAML DSLs like this is just really really painful, and the resulting YAML gets less and less easy to read and write.
Perhaps it would be better to have a proper grammar (e.g. using Antlr, for example), so that future evolution is relatively easy.
allow READ on Topic with name='foo' to subjects with UserPrincipal in ('bob', 'grace')
allow * on Topic with name startingWith 'bar' to subjects with UserPrincipal in ('bob', 'grace')
allow * on Topic with name matching /baz*/ to subjects with UserPrincipal in ('bob', 'grace')
```java | ||
// Inspired by org.apache.kafka.server.authorizer.AuthorizableRequestContext | ||
interface AuthorizableRequestContext { | ||
String principal(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, isn't it basically the same conversation that we had during the Authenticate API work (#71) (whether to use Subject, Principal, Set).
Yes, more or less.
There we decided to be content with a String until such times we need something richer.
I think now could be such a time, or at least it's worth discussing it again now.
Let me put my cards on the table: I like the JASS model where you have a Subject
which can have 0 or more Principals
associated with it, and where the framework itself is ambivalent about the concrete Principals which applications might define. I don't know why something different was done for authz in Kafka. They ended up having to solve similar problems anyway, such as exposing things like client id to authorizers, when they rewrote their Authorizer API.
I'm not advocating to use JAAS for authorization. I don't think itself JAAS gives use anything useful, especially with the removal of the security manager and all the associated classes. But we could easily reuse the model (Subject
is a trivial record
type, and Principal
is a trivial interface
), and doing so would seem to give us a route to making access control decisions on properties about subjects which are held externally, such as their roles.
But stepping away from JAAS specifically, my point is that:
- authN currently gives us an identity as a string,
- and different users will want to base their authZ decisions on different sources of identity (SASL, X509)
So we need some way of saying which identity should be used to grant access:
- This could be as simple as an enum in the config which says
USE_SASL
orUSE_X509
, - It could be a
PrincipalBuilder
interface if we want a system which only admits a single kind of principal (the Kafka way), - It could be a
SubjectBuilder
if we want a system which admits multiple kinds of principal (the JAAS way)
Both the latter two have the option of supporting a lookup of some additional information in some external system (e.g. introspection endpoint, or AD/LDAP or whatever).
config: | ||
authorizer: FileBasedAllowListAuthorizer | ||
authorizerConfig: | ||
rulesFile: /path/to/allow-list.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the operator, I guess the user would need to put the file in a ConfigMap
which we'd need to mount. I think that should all just work via the interpolation syntax, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I believe it will.
It will also use the same implied operation semantics as implemented by Kafka itself, such as where `ALTER` implies `DESCRIBE`, as described by | ||
`org.apache.kafka.common.acl.AclOperation`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should define what has responsibility for this. It could be:
- A part of the contract of
interface Authorizer
- An implementation detail of the particular Authorizer implementation being proposed, so that other Authorizers are not required to respect this implication
If the former then it's natural to model the implication relation on the operation enum. If the latter then probably we're saying that the Authorizer implementation has to special case those topic operations.
| DELETE | Topic | DeleteTopics | | ||
| ALTER | Topic | AlterConfigs, IncrementalAlterConfigs, CreatePartitions | | ||
| DESCRIBE | Topic | ListOffset, OffsetFetch, OffsetFetchForLeaderEpoch DescribeProducers, ConsumerGroupHeartbeat, ConsumerGroupDescribe, ShareGroupHeartbeat, ShareGroupDescribe, MetaData, DescribeTopicPartitions, ConsumerGroupDescribe | | ||
| CONNECT | Cluster | SaslAuthenticate | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Cluster
here a virtual cluster or a target cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this was target cluster, it's hard to identify currently. We don't name our target cluster in the proxy configuration, it's an object embedded in a VirtualCluster container a bootstrapServers
. Also, how would the Filter know which Target Cluster it's connected to? Maybe it implies we would have to elevate Target Cluster to a named entity in it's own right and give Filters knowledge of the Target Cluster they are connected to.
If it's virtual cluster, we can identify it by name in config and via the Filter API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for current purposes it would be fine if this were virtual cluster. But we should have explicit names (i.e. a VIRTUAL_CLUSTER
resource type) so that it's unambiguous and we could add TARGET_CLUSTER` in the future without upset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent was to allow access to the virtual cluster to be authorized. So VIRTUAL_CLUSTER resource type is sensible. Do we actually have a use-case for applying permissions to the TARGET_CLUSTER?
Signed-off-by: Keith Wall <[email protected]>
Signed-off-by: Keith Wall <[email protected]>
We said during the sync call that @tombentley was going to iterate on the Java APIs in a PR, we'd then update the proposal with the outcome of that. |
|
The proposed Authorizer filter gives the ability to add authorisation checks into a Kafka system which will be enforced by the proxy.
why: We are identifying use-cases where making authorization decisions at the proxy is desireable. Examples include where one wishes to restrict a virtual cluster to a sub-set of the resources (say topics) of the cluster.