-
-
Notifications
You must be signed in to change notification settings - Fork 8
009 - Add azure kms proposal #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Robert Young <[email protected]>
Signed-off-by: Robert Young <[email protected]>
Signed-off-by: Robert Young <[email protected]>
proposals/007-azure-kms.md
Outdated
2. Support RSA and HSM-RSA key types, wrapping using `RSA-OAEP-256` but emit a warning that it is not quantum-resistant. | ||
3. Support HSM-AES key type and AES-GCM wrapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on this, I think the EDEK should encode which key type we used, so that we can use the appropriate algorithm at key unwrapping time (Which will be 1:1 with whether the key type is RSA or AES)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, would querying the key's type also be an option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you can discover the key type using the key name via the key vault apis.
In my prototype I'm using this to determine which algorithm to use (and fail fast if the key is an unsupported type) at resolveAlias time.
Yeah, maybe it's not such a big deal to make the extra request as the unwrapped DEK will be cached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ail fast if the key is an unsupported type
I'd do that. We ought to do that in the other KMS implementations too :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing you can discover up front is whether unwrapKey
and wrapKey
are supported by the key, users are able to do things like create an RSA key that doesn't support unwrap/wrap.
And you can determine whether the key is enabled.
Looks like it is heading in a good direction. |
Something new that cropped up while working on the integration tests: The Azure key name restrictions are:
Our tests so far have expected that a kek selector of What should we do? Some ideas:
Options 1,2 & 3 make it possible to keep the existing integration suite passing, 3 we would have to make some changes to the IT setup. Option 3 I don't really like that the thing picked by the kekSelector isn't a name in Azure, that's it's job, pick an alias corresponding to something upstream. Option 1 is a bit annoying to surface Azure specifics in the RecordEncryption config. |
I'm tempted to say, let's just document this wrinkle for now and create an issue for it. Adapt any tests to cope with the tighter rules. TopicNameBasedKekSelector is good enough that it allows a user to try out Azure KMS. I think we know that TopicNameBasedKekSelector isn't really what we want. Tieing the topic name to the key name is inflexible. The knowledge about which key should be used to encrypt which topic belongs somewhere else, but I don't think we have a clear idea exactly where. Aside: I did wonder if we could have a KekSelector that is driven from metadata (tags) on the keys themselves. |
yeah this dovetails with the Label API discussions, whether we should make it possible to label entities that are completely unrelated to Kafka or unanticipated currently.
Sounds good to me, I wonder what a clean way to handle the test updates will be. We are depending on the topic names produced by the test extension. I guess we could move topic creation into the tests. Or maybe we could let the test influence it more with a naming strategy, something like: edit: I've implemented something like that here kroxylicious/kroxylicious-junit5-extension#517 |
Have you looked how we'll ensure that the filter operates with least privilege? In the case of the other KMS, we've documented what privileges the filter needs and given guidance about how to achieve that. It would be good to include the approach we'll talk to least priviledge in the proposal. |
Thanks @robobario. Almost there, I think. Can I ask you restructure to use a Reject Alternatives section and pull down the detail about things we are not doing to that section? This might seem pedantic, but I think there is value in driving towards a consistent approach for proposals. Also remember that a contributor wishing to implement support for KMS X is likely to clone this proposal as a starting point, so let's make sure it is a great example to follow. |
I'm also wondering if it is wise to give the code the responsibility to emit the quantum warning. I think it might be better if we made it a responsibility of the documentation. PQC is an area of active research and the situation could change. The warnings (or lack of them) baked into code could mislead users. We can always update documentation to change PCQ statements. |
keen for you to have a look too @tombentley, the bit that gives me most pause is supporting RSA. Without it users have no way to use the affordable tiers of Key Vault (that I suspect are commonly used). The product docs endorse it for this key-wrapping use-case, but it doesn't have the quantum-resistant nod. We will support AES keys with AES GCM wrapping but it's only available on the hellaciously expensive Managed HSM. |
@robobario did you have chance to think about this? |
Will think about how to document it, in my exploration I used:
as the policy to give minimum privileges, then when creating the key enabled the minimum operations the key needs:
|
Signed-off-by: Robert Young <[email protected]>
Signed-off-by: Robert Young <[email protected]>
Signed-off-by: Robert Young <[email protected]>
Signed-off-by: Robert Young <[email protected]>
Signed-off-by: Robert Young <[email protected]>
Signed-off-by: Robert Young <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @robobario, I had a few questions, but over all this is great effort and I think it's nearly there.
|
||
The API usually returns it as a `kid` https://learn.microsoft.com/en-us/azure/key-vault/general/about-keys-secrets-certificates#object-identifiers. Like | ||
|
||
> For Vaults: https://{vault-name}.vault.azure.net/{object-type}/{object-name}/{object-version} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be explicit about which parts of this can vary and how. I've not read the docs, but I'm imagining that:
{object-name}
will obviously vary{object-version}
will obviously vary{object-type}
is/may always bekeys
{vault-name}
??
Let's consider too the point you made above:
Users can also install it on-premise by negotiation.
I guess that would imply that the host name cannot be assumed to end with .vault.azure.net
. That means that if we are going to "compress" the start of the URL it needs to be uncompressed by prefixing with the configured keyVaultBaseUri
. But doing this comes with a commitment: That the keyVaultBaseUri
will never be changed between when any given record is encrypted and when it's decrypted. If it does change then there is a period of time during which it's ambiguous how to uncompress a compressed key id (with the old URI or the new URI).
Can we assume that keys will absolutely never migrate between instances? If this is possible then again there are potential issues for users. They would need for the key ids using the old domain name to continue to work for the duration of the life time of those records. We developers cannot put an upper bound on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that would imply that the host name cannot be assumed to end with .vault.azure.net.
Correct, it could also differ for other national/sovereign clouds like China's Azure instance, or the US government instance.
{vault-name} ??
This is the user-defined name for their Key Vault.
That means that if we are going to "compress" the start of the URL it needs to be uncompressed by prefixing with the configured keyVaultBaseUri
Correct, that's what is proposed
Some relevant capabilities:
Key Vaults can be backed up and restored within an Azure Geography and Subscription (docs). Meaning that within a geo region and subscription they could clone a key/version to a new {vault-name}, potentially the objects can exist in both vaults simultaneously.
Contrasting with AWS, there is no intermediate layer similar to the named Key Vaults. In AWS you only work with Keys. You can't backup and restore keys, but AWS it has a concept of multi-region-keys to use keys across regions. In the AWS EDEK, we encode just the key id which looks to have some limitations kroxylicious/kroxylicious#1217.
So maybe we should record more details in the records, like the vault-name? That would let us support the user moving to a new key vault and keys, while continuing to decrypt from the old vault-name. To support a migration of existing keys to a new vault-name maybe we could do it by configuration, like the user could configure something like alias(from=my-old-kv, to=my-new-kv)
and we'd handle the substitution.
There are only a very limited number of public clouds, so maybe we could get some compression there by using an enum to record if they are using public cloud and infer the host from that. Else, store the whole thing.
for cloud in $(az cloud list --query "[].name" -o tsv); do endpoint=$(az cloud show --name $cloud --query suffixes.keyvaultDns -o tsv); echo "$cloud: $endpoint"; done
AzureCloud: .vault.azure.net
AzureChinaCloud: .vault.azure.cn
AzureUSGovernment: .vault.usgovcloudapi.net
AzureGermanCloud: .vault.microsoftazure.de
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should serialize the object name. To enable the user to move from one key vault to another within the same cloud. We would still operate on the assumption that there is a single Key Vault in use for encryption purposes, but support decryption from multiple.
It is theoretically possible that a user could have the same vault name in multiple national clouds, but it seems like an unlikely setup, Entra is also tied to a particular cloud, so we can't auth across multiple clouds with our one token. I'm wondering if we are better off saying we are limited to working with a single cloud and externalizing the vault host in proxy configuration.
Are there other considerations? Like is it good to have the vault hostname in the record as some kind of record about where the key originated, even if it's taking up some bytes?
so some options are:
- We serialize the object name and vault hostname as 2 fields in the serialized EDEK. We don't attempt to compress the
{object-name}
and vault hostname. - the same as option 1, except we have some smarts to compress the vault hostname if it's a known public hosted vault address as above.
- we serialize only the
{object-name}
and use a vault hostname from proxy configuration. Initially we would support only a single fixed vault hostname for all records, but we could in future enable users to configure a hostname per vault object name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have updated the proposal with option 3. The KMS will only support operating in the context of a single cloud for now. We could always evolve towards storing the vault host later if we had some request to support operating across national clouds.
Maybe multi-tenant is more likely, where an organization has some Keys under one Azure tenant and some under another tenant. Either way I think we should wait for a request first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think option 3 is the right approach for now. I agree, waiting for user requirements is the right way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing here, I think we should also serialize a byte for the key type (which implies the wrapping algorithm). This means at decrypt time we will have everything we need to make the unwrap request, and not need to fetch the key to get it's key type. Feels like a sensible thing to encode anyway as a record of how it was wrapped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have pushed that up
I don't think we need more details in the proposal beyond saying what are the minimum permissions that proxy will need to operate. Out of curiosity, if the KMS sniffs the key to check its type, does it need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really thorough job @robobario. This doc will be a great foundation for future KMS proposals.
This should probably be proposal 009 |
Yes, I believe that's what the I've realised this
will try it out edit: there's actually a built-in role which contains exactly those permissions
edit: I've confirmed it works with RBAC configured with these permissions. Also found that the API docs for wrap is incorrect as it works even if you don't supply the |
Co-authored-by: Tom Bentley <[email protected]> Signed-off-by: Robert Young <[email protected]>
Co-authored-by: Tom Bentley <[email protected]> Signed-off-by: Robert Young <[email protected]>
Signed-off-by: Robert Young <[email protected]>
Signed-off-by: Robert Young <[email protected]>
Signed-off-by: Robert Young <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spelling nit but otherwise LGTM
Co-authored-by: Keith Wall <[email protected]> Signed-off-by: Robert Young <[email protected]>
Sounds like something worthwhile to flag up to Mircosoft. If we can avoid an API call, I think that's worthwhile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Robert Young <[email protected]>
This enables us to execute the unwrap without having to describe the key again to obtain it's type. It also adds a useful record of how the key was encrypted, with the downside that it costs a byte per record. Signed-off-by: Robert Young <[email protected]>
nudge to @kroxylicious/developers, keen for a second approval before merging this. Latest changes were I added the vault name and key type to the serialized EDEK written into each record, so that we have all the information we need to unwrap the key, without having to make a separate API call to describe the key and learn its type on the decrypt path. |
No description provided.