Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-encryption of database contents when rotating datastore keys #1618

Open
divergentdave opened this issue Jul 18, 2023 · 2 comments
Open

Re-encryption of database contents when rotating datastore keys #1618

divergentdave opened this issue Jul 18, 2023 · 2 comments

Comments

@divergentdave
Copy link
Contributor

In the long term, we should be able to rotate the datastore encryption key, re-encrypt all data with a new key, and destroy the old key once it is no longer in use. Currently, we rely exclusively on trial decryption to determine which key was used to encrypt any field. It seems impractical to do so in maintenance tooling to either check if a key is still used by at-rest data or to perform the re-encryption. (Note that most, but not all, encrypted data will be either re-encrypted or deleted over several weeks during the normal course of operation) To more efficiently handle this, we may want to assign identifiers to our keys, and make a breaking change to the format of encrypted fields to include this identifier. Then, the decrypt() method could skip trial decryption, and maintenance tooling could quickly scan tables, only checking these identifiers. (in fact, we could even push this to the database with substring(column FROM 0 FOR length))

(See also divviup/divviup-api#302 (comment))

@branlwyd
Copy link
Member

Note that most, but not all, encrypted data will be either re-encrypted or deleted over several weeks during the normal course of operation

Task-related data will not be naturally rotated, and task-related data is the majority (all?) of encrypted data. I think we'll need a tool for this.

To more efficiently handle this, we may want to assign identifiers to our keys

I would prefer not to introduce key identifiers, given that: the overwhelming majority of the time we will have at most ~2 living keys at once, we are the only entity handling these keys, and this decryption will (I believe) only be done once per task per aggregator process due to the task-level caching, each task has a relatively tiny amount of encrypted data attached to it, and decryption happens at the application level. Trial decryption is OK here.

@divergentdave
Copy link
Contributor Author

Ah, I misremembered which fields we encrypt. The only encrypted fields are task_aggregator_auth_tokens.token, task_collector_auth_tokens.token, task_hpke_keys.private_key, and task_vdaf_verify_keys.vdaf_verify_keys. That is a much smaller volume than I was thinking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants