Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float contributions value in private aggregation API #1326

Open
mehdisebbar opened this issue Nov 6, 2024 · 1 comment
Open

Float contributions value in private aggregation API #1326

mehdisebbar opened this issue Nov 6, 2024 · 1 comment

Comments

@mehdisebbar
Copy link

Request:

We propose allowing contributions of float values (including negative ones) to contributeToHistogramOnEvent with a sensitivity of 2^16.

Background:

Currently, contributions are limited to integer values up to the L1 bound of 2^16. For specific use cases such as encoding gradients (for noised gradient descent methods), we need to encode decimal values that can be negative. The current implementation requires positive and negative integers values to be contributed separately into two different buckets, which doubles the privacy cost. Since the Laplace distribution considers the L1 norm (the sum of absolute values), the limitation to non-negative integers is unnecessary from a differential privacy perspective.

@alexmturner
Copy link
Contributor

Hi @mehdisebbar, thanks for the feedback! Some initial thoughts below.

For allowing negative values, totally agree that theoretically we should be able to support this. As you point out, we would still need to use up (positive) budget for those contributions. (Also note some similar previous feedback here.) This would require some changes to the report format and processing on the Aggregation Service, however.

For allowing non-integer values, we originally chose 2^16 as we believed that it would have sufficient granularity, especially given the scale of noise added on the server. To quickly comment on privacy, I do agree that for simple flows, floats should be fine privacy-wise; however, there are some more complex flows in the future that could depend on the quantization here, e.g. choosing a threshold for key discovery.

A switch to floats would be a pretty big change in the flow, so we'd need to consider that carefully. Another option for more granularity could be to choose a larger L1 limit (and increase the server-side noise proportionally). What level of granularity do you think you would need for your use case here? (Keeping in mind the magnitude of noise the server adds.)

cc @csharrison

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants