-
Notifications
You must be signed in to change notification settings - Fork 9
Add text on DP formal analysis and its assumptions #271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
|
@@ -2338,6 +2338,45 @@ for a number of reasons: | |||||||
Without allocating [=privacy budget=] for new data, | ||||||||
sites could exhaust their budget forever. | ||||||||
|
||||||||
### Formal Analysis of Privacy Properties and Their Limitations ### {#formal-analysis} | ||||||||
The paper [[PPA-DP-2]] provides formal analysis of the mathematical privacy guarantees | ||||||||
afforded by *per-site budgets* and by *safety limits*. Per-site | ||||||||
budgets include [=site=] in the [=privacy unit=], whereas safety limits exclude it | ||||||||
thereby enforcing a global individual DP guarantee. | ||||||||
|
||||||||
The analysis shows that *per-site individual DP guarantees* hold under a restricted system | ||||||||
model that makes two assumptions, which may not always be satisfied in practice: | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||
|
||||||||
1. *No cross-site adaptivity in data generation.* A site’s queryable data stream (impressions | ||||||||
and conversions) must be generated independently of past DP query results from other sites. | ||||||||
2. *No leakage through cross-site shared limits.* Queries from one site must not affect which | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
House style is that lists are all ones. |
||||||||
reports are emitted to others. | ||||||||
|
||||||||
Assumption 1 is necessary because the system involves multiple sites that could interact | ||||||||
with the same user over time and change the ads they show to the user, or impact the | ||||||||
conversions the user has, based on each other’s DP measurements. For example, if one advertiser | ||||||||
learns, from DP measurements, to make an ad more effective, a user may convert on their site | ||||||||
rather than a competitor’s. In this case, the first site’s DP outputs -- counted only against | ||||||||
its own per-site budget -- alter the data (or absence of data) visible to the competitor, yet | ||||||||
this impact is not reflected in the competitor’s per-site budget. When Assumption 1 is violated, | ||||||||
the analysis shows that per-site guarantees cannot be achieved. | ||||||||
Comment on lines
+2355
to
+2362
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is part of the assumption, but I think that the main challenge here is different. Sites might gain an understanding that a particular visitor to each site in a set is the same person (due to federated login, same email address, or anything including navigation tracking which we can't stop). AND THEN they decide to pool their per-site budgets to use the API to extract more information about that person. In that case, we have no defense from the per-site budget. Sites are only limited by their ability to link activity across sites (which is too easy, as noted) and then the global budget. So we should acknowledge that limitation as well as the more theoretical one here. |
||||||||
|
||||||||
Assumption 2 is necessary when we have shared limits that span multiple sites. An example of | ||||||||
such shared limits are the global safety limits that aim to provide a global DP guarantee. | ||||||||
If queries from some sites cause a shared limit to be reached, reports to other sites may be | ||||||||
filtered, creating dependencies across separate per-site privacy units and affecting the validity | ||||||||
of the per-site guarantees. Thus, care must be taken when introducing any new shared limit, such | ||||||||
as cross-site rate limiters on privacy loss. If only Assumption 2 is violated, it is unknown whether | ||||||||
per-site guarantees can still be preserved, for example via special designs of the shared limits. | ||||||||
|
||||||||
These results suggest that per-site protections should be regarded as theoretically grounded approximations | ||||||||
of an ideal per-site individual DP guarantee that can be established only under certain assumptions. | ||||||||
The extent to which privacy protection from per-site budgets may be impacted in practice remains unknown. | ||||||||
|
||||||||
By contrast, the analysis shows that *safety limits* -- which operate at global level, | ||||||||
excluding [=site=] from the [=privacy unit=] -- can be implemented to deliver *sound global individual | ||||||||
DP guarantees* regardless of whether either assumption is satisfied. | ||||||||
|
||||||||
|
||||||||
### Browser Instances ### {#dp-instance} | ||||||||
|
||||||||
|
@@ -3138,6 +3177,23 @@ spec:structured header; type:dfn; urlPrefix: https://httpwg.org/specs/rfc9651; | |||||||
"href": "https://arxiv.org/abs/2405.16719", | ||||||||
"title": "Cookie Monster: Efficient On-device Budgeting for Differentially-Private Ad-Measurement Systems", | ||||||||
"publisher": "SOSP'24" | ||||||||
}, | ||||||||
"ppa-dp-2": { | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||
"authors": [ | ||||||||
"Pierre Tholoniat", | ||||||||
"Alison Caulfield", | ||||||||
"Giorgio Cavicchioli", | ||||||||
"Mark Chen", | ||||||||
"Nikos Goutzoulias", | ||||||||
"Benjamin Case", | ||||||||
"Asaf Cidon", | ||||||||
"Roxana Geambasu", | ||||||||
"Mathias Lécuyer", | ||||||||
"Martin Thomson" | ||||||||
], | ||||||||
"href": "https://arxiv.org/abs/2506.05290", | ||||||||
"title": "Big Bird: Privacy Budget Management for W3C's Privacy-Preserving Attribution API", | ||||||||
"publisher": "arXiv" | ||||||||
Comment on lines
+3195
to
+3196
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||
}, | ||||||||
"prio": { | ||||||||
"authors": [ | ||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you want to mention both papers in the introduction to this, not just the big bird one. Point out that the first establishes the basic system that this document relies on and the second expands that to be a more comprehensive analysis of the system as a whole and the overall privacy guarantees.
This is a timely reminder that we need to implement safety limits in the spec.