-
Notifications
You must be signed in to change notification settings - Fork 9
Add text on DP formal analysis and its assumptions #271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
DP assumptions for formal proofs
"title": "Big Bird: Privacy Budget Management for W3C's Privacy-Preserving Attribution API", | ||
"publisher": "arXiv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"title": "Big Bird: Privacy Budget Management for W3C's Privacy-Preserving Attribution API", | |
"publisher": "arXiv" | |
"title": "Big Bird: Privacy Budget Management for W3C's Privacy-Preserving Attribution API" |
@@ -3138,6 +3177,23 @@ spec:structured header; type:dfn; urlPrefix: https://httpwg.org/specs/rfc9651; | |||
"href": "https://arxiv.org/abs/2405.16719", | |||
"title": "Cookie Monster: Efficient On-device Budgeting for Differentially-Private Ad-Measurement Systems", | |||
"publisher": "SOSP'24" | |||
}, | |||
"ppa-dp-2": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ppa-dp-2": { | |
"ppa-dp-2": { |
Assumption 1 is necessary because the system involves multiple sites that could interact | ||
with the same user over time and change the ads they show to the user, or impact the | ||
conversions the user has, based on each other’s DP measurements. For example, if one advertiser | ||
learns, from DP measurements, to make an ad more effective, a user may convert on their site | ||
rather than a competitor’s. In this case, the first site’s DP outputs -- counted only against | ||
its own per-site budget -- alter the data (or absence of data) visible to the competitor, yet | ||
this impact is not reflected in the competitor’s per-site budget. When Assumption 1 is violated, | ||
the analysis shows that per-site guarantees cannot be achieved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the assumption, but I think that the main challenge here is different. Sites might gain an understanding that a particular visitor to each site in a set is the same person (due to federated login, same email address, or anything including navigation tracking which we can't stop). AND THEN they decide to pool their per-site budgets to use the API to extract more information about that person. In that case, we have no defense from the per-site budget. Sites are only limited by their ability to link activity across sites (which is too easy, as noted) and then the global budget.
So we should acknowledge that limitation as well as the more theoretical one here.
|
||
1. *No cross-site adaptivity in data generation.* A site’s queryable data stream (impressions | ||
and conversions) must be generated independently of past DP query results from other sites. | ||
2. *No leakage through cross-site shared limits.* Queries from one site must not affect which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. *No leakage through cross-site shared limits.* Queries from one site must not affect which | |
1. *No leakage through cross-site shared limits.* Queries from one site must not affect which |
House style is that lists are all ones.
thereby enforcing a global individual DP guarantee. | ||
|
||
The analysis shows that *per-site individual DP guarantees* hold under a restricted system | ||
model that makes two assumptions, which may not always be satisfied in practice: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
model that makes two assumptions, which may not always be satisfied in practice: | |
model that makes two assumptions, which may not always be satisfied in practice: |
@@ -2338,6 +2338,45 @@ for a number of reasons: | |||
Without allocating [=privacy budget=] for new data, | |||
sites could exhaust their budget forever. | |||
|
|||
### Formal Analysis of Privacy Properties and Their Limitations ### {#formal-analysis} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you want to mention both papers in the introduction to this, not just the big bird one. Point out that the first establishes the basic system that this document relies on and the second expands that to be a more comprehensive analysis of the system as a whole and the overall privacy guarantees.
This is a timely reminder that we need to implement safety limits in the spec.
With @roxanageambasu and @tholop we have put together some text for the spec that states what assumptions are used in the system model in which the formal DP analysis is done.
The referenced paper is already available, but a few claims in the PR are not yet reflected there. We will update the arXiv version to align with the PR shortly. In the meantime, we confirm that the statements are accurate and supported by analyses we already have internally.