-
Notifications
You must be signed in to change notification settings - Fork 8
/
private-click-measurement.bs
305 lines (211 loc) · 17.3 KB
/
private-click-measurement.bs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
<pre class='metadata'>
Title: Private Click Measurement
Shortname: private-click-measurement
Level: None
Status: CG-DRAFT
Group: privacycg
Repository: privacycg/private-click-measurement
URL: https://privacycg.github.io/private-click-measurement/
Editor: John Wilander, w3cid 89478, Apple Inc. https://apple.com/, [email protected]
Status Text: This specification is intended to be merged into the HTML Living Standard. It is neither a WHATWG Living Standard nor is it on the standards track at W3C.
Text Macro: LICENSE <a href=https://creativecommons.org/licenses/by/4.0/>Creative Commons Attribution 4.0 International License</a>
Abstract: This specification defines a privacy-preserving way to measure clicks that result in cross-site navigations, such as ad clicks that result in a purchase or a sign-up.
Markup Shorthands: idl yes, markdown yes
Complain About: missing-example-ids yes
</pre>
<pre class="biblio">
{
"WELL-KNOWN": {
"aliasOf": "RFC8615"
}
}
</pre>
# Introduction # {#introduction}
<em>This section is non-normative.</em>
A popular business model for the web is to get attribution and payment for clicks, for instance ad clicks that result in purchases or sign-ups. Traditionally, such attribution has been facilitated by user or device identifying cookies sent in cross-site HTTP requests. However, the same technology can be and has been used for privacy invasive cross-site tracking of users.
The technology described in this document is intended to allow for click attribution, such as ad click attribution, while disallowing arbitrary cross-site tracking.
## Goals ## {#goals}
* Support click attribution or measurement of clicks across websites.
* Preserve user privacy, specifically prevent cross-site tracking of users.
## Terminology ## {#terminology}
: click
:: This document will use the term “click” for any kind of user gesture on web content that invokes the navigation to a link destination, such as clicks, taps, and use of accessibility tools.
: <dfn>attribution</dfn>
:: A user action on one website is attributed to a preceding click, meaning that the source of the click should receive attribution for the user action.
The four parties involved in this technology are:
: user
:: They click, end up on a destination website, and perform an action that triggers [=attribution=], such as a purchase.
: user agent
:: The web browser that acts on behalf of the user and facilitates click attribution.
: <dfn>click source website</dfn>
:: The first-party website where the user clicks.
: <dfn>attribution destination website</dfn>
:: The destination website where the user performs an action that triggers attribution.
The data consumed by the user agent to support private click measurement is:
: <dfn>attribution source id</dfn>
:: An [=eight-bit decimal value=] identifying the content that was clicked to navigate to the destination. This means support for 255 unique pieces of content, such as ads, per attribution destination on the click source. Example: `merchant.example` can run up to 255 concurrent ad campaigns on `search.example`. The valid decimal values are 0 to 255.
: <dfn>trigger data</dfn>
:: A [=four-bit decimal value=] encoding the user action that triggers the attribution. This value may encode things like specific steps in a sales funnel or the value of the sale in buckets, such as less than $10, between $10 and $50, between $51 and $200, above $200, and so on. The valid decimal values are 00 to 15.
: <dfn>optional trigger priority</dfn>
:: An optional [=six-bit decimal value=] encoding the priority of the triggering event. The priority is only intended for the user agent to be able to pick the most important trigger data if there are multiple. One such case may be after the user has taken step 1 through 3 in a sales funnel and the third step is the most important to get attribution for. The valid decimal values are 00 to 63.
The final step in private click measurement results in an:
: <dfn>attribution report</dfn>
:: An report comprised of the [=click source website=], [=attribution source id=], [=attribution destination website=], and [=trigger data=].
## A High Level Scenario ## {#scenario}
A high level example of a scenario where the described technology is intended to come into play is this:
1. A user searches for something on `search.example`'s website.
2. The user is shown an ad for a product and clicks it.
3. The [=click source website=], `search.example`, informs the user agent (see [[#linkformat]]):
- That it will want attribution for this click.
- What the intended [=attribution destination website=] is.
- What the [=attribution source id=] is.
4. The user agent navigates and takes note that the user landed on the intended [=attribution destination website=].
5. The user's activity on the [=attribution destination website=] leads to a [=triggering event=].
6. A third-party HTTP request is made on the [=attribution destination website=] to `https://search.example/.well-known/private-click-measurement/trigger-attribution/` with [=trigger data=].
7. The user agent checks for stored clicks for the [=click source-attribute-on pair=] and if there's a hit, makes or schedules an HTTP POST request to `https://search.example/.well-known/private-click-measurement/report-attribution/` with the corresponding [=attribution report=].
ISSUE: One thing to consider here is whether there should be an option to send the [=attribution report=] to the [=attribution destination website=] too.
# Click Source Link Format # {#linkformat}
This specification adds two attributes to the {{HTMLAnchorElement}} interface. Authors can use these attributes in HTML content like so (where `17` is an [=attribution source id=] and `https://destination.example/` is an [=attribution destination website=]):
<xmp class="highlight" highlight=html>
<a attributionsourceid="17" attributiondestination="https://destination.example/">
</xmp>
Formally:
<pre class="idl">
partial interface HTMLAnchorElement {
[CEReactions] attribute unsigned long attributionSourceId;
[CEReactions] attribute DOMString attributionDestination;
};
</pre>
The IDL attributes {{HTMLAnchorElement/attributionSourceId}} and {{HTMLAnchorElement/attributionDestination}} must [=reflect=] the <code>attributionsourceid</code> and <code>attributiondestination</code> content attributes, respectively.
ISSUE(1): Should these attributes be on {{HTMLHyperlinkElementUtils}} instead?
If an element with such attributes triggers a top frame navigation that lands, possibly after HTTP redirects, on the [=attribution destination website=], the user agent stores the request for click attribution as the [=tuple=] ( [=click source website=], [=attribution destination website=], [=attribution source id=] ). If any of the conditions do not hold, such as the [=attribution source id=] not being a valid [=eight-bit decimal value=], the request for click attribution is ignored.
# Triggering of Click Attribution # {#triggering}
Triggering of attribution is what happens when there is a [=triggering event=].
Existing ad click attribution relies on third-party HTTP requests to the [=click source website=] or other websites and these requests are typically the result of invisible image elements or "tracking pixels" placed in the DOM solely to fire HTTP GET requests. To allow for a smooth transition from these old pixel requests to Private Click Measurement, a server-side redirect to a well-known location is used as triggering mechanism. [[!WELL-KNOWN]]
<div algorithm>
To generate a <dfn export>triggering event</dfn>, the top frame context of an [=attribution destination website=] page needs to do the following:
1. An HTTP GET request to the [=click source website=]. This HTTP request may be the result of an HTTP redirect, such as `searchUK.example` HTTP 302 redirect to `search.example`. The use of HTTP GET is intentional in that existing “pixel requests” can be repurposed for this and in that the HTTP request should be idempotent.
1. A secure HTTP GET redirect to the URL returned by [=generate a tiggering event URL|generating a triggering event URL=] for [=click source website=] with [=trigger data=] and [=optional trigger priority=]. This mandatory redirect ensures that the [=click source website=] is in control of who can trigger click attribution on its behalf and optionally what the priority is. If the user agent gets such an HTTP request, it will check its stored clicks, and if there's a match for ([=click source website=], [=attribution destination website=]), attribution has been triggered for that stored click.
</div>
# Attribution Report Format # {#report-format}
An attribution report has this JSON format:
{
"source_engagement_type":"click",
"source_site":"search.example",
"source_id":3,
"attributed_on_site":"destination.example",
"trigger_data":12,
"version":1
}
# Receiving Attribution Reports # {#receving-reports}
1. The user agent makes or schedules a secure HTTP POST request to the URL returned by [=generate an attribution report URL|generating an attribution report URL=] for [=click source website=] with an [=attribution report=]. The use of HTTP POST is intentional in that it differs from the HTTP GET redirect used to trigger the attribution and in that it is not expected to be idempotent.
ISSUE: We may have to add a nonce to the HTTP POST request to prohibit double counting in cases where the user agent decides to retry the request.
If there are multiple ad click attribution requests for the same [=click source-attribute-on pair=], the one with the highest [=optional trigger priority=] will be the one sent and the rest discarded.
Issue: This needs to be reworked to monkeypatch HTML's "follows a hyperlink" algorithm.
</div>
# Well-Known URLs # {#well-known-urls}
## Triggering Event URL ## {#triggering-event-url}
<div algorithm>
Clients <dfn>generate a tiggering event URL</dfn> for |source site| with |trigger data| and |optional trigger priority| by following these steps:
1. Let |url| be the result of [=concatenating=] the strings « `".well-known"`, `"private-click-measurement"`, `"trigger-attribution"`, |trigger data|, |optional trigger priority| » using the separator U+002F (/).
1. Return the result of calling {{URL(url, base)}} with url |url| and base |source site|.
</div>
## Attribution Report URL ## {#attribution-report-url}
<div algorithm>
User agents <dfn>generate an attribution report URL</dfn> for |source site| by following these steps:
1. Let |url| be `".well-known/private-click-measurement/report-attribution/"`.
1. Return the result of calling {{URL(url, base)}} with url |url| and base |source site|.
</div>
# Click source/attribute-on pairs # {#click-source-attribute-on-pairs}
An <dfn>click source-attribute-on pair</dfn> is a [=tuple=] of two [=sites=]: (source, attribute-on).
# N-bit decimal values # {#n-bit-decimal-values}
A <dfn>four-bit decimal value</dfn> is a [=string=] for which the [=extract a four-bit decimal value=] algorithm does not return failure.
A <dfn>six-bit decimal value</dfn> is a [=string=] for which the [=extract a six-bit decimal value=] algorithm does not return failure.
A <dfn>eight-bit decimal value</dfn> is an unsigned long between 0 and 255 inclusive.
<div class=example id="valid-four-bit-decimal-values">
The [=strings=] `"00"` and `"15"` are both [=four-bit decimal values=], whereas `"7"`, `"20"`, and `"!!11one"` are not.
</div>
<div algorithm>
Clients <dfn type=abstract-op>extract a four-bit decimal value</dfn> from a [=string=] |string| by running these steps:
1. If |string|'s [=string/length=] is not 2, return failure.
1. Let |tens| be the [=code unit=] at position 0 within |string|, and |ones| be the [=code unit=] at position 1.
1. If |tens| is less than U+0030 (0) or greater than U+0031 (1), return failure.
1. If |ones| is less than U+0030 (0) or greater than U+0039 (9), return failure.
1. If |tens| is U+0031 (1) and |ones| is greater than U+0035 (5), return failure.
1. Return (|tens| - 30) * 10 + ones - 30.
</div>
<div class=example id="valid-six-bit-decimal-values">
The [=strings=] `"00"` and `"63"` are both [=six-bit decimal values=], whereas `"7"`, `"98"`, and `"!!11one"` are not.
</div>
<div algorithm>
Clients <dfn type=abstract-op>extract a six-bit decimal value</dfn> from a [=string=] |string| by running these steps:
1. If |string|'s [=string/length=] is not 2, return failure.
1. Let |tens| be the [=code unit=] at position 0 within |string|, and |ones| be the [=code unit=] at position 1.
1. If |tens| is less than U+0030 (0) or greater than U+0036 (6), return failure.
1. If |ones| is less than U+0030 (0) or greater than U+0039 (9), return failure.
1. If |tens| is U+0036 (6) and |ones| is greater than U+0033 (3), return failure.
1. Return (|tens| - 30) * 10 + ones - 30.
</div>
# Modern Triggering of Attribution # {#modern-triggering}
We envision a JavaScript API and a same-site "pixel" redirect on the [=attribution destination website=] as a modern [=triggering event=]. This removes the necessity for third-party "pixels" and would allow the [=attribution destination website=] to fire a [=triggering event=] without direct dependencies to the exact [=click source website=]s where their incoming clicks are coming from.
# Privacy Considerations # {#privacy}
The total entropy in Private Click Measurement HTTP requests is 12 bits (8+4), which means 4096 unique values can be managed for each [=click source-attribute-on pair=].
With no other means of cross-site tracking, the [=click source website=] and the [=attribution destination website=] have no joint view of the user or the device. This restricts the entropy under control to 8 bits, or 256 values, at any moment.
We believe these restrictions avoid general cross-site tracking while still providing useful click attribution at web scale.
In the interest of user privacy, user agents are encouraged to deploy the following restrictions to when and how they make secure HTTP POST requests to an [[#attribution-report-url|Attribution Report URL]]:
* The user agent targets a delay of [=attribution report=] requests by 24–48 hours. However, the user agent might not be running or the user's device may be or disconnected from the internet, in which case the request may be delayed further.
* The user agent only holds on to the [=tuple=] ([=click source website=], [=attribution destination website=], [=attribution source id=]) for 7 days, i.e. one week of potential click attribution.
* The user agent doesn't guarantee any specific order in which multiple [=attribution report=] requests for the same [=attribution destination website=] are sent, since the order itself could be abused to increase the entropy.
* The user agent uses an ephemeral session (a.k.a. private or incognito mode) to make [=attribution report=] requests.
* The user agent doesn't use or accept any credentials such as cookies, client certificates, or Basic Authentication in [=attribution report=] requests or responses.
* The user agent may use a central clearinghouse to further anonymize [=attribution report=] requests, should a trustworthy clearinghouse exist.
* The user agent offers users a way to turn Private Click Measurement on and off.
* The user agent doesn't support Private Click Measurement in private/incognito mode.
# Performance Considerations # {#performance}
The user agent may want to limit the amount of stored click attribution data. Limitations can be set per [=click source website=], per [=attribution destination website=], and on the total amount of click attribution data.
# IANA considerations # {#iana}
This document defines the `".well-known"` URIs `"trigger-attribution"` and `"report-attribution"`.
These registrations will be submitted to the IESG for review, approval, and registration with IANA using the template defined in [[!WELL-KNOWN]] as follows:
: URI suffix
:: `"trigger-attribution"`
:: `"report-attribution"`
: Change controller
:: W3C
: Specification document(s)
:: This document is the relevant specification. (See [[#triggering-event-url]] [[#attribution-report-url]].)
: Related information:
:: None.
# Related Work # {#relatedwork}
The Improving Web Advertising Business Group has related work that started in January 2019. It similarly uses a .well-known path with no cookies. [[METRICS]]
Brave publised a security and privacy model for ad confirmations in March 2019. [[CONFIRMATIONS]]
Google Chrome published an explainer document on May 22, 2019, for a very similar technology. They cross-reference this spec in its earlier form on the WebKit wiki. [[EVENT-LEVEL]]
<pre class="biblio">
{
"METRICS": {
"href": "https://github.com/w3c/web-advertising/blob/master/admetrics.md",
"title": "Privacy protecting metrics for web audience measurement",
"publisher": "Improving Web Advertising Business Group"
},
"CONFIRMATIONS": {
"href": "https://github.com/brave/brave-browser/wiki/Security-and-privacy-model-for-ad-confirmations",
"title": "Security and privacy model for ad confirmations",
"publisher": "Brave"
},
"EVENT-LEVEL": {
"href": "https://github.com/WICG/conversion-measurement-api",
"title": "Click Through Conversion Measurement Event-Level API Explainer",
"publisher": "Google Chrome"
}
}
</pre>
# Acknowledgements # {#acknowledgements}
Thanks to
Brent Fulgham,
Ehsan Akghari,
Erik Neuenschwander,
Jason Novak,
Maciej Stachowiak,
Mark Xue,
and
Steven Englehardt
for their feedback on this proposal.