Consider relaxing type of `matchValue` #58

andyleiserson · 2024-12-17T00:38:28Z

We are currently specifying that this be an integer, but if we want it to be extensible to more complex filtering in the future, possibly we should specify it as any.

The text was updated successfully, but these errors were encountered:

apasel422 · 2025-02-11T22:05:00Z

I assume that we can start with integers for now and loosen that requirement in the future backwards-compatibly?

apasel422 · 2025-03-25T17:05:54Z

Making filtering more powerful than simple integer equality is useful, though it is worth pointing out what we learned from Attribution Reporting’s declarative filtering API: Users are always going to want something more general than the declarative API allows.

ARA added operations over time (first single-value equality, then “any-of” matching in lists of values, then disjunctions over multiple such lists, then negation of these matches), which has led to a complex specification and implementation that is still not completely general, and that is hard for users to express what should in essence be arbitrary boolean predicates in.

It makes sense for PPA to support declarative filtering for simple operations like integer equality between values specified on the impression and conversion sides, but once aggregates are involved (lists of values, key-value associations), or operations other than equality, it will quickly become complicated to specify how conversions do things like filterData[someField].contains(someValue) && filterData[otherField] > 5, effectively a domain-specific language representing the execution of a pure function that accepts metadata about the impression and produces either true or false to indicate whether the conversion should match it.

Before deciding whether impressions should be able to set other data types as their filterData field, I think we should consider how exactly the conversion side will operate on those values, especially given the need to sandbox these expressions for security, performance (limitations on time/memory consumption), and to avoid retention of PPA-level impression data between operations (i.e. no side-channels involving network/storage). In other words, it doesn’t seem like we could simply allow measureConversion to accept a normal JS callback in its options, as there would be no way for PPA to execute it in a manner consistent with those restrictions (at the very least, callbacks could easily exfiltrate filter data for all matching impressions).

We will also need to consider how these predicates are provided in the HTTP API, if measureConversion is ever exposed through it, making an approach based on JS callbacks alone insufficient, even if they could be executed safely.

It might be possible to do something like this with worklets (perhaps similar to Shared Storage).

martinthomson · 2025-03-26T02:32:59Z

I have a strong desire to avoid the use of worklets in this API: hiding timing side channels in isolated processing is something fenced frames and protected audience fails at; I don't want this API to be the reason we add another such system.

I do think that we need to have a discussion about what the future of the API is in terms of querying capabilities. I know that @michaelkleber expressed a desire to turn the decision-making process into a simple inner product. Or even a matrix product followed by a vector product. This gives you a lot more power than people often appreciate, even if it is not necessarily user-friendly. The advantage would be fixed expectations about running time.

(Having that discussion here might not be ideal. I would like to rename filterData to something better (#24) and then talk about the addition of other fields that can be used to more precisely select impressions for consideration. I had thought that this could be done incrementally, but it makes sense to plot out a bigger plan. Should we open an issue for discussion at our upcoming face-to-face? We might make some progress ahead of that time, but this probably isn't where people will expect to find that discussion.)

apasel422 · 2025-04-02T15:39:57Z

Maybe for now it's sufficient to allow the conversion side to specify a list of integers to match for disjunction.

apasel422 · 2025-04-04T12:17:03Z

A related question here:

Per the Web IDL standard:

The unsigned long type corresponds to 32-bit unsigned integers.

Do we think that's sufficient, even in the integer-only use case? If not, there's further complexity due to JavaScript's Number.MAX_SAFE_INTEGER and structured header integers being limited to the range [-999,999,999,999,999, 999,999,999,999,999].

csharrison · 2025-04-04T13:25:05Z

32 bits is not a lot. I would like to support the use-case where these integers can be hashes of strings if needed, so my preference would be to support either:

Each API surface just specifies its own limits, and we internally support 64 bit integers
We clamp JS API to the (more limited) structured header max / min

apasel422 · 2025-04-04T13:28:34Z

32 bits is not a lot. I would like to support the use-case where these integers can be hashes of strings if needed, so my preference would be to support either:

Each API surface just specifies its own limits, and we internally support 64 bit integers

We clamp JS API to the (more limited) structured header max / min

Another option is bigint, which implementations can clamp to their own limit. On the structured-header side, we would support either integers or string-encoded values above the integer limits.

csharrison · 2025-04-04T13:42:05Z

I would prefer avoiding string parsing on the structured header side unless we know it is needed. I think we could add bigint support later on in a backwards compatible way if needed?

martinthomson · 2025-04-10T08:39:45Z

If we only permit subsetting in the form of an inclusion list, why would having more bits in the integer be useful?

csharrison · 2025-04-10T13:33:08Z

Upthread I mentioned a use-case:

I would like to support the use-case where these integers can be hashes of strings if needed

This is useful if you want to filter on some fields which are not already densely encoded.

apasel422 · 2025-05-20T18:08:20Z

Now that the initial HTTP Save-Impression API has been specified, we should re-investigate whether we need 64 bits for match values, and, if so, figure out how to give the HTTP parity with IDL.

martinthomson · 2025-05-21T00:25:06Z

HTTP and RFC 9651 cannot express the full range of values in an unsigned 64-bit value. We could:

Constrain other interfaces to the same range of values.
Simply observe that HTTP cannot address part of the potential range of values.

I lean toward (2). Sites need to coordinate their use of these values and so can coordinate to avoid parts of the space their system can't address.

andyleiserson changed the title ~~Consider relaxing type of filterData.~~ Consider relaxing type of filterData Dec 17, 2024

andyleiserson mentioned this issue Dec 17, 2024

Initial detail of HTTP API #56

Merged

tholop mentioned this issue Mar 14, 2025

Enrich event type for PPA events columbia/pdslib#8

Closed

apasel422 mentioned this issue Apr 4, 2025

Consider allowing conversions to specify a set of integers to match filter data #132

Closed

martinthomson changed the title ~~Consider relaxing type of filterData~~ Consider relaxing type of matchValue May 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consider relaxing type of `matchValue` #58

Consider relaxing type of `matchValue` #58

andyleiserson commented Dec 17, 2024

apasel422 commented Feb 11, 2025

Uh oh!

apasel422 commented Mar 25, 2025

Uh oh!

martinthomson commented Mar 26, 2025

Uh oh!

apasel422 commented Apr 2, 2025

Uh oh!

apasel422 commented Apr 4, 2025 •

edited

Loading

Uh oh!

csharrison commented Apr 4, 2025

Uh oh!

apasel422 commented Apr 4, 2025 •

edited

Loading

Uh oh!

csharrison commented Apr 4, 2025

Uh oh!

martinthomson commented Apr 10, 2025

Uh oh!

csharrison commented Apr 10, 2025

Uh oh!

apasel422 commented May 20, 2025

Uh oh!

martinthomson commented May 21, 2025

Uh oh!

Consider relaxing type of matchValue #58

Consider relaxing type of matchValue #58

Comments

andyleiserson commented Dec 17, 2024

apasel422 commented Feb 11, 2025

Uh oh!

apasel422 commented Mar 25, 2025

Uh oh!

martinthomson commented Mar 26, 2025

Uh oh!

apasel422 commented Apr 2, 2025

Uh oh!

apasel422 commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

csharrison commented Apr 4, 2025

Uh oh!

apasel422 commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

csharrison commented Apr 4, 2025

Uh oh!

martinthomson commented Apr 10, 2025

Uh oh!

csharrison commented Apr 10, 2025

Uh oh!

apasel422 commented May 20, 2025

Uh oh!

martinthomson commented May 21, 2025

Uh oh!

Consider relaxing type of `matchValue` #58

Consider relaxing type of `matchValue` #58

apasel422 commented Apr 4, 2025 •

edited

Loading

apasel422 commented Apr 4, 2025 •

edited

Loading