Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out the default privacy boundary for the web #1

Closed
jyasskin opened this issue Jun 15, 2021 · 7 comments
Closed

Figure out the default privacy boundary for the web #1

jyasskin opened this issue Jun 15, 2021 · 7 comments
Assignees
Labels
agenda+ Add to the next call's agenda.

Comments

@jyasskin
Copy link
Collaborator

The origin is the default security boundary for the web. That is, different things within the same origin are expected to be able to interfere with each other, while cross-origin things have to opt into communication. There are some exceptions, like cookies and Chrome's site isolation, but we have consensus that origin isolation is the goal.

A default privacy boundary would, roughly, be the point at which the user (rather than the page) needs to approve communication.

We're currently defaulting to the Public Suffix List for this, but that answer isn't working perfectly. Specifically, the PSL groups origins into sites, and we declare that the site is the privacy boundary. The PSL has known problems, and it's not currently funded well enough to handle wider adoption, especially in potentially-adversarial situations. We might be able to round up the needed extra investment if we so choose.

A variant of First Party Sets might be able to serve as an alternate default privacy boundary.

@dmarti
Copy link
Collaborator

dmarti commented Jun 26, 2021

There is a collection of ad industry standards that might be worth looking at here. These are not adequate for saying that two origins are on the same side of a privacy boundary, but they do indicate a site's claims about business relationships that are relevant to user data handling.

Many sites have an incentive to maintain these files in a correct state. If an industry-standard file is present on two sites but different, or present on one site but not another, then the sites are signaling that they are probably on different sides of a privacy boundary.

@jyasskin
Copy link
Collaborator Author

I forgot to mention that the IETF has tried to address an adjacent problem, but didn't succeed: https://datatracker.ietf.org/wg/dbound/about/ and https://github.com/equalsJeffH/dbound#readme.

@torgo
Copy link
Member

torgo commented Jul 7, 2021

As discussed today - we think building on the whatwg def of origin and writing some words around that might be the best approach here. We could write it from the perspective of how new technologies should behave.

@jyasskin
Copy link
Collaborator Author

Looking at the discussion, I think we have some ambiguity about what "use the origin for the default privacy boundary" would mean. For example, @darobin said "Any company that can acquire many companies and put them under its domain can freely share data". To make this concrete, say the company owns bigcompany.com. If the origin is the privacy boundary, the company would have to put the acquired companies at https://bigcompany.com/acquired1, https://bigcompany.com/acquired2, etc., in order to avoid asking the user each time they want to share data. This is a security problem, because now any vulnerability in acquiredN can see and interfere with all the data for all the pieces of BigCo. There's some mention of creating sub-origin security boundaries in the notes, but we should check with WebAppSec before assuming that's plausible. It's also a big migration, and one that I don't see a plausible way to finish (see #21).

As @hober said, "The default privacy boundary in terms of deployed content is the site not the origin.", where "site" is defined using the Public Suffix List. That is, BigCo could put its acquisitions at https://acquired1.bigcompany.com/, https://acquired2.bigcompany.com/, etc., putting each on its own origin, which puts a security boundary between each but still allows them to communicate without asking the user. The privacy domain still builds on origins, but it includes more than one of them.

@darobin
Copy link
Member

darobin commented Jul 21, 2021

My suggestion from the meeting discussion:

  • We define when we see it as important to have a privacy boundary in a principled, even if we can't enforce it technically.
  • We encourage enforcing it technically where possible, and suggest a hand-off to policymakers for what we cannot enforce, with the hope that they use similar principles.
  • We admit that perfection is not required. You can write a scam with access to nothing more than a p element, which means that security is never perfect either.

jyasskin added a commit that referenced this issue Aug 19, 2021
… privacy/contextual boundary. (#28)

This incorporates some ideas from https://github.com/asankah/identity-domains,
which distinguishes separate profiles and boundaries a user creates by clearing
cookies/storage.

It explicitly says that browsers are free to separate contexts more finely than
this default and says there's controversy about separating contexts less finely.

It drops the blanket statement that automating recognition is always
inappropriate.

It also removes the explicit mention of email-based cross-context recognition
in favor of a more general statement about difficult-to-forge pieces of
people's identities.

This contributes to #1 but doesn't completely fix it.
@darobin darobin added this to the post-fpwd milestone Jan 5, 2022
@dmarti
Copy link
Collaborator

dmarti commented Jan 5, 2022

Possibly related: what does a user understand as a single "thing" they are interacting with? #44

@darobin darobin removed this from the post-fpwd milestone Jun 27, 2022
@darobin darobin added the agenda+ Add to the next call's agenda. label Jun 27, 2022
@darobin
Copy link
Member

darobin commented Jul 6, 2022

This feels overtaken by events and captured by the rest of the document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agenda+ Add to the next call's agenda.
Projects
None yet
Development

No branches or pull requests

4 participants