Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High volume of requests to add domains to the PSL #78

Open
benjaminsavage opened this issue Apr 6, 2021 · 44 comments
Open

High volume of requests to add domains to the PSL #78

benjaminsavage opened this issue Apr 6, 2021 · 44 comments

Comments

@benjaminsavage
Copy link

benjaminsavage commented Apr 6, 2021

Hi John,

We have a problem. As this GitHub issue explains, there is an increased volume of requests to add entries to the public suffix list.

One problem is with resourcing. The PSL is maintained by volunteers, and they are not prepared for a large volume of requests. This is definitely a problem we should all discuss at the next Privacy-CG.

But that is not the only problem. This is being mis-characterized as a “workaround for [...] security measures”.

Firstly, anyone who thinks that they have discovered a “workaround” is going to be sorely disappointed. And as you yourself pointed out in a recent tweet: “Today might be a good day to remind folks that adding a domain as an eTLD to the Public Suffix List (PSL) affects cookies. E.g. if you register shoppe.example to the PSL so that coffeemug.shoppe.example can be an eTLD+1, shoppe.example can no longer use cookies.”

I think we would both agree that it’s not feasible to use the PSL as a “workaround” for tracking users - it makes it so that the subdomains act, in effect, like separate websites. I think it might help if you could chime in on the issue to explain that there is no “security workaround” here.

So what is going on? Here’s my understanding:

Once iOS14 starts enforcing the ATT prompt, a large number of people will likely “opt-out” of “tracking”. Advertisers will still need to measure the aggregate count of conversions driven by their paid advertising, including conversions from people who have “opted-out” of “tracking”.

This is not a Facebook specific problem at all. They will face this same challenge everywhere they run ads. As such, businesses need to find alternatives ways to measure their ads that are allowed by Apple’s policies, such as “Private Click Measurement”.

“Private Click Measurement” conflates “business entity” with “registrable domain”. There are many small businesses who buy ads (on Facebook and elsewhere) that do not operate on their own eTLD+1 and instead operate on subdomains of websites like “foo.myshopify.com”. Unless they do something - they will no longer be able to measure their paid advertising.

The blog post which introduces PCM provides a possible solution to their dilemma. The blog post explains in the documentation of the “attributeon” parameter that PCM only supports “registrable domains” and links to this page, which specifically makes mention of the “Public Suffix List”.

Websites like “myshopify.com” that offer hosting to many separate businesses, as subdomains of that root, need to go and register themselves on the PSL prior to ATT enforcement.

Now clearly, this is only appropriate if there is no need for data-sharing between subdomains, and they are OK without having any cookies on the root domain.

So what is Facebook’s involvement here?

Facebook finds itself in the position of trying to help advertisers navigate Apple’s ATT changes - answering a wide variety of questions. We didn’t originally have any guidance around the PSL in our help articles until we received questions from advertisers who noticed PCM was supporting it. We are just trying our best to provide guidance about how to work with PCM. If you’d like to suggest any changes to the wording of that guidance, or to offer your own guidance about the PSL and how it interacts with PCM, we’d be happy to just direct people there.

As for the increased number of requests - this is an important issue worth discussing. The underlying cause of this increased demand is due to Apple’s upcoming ATT changes - so I think it would be sensible for Apple to help provide the PSL maintainers with additional support.

I do understand that a volunteer maintained group cannot be expected to meet SLAs for adding new entries here. But as you understand, with the uncertainty about when iOS’s ATT policy will be enforced, businesses are attempting to prepare in advance to avoid disruption. I’m open to discussing possible solutions to this capacity issue. Perhaps the next Privacy-CG is a good venue to discuss this.

@johnwilander
Copy link
Collaborator

johnwilander commented Apr 6, 2021

Hi Ben! I’m not commenting on any policies on specific platforms, only on this proposed web standard.

To understand this a bit better, are merchants registering new eTLDs to be able to track clicks to specific product pages, such as 5kg-kettlebells-in-red.gymshop.example? If so, then we’ve quickly arrived at the known issue of potential cross-site tracking through personalized registrable domains. href and attributionDestination (new name) could be set to john-wilander-ID-abc123.gymshop.example by the click source and the resulting attribution report would reveal the user.

When we’ve discussed this previously, the two defenses have been 1) the registrable domain showing in the URL bar, and 2) re-engagement on the personalized registrable domain would be very hard unless a link was again followed from the same click source or the user leveraged a bookmark or their browsing history. However, for triggering events that happen on first engagement (=direct conversion), only the URL bar would be the defense and there may be merchants who only expect or care about measuring direct conversions.

Flooding the PSL is clearly a problem given the link you provided. Browsers also don’t update their copy of the PSL on any guaranteed schedule that I’m aware of.

Perhaps the only solution is to not support eTLDs in PCM and only support TLDs. That is my current thinking. Thoughts?

@hober hober added the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Apr 6, 2021
@benjaminsavage
Copy link
Author

Perhaps the only solution is to not support eTLDs in PCM and only support TLDs. That is my current thinking. Thoughts?

That's a non-solution. That would cause tremendous harm to all the small businesses who operate on subdomains of TLDs like myshopify, and for what?

We need a less binary solution here. If there ought to be some sort of vetting process to determine who is using subdomains in a way that is aligned with the intended purpose of the PSL, then we should actually invest in making that happen, not just throw up our hands and let thousands of businesses suffer.

We both agree that cross-site tracking through personalized registrable domains is abusive, and is not an appropriate use of the PSL. I'm assuming you share the same desire to make this work only for those with a legit use-case.

I'd propose a "facts and circumstances" approach. Let's actually take a look at which subdomains of the TLD are in use. Let's do an analysis to see if they appear to represent wholly different businesses, or if they appear to be something personalized. Let's invest in some education about what is, and isn't an appropriate use. Let's invest in some kind of a form people fill in to generate a request, that can weed out requests that aren't going to fly. And let's invest in some kind of ongoing monitoring to automatically remove TLDs that change their behavior post addition to the PSL. Let's try to find a more even-handed approach that enables legitimate usage.

@johnwilander
Copy link
Collaborator

Who will vet such a list continuously at a global scale? The people maintaining PSL have already rejected the assignment purely on the volume of requests to be added. They have not looked at any actual use in my understanding, and are probably not obliged to do so.

In my view, both 5kg-kettlebells-in-red.gymshop.example and john-wilander-ID-abc123.gymshop.example are problematic. The intent of PCM is to allow for measurement of clicks to a site, not to a product page, and definitely not to a personalized URL. Advertisers have the source ID to encode products/offers/campaigns with. The issue you've brought up in the "Etsy Case" is about sites but it's unclear to me if the new use of the PSL is for that or for things like 5kg-kettlebells-in-red.gymshop.example. Do you know?

@eriktaubeneck
Copy link

There are three different examples to consider:

  1. gymshop.retail-as-a-service.example: a distinct business, "Gym Shop", which uses retail-as-a-service.example to run the website for their business, which provides them a subdomain distinct to their business
  2. 5kg-kettlebells-in-red.gymshop.example: a distinct business, "Gym Shop" is using subdomains to extract more entropy from PCM at an item level in their catalogue.
  3. john-wilander-ID-abc123.gymshop.example: a distinct business, "Gym Shop" is using subdomains to uniquely identify users.

I don't believe there is any disagreement that both the 2nd and 3rd cases are problematic in the view of the intentions stated by PCM.

However, the first case seems reasonable to support, in terms of the intention to allow for measurement of clicks to a site. It seems that the PSL was initially thought to be a way to support the first case, but that has caused the issued linked from their repository. While it may be difficult to technically differentiate these three examples above, it's at least useful to acknowledge that it would be useful to solve the 1st case while preventing the 2nd and 3rd.

@johnwilander
Copy link
Collaborator

johnwilander commented Apr 6, 2021

Agreed. Number one in your list is the "Etsy Case" tracked in #60. However, as discussed in that thread, doing it through PSL means users can not be logged in to, or potentially even browse to retail-as-a-service.example by itself. Being on the PSL means that retail-as-a-service.example is treated like com in .com. It's not a valid website.

I have always assumed that sites like Etsy are not looking to become services in that sense but Etsy is using www for its main site so as long as they are fine with www being completely isolated from any other "site" on their domain, they can make it work with the PSL. Everything from caches to storage to cookies to logins to password management will be isolated between every merchant on Etsy if they go that route.

@benjaminsavage
Copy link
Author

Who will vet such a list continuously at a global scale?

Apple should.

  1. Apple created this issue in the first place. The need for multi-tenant websites to add themselves to the PSL exists only because of the PCM design decision to limit measurement to registrable domains. The urgency exists because Apple's planned ATT enforcement.

  2. If Apple can develop a scaled process to review the millions of apps submitted to the Apple store, surely it is also capable of reviewing the few dozen multi-tenant domains that exist on the internet. At the time of this writing I only see 9 issues linked to New interaction between IOS 14.5 PCM and Facebook Pixel causing increase in PSL inclusion requests publicsuffix/list#1245. I presume there are a few more multi-tenant websites out there, but how many can there really be? A few dozen perhaps?

@eriktaubeneck
Copy link

I have always assumed that sites like Etsy are not looking to become services in that sense but Etsy is using www for its main site so as long as they are fine with www being completely isolated from any other "site" on their domain, they can make it work with the PSL.

I can't speak for any individual multi-tenant website, but for the sites driving the current influx to the PSL are presumably making that decision. For example, this issue is one such service which describes its need as:

Cookie Security between subdomains. Each property is usually a complete separate business and we want to prevent cross-engine cookie pollination.

This specific pull request did get merged, but it seems that the PSL needs some support in handling the domains making the similar decision to "make it work with the PSL."

@LouisStAmour
Copy link

There’s a fourth example to consider too, which is that nothing would stop retail-as-a-service.example from implementing a feature where they use their status on the PSL to set up a tracking feature that does exactly what gymshop.example does, using a prefix or suffix to distinguish tracked requests from ordinary requests?

Also the second example - 5kg-kettlebells-in-red.gymshop.example - seems a bit problematic to prevent: leaving out companies with their own TLD (I’m not sure how that works…), they might also buy standard top-level domains for specific campaigns, products, etc. Buying a domain for a user session would be extreme unless free trials or refunds are involved but doing so for a limited set of products seems like something that’s hard to prevent even if we exclude use of the PSL list? It’s just PSL makes it cheaper, of course…

@dnsguru
Copy link

dnsguru commented Apr 8, 2021

Appreciate this discussion is happening. Any help in not flash-mobbing our volunteers and bloating the PSL is appreciated in how you address this.

Please also see this: https://github.com/sleevi/psl-problems

@bedfordsean
Copy link

Looping back around on the original rationale behind this from the Facebook perspective ahead of our privacy-cg discussion later today:

  • Many small businesses run on top of "platform partners". Typical examples of those platforms include:
    • Etsy (which doesn't generally do eTLD+2 representation of each merchant instead opting for a "combined" view of all merchant products, searchable on the platform and fulfilled by merchants)
    • Shopify (which does do eTLD+2 representation of a merchant through their "myshopify.com" domain, separating out merchants). Myshopify.com is public suffix listed for this reason to give a distinction between platform hosted merchants vs their own "shopify.com" site. Looking at Google, there are over 12 million "myshopify.com" domains. Many of these will redirect to owned eTLD+1s, but some will not - I can see a few on the first page of these search results that don't redirect from "myshopify.com" for example.
    • There are numerous other examples. One such example that I supported the other day is the Australian government ("gov.au" is public suffix listed). In their case, they have separate sites for emergency services, civil services (parks, utilities, education, etc), the armed forces, and so on. It seems reasonable that each of these "entities" should be able to run their own form of measurement and reporting and that their goals and expectations for that measurement are distinct enough that it isn't easy to merge the whole lot together underneath "gov.au"
  • It seems we're in agreement that separate "business entities" should be able to measure and report separately in this regard, but we don't want to allow abuse of this mechanism to extend to product granular or user granular measurement.
  • One of the main privacy considerations here is the separation of cookies on an eTLD+1 basis to make that type of abuse more difficult.

It seems like we have one consensus point, one fact of how business operate, and three questions:
Consensus: It is reasonable for separate business entities to be able to independently measure using PCM-like solutions within the constraints of preventing user or product level tracking.
Fact: Some entities are hosted on "platform owned" domains. With the eTLD+1 restrictions, that creates a challenge since the individual merchants cannot independently run measurement under PCM-like approaches. Those entities often host in this way because they do not have the resources (technical or financial) to run their own eTLD+1s, so "buy a domain" is not always a viable option.
Question: What is the "right" way to support this? PSL is one way that can provide an element of transparency given it is a "public list", but due to the aforementioned issues, may not scale well if it results in large numbers of requests.
Follow-up Question: If PSL does feel like the right way to do this and also assure cookie separation on the browser level, how can we support the PSL volunteers to ensure that the correct domains end up on the PSL for the correct reasons?
Follow-up Question: If PSL is not the right approach, how do we propose we will (imminently due to ATT prompt enforcement) support all of these small businesses who don't have the capability to drive this themselves?

@dnsguru
Copy link

dnsguru commented Apr 12, 2021

Hi - a plea from a PSL maintainer - what's the latest on this stuff?

I am now getting threatening calls from people freaking out that their Facebook Pixel stuff is going to break if they are not added and other visceral conversations out of band from the github repo for PSL PR's on this matter.

What's an update?

@bedfordsean
Copy link

bedfordsean commented Apr 12, 2021

Hi @dnsguru,

Firstly, sorry that you're getting threatening calls over this... that should never happen.

The discussion in privacy-CG this week closed out that PSL is the "least bad" way of handling this today; due to the definition of eTLD+1 and the privacy requirement of cookie separation, there are going to be some use cases that fall into a space of needing PSL registration. Facebook could handle this on our side directly, however without the guarantee of cookie separation, this does not have the desired privacy guarantees.

We clarified our documentation on this last week. Our intent was always for this to be for platforms and multi-faceted organisations such as governments as described above and our expectation was that if you weren't already on the PSL, then it is likely you do not need to be added to the PSL. You can see our updated guidance here: https://www.facebook.com/business/help/331612538028890

Specifically we added the paragraph:

Our current efforts are designed to support clients with preexisting Public Suffix List domain registrations or eTLDs. This support is in line with Apple’s recent Private Click Measurement update. There are other technical implications if a domain is registered as a Public Suffix that a business should consider (for example, the domain that is registered on the Public Suffix List cannot have its own cookies) and as such, we do not recommend that clients register their domains on the Public Suffix List specifically for Facebook event configuration.

From the Facebook side, @benjaminsavage, @n8schloss, and myself can volunteer to help you vet the PSL applications. I'm hoping that @johnwilander can nominate someone to support from Apple as well.

I'm going to drop you an email for us to chat and agree how to proceed. In the meantime, please continue to reject iOS14/FB related PSL registrations as you have been doing and feel free to point advertisers who do apply in the direction of our updated help centre guidance.

@jonrburns
Copy link

Sharing perspective from Shopify which is mentioned a few times in above discourse.

Shopify submitted myshopify.com to PSL in advance of the PCM announcement, although, we are still waiting for this to propagate to Webkit. Merchants who do not yet have a customized domain operate under a unique myshopify.com subdomain. Each of these shops operate in isolation of one and other. Operating on myshopify.com reduces the barrier to entry to opening a Shopify store. The primary purpose of our inclusion to the PSL was as a defence-in-depth mechanism, specifically guarding against CSRF.

Per the above nomenclature, on the Shopify platform, myshopify.com is exclusively used to provide a distinct website for businesses. There is no capability to route traffic on either a user or product/category subdomain under myshopify.com.

When Apple announced PCM and Facebook subsequently adopted it as a standard into their platform, we advocated to Facebook that they support our merchants operating on myshopify.com per the PCM PSL specifications. Advertising and related measurement is a critical requirement in the success and growth of businesses operating on Shopify; especially for many nascent merchants who are still operating on the myshopify.com domain.

Our hope is that there is no regression through these discussions for legitimate entities leveraging PSL appropriately and further looking to adopt privacy preserving mechanisms of advertising, such as PCM.

@johnwilander
Copy link
Collaborator

johnwilander commented Apr 12, 2021

Sharing perspective from Shopify which is mentioned a few times in above discourse.

Shopify submitted myshopify.com to PSL in advance of the PCM announcement, although, we are still waiting for this to propagate to Webkit. Merchants who do not yet have a customized domain operate under a unique myshopify.com subdomain. Each of these shops operate in isolation of one and other.

Thanks! I went to a shopping site under myshopify.com and got four tracking cookies set for .myshopify.com.

@jonrburns
Copy link

jonrburns commented Apr 13, 2021

Thanks! I went to a shopping site under myshopify.com and got four tracking cookies set for .myshopify.com.

Thanks John. We’d like to prevent this as well, your observed behaviour is likely from an embedded 3P integration. Once Webkit updates its implementation of the PSL, this should no longer be possible. I’ve filed a bug in webkit and would greatly appreciate any support, do not wish to hijack this issue.

@coronado101
Copy link

Hello, First of all thank you to all of the volunteers who actively participate to serve and maintain this list. Secondly, it's so irresponsible for Facebook to push this issue onto PSL volunteers. They have an internal tool for verifying subdomains which I have used, but the pixels installed on sites can only point to a root domain. It seems that if a root domain ownership is verified, the owner would be help accountable/responsible for the subdomains as well, the functionality to tie an FB pixel to a subdomain would be simple to implement for FB devs. IMO Facebook isn't really invested in better security, only reacting to ridicule over malicious, misinformation-based content.

@johnwilander
Copy link
Collaborator

Hello, First of all thank you to all of the volunteers who actively participate to serve and maintain this list. Secondly, it's so irresponsible for Facebook to push this issue onto PSL volunteers. They have an internal tool for verifying subdomains which I have used, but the pixels installed on sites can only point to a root domain. It seems that if a root domain ownership is verified, the owner would be help accountable/responsible for the subdomains as well, the functionality to tie an FB pixel to a subdomain would be simple to implement for FB devs. IMO Facebook isn't really invested in better security, only reacting to ridicule over malicious, misinformation-based content.

Please refrain from accusations and speculation (the IMO part). There’s been a bit of that above too so you’re not alone.

Let’s stick to facts and the technology since this is a standards proposal conversation. There are plenty of places where you can discuss your opinion on certain actors. Thanks!

@TanviHacks TanviHacks removed the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Apr 19, 2021
@dnsguru
Copy link

dnsguru commented Apr 23, 2021

The PSL Pull Requests that are coming in are manifesting as being more iterative than the typical ones - we seem to get stuff like this request below where the root domain name is requested and they need extra hand-holding on not mucking up their root domain cookie stuff and they have clearly done a copy/paste and threw this over the fence to the volunteers.
publicsuffix/list#1294

Some are just copy/pasting from (new or old), other PRs, or otherwise submitting things that would not typically get approved.

So the icing on the 'this ain't chocolate but still is brown' cake that we're getting served is that these are more process expensive, iterative, high-maintenence walkthroughs.

Hoping to see some text for the wiki from FB folks to help direct people but this likely needs to end up being something FB is handling end to end akin to where Let's Encrypt have their own list and request form for their customers seeking to be working around their limits themselves.

@dnsguru
Copy link

dnsguru commented Apr 27, 2021

From the PSL volunteers campside, we're still awaiting some sample text for an FAQ or wiki entry from Facebook to help folks understand who to talk to AT FACEBOOK to address this.

Meanwhile the team at Facebook are peeking at the PSL PRs and landing their helicopter from time to time and engaging with the request pool, but we'd like to see a bit more of that proactive spirit.

A lot of the requests, if they had just been approved and merged, would have mucked up the root domain in ways the requestor would not have wanted, and due to the time-slide on when different browsers (in this case Apple) integrate and subsequently push their eTLD+1 derivative list, during System Updates.

It is best that the PSL volunteers are wontfix closing the requests to workaround the 8 subdomain limit that FB has...

Those requestors would have been in bad shape.if they would have to wait for post 14.5 IOS update 1 to discover the breakage and then make yet another PSL request and then wait for the subsequent IOS update to fix the fix.

@johnwilander
Copy link
Collaborator

We need to make sure the organizations wanting to register understand the full implications of getting on the PSL. Mozilla has this documentation: https://wiki.mozilla.org/Public_Suffix_List/Use_Cases

As you can see, getting on the PSL has consequences for wildcard TLS certificates, possibly invalidating existing ones and creating issues for existing certificate update procedures. Do the sites know about that?

@bedfordsean
Copy link

It's a mixed bag.

Some of the requests are legitimate and understand this, others are legitimate with no understanding of this, others still are inappropriate for a PSL addition.

We've just updated our guidance today found here that goes into more details. Specific section on PSL limitations below, and we've also clarified further that if you're not already on the PSL it most likely does not make sense to be on it.

The guidance provided below is Facebook’s understanding of Public Suffix List use cases:

Your business operates as a platform by providing subdomains to other separate businesses from your own, and you’d like to protect the user from being tracked across these independent subdomains. For example, if a business called myretailplatform offers their service to multiple businesses to sell their products by creating their own subdomains on myretailplatform.com. If this describes your business, then the following conditions should be met before attempting to request the registration of a domain on the Public Suffix List:

Your business doesn’t require functionality on the domain that you want to register on the Public Suffix List. If you add your domain to the Public Suffix List, cookies won’t be stored on your domain. In the myretailplatform example, myretailplatform.com hosts thousands of small businesses on subdomains of their website, but myretailplatform.com doesn’t have any functionality of its own. The business owner should not need cookies on myretailplatform.com, and doesn’t need to be able to configure events on myretailplatform.com.

The subdomains of the domain you want to register on the Public Suffix List are fully separated from one another, and you need to prevent data sharing between them. Once registered on the PSL, browsers that reference the PSL will treat each subdomain as a separate eTLD+1 website.

Each subdomain corresponds to a separate business. In the myretailplatform example, if myretailplatform.com wants to register on the Public Suffix List then jasper.myretailplatform.com and abc.myretailplatform.com should be two separate business entities. Attempting to use subdomains to represent different products, different people, or different devices isn’t recommended and your application for registering on the Public Suffix List will likely be rejected.

@johnwilander
Copy link
Collaborator

Thanks! I haven't seen it documented anywhere but I think it's reasonable to assume that browsers will not allow a user to even visit/load an entry on the PSL. It's treated like a TLD and afaik you can't visit https://com. That's something for them to take into consideration, including any existing links, subresource URLs, bookmarks, QR codes etc.

@bedfordsean
Copy link

Yes, we were also not 100% certain on this for every browser, but it stands to reason. This is why we put in the line specifically around:

Your business doesn’t require functionality on the domain that you want to register on the Public Suffix List.

Many businesses don't understand this detail and the risk with different browser cadences for updating their own PSLs is that this may appear to be ok right now, but would be hard/nearly impossible to back out later, so definitely better to "measure twice, cut once" in this scenario

@bedfordsean
Copy link

Hey @johnwilander, just looking into this a bit more and would like to understand exactly how PCM would handle something on the PSL.

From what I can see, the browser does in fact load a page that is PSL listed. The example I tried is http://gov.au which does not redirect, and is on the PSL. Now it could be a more recent addition to the PSL than the Safari list, but I believe it's been there quite a while so far as I can tell.

In this circumstance (beyond just the limited functionality that the PSL site may have), if we had a PCM flow with attributeon="http://gov.au" and if "gov.au" did later fire a trigger event, what would happen? Would this be accepted for attributeon, and then later the trigger would fail?

@johnwilander
Copy link
Collaborator

johnwilander commented Apr 28, 2021

If it's WebKit specific, please file a bug on https://bugs.webkit.org and cc me (thanks!). If it works the same in all browsers, we should bring it up with in the storage partitioning repo.

@bedfordsean
Copy link

bedfordsean commented Apr 28, 2021

It isn't webkit specific from what I can tell. Chrome and Firefox both allow for similar behaviours. How do we raise this with storage partitioning?

Separately I think we should figure out how PCM (and actually all other proposals that use eTLD+1) handle the edge case of something that's on the PSL since the domain seems to still actually load. Let's have the storage partitioning conversation first and go from there

@johnwilander
Copy link
Collaborator

It isn't webkit specific from what I can tell. Chrome and Firefox both allow for similar behaviours. How do we raise this with storage partitioning?

It breaks partitioning and cookie scoping. Please file here: https://github.com/privacycg/storage-partitioning and reference this bug. Thanks!

@bedfordsean
Copy link

It breaks partitioning and cookie scoping. Please file here: https://github.com/privacycg/storage-partitioning and reference this bug. Thanks!

Raised and requested for next privacy-cg agenda: privacycg/storage-partitioning#24

@dnsguru
Copy link

dnsguru commented May 14, 2021

Continuing to wontfix requests to the PSL, which are stacking up. Requestors have 'energy' to express about that and they are sharing with PSL. Really not appreciating this whole mess, folks. really. REALLY. R E A L L Y! The dialog seems fairly casual ...

While FB folks have been helping a bit with triage, the rate increase on PR due to this change is really dumping on the PSL volunteer pool, and I had planned to attempt to address some of the backlog of automation and other improvements, so this whole matter has been incredibly disruptive - ultimately to the PSL volunteers of which I currently am the heaviest lifter.

grrr to whomever put this in motion. And I want your cel number to pass along so that you're not denied the dignity of hearing from people affected and how expressive they are

@bedfordsean
Copy link

bedfordsean commented May 14, 2021

FB are standing up a process to alleviate the burden of validation of requests from PSL maintainers, which will hopefully reduce the load for the time being, however as discussed here, PCM and all privacy preserving proposals that currently rely on eTLD+1 and the PSL to determine what a business entity is need to find a better way.

I don't see this problem going away, or diminishing in volume as more of these types of proposals progress from ideation to trial to production. It's clear from the situation here that this is not something easily taken on by the existing PSL structure.

cc @johnwilander @csharrison @michaelkleber @krgovind for visibility

@dnsguru
Copy link

dnsguru commented May 16, 2021

FB are standing up a process to alleviate the burden of validation of requests from PSL maintainers, which will hopefully reduce the load for the time being, however as discussed here, PCM and all privacy preserving proposals that currently rely on eTLD+1 and the PSL to determine what a business entity is need to find a better way.

@bedfordsean I plead... please 'stand up' faster.... perhaps at very least immediately update this web page :
https://developers.facebook.com/docs/sharing/domain-verification/ as it currently seems to be worded in a way that says "PSL entry will solve", and I do not understand why that guidance would still remain in place after a month of good faith discussions about shoving facebook customers to the PSL.

Here is the exact text:

Enabling More Domain Verification Use Cases in Aggregated Event Measurement
Facebook introduced Aggregated Event Measurement (AEM) to support measurement of web events after Apple begins enforcing its App Tracking Transparency prompt requirement.

We will be supporting the Public Suffix List for domain verification and event configuration. This means that merchants using a registered domain on the Public Suffix List will be able to use that domain for verifying and configuring their top 8 events on the domain. For example, if "myplatform.com" is a registered domain on the Public Suffix List, then Jasper, a merchant with the subdomain "jasper.myplatform.com", would now qualify as an effective eTLD+1 and would be able to verify "jasper.myplatform.com" and use it to configure their top 8 events in the web events configuration tool. Please note this would not apply to URL paths (e.g. "myplatform.com/jasper") or an eTLD+2 domain (e.g. "abc.jasper.myplatform.com"). For the URL path and eTLD+2 use cases, businesses can alternatively consider moving to Landing Page Views or Link Click optimization, or purchasing their own domains to try and avoid disruption to their ad campaigns once Apple begins enforcing its prompt.

Advertisers who use your platform may benefit from this update if your platform domain is registered in the Public Suffix List. You can learn more about the Public Suffix List here.

For more information on verifying domains and configuring web events, visit our Help Center article: Facebook Pixel Updates for Apple's iOS 14 Requirements

Its really basic and folks continue to submit PR that are going to break their stuff.

The guidance is absent any of the dialog or evolutions that we have been discussing in good faith to discourage the PSL as a workaround, does not identify the hazards about breaking their core domain inadvertantly or how browsers do their own thing per browser at their pace on propogation of changes or rollbacks,

As worded, it will continue to push higher effort pull requests to the PSL in order for that Facebook customer to solve their issue.

I do not work for Facebook, I am not paid to work for Facebook. I do not want to do unpaid work for Facebook. Straight up, that guidance is abandoning Facebook's responsibility to THEIR customer, and causing me and other volunteers to work for Facebook for free and I feel that is just reprehensible.

While you and your colleagues have been monitoring some of the PR and walking people through their PR at the PSL repo on github, and that is appreciated, the root cause (the FB Guidance) needs to change.

I don't see this problem going away, or diminishing in volume as more of these types of proposals progress from ideation to trial to production.

I could see it alleviated a bit by making the guidance less basic on that URL pretty please

It's clear from the situation here that this is not something easily taken on by the existing PSL structure.

Just want FB to not hold a Fyre fesival PSL Island

@bedfordsean
Copy link

@dnsguru inbound form will be coming this week. I'll ensure that page is up to date ASAP

@voxpelli
Copy link

@bedfordsean I think the wording on the Facebook page needs to reflect the reality of PSL:

That there are no obligations for any users of the PSL to update their copy of PSL in any set timeframe.

Hence a wording like:

We will be supporting the Public Suffix List for domain verification and event configuration.

Should get a strong clarification like:

We will be supporting historic suffixes in the Public Suffix List for domain verification and event configuration.

As well as get new clarifying texts about additions added to it:

New additions to the Public Suffix List may eventually be included down the line, but no expectations of a time frame for such inclusions can be made.

Right @dnsguru?

Also: If there needs to be a registry for all of the *.myshopify.com, *.github.io and such, then maybe that should be separate from the documentation of historic such domains like eg. *.co.uk and be something that companies pays to get added to, just like they pay for registering domain names and often paid for getting HTTPS-certificates?

Documenting all of the *.co.uk variants makes sense as a community effort, adding all of the *.myshopify.com, *.github.io etc doesn't.

@bedfordsean
Copy link

@voxpelli that information is already detailed on our updated guidance that I posted about here #78 (comment)

This other page has been missed

@dnsguru
Copy link

dnsguru commented May 18, 2021

The updated guidance pages have reduced the pace of requests already - which is helpful.

I do like @voxpelli suggestion on the 'historic' word being included to help further smush down the pace of requests. Once we see what kind of process FB has in place for "review, not refer" and can gauge the request quality, there will be a review of acceptance criteria and perhaps those caught in the queue of PR that accumulated can move forward.

I am especially intreagued with the potential of a registry for the private section stuff - we have avoided requiring payments to also avoid the associated expectations that come along with them, but given this has grown a lot since we made that choice it would make sense to re-explore it. The concern I have is that charging money may unravel the voluntary nature of browser participation as much as accepting any volume of workarounds for security or rate limits might.

I digress... We will likely see a few bursts again in cycles when affiliate types who are more 'set and forget' peek at their referral reports and scratch their heads as to what happened and discover the IOS change impacts.

Any vetting pre PSL PR is going to help reduce the queue load, and mid PR interventions have been helpful to addressing the increased load.

Yes, I have been [understandibly] cranky about this update of the FB help page situation due to how long it took before these would update to change from 'come to PSL for remedy' to current as FB's help for their clients adjusts away from punting towards helping, but I wanted to flip around and instead frame things from a position of gratitude - to also point out that we have had members of the team at FB stepping in to triage PR and reason out the requests so that they are well reasoned and proportionate. That has been helpful with the symptomatic increase in PR and the load - and the recent updates to the help pages should prove to be curative or slow the herd.

The biggest issue I have had with all of this was that the voluntary nature of browser participation in use of the PSL has been powerful as a means to help namespace operators ensure their users experience was happening in an intentional manner. Volunteering towards that objective has always been fulfilling because it allows for making real beneficial change and enablement of communities and innovation at scale.

I am biased in that I give fewer effs if some marketing affiliate entrepreneur wants to not spend $10 to buy their own domain to make thirty cents off a take out order they referred than I do in ensuring that remote learning functions as expected in say the entire country of Ecuador. Both are one line in a file. The latter feeds my soul a bit more for the same amount of effort as a volunteer on the PSL, and it helps principly to continue a legacy of graceful service that Gerv Markham represented while he was with us.

@bedfordsean
Copy link

One further update here; we have updated our page at https://developers.facebook.com/docs/sharing/domain-verification/ to point to the Help Center article (https://www.facebook.com/business/help/126789292407737) where we had already listed much more detailed guidance on PSL usage.

I do like @voxpelli suggestion on the 'historic' word being included to help further smush down the pace of requests. Once we see what kind of process FB has in place for "review, not refer" and can gauge the request quality, there will be a review of acceptance criteria and perhaps those caught in the queue of PR that accumulated can move forward.

That Help Center page also includes the wording "pre-existing Public Suffix List domain registrations or eTLDs" to try to encourage not blindly making a PSL addition.

You'll note that we also now have a request form present and linked on the Help Center page.

We're going to be finalising the process that happens after submitting that form with @dnsguru and others via email in the next 24h for what to do for pre-validation of PSL requests. Once we've agreed on that we're going to follow up and post on every PR linked to publicsuffix/list#1245 and handle them again through the FB centric flow.

I hope this will put us on to a more even keel and we can alleviate the burden on the PSL volunteers in the near term. In the longer term once we learn more about common business use cases that seem to require currently being on the PSL, we can bring these to future W3C meetings discussing PCM/privacy sandbox proposals and (hopefully) find a better way for the long term

@dnsguru
Copy link

dnsguru commented May 18, 2021

@bedfordsean @johnwilander Excellent - so will PSL folks be invited to tomorrow or thursday's virtual f2f? I pledge to not darth vader lift anyone across zoom

@rubenskuhl
Copy link

One possible way to get a volunteer-like payment speed bump is asking for a credit card, charging something ($100, perhaps) and later refunding it (less processing costs). This at least makes people to think if they need it and qualify for it. If that's too much for the Global South, just use IP-based location to pick an appropriate value.

@bedfordsean
Copy link

@bedfordsean @johnwilander Excellent - so will PSL folks be invited to tomorrow or thursday's virtual f2f? I pledge to not darth vader lift anyone across zoom

I'm going to suggest we wait a couple more weeks for the following reasons:

  • I'd like us to collate some themes from the requests coming through as a result of New interaction between IOS 14.5 PCM and Facebook Pixel causing increase in PSL inclusion requests publicsuffix/list#1245
  • There is a Google Android developer conference this week and an Apple conference in the second week of June and there may be more announcements/proposals that impact PSL in either of these events
  • The F2F this week has a much larger audience with a wider range of interests so this more targeted conversation on browser vendor interactions with the PSL isn't of as much relevance to the entire privacy-CG group.

There's a privacy-CG meeting every 2-3 weeks and a web-adv meeting every week, so suggest we collate this information and come in to have a smaller group chat with the parties this is most relevant to, as well as incorporating our learnings and understanding of any new proposals that come out of the Google and Apple developer events.

@dnsguru
Copy link

dnsguru commented May 19, 2021

  • ... there may be more announcements/proposals that impact PSL in either of these events

NO, THANK YOU

@dnsguru
Copy link

dnsguru commented May 19, 2021

One possible way to get a volunteer-like payment speed bump is asking for a credit card, charging something ($100, perhaps) and later refunding it (less processing costs). This at least makes people to think if they need it and qualify for it. If that's too much for the Global South, just use IP-based location to pick an appropriate value.

We have avoided charging for changes, historically, but that was when there had been 1-2 PR a month or so and could manage to resource it.

@rubenskuhl
Copy link

One possible way to get a volunteer-like payment speed bump is asking for a credit card, charging something ($100, perhaps) and later refunding it (less processing costs). This at least makes people to think if they need it and qualify for it. If that's too much for the Global South, just use IP-based location to pick an appropriate value.

We have avoided charging for changes, historically, but that was when there had been 1-2 PR a month or so and could manage to resource it.

The charge would be minimal after the refund, and the system would still have 0 income.

@dnsguru
Copy link

dnsguru commented Jun 3, 2021

The charge would be minimal after the refund, and the system would still have 0 income.

At this point, charging an astronomical price for a rollback on casual requestors is making more and more sense, as those are the worst waste of PSL volunteer cycles to generate zero net outcome

On the better news side of things, FB's lead is back from their two weeks off (sigh, they took that two weeks from ME ;) ) and they have now added an intake form to hopefully process to help qualify requestors. though the vetting process or acceptance criteria on those requests that come through their sieve are yet to be defined. I have created a label on PRs called 'IOS-FB?' to help identify PRs that were/are directed at the PSL for false salvation.

There have been a number of request/rollbacks in the past month - so qualifying requests is really, really fundamentally important.

It is also important to note that PSL inclusion is not any guarantee that anything propogates downstream to browsers, certs, DMARC, libs etc. They do what they do, when they do. So the rollback timing can have big ouch factors where the core business domain name is used and a ham-fisted request comes in on the domain.

@bulk88
Copy link

bulk88 commented Nov 8, 2023

“Private Click Measurement” conflates “business entity” with “registrable domain”. There are many small businesses who buy ads (on Facebook and elsewhere) that do not operate on their own eTLD+1 and instead operate on subdomains of websites like “foo.myshopify.com”. Unless they do something - they will no longer be able to measure their paid advertising.

A requirement for PSL should be that "foo.myshopify.com" have sub-user A records or NS record control, to aim their site off anycastCDN.myshopify.com infrastructure. If all *.myshopify.com are edge proxys to anycastCDN.myshopify.com, Cookie/CA/Lets Encrypt details can be handled by myshopify.com enforcing rules at the edge proxy, or thru static analysis/app store style controls over sub-user content (which is probably all WYSIWYG templates anyways). myshopify.com can also do "first party" GUID/user tracking/pixel integration for their sub-users through postMessage/iframe APIs/REST to magic root URLs along with "domain:" in Set-Cookie. An edge proxy banning a sub-user`s CDN from ever returning files with Content-type: text/html and ```X-Content-Type-Options: nosniff```, will ban localStorage/JS Cookies in 1 shot. ```__Host-``` cookies from the edge proxy can also clean up the leakage. Unless "myshopify.com" shows to PSL, that they host ```http://``` cleartext sub-users (no ```__Host-``` header b/c no SSL), IDK why the edge proxy cant enforce cookie separation, or implement a server side root domain pixel/ad track/global to ```myshopify.com``` client or eyeballs GUID. Ima bet these platforms redirect final payment to ```payments.myshopify.com``` anyways for checkout/CC input anyways or to ```myshopify.stripe.com``` or ```(amazon-payments.com|paypal.com)/order?uid=myshopify.com&orderid=1234567```.

Only true webhosting/VMs/web dev playgrounds/free sub domains and dynamic DNS make sense on PSL. If the major cause for PSL is for ad-tracking/e-commerce, the sub-users REALLY need to move to a proper eTLD/well known PSL TLD, if the sub-user a commercial biz, they can afford $30-$60 a year for the well known TLD domain. It looks more professional to clients/eyeballs.

Most CDNs/platforms nowadays have sub-user-brand.com be SPA root/URL bar/Dynamic server side generated HTML, and pure static disk file assets, be (anycast1|anycast2|DFW|ATL|ORD|LAX).cdn.io so client browsers don't send or receive cookies over the wire to cdn.io, so cdn.io doesn't need to be on the PSL anyways.

The minimum future-TLD domain registration renewel requirement (years into the future) sounds smart, or my idea a 1 time payment for PSL listing if owner is a for-profit LLC, voided if the future-TLD has >1 year history by google search of use as sub-domains or void the PSL listing fee if proof of non-profit status of future-TLD owner background.

PSL really needs to be free subdomains with A/NS/SOA/TXT records, not subdomains of "full stack" platforms to "1 company edge-proxies". The full stack, all of our parts/services, or none of our parts/services, platforms, control their sub-users so much anyways, and all use edge proxies/anti-DDOS/edge TLS termination anyways, the full-stack platforms don't really need to be on the PSL, or they can pay a PSL listing fee, that allocates resources for a PSL volunteer in the future to remove their eTLD from PSL if faux-eTLD domain owner goes bankrupt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests