Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reporting #585

Closed
1 task done
clelland opened this issue Dec 8, 2020 · 10 comments
Closed
1 task done

Reporting #585

clelland opened this issue Dec 8, 2020 · 10 comments
Assignees
Labels
privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response. Resolution: satisfied The TAG is satisfied with this design

Comments

@clelland
Copy link

clelland commented Dec 8, 2020

HIQaH! QaH! TAG!

I'm requesting a TAG review of the Reporting API.

The Reporting API is a mechanism for web servers to tell browsers where to send errors and other information about a browsing session.

Further details:

  • I have reviewed the TAG's API Design Principles
  • Relevant time constraints or deadlines: Chrome is preparing to ship changes to the API based on feedback from other browser vendors in Feedback from Mozilla w3c/reporting#158, hoping to land those with Chrome 89 soon.
  • The group where the work on this specification is currently being done: WebPerfWG
  • The group where standardization of this work is intended to be done (if current group is a community group or other incubation venue): WebPerfWG
  • Major unresolved issues with or opposition to this specification:
  • This work is being funded by: Google

You should also know that, while the Reporting API has shipped in Chrome for some time, and several features have integrated with it, and ReportingObserver has been reviewed by TAG, the API as a whole was never reviewed. This is relevant now as changes have been made to the scope of the API, as well as the header used and its syntax, which Chrome is looking to ship.

We'd prefer the TAG provide feedback as (please delete all but the desired option):

🐛 open issues in our GitHub repo for each point of feedback

@MattMenke2
Copy link

MattMenke2 commented Dec 15, 2020

I think sending credentials here is a problem, when it comes to dealing with cross-site tracking. Either we only send SameSite=None cookies (which will be removed in the forseeable future), and will break everyone relying on credentialed reports, or we send SameSite=Lax/Strict cookies, which allows cross-site tracking.

We also need to specify the Network Partition Key used for the upload, and the scope of data stored by the reporting API (likely also keyed on the Network Partition Key, again, to protect against cross-site tracking - a site could use a unique ID in the Report-To URL to track individual users).

Chromium already has code, not yet enabled on stable, to both key reporting information and uploaded reports on the network partition key either of the original request (NEL, learning report-to information), or of the frame associated with a report.

@MattMenke2
Copy link

MattMenke2 commented Dec 15, 2020

Sorry, I may make misunderstood the spec (influenced by Chromium's implementation) - is Report-To information now scoped to a document, so there's no global cache of Report-To information? That addresses all my concerns. It would inherit Same-Site-ness from the document, and could act like a normal subresource request.

@clelland
Copy link
Author

That is correct, everything covered by the Reporting API should be document-scoped. Reporting endpoint configuration is ephemeral, and reports from separate documents should not be sent together in the same POST request, even if they are coming from the same origin. (There's an open question there — whether it's okay to bundle together reports from different same-origin documents, as long as they are in a single agent cluster, but currently the spec does not allow that.)

NEL is a different case, and isn't covered by the base reporting API. There is an extension to the spec, https://w3c.github.io/reporting/network-reporting.html, which allows for persistent configuration, but that will be a separate TAG review, and will certainly need to take your comments into consideration.

@MattMenke2
Copy link

MattMenke2 commented Dec 15, 2020

That's great - thanks so much for setting me straight! Then this feature seems totally fine to me, as specced.

@atanassov
Copy link

atanassov commented Jan 26, 2021

@LeaVerou, @plinss and myself looked at our "Kronos" virtual f2f. It would great to have the following issues addressed:

  • Can you provide an explicit list of the types of reports we are evaluating for?
  • Given this feature could be sending PII through such reports, can you please complete the Privacy & Security questionnaire?
  • Some of the report endpoints have to be sent by definition to different servers. That would mean we're exposing information to a 3rd party by design. Are we reading the proposal correctly?
  • What would the list of all types of endpoints be?
  • Would an event be more appropriate?
  • Using such reporting mechanism, do you expect to have user-generated events part of the same reports?
  • Should all reports be visible to script?
  • Currently the JSON object includes keys in both underscore case (user_agent) and camelCase (lineNumber). For consistency, it would be best to use underscore case only.

@clelland
Copy link
Author

Thanks for taking the time to look at this, @atanassov! I'll try to address each of those questions here -- note that some of these points have already been addressed as part of the TAG review of ReportingObserver mentioned in the 'You should also know that' section.

As a general point, I'm not sure what you mean when you've used the word 'endpoint' -- in the spec, a reporting endpoint is the location of an HTTP server which can accept POST requests containing reports. I feel like you're using it to mean something different here, so if I've misunderstood your questions, let me know and I'll try to clarify.

  • Can you provide an explicit list of the types of reports we are evaluating for?

Probably not, and this is by design -- the API is a generic framework for other other specs to use to achieve consistent reporting across features. I would prefer to evaluate this API on its own merits, based on what it can and cannot enable, rather than the specific list of current integrations. That said, I can understand that it's difficult to do that completely out of context. The current list of APIs which use this framework for reporting are:

(I hope I haven't missed any, but I'll comment if I find new ones)

  • Given this feature could be sending PII through such reports, can you please complete the Privacy & Security questionnaire?

Linked above at https://github.com/w3c/reporting/blob/master/security-and-privacy-questionnaire.md

  • Some of the report endpoints have to be sent by definition to different servers. That would mean we're exposing information to a 3rd party by design. Are we reading the proposal correctly?

The purpose of the reporting framework is to allow reports to be sent over HTTP to remote servers. Cross-origin and third-party servers are included by design, under the belief that:

  • It can be necessary, especially in the case of APIs such as network error logging, to have a report collector which is separate from the site which produces the error.
  • Organizations will probably want to have a central report collector, rather than hosting an endpoint on every origin.
  • Allowing third-party collectors enables an ecosystem of report hosting and analysis services, such as Report-uri.com and uriports.com, among others.
  • In general, the same or similar information could be communicated with third parties through other means (cross-origin subresources, XHR, beacons, etc).
  • What would the list of all types of endpoints be?

I'm not sure what you mean here -- endpoints are the URLs / servers to which reports are sent. They don't have types, unless you're drawing a 1p/3p distinction, or categorizing them by the types of reports which they choose to accept. Can you clarify this question?

  • Would an event be more appropriate?

There was discussion on this when ReportingObserver was designed, although I wasn't part of those discussions at the time (@juliatuttle or @RByers may have additional historical context). I do believe that an observer would be preferable to an event, to support batching (reports are delivered as a list,) post hoc reporting (observers can be registered to retrieve reports from events which happened before they were created, which is useful for early-page-load events,) and to make it easy to write handlers for just some subset of report types. Additionally, the event infrastructure introduces complications such as bubbling, and can require coordination between multiple listeners, neither of which is useful here.

  • Using such reporting mechanism, do you expect to have user-generated events part of the same reports?

Not currently -- there are no free-form user-generated reports. The closest thing that exists is the test-driver-only generateTestReport API. However, it is not impossible for another spec to integrate in that way; nothing in this API prevents that. I suspect that would require a lot more scrutiny, for the possible XSS / injection issues at least. Is the TAG aware of other issues that would mean that we should strongly discourage such things?

  • Should all reports be visible to script?

No, by design there is an allowance for reports which are not visible. This can be just by necessity (crash reports, for instance), or it might be important for privacy, if there are cases where it's not advisable for the page to have immediate, real-time script access to user events.

  • Currently the JSON object includes keys in both underscore case (user_agent) and camelCase (lineNumber). For consistency, it would be best to use underscore case only.

This was discussed in w3c/reporting#72, and I think that the decision was made to follow the TAGs recommendations wherever possible, but that in some cases (CSP reports, for instance,) backwards-compatibility issues meant keeping camelCase identifiers.

@torgo torgo added the privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response. label Feb 15, 2021
@kenchris
Copy link

This seems related (we did an early review before) #419

@torgo
Copy link
Member

torgo commented Feb 15, 2021

We discussed in today's breakout just now and agreed we need a dedicated breakout to delve into this issue further. Tentatively scheduled for breakout slot A next week. Action on me to make that happen.

@plinss
Copy link
Member

plinss commented May 13, 2021

@ylafon and I took a look at this during our virtual F2F, thank you for filling the Security and Privacy questionnaire and for your response to our earlier comments.

At this point we don't have any further comment on this, this looks OK to proceed. Thanks!

@plinss plinss closed this as completed May 13, 2021
@plinss plinss added Resolution: satisfied The TAG is satisfied with this design and removed Progress: breakout Progress: in progress labels May 13, 2021
@clelland
Copy link
Author

Thanks for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response. Resolution: satisfied The TAG is satisfied with this design
Projects
None yet
Development

No branches or pull requests

7 participants