Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complaint request filter: Match against endpoint URLs from adapter instead of hostname from HAR #6

Open
baltpeter opened this issue Apr 9, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@baltpeter
Copy link
Member

In e9e5247, I implemented a filter that only includes requests to servers that the user's device also provably (through Tracker Control/the App Privacy Report) contacted.

I am currently doing that by checking the request's hostname from the HAR against the hostnames in the TC/APR export.

Instead of the HAR hostname, I think we should be checking against all endpoint URLs that the corresponding adapter accepts.

Imagine a tracking endpoint https://api\d.tracker.tld/ingest. If during our analysis, we happened to find requests to https://api2.tracker.tld/ingest but the user's device happened to use https://api5.tracker.tld/ingest instead, we would currently exclude those requests.

However, implementing it this way is surprisingly hard. We only get a hostname from the TC/APR export. Meanwhile, our adapters' endpoint URLs can be strings or regexes of full URLs.

How would we check whether android2-ads.adcolony.com matches /^https:\/\/(android|ios)?ads\d-?\d\.adcolony\.com\/configure$/? Maybe I'm missing something, but I really can't see an automated way that isn't hacky and error-prone.

I feel like the only (proper) way to implement this change would be to also manually add a hosts array to each adapter in TrackHAR.

@baltpeter baltpeter added the enhancement New feature or request label Apr 9, 2024
@baltpeter
Copy link
Member Author

@zner0L What do you think?

@zner0L
Copy link
Contributor

zner0L commented Oct 3, 2024

I just thought of creating a kind of Endpoint object which would allow to decompose the endpoint URL similarly to the URL JS object. For regexes, we could combine a host regex with a path and protocol component (like this: https://stackoverflow.com/questions/9213237/combining-regular-expressions-in-javascript).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants