Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for clearURLs rewrite/how to address clearURLs' shortcoming? #1

Open
JaneJeon opened this issue Aug 29, 2021 · 3 comments
Open
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@JaneJeon
Copy link
Collaborator

Some of the rule semantics might change... or we might have to abandon tracker stripping entirely

@JaneJeon
Copy link
Collaborator Author

ClearURLs/Addon#144

@JaneJeon JaneJeon self-assigned this Sep 22, 2021
@JaneJeon JaneJeon added enhancement New feature or request help wanted Extra attention is needed labels Sep 23, 2021
@JaneJeon
Copy link
Collaborator Author

JaneJeon commented Sep 23, 2021

An alternative method I'm thinking of is only using clearURLs for websites that won't automatically redirect you to the destination (e.g. youtube links, skimlinks, etc) - i.e. rawRules.

For "stripping" everything else, I think relying on existing UBO rulesets like https://github.com/uBlockOrigin/uAssets and https://github.com/AdguardTeam/FiltersRegistry/blob/master/filters/filter_17_TrackParam/filter.txt might be better?

Obviously UBO blocklists are 1. fucking huge and 2. do a LOT more than just "strip trackers from URLs", but preprocessing these lists by stripping off DOM-manipulating filters (or basically anything that doesn't touch the URL) and focusing on ones that are usually named "privacy" something should help.

In that case, I would need:

  1. Auto-updating list of "privacy" filters to strip trackers off of the URLs - AdGuard privacy list, UBO default & easyprivacy lists
  2. Implement anti-breakage shit (UBO list)
  3. Script to deduplicate filters
  4. Implement UBO parser only for URL transformations
  5. Run UBO blocklist-based tracker stripping alongside clearURLs
  6. Remove clearURLs implementation except for rawRules matching (which is what allows us to skip the intermediary "redirect" pages)

The motivation for skipping clearURLs for the actual tracker stripping is the existing clearURLs-based approach falls flat on obscure/chinese sites that don't provide any sort of "hints":

@JaneJeon JaneJeon changed the title Wait for clearURLs rewrite Wait for clearURLs rewrite/how to address clearURLs' shortcoming? Sep 23, 2021
@JaneJeon
Copy link
Collaborator Author

Resources for directly integrating UBO to strip bullshit from URLs:

Also I have been made aware that even the annoyances rulesets contain tracker stripping magic? https://github.com/uBlockOrigin/uAssets/blob/02d16a221c276fe58bdd72cc947b26eaf9d1318e/filters/annoyances.txt#L4560-L4561

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant