PoToken implementation to solve 403 errors #11955

Stypox · 2025-01-25T12:55:43Z

What is it?

Bugfix (user facing)
Feature (user facing)
Codebase improvement (dev facing)
Meta improvement to the project (dev facing)

Description of the changes in your PR

General information about poTokens and about this PR structure:

YouTube now requires integrity checks to access their clients. The most "vulnerable" client is the WEB client, since they can't enforce integrity checks on all web browsers, so that's the only client (for now) that we have found a way to obtain an integrity token for.
In order to obtain a poToken, we need to run BotGuard, an obfuscated virtual machine implemented in JavaScript that performs the integrity checks and gives us an integrity token. In order to make the integrity checks succeed, we need to run this VM in an environment that resembles a browser as much as possible. The integrity token can be used to generate multiple poTokens. Two network requests are needed: Create to obtain the VM code, GenerateIT to obtain the integrity token after running the VM code. See the README here for the detailed steps.
PoTokenGenerator is the base class for all poToken generators. It has a factory method that allows asynchronously obtaining a new instance of a PoTokenGenerator, and then two methods to generate a poToken given a specific identifier, and a method to check if the integrity token has expired.
PoTokenWebView is currently the only implementation of PoTokenGenerator, but we might want to add other implementations in the future, e.g. ones that do not rely on WebView.
PoTokenProviderImpl implements the extractor interface and is supposed to take care of possibly multiple PoTokenGenerators (although right now there is only one based on WebView). It takes care of retrying in case of problems, recreates a new PoTokenGenerator if the current one expired, and finally returns a PoTokenResult. A PoTokenResult contains two poTokens: one for the specific requested video id (used to fetch the player), and another that can be generated only once as the first thing and is specific to a visitor data (used in streaming urls).

TODO:

The JavaScript poToken implementation comes from https://github.com/LuanRT/BgUtils
Obtaining a poToken via WebView
~~Obtaining a poToken with something like HtmlUnit~~ not doable unfortunately
Handling devices that don't have a WebView (needs to be tested)
Passing the poToken to the extractor when requested
Passing the poToken to player network requests (not sure if needed?)
Understand whether we need to change user agent everywhere

You can test whether the poTokens generated work also using the latest yt-dlp commit from their git repo (older commits won't work!), this way (take PLAYER_POT, STREAMING_POT and VISITOR_DATA from logcat):

yt-dlp "https://www.youtube.com/watch?v=i_SsnRdgitA" --extractor-args 'youtube:player_client=web;player-skip=webpage,configs;po_token=web.player+PLAYER_POT,web.gvs+STREAMING_POT;visitor_data=VISITOR_DATA'

Fixes the following issue(s)

Fixes [YouTube] HTTP error 403 for playback or download #11803

Relies on the following changes

[YouTube] Potokens support implementation NewPipeExtractor#1247

APK testing

The APK can be found by going to the "Checks" tab below the title. On the left pane, click on "CI", scroll down to "artifacts" and click "app" to download the zip file which contains the debug APK of this PR. You can find more info and a video demonstration on this wiki page.

Due diligence

I read the contribution guidelines.

Stypox · 2025-01-26T15:32:58Z

Now the PR builds fine based on TeamNewPipe/NewPipeExtractor#1247, you can download the APK which uses poTokens! Let us know if you notice any issues.

gechoto · 2025-01-27T08:19:57Z

app/src/main/java/org/schabi/newpipe/util/potoken/PoTokenWebView.kt

+        private val TAG = PoTokenWebView::class.simpleName
+        private const val GOOGLE_API_KEY = "AIzaSyDyT5W0Jh49F30Pqqtyfdf7pDLFKLJoAnw"
+        private const val REQUEST_KEY = "O43z0dpjhgX20SCx4KAo"
+        private const val USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.3"


Could this be Firefox ESR like in DownloaderImpl?

No, for some reason it does not work with the Firefox user agent. It would work with the curl user agent though, I don't know why...

gechoto · 2025-01-27T08:44:18Z

Would it be possible to move the po token implementation to a library?

Currently this is in NewPipe (the app repo) which makes it inaccessible by other apps which also have the need for po tokens.

This will lead to a lot of duplicate code because it needs to be implement over and over again for each YT client app.

Would be cool if this can be maintained in just one place (and multiple apps could benefit like it is already the case with NewPipeExtractor).

Figim · 2025-01-27T08:47:58Z

Would it be possible to move the po token implementation to a library?

Currently this is in NewPipe (the app repo) which makes it inaccessible by other apps which also have the need for po tokens.

This will lead to a lot of duplicate code because it needs to be implement over and over again for each YT client app.

Would be cool if this can be maintained in just one place (and multiple apps could benefit like it is already the case with NewPipeExtractor).

You can recreate this PR in your own application.

This simply connects to the extractor to support the Potoken stream. You will need to do this separately in your application. It should have been like this.

gechoto · 2025-01-27T09:25:54Z

You can recreate this PR in your own application.

my point was this would be inefficient

If you want to implement this over and over again for each app - sure, go ahead.

Keep in mind that this will likely not be "done" after the initial implementation.
YT will probably try to break this solution every few months.

You will have to update the implementation in many places again. And again. And again...
What a great way to waste time.

If this was implemented in just one place as a library it would be easier for more developers to share efforts.
To me this sounds like a reasonable thing to discuss - if possible.

Profpatsch · 2025-01-27T10:39:44Z

app/src/main/assets/po_token.html

+    // an asynchronous function runs in the background and it will eventually call
+    // `vmFunctionsCallback`, however we need to manually tell JavaScript to pass
+    // control to the things running in the background by interrupting this async
+    // function in any way, e.g. with a delay of 1ms. The loop is most probably not
+    // needed but is there just because.
+    for (let i = 0; i < 10000 && !this.vmFunctions.asyncSnapshotFunction; ++i) {
+      await new Promise(f => setTimeout(f, 1))
+    }


I … don’t think this is how async works. The timeout is just gonna be scheduled on a new task, but the code before the loop still runs on a microtask on the previous task.

Yes, but this.vm.a seems to start a standalone task in the background or something like that, and we need to explicitly pass control back to the event loop by pausing this async execution, for the background task to finish executing.

The loop actually executes only once as far as I know, I still put a loop because you never know

Profpatsch · 2025-01-27T10:44:50Z

Can there be an architecture overview of this somewhere? From a skim of the code I don’t get any idea of what problem this solves or how the solution is structured.

This will be tried only once, and afterwards an error will be thrown

Stypox · 2025-01-27T11:45:59Z

YouTube now requires integrity checks to access their clients. The most "vulnerable" client is the WEB client, since they can't enforce integrity checks on all web browsers, so that's the only client (for now) that we have found a way to obtain an integrity token for.
In order to obtain a poToken, we need to run BotGuard, an obfuscated virtual machine implemented in JavaScript that performs the integrity checks and gives us an integrity token. In order to make the integrity checks succeed, we need to run this VM in an environment that resembles a browser as much as possible. The integrity token can be used to generate multiple poTokens. Two network requests are needed: Create to obtain the VM code, GenerateIT to obtain the integrity token after running the VM code. See the README here for the detailed steps.
PoTokenGenerator is the base class for all poToken generators. It has a factory method that allows asynchronously obtaining a new instance of a PoTokenGenerator, and then two methods to generate a poToken given a specific identifier, and a method to check if the integrity token has expired.
PoTokenWebView is currently the only implementation of PoTokenGenerator, but we might want to add other implementations in the future, e.g. ones that do not rely on WebView.
PoTokenProviderImpl implements the extractor interface and is supposed to take care of possibly multiple PoTokenGenerators (although right now there is only one based on WebView). It takes care of retrying in case of problems, recreates a new PoTokenGenerator if the current one expired, and finally returns a PoTokenResult. A PoTokenResult contains two poTokens: one for the specific requested video id (used to fetch the player), and another that can be generated only once as the first thing and is specific to a visitor data (used in streaming urls).

Let me know which places are not documented enough.

sonarqubecloud · 2025-01-27T12:40:00Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
1 Security Hotspot
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Profpatsch · 2025-01-27T13:01:00Z

@Stypox I think it would be good to include this documentation into the source code somewhere, maybe in the interface module.

Profpatsch · 2025-01-27T13:02:03Z

So that people who want to understand the code later don’t have to find this PR and looks through lots of issues first

Interfaces for poTokens + WebView implementation

46dbc43

github-actions bot added the size/large PRs with less than 750 changed lines label Jan 25, 2025

This was referenced Jan 25, 2025

YouTube IP Ban / 403 Error MaintainTeam/LastPipeBender#12

Open

Video doesn't play beyond 50-60 seconds polymorphicshade/Tubular#189

Open

Connect poToken generation to extractor

b37a36e

Stypox mentioned this pull request Jan 26, 2025

Fix loading StreamInfo twice on first VideoDetailFragment opening #11959

Merged

5 tasks

Fix checkstyle

0801573

Figim mentioned this pull request Jan 27, 2025

Player fixes z-huang/InnerTune#1789

Open

gechoto reviewed Jan 27, 2025

View reviewed changes

Unify running on main thread

0caa96f

Profpatsch reviewed Jan 27, 2025

View reviewed changes

Stypox added 2 commits January 27, 2025 12:08

Recreate poToken generator if current is broken

8433203

This will be tried only once, and afterwards an error will be thrown

Wrap logs in BuildConfig.DEBUG

dde8b5f

Make sure downloadAndRunBotguard() is called after <script> loaded

e6f47ac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PoToken implementation to solve 403 errors #11955

PoToken implementation to solve 403 errors #11955

Stypox commented Jan 25, 2025 •

edited

Loading

Stypox commented Jan 26, 2025

gechoto Jan 27, 2025

Stypox Jan 27, 2025

gechoto commented Jan 27, 2025

Figim commented Jan 27, 2025 •

edited

Loading

gechoto commented Jan 27, 2025 •

edited

Loading

Profpatsch Jan 27, 2025

Stypox Jan 27, 2025

Stypox Jan 27, 2025

Profpatsch commented Jan 27, 2025

Stypox commented Jan 27, 2025

sonarqubecloud bot commented Jan 27, 2025

Profpatsch commented Jan 27, 2025

Profpatsch commented Jan 27, 2025

PoToken implementation to solve 403 errors #11955

Are you sure you want to change the base?

PoToken implementation to solve 403 errors #11955

Conversation

Stypox commented Jan 25, 2025 • edited Loading

What is it?

Description of the changes in your PR

Fixes the following issue(s)

Relies on the following changes

APK testing

Due diligence

Stypox commented Jan 26, 2025

gechoto Jan 27, 2025

Choose a reason for hiding this comment

Stypox Jan 27, 2025

Choose a reason for hiding this comment

gechoto commented Jan 27, 2025

Figim commented Jan 27, 2025 • edited Loading

gechoto commented Jan 27, 2025 • edited Loading

Profpatsch Jan 27, 2025

Choose a reason for hiding this comment

Stypox Jan 27, 2025

Choose a reason for hiding this comment

Stypox Jan 27, 2025

Choose a reason for hiding this comment

Profpatsch commented Jan 27, 2025

Stypox commented Jan 27, 2025

sonarqubecloud bot commented Jan 27, 2025

Quality Gate passed

Profpatsch commented Jan 27, 2025

Profpatsch commented Jan 27, 2025

Stypox commented Jan 25, 2025 •

edited

Loading

Figim commented Jan 27, 2025 •

edited

Loading

gechoto commented Jan 27, 2025 •

edited

Loading