Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always 403 when trying to download the actual video segments in m3u8 #12

Open
moverlock1024 opened this issue Jul 10, 2019 · 13 comments
Open

Comments

@moverlock1024
Copy link

moverlock1024 commented Jul 10, 2019

The m3u8 file is correctly downloaded and decoded, but none of the segments are accessible. I've ruled out any proxies or ambiguous routes, and tried ffmpeg to download the m3u8 directly. However it still yields 403. I guess avgle site updated again?

@vide0
Copy link
Member

vide0 commented Jul 11, 2019

Can you provide details information to here? Because I tested downloading the recent video via AvgleDownloader from Avgle. It works great.

Most the reasons of 403 are caused by network proxy and resource expired.
For example: You execute AvgleDownloader command after you get this command a long time later. Or you set up proxy in the browser side but not in terminal.

@JohnDoee
Copy link

JohnDoee commented Jul 11, 2019

Avgle actually gives an "invalid/unusable" m3u8 file when it thinks you didn't solve the captcha. As far as I remember this was a lot more common with one of their file-hosting backends.

To add a bit to how Avgle works. When you visit the page from a non-validated IP (one that hasn't solved the CAPTCHA recently) the flow is like this

  • Invalid video URL fetched
  • Captcha shown
  • Captcha solved
  • Valid video URL fetched

The video URL is, as far a I remember, not actually restricted to a specific IP, although it might matter how far into the rabbithole you travel before you decide on which URL to use.

@moverlock1024
Copy link
Author

Avgle actually gives an "invalid/unusable" m3u8 file

But Captcha sometimes isn't shown at all but videos get played.

Invalid video URL fetched
Captcha shown
Captcha solved
Valid video URL fetched

I guess the issue is that we log the invalid video first but doesn't update it when the next valid URL comes up?

@moverlock1024
Copy link
Author

Most the reasons of 403 are caused by network proxy and resource expired.

I'm sure there is no proxy at all, and AvgleDownloader command is issued in less than 1 minute when it's shown in the tab.

@JohnDoee
Copy link

But Captcha sometimes isn't shown at all but videos get played.

Isn't scraping avgle fun?!
I've written and published my own avgle scraper that's how I know how it actually works. It seems to vary with which video backend you're using (there's two).

I guess the issue is that we log the invalid video first but doesn't update it when the next valid URL comes up?

Not sure how the project does exactly. Looking through the code and readme I saw no mention or checks for the captcha.

@freeleefly
Copy link

freeleefly commented Nov 18, 2019

Always 403 when trying to download the actual video segments through the address in .m3u8 file which was downloaded with the "Avgledownloader" with idm.
but,your aria2c can download it, meanwhile put the address to chrome which will "save as" it successfully too. ** I really want to know why and what hanppens?**

by the way:
On the avgle.com😂, IDM(Internet Download Manager) can download the sniffed .ts file with chrome. It does works!

However,when I copied the address of .ts file sniffed by IDM extension in Chrome, then put it to "Add URL" to download it, idm went to error with code 403?

In chrome->network, I could find the .ts file and "save as " it. At first I thought the difference between the two ways was "user-agent", I updated the version of idm to 6.35 and added a user-agent string, it doesn't work either.

So,
1)How can I download files with urls from the avgle.com?
2)What the hell difference between the two ways to download?

the url is something like "http://xxxx.com/key=lagjlasjdgl=/media=hlsA/xxxx.mp4/seg-001.ts"


欸欸,发布者可以读懂中文的。那我就中文再说一下,就是通过这个问题顺便搞清楚下,怎么回事。
就是某个avgle的网页,用idm嗅探到地址,直接点击,是可以下载成功的,但是把嗅探到的地址,直接复制到idm,再下载,发现就不行了。
看你的程序,发现也是base64解密了m3u8地址以后,加一个user-agent直接通过地址就能用wget或者aria2下载了,这怎么回事呢?我idm(6.35)也加了user-agent了,还是下载失败(没有权限403)

比较想搞清楚的是,通过嗅探下载,和直接添加地址下载,有什么区别。你的这个程序里面,为什么却能够下载成功,看log文件,也发现了很多个403了。很好奇这个,到底为什么我不行。
另外base解密之后得到的是:https://gooqlevideo.xyz/playback/eyJ0eXxxx 类似这么一串的url,那是怎么用这个找到m3u8文件的呢,看了你的源代码,有点没搞清楚,并不是很会bash,这个是另一个好奇的问题(对不起,我好多问题哦。)

@vide0
Copy link
Member

vide0 commented Nov 18, 2019

@freeleefly 其实这个很玄幻, 有可能式 User-Agent 头导致的,也有可能式 Reference 头导致的,最好都加上。

@freeleefly
Copy link

freeleefly commented Nov 19, 2019

谢谢你,发现了,好像确实是 referer的问题。
仔细看那个直连地址,有 referer=force,应该是强制。然后你的程序里面添加了 referer配置。idm的嗅探,应该默认加上去了,但是手动添加地址的时候,就没有添加referer,所以不成功。应该是这样样子。
现在伤脑筋,怎么能让idm加上referer。我用python 的requests模块,去访问并写,总是提示 没有这样文件或地址

@freeleefly
Copy link

啊,还有一个问题。我在页面上,看到视频地址是被blob加密的,请问,这个你是怎么通过这个blob的地址找到m3u8地址的呢?

@vide0
Copy link
Member

vide0 commented Nov 19, 2019

@freeleefly 哪个视频是被加密的,我目前遇到的只有 base64 编码过的。如果你真的遇到有加密的,可以参考一下这个仓库:https://github.com/video-dev/hls.js/#supported-m3u8-tags

@freeleefly
Copy link

不不不,就是从html可以看到 这里。
这个src是怎么得到所需要的m3u8的,这里有点不清楚。经过base64编码的过程我是知道的。

@vide0
Copy link
Member

vide0 commented Nov 19, 2019

@freeleefly 我没有知道你的意思,你是说怎么从一个 URL 获取到 m3u8 的链接的吗?

@freeleefly
Copy link

对的,因为src的属性是 blob:http....,对这样一段,不知道怎么处理能够得到m3u8,想知道这个。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants