-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Browser listening on: ws://127.0.0.1:59948 #35
Comments
Could you try running the code again with |
Nope, even with |
Just to check, are you on the latest version? |
Yep, v2.0.5 |
I'm not fully sure if its requests-html in the library or requests-html in your code which is causing the problem. If you disable proxy rotation entirely, and use the default response_retriever, do you still get the same error? |
Nope. Exact same thing after default response_retriever and no proxies. However, while its running, I can see multiple, sometimes ten to fifteen, chromium processes that just sit there. It seems like they're not getting closed properly, even though I added code to do it manually when it wasn't doing it itself? Other than that no ideas. |
The program without proxies, if you are curious:
|
I've had this issue a few times in the development of an app utilising PyPartPicker, but it's usally resolved itself as I've fixed other problems. But now I'm in a bit of a stalemate.
I've just implemented proxy rotation and asyncio functions as per the documentation suggests, and it seems to be working fine. I moved the proxy debugging to the main function to test the proxies but had to move it back to response retreiver since it just pooped itself with the debugging in the main function for some reason.
Anyway, it goes through about 10 parts and then just hangs on "2025-01-20 19:15:56,761 - INFO - Browser listening on: ws://127.0.0.1:59948/devtools/browser/f358eb04-9c23-4952-b9c6-a92e91b1fe9b". I know it's a localhost loopback ip, but I just can't see why it hangs. It should retry if the connection is unsuccessful anyway!
I've scoured pretty much all of StackOverflow and the RequestsHTML documentation but can't see to find anything on it at all. At first I thought it had something to do with rate limiting but now I'm having second thoughts. I left it sit for about 45 miniutes and eventually it closed the connection.
Then I though that perhaps it was something to do with the the chromium processes not terminating after completing the scrape, so I added some logic that closes it manually. I don't think it the custom response retriever either.
Anyway, I don't know if this is a library wide thing or just a issue with my program, but since nothing else seems to have any answers I thought i'd give it a shot here. Let me know if anyone has any ideas :)
Program below:
The text was updated successfully, but these errors were encountered: