You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Socks4 and Socks5 are stealthier than HTTP and HTTPS, so please add support for using Socks proxies.
However, be cautious—there was a time when Playwright rejected non-anonymous Socks proxies. I'm not sure about the current status.
Test code:
import asyncio
from crawlee.proxy_configuration import ProxyConfiguration
async def config_proxy() -> ProxyConfiguration:
# Create and return a ProxyConfiguration object.
proxy_configuration = ProxyConfiguration(
tiered_proxy_urls=[
# No proxy tier. (Not needed, but optional in case you do not want to use any proxy on lowest tier.)
# [None],
# lower tier, cheaper, preferred as long as they work
['https://example.com:8080', 'https://example.com:8080'],
# higher tier, more expensive, used as a fallback
['socks4://example.com:8080', 'socks5://example.com:8080'],
]
)
return proxy_configuration
asyncio.run(config_proxy())
Terminal output:
/Users/matecsaj/PycharmProjects/wat-crawlee/venv/bin/python /Users/matecsaj/Library/Application Support/JetBrains/PyCharm2024.3/scratches/scratch_1.py
Traceback (most recent call last):
File "/Users/matecsaj/Library/Application Support/JetBrains/PyCharm2024.3/scratches/scratch_1.py", line 19, in <module>
asyncio.run(config_proxy())
~~~~~~~~~~~^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 194, in run
return runner.run(main)
~~~~~~~~~~^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/base_events.py", line 720, in run_until_complete
return future.result()
~~~~~~~~~~~~~^^
File "/Users/matecsaj/Library/Application Support/JetBrains/PyCharm2024.3/scratches/scratch_1.py", line 6, in config_proxy
proxy_configuration = ProxyConfiguration(
tiered_proxy_urls=[
...<6 lines>...
]
)
File "/Users/matecsaj/PycharmProjects/wat-crawlee/venv/lib/python3.13/site-packages/crawlee/proxy_configuration.py", line 103, in __init__
[[URL(url) for url in tier if self._url_validator.validate_python(url)] for tier in tiered_proxy_urls]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^
File "/Users/matecsaj/PycharmProjects/wat-crawlee/venv/lib/python3.13/site-packages/pydantic/type_adapter.py", line 412, in validate_python
return self.validator.validate_python(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
object,
^^^^^^^
...<3 lines>...
allow_partial=experimental_allow_partial,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
pydantic_core._pydantic_core.ValidationError: 1 validation error for function-wrap[wrap_val()]
URL scheme should be 'http' or 'https' [type=url_scheme, input_value='socks4://example.com:8080', input_type=str]
For further information visit https://errors.pydantic.dev/2.10/v/url_scheme
Process finished with exit code 1
The text was updated successfully, but these errors were encountered:
Socks4 and Socks5 are stealthier than HTTP and HTTPS, so please add support for using Socks proxies.
However, be cautious—there was a time when Playwright rejected non-anonymous Socks proxies. I'm not sure about the current status.
Test code:
Terminal output:
The text was updated successfully, but these errors were encountered: