Description
Describe the bug
clickhouse-connect
seems to ignore the environment variable no_proxy
/NO_PROXY
in a local Apache Superset configuration via Windows 10 with Docker Desktop and WSL2.
Utilizing a configuration that requires a forward proxy (e.g. in a corporate network) creates the need to maintain HTTP_PROXY
/HTTPS_PROXY
environment variables to allow Apache Superset to download Python packages like the clickhouse-connect
driver and sample data. This may eventually need to provide additional SSL certificates and to set corresponding environment variables for Python and Node.JS (details on the configuration below - no issue in this context).
In the particular scenario described below a local Fiddler
forward proxy is used, however the used proxy does not seem to matter and is just relevant for the text of the error message.
Installing the clickhouse-connect
driver and connecting to clickhouse via connection string clickhousedb://default:@clickhouse:8123
should work, given the clickhouse container was named clickhouse
respectively without leaving the local docker network. When trying to connect to the database, there is the following error message:
ERROR: :HTTPDriver for http://clickhouse:8123 returned response code 502)
[Fiddler] DNS Lookup for "clickhouse" failed. System.Net.Sockets.SocketException No such host is known
The error message indicates that the local docker network was left and the request to host clickhouse
was sent to the forward proxy. This is ignoring the setting in environment variable no_proxy
, which did contain host clickhouse
as the exception.
Workaround:
Docker compose creates a default network to bridge to the host. As per the clickhouse settings described above, the container is exposed on a port on the host OS, hence connecting via string clickhousedb://default:@localhost:8123
does work. This is considered sub-optimal since a later production setup should isolate the database, while allowing to install the database driver. This would require proper handling of proxy settings and avoid to reroute via the forward proxy when connecting to the database.
Other considerations:
Connecting to clickhousedb://play:[email protected]:443
as suggested here does work as a sanity check of the proxy and certificates setup. The external URL is recognized by the forward proxy, while the container/service name clickhouse
is not and leads to an error.
Steps to reproduce
-
Setup Apache Superset in Windows 10 with Docker Desktop with WSL2 support as described here
-
Add
clickhouse
container todocker-compose-non-dev.yml
(add the following in the respective sections):x-clickhouse-volumes: &clickhouse-volumes - clickhouse_home:/var/lib/clickhouse services: clickhouse: image: clickhouse/clickhouse-server:23 container_name: clickhouse env_file: docker/.env-non-dev user: "root" restart: unless-stopped ports: - "8123:8123" - "9000:9000" volumes: *superset-volumes volumes: clickhouse_home: external: false
-
Add
clickhouse-connect>=0.4.1
to./docker/requirements-local.txt
as described here and here -
Add proxy setup to
.docker/.env-non-dev
(add the following to the file):# Local Proxy Settings http_proxy="http://host.docker.internal:8888" https_proxy="http://host.docker.internal:8888" no_proxy="localhost,127.0.0.1,db,redis,superset,superset-init,superset-worker,superset-beat,clickhouse" HTTP_PROXY="http://host.docker.internal:8888" HTTPS_PROXY="http://host.docker.internal:8888" NO_PROXY="localhost,127.0.0.1,db,redis,superset,superset-init,superset-worker,superset-beat,clickhouse" # SSL certificate for the Python part REQUESTS_CA_BUNDLE=/app/docker/ca-bundle.crt # The superset_init container uses urllib (instead of urllib3) and ignores the REQUEST_CA_BUNDLE variable SSL_CERT_FILE=/app/docker/ca-bundle.crt # Eventually used for the node.js based UI components NODE_EXTRA_CA_CERTS=/app/docker/ca-bundle.crt
-
Start the containers via
TAG=1.5.3 docker-compose -f docker-compose-non-dev.yml up
-
Log in to
http://localhost:8088/
-
Add clickhouse database as described here
Expected behaviour
Connection string clickhousedb://default:@clickhouse:8123
and clicking "Test Connection" should allow successful connectivity to the database.
Code example
The code of clickhouse-connect
indicates support for the proxy environment variables, but not for the no_proxy
exception list:
clickhouse-connect/clickhouse_connect/driver/__init__.py
Lines 65 to 66 in b1c78f8
clickhouse-connect and/or ClickHouse server logs
Configuration
Environment - from within the superset_app
container:
pip list | grep clickhouse
clickhouse-connect 0.5.18
python --version
Python 3.8.13
uname -a
Linux 8275ab37bfa1 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 GNU/Linux
Clickhouse Server on Version 23 (see docker compose file).