Describe the bug
clickhouse-connect seems to ignore the environment variable no_proxy/NO_PROXY in a local Apache Superset configuration via Windows 10 with Docker Desktop and WSL2.
Utilizing a configuration that requires a forward proxy (e.g. in a corporate network) creates the need to maintain HTTP_PROXY/HTTPS_PROXY environment variables to allow Apache Superset to download Python packages like the clickhouse-connect driver and sample data. This may eventually need to provide additional SSL certificates and to set corresponding environment variables for Python and Node.JS (details on the configuration below - no issue in this context).
In the particular scenario described below a local Fiddler forward proxy is used, however the used proxy does not seem to matter and is just relevant for the text of the error message.
Installing the clickhouse-connect driver and connecting to clickhouse via connection string clickhousedb://default:@clickhouse:8123 should work, given the clickhouse container was named clickhouse respectively without leaving the local docker network. When trying to connect to the database, there is the following error message:
ERROR: :HTTPDriver for http://clickhouse:8123 returned response code 502)
[Fiddler] DNS Lookup for "clickhouse" failed. System.Net.Sockets.SocketException No such host is known
The error message indicates that the local docker network was left and the request to host clickhouse was sent to the forward proxy. This is ignoring the setting in environment variable no_proxy, which did contain host clickhouse as the exception.
Workaround:
Docker compose creates a default network to bridge to the host. As per the clickhouse settings described above, the container is exposed on a port on the host OS, hence connecting via string clickhousedb://default:@localhost:8123 does work. This is considered sub-optimal since a later production setup should isolate the database, while allowing to install the database driver. This would require proper handling of proxy settings and avoid to reroute via the forward proxy when connecting to the database.
Other considerations:
Connecting to clickhousedb://play:clickhouse@play.clickhouse.com:443 as suggested here does work as a sanity check of the proxy and certificates setup. The external URL is recognized by the forward proxy, while the container/service name clickhouse is not and leads to an error.
Steps to reproduce
-
Setup Apache Superset in Windows 10 with Docker Desktop with WSL2 support as described here
-
Add clickhouse container to docker-compose-non-dev.yml (add the following in the respective sections):
x-clickhouse-volumes:
&clickhouse-volumes
- clickhouse_home:/var/lib/clickhouse
services:
clickhouse:
image: clickhouse/clickhouse-server:23
container_name: clickhouse
env_file: docker/.env-non-dev
user: "root"
restart: unless-stopped
ports:
- "8123:8123"
- "9000:9000"
volumes: *superset-volumes
volumes:
clickhouse_home:
external: false
-
Add clickhouse-connect>=0.4.1 to ./docker/requirements-local.txt as described here and here
-
Add proxy setup to .docker/.env-non-dev (add the following to the file):
# Local Proxy Settings
http_proxy="http://host.docker.internal:8888"
https_proxy="http://host.docker.internal:8888"
no_proxy="localhost,127.0.0.1,db,redis,superset,superset-init,superset-worker,superset-beat,clickhouse"
HTTP_PROXY="http://host.docker.internal:8888"
HTTPS_PROXY="http://host.docker.internal:8888"
NO_PROXY="localhost,127.0.0.1,db,redis,superset,superset-init,superset-worker,superset-beat,clickhouse"
# SSL certificate for the Python part
REQUESTS_CA_BUNDLE=/app/docker/ca-bundle.crt
# The superset_init container uses urllib (instead of urllib3) and ignores the REQUEST_CA_BUNDLE variable
SSL_CERT_FILE=/app/docker/ca-bundle.crt
# Eventually used for the node.js based UI components
NODE_EXTRA_CA_CERTS=/app/docker/ca-bundle.crt
-
Start the containers via TAG=1.5.3 docker-compose -f docker-compose-non-dev.yml up
-
Log in to http://localhost:8088/
-
Add clickhouse database as described here
Expected behaviour
Connection string clickhousedb://default:@clickhouse:8123 and clicking "Test Connection" should allow successful connectivity to the database.
Code example
The code of clickhouse-connect indicates support for the proxy environment variables, but not for the no_proxy exception list:
|
:param http_proxy http proxy address. Equivalent to setting the HTTP_PROXY environment variable |
|
:param https_proxy https proxy address. Equivalent to setting the HTTPS_PROXY environment variable |
clickhouse-connect and/or ClickHouse server logs
Configuration
Environment - from within the superset_app container:
pip list | grep clickhouse
clickhouse-connect 0.5.18
python --version
Python 3.8.13
uname -a
Linux 8275ab37bfa1 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 GNU/Linux
Clickhouse Server on Version 23 (see docker compose file).
Describe the bug
clickhouse-connectseems to ignore the environment variableno_proxy/NO_PROXYin a local Apache Superset configuration via Windows 10 with Docker Desktop and WSL2.Utilizing a configuration that requires a forward proxy (e.g. in a corporate network) creates the need to maintain
HTTP_PROXY/HTTPS_PROXYenvironment variables to allow Apache Superset to download Python packages like theclickhouse-connectdriver and sample data. This may eventually need to provide additional SSL certificates and to set corresponding environment variables for Python and Node.JS (details on the configuration below - no issue in this context).In the particular scenario described below a local
Fiddlerforward proxy is used, however the used proxy does not seem to matter and is just relevant for the text of the error message.Installing the
clickhouse-connectdriver and connecting to clickhouse via connection stringclickhousedb://default:@clickhouse:8123should work, given the clickhouse container was namedclickhouserespectively without leaving the local docker network. When trying to connect to the database, there is the following error message:The error message indicates that the local docker network was left and the request to host
clickhousewas sent to the forward proxy. This is ignoring the setting in environment variableno_proxy, which did contain hostclickhouseas the exception.Workaround:
Docker compose creates a default network to bridge to the host. As per the clickhouse settings described above, the container is exposed on a port on the host OS, hence connecting via string
clickhousedb://default:@localhost:8123does work. This is considered sub-optimal since a later production setup should isolate the database, while allowing to install the database driver. This would require proper handling of proxy settings and avoid to reroute via the forward proxy when connecting to the database.Other considerations:
Connecting to
clickhousedb://play:clickhouse@play.clickhouse.com:443as suggested here does work as a sanity check of the proxy and certificates setup. The external URL is recognized by the forward proxy, while the container/service nameclickhouseis not and leads to an error.Steps to reproduce
Setup Apache Superset in Windows 10 with Docker Desktop with WSL2 support as described here
Add
clickhousecontainer todocker-compose-non-dev.yml(add the following in the respective sections):Add
clickhouse-connect>=0.4.1to./docker/requirements-local.txtas described here and hereAdd proxy setup to
.docker/.env-non-dev(add the following to the file):Start the containers via
TAG=1.5.3 docker-compose -f docker-compose-non-dev.yml upLog in to
http://localhost:8088/Add clickhouse database as described here
Expected behaviour
Connection string
clickhousedb://default:@clickhouse:8123and clicking "Test Connection" should allow successful connectivity to the database.Code example
The code of
clickhouse-connectindicates support for the proxy environment variables, but not for theno_proxyexception list:clickhouse-connect/clickhouse_connect/driver/__init__.py
Lines 65 to 66 in b1c78f8
clickhouse-connect and/or ClickHouse server logs
Configuration
Environment - from within the
superset_appcontainer:Clickhouse Server on Version 23 (see docker compose file).