Skip to content

HTTP client ignores "no_proxy" environment variable #163

Closed
@karsten-wagner

Description

@karsten-wagner

Describe the bug

clickhouse-connect seems to ignore the environment variable no_proxy/NO_PROXY in a local Apache Superset configuration via Windows 10 with Docker Desktop and WSL2.

Utilizing a configuration that requires a forward proxy (e.g. in a corporate network) creates the need to maintain HTTP_PROXY/HTTPS_PROXY environment variables to allow Apache Superset to download Python packages like the clickhouse-connect driver and sample data. This may eventually need to provide additional SSL certificates and to set corresponding environment variables for Python and Node.JS (details on the configuration below - no issue in this context).
In the particular scenario described below a local Fiddler forward proxy is used, however the used proxy does not seem to matter and is just relevant for the text of the error message.

Installing the clickhouse-connect driver and connecting to clickhouse via connection string clickhousedb://default:@clickhouse:8123 should work, given the clickhouse container was named clickhouse respectively without leaving the local docker network. When trying to connect to the database, there is the following error message:

ERROR: :HTTPDriver for http://clickhouse:8123 returned response code 502)
[Fiddler] DNS Lookup for "clickhouse" failed. System.Net.Sockets.SocketException No such host is known

The error message indicates that the local docker network was left and the request to host clickhouse was sent to the forward proxy. This is ignoring the setting in environment variable no_proxy, which did contain host clickhouse as the exception.

Workaround:
Docker compose creates a default network to bridge to the host. As per the clickhouse settings described above, the container is exposed on a port on the host OS, hence connecting via string clickhousedb://default:@localhost:8123 does work. This is considered sub-optimal since a later production setup should isolate the database, while allowing to install the database driver. This would require proper handling of proxy settings and avoid to reroute via the forward proxy when connecting to the database.

Other considerations:
Connecting to clickhousedb://play:[email protected]:443 as suggested here does work as a sanity check of the proxy and certificates setup. The external URL is recognized by the forward proxy, while the container/service name clickhouse is not and leads to an error.

Steps to reproduce

  1. Setup Apache Superset in Windows 10 with Docker Desktop with WSL2 support as described here

  2. Add clickhouse container to docker-compose-non-dev.yml (add the following in the respective sections):

    x-clickhouse-volumes:
      &clickhouse-volumes
      - clickhouse_home:/var/lib/clickhouse
    
    services:
      clickhouse:
        image: clickhouse/clickhouse-server:23
        container_name: clickhouse
        env_file: docker/.env-non-dev
        user: "root"
        restart: unless-stopped
        ports:
          - "8123:8123"
          - "9000:9000"
        volumes: *superset-volumes
    
    volumes:
      clickhouse_home:
        external: false
  3. Add clickhouse-connect>=0.4.1 to ./docker/requirements-local.txt as described here and here

  4. Add proxy setup to .docker/.env-non-dev (add the following to the file):

    # Local Proxy Settings
    http_proxy="http://host.docker.internal:8888"
    https_proxy="http://host.docker.internal:8888"
    no_proxy="localhost,127.0.0.1,db,redis,superset,superset-init,superset-worker,superset-beat,clickhouse"
    HTTP_PROXY="http://host.docker.internal:8888"
    HTTPS_PROXY="http://host.docker.internal:8888"
    NO_PROXY="localhost,127.0.0.1,db,redis,superset,superset-init,superset-worker,superset-beat,clickhouse"
    # SSL certificate for the Python part
    REQUESTS_CA_BUNDLE=/app/docker/ca-bundle.crt
    # The superset_init container uses urllib (instead of urllib3) and ignores the REQUEST_CA_BUNDLE variable
    SSL_CERT_FILE=/app/docker/ca-bundle.crt
    # Eventually used for the node.js based UI components
    NODE_EXTRA_CA_CERTS=/app/docker/ca-bundle.crt
  5. Start the containers via TAG=1.5.3 docker-compose -f docker-compose-non-dev.yml up

  6. Log in to http://localhost:8088/

  7. Add clickhouse database as described here

Expected behaviour

Connection string clickhousedb://default:@clickhouse:8123 and clicking "Test Connection" should allow successful connectivity to the database.

Code example

The code of clickhouse-connect indicates support for the proxy environment variables, but not for the no_proxy exception list:

:param http_proxy http proxy address. Equivalent to setting the HTTP_PROXY environment variable
:param https_proxy https proxy address. Equivalent to setting the HTTPS_PROXY environment variable

clickhouse-connect and/or ClickHouse server logs

Configuration

Environment - from within the superset_app container:

pip list | grep clickhouse
clickhouse-connect     0.5.18

python --version
Python 3.8.13

uname -a
Linux 8275ab37bfa1 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 GNU/Linux

Clickhouse Server on Version 23 (see docker compose file).

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions