Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proxy] Unable to start offline with persistent cache #28305

Open
carlzogh opened this issue Sep 6, 2024 · 2 comments
Open

[Proxy] Unable to start offline with persistent cache #28305

carlzogh opened this issue Sep 6, 2024 · 2 comments
Labels

Comments

@carlzogh
Copy link

carlzogh commented Sep 6, 2024

Describe the bug
Vault Proxy with persistent static secrets cache & auto-auth enabled is unable to start up offline without trying to connect to Vault Server to renew its token.

Unless the issue is misconfiguration on our side, this prevents us from relying on Vault Proxy for static stability / availability in cases when the Vault Proxy process is restarted during Vault Server unreachability.

To Reproduce

  1. Run Vault Proxy with the provided configuration, and request any secret to ensure cache is created and token + secret are persisted.
  2. Stop Vault Proxy process, ensuring that the cache database is still present and will be used by the next run.
  3. Disconnect from the internet (eg. turn wifi off) to emulate network unreachability.
  4. Run Vault Proxy again with the same configuration, to observe the error with the process failing to start due to it being unable to connect to the Server. eg. logs:
Couldn't start vault with IPC_LOCK. Disabling IPC_LOCK, please use --cap-add IPC_LOCK
==> Vault Proxy started! Log data will stream in below:

==> Vault Proxy configuration:

           Api Address 1: http://0.0.0.0:8200
                     Cgo: disabled
               Log Level: trace
                 Version: Vault v1.17.5, built 2024-08-30T15:54:57Z
             Version Sha: 4d0c53e84094b8017d32b6e5b7f8142035c8837f

2024-09-06T09:39:38.514Z [INFO]  proxy.cache: cache configured: cache_static_secrets=true disable_caching_dynamic_secrets=true
2024-09-06T09:39:38.516Z [TRACE] proxy.cache.cacheboltdb: closing bolt db: path=/var/run/cache/vault-agent-cache.db
2024-09-06T09:39:38.519Z [TRACE] proxy.cache.leasecache: restored token: id=0EHrt
2024-09-06T09:39:38.519Z [TRACE] proxy.cache.leasecache: restored token: id=0eE2O
... other tokens
2024-09-06T09:39:38.521Z [TRACE] proxy.cache.leasecache: restored token: id=zBDQx
2024-09-06T09:39:38.521Z [TRACE] proxy.cache.leasecache: restoring static secret index: id=b0b444679a0300f2cf23b576d35637fae0b93dfda757e1de9c2874eb4a50a7f7 path={namespace}/{kvv2_name}/data/{secret_path}
2024-09-06T09:39:38.521Z [TRACE] proxy.cache.leasecache: restoring capability index: id=e477169eaf91f6a693570a52027f688cedf6ca986b6599ffabcec88bfe3281b0
2024-09-06T09:39:38.521Z [INFO]  proxy.cache: loaded memcache from persistent storage
2024-09-06T09:39:38.521Z [DEBUG] proxy.apiproxy: configuring inmem auto-auth sink
2024-09-06T09:39:38.521Z [DEBUG] proxy: would have sent systemd notification (systemd not present): notification=READY=1
2024-09-06T09:39:38.521Z [INFO]  proxy.cache.staticsecretcacheupdater: starting static secret cache updater subsystem
2024-09-06T09:39:38.521Z [INFO]  proxy.sink.server: starting sink server
2024-09-06T09:39:38.521Z [INFO]  proxy.auth.handler: starting auth handler
2024-09-06T09:39:38.521Z [DEBUG] proxy.auth.handler: using preloaded token
2024-09-06T09:39:38.521Z [DEBUG] proxy.auth.handler: lookup-self with preloaded token
2024-09-06T09:39:38.523Z [ERROR] proxy.auth.handler: could not look up token: err="Get \"https://{vault_server}:8200/v1/auth/token/lookup-self\": dial tcp: lookup {vault_server} on 192.168.65.7:53: no such host" backoff=860ms
2024-09-06T09:39:39.388Z [INFO]  proxy.auth.handler: authenticating
2024-09-06T09:39:39.392Z [ERROR] proxy.auth.handler: error authenticating: error="Put \"https://{vault_server}:8200/v1/auth/approle/login\": dial tcp: lookup {vault_server} on 192.168.65.7:53: no such host" backoff=860ms
2024-09-06T09:39:40.382Z [INFO]  proxy.auth.handler: authenticating
... similar logs
  1. Any request to get a cached secret will fail as it waits on Vault Proxy to successfully validate its own token (will never succeed offline).

Expected behavior
Vault Proxy is able to persist its authentication token and not need to perform a mandatory token lookup / refresh on startup if it is still valid.

Environment:

  • Vault Server Version: 1.17.3 (Enterprise)
  • Vault Proxy Version: 1.17.5
  • Vault CLI Version: 1.17.5
  • Server Operating System/Architecture: Linux / amd64
  • Proxy Operating System/Architecture: macOS / arm64
  • Client Operating System/Architecture: macOS / arm64

Vault server configuration file(s):

# ref. https://developer.hashicorp.com/vault/docs/agent-and-proxy/autoauth
auto_auth {
    method "approle" {
        mount_path = "auth/approle"
        max_backoff = "10s"
        config = {
            role_id_file_path = "/etc/vault/role_id"  # secrets expected to be mounted as volumes
            secret_id_file_path = "/etc/vault/secret_id"  # secrets expected to be mounted as volumes
            remove_secret_id_file_after_reading = false
            exit_on_err = true
        }
    }
}

# ref. https://developer.hashicorp.com/vault/docs/agent-and-proxy/proxy/apiproxy
api_proxy {
    use_auto_auth_token = "force"
    enforce_consistency = "always"
    when_inconsistent = "retry"
}

# ref. https://developer.hashicorp.com/vault/docs/agent-and-proxy/proxy/caching
cache {
    disable_caching_dynamic_secrets = true
    # ref. https://developer.hashicorp.com/vault/docs/agent-and-proxy/proxy/caching/static-secret-caching
    cache_static_secrets = true
    static_secret_token_capability_refresh_interval = "1d"
    static_secret_token_capability_refresh_behavior = "optimistic"

    persist {
        type = "kubernetes"  # mocking k8s by providing a (secret) static service account JWT token as AAD
        path = "/var/run/cache"
        service_account_token_file = "/var/run/cache-persistence-token"
        exit_on_err = false
        keep_after_import = true
    }
}

# ref. https://developer.hashicorp.com/vault/docs/configuration/listener/tcp
listener "tcp" {
    address = "0.0.0.0:8200"
    tls_disable = true  # http local connections

    telemetry {
        unauthenticated_metrics_access = true
    }
}

# ref. https://developer.hashicorp.com/vault/docs/configuration/telemetry
telemetry {
    enable_hostname_label = true
}

# ref. https://developer.hashicorp.com/vault/docs/agent-and-proxy/proxy#vault-stanza
vault {
    # vault address is configured by means of the VAULT_ADDR environment variable
    # vault namespace is configured by means of the VAULT_NAMESPACE environment variable
    tls_skip_verify = true  # TODO: use actual certificate for instance through volume mounts
}

Running the Vault Proxy (version 1.17.5) with Docker:

docker run --rm -it \
  -p 8200:8200 \
  -v "$(pwd)/vault-proxy/secrets/role_id:/etc/vault/role_id" \
  -v "$(pwd)/vault-proxy/secrets/secret_id:/etc/vault/secret_id" \
  -v "$(pwd)/vault-proxy/config/vault-proxy-config.hcl:/etc/vault/vault-proxy-config.hcl" \
  -v "$(pwd)/vault-proxy/config/cache-persistence-token:/var/run/cache-persistence-token" \
  -v "$(pwd)/vault-proxy/cache:/var/run/cache" \
  -e VAULT_ADDR=https://{vault_server}:8200 \
  -e VAULT_NAMESPACE={namespace} \
  --hostname $(hostname) \
  hashicorp/vault:1.17.5 \
  proxy -log-level trace -config /etc/vault/vault-proxy-config.hcl
@carlzogh
Copy link
Author

Hey @heatherezell @VioletHynes - could you please confirm if this is the expected behavior or if this is something we've misconfigured in the Proxy?

@VioletHynes
Copy link
Contributor

This seems like expected behaviour today. However, I agree it's unfortunate, and it'd be great if we could avoid this. To your point here:

Vault Proxy is able to persist its authentication token and not need to perform a mandatory token lookup / refresh on startup if it is still valid.

This is the crux of the issue, and I'd call this the problem as opposed to the persistent cache itself. This is something that would provide great value with or without persistent caching.

Another big benefit here is that if my Auto Auth token has a 1 day TTL and I restart Agent/Proxy five times a day, we make one token instead of five.

This has been a known potential enhancement for a while, but I'm going to give this a think and see if there's any smart, easy way we can address this. The challenge is of course safely persisting a token, and ideally any way we'd want to do this in a way that provides value to users who are and aren't using caching. I can't promise anything here timeline wise (like I've said, this has been a known FR for a while) but I will promise I'll give this a good think at the very least.

Thanks for the issue and food for thought!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants