Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TiDB-operator fails to start the tiproxy servers if spec.tiproxy.version not provided #5833

Open
kos-team opened this issue Nov 4, 2024 · 5 comments

Comments

@kos-team
Copy link
Contributor

kos-team commented Nov 4, 2024

Bug Report

What version of Kubernetes are you using?
Client Version: v1.31.1
Kustomize Version: v5.4.2

What version of TiDB Operator are you using?
v1.6.0

What's the status of the TiDB cluster pods?
TiProxy pods are in CrashBackOffLoop State.

What did you do?
We deployed a cluster with TiProxy.

How to reproduce

  1. Deploy a TiDB cluster with TiProxy enabled, for example:
apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
  name: test-cluster
spec:
  configUpdateStrategy: RollingUpdate
  enableDynamicConfiguration: true
  helper:
    image: alpine:3.16.0
  pd:
    baseImage: pingcap/pd
    config: "[dashboard]\n  internal-proxy = true\n"
    maxFailoverCount: 0
    mountClusterClientSecret: true
    replicas: 3
    requests:
      storage: 10Gi
  pvReclaimPolicy: Retain
  ticdc:
    baseImage: pingcap/ticdc
    replicas: 3
  tidb:
    baseImage: pingcap/tidb
    config: "[performance]\n  tcp-keep-alive = true\ngraceful-wait-before-shutdown\
      \ = 30\n"
    maxFailoverCount: 0
    replicas: 3
    service:
      externalTrafficPolicy: Local
      type: NodePort
  tiflash:
    baseImage: pingcap/tiflash
    replicas: 3
    storageClaims:
    - resources:
        requests:
          storage: 10Gi
  tikv:
    baseImage: pingcap/tikv
    config: 'log-level = "info"

      '
    maxFailoverCount: 0
    mountClusterClientSecret: true
    replicas: 3
    requests:
      storage: 100Gi
    scalePolicy:
      scaleOutParallelism: 5
  timezone: UTC
  tiproxy:
    replicas: 5
    sslEnableTiDB: true
  version: v8.1.0

What did you expect to see?
TiProxy pods should start successfully and be in the Healthy state.

What did you see instead?
The TiProxy pods kept crashing and be in CrashBackOffLoop state due to ErrImagePull.

Root Cause
The root cause is that we specified spec.version to v8.1.0 which will be used for all components when pulling their images. However, there is no pingcap/tiproxy:v8.1.0 image available on the DockerHub causing the image pull process to fail for the TiProxy.

How to fix
Since the image tag for TiProxy follows a different naming convention compared to other components like TiKV and TiFlash, we recommend setting a default value of main for spec.tiproxy.version. This will ensure the TiDB Operator overrides the version tag for TiProxy and pulls the correct image.

@csuzhangxc
Copy link
Member

the main may not be stable, and we recommend the user to try the newest release version (vx.y.z)

@kos-team
Copy link
Contributor Author

kos-team commented Nov 5, 2024

@csuzhangxc The main usability issue here is that, the TiProxy follows a different version numbering than the other TiDB components. And if we set a version v8.1.0 in the property spec.version, all TiDB components use v8.1.0 as the version. This works for all other components such as TiFlash, TiKV. However, since TiProxy does not have the same version number as the rest of the components, it would fail.

@csuzhangxc
Copy link
Member

@csuzhangxc The main usability issue here is that, the TiProxy follows a different version numbering than the other TiDB components. And if we set a version v8.1.0 in the property spec.version, all TiDB components use v8.1.0 as the version. This works for all other components such as TiFlash, TiKV. However, since TiProxy does not have the same version number as the rest of the components, it would fail.

I know. I mean it's hard to choose a default value for TiProxy as we always recommend the user to use the newest version

@kos-team
Copy link
Contributor Author

kos-team commented Nov 8, 2024

We also reported a related issue to the tidb upstream repo: pingcap/tidb#56643, about the lastest tag not pointing to the actual latest version. It seems that those upstream systems do not have a reliable tag for using the latest version.
It would be nice if they have a tag which can be used as the default value here.

@kos-team
Copy link
Contributor Author

To make the deployment safer, I think the spec.tiproxy.image perhaps can be set to be a required property of the spec.tiproxy object. This enforces users to specify a TiProxy version when they enable it, since tiproxy cannot use the default value from the spec.version anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants