Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod Autoscaler for tidb (SQL Processing) component in TiDB Cluster #5529

Open
varun-mishra opened this issue Jan 22, 2024 · 2 comments
Open

Comments

@varun-mishra
Copy link

Question

We are exploring the horizontal pod auto scaler (hpa) for the TiDB component. We tested it by deploying a small tidb cluster and auto-scaler hpa with basic configuration.

HPA was able to scale up and down the TiDB component. But if the replica count was not equal to the default value in TC spec the cluster state went into the not-ready status and tidb phase was changed to scale. This can cause maintenance hindrances for the operator.

Regarding the above-discussed cases, we have a few questions:

  1. What was the outcome of HPA evaluation done by the PingCap team? (as said here it did not work: doc
  2. Do you have such features in your roadmap?
  3. If tc status is not-ready and tidb phase is scale, how exactly does it affect the TiDB cluster?
  4. can we maintain a healthy state based on scale action done by hpa? or would it be anti-pattern for operator framework?
@csuzhangxc
Copy link
Member

For a Database, scaling out or in a replica will cause data rebalance, this may take a long time and may have performance issues. The scaling operation will also cause some connections to close or re-connect to another replica, this may need some retry logic in the clients.

We tried to implement HAP before, but it didn't work well. and we don't have such features in the roadmap now.

Updating replicas in the StatefulSet may be overwritten by the TiDB Operator later.

@varun-mishra
Copy link
Author

Thanks for the response @csuzhangxc
TiDB component is deployed as statefulsets in k8s but is a stateless component, which does not cause the rebalancing of data in case of scaling in/out.
The retry logic from the client side can be worked with clients.

Updating replicas in the StatefulSet may be overwritten by the TiDB Operator later.
I ran the experiment with hpa max-replica count same as tidb replica count in cluster spec, with this configuration operator was not overriding the sts. I observed it for more than 10 hours.

If available can you share the HPA/VPA experiment results done by Pingcap team.

Auto scaling can help TiDB clients with managing the burst load and lean period. Consider this as a requirement and if any change is needed I can contribute.
Let me know your thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants