-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-Creating node from scratch does not copy tables for the Postgres and Kafka engines #1455
Comments
@Hubbitus , have you used latest 0.23.6 or earlier release? |
@alex-zaitsev, thank you for the response. That was in older version. Now we have updated operator. What is a correct way to re-init node? Is it enough to just delete PVC of failed node and delete POD? |
@Hubbitus , if you want to re-init the existing node, delete STS, PVC, PV and start re-concile. Do you have multiple replicas? |
@alex-zaitsev, thank you for the reply. I understand how to delete objects. But what you are meant under "start re-concile"? I have two replicas
|
@Hubbitus , we have released 0.23.7 that is more aggressive re-creating the schema. So you may try to delete PVC/PV completely, and let it to re-create the objects. |
@alex-zaitsev, thank you very much!
And doing in ArgoCD:
Then I see pod is up and running.
SELECT hostname() as node, COUNT(*)
FROM clusterAllReplicas('{cluster}', system.tables)
WHERE database NOT IN ('INFORMATION_SCHEMA', 'information_schema', 'system')
GROUP BY node
And also error in log like: So, I see only tables in |
Notes:
Note, the order is important, but local_directory may be skipped if you are not using it. Keep it, if there are users defined with CREATE USER already, otherwise they disappear at all.
Others should work, so operator log is needed to check what went wrong. The correct PVC recovery sequence is:
Looks like since you have deleted PVC and Pod, the recovery has been handled by Kubernetes (STS), and Operator even did not know that PVC has been recreated. So make sure you delete STS as well. Also consider using operator managed persistance:
|
@alex-zaitsev, very thank you for the answer. First I would like to recover my tables, then I will try to deal with users. Today, I eventfully receive rights to see operator pod in kube-system namespace.
|
As we are speaking, I have tried to reconcile cluster by providing: spec:
taskID: "click-reconcile-1" Indeed, that looks like triggering reconcile. Logs of operator pod:
Not sure what going wrong, but on host |
@alex-zaitsev, could you please look on it? |
@Hubbitus is your cluster have 2 shards with only 1 replica inside shard? Could you share: |
@Slach, thanks to response. Output of |
@Hubbitus |
Sure (limit to 10, total 269):
|
You shared logs for 29 sep 2024 since 11:55 UTC |
Hello. |
@Hubbitus i don't see logs from 15 Oct 2024 I need to ensure you tried to reconcile, after drop PVC and STS |
Yes. By suggestion of @alex-zaitsev I had introduced there |
share clickhouse-operator logs for 15 Oct related to your changes |
Hello. I do not have so old logs. But I've switching on branch were set
operator.2024-11-02T17:47:14+03:00.obfuscated.log Output of SELECT database, table, engine_full, count() c, hostname()
FROM
cluster('{cluster}',system.tables)
WHERE
database NOT IN ('system','INFORMATION_SCHEMA','information_schema')
GROUP BY ALL
HAVING c<2 Contains 515 rows. Heading of it:
|
according to logs you just triggered reconcile for -0-0-0 when sts is not deleted try
edit |
Ok, thank you.
I think relevants logs are:
So operator can't resolve hostname of node:
Indeed hostname SELECT cluster, host_name
FROM system.clusters
WHERE cluster = 'gid'
|
You did not share full logs, just found first error message, error message is expected, because you deleted Hostname, contains SERVICE name, not pod is sts could you share full operator logs? |
Sure. I found much more errors in log: |
@sunsingerus. according to shared logs first reconcile was applied at 2024-11-02 14:44:41
second try after sts + pvc deletion
STS and PVC was deleted
PVC recreated
Migration force to be applied
Prepare for table migration, try
Trying drop data from ZK @sunsingerus
Get SQL object definitions
Trying to restore and failure cause ZK data still present
We need to choose 0-1-0 for execution SYSTEM DROP REPLICA... |
@Hubbitus after reconcile
in zookeeper
in local SQL
|
@Hubbitus could you share
and
|
@Slach, sure (comments of columns and table stripped):
Error looks reasonable - we got error on table creation after delete STS and PVC, is not? |
add
|
|
try to restore table
|
@Slach, why it is not restored automatically? Really, this table may be deleted. IS it the solution? |
It tried to restore, and most of the tables were restored (am I right?) |
try to upgrade clickhouse-keeper to latest and clickhouse-server to 24.8 latest LTS |
No. Query SELECT database, table, engine_full, count() c, hostname()
FROM
cluster('{cluster}',system.tables)
WHERE
database NOT IN ('system','INFORMATION_SCHEMA','information_schema')
GROUP BY ALL
HAVING c<2 Still returns 729 tables present only on host
It is not so fast. We will try to do it in the next week. |
ok. let's try again # safe shutdown
kubectl exec -n gidplatform-dev chi-gid-gid-0-0-0 -- clickhouse-client -q "SYSTEM SHUTDOWN"
# different node
kubectl exec -n gidplatform-dev chi-gid-gid-0-1-0 -- clickhouse-client -q "SYSTEM DROP REPLICA 'chi-gid-gid-0-0'"
# delete sts to propogate schema during reconcile
kubectl delete sts -n gidplatform-dev chi-gid-gid-0-0
# change spec.taskID
kubectl edit chi -n gidplatform-dev gid
# wait when
watch -n 1 kubectl describe chi chi -n gidplatform-dev gid
# check tables
# share clickhouse-operator logs |
Ok, lets try! I have updated Clickhouse to version 24.8.6.70 (LiveView was dropped because of incompatibility). Still used New attempt:
Wasn't
Sorry, what I should wait in such output? For command
Waiting some time. I suppose I need line
I see again a bunch of authorization errors in logs, but do not want to make any assumptions. |
I think we need to delete PVC+PV for dead 0-0-0, to complete erase data for -0-0-0 I found in logs Let another try
|
Yes, we use openebs and taint to node. Openebs is one of the fastest storage class. And I hope there was no pod migration to another node.
Let's doing:
I've tried several times last command with the same result. |
ok. change action sequence
|
|
Another try of
Last line appeared:
Then:
Reconcile done: Now 730 tables only on host |
|
Comments stripped. I can drop that table if it helps. But what is the reason it is not restored? |
|
Returns nothing |
try replace
|
And for 1 still empty:
|
/1/ expected empty cause {shard} have 0 value this is really weird, why did you get ok. let's try create schema on 0-0-0 completelly manually Create databases
Create MergeTree tables
Create Distributed tables
Create Other tables, MV, Dictionaries
|
@Slach,
My attempt to fix:
Correcting quotes probably will:
That without last part But what the reason dong so on same 0 node? Dumping from it and applying there. So clause Is it intended to dump structure from 1 (working) node and apply to 0 (broken)? |
yes |
let's change approach
Datbases
Create MergeTree tables
Create Distributed tables
Create Other tables, MV, Dictionaries
|
Then it probably should be like:
But still, if there was not But what the purpose of such operations?
That still will be executed within the single 0 node. |
no, i provided command with pipeline inside bash -c
Yes, i'm looking into clickhouse-operator source code and trying to execute schemer.go commands manually |
Yes, they piped, but inside an interactive shell of 0 node. There are no cross-nodes connections. |
look to |
We use your Operator to manage Clickhouse cluster. Thank you.
After some hardware failure we reset PVC (and zookeeper namespace) to re-create one clickhouse node.
Most of metadata like views, materialized views and tables with most engines (
MergeTree
,ReplicatedMergeTree
etc.) was successfully re-created on the node and replication was started.Meantime none of Postgres and Kafka based engines tables was recreated.
Is it a bug, or we need to use some commands or hacks to sync all metadata across the cluster?
The text was updated successfully, but these errors were encountered: