Sink connector errors - cannot replicate mysql replica after import #927

kamil-bednarek · 2024-11-21T09:51:36Z

I have a problem with the Connector. I dumped the MySQL database into Parquet format, imported it into ClickHouse, and started the connector with the appropriate binlog position – I set this position in the asc_offsets table. However, the connector is not working correctly and throws errors like:

{"source":{"server":"embeddedconnector"},"position":{"ts_sec":1732176573,"file":"mysql-bin-changelog.123278","pos":237,"gtids":"10f621a7-f3ba-11ea-8017-02015778a3f9:1-2209085961,4180c625-be60-11e8-8d56-a4bf010a41be:3652387912-3652388179:3652388182-4169697097,6d1e5f7e-b5c9-11e8-a514-a4bf011babaa:16049745253-16778132426,95612870-7801-11ee-81e9-02eaf296d7f1:1-1564,d6ddd381-73df-11ee-a092-120f624239bb:1-6125897030","snapshot":true},"ts_ms":1732176573908,"databaseName":"core","ddl":"DROP TABLE IF EXISTS `core`.`user_info`","tableChanges":[{"type":"DROP","id":"\"core\".\"user_info\""}]}

2024-11-21 08:14:54.329 INFO  - ClickHouse DDL: CREATE TABLE core.`task`(`task_id` Int64 NOT NULL ,`parent_id` Nullable(Int64),`name` String NOT NULL ,`ext_task_id` Nullable(String),`ext_parent_id` Nullable(String),`lvl` Int32 NOT NULL ,`add_date` DateTime64(0,'Europe/Paris') NOT NULL ,`archived` Int16 NOT NULL ,`color` String NOT NULL ,`tags` Nullable(String) Engine=ReplacingMergeTree(_version,is_deleted) ORDER BY (`task_id`)
2024-11-21 08:14:54.353 ERROR - Error running DDL Query: java.sql.SQLException: Code: 57. DB::Exception: Table core.task already exists. (TABLE_ALREADY_EXISTS) (version 24.5.3.5 (official build))
, server ClickHouseNode [uri=http://ch.svc.cluster.local:8123/system, options={custom_settings=allow_experimental_object_type=1,insert_allow_materialized_columns=1,client_name=Client_1}]@1909615670

My config:

logging.level.org.apache.kafka.connect.runtime.Worker: "DEBUG"
logging.level.io.debezium: "DEBUG"

metrics.enable: "true"
metrics.port: "8083"

# needed for table.include.list filtering
enable.snapshot.ddl: "true"
auto.create.tables: "false"
#snapshot.mode: "recovery"
snapshot.mode: "schema_only"
buffer.max.records: 20000
#skip_replica_start: "true"

#schema.history.internal.recovery.mode: "true"

transforms: "convertTimezone"
transforms.convertTimezone.type: "io.debezium.transforms.TimezoneConverter"
transforms.convertTimezone.converted.timezone: "Europe/Paris"
database.hostname: "instance.us-east-1.rds.amazonaws.com"
database.port: "3306"
database.user: "user"
database.password: "pass"

database.connectionTimeZone: "Europe/Paris"  # todo: need to recheck with datetime column and non UTC mysql timezone
clickhouse.datetime.timezone: "Europe/Paris"
clickhouse.server.url: "http://ch.clickhouse.svc.cluster.local"
clickhouse.server.port: "8123"
clickhouse.server.user: "connector"
clickhouse.server.password: "test"
clickhouse.server.database: "system"  # workaround! no such setting for mysql version starting from v2.1
snapshot.locking.mode: "none"
database.allowPublicKeyRetrieval: "true"
offset.flush.interval.ms: 5000
connector.class: "io.debezium.connector.mysql.MySqlConnector"
offset.storage: "io.debezium.storage.jdbc.offset.JdbcOffsetBackingStore"
offset.storage.jdbc.offset.table.name: "altinity.asc_offsets"
offset.storage.jdbc.offset.table.select: "SELECT id, offset_key, offset_val FROM %s"
offset.storage.jdbc.url: "jdbc:clickhouse://ch.clickhouse.svc.cluster.local:8123/altinity"
offset.storage.jdbc.user: "connector"
offset.storage.jdbc.password: "test"
offset.storage.jdbc.offset.table.ddl: "CREATE TABLE if not exists %s on cluster '{cluster}'
    (
        id String,
        offset_key String,
        offset_val String,
        record_insert_ts DateTime,
        record_insert_seq UInt64,
    ) ENGINE = EmbeddedRocksDB
    PRIMARY KEY offset_key"
offset.storage.jdbc.offset.table.delete: "select 1"
schema.history.internal: "io.debezium.storage.jdbc.history.JdbcSchemaHistory"
schema.history.internal.jdbc.schema.history.table.name: "altinity.asc_schema"
schema.history.internal.jdbc.url: "jdbc:clickhouse://ch.clickhouse.svc.cluster.local:8123/altinity"
schema.history.internal.jdbc.user: "connector"
schema.history.internal.jdbc.password: "test"
schema.history.internal.jdbc.schema.history.table.ddl: "CREATE TABLE if not exists %s on cluster '{cluster}'
    (
       id FixedString(36),
       history_data String,
       history_data_seq UInt32,
       record_insert_ts DateTime,
       record_insert_seq UInt32
    ) ENGINE=ReplicatedReplacingMergeTree(record_insert_seq)
    order by id"
schema.history.internal.skip.unparseable.ddl: "true"
schema.history.internal.store.only.captured.tables.ddl: "true"
table.include.list: "my_table_hidden_1,test_1" # anonymized

Is that an error related to the configuration? Is there any way to start from an specific binlog or am I doing something wrong?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sink connector errors - cannot replicate mysql replica after import #927

Sink connector errors - cannot replicate mysql replica after import #927

kamil-bednarek commented Nov 21, 2024 •

edited

Loading

Sink connector errors - cannot replicate mysql replica after import #927

Sink connector errors - cannot replicate mysql replica after import #927

Comments

kamil-bednarek commented Nov 21, 2024 • edited Loading

kamil-bednarek commented Nov 21, 2024 •

edited

Loading