-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Describe the bug
Starting from Antalya 25.3.3.20183, Iceberg queries executed via DataLakeCatalog (REST catalog) no longer emit the ParquetMetaDataCacheHits ProfileEvent in system.query_log, even when Parquet metadata cache is explicitly enabled and the same query is executed repeatedly (warm runs).
This is a regression: the same behavior worked correctly in earlier Antalya 25.3 builds, and it is still failing in Antalya 25.8.
As a result, Parquet metadata caching cannot be observed or validated via ProfileEvents for Iceberg database engine queries, breaking cache validation and violating the documented SRS behavior.
To Reproduce
Option A — Reproduce via automation (recommended)
The issue can be reproduced using the existing Iceberg cache regression test.
The test will pause immediately after the failure, allowing live investigation of the ClickHouse instance.
python3 -u iceberg/regression.py \
--only "/iceberg/iceberg cache/rest catalog/iceberg database engine/cache/*" \
--local \
--clickhouse docker://altinity/clickhouse-server:25.8.12.20747.altinityantalya \
--clickhouse-version 25.8.12.20747 \
--pause-on-fail "/iceberg/iceberg cache/rest catalog/iceberg database engine/cache/"Option B — Manual reproduction
Prerequisites
- ClickHouse Antalya build >= 25.3.3.20183
- S3-compatible object storage (e.g. MinIO)
- Iceberg REST catalog reachable from ClickHouse (may require authentication)
- An existing Iceberg table in the catalog/warehouse (created via Iceberg tooling; must include a
date_colcolumn used in the query) - ClickHouse has access to:
- the REST catalog endpoint (including required auth), and
- the object storage warehouse location (credentials + endpoint)
Steps
- Enable Iceberg database engine:
SET allow_experimental_database_iceberg = 1;- Create an Iceberg catalog database using REST catalog:
CREATE DATABASE iceberg_db
ENGINE = DataLakeCatalog('http://ice-rest-catalog:5000', 'admin', 'password')
SETTINGS
catalog_type = 'rest',
storage_endpoint = 'http://minio:9000/warehouse',
warehouse = 's3://bucket1/',
auth_header = 'Authorization: <...>'; -- if your REST catalog requires it- Enable Parquet metadata cache:
SET input_format_parquet_use_metadata_cache = 1;- Execute the same query once (cold run) and then repeat it multiple times (warm runs):
SELECT *
FROM iceberg_db.`<namespace>.<table>`
WHERE date_col > '3030-01-01'
SETTINGS log_comment = 'repro_parquet_metadata_cache'
FORMAT TabSeparated;(repeat the query ~20–100 times)
- Inspect ProfileEvents:
SELECT
count() AS total_rows,
sum(mapContains(ProfileEvents, 'ParquetMetaDataCacheHits')) AS rows_with_hits_key,
sum(ProfileEvents['ParquetMetaDataCacheHits']) AS hits_sum
FROM system.query_log
WHERE type = 'QueryFinish'
AND log_comment = 'repro_parquet_metadata_cache';Expected behavior
When input_format_parquet_use_metadata_cache = 1 is enabled and the same Iceberg query is executed repeatedly:
- Parquet metadata should be cached
ParquetMetaDataCacheHitsmust be emitted insystem.query_log → ProfileEvents- The counter should be > 0 for warm runs
This matches the documented SRS requirement:
ClickHouse SHALL track parquet metadata cache performance metrics in
system.query_logviaParquetMetaDataCacheHits.
Actual behavior
In Antalya builds >= 25.3.3.20183:
ParquetMetaDataCacheHitsis not emitted- The key does not appear in
ProfileEvents, even after many warm runs - The same query path emitted this event correctly in earlier builds
Key information
-
Project Antalya Build Version
- ✅ PASS:
25.3.3.20143.altinityantalya(Jun 13, 2025) - ❌ FAIL:
25.3.3.20183.altinityantalya(Jul 10, 2025) - ❌ FAIL:
25.8.12.20747.altinityantalya
- ✅ PASS:
-
Cloud provider: N/A (local repro)
-
Object storage: MinIO (S3-compatible)
-
Iceberg catalog: REST catalog
-
Iceberg access method:
DataLakeCatalogdatabase engine
Release notes of the first failing build:
https://github.com/Altinity/ClickHouse/releases/tag/v25.3.3.20183.altinityantalya
Additional context
- Historical test data and re-runs against multiple versions confirm this is not intermittent.
- Other Iceberg-related cache ProfileEvents (e.g.
IcebergMetadataFilesCacheHits) may still be emitted. - The issue seems specific to Parquet metadata cache ProfileEvents for Iceberg database engine queries.