Initialize ServerMetrics in PredownloadScheduler#18608
Open
pradeeee wants to merge 3 commits into
Open
Conversation
…ainer The PredownloadScheduler runs in a separate predownload container that starts before the main Pinot server. Previously, it only initialized PredownloadMetrics but did not set up the ServerMetrics registry. This caused PredownloadMetrics (which internally depends on ServerMetrics) to silently fall back to the NOOP metrics instance, resulting in no server-level metrics being emitted from the predownload container. This change initializes ServerMetrics via PinotMetricUtils in the initializeMetricsReporter() method. The initialization is wrapped in a try-catch so that if the underlying metrics factory cannot be created (e.g., because a vendor-specific metrics client is not yet available in the predownload lifecycle), the predownload process continues with the default NOOP ServerMetrics rather than crashing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three test cases covering the initializeMetricsReporter() change: 1. testInitializeMetricsReporterRegistersServerMetrics - verifies that ServerMetrics is properly initialized and registered when the metrics factory is available 2. testInitializeMetricsReporterFallsBackOnFailure - verifies that when PinotMetricUtils throws (e.g., vendor metrics client not ready), the scheduler continues with the NOOP ServerMetrics instance instead of crashing 3. testInitializeMetricsReporterAlwaysCreatesPredownloadMetrics - verifies that PredownloadMetrics is always created and registered regardless of whether ServerMetrics initialization succeeds or fails Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Changed catch(Throwable) to catch(Exception) to avoid swallowing JVM-level Errors like OutOfMemoryError - Check ServerMetrics.register() return value and log success or error if an instance was already registered
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
PredownloadSchedulerruns in a separate predownload container that starts before the main Pinot server process. Its purpose is to pre-download segments from deep storage so the server can start faster.Previously, the
initializeMetricsReporter()method inPredownloadScheduleronly createdPredownloadMetricsbut never initialized theServerMetricsregistry. As a result:PredownloadMetricsinternally depends onServerMetrics.get()for reporting server-level meters and timers.ServerMetricswas never initialized in the predownload container lifecycle,ServerMetrics.get()always returned the NOOP instance.Call chain showing the gap
In contrast, the normal server startup path initializes
ServerMetricsviaPinotMetricUtils.getPinotMetricsRegistry()before any metrics are used.Changes
This PR adds
ServerMetricsinitialization toPredownloadScheduler.initializeMetricsReporter():ServerConffrom the existing_pinotConfigPinotMetricsRegistryviaPinotMetricUtils.getPinotMetricsRegistry()ServerMetricsinstance with the proper prefix, table-level metrics config, and allowed tablesGraceful fallback
The initialization is wrapped in a
try-catch(Throwable)so that if the configuredPinotMetricsFactorycannot be created (e.g., because a vendor-specific metrics client is not yet available during the predownload lifecycle), the predownload process continues with the default NOOPServerMetricsrather than crashing. This ensures the predownload container remains resilient regardless of the metrics backend availability.Before (no ServerMetrics)
After (ServerMetrics initialized with fallback)
Test Plan
ServerMetricsis properly initialized and metrics are emitted