Skip to content

ESQL: Added Sample operator NamedWritable to plugin #131541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 21, 2025

Conversation

ivancea
Copy link
Contributor

@ivancea ivancea commented Jul 18, 2025

SampleOperator.Status wasn't declared as a NamedWritable by the plugin, leading to serialization errors when SAMPLE is used with profile: true.

It leads to an IllegalArgumentException: Unknown NamedWriteable [org.elasticsearch.compute.operator.Operator$Status][sample]

Profiles will be tested in this PR: #131474, that's currently failing because of this bug

@ivancea ivancea requested a review from jan-elastic July 18, 2025 15:33
@ivancea ivancea added >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.2.0 v9.1.1 labels Jul 18, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @ivancea, I've created a changelog YAML for you.

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a TransportVersion and a minimumSupportedVersion should make it skipped for versions that don't support it.

I would recommend making a 9.2 and 9.1 one.

Copy link
Contributor

@jan-elastic jan-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ivancea ivancea force-pushed the fix-sample-operator-serialization branch from cdf06ed to b321464 Compare July 21, 2025 10:44
@ivancea ivancea marked this pull request as ready for review July 21, 2025 10:50
@ivancea ivancea requested review from nik9000 and jan-elastic July 21, 2025 10:50
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@@ -109,7 +109,7 @@ private void createOutputPage(Page page) {
final int[] sampledPositions = new int[page.getPositionCount()];
int sampledIdx = 0;
for (int i = randomSamplingIterator.docID(); i - rowsReceived < page.getPositionCount(); i = randomSamplingIterator.nextDoc()) {
sampledPositions[sampledIdx++] = i - rowsReceived;
sampledPositions[sampledIdx++] = Math.toIntExact(i - rowsReceived);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RandomSamplingIterator uses an int for maxDoc. So if more than MAX_INT rows are process, something else breaks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really an issue introduced by this PR though

Copy link
Contributor

@jan-elastic jan-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ivancea ivancea merged commit aa77c4a into elastic:main Jul 21, 2025
33 checks passed
@ivancea ivancea deleted the fix-sample-operator-serialization branch July 21, 2025 12:27
ivancea added a commit to ivancea/elasticsearch that referenced this pull request Jul 21, 2025
`SampleOperator.Status` wasn't declared as a NamedWritable by the plugin, leading to serialization errors when `SAMPLE` is used with `profile: true`.

It leads to an `IllegalArgumentException: Unknown NamedWriteable [org.elasticsearch.compute.operator.Operator$Status][sample]`

Profiles will be tested in this PR: elastic#131474, that's currently failing because of this bug
ivancea added a commit that referenced this pull request Jul 21, 2025
…131621)

Manual 9.1 backport of #131541

There's a TransportVersion change here
szybia added a commit to szybia/elasticsearch that referenced this pull request Jul 22, 2025
…king

* upstream/main: (100 commits)
  Term vector API on stateless search nodes (elastic#129902)
  TEST Fix ThreadPoolMergeSchedulerStressTestIT testMergingFallsBehindAndThenCatchesUp (elastic#131636)
  Add inference.put_custom rest-api-spec (elastic#131660)
  ESQL: Fewer serverless docs in tests (elastic#131651)
  Skip search on indices with INDEX_REFRESH_BLOCK (elastic#129132)
  Mute org.elasticsearch.indices.cluster.RemoteSearchForceConnectTimeoutIT testTimeoutSetting elastic#131656
  [jdk] Resolve EA OpenJDK builds to our JDK archive (elastic#131237)
  Add optimized path for intermediate values aggregator (elastic#131390)
  Correctly handling download_database_on_pipeline_creation within a pipeline processor within a default or final pipeline (elastic#131236)
  Refresh potential lost connections at query start for `_search` (elastic#130463)
  Add template_id to patterned-text type (elastic#131401)
  Integrate LIKE/RLIKE LIST with ReplaceStringCasingWithInsensitiveRegexMatch rule (elastic#131531)
  [ES|QL] Add doc for the COMPLETION command (elastic#131010)
  ESQL: Add times to topn status (elastic#131555)
  ESQL: Add asynchronous pre-optimization step for logical plan (elastic#131440)
  ES|QL: Improve generative tests for FORK [130015] (elastic#131206)
  Update index mapping update privileges (elastic#130894)
  ESQL: Added Sample operator NamedWritable to plugin (elastic#131541)
  update `kibana_system` to grant it access to `.chat-*` system index (elastic#131419)
  Clarify heap size configuration (elastic#131607)
  ...
szybia added a commit to szybia/elasticsearch that referenced this pull request Jul 22, 2025
…-tracking

* upstream/main: (44 commits)
  Term vector API on stateless search nodes (elastic#129902)
  TEST Fix ThreadPoolMergeSchedulerStressTestIT testMergingFallsBehindAndThenCatchesUp (elastic#131636)
  Add inference.put_custom rest-api-spec (elastic#131660)
  ESQL: Fewer serverless docs in tests (elastic#131651)
  Skip search on indices with INDEX_REFRESH_BLOCK (elastic#129132)
  Mute org.elasticsearch.indices.cluster.RemoteSearchForceConnectTimeoutIT testTimeoutSetting elastic#131656
  [jdk] Resolve EA OpenJDK builds to our JDK archive (elastic#131237)
  Add optimized path for intermediate values aggregator (elastic#131390)
  Correctly handling download_database_on_pipeline_creation within a pipeline processor within a default or final pipeline (elastic#131236)
  Refresh potential lost connections at query start for `_search` (elastic#130463)
  Add template_id to patterned-text type (elastic#131401)
  Integrate LIKE/RLIKE LIST with ReplaceStringCasingWithInsensitiveRegexMatch rule (elastic#131531)
  [ES|QL] Add doc for the COMPLETION command (elastic#131010)
  ESQL: Add times to topn status (elastic#131555)
  ESQL: Add asynchronous pre-optimization step for logical plan (elastic#131440)
  ES|QL: Improve generative tests for FORK [130015] (elastic#131206)
  Update index mapping update privileges (elastic#130894)
  ESQL: Added Sample operator NamedWritable to plugin (elastic#131541)
  update `kibana_system` to grant it access to `.chat-*` system index (elastic#131419)
  Clarify heap size configuration (elastic#131607)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.1.1 v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants