Skip to content

Add l2_norm normalization support to linear retriever #128504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

mridula-s109
Copy link
Contributor

Summary

This PR adds support for L2 normalization (l2_norm) to the linear retriever in Elasticsearch.

Changes

  • Implements a new L2ScoreNormalizer class under org.elasticsearch.xpack.rank.linear that normalizes scores so that their L2 norm is 1.
  • Registers l2_norm as a valid normalizer in the linear retriever configuration.
  • Updates YAML REST tests (10_linear_retriever.yml) to cover the new normalization method.
  • Updates documentation to include l2_norm as a supported normalizer option.

@mridula-s109 mridula-s109 requested review from ioanatia, a team and Copilot May 27, 2025 11:33
@mridula-s109 mridula-s109 added >enhancement auto-backport Automatically create backport pull requests when merged :SearchOrg/Relevance Label for the Search (solution/org) Relevance team v8.19.0 v9.1.0 Team:Search - Relevance The Search organization Search Relevance team labels May 27, 2025
@elasticsearchmachine elasticsearchmachine added the Team:SearchOrg Meta label for the Search Org (Enterprise Search) label May 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-eng (Team:SearchOrg)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-relevance (Team:Search - Relevance)

@elasticsearchmachine
Copy link
Collaborator

Hi @mridula-s109, I've created a changelog YAML for you.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds L2 (Euclidean) normalization support for scores in the linear retriever, registers it in the core normalizer lookup, updates REST tests, and expands documentation.

  • Implements L2ScoreNormalizer to normalize score vectors to unit L2 norm.
  • Registers "l2_norm" in ScoreNormalizer.valueOf.
  • Adds YAML REST tests and docs entries for the new normalizer.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
x-pack/plugin/rank-rrf/src/yamlRestTest/resources/rest-api-spec/test/linear/10_linear_retriever.yml Adds a test scenario for l2_norm normalization
x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/linear/ScoreNormalizer.java Registers L2ScoreNormalizer in valueOf
x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/linear/L2ScoreNormalizer.java Implements the L2 normalization logic
docs/reference/elasticsearch/rest-apis/retrievers.md Documents l2_norm as a valid normalizer option
Comments suppressed due to low confidence (1)

x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/linear/L2ScoreNormalizer.java:29

  • Add unit tests covering edge cases in normalizeScores, such as when the input array is empty, when all scores are NaN, and when the computed norm is below EPSILON, to ensure the fallback branches behave as expected.
    public ScoreDoc[] normalizeScores(ScoreDoc[] docs) {

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @mridula-s109 ! Agreed with @ioanatia 's suggestion on additional tests.

Does it make sense to add unit tests for the normalizeScores method too?

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds L2 normalization support to the linear retriever in Elasticsearch by implementing a new L2ScoreNormalizer, updating configuration resolution, and expanding tests and documentation.

  • Introduces L2ScoreNormalizer with L2 norm scaling
  • Updates ScoreNormalizer to recognize "l2_norm"
  • Adds YAML REST tests and documentation changes for the new normalizer

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
x-pack/plugin/rank-rrf/src/yamlRestTest/resources/rest-api-spec/test/linear/10_linear_retriever.yml Added YAML tests for verifying L2 normalization behavior
x-pack/plugin/rank-rrf/src/test/java/org/elasticsearch/xpack/rank/linear/L2ScoreNormalizerTests.java Created test cases to validate normalization with typical, zero, and NaN scores
x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/linear/ScoreNormalizer.java Updated to support lookup of the new L2 normalizer
x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/linear/L2ScoreNormalizer.java New implementation for L2 normalization of scores
docs/reference/elasticsearch/rest-apis/retrievers.md Updated documentation with the "l2_norm" option
docs/changelog/128504.yaml Changelog entry for L2 normalization support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >enhancement :SearchOrg/Relevance Label for the Search (solution/org) Relevance team Team:Search - Relevance The Search organization Search Relevance team Team:SearchOrg Meta label for the Search Org (Enterprise Search) v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants