-
Notifications
You must be signed in to change notification settings - Fork 3k
Add Weighted Reciprocal Rank Fusion (WRRF) to Python SDK #40822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Adds weighted reciprocal rank fusion to the python sdk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for Weighted Reciprocal Rank Fusion (WRRF) to the Python Cosmos DB SDK to allow different weights for hybrid full text search queries. Key changes include:
- New query tests validating WRRF behavior for both weighted and non‐weighted queries.
- Updates in query planning and aggregator logic (both synchronous and asynchronous) to incorporate component weights.
- Documentation and changelog updates explaining the new WRRF feature.
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
sdk/cosmos/azure-cosmos/tests/test_query_hybrid_search_async.py | Added async test cases for WRRF with various weight scenarios and error conditions. |
sdk/cosmos/azure-cosmos/tests/test_query_hybrid_search.py | Added similar WRRF test cases for synchronous queries including missing weights error handling. |
sdk/cosmos/azure-cosmos/azure/cosmos/documents.py | Declared a new query feature, WeightedRankFusion, to support WRRF queries. |
sdk/cosmos/azure-cosmos/azure/cosmos/aio/_cosmos_client_connection_async.py sdk/cosmos/azure-cosmos/azure/cosmos/_cosmos_client_connection.py |
Updated query plan string to include WeightedRankFusion. |
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/hybrid_search_aggregator.py sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/aio/hybrid_search_aggregator.py |
Modified aggregator functions to accept component weights and adjust sorting based on weight direction. |
sdk/cosmos/azure-cosmos/README.md | Documented the WRRF functionality, including examples and behavior of negative weights. |
sdk/cosmos/azure-cosmos/CHANGELOG.md | Updated changelog to include the WRRF feature announcement. |
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/hybrid_search_aggregator.py
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/hybrid_search_aggregator.py
Outdated
Show resolved
Hide resolved
API change check API changes are not detected in this pull request. |
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/hybrid_search_aggregator.py
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/aio/hybrid_search_aggregator.py
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/aio/hybrid_search_aggregator.py
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/aio/hybrid_search_aggregator.py
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/aio/hybrid_search_aggregator.py
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/README.md
Outdated
@@ -867,6 +867,14 @@ All of these mentioned queries would look something like this: | |||
|
|||
- `SELECT TOP 10 c.id, c.text FROM c ORDER BY RANK RRF(FullTextScore(c.text, ['quantum', 'theory']), FullTextScore(c.text, ['model']), VectorDistance(c.embedding, {item_embedding}))"` | |||
|
|||
You can also use Weighted Reciprocal Rank Fusion to assign different weights to the different scores being used in the RRF function. | |||
This is done by passing in a list of weights to the RRF function, which will be used to multiply the scores before the fusion is done. | |||
A Negative value on a weight will reverse the ordering for that score. Usually it is descending, but a negative weight will order it ascending. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this information is something we should be sharing with users - it's an implementation detail more than an indication of anything. A user that is using weights knows that they are applying more or less importance to a given component. (ignore the suggestion below, just trying to point to the line in question)
A Negative value on a weight will reverse the ordering for that score. Usually it is descending, but a negative weight will order it ascending. | |
A Negative value on a weight will reverse the ordering for that score. Usually it is descending, but a negative weight will order it ascending. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
This PR adds Weighted RRF to the python cosmos db sdk. Weights can be added to the end of the RRF function of a Hybrid Full Text Query. The number of weights need to match the number of components passed into the RRF Function. A positive weight will order by descending order and a negative weight will order by ascending order for that particular component/rank.
some examples of WRRF: