-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for ordering by distance (using <-> operator) #2989
Comments
Syntax could maybe be something like this:
|
It's getting too hard to extend the URL query grammar. Another case of a common query that uses the order by distance is this one for vector search: select
id,
content,
metadata,
1 - (documents.embedding <=> query_embedding) as similarity
from documents
where metadata @> filter
order by documents.embedding <=> query_embedding; (from https://python.langchain.com/docs/integrations/vectorstores/supabase) Instead of trying to shove these special cases into query params, I'm thinking we should take advantage of content types. Something like: SEARCH /table
Content-Type: application/vnd.pgrst.vector-search; distance-order;col1;col2; query=xx; filter=yy
Accept: application/json
{query: "..", filter: ".."} Then that would then execute that common query. (Another case that also fits here is the order by random PostgREST/postgrest-docs#671) If we can make this play nicely with #2826 then that would be ideal. I also think it'd be much clearer for users to have properly named content-types with parameters than to use more magic syntax in the URL grammar. |
I am not really sure if ordering by distance specifically is not too limiting. In SQL it is more general ORDER BY [expr] DESC/ASC. It is just that distance operator is most commonly used and supported by index access methods. In my case adding ORDER BY col <-> [constant] is just a workaround for lack of ORDER BY column support in GIST access method. I would actually envision syntax resembling horizontal filtering: ?order=[column].[operator].[argument] where argument might be required to be a constant value. |
In #2066 (comment) I argued, that we should:
I'm not sure anymore whether that's really the case. So... I don't think this is a good idea after all. |
I have an interesting case: we have a huge (one partitioned table with 10G rows) database that needs to be searched effectively using various criteria. It appears a single GIST index would meet all our demands as it is multi-column and supports multiple datatypes in a single index.
The problem is that GIST only supports ordering by distance:
ORDER BY column <-> [value]
.It is not a big deal in SQL as for btree_gist operations we can change from ORDER BY column to ORDER BY column <-> [very_small_value]
It would be great if we could use it from PostgREST without resorting to custom functions.
I guess it would also be very nice addition for all full text and vector searching needs (as most of the time you don't want to filter but rather to sort by rank).
The text was updated successfully, but these errors were encountered: