Description
Description
When querying _field_caps
, semantic_text
fields are returned as a text
field type. The intention is that these fields should "just be text fields", but there are differences, such as the multi_match
query not working for semantic_text
fields, or semantic_text
not being queryable cross-cluster. Getting the index mappings directly is suggested as an alternative but this is very bad for performance (because of the payload size) and does not work cross-cluster.
Either semantic_text
should be recognizable as such in _field_caps
or the feature gap needs to be closed.
Use cases
For knowledge bases in the Obs AI Assistant, we leverage search connector indices. The user types in a prompt, and then we want to use the prompt to query these indices. For that to work we need to understand what fields should be queried as text. We can do this by calling _field_caps
on those indices with type: text
. However because we don't understand what fields are semantic_text
we cannot use all text queries. This currently means that we have to create match
queries for each text field instead of creating a multi_match
query. There are probably other queries as well that do not work on semantic_text
fields. On a case-by-case basis we can work around this but generally we need the capability to determine upfront what kind of queries can be executed and which cannot in order to prevent internal errors.