Skip to content

[ML] Move to the Cohere V2 API for new inference endpoints #129884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jun 24, 2025

Conversation

davidkyle
Copy link
Member

The Cohere V2 API contains 2 changes that must be adapted for

  1. The model parameter is no longer optional
  2. For embeddings the input_type parameter is no longer optional

Creating an endpoint without a model now causes a validation exception. input_type is declared either in task_settings or in the inference call, if not set in either or these places input_type defaults to search_query.

New inference endpoints will use the V2 API, existing endpoints will continue to use the V1 API. The user does not have the option of picking the V1 API in new endpoints. One possibly controversial aspect is that the API version is not surfaced to the user, the version is persisted with the model config but not included in the GET _inference response. I implemented this behaviour because the user does not have the ability to pick the API, in retrospect hiding the version now seems confusing.

The request classes have been moved to org.elasticsearch.xpack.inference.services.cohere.request.v1 and renamed. The new V2 request classes are in org.elasticsearch.xpack.inference.services.cohere.request.v2 (they are very similar).

The upgrade test CohereServiceUpgradeIT tests that the old v1 endpoints still work after upgrading.

@davidkyle davidkyle added >enhancement :ml Machine learning auto-backport Automatically create backport pull requests when merged v8.19.0 v9.1.0 labels Jun 23, 2025
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Jun 23, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Hi @davidkyle, I've created a changelog YAML for you.

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
…inference/services/cohere/CohereServiceSettings.java

Co-authored-by: Pat Whelan <[email protected]>
@davidkyle davidkyle enabled auto-merge (squash) June 24, 2025 15:36
@davidkyle
Copy link
Member Author

Test this please

@davidkyle davidkyle merged commit 3a1551e into elastic:main Jun 24, 2025
32 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 129884

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending >enhancement :ml Machine learning Team:ML Meta label for the ML team v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants