Skip to content

[ML] Move to the Cohere V2 API for new inference endpoints #129884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

davidkyle
Copy link
Member

The Cohere V2 API contains 2 changes that must be adapted for

  1. The model parameter is no longer optional
  2. For embeddings the input_type parameter is no longer optional

Creating an endpoint without a model now causes a validation exception. input_type is declared either in task_settings or in the inference call, if not set in either or these places input_type defaults to search_query.

New inference endpoints will use the V2 API, existing endpoints will continue to use the V1 API. The user does not have the option of picking the V1 API in new endpoints. One possibly controversial aspect is that the API version is not surfaced to the user, the version is persisted with the model config but not included in the GET _inference response. I implemented this behaviour because the user does not have the ability to pick the API, in retrospect hiding the version now seems confusing.

The request classes have been moved to org.elasticsearch.xpack.inference.services.cohere.request.v1 and renamed. The new V2 request classes are in org.elasticsearch.xpack.inference.services.cohere.request.v2 (they are very similar).

The upgrade test CohereServiceUpgradeIT tests that the old v1 endpoints still work after upgrading.

@davidkyle davidkyle added >enhancement :ml Machine learning auto-backport Automatically create backport pull requests when merged v8.19.0 v9.1.0 labels Jun 23, 2025
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Jun 23, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Hi @davidkyle, I've created a changelog YAML for you.

@@ -166,24 +166,14 @@ private static CohereModel createModel(
return switch (taskType) {
case TEXT_EMBEDDING -> new CohereEmbeddingsModel(
inferenceEntityId,
taskType,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just removing the task type and service name parameters as both of these are known from model type.

builder.endObject();
return builder;
}

public XContentBuilder toXContentFragment(XContentBuilder builder, Params params) throws IOException {
return toXContentFragmentOfExposedFields(builder, params);
toXContentFragmentOfExposedFields(builder, params);
return builder.field(API_VERSION, apiVersion); // API version is persisted but not exposed to the user
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this to the toXContent will expose it to the user

}

@Override
public HttpRequest createHttpRequest() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving shared code to the base class

import java.util.List;
import java.util.Objects;

public class CohereV1CompletionRequest extends CohereRequest {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CohereCompletionRequest and CohereCompletionRequestEntity classes into a single class here. Same for the other request types, this is a refactoring not new code


@Override
protected List<String> pathSegments() {
return List.of(CohereUtils.VERSION_2, CohereUtils.CHAT_PATH);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the V2 API path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >enhancement :ml Machine learning Team:ML Meta label for the ML team v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants