-
Notifications
You must be signed in to change notification settings - Fork 25.3k
[ML] Move to the Cohere V2 API for new inference endpoints #129884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pinging @elastic/ml-core (Team:ML) |
Hi @davidkyle, I've created a changelog YAML for you. |
@@ -166,24 +166,14 @@ private static CohereModel createModel( | |||
return switch (taskType) { | |||
case TEXT_EMBEDDING -> new CohereEmbeddingsModel( | |||
inferenceEntityId, | |||
taskType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just removing the task type and service name parameters as both of these are known from model type.
builder.endObject(); | ||
return builder; | ||
} | ||
|
||
public XContentBuilder toXContentFragment(XContentBuilder builder, Params params) throws IOException { | ||
return toXContentFragmentOfExposedFields(builder, params); | ||
toXContentFragmentOfExposedFields(builder, params); | ||
return builder.field(API_VERSION, apiVersion); // API version is persisted but not exposed to the user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to the toXContent
will expose it to the user
} | ||
|
||
@Override | ||
public HttpRequest createHttpRequest() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving shared code to the base class
import java.util.List; | ||
import java.util.Objects; | ||
|
||
public class CohereV1CompletionRequest extends CohereRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CohereCompletionRequest
and CohereCompletionRequestEntity
classes into a single class here. Same for the other request types, this is a refactoring not new code
|
||
@Override | ||
protected List<String> pathSegments() { | ||
return List.of(CohereUtils.VERSION_2, CohereUtils.CHAT_PATH); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the V2 API path.
The Cohere V2 API contains 2 changes that must be adapted for
model
parameter is no longer optionalinput_type
parameter is no longer optionalCreating an endpoint without a model now causes a validation exception.
input_type
is declared either in task_settings or in the inference call, if not set in either or these places input_type defaults to search_query.New inference endpoints will use the V2 API, existing endpoints will continue to use the V1 API. The user does not have the option of picking the V1 API in new endpoints. One possibly controversial aspect is that the API version is not surfaced to the user, the version is persisted with the model config but not included in the
GET _inference
response. I implemented this behaviour because the user does not have the ability to pick the API, in retrospect hiding the version now seems confusing.The request classes have been moved to
org.elasticsearch.xpack.inference.services.cohere.request.v1
and renamed. The new V2 request classes are inorg.elasticsearch.xpack.inference.services.cohere.request.v2
(they are very similar).The upgrade test
CohereServiceUpgradeIT
tests that the old v1 endpoints still work after upgrading.