Skip to content

InferenceClient: allow passing a pydantic model as response_format #2646

Open
@Wauplin

Description

@Wauplin

(issue opened after discussion with @lhoestq)

In InferenceClient.chat_completion, one can pass a response_format which constraint the output format. It must be either a regex or a json schema. A usual use case is to have a dataclass or a Pydantic model and you want the LLM to generate an instance of that class. This can currently be done like this:

client.chat_completion(..., response_format={"type": "json", "value": MyCustomModel.schema()})

It would be good to either:

  1. document this particular use case for convenience
  2. or even allow passing client.chat_completion(..., response_format=MyCustomModel) and handle the serialization automatically before making the http call. If we do so, pydantic shouldn't be a dependency.

Note: the same should be done for client.text_generation(..., grammar=...).


Note: it seems that it's also possible to handle simple dataclasses with something like this. Unsure if it's worth the hassle though. If we add that, we should not add a dependency, simply copy the code + license into a submodule given how tiny and unmaintained the code is.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions