Skip to content

Structured Output & JSON mode response support #131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jayelkaake
Copy link

@jayelkaake jayelkaake commented Apr 21, 2025

Features

✅ Supports all features of json_schema with minimal text.
✅ Automatically switches to json_mode. No need to think about it. Just ask for JSON.
✅ Does not require maintenance when the APIs inevitably increase their support
✅ Thread-safe implementation
✅ Full test coverage
❌ To reduce initial complexity, Anthropic, Deepseek and Gemini are not yet supported. I can add this in another PR.

Sample Code

You can now ensure the responses follow a schema you define like this:

chat = RubyLLM.chat

chat.with_response_format(:integer).ask("What is 2 + 2?").content
# => 4

chat.with_response_format(:string).ask("Say 'Hello World' and nothing else.").content
# => "Hello World"

chat.with_response_format(:array, items: { type: :string })
chat.ask('What are the 2 largest countries? Only respond with country names.').content
# => ["Russia", "Canada"]

chat.with_response_format(:object, properties: { age: { type: :integer } })
chat.ask('Provide sample customer age between 10 and 100.').content
# => { "age" => 42 }

chat.with_response_format(
  :object,
  properties: { hobbies: { type: :array, items: { type: :string, enum: %w[Soccer Golf Hockey] } } }
)
chat.ask('Provide at least 1 hobby.').content
# => { "hobbies" => ["Soccer"] }

You can also provide the JSON schema you want directly to the method like this:

chat.with_response_format(type: :object, properties: { age: { type: :integer } })
# => { "age" => 31 }

In this example the code is automatically switching to OpenAI's json_mode since no object properties are requested:

chat.with_response_format(:json) # Don't care about structure, just give me JSON

chat.ask('Provide a sample customer data object with name and email keys.').content
# => { "name" => "Tobias", "email" => "[email protected]" }

chat.ask('Provide a sample customer data object with name and email keys.').content
# => { "first_name" => "Michael", "email_address" => "[email protected]" }

How was it tested?

  • Tested all the models, however currently it only works for OpenAI. I added specs for OpenAI.

Shout-outs

Related

@danielfriis
Copy link

I like this approach! Very simple implementation.

@danielfriis
Copy link

Take a look at #90. It was a schema builder, I thought about adding, similar to how tools are defined.

@jayelkaake jayelkaake force-pushed the WithResponseFormat branch 3 times, most recently from 3b53b08 to cbcbc7f Compare April 22, 2025 15:40
@jayelkaake jayelkaake force-pushed the WithResponseFormat branch 2 times, most recently from b9b3d63 to 58d0b6a Compare April 25, 2025 15:56
Eric-Guo added a commit to thape-cn/web that referenced this pull request Apr 27, 2025
connection: @connection,
&
)
response = @provider.with_response_schema(@response_schema) do
Copy link
Author

@jayelkaake jayelkaake Apr 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crmne should we reset the @response_schema after the completion so it doesn't apply to subsequent messages in the chat?

Learning from your comment on my other PR about temperature I realize now that your with_* pattern is meant to only apply to the next completion, so that's why I'm asking.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, with_* applies to all consequent messages

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I actually updated the code to reset the response schema after completion after your comment on my other PR about temperature already!

I've been battle-testing this code at my company Osello (which is also a sponsor now of this project) and I realized it does make more sense to reset the response_schema after completion because in practice subsequent chat messages are most likely not meant to follow the same format:

chat.with_response_format(type: :string, enum: %w["Toronto", "Ottawa"])
       .ask("What's the capital of Canada?")
       .content
# => "Ottawa"

chat.ask("How long has it been the capital?")
# => "Ottawa has been the capital of Canada since 1857."

chat.with_response_format(type: :integer).ask("How many years is that?")
# => 168

Copy link

@sirwolfgang sirwolfgang Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the the design of the with_* prefix I think it would be best to not break the applies to all interface. What about splitting it to a different prefix?

# Applies to all messages
chat = RubyLLM.chat.with_response_format(type: :string)

chat.ask("What's the capital of Canada?")
# => "Ottawa"

# Applies to current message
chat.as(type: :integer).ask("How many years is that?")
# => 168

# Resets back
chat.ask("How long has it been the capital?")
# => "Ottawa has been the capital of Canada since 1857."

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sirwolfgang good idea!

Although as might be a bit too ambiguous. Maybe .with_next_response_in_format(...) or something like that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea is that I could make it so that with_response_format takes a block, and that can be used to reset format after:

chat.with_response_format(type: :string, enum: %w["Toronto", "Ottawa"]) do
  chat.ask("What's the capital of Canada?").content
end
# => "Ottawa"

chat.ask("How long has it been the capital?")
# => "Ottawa has been the capital of Canada since 1857."

chat.with_response_format(type: :integer)
chat.ask("How many years ago is that?")
# => 168

chat.ask("How many years ago will that be next year?")
# => 169

Copy link

@sirwolfgang sirwolfgang Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jayelkaake I think linguistically/ergonomically it would be better to split and not use the .with prefix at all, making it a shorter scan/parse.

Totally open to ideas other than .as, but also curious to what you think it might collide with? I don't think it's structurally more ambiguous than .with. The only other as convention I can think of is rails routes, so I also don't think it's mnemonically overloaded.

Could extend it to .as_format, .as_type, .as_response_type or something else, if we want to preserve the root namespace.

Otherwise, if we could delay execution like that of AR, I can see the argument for making it a post setting; something like:

agent.ask("...?").in(type: :integer)
agent.ask("...?").as(type: :integer)
agent.ask("...?").structured_as(type: :integer)
agent.ask("...?").formated_as(type: :integer)

Copy link
Author

@jayelkaake jayelkaake Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, just doesn't read well with English "ask someone to do something as...".... and also it should be clear you're not modifying the query like it is with AR, it's modifying the response.

I like the postfix format better. Maybe something like agent.ask("...?").response_as(:integer) but I think that might require some major updates to this library to get there.

Most APIs don't let you mutate the response format in real-time, so LLMs are kind of introducing the need for a new pattern maybe? I've been scratching my head about these things a lot over the last couple weeks! 😅

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think refactoring to support a more dynamic method chaining should be a different PR; But we could setup the expect syntax here, and build towards that; Since it should functionally work in either order.
agent.<token>.ask("...?") => agent.ask("...?").<token>

response/d feels a little weird to me. I could see this interface also make sense for loading personas; like respond_as(:support_agent). Which might be the process of chaining assistants for processing. Like

timekeeper = RubyLLM.chat.with_tool(TIME)
groot = RubyLLM.chat.with_instructions(GROOT)

timekeeper.ask("What time is it?").respond_as(groot) => "I am grooooooot"

@jayelkaake
Copy link
Author

FYI @crmne I've been battle-testing this code at my company Osello (which is also a sponsor now of this project) and making tweaks over the last week.

If you took a look early last week, please take a fresh look now (same with my PR #124)

@kieranklaassen
Copy link
Contributor

You rock @jayelkaake !

@sirwolfgang
Copy link

LMK if there's anything I can contribute to this to help get it over the link @crmne @jayelkaake. I'd like to get this in along with #144 to move forward on the path of using structured responses with OpenRouter

@message.try('tool_call_id=', tool_call_id)
@message.try('input_tokens=', message.input_tokens)
@message.try('output_tokens=', message.output_tokens)
@message.try('content_schema=', message.schema)
Copy link

@tkoenig tkoenig May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jayelkaake I think this should be @message.try('content_schema=', message.content_schema), right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for pointing out @tkoenig

It was actually a merge problem for me working with multiple branches since this one hasn't been reviewed yet 😛 (was fixed in my dev branch). I think I need to stop making changes for a little while until it gets a review so I'm not just writing vapourware.

@jayelkaake jayelkaake force-pushed the WithResponseFormat branch from 835327c to e325cba Compare May 5, 2025 14:25
@kieranklaassen
Copy link
Contributor

kieranklaassen commented May 12, 2025

@jayelkaake is this one ready to use in production? Should I try it out this branch? Still running my own version but would like to use this if this will be merged/

Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request May 13, 2025
Eric-Guo added a commit to thape-cn/web that referenced this pull request May 13, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request May 13, 2025
Eric-Guo added a commit to thape-cn/web that referenced this pull request May 13, 2025
@Eric-Guo
Copy link
Contributor

Eric-Guo commented May 13, 2025

@jayelkaake I just do a rebase in main and trick part like this, FYI and hoping this PR merged soon.

Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request May 13, 2025
Eric-Guo added a commit to thape-cn/web that referenced this pull request May 13, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request May 13, 2025
Eric-Guo added a commit to thape-cn/web that referenced this pull request May 13, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request May 21, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request May 22, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request May 23, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request May 23, 2025
@joshmlewis
Copy link

Thank you for the work on this PR. It's definitely a common use case I was sad to see wasn't in the gem already. Anything I can do to help get it over the finish line?

Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jun 1, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jun 3, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jun 3, 2025
Copy link
Owner

@crmne crmne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this @jayelkaake, and for the sponsorship support. I've taken a close look at this PR and while I appreciate the effort, I don't think this API design fits with RubyLLM's philosophy.

The core issue: This approach is inconsistent with how RubyLLM handles structured data elsewhere. Compare your proposed API:

chat.with_response_format(:object, properties: { hobbies: { type: :array, items: { type: :string, enum: %w[Soccer Golf Hockey] } } })

To how we handle Tools:

class Weather < RubyLLM::Tool
  description "Gets current weather for a location"
  param :latitude, desc: "Latitude (e.g., 52.5200)"
  param :longitude, desc: "Longitude (e.g., 13.4050)"
  
  def execute(latitude:, longitude:)
    # implementation
  end
end

The Tools API is self-documenting, reusable, and feels Ruby-native. Your approach is essentially raw JSON Schema dumped into method calls, which I have to backtrack on.

Additional concerns:

  1. Parameter overloading is confusing - sometimes a symbol, sometimes symbol + hash, sometimes just hash
  2. Thread.current conflicts with Context - we just shipped Context in 1.3.0 specifically to avoid thread-local state
  3. Magic behavior switching between json_schema and json_object modes is too implicit

What structured responses could look like in RubyLLM:

(need to think this through)

class CustomerData < RubyLLM::Schema
  string :name, required: true
  string :email
  integer :age, range: 18..100
end

class HobbyList < RubyLLM:Schema
  array :hobbies, items: { type: :string, enum: %w[Soccer Golf Hockey] }
end

# Usage
response = chat.with_schema(CustomerData).ask("Generate a customer")
# or
response = chat.ask("Generate a customer", schema: CustomerData)

puts response.name      # => "John Doe"
puts response.email     # => "[email protected]" 
puts response.age       # => 42

This approach would be:

  • Consistent with our existing patterns
  • Reusable across conversations
  • Self-documenting
  • Properly integrated with Context instead of Thread.current

I know this means starting over, but I'd rather ship something that feels like RubyLLM than rush a feature that fights the library's design. When structured responses land, they need to feel like they belong.

Given the strong community interest in this feature (as seen in #11), I'm prioritizing getting this right. Would you be interested in exploring this class-based approach, or would you prefer I handle the implementation?

Comment on lines 161 to 179
@message.update!(
role: message.role,
content: message.content,
model_id: message.model_id,
tool_call_id: tool_call_id,
input_tokens: message.input_tokens,
output_tokens: message.output_tokens
)
# These are required fields:
@message.role = message.role
@message.content = message.content

# These are optional fields:
@message.try('model_id=', message.model_id)
@message.try('tool_call_id=', tool_call_id)
@message.try('input_tokens=', message.input_tokens)
@message.try('output_tokens=', message.output_tokens)
@message.try('content_schema=', message.schema)

@message.save!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what has this to do with structured outputs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 2 things here depending on what you're asking about @crmne :

Why use try?

The repo documentation here says that there are required fields and the other fields are optional.

Either docs need to be changed and a major version increased to v2 to indicate semantic versioning reverse-incompatibility, or using try lets it stay optional for now.

Why store the schema in the DB?

Since the LLMs send the entire thread on each request they also require the schema for each message sent. Without it, you can't continue conversations that were stored in the DB a later date.

In fact, I also later realized that the tool call param schemas need to be stored as well for the same reason.

There are probably ways to not store the schema and instead link it to a class locally, but that would break all past conversations if the response schema in the code changed.

@crmne let me know if you're wondering about something else.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to use a try as the message referenced there is a Message instance which sets these parameters to nil or a suitable default in case they don't exist.

Totally understand the schema needing to be in the message and up for that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crmne Unless fallbacks were added to to the ActsAs concern after this commit, you will get an error on save if you follow the docs exactly for the Rails models and don't add the output_tokens column to your Rails model.

That's why the try is needed. You can try it for yourself if you want.

@message.class != message.class in this method - maybe that's what's causing confusion? I'm trying to add more docs to methods to reduce that confusion in my fork (which I'll add in to this PR)

Let me know if I missed something here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see what you mean now! In the docs I have mistakenly kept the comment that says that these fields are optional. My bad. Fixing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha thanks - if you're updating the docs I'll just remove the trys because others would not have been able to proceed anyways with basic functionality if they ran into the same issue as me. Do you agree?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did c48c8eb

That's a good idea. Let's remove the trys.

Comment on lines +34 to +56
##
# @return [::RubyLLM::Schema, NilClass]
def response_schema
Thread.current['RubyLLM::Provider::Methods.response_schema']
end

##
# @param response_schema [::RubyLLM::Schema]
def with_response_schema(response_schema)
prev_response_schema = Thread.current['RubyLLM::Provider::Methods.response_schema']

result = nil
begin
Thread.current['RubyLLM::Provider::Methods.response_schema'] = response_schema

result = yield
ensure
Thread.current['RubyLLM::Provider::Methods.response_schema'] = prev_response_schema
end

result
end

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's all this about threads? we have contexts now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a thread-safe way to set response schema.

I'll be able to refactor it to use context instead (assuming the context system thread-safe, haven't looked yet).

@jayelkaake
Copy link
Author

Hey @crmne thanks for the feedback!

I'd be happy to update the PR with your suggestions.

Adding the context system is amazing. I was going to suggest that, so now that it exists it'll greatly simplify my implementation.

I'll work on it and submit my update for re-review.

@danielfriis
Copy link

danielfriis commented Jun 3, 2025

FYI. Did a bit of work towards class-defined schemas here #90

Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jun 4, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jun 5, 2025
@crmne crmne linked an issue Jun 10, 2025 that may be closed by this pull request
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jun 11, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jun 12, 2025
@crmne crmne added the enhancement New feature or request label Jul 16, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jul 17, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jul 18, 2025
crmne added a commit that referenced this pull request Jul 21, 2025
…lying API payload (#265)

## What this does

Implements `with_request_options` (renamed from `with_options` due to
ActiveRecord conflict -- see conversation) with @crmne's suggestions
from comment
#130 (review)
and tested against all providers.

This allows users to set arbitrary options on the payload before it's
sent to the provider's API endpoint. The render_payload takes
precedence.

Demo:

```ruby
chat = RubyLLM
  .chat(model: "qwen3", provider: :ollama)
  .with_request_options(response_format: {type: "json_object"})
  .with_instructions("Answer with a JSON object with the key `result` and a numerical value.")
response = chat.ask("What is the square root of 64?")
response.content
=> "{\n  \"result\": 8\n}"
```

This is a power-user feature, and is specific to each provider (and
model, to a lesser extent). I added a brief section to the docs.

For tests: different providers supported different options, so tests are
divided by provider.

(Note that `deep_merge` is required for Gemini in particular because it
relies on a top-level `generationConfig` object.)

## Type of change

- [ ] Bug fix
- [X] New feature
- [ ] Breaking change
- [ ] Documentation
- [ ] Performance improvement

## Scope check

- [X] I read the [Contributing
Guide](https://github.com/crmne/ruby_llm/blob/main/CONTRIBUTING.md)
- [X] This aligns with RubyLLM's focus on **LLM communication**
- [X] This isn't application-specific logic that belongs in user code
- [X] This benefits most users, not just my specific use case

## Quality check

- [ ] I ran `overcommit --install` and all hooks pass
- [X] I tested my changes thoroughly
- [X] I updated documentation if needed
- [X] I didn't modify auto-generated files manually (`models.json`,
`aliases.json`)

## API changes

- [ ] Breaking change
- [X] New public methods/classes
- [ ] Changed method signatures
- [ ] No API changes

## Related issues

- #130
- #131
- #221

---------

Co-authored-by: Carmine Paolino <[email protected]>
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jul 22, 2025
Eric-Guo added a commit to Eric-Guo/ruby_llm that referenced this pull request Jul 23, 2025
Eric-Guo added a commit to thape-cn/web that referenced this pull request Jul 23, 2025
Eric-Guo added a commit to thape-cn/web that referenced this pull request Jul 23, 2025
Eric-Guo added a commit to thape-cn/web that referenced this pull request Jul 23, 2025
@Eric-Guo
Copy link
Contributor

Glad new with_params as workaround and looking forward to #11 full support!

@crmne
Copy link
Owner

crmne commented Jul 23, 2025

Thank you for this PR! Your ideas have been incorporated into the structured output implementation that just landed in be85257.

I've implemented many of your suggestions:

  • Automatic switching to JSON mode/structured output based on schema presence
  • Support for both manual schemas and RubyLLM::Schema objects
  • Provider-specific handling (OpenAI's json_schema, Gemini's responseSchema)
  • Automatic JSON parsing of responses

Key differences from the original PR:

  • Used with_schema method name for consistency with other with_* methods
  • Added a force flag to bypass model capability checks when needed
  • Integrated deeply with the Rails persistence layer

Full documentation: https://rubyllm.com/guides/chat#structured-output-with-json-schemas-with_schema

The feature will be available in v1.4.0. Thanks again for the initial implementation - it was very helpful in shaping the final design!

@crmne crmne closed this Jul 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Structured output response support
10 participants