Skip to content

[BUG] Bedrock models with largest context window are favored by aliases even though they do not support ON_DEMAND inference #232

@tpaulshippy

Description

@tpaulshippy

Basic checks

  • I searched existing issues - this hasn't been reported
  • I can reproduce this consistently
  • This is a RubyLLM bug, not my application code

What's broken?

This is more of an annoyance than a bug, but I think we do want working with this library to be delightful. The issue is that the :200k versions of the bedrock models tend to not have "ON_DEMAND" inference type supported. This means that in order to use them, you are billed hourly (provisioned throughput). See this. Most customers will not want this, so the aliases are unusable for many of the Bedrock models.

How to reproduce

chat = RubyLLM.chat(
model: 'claude-3-haiku',
provider: :bedrock
)
response = chat.ask("hello")
puts response.content

Expected behavior

A response from the LLM is shown.

What actually happened

/Users/paulshippy/Dev/ruby_llm/lib/ruby_llm/error.rb:78:in parse_error': Model not found. (RubyLLM::Error) from /Users/paulshippy/Dev/ruby_llm/lib/ruby_llm/streaming.rb:101:in handle_failed_response'
from /Users/paulshippy/Dev/ruby_llm/lib/ruby_llm/providers/bedrock/streaming/base.rb:52:in block in handle_stream' from /Users/paulshippy/.rbenv/versions/3.2.1/lib/ruby/gems/3.2.0/gems/faraday-2.13.1/lib/faraday/options/env.rb:176:in block in stream_response'
from /Users/paulshippy/.rbenv/versions/3.2.1/lib/ruby/gems/3.2.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:535:in `call_block'

Environment

  • ruby 3.2.1 (2023-02-08 revision 31819e82c8) [arm64-darwin22]
  • RubyLLM 1.3
  • Bedrock
  • MacOS

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions