Skip to content

Added proper handling of streaming error responses across both Faraday V1 and V2 #273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

dansingerman
Copy link

What this does

When used within our app, streaming error responses were throwing an error and not being properly handled

worker      | D, [2025-07-03T18:49:52.221013 #81269] DEBUG -- RubyLLM: Received chunk: event: error
worker      | data: {"type":"error","error":{"details":null,"type":"overloaded_error","message":"Overloaded"}               }
worker      | 
worker      | 
worker      | 2025-07-03 18:49:52.233610 E [81269:sidekiq.default/processor chat_agent.rb:42] {jid: 7382519287f08cfa7cd1e4e4, queue: default} Rails -- Error in ChatAgent#send_with_streaming: NoMethodError - undefined method `merge' for nil:NilClass
worker      | 
worker      |       error_response = env.merge(body: JSON.parse(error_data), status: status)
worker      |                           ^^^^^^
worker      | 2025-07-03 18:49:52.233852 E [81269:sidekiq.default/processor chat_agent.rb:43] {jid: 7382519287f08cfa7cd1e4e4, queue: default} Rails -- Backtrace: /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:91:in `handle_error_chunk'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:62:in `process_stream_chunk'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:70:in `block in legacy_stream_processor'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/faraday-net_http-1.0.1/lib/faraday/adapter/net_http.rb:113:in `block in perform_request'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:535:in `call_block'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:526:in `<<'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb

It looks like the introduction of support for Faraday V1 introduced this error, as the error handling relies on an env that is no longer passed. This should provide a fix for both V1 and V2.

One thing to note, I had to manually construct the VCR cassettes, I'm not sure of a better way to test an intermittent error response.

I have also only written the tests against anthropic/claude-3-5-haiku-20241022 - it's possible other models with a different error format may still not be properly handled, but even in that case it won't error for the reasons fixed here.

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • Performance improvement

Scope check

  • I read the Contributing Guide
  • This aligns with RubyLLM's focus on LLM communication
  • This isn't application-specific logic that belongs in user code
  • This benefits most users, not just my specific use case

Quality check

  • I ran overcommit --install and all hooks pass
  • I tested my changes thoroughly
  • I updated documentation if needed
  • I didn't modify auto-generated files manually (models.json, aliases.json)

API changes

  • Breaking change
  • New public methods/classes
  • Changed method signatures
  • No API changes

Related issues

Copy link

codecov bot commented Jul 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.34%. Comparing base (0f60067) to head (87381b1).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #273      +/-   ##
==========================================
+ Coverage   89.71%   90.34%   +0.62%     
==========================================
  Files          75       75              
  Lines        2811     2817       +6     
  Branches      555      559       +4     
==========================================
+ Hits         2522     2545      +23     
+ Misses        289      272      -17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes to the spec file are quite different than our current style for spec files. We also assume that cassettes can be removed, and in fact rake vcr:record[anthropic] would remove yours. I'd suggest to mock the messages coming back from the API instead.

It would also be great to test with other providers too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've removed the VCR cassettes and replaced it with stubbed_requests.

I've also added support for other providers (other than Bedrock)

I've verified the error format used in the mocks against Anthropic and Open AI apis, so I'm reasonably confident in that mocking. I don't have access to Bedrock right now, and the error handling seems somewhat different as far as I can tell, so I haven't added support for that in the tests yet. That would ideally be done by someone with access to Bedrock.

@crmne crmne added the bug Something isn't working label Jul 16, 2025
@dansingerman dansingerman requested a review from crmne July 17, 2025 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants