Skip to content

Add global Rerank interface with Cohere Rerank model support #276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

infinityrobot
Copy link
Contributor

@infinityrobot infinityrobot commented Jul 7, 2025

Important

Note! This PR is branched off and is dependent on the Cohere Provider implementation in #275.
Reranking has been submitted as a separate contribution to allow for a meaningful review independent of the core Cohere Provider implementation. The diff here might be a bit complicated until the core Cohere implementation is approved.

What this does

Note

This PR adds support for reranking to RubyLLM, enabling semantic reranking of search results and document collections. The initial implementation supports Cohere's Rerank models.

Why add reranking?

Reranking is a standard step in retrieval-augmented generation (RAG) pipelines that bridges retrieval and generation workflows. It's not application-specific logic but rather a foundational communication pattern with LLMs that most users implementing search, recommendation, or RAG systems will need.

Reranking broadly serves multiple common use cases:

  • RAG – Improving context relevance before chat-based text generation
  • Search enhancement – Semantically ranking search results
  • Recommendation systems – Ordering items by relevance to user queries
  • Content curation – Ranking documents, articles, or responses by semantic similarity

Implementation overview

The reranking implementation is primarily via a new RubyLLM::Rerank class which operates via a pattern that mirrors that of the current RubyLLM::Embedding implementation.

Note

To facilitate easy reranking, a default_rerank_model attribute has been added to Configuration which is set to Cohere's rerank-v3.5.

The main entry point is RubyLLM.rerank(...) which delegates to RubyLLM::Rerank.rank(...). This method accepts:

  • query – The search query to evaluate documents against
  • documents – An array of text documents to rerank
  • Optional reranking parameters
    • top_n – Top number of results to return
    • max_tokens_per_doc – Max tokens per doc for chunking
  • Optional model parameters (per embeddings an other implementations)
    • model, provider, , context, assume_model_exists

Using the Provider Reranking interface – e.g., RubyLLM::Providers::Cohere::Reranking – the API payload with the query, documents, and parameters are prepared and submitted.

If successful, the response is parsed and a Rerank object is returned with:

  • model – The ID of the model used
  • results – The results of the reranking, returned as an array of RerankResult objects sorted by relevance_score containing:
    • index – The original document index in documents
    • relevance_score – Score between 0-1 with 1 being perfectly relevant
    • document – The document content tested
  • search_units – The billable units returned by providers like Cohere, TogetherAI, etc.

Usage example

# Configure your API key
RubyLLM.configure do |config|
  config.cohere_api_key = ENV['COHERE_API_KEY']
end

# Your search query
query = "How do I handle exceptions in Ruby?"

# Candidate documents to rerank
documents = [
  "Ruby uses begin/rescue/end blocks to handle exceptions, similar to try/catch in other languages.",
  "JavaScript async/await syntax makes handling asynchronous operations much easier.",
  "The raise keyword in Ruby allows you to throw custom exceptions with specific error messages.",
  "Python dictionaries are similar to Ruby hashes but use different syntax for iteration.",
  "Ruby's ensure block always executes, making it perfect for cleanup operations like closing files."
]

# Rerank the documents
RubyLLM.rerank(query, documents)
=>
#<RubyLLM::Rerank:0x0000000122cb0d68
 @model="rerank-v3.5",
 @results=
  [#<RubyLLM::RerankResult:0x0000000122cb0ed0
    @document="Ruby uses begin/rescue/end blocks to handle exceptions, similar to try/catch in other languages.",
    @index=0,
    @relevance_score=0.8865877>,
   #<RubyLLM::RerankResult:0x0000000122cb0e80
    @document="The raise keyword in Ruby allows you to throw custom exceptions with specific error messages.",
    @index=2,
    @relevance_score=0.63750535>,
   #<RubyLLM::RerankResult:0x0000000122cb0e58
    @document="Ruby's ensure block always executes, making it perfect for cleanup operations like closing files.",
    @index=4,
    @relevance_score=0.08234999>,
   #<RubyLLM::RerankResult:0x0000000122cb0e30
    @document="JavaScript async/await syntax makes handling asynchronous operations much easier.",
    @index=1,
    @relevance_score=0.031288605>,
   #<RubyLLM::RerankResult:0x0000000122cb0e08
    @document="Python dictionaries are similar to Ruby hashes but use different syntax for iteration.",
    @index=3,
    @relevance_score=0.027483363>],
 @search_units=1>

Implementation notes

  • Follows existing RubyLLM patterns for model communication
  • Added detailed documentation in docs/guides/rerank.md
  • Includes comprehensive error handling and response validation
  • Supports all Cohere Rerank model variants (rerank-v3.5, rerank-english-v3.0, rerank-multilingual-v3.0, etc.)
  • Ready to support other ranking providers – e.g., existing Ollama models or in case of potential new providers that offer a Rerank API like TogetherAI
  • Maintains consistency with existing RubyLLM conventions
  • Supports context-specific implementations (e.g., rerank_context.rerank(...))
  • Full specs with VCRs

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • Performance improvement

Scope check

  • I read the Contributing Guide
  • This aligns with RubyLLM's focus on LLM communication
  • This isn't application-specific logic that belongs in user code
  • This benefits most users, not just my specific use case

Quality check

  • I ran overcommit --install and all hooks pass
  • I tested my changes thoroughly
  • I updated documentation if needed
  • I didn't modify auto-generated files manually (models.json, aliases.json)

API changes

  • Breaking change
  • New public methods/classes
  • Changed method signatures
  • No API changes

Related issues

No related issues.

@infinityrobot infinityrobot mentioned this pull request Jul 7, 2025
17 tasks
@infinityrobot infinityrobot force-pushed the add-cohere-reranking-support branch from 1e2ae2a to 36669f1 Compare July 18, 2025 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant