Skip to content

Create evaluation suite for benchmarking various models #15

@nikiburggraf

Description

@nikiburggraf

Related to: #2

  • Define our needs for an evaluation suite
  • Collect the metrics we're using for benchmarking for various models using this evaluation suite
  • Update the evaluation code for deepeval to leverage the modular retrieval / generation code.

Metadata

Metadata

Labels

SL1: EvaluationWe expect to use Trulens for measuring performance of different RAG metrics

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions