What's Changed
- refactor: improve metrics code quality by @anistark in #2337
- chore: remove old analtyics by @jjmachan in #2338
- Fix/query distribution robustness by @yatoyun in #2340
- Simplify earlier how to guides in docs by @sanjeed5 in #2319
- docs: reorganize prompt evaluation guides in navigation by @sanjeed5 in #2346
- Metrics migration, migrate rouge + answer relevance by @rhlbhatnagar in #2335
- fix: streamline theme extraction from overlaps in MultiHopSpecificQue… by @kenzoyan in #2347
- Test/metric new compare by @anistark in #2349
- feat: bleu score migrated to collections by @anistark in #2352
- fix: Add List[List[str]] formats for overlapped items in theme extration (Continuation in #2347) by @kenzoyan in #2355
- feat: string metrics migrated to collections by @anistark in #2356
- feat: answer similarity migrated to collections by @anistark in #2358
- fix: add missing props token_usage_parser for test generation methods #2359 by @bhkj9999 in #2360
- feat: add bypass_n option to LangchainLLMWrapper for n-completion control by @SimFG in #2354
- docs: Add how-to guide for aligning LLM-as-Judge by @sanjeed5 in #2348
New Contributors
- @yatoyun made their first contribution in #2340
- @kenzoyan made their first contribution in #2347
- @bhkj9999 made their first contribution in #2360
- @SimFG made their first contribution in #2354
Full Changelog: v0.3.6...v0.3.7