Skip to content

[experiment] generate a single final report instead of separate sections #114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 11 commits into
base: vb/evals-and-improvements
Choose a base branch
from

Conversation

vbarda
Copy link
Contributor

@vbarda vbarda commented Jun 6, 2025

No description provided.

vbarda and others added 11 commits June 6, 2025 15:33
---------

Co-authored-by: Lance Martin <[email protected]>
* add evals

* update input formatting

* add prompt caching

* gitignore

* allow returning source string

* add groundedness eval

* add option to summarize search results

* add option to split & rerank webpage chunks

* rename justification -> reasoning

* caching for summarization

* caching for summarization

* separate keys

* retry

* add generate report helper

* update evaluator & add date to system prompt

* add multi-agent helper

* add multi-agent helper

* propagate retrieved source from multi-agent

* fixes

* split evals files & add overall quality eval

* add MCP support (#112)

* bump requirements

* Add Question tool, update prompt

* nits

* Add file for testing

* Update

* rename & make question tool optional

* improve prompts

* Update README

* Improvements

* Minor updates

* Add evaluation script, change config default to sonnet-3-5

* Update tests, config, Anthropic version

* Updates

---------

Co-authored-by: vbarda <[email protected]>
Co-authored-by: Vadym Barda <[email protected]>
Co-authored-by: Nick Huang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants