Statnett-142: Fix bug in the calculation of the aggregated results in case of an output from DESCRIBE or CONSTRUCT query containing the string "results"

Assets 2

30 Sep 09:56

nelly-hateva

5.1.0

f698d70

5.1.0

Bug fixes

TTYG-126: Fix ragas errors caused by incompatibility of some libraries
TTYG-126: Rename openai extra and dependency group to ragas

Assets 2

26 Sep 16:21

nelly-hateva

5.0.2

75cdbbe

5.0.2

Bug fixes

TTYG-130: Package the prompts as part of the package

Assets 2

26 Sep 14:19

nelly-hateva

5.0.1

6e61ae7

5.0.1

Bug fixes

openai is now an extra for the package

Assets 2

25 Sep 12:52

nelly-hateva

5.0.0

3f7e3b1

5.0.0

New features

TTYG-118: Retrieval correctness without reference
TTYG-119: Retrieval correctness using reference context texts

Bug fixes

TTYG-127: Refactor compare SPARQL results
Statnett-217: Rename key 'steps' to 'actual_steps' in the source code

Assets 2

19 Sep 09:05

nelly-hateva

4.0.0

da46d09

4.0.0

Version 4.0.0: First Release of graphrag-eval

This update includes major structural changes and feature additions in preparation for the first official release.

Project Updates

Version bumped to 4.0.0
Project structure updated with folder renaming and import fixes

Features

Answer Relevance Evaluation

Integrated LangEvals for improved relevance evaluation
Added exception handling and error reporting in evaluation results
Introduced system and unit tests to validate evaluations
Extended output with fields: answer_relevance, answer_relevance_cost, answer_relevance_error
Added aggregated metrics for relevance evaluation, including micro/macro statistics

Answer Correctness Evaluation

Refined evaluation output fields (answer_eval_reason → answer_correctness_reason)
Added CLI support for evaluation with input/output file paths
Implemented flattened and aggregated evaluation metrics
Added support for claim-based metrics (answer_*_claims_count)
Improved YAML output examples and aggregate key documentation

SPARQL Result Comparison

Major refactoring and optimization of compare_sparql_results
Reworked SPARQL evaluation for efficiency and accuracy

Improvements

Extensive README enhancements with clarified input/output formats
Expanded documentation of aggregates, error handling, and evaluation examples
Added installation instructions with OpenAI extras
Improved explanations of relevance/correctness evaluation, costs, and result interpretation

Refactoring

Modularization of evaluation code (steps and answer evaluation separated)
Aggregation logic moved to dedicated modules
Import logic refactored to avoid unnecessary OpenAI dependencies
Standardized naming conventions for variables, keys, and test data

Testing & CI

Added system tests to run after PRs and before releases
Expanded unit tests for evaluation functions, error handling, and aggregation
Implemented mocking and dependency separation for tests involving OpenAI

Bug Fixes

Fixed aggregation key errors when no steps are available
Fixed relevance result property access in LangEvals
Corrected type hints, key naming mismatches, and parsed value checks
Multiple fixes in README formatting (indentation, anchors, examples)
Fixed duplicate or inconsistent test cases

Assets 2

16 Jul 12:39

pgan002

3.0.0

4821d1f

3.0.0

TTYG-106 i/o format v2.0
https://graphwise.atlassian.net/browse/TTYG-103

Assets 2

01 Jul 06:36

nelly-hateva

2.2.0

76930f8

2.2.0

Relax versions definitions

Assets 2

Releases: Ontotext-AD/graphrag-eval

5.2.0

Uh oh!

5.1.2

Bug fixes

Uh oh!

5.1.1

Bug fixes

Uh oh!

5.1.0

Bug fixes

Uh oh!

5.0.2

Bug fixes

Uh oh!

5.0.1

Bug fixes

Uh oh!

5.0.0

New features

Bug fixes

Uh oh!

4.0.0

Version 4.0.0: First Release of graphrag-eval

Project Updates

Features

Answer Relevance Evaluation

Answer Correctness Evaluation

SPARQL Result Comparison

Improvements

Refactoring

Testing & CI

Bug Fixes

Uh oh!

3.0.0

Uh oh!

2.2.0

Uh oh!