feat(rubric-auto-grading): enhance usage experience #8009

ncduy0303 · 2025-07-02T05:51:57Z

Description

Any thoughts on how to deal with multiple AI generated messages?
When student submits blank answer for this question, AI assigned full marks rubric (ping me if you need to view this qns)
Even when student gets answers correctly and is assigned full marks, the AI still gives comments on how to improve, which makes me as a grader end up having to edit the response to omit the improvement part.
If AI fails to assign rubric, should we just leave blank, or re-request (silently) to ensure graded state?

Changes made

Updating the latest draft unpublished instead of creating a new one
Update user prompt to have open and close tag for each section. This will help the LLM to recognise blank sections, and differentiate the sections better.
Tweak the system prompt slightly to compliment their answers if they are good, and remove the feedback part from the schema description. Note: "custom prompts" feature might be used to further tweak this baheviour to some extent.
Add a retry loop to chat with OpenAI LLM again if parsing fails:
- OutputFixingParser already has a retry mechanism to send a seperate message asking the LLM to fix the output according the schema (this will now be done after retrying with OpenAI LLM fails as this message would not have the question and rubric content).
- The schema is now dynamically generated with enum contraints on the selected criterion for each category (this will make the llm input and output longer, but it ensures passing the parser means the rubric grading is valid as there would be no non-existent category/criterion ids, and the output criterion belongs to the category)

…o updating the latest one

Copilot

Pull Request Overview

This PR enhances the rubric auto‐grading flow by introducing dynamic schema generation, retry logic, draft‐post updates, and prompt formatting improvements to better handle blank or multiple AI messages.

Refactor RubricLlmService to build a dynamic JSON schema, parse LLM responses with retries, and process category grades
Update auto‐grading service to find/update existing AI‐generated draft posts instead of always creating new ones
Add tags in the user prompt, adjust system prompt wording, and tweak stub behavior for testing

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
spec/support/stubs/langchain/llm_stubs.rb	Fix stub to parse dynamic schema and handle grading output
spec/services/.../rubric_llm_service_spec.rb	Update symbol‐key expectations and introduce output_parser
spec/services/.../rubric_auto_grading_service_spec.rb	Adjust tests for selection updates and add draft‐post specs
client/app/.../reducers/topics.js	Prevent duplicate post IDs in Redux state
app/views/.../_rubric_based_response.json.jbuilder	Filter for AI‐generated draft posts only
app/services/.../rubric_llm_service.rb	Generate dynamic schema, add retry logic, and process grades
app/services/.../rubric_auto_grading_service.rb	Fix class inheritance, update draft‐post creation logic
app/services/.../prompts/rubric_auto_grading_user_prompt.json	Add XML‐style tags around prompt sections
app/services/.../prompts/rubric_auto_grading_system_prompt.json	Remove unused format_instructions variable
app/services/.../prompts/rubric_auto_grading_output_format.json	Change category_grades from array to object map

Comments suppressed due to low confidence (3)

app/services/course/assessment/answer/rubric_llm_service.rb:41

There's a typo in the variable name llm_reponse; it should be llm_response for consistency and clarity.

    llm_reponse = call_llm_with_retries(messages, dynamic_schema, output_parser)

spec/services/course/assessment/answer/rubric_auto_grading_service_spec.rb:75

The tests for process_llm_grading_response return values (correct status, total grade, messages, feedback) were removed. Re-introduce tests to ensure that the service returns the expected tuple.

end

app/services/course/assessment/answer/rubric_auto_grading_service.rb:2

The superclass has been removed from the class definition, causing a syntax error. It should remain < Course::Assessment::Answer::AutoGradingService before the rubocop disable comment.

class Course::Assessment::Answer::RubricAutoGradingService < # rubocop:disable Metrics/ClassLength

spec/support/stubs/langchain/llm_stubs.rb

app/views/course/assessment/answer/rubric_based_responses/_rubric_based_response.json.jbuilder

…ach category selected criterion

…Parser

ncduy0303 force-pushed the ncduy0303/update-rubric-grading branch 2 times, most recently from 335b011 to 65f1e5f Compare July 2, 2025 05:56

ncduy0303 requested review from Copilot and removed request for Copilot July 2, 2025 06:12

This comment was marked as outdated.

Sign in to view

feat(rubric-auto-grading): switch from creating a new draft comment t…

40d4d06

…o updating the latest one

ncduy0303 force-pushed the ncduy0303/update-rubric-grading branch from 65f1e5f to 33f816a Compare July 2, 2025 07:54

feat(rubric-auto-grading): improve prompts for better llm response

8009733

ncduy0303 force-pushed the ncduy0303/update-rubric-grading branch 2 times, most recently from a861095 to 7dd1d22 Compare July 4, 2025 04:08

ncduy0303 requested a review from Copilot July 4, 2025 04:10

Copilot AI reviewed Jul 4, 2025

View reviewed changes

spec/support/stubs/langchain/llm_stubs.rb Show resolved Hide resolved

app/views/course/assessment/answer/rubric_based_responses/_rubric_based_response.json.jbuilder Show resolved Hide resolved

ncduy0303 added 2 commits July 4, 2025 13:17

feat(rubric-auto-grading): update schema with dynamic enum type for e…

ec43e00

…ach category selected criterion

feat(rubric-llm-service): add retry attempt before using OutputFixing…

e0d6e4b

…Parser

ncduy0303 force-pushed the ncduy0303/update-rubric-grading branch from 7dd1d22 to e0d6e4b Compare July 4, 2025 05:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(rubric-auto-grading): enhance usage experience #8009

feat(rubric-auto-grading): enhance usage experience #8009

ncduy0303 commented Jul 2, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

feat(rubric-auto-grading): enhance usage experience #8009

Are you sure you want to change the base?

feat(rubric-auto-grading): enhance usage experience #8009

Conversation

ncduy0303 commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes made

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ncduy0303 commented Jul 2, 2025 •

edited

Loading