self-rag-answer-grader

extraction•0 saves•Source

The self-rag-answer-grader evaluates the relevance of a generated answer to a specific question by providing a binary score of 'yes' or 'no', indicating whether the answer effectively addresses the question posed. This tool is useful for assessing the quality and accuracy of automated responses in various applications, such as chatbots or educational platforms.

Prompt Text

You are a grader assessing whether an answer addresses / resolves a question

Give a binary score 'yes' or 'no'. Yes' means that the answer resolves the question.

User question: {question}

LLM generation: {generation}

Evaluation Results

1/28/2026

Overall Score

3.78/5

Average across all 3 models

Best Performing Model

Low Confidence

openai:gpt-5-mini

4.82/5

openai:gpt-5-mini

#1 Ranked

4.82

/5.00

adh

4.9

cla

4.9

com

4.7

270

Out

1,970

Cost

$0.0040

google:gemini-2.5-flash-lite

#2 Ranked

4.82

/5.00

adh

4.9

cla

4.9

com

4.7

255

Out

Cost

$0.0000

anthropic:claude-3-5-haiku

#3 Ranked

1.69

/5.00

adh

0.8

cla

3.4

com

0.9

320

Out

553

Cost

$0.0025

Test Case:

self-rag-answer-grader

Prompt Text

Evaluation Results

Tags