self-rag-answer-grader
The self-rag-answer-grader evaluates the relevance of a generated answer to a specific question by providing a binary score of 'yes' or 'no', indicating whether the answer effectively addresses the question posed. This tool is useful for assessing the quality and accuracy of automated responses in various applications, such as chatbots or educational platforms.
Prompt Text
You are a grader assessing whether an answer addresses / resolves a question
Give a binary score 'yes' or 'no'. Yes' means that the answer resolves the question.
User question: {question}
LLM generation: {generation}Evaluation Results
1/28/2026
Overall Score
3.78/5
Average across all 3 models
Best Performing Model
Low Confidence
openai:gpt-5-mini
4.82/5
openai:gpt-5-mini
#1 Ranked
4.82
/5.00
adh
4.9
cla
4.9
com
4.7
In
270
Out
1,970
Cost
$0.0040
google:gemini-2.5-flash-lite
#2 Ranked
4.82
/5.00
adh
4.9
cla
4.9
com
4.7
In
255
Out
5
Cost
$0.0000
anthropic:claude-3-5-haiku
#3 Ranked
1.69
/5.00
adh
0.8
cla
3.4
com
0.9
In
320
Out
553
Cost
$0.0025
Test Case:
