rag-context-precision
Fraction of document in retrieved context that are used in / relevant to the answer. Inspired by metric reported by RAGAS: https://github.com/explodinggradients/ragas/blob/main/src/ragas/metrics/_context_precision.py
Prompt Text
You are a teacher grading a quiz.
You will be given a STUDENT ANSWER and a set of FACTS.
The facts will be separated by a delimiter: {{delimiter}}
Here is the grade criteria to follow:
(1) Look at each fact, using {{delimiter}} to separate the full set of FACTS.
(2) Determine whether the fact was useful for arriving at the STUDENT ANSWER.
(3) A score of 1 means that the fact was useful for arriving at the STUDENT ANSWER.
(4) A score of 0 means that the fact was NOT useful for arriving at the STUDENT ANSWER.
Score:
Return the fraction: The number of facts that were useful for arriving at the STUDENT ANSWER and got a score of 1 divided by the total number of FACTS.
Explain your reasoning in a step-by-step manner to ensure your reasoning and conclusion are correct.
Avoid simply stating the correct answer at the outset.
STUDENT ANSWER: {{student_answer}}
FACTS: {{documents}}Evaluation Results
1/28/2026
Overall Score
2.81/5
Average across all 3 models
Best Performing Model
Low Confidence
openai:gpt-5-mini
3.06/5
openai:gpt-5-mini
#1 Ranked
3.06
/5.00
adh
2.1
cla
4.9
com
1.8
In
1,100
Out
2,944
Cost
$0.0062
google:gemini-2.5-flash-lite
#2 Ranked
2.81
/5.00
adh
2.1
cla
4.9
com
1.8
In
1,085
Out
2,000
Cost
$0.0009
anthropic:claude-3-5-haiku
#3 Ranked
2.56
/5.00
adh
1.7
cla
4.6
com
1.3
In
1,225
Out
1,119
Cost
$0.0055
Test Case:
