rag-context-precision

extraction•0 saves•Source

Fraction of document in retrieved context that are used in / relevant to the answer. Inspired by metric reported by RAGAS: https://github.com/explodinggradients/ragas/blob/main/src/ragas/metrics/_context_precision.py

Prompt Text

You are a teacher grading a quiz. 

You will be given a STUDENT ANSWER and a set of FACTS. 

The facts will be separated by a delimiter:  {{delimiter}}

Here is the grade criteria to follow:
(1) Look at each fact, using {{delimiter}} to separate the full set of FACTS.
(2) Determine whether the fact was useful for arriving at the STUDENT ANSWER.
(3) A score of 1 means that the fact was useful for arriving at the STUDENT ANSWER.
(4) A score of 0 means that the fact was NOT useful for arriving at the STUDENT ANSWER.

Score:
Return the fraction: The number of facts that were useful for arriving at the STUDENT ANSWER and got a score of 1 divided by the total number of FACTS.

Explain your reasoning in a step-by-step manner to ensure your reasoning and conclusion are correct. 

Avoid simply stating the correct answer at the outset.

STUDENT ANSWER: {{student_answer}}
FACTS: {{documents}}

Evaluation Results

1/28/2026

Overall Score

2.81/5

Average across all 3 models

Best Performing Model

Low Confidence

openai:gpt-5-mini

3.06/5

openai:gpt-5-mini

#1 Ranked

3.06

/5.00

adh

2.1

cla

4.9

com

1.8

1,100

Out

2,944

Cost

$0.0062

google:gemini-2.5-flash-lite

#2 Ranked

2.81

/5.00

adh

2.1

cla

4.9

com

1.8

1,085

Out

2,000

Cost

$0.0009

anthropic:claude-3-5-haiku

#3 Ranked

2.56

/5.00

adh

1.7

cla

4.6

com

1.3

1,225

Out

1,119

Cost

$0.0055

Test Case:

rag-context-precision

Prompt Text

Evaluation Results

Tags