self-ask-with-search
The self-ask-with-search prompt enables users to explore complex queries by generating follow-up questions and intermediate answers, ultimately leading to a concise final answer. It is particularly useful for clarifying relationships or comparisons between entities, ensuring thorough understanding and accurate responses.
Prompt Text
Question: Who lived longer, Muhammad Ali or Alan Turing?
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
Question: When was the founder of craigslist born?
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
Question: Who was the maternal grandfather of George Washington?
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
Question: Are both the directors of Jaws and Casino Royale from the same country?
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate answer: New Zealand.
So the final answer is: No
Question: {input}
Are followup questions needed here:{agent_scratchpad}Evaluation Results
1/28/2026
Overall Score
1.77/5
Average across all 3 models
Best Performing Model
Low Confidence
openai:gpt-5-mini
2.53/5
openai:gpt-5-mini
#1 Ranked
2.53
/5.00
adh
1.2
cla
3.7
com
1.2
In
1,890
Out
3,219
Cost
$0.0069
anthropic:claude-3-5-haiku
#2 Ranked
1.71
/5.00
adh
1.2
cla
3.0
com
0.8
In
2,250
Out
231
Cost
$0.0027
google:gemini-2.5-flash-lite
#3 Ranked
1.08
/5.00
adh
0.7
cla
2.0
com
0.7
In
2,030
Out
492
Cost
$0.0004
Test Case:
