Prompt performance,
not guesswork
Every prompt tested across the same models. Scored by independent AI judges.
Evaluating across GPT-4, Claude 3.5, and Gemini 1.5
aexl
Sort By
All scores are aggregated using multi-judge consensus (GPT-4o Mini + Claude 3 Haiku).
6 prompts found
Summarization
pre-top-3-summarization
Best Modelgpt-5-mini
Overall3.3
Winner3.5
0
View details →
Summarization
win-place-ci
Best Modelclaude-3-5-haiku
Overall3.1
Winner3.8
0
View details →
Summarization
forecast-quinella-win-ci
Best Modelclaude-3-5-haiku
Overall2.9
Winner3.2
0
View details →
Summarization
pre-next-5-summarization
Best Modelclaude-3-5-haiku
Overall2.8
Winner3.1
0
View details →
Summarization
pre-reflection-summary
Best Modelclaude-3-5-haiku
Overall2.7
Winner4.1
0
View details →
Summarization
reborn
Best Modelgemini-2.5-flash-lite
Overall2.5
Winner3.7
0
View details →
You've reached the end
