Prompt performance,
not guesswork
Every prompt tested across the same models. Scored by independent AI judges.
Evaluating across GPT-4, Claude 3.5, and Gemini 1.5
mitchell-compoze
Sort By
All scores are aggregated using multi-judge consensus (GPT-4o Mini + Claude 3 Haiku).
6 prompts found
Extraction
generate_prompt_nolinks
Best Modelclaude-3-5-haiku
Overall2.7
Winner3.0
0
View details →
Extraction
generate_prompt_gpt4_1
Best Modelgemini-2.5-flash-lite
Overall2.6
Winner3.8
0
View details →
Extraction
agent_prompt_nolinks
Best Modelclaude-3-5-haiku
Overall2.5
Winner3.1
0
View details →
Extraction
generate_prompt
Best Modelgemini-2.5-flash-lite
Overall2.3
Winner2.5
0
View details →
Extraction
agent_prompt_gpt4_1
Best Modelgemini-2.5-flash-lite
Overall2.1
Winner2.7
0
View details →
Extraction
agent_prompt
Best Modelclaude-3-5-haiku
Overall1.7
Winner2.4
0
View details →
You've reached the end
