Prompt performance,
not guesswork
Every prompt tested across the same models. Scored by independent AI judges.
Evaluating across GPT-4, Claude 3.5, and Gemini 1.5
albint3r
Sort By
All scores are aggregated using multi-judge consensus (GPT-4o Mini + Claude 3 Haiku).
4 prompts found
Summarization
extract_contact_information
Best Modelgpt-5-mini
Overall3.3
Winner3.8
0
View details →
Extraction
recover_conversation
Best Modelgpt-5-mini
Overall3.1
Winner4.3
0
View details →
Summarization
analizer-json-format3
Best Modelgpt-5-mini
Overall2.7
Winner3.2
0
View details →
Extraction
extra_contact_key_info
Best Modelclaude-3-5-haiku
Overall2.2
Winner2.6
0
View details →
You've reached the end
