Prompt performance,
not guesswork
Every prompt tested across the same models. Scored by independent AI judges.
Evaluating across GPT-4, Claude 3.5, and Gemini 1.5
souzatharsis
Sort By
All scores are aggregated using multi-judge consensus (GPT-4o Mini + Claude 3 Haiku).
3 prompts found
Summarization
podcastfy_multimodal_cleanmarkup
Best Modelgemini-2.5-flash-lite
Overall1.8
Winner2.0
0
View details →
Summarization
podcastfy_multimodal
Best Modelgemini-2.5-flash-lite
Overall1.8
Winner1.9
0
View details →
Summarization
podcastfy_longform
Best Modelclaude-3-5-haiku
Overall1.7
Winner2.1
0
View details →
You've reached the end
