Prompt performance,
not guesswork
Every prompt tested across the same models. Scored by independent AI judges.
Evaluating across GPT-4, Claude 3.5, and Gemini 1.5
github
Sort By
All scores are aggregated using multi-judge consensus (GPT-4o Mini + Claude 3 Haiku).
21 prompts found
Extraction
City Extractor (Few-Shot)
Best Modelgemini-2.5-flash-lite
Overall4.9
Winner4.9
0
View details →
Extraction
Capital City Extractor
Best Modelgemini-2.5-flash-lite
Overall4.7
Winner4.8
0
View details →
Summarization
Legal Document Summarizer
Best Modelgemini-2.5-flash-lite
Overall4.6
Winner4.9
0
View details →
Extraction
RAG Query Answering
Best Modelgpt-5-mini
Overall4.0
Winner4.1
0
View details →
Classification
Social Media Comment Moderator
Best Modelgemini-2.5-flash-lite
Overall3.5
Winner4.9
0
View details →
Summarization
Guided Legal Summary Generator
Best Modelgemini-2.5-flash-lite
Overall3.4
Winner4.8
0
View details →
Summarization
Investment Memo Editor
Best Modelgpt-5-mini
Overall3.1
Winner4.0
0
View details →
Summarization
Portfolio Manager Investment Memo
Best Modelclaude-3-5-haiku
Overall3.1
Winner3.7
0
View details →
Extraction
Citation Extraction Agent
Best Modelclaude-3-5-haiku
Overall2.9
Winner3.7
0
View details →
Extraction
Recipe Ingredient Extractor
Best Modelgpt-5-mini
Overall2.9
Winner3.4
0
View details →
Extraction
Shopping List Organizer
Best Modelclaude-3-5-haiku
Overall2.8
Winner3.2
0
View details →
Classification
Customer Support Ticket Classifier
Best Modelgpt-5-mini
Overall2.7
Winner3.3
0
View details →
Summarization
Long Document Sublease Summarizer
Best Modelgemini-2.5-flash-lite
Overall2.7
Winner4.4
0
View details →
Classification
Essay Grading Evaluator
Best Modelgpt-5-mini
Overall2.3
Winner2.4
0
View details →
Extraction
Text-to-SQL with Chain-of-Thought
Best Modelclaude-3-5-haiku
Overall2.3
Winner2.6
0
View details →
Extraction
Text-to-SQL Converter
Best Modelgpt-5-mini
Overall2.3
Winner3.4
0
View details →
Extraction
Text-to-SQL with Few-Shot Examples
Best Modelgpt-5-mini
Overall2.2
Winner2.9
0
View details →
Summarization
Macro Strategy Report
Best Modelgemini-2.5-flash-lite
Overall1.9
Winner3.0
0
View details →
Classification
LLM Output Quality Judge
Best Modelgemini-2.5-flash-lite
Overall1.8
Winner3.2
0
View details →
Extraction
Quantitative Analysis Report
Best Modelgemini-2.5-flash-lite
Overall1.8
Winner2.9
0
View details →
Extraction
Fundamental Analysis Report
Best Modelgemini-2.5-flash-lite
Overall1.8
Winner2.8
0
View details →
You've reached the end
