Back to prompts

tnt-llm-taxonomy-update

classification0 savesSource

Adapted from: https://arxiv.org/abs/2403.12173

Prompt Text

# Instruction
## Context
- **Goal**: You goal is to review the given reference table based on the input data for the specified use case, then update the reference table if needed.
  - You will be given a reference cluster table, which is built on existing data. The reference table will be used to classify new data points.
  - You will compare the input data with the reference table, output a rating score of the quality of the reference table, suggest potential edits, and update the reference table if needed.
- **Reference cluster table**: The input cluster table is in XML format with each cluster as a `<cluster>` element, containing the following sub-elements:
  - **id**: category index.
  - **name**: category name.
  - **description**: category description used to classify data points.
- **Data**: The input data will be a list of human-AI conversation summaries in XML format, including the following elements:
  - **id**: conversation index.
  - **text**: conversation summary.
- **Use case**: {use_case}

## Requirements
### Format
- Output clusters in **XML format** with each cluster as a `<cluster>` element, containing the following sub-elements:
  - **id**: category number starting from 1 in an incremental manner.
  - **name**: category name should be **within {cluster_name_length} words**. It can be either verb phrase or noun phrase, whichever is more appropriate.
  - **description**: category description should be **within {cluster_description_length} words**.

Here is an example of your output:
```xml
<clusters>
  <cluster>
    <id>category id</id>
    <name>category name</name>
    <description>category description</description>
  </cluster>
</clusters>
```
- Total number of categories should be **no more than {max_num_clusters}**.
- Output should be in **English** only.

### Quality
- **No overlap or contradiction** among the categories.
- **Name** is a concise and clear label for the category. Use only phrases that are specific to each category and avoid those that are common to all categories.
- **Description** differentiates one category from another.
- **Name** and **description** can **accurately** and **consistently** classify new data points **without ambiguity**.
- **Name** and **description** are *consistent with each other*.
- Output clusters match the data as closely as possible, without missing important categories or adding unnecessary ones.
- Output clusters should strive to be orthogonal, providing solid coverage of the target domain.
- Output clusters serve the given use case well.
- Output clusters should be specific and meaningful. Do not invent categories that are not in the data.

# Reference cluster table
<reference_table>
{cluster_table_xml}
</reference_table>

# Data
<conversations>
{data_xml}
</conversations>

# Questions
## Q1: Review the given reference table and the input data and provide a rating score of the reference table. The rating score should be an integer between 0 and 100, higher rating score means better quality. You should consider the following factors when rating the reference cluster table:
- **Intrinsic quality**:
  - 1) if the cluster table meets the *Requirements* section, with clear and consistent category names and descriptions, and no overlap or contradiction among the categories;
  - 2) if the categories in the cluster table are relevant to the the given use case;
  - 3) if the cluster table includes any vague categories such as "Other", "General", "Unclear", "Miscellaneous" or "Undefined".
- **Extrinsic quality**:
  - 1) if the cluster table can accurately and consistently classify the input data without ambiguity;
  - 2) if there are missing categories in the cluster table but appear in the input data;
  - 3) if there are unnecessary categories in the cluster table that do not appear in the input data.
## Q2: Explain your rating score in Q1 **within {explanation_length} words**.
## Q3: Based on your review, decide if you need to edit the reference table to improve its quality. If yes, suggest potential edits **within {suggestion_length} words**. If no, please output the original reference table.

Tips:
- You can edit the category name, description, or remove a category. You can also merge or add new categories if needed. Your edits should meet the *Requirements* section.
- The cluster table should be a **flat list** of **mutually exclusive** categories. Sort them based on their semantic relatedness.
- You can have *fewer than {max_num_clusters} categories* in the cluster table, but **do not exceed the limit.**
- Be **specific** about each category. **Do not include vague categories** such as "Other", "General", "Unclear", "Miscellaneous" or "Undefined" in the cluster table.
- You can ignore low quality or ambiguous data points.
## Q4: If you decide to edit the reference table, please provide your updated reference table. If you decide not to edit the reference table, please output the original reference table.
## Provide your answers between the following tags:
<rating_score>integer between 0 and 100</rating_score>
<explanation>explanation of your rating score within {explanation_length} words</explanation>
<suggestions>suggested edits within {suggestion_length} words, or "N/A" if no edits needed</suggestions>
<updated_table>
your updated cluster table in XML format if you decided to edit the reference table, or the original reference table if no edits made
</updated_table>
# Output

Evaluation Results

1/28/2026
Overall Score
2.99/5

Average across all 3 models

Best Performing Model
Low Confidence
google:gemini-2.5-flash-lite
4.32/5
google:gemini-2.5-flash-lite
#1 Ranked
4.32
/5.00
adh
3.9
cla
5.0
com
3.9
In
7,548
Out
2,864
Cost
$0.0019
openai:gpt-5-mini
#2 Ranked
2.32
/5.00
adh
1.9
cla
3.6
com
1.6
In
7,236
Out
4,208
Cost
$0.0102
anthropic:claude-3-5-haiku
#3 Ranked
2.31
/5.00
adh
1.4
cla
4.4
com
1.1
In
7,836
Out
1,227
Cost
$0.0112
Test Case:

Tags