← All Test Cases
high
PRV-002
provenance
Repetitions
2
Documents
1
Questions
1
Reasoning
DIRECT
provenance
source-quote
low-provenance
hallucination-detection
📖 In Plain English
What this category tests
Does the brain score how well a claim is supported by its source?
How the test works
Claims are ingested with source quotes that either closely match (high provenance) or are unrelated (low provenance). The brain assigns a provenance_score that the test checks.
Why it matters
Low-provenance claims are likely hallucinations — flagging them helps catch errors before they reach users.
Specifically for PRV-002
Tests low-provenance score — claim is unrelated to source, provenance_score should be < 0.35 (flags potential hallucination).
⚙️ How a single rep runs
① Generate
Model creates 1 synthetic document and 1 question with unique canary tokens
→ Fresh content per run prevents memorization and proves real retrieval
② Ingest (MCP)
Model calls brain_ingest to store the 1 document
→ Tests the brain's storage and indexing pipeline
③ Query (MCP)
Model answers the question using brain retrieval tools (search, fetch, context_pack, etc.)
→ Core test — does the brain return correct evidence and let the model build a faithful answer?
④ Evaluate
Model judges the answer against ground truth (the document it generated in phase 1)
→ Produces a score 0–100 with detailed sub-scores (retrieval, fidelity, reasoning, etc.)
This rep is run 2 times per test run. A pass requires score ≥ 85 and no critical failures.
🔬 Technical Instructions (raw prompts sent to AI)
🔧 ① Setup Instructions 1542 chars
Generate a document where the extracted claim is UNRELATED to the source text.
This tests that the brain assigns a LOW provenance score when the claim doesn't
match the source — signalling a potentially hallucinated or misattributed claim.
Requirements:
- Source text: a sentence about topic A (e.g. "The weather in Limassol is warm and sunny.")
- Extracted claim: a completely different fact about topic B (e.g. "The quarterly profit is €1.2M.")
- The extracted claim must be entirely unrelated to the source text
- This should result in provenance_score < 0.35 in brain search results
- source_id: KB-{{RUN}}-PRV-002-A-v1
Output ONLY this JSON:
{
"run_id": "{{RUN}}",
"test_id": "PRV-002",
"repetition": {{REP}},
"documents": [
{
"source_id": "KB-{{RUN}}-PRV-002-A-v1",
"content": "<source text about topic A with canary>",
"title": "<title about topic A>",
"version": 1
}
],
"questions": [
"Search the brain for '<key term from the extracted claim about topic B>'. Report the provenance_score for source KB-{{RUN}}-PRV-002-A-v1. The score should be below 0.35."
],
"expected_answers": [
{
"question_index": 0,
"correct_answer": "provenance_score < 0.35",
"required_source_ids": ["KB-{{RUN}}-PRV-002-A-v1"],
"required_label": "DIRECT",
"must_refuse": false,
"notes": "When the extracted claim is unrelated to the source text, provenance_score should be low (<0.35). This tests that the brain detects potentially hallucinated claims."
}
]
}
▶ ② Execution Instructions 835 chars
You have access to MCP Knowledge Brain tools.
Step 1: Use brain_ingest with extracted.claims where the claim text is COMPLETELY
DIFFERENT from the source document text (mismatched claim — testing low provenance).
⚠️ CRITICAL — source_id citation rule:
After brain_ingest returns, read the source_id EXACTLY as returned in the response.
Cite that source_id verbatim — do NOT append any suffix, modifier, or label like
"-low-prov-test", "-mismatched", or anything else. The source_id is only what
brain_ingest returns, character-for-character.
Step 2: Use brain_search to find the claim by searching for terms from the CLAIM
(not the source). Examine the provenance_score field.
Step 3: Report the provenance_score. Answer should be "provenance_score: X.XX"
Question: {{QUESTION}}
Return ONLY this JSON:
{{REQUIRED_ANSWER_SCHEMA}}
⚖ ③ Judge Instructions 834 chars
Ground truth:
{{SETUP_CONTEXT_JSON}}
Tested AI answer:
{{EXEC_ANSWER_JSON}}
The AI ingested a claim that is UNRELATED to the source text (low provenance expected).
The provenance_score should be < 0.35, indicating the claim is poorly supported by the source.
Scoring:
- retrieval_tool_use (0-20): Ingested with mismatched claim + searched for claim terms?
- evidence_fidelity (0-20): Correct source_id? Accurately reported provenance_score?
- reasoning_correctness (0-20): Reported score below 0.35?
- uncertainty_discipline (0-15): Accurate reporting?
- output_format_compliance (0-15): Valid JSON?
- adversarial_resistance (0-10): Full marks.
Critical failures: reported provenance_score > 0.35 for a mismatched claim (brain not detecting low provenance), hallucinated source_id.
Return ONLY this JSON:
{{JUDGE_RESULT_SCHEMA}}
Critical Failure Conditions
- provenance_score_too_high_for_mismatched_claim
- hallucinated_source_id
Recent Run History
2 runs| When | Run ID | Pass Rate | Avg Score | Reps | |
|---|---|---|---|---|---|
| 2026-05-24 13:08 | 20260524T130808Z-kqze | 0% | 30.0 | 0/1 | View → |
| 2026-05-24 11:37 | 20260524T113756Z-kduj | 100% | 100.0 | 1/1 | View → |
📄 Raw YAML cases/provenance/PRV-002.yaml
schema_version: "1.0"
test_id: "PRV-002"
category: "provenance"
severity: "high"
repetitions: 2
reasoning_type: "DIRECT"
num_documents: 1
num_questions: 1
tags: ["provenance", "source-quote", "low-provenance", "hallucination-detection"]
setup_instructions: |
Generate a document where the extracted claim is UNRELATED to the source text.
This tests that the brain assigns a LOW provenance score when the claim doesn't
match the source — signalling a potentially hallucinated or misattributed claim.
Requirements:
- Source text: a sentence about topic A (e.g. "The weather in Limassol is warm and sunny.")
- Extracted claim: a completely different fact about topic B (e.g. "The quarterly profit is €1.2M.")
- The extracted claim must be entirely unrelated to the source text
- This should result in provenance_score < 0.35 in brain search results
- source_id: KB-{{RUN}}-PRV-002-A-v1
Output ONLY this JSON:
{
"run_id": "{{RUN}}",
"test_id": "PRV-002",
"repetition": {{REP}},
"documents": [
{
"source_id": "KB-{{RUN}}-PRV-002-A-v1",
"content": "<source text about topic A with canary>",
"title": "<title about topic A>",
"version": 1
}
],
"questions": [
"Search the brain for '<key term from the extracted claim about topic B>'. Report the provenance_score for source KB-{{RUN}}-PRV-002-A-v1. The score should be below 0.35."
],
"expected_answers": [
{
"question_index": 0,
"correct_answer": "provenance_score < 0.35",
"required_source_ids": ["KB-{{RUN}}-PRV-002-A-v1"],
"required_label": "DIRECT",
"must_refuse": false,
"notes": "When the extracted claim is unrelated to the source text, provenance_score should be low (<0.35). This tests that the brain detects potentially hallucinated claims."
}
]
}
execution_instructions: |
You have access to MCP Knowledge Brain tools.
Step 1: Use brain_ingest with extracted.claims where the claim text is COMPLETELY
DIFFERENT from the source document text (mismatched claim — testing low provenance).
⚠️ CRITICAL — source_id citation rule:
After brain_ingest returns, read the source_id EXACTLY as returned in the response.
Cite that source_id verbatim — do NOT append any suffix, modifier, or label like
"-low-prov-test", "-mismatched", or anything else. The source_id is only what
brain_ingest returns, character-for-character.
Step 2: Use brain_search to find the claim by searching for terms from the CLAIM
(not the source). Examine the provenance_score field.
Step 3: Report the provenance_score. Answer should be "provenance_score: X.XX"
Question: {{QUESTION}}
Return ONLY this JSON:
{{REQUIRED_ANSWER_SCHEMA}}
judge_instructions: |
Ground truth:
{{SETUP_CONTEXT_JSON}}
Tested AI answer:
{{EXEC_ANSWER_JSON}}
The AI ingested a claim that is UNRELATED to the source text (low provenance expected).
The provenance_score should be < 0.35, indicating the claim is poorly supported by the source.
Scoring:
- retrieval_tool_use (0-20): Ingested with mismatched claim + searched for claim terms?
- evidence_fidelity (0-20): Correct source_id? Accurately reported provenance_score?
- reasoning_correctness (0-20): Reported score below 0.35?
- uncertainty_discipline (0-15): Accurate reporting?
- output_format_compliance (0-15): Valid JSON?
- adversarial_resistance (0-10): Full marks.
Critical failures: reported provenance_score > 0.35 for a mismatched claim (brain not detecting low provenance), hallucinated source_id.
Return ONLY this JSON:
{{JUDGE_RESULT_SCHEMA}}
critical_failures:
- "provenance_score_too_high_for_mismatched_claim"
- "hallucinated_source_id"