← All Test Cases

PRV-002

provenance

high

Repetitions

Documents

Questions

Reasoning

DIRECT

provenance source-quote low-provenance hallucination-detection

📖 In Plain English

What this category tests

Does the brain score how well a claim is supported by its source?

How the test works

Claims are ingested with source quotes that either closely match (high provenance) or are unrelated (low provenance). The brain assigns a provenance_score that the test checks.

Why it matters

Low-provenance claims are likely hallucinations — flagging them helps catch errors before they reach users.

Specifically for PRV-002

Tests low-provenance score — claim is unrelated to source, provenance_score should be < 0.35 (flags potential hallucination).

⚙️ How a single rep runs

① Generate

Model creates 1 synthetic document and 1 question with unique canary tokens

→ Fresh content per run prevents memorization and proves real retrieval

② Ingest (MCP)

Model calls brain_ingest to store the 1 document

→ Tests the brain's storage and indexing pipeline

③ Query (MCP)

Model answers the question using brain retrieval tools (search, fetch, context_pack, etc.)

→ Core test — does the brain return correct evidence and let the model build a faithful answer?

④ Evaluate

Model judges the answer against ground truth (the document it generated in phase 1)

→ Produces a score 0–100 with detailed sub-scores (retrieval, fidelity, reasoning, etc.)

This rep is run 2 times per test run. A pass requires score ≥ 85 and no critical failures.

🔬 Technical Instructions (raw prompts sent to AI)

🔧 ① Setup Instructions 1542 chars

Generate a document where the extracted claim is UNRELATED to the source text.
This tests that the brain assigns a LOW provenance score when the claim doesn't
match the source — signalling a potentially hallucinated or misattributed claim.

Requirements:
- Source text: a sentence about topic A (e.g. "The weather in Limassol is warm and sunny.")
- Extracted claim: a completely different fact about topic B (e.g. "The quarterly profit is €1.2M.")
- The extracted claim must be entirely unrelated to the source text
- This should result in provenance_score < 0.35 in brain search results
- source_id: KB-{{RUN}}-PRV-002-A-v1

Output ONLY this JSON:
{
  "run_id": "{{RUN}}",
  "test_id": "PRV-002",
  "repetition": {{REP}},
  "documents": [
    {
      "source_id": "KB-{{RUN}}-PRV-002-A-v1",
      "content": "<source text about topic A with canary>",
      "title": "<title about topic A>",
      "version": 1
    }
  ],
  "questions": [
    "Search the brain for '<key term from the extracted claim about topic B>'. Report the provenance_score for source KB-{{RUN}}-PRV-002-A-v1. The score should be below 0.35."
  ],
  "expected_answers": [
    {
      "question_index": 0,
      "correct_answer": "provenance_score < 0.35",
      "required_source_ids": ["KB-{{RUN}}-PRV-002-A-v1"],
      "required_label": "DIRECT",
      "must_refuse": false,
      "notes": "When the extracted claim is unrelated to the source text, provenance_score should be low (<0.35). This tests that the brain detects potentially hallucinated claims."
    }
  ]
}

▶ ② Execution Instructions 835 chars

You have access to MCP Knowledge Brain tools.

Step 1: Use brain_ingest with extracted.claims where the claim text is COMPLETELY
DIFFERENT from the source document text (mismatched claim — testing low provenance).

⚠️ CRITICAL — source_id citation rule:
After brain_ingest returns, read the source_id EXACTLY as returned in the response.
Cite that source_id verbatim — do NOT append any suffix, modifier, or label like
"-low-prov-test", "-mismatched", or anything else. The source_id is only what
brain_ingest returns, character-for-character.

Step 2: Use brain_search to find the claim by searching for terms from the CLAIM
(not the source). Examine the provenance_score field.

Step 3: Report the provenance_score. Answer should be "provenance_score: X.XX"

Question: {{QUESTION}}

Return ONLY this JSON:
{{REQUIRED_ANSWER_SCHEMA}}

⚖ ③ Judge Instructions 834 chars

Ground truth:
{{SETUP_CONTEXT_JSON}}

Tested AI answer:
{{EXEC_ANSWER_JSON}}

The AI ingested a claim that is UNRELATED to the source text (low provenance expected).
The provenance_score should be < 0.35, indicating the claim is poorly supported by the source.

Scoring:
- retrieval_tool_use (0-20): Ingested with mismatched claim + searched for claim terms?
- evidence_fidelity (0-20): Correct source_id? Accurately reported provenance_score?
- reasoning_correctness (0-20): Reported score below 0.35?
- uncertainty_discipline (0-15): Accurate reporting?
- output_format_compliance (0-15): Valid JSON?
- adversarial_resistance (0-10): Full marks.

Critical failures: reported provenance_score > 0.35 for a mismatched claim (brain not detecting low provenance), hallucinated source_id.

Return ONLY this JSON:
{{JUDGE_RESULT_SCHEMA}}

Critical Failure Conditions

provenance_score_too_high_for_mismatched_claim
hallucinated_source_id

Recent Run History

2 runs

When	Run ID	Pass Rate	Avg Score	Reps
2026-05-24 13:08	20260524T130808Z-kqze	0%	30.0	0/1	View →
2026-05-24 11:37	20260524T113756Z-kduj	100%	100.0	1/1	View →

📄 Raw YAML cases/provenance/PRV-002.yaml

schema_version: "1.0"
test_id: "PRV-002"
category: "provenance"
severity: "high"
repetitions: 2
reasoning_type: "DIRECT"
num_documents: 1
num_questions: 1
tags: ["provenance", "source-quote", "low-provenance", "hallucination-detection"]

setup_instructions: |
  Generate a document where the extracted claim is UNRELATED to the source text.
  This tests that the brain assigns a LOW provenance score when the claim doesn't
  match the source — signalling a potentially hallucinated or misattributed claim.

  Requirements:
  - Source text: a sentence about topic A (e.g. "The weather in Limassol is warm and sunny.")
  - Extracted claim: a completely different fact about topic B (e.g. "The quarterly profit is €1.2M.")
  - The extracted claim must be entirely unrelated to the source text
  - This should result in provenance_score < 0.35 in brain search results
  - source_id: KB-{{RUN}}-PRV-002-A-v1

  Output ONLY this JSON:
  {
    "run_id": "{{RUN}}",
    "test_id": "PRV-002",
    "repetition": {{REP}},
    "documents": [
      {
        "source_id": "KB-{{RUN}}-PRV-002-A-v1",
        "content": "<source text about topic A with canary>",
        "title": "<title about topic A>",
        "version": 1
      }
    ],
    "questions": [
      "Search the brain for '<key term from the extracted claim about topic B>'. Report the provenance_score for source KB-{{RUN}}-PRV-002-A-v1. The score should be below 0.35."
    ],
    "expected_answers": [
      {
        "question_index": 0,
        "correct_answer": "provenance_score < 0.35",
        "required_source_ids": ["KB-{{RUN}}-PRV-002-A-v1"],
        "required_label": "DIRECT",
        "must_refuse": false,
        "notes": "When the extracted claim is unrelated to the source text, provenance_score should be low (<0.35). This tests that the brain detects potentially hallucinated claims."
      }
    ]
  }

execution_instructions: |
  You have access to MCP Knowledge Brain tools.

  Step 1: Use brain_ingest with extracted.claims where the claim text is COMPLETELY
  DIFFERENT from the source document text (mismatched claim — testing low provenance).

  ⚠️ CRITICAL — source_id citation rule:
  After brain_ingest returns, read the source_id EXACTLY as returned in the response.
  Cite that source_id verbatim — do NOT append any suffix, modifier, or label like
  "-low-prov-test", "-mismatched", or anything else. The source_id is only what
  brain_ingest returns, character-for-character.

  Step 2: Use brain_search to find the claim by searching for terms from the CLAIM
  (not the source). Examine the provenance_score field.

  Step 3: Report the provenance_score. Answer should be "provenance_score: X.XX"

  Question: {{QUESTION}}

  Return ONLY this JSON:
  {{REQUIRED_ANSWER_SCHEMA}}

judge_instructions: |
  Ground truth:
  {{SETUP_CONTEXT_JSON}}

  Tested AI answer:
  {{EXEC_ANSWER_JSON}}

  The AI ingested a claim that is UNRELATED to the source text (low provenance expected).
  The provenance_score should be < 0.35, indicating the claim is poorly supported by the source.

  Scoring:
  - retrieval_tool_use (0-20): Ingested with mismatched claim + searched for claim terms?
  - evidence_fidelity (0-20): Correct source_id? Accurately reported provenance_score?
  - reasoning_correctness (0-20): Reported score below 0.35?
  - uncertainty_discipline (0-15): Accurate reporting?
  - output_format_compliance (0-15): Valid JSON?
  - adversarial_resistance (0-10): Full marks.

  Critical failures: reported provenance_score > 0.35 for a mismatched claim (brain not detecting low provenance), hallucinated source_id.

  Return ONLY this JSON:
  {{JUDGE_RESULT_SCHEMA}}

critical_failures:
  - "provenance_score_too_high_for_mismatched_claim"
  - "hallucinated_source_id"