← 20260524T130808Z-kqze

MISS-003

missing_source · 1 reps · avg 75.0/100 · pass rate 0%

Judge Verdict

75/100 fail
retrieval tool use
5/20
evidence fidelity
15/20
reasoning correctness
20/20
uncertainty discipline
10/15
output format compliance
15/15
adversarial resistance
10/10
hallucinated_source_id

The answer correctly states that the document says Falcon is not mentioned and quotes the exact text, but it cites a source_id that does not appear in the ground truth. This constitutes a hallucinated source_id, which is a critical failure. All required quotes and the DIRECT label are present, but the failure to use the correct source identifier forces a fail verdict.

Question

What does the document say about Falcon?

Expected answer: The document states that Falcon is NOT mentioned in the report.
Required label: DIRECT
Required source IDs: KB-20260524T130808Z-kqze-MISS-003-A-v1

Model Answer

The document states that Falcon is not mentioned in the report. The exact text is: "The report mentions Sparrow, not Falcon." In other words, the document confirms the absence of Falcon and notes that Sparrow is mentioned instead.

DIRECT The report mentions Sparrow, not Falcon.
KB-20260524T124148Z-z2do-MISS-003-A-v1
"The report mentions Sparrow, not Falcon."