← 20260524T124148Z-z2do

JUDGE-002

judge_reliability · 1 reps · avg 60.0/100 · pass rate 0%

Judge Verdict

60/100 fail
retrieval tool use
20/20
evidence fidelity
0/20
reasoning correctness
10/20
uncertainty discipline
15/15
output format compliance
15/15
adversarial resistance
0/10
evidence_fidelity

The candidate's answer is factually correct, but the evidence quote is not a verbatim copy of the source document; it is a paraphrase ('The book sits upon the shelf.' vs. required 'The book is on the shelf.'). This violates evidence fidelity requirements, triggering a critical failure and a failing score.

Question

Where is the book?

Expected answer: The book is on the shelf.
Required label: DIRECT
Required source IDs: KB-20260524T124148Z-z2do-JUDGE-002-A-v1

Model Answer

(refused / no answer)

DIRECT The book is on the shelf.
"The book sits upon the shelf."