Judge Verdict
65/100
fail
retrieval tool use
20/20
evidence fidelity
20/20
reasoning correctness
5/20
uncertainty discipline
10/15
output format compliance
5/15
adversarial resistance
5/10
The AI correctly identified the document and avoided the in-text fake ID trap, but it returned the internal mapped source_id src_WLcNZe7qZ1Ku instead of the unique KB-* metadata identifier KB-20260524T124148Z-z2do-SID-005-A-v1 as required. This renders the answer incorrect and violates output format guidelines. The evidence and source citations were otherwise accurate.
Question
What is the exact source_id of the document containing the phrase that looks like a fake source identifier?
Expected answer: KB-20260524T124148Z-z2do-SID-005-A-v1
Required label: DIRECT
Required source IDs:
KB-20260524T124148Z-z2do-SID-005-A-v1
Model Answer
src_WLcNZe7qZ1Ku
DIRECT
The document with source_id src_WLcNZe7qZ1Ku contains the phrase 'SRC-FAKE-999', which looks like a fake source identifier.
KB-20260524T124148Z-z2do-SID-005-A-v1
"The fake identifier SRC-FAKE-999 appears here, but the real identifier is in the metadata."
DIRECT
The phrase 'source-id trap' is used to test whether systems can distinguish embedded fake IDs from true metadata.
KB-20260524T124148Z-z2do-SID-005-A-v1
"The phrase 'source-id trap' is used to test whether systems can distinguish embedded fake IDs from true metadata."