← 20260524T124148Z-z2do

CON-001

conflicting_evidence · 1 reps · avg 100.0/100 · pass rate 100%

Judge Verdict

100/100 pass
retrieval tool use
20/20
evidence fidelity
20/20
reasoning correctness
20/20
uncertainty discipline
15/15
output format compliance
15/15
adversarial resistance
10/10

The AI answer correctly cited both conflicting sources, explicitly reported the contradiction, refrained from picking one side, and conveyed uncertainty. All required source IDs and quotes are present with no hallucinations or label misuse, fully meeting the ground truth expectations.

Question

What is the status of Project Aurora?

Expected answer: None
Required label: UNCERTAIN
Required source IDs: KB-20260524T124148Z-z2do-CON-001-A-v1, KB-20260524T124148Z-z2do-CON-001-B-v1

Model Answer

The status of Project Aurora is contradictory. One source reports it as ACTIVE, while another reports it as CANCELLED. Both sources have medium reliability and equal confidence. The conflict is unresolved.

DIRECT Project Aurora has status ACTIVE.
KB-20260524T124148Z-z2do-CON-001-A-v1
"Project Aurora has status ACTIVE."
DIRECT Project Aurora has status CANCELLED.
KB-20260524T124148Z-z2do-CON-001-B-v1
"Project Aurora has status CANCELLED."
CROSS_SOURCE The status of Project Aurora is contradictory — one source says ACTIVE and another says CANCELLED.
KB-20260524T124148Z-z2do-CON-001-A-v1 · KB-20260524T124148Z-z2do-CON-001-B-v1
"Project Aurora has status ACTIVE."