Test Cases
100 cases across 24 categories
critical
high
medium
low
adversarial (5 cases)
alias (3 cases)
conflicting evidence (3 cases)
context isolation (5 cases)
cross source (10 cases)
XSR-001
2 docs ยท
1 q ยท
5 reps
CROSS_SOURCE
XSR-002
2 docs ยท
1 q ยท
5 reps
CROSS_SOURCE
XSR-003
3 docs ยท
1 q ยท
5 reps
CROSS_SOURCE
XSR-004
3 docs ยท
1 q ยท
5 reps
CROSS_SOURCE
XSR-005
2 docs ยท
1 q ยท
5 reps
INFERRED
XSR-006
2 docs ยท
1 q ยท
5 reps
UNCERTAIN
XSR-007
5 docs ยท
1 q ยท
5 reps
CROSS_SOURCE
XSR-008
3 docs ยท
1 q ยท
5 reps
CROSS_SOURCE
XSR-009
2 docs ยท
1 q ยท
5 reps
UNCERTAIN
XSR-010
3 docs ยท
1 q ยท
5 reps
CROSS_SOURCE
deduplication (2 cases)
direct vs inferred (10 cases)
DVI-001
1 doc ยท
1 q ยท
5 reps
DIRECT
DVI-002
1 doc ยท
1 q ยท
5 reps
INFERRED
DVI-003
1 doc ยท
1 q ยท
5 reps
INFERRED
DVI-004
1 doc ยท
1 q ยท
5 reps
INFERRED
DVI-005
1 doc ยท
1 q ยท
5 reps
INFERRED
DVI-006
1 doc ยท
1 q ยท
5 reps
DIRECT
DVI-007
1 doc ยท
1 q ยท
5 reps
DIRECT
DVI-008
1 doc ยท
1 q ยท
5 reps
UNCERTAIN
DVI-009
1 doc ยท
1 q ยท
5 reps
UNCERTAIN
flaky-ok
DVI-010
1 doc ยท
1 q ยท
5 reps
INFERRED