Metric v1.025 questions

Reliability

Adversarial robustness benchmark testing stale memory handling, entity separation, hallucination resistance, and deletion compliance.

What it measures

Robustness under adversarial conditions: does the system handle edge cases that trip up real-world memory systems?

Run 25 adversarial trap questions across 4 sub-dimensions:
- Stale Memory Handling: does the system return outdated info after updates?
- Entity Separation: does the system confuse similar entities?
- Hallucination Resistance: does the system abstain when it has no relevant memory?
- Deletion Compliance: does the system honor explicit forget/delete requests?
Score using reliability trap method: response must contain expected behavioral indicators.

Deterministic (reliability trap). Keyword-based pass/fail for behavioral indicators.

Dimensions tested: recall, temporal

How this metric relates to each track (v1.0):

See the full failure taxonomy for all 20+ reason codes.

Bench'd adversarial dataset, hand-crafted robustness scenarios.

Sub-dimension scoring can produce low overall scores when a system doesn't support certain capabilities (e.g., no delete API).
The interpretation system accounts for this with the 'capability_limited' label.

Stable URL: benchd.ai/methodology/metrics/reliability
This URL is referenced in signed manifests. It will not change.