Run: run_5f8cac732bcf

VERIFIED

Systemmem0-local

BenchmarkLongMemEval

Harnessv0.1.0

Verified Overall0%

Nuance Overall33.14333771999676%

DateMay 11, 2026

run_manifest.jsonjson

{
  "version": "1.0.0",
  "runId": "run_5f8cac732bcf",
  "systemId": "sys_mem0-local",
  "systemName": "mem0-local",
  "benchmarkId": "bench_longmemeval-v1",
  "benchmarkName": "LongMemEval",
  "benchmarkVersion": "1.0",
  "harnessVersion": "0.1.0",
  "judgeModel": "openai/gpt-4o-mini",
  "judgeTemperature": 0,
  "startedAt": "2026-05-11T21:08:54.626661+00:00",
  "completedAt": "2026-05-11T22:18:51.726560+00:00",
  "scores": {
    "verified": {
      "recall": 0,
      "temporal": 0,
      "reasoning": 0,
      "overall": 0
    },
    "nuance": {
      "recall": 47.43589743589743,
      "temporal": 36.95652173913043,
      "reasoning": 15.037593984962406,
      "overall": 33.14333771999676
    }
  },
  "questionCount": 500,
  "passCount": 162,
  "failCount": 338,
  "merkleRoot": "c4879d6f9dd98984bf1d982339c01b3c005b20f127d20a418800ebd492be93e7"
}

Run: run_5f8cac732bcf

Command Palette