llamaindex-memory 0.0 on LoCoMollm-baseline 0.0 on LoCoMomem0-local 0.0 on LongMemEvalmem0-local 0.0 on LongMemEvalllamaindex-memory 0.0 on LongMemEvalllm-baseline 0.0 on LongMemEvallangchain-memory 0.0 on LongMemEvalcognee 0.0 on LongMemEval13 systems independently scored64 systems indexedllamaindex-memory 0.0 on LoCoMollm-baseline 0.0 on LoCoMomem0-local 0.0 on LongMemEvalmem0-local 0.0 on LongMemEvalllamaindex-memory 0.0 on LongMemEvalllm-baseline 0.0 on LongMemEvallangchain-memory 0.0 on LongMemEvalcognee 0.0 on LongMemEval13 systems independently scored64 systems indexed

Run a Benchmark

Test your memory system against standardized benchmarks. Paste your endpoint, pick a benchmark, get a signed receipt.

Live runs coming soon — UI preview
1

Connect your system

Provide your MCP endpoint or REST API URLs

2

Choose benchmark

Select the evaluation suite to run

3

Scoring

Pick a judge model for answer evaluation

Results will appear here after your run completes.

For now, use the CLI: pip install benchd-harness && benchd run

CLI Alternative

Prefer the command line? The harness runs locally and produces the same signed receipts.

pip install benchd-harness && benchd run --adapter mcp --benchmark longmemeval-v1

The harness is open source. github.com/benchdai/harness

Command Palette

Search for a command to run...