Run a Benchmark
Test your memory system against standardized benchmarks. Paste your endpoint, pick a benchmark, get a signed receipt.
Live runs coming soon — UI preview
1
Connect your system
Provide your MCP endpoint or REST API URLs
2
Choose benchmark
Select the evaluation suite to run
3
Scoring
Pick a judge model for answer evaluation
Results will appear here after your run completes.
For now, use the CLI: pip install benchd-harness && benchd run
CLI Alternative
Prefer the command line? The harness runs locally and produces the same signed receipts.
pip install benchd-harness && benchd run --adapter mcp --benchmark longmemeval-v1The harness is open source. github.com/benchdai/harness