Run a Benchmark

Test your memory system against standardized benchmarks. Paste your endpoint, pick a benchmark, get a signed receipt.

Live runs coming soon — UI preview

Connect your system

Provide your MCP endpoint or REST API URLs

MCP Server URL

API Key (optional)

Select the evaluation suite to run

Pick a judge model for answer evaluation

Results will appear here after your run completes.

For now, use the CLI: pip install benchd-harness && benchd run

Prefer the command line? The harness runs locally and produces the same signed receipts.

pip install benchd-harness && benchd run --adapter mcp --benchmark longmemeval-v1

The harness is open source. github.com/benchdai/harness