Mastra
Self-ReportedMCP Endpoint:
https://memory.mastra.ai/mcpTypeScript-native AI agent framework with integrated memory, RAG, and workflow orchestration. Offers pluggable memory backends and first-class MCP support for building production agents.
These scores are self-reported by the vendor and have not been independently verified by Bench'd.
Scores from 0–100. Higher is better. LLM Baseline (no memory system) scores 57.6%. How we calculate this →
TrackConversational Memory
Track IndexNo results yet
Benchmark Results
| Benchmark | Score | Status | Receipt |
|---|---|---|---|
| LongMemEval | Pending | Pending | -- |
| LoCoMo | Pending | Pending | -- |
| Reliability | Pending | Pending | -- |
| Truth Arbitration | Pending | Pending | -- |
| Memory Poisoning | Pending | Pending | -- |
| Budget Curves | Pending | Pending | -- |
| Other Benchmarks | |||
| Knowledge Retrieval | Not applicable — outside Conversational Memory track | ||
| Knowledge Scale | Not applicable — outside Conversational Memory track | ||
Relative Performance vs All Benchmarked Systems
vs 16 scored systemsEach dot is a system. Amber dot is Mastra. Amber line = LLM Baseline (no memory).
Overall84.275th percentile
No memory: 57.6%gbrain
Recall84.769th percentile
No memory: 57.6%gbrain
Temporal85.875th percentile
No memory: 57.6%gbrain
Reasoning78.975th percentile
No memory: 57.6%gbrain
Bench'd Memory Index
The BMI combines accuracy (70%) and efficiency (30%) into a single production-weighted score. Formula is public and versioned.
84.2
/ 100
#1 of 8 systemsTop 12%
Accuracy (70%)84.2
Efficiency (30%)--
Per-Benchmark Breakdown
| Benchmark | Verified | Nuance |
|---|---|---|
| LoCoMo | 82.4 | 75.9 |
| PersonaMem | 81.5 | 74.9 |