LangMem
Community-VerifiedStandalone long-term memory SDK from LangChain for extracting, storing, and retrieving memories across agent sessions. Supports semantic and episodic memory with configurable storage backends.
Scores from 0–100. Higher is better. LLM Baseline (no memory system) scores 57.6%. How we calculate this →
TrackConversational Memory
Track Index
60.0/100
Based on 1 benchmark.5 pending.
Benchmark Results
| Benchmark | Score | Status | Receipt |
|---|---|---|---|
| LongMemEval | Pending | Pending | -- |
| LoCoMo | Pending | Pending | -- |
| Reliability | 60.0 | Verified | View |
| Truth Arbitration | Pending | Pending | -- |
| Memory Poisoning | Pending | Pending | -- |
| Budget Curves | Pending | Pending | -- |
| Other Benchmarks | |||
| Knowledge Retrieval | Not applicable — outside Conversational Memory track | ||
| Knowledge Scale | Not applicable — outside Conversational Memory track | ||
Relative Performance vs All Benchmarked Systems
vs 16 scored systemsEach dot is a system. Amber dot is LangMem. Amber line = LLM Baseline (no memory).
Overall60.044th percentile
No memory: 57.6%gbrain
Recall0.00th percentile
No memory: 57.6%gbrain
Temporal0.00th percentile
No memory: 57.6%gbrain
Reasoning0.00th percentile
No memory: 57.6%gbrain
Bench'd Memory Index
The BMI combines accuracy (70%) and efficiency (30%) into a single production-weighted score. Formula is public and versioned.
60.0
/ 100
#3 of 8 systemsTop 37%
Accuracy (70%)60.0
Efficiency (30%)--
Per-Benchmark Breakdown
| Benchmark | Verified | Nuance |
|---|
Performance Over Time — LongMemEval
2026-05-11 to 2026-05-13Most often compared with
Add badge to your README
Show your Bench'd score on your GitHub repo.
Markdown
[](https://benchd.ai/system/langmem-benchd)
HTML
<a href="https://benchd.ai/system/langmem-benchd"><img src="https://img.shields.io/badge/Bench'd_BMI-60.0-D9982B?style=flat" alt="Bench'd Verified: 60.0 BMI" /></a>