llamaindex-memory 0.0 on LoCoMollm-baseline 0.0 on LoCoMomem0-local 0.0 on LongMemEvalmem0-local 0.0 on LongMemEvalllamaindex-memory 0.0 on LongMemEvalllm-baseline 0.0 on LongMemEvallangchain-memory 0.0 on LongMemEvalcognee 0.0 on LongMemEval13 systems independently scored64 systems indexedllamaindex-memory 0.0 on LoCoMollm-baseline 0.0 on LoCoMomem0-local 0.0 on LongMemEvalmem0-local 0.0 on LongMemEvalllamaindex-memory 0.0 on LongMemEvalllm-baseline 0.0 on LongMemEvallangchain-memory 0.0 on LongMemEvalcognee 0.0 on LongMemEval13 systems independently scored64 systems indexed

LangMem

Community-Verified
LangChain IncWebsiteGitHub(2.1k)DocsLast tested May 13, 2026

Standalone long-term memory SDK from LangChain for extracting, storing, and retrieving memories across agent sessions. Supports semantic and episodic memory with configurable storage backends.

Scores from 0–100. Higher is better. LLM Baseline (no memory system) scores 57.6%. How we calculate this →

TrackConversational Memory
Track Index
60.0/100

Based on 1 benchmark.5 pending.

Benchmark Results

BenchmarkScoreStatusReceipt
LongMemEvalPendingPending--
LoCoMoPendingPending--
Reliability60.0VerifiedView
Truth ArbitrationPendingPending--
Memory PoisoningPendingPending--
Budget CurvesPendingPending--
Other Benchmarks
Knowledge RetrievalNot applicable — outside Conversational Memory track
Knowledge ScaleNot applicable — outside Conversational Memory track

Relative Performance vs All Benchmarked Systems

vs 16 scored systems

Each dot is a system. Amber dot is LangMem. Amber line = LLM Baseline (no memory).

Overall
No memory: 57.6%
gbrain
60.044th percentile
Recall
No memory: 57.6%
gbrain
0.00th percentile
Temporal
No memory: 57.6%
gbrain
0.00th percentile
Reasoning
No memory: 57.6%
gbrain
0.00th percentile
Bench'd Memory Index
The BMI combines accuracy (70%) and efficiency (30%) into a single production-weighted score. Formula is public and versioned.
60.0
/ 100
#3 of 8 systemsTop 37%
Accuracy (70%)60.0
Efficiency (30%)--

Per-Benchmark Breakdown

BenchmarkVerifiedNuance

Performance Over Time — LongMemEval

2026-05-11 to 2026-05-13
0255075100baseline05-1105-1205-13

Add badge to your README

Show your Bench'd score on your GitHub repo.

Bench'd Verified: 60.0 BMI
Markdown
[![Bench'd Verified: 60.0 BMI](https://img.shields.io/badge/Bench'd_BMI-60.0-D9982B?style=flat&logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAzMiAzMiI+PHJlY3Qgd2lkdGg9IjMyIiBoZWlnaHQ9IjMyIiByeD0iNiIgZmlsbD0iIzExMSIvPjx0ZXh0IHg9IjgiIHk9IjIyIiBmb250LXNpemU9IjIwIiBmb250LWZhbWlseT0ic2VyaWYiIGZpbGw9IiNmZmYiIGZvbnQtd2VpZ2h0PSI2MDAiPkInPC90ZXh0PjwvcHZnPg==)](https://benchd.ai/system/langmem-benchd)
HTML
<a href="https://benchd.ai/system/langmem-benchd"><img src="https://img.shields.io/badge/Bench'd_BMI-60.0-D9982B?style=flat" alt="Bench'd Verified: 60.0 BMI" /></a>

Command Palette

Search for a command to run...