llamaindex-memory 0.0 on LoCoMollm-baseline 0.0 on LoCoMomem0-local 0.0 on LongMemEvalmem0-local 0.0 on LongMemEvalllamaindex-memory 0.0 on LongMemEvalllm-baseline 0.0 on LongMemEvallangchain-memory 0.0 on LongMemEvalcognee 0.0 on LongMemEval13 systems independently scored64 systems indexedllamaindex-memory 0.0 on LoCoMollm-baseline 0.0 on LoCoMomem0-local 0.0 on LongMemEvalmem0-local 0.0 on LongMemEvalllamaindex-memory 0.0 on LongMemEvalllm-baseline 0.0 on LongMemEvallangchain-memory 0.0 on LongMemEvalcognee 0.0 on LongMemEval13 systems independently scored64 systems indexed

LlamaIndex Memory

Community-Verified
LlamaIndexWebsiteGitHub(38.0k)DocsLast tested May 11, 2026

Memory module within the LlamaIndex framework providing chat memory, composable memory buffers, and vector-backed long-term storage. Tightly integrated with LlamaIndex's retrieval and indexing pipeline.

Scores from 0–100. Higher is better. LLM Baseline (no memory system) scores 57.6%. How we calculate this →

TrackConversational Memory
Track Index
63.4/100

Based on 6 benchmarks.

Benchmark Results

BenchmarkScoreStatusReceipt
LongMemEval59.0VerifiedView
LoCoMo65.3VerifiedView
Reliability56.0VerifiedView
Truth Arbitration100.0VerifiedView
Memory Poisoning0.0VerifiedView
Budget Curves100.0VerifiedView
Other Benchmarks
Knowledge RetrievalNot applicable — outside Conversational Memory track
Knowledge ScaleNot applicable — outside Conversational Memory track

Relative Performance vs All Benchmarked Systems

vs 16 scored systems

Each dot is a system. Amber dot is LlamaIndex Memory. Amber line = LLM Baseline (no memory).

Overall
No memory: 57.6%
gbrain
59.031th percentile
Recall
No memory: 57.6%
gbrain
59.025th percentile
Temporal
No memory: 57.6%
gbrain
59.056th percentile
Reasoning
No memory: 57.6%
gbrain
59.056th percentile
Bench'd Memory Index
The BMI combines accuracy (70%) and efficiency (30%) into a single production-weighted score. Formula is public and versioned.
59.0
/ 100
#3 of 8 systemsTop 37%
Accuracy (70%)59.0
Efficiency (30%)--

Per-Capability Score Matrix

DimensionBudget CurvesKnowledge RetrievalLoCoMoLongMemEvalMemory PoisoningReliabilityTruth Arbitration
Recall----44.481.2------
Temporal----40.042.9------
Reasoning----80.046.2------
Hallucination----------0.0--
Stale Memory----------100.0--
Entity Confusion----------100.0--
Deletion----------20.0--
Budget 1000100.0------------
Budget 10000100.0------------
Budget 2000100.0------------
Budget 500100.0------------
Budget 5000100.0------------
Conflict resolution------------100.0
Document retrieval--100.0----------
Injection resistance--------0.0----
Knowledge update--80.0----------
Multi page--100.0----------
Semantic search--100.0----------
Overall100.095.065.356.00.056.0100.0

Per-Benchmark Breakdown

BenchmarkVerifiedNuance
LongMemEval74.167.5
LoCoMo72.866.3

Performance Over Time — LongMemEval

2026-05-11 to 2026-05-13
0255075100baseline05-1105-1205-13
LlamaIndex59.0

Add badge to your README

Show your Bench'd score on your GitHub repo.

Bench'd Verified: 59.0 BMI
Markdown
[![Bench'd Verified: 59.0 BMI](https://img.shields.io/badge/Bench'd_BMI-59.0-D9982B?style=flat&logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAzMiAzMiI+PHJlY3Qgd2lkdGg9IjMyIiBoZWlnaHQ9IjMyIiByeD0iNiIgZmlsbD0iIzExMSIvPjx0ZXh0IHg9IjgiIHk9IjIyIiBmb250LXNpemU9IjIwIiBmb250LWZhbWlseT0ic2VyaWYiIGZpbGw9IiNmZmYiIGZvbnQtd2VpZ2h0PSI2MDAiPkInPC90ZXh0PjwvcHZnPg==)](https://benchd.ai/system/llamaindex-memory)
HTML
<a href="https://benchd.ai/system/llamaindex-memory"><img src="https://img.shields.io/badge/Bench'd_BMI-59.0-D9982B?style=flat" alt="Bench'd Verified: 59.0 BMI" /></a>

Command Palette

Search for a command to run...