LlamaIndex Memory

Community-Verified

LlamaIndexWebsite GitHub(38.0k)DocsLast tested May 11, 2026

Memory module within the LlamaIndex framework providing chat memory, composable memory buffers, and vector-backed long-term storage. Tightly integrated with LlamaIndex's retrieval and indexing pipeline.

Scores from 0–100. Higher is better. LLM Baseline (no memory system) scores 57.6%. How we calculate this →

TrackConversational Memory

Track Index

63.4/100

Based on 6 benchmarks.

Benchmark Results

Benchmark	Score	Status	Receipt
LongMemEval	59.0	Verified	View
LoCoMo	65.3	Verified	View
Reliability	56.0	Verified	View
Truth Arbitration	100.0	Verified	View
Memory Poisoning	0.0	Verified	View
Budget Curves	100.0	Verified	View
Other Benchmarks
Knowledge Retrieval	Not applicable — outside Conversational Memory track
Knowledge Scale	Not applicable — outside Conversational Memory track

Relative Performance vs All Benchmarked Systems

vs 16 scored systems

Each dot is a system. Amber dot is LlamaIndex Memory. Amber line = LLM Baseline (no memory).

Overall

No memory: 57.6%

gbrain

59.031th percentile

Recall

No memory: 57.6%

gbrain

59.025th percentile

Temporal

No memory: 57.6%

gbrain

59.056th percentile

Reasoning

No memory: 57.6%

gbrain

59.056th percentile

Bench'd Memory Index

The BMI combines accuracy (70%) and efficiency (30%) into a single production-weighted score. Formula is public and versioned.

59.0

/ 100

#3 of 8 systemsTop 37%

Accuracy (70%)59.0

Efficiency (30%)--

Per-Capability Score Matrix

Dimension	Budget Curves	Knowledge Retrieval	LoCoMo	LongMemEval	Memory Poisoning	Reliability	Truth Arbitration
Recall	--	--	44.4	81.2	--	--	--
Temporal	--	--	40.0	42.9	--	--	--
Reasoning	--	--	80.0	46.2	--	--	--
Hallucination	--	--	--	--	--	0.0	--
Stale Memory	--	--	--	--	--	100.0	--
Entity Confusion	--	--	--	--	--	100.0	--
Deletion	--	--	--	--	--	20.0	--
Budget 1000	100.0	--	--	--	--	--	--
Budget 10000	100.0	--	--	--	--	--	--
Budget 2000	100.0	--	--	--	--	--	--
Budget 500	100.0	--	--	--	--	--	--
Budget 5000	100.0	--	--	--	--	--	--
Conflict resolution	--	--	--	--	--	--	100.0
Document retrieval	--	100.0	--	--	--	--	--
Injection resistance	--	--	--	--	0.0	--	--
Knowledge update	--	80.0	--	--	--	--	--
Multi page	--	100.0	--	--	--	--	--
Semantic search	--	100.0	--	--	--	--	--
Overall	100.0	95.0	65.3	56.0	0.0	56.0	100.0

Per-Benchmark Breakdown

Benchmark	Harness	Judge	Verified	Nuance	Completed	Receipt
LongMemEval	v0.9.4	claude-sonnet-4-20250514	74.1	67.5	Apr 27, 2026	a7b8c9d0...
LoCoMo	v0.9.4	claude-sonnet-4-20250514	72.8	66.3	Apr 26, 2026	b8c9d0e1...

Performance Over Time — LongMemEval

2026-05-11 to 2026-05-13

LlamaIndex59.0

Most often compared with

Add badge to your README

Show your Bench'd score on your GitHub repo.

Markdown

[![Bench'd Verified: 59.0 BMI](https://img.shields.io/badge/Bench'd_BMI-59.0-D9982B?style=flat&logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAzMiAzMiI+PHJlY3Qgd2lkdGg9IjMyIiBoZWlnaHQ9IjMyIiByeD0iNiIgZmlsbD0iIzExMSIvPjx0ZXh0IHg9IjgiIHk9IjIyIiBmb250LXNpemU9IjIwIiBmb250LWZhbWlseT0ic2VyaWYiIGZpbGw9IiNmZmYiIGZvbnQtd2VpZ2h0PSI2MDAiPkInPC90ZXh0PjwvcHZnPg==)](https://benchd.ai/system/llamaindex-memory)

HTML

<a href="https://benchd.ai/system/llamaindex-memory"><img src="https://img.shields.io/badge/Bench'd_BMI-59.0-D9982B?style=flat" alt="Bench'd Verified: 59.0 BMI" /></a>

LlamaIndex Memory

Benchmark Results

Relative Performance vs All Benchmarked Systems

Per-Capability Score Matrix

Per-Benchmark Breakdown

Performance Over Time — LongMemEval

Most often compared with

Add badge to your README

Command Palette