Mem0 OSS

Community-Verified

Mem0 IncWebsite GitHub(24.8k)DocsLast tested May 12, 2026

Open-source version of the Mem0 memory platform. Community-verified benchmark results on LongMemEval using the OSS package with default configuration.

Scores from 0–100. Higher is better. LLM Baseline (no memory system) scores 57.6%. How we calculate this →

TrackConversational Memory

Track Index

37.4/100

Based on 6 benchmarks.

Benchmark Results

Benchmark	Score	Status	Receipt
LongMemEval	32.4	Verified	View
LoCoMo	0.0	Verified	View
Reliability	52.0	Verified	View
Truth Arbitration	40.0	Verified	View
Memory Poisoning	0.0	Verified	View
Budget Curves	100.0	Verified	View
Other Benchmarks
Knowledge Retrieval	Not applicable — outside Conversational Memory track
Knowledge Scale	Not applicable — outside Conversational Memory track

Relative Performance vs All Benchmarked Systems

vs 16 scored systems

Each dot is a system. Amber dot is Mem0 OSS. Amber line = LLM Baseline (no memory).

Overall

No memory: 57.6%

gbrain

32.413th percentile

Recall

No memory: 57.6%

gbrain

47.419th percentile

Temporal

No memory: 57.6%

gbrain

32.225th percentile

Reasoning

No memory: 57.6%

gbrain

15.025th percentile

Bench'd Memory Index

The BMI combines accuracy (70%) and efficiency (30%) into a single production-weighted score. Formula is public and versioned.

28.4

/ 100

#6 of 8 systemsTop 75%

Accuracy (70%)32.4

Efficiency (30%)94.6

Efficiency Metrics

Avg Latency

Average time to retrieve memories and generate an answer. Lower is better.

7.6sTime per recall query

Tokens / Correct

Average tokens consumed per correctly answered question. Lower means more efficient.

544Token cost per correct answer

Recall Tokens

Average tokens returned by the memory system per query. Lower means tighter retrieval.

167Avg tokens per retrieval

Per-Benchmark Breakdown

Benchmark	Harness	Judge	Verified	Nuance	Completed	Receipt

Performance Over Time — LongMemEval

2026-05-11 to 2026-05-13

Most often compared with

Add badge to your README

Show your Bench'd score on your GitHub repo.

Markdown

[![Bench'd Verified: 28.4 BMI](https://img.shields.io/badge/Bench'd_BMI-28.4-D9982B?style=flat&logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAzMiAzMiI+PHJlY3Qgd2lkdGg9IjMyIiBoZWlnaHQ9IjMyIiByeD0iNiIgZmlsbD0iIzExMSIvPjx0ZXh0IHg9IjgiIHk9IjIyIiBmb250LXNpemU9IjIwIiBmb250LWZhbWlseT0ic2VyaWYiIGZpbGw9IiNmZmYiIGZvbnQtd2VpZ2h0PSI2MDAiPkInPC90ZXh0PjwvcHZnPg==)](https://benchd.ai/system/mem0-oss)

HTML

<a href="https://benchd.ai/system/mem0-oss"><img src="https://img.shields.io/badge/Bench'd_BMI-28.4-D9982B?style=flat" alt="Bench'd Verified: 28.4 BMI" /></a>

Mem0 OSS

Benchmark Results

Relative Performance vs All Benchmarked Systems

Efficiency Metrics

Per-Benchmark Breakdown

Performance Over Time — LongMemEval

Most often compared with

Add badge to your README

Command Palette