Cognee

Community-Verified

CogneeWebsite GitHub(3.8k)DocsLast tested May 16, 2026

Memory management layer that builds dynamic knowledge graphs from conversations and documents. Combines semantic chunking with graph-based retrieval for contextual recall.

Scores from 0–100. Higher is better. LLM Baseline (no memory system) scores 57.6%. How we calculate this →

TrackKnowledge Graph

Track IndexNo results yet

Benchmark Results

Benchmark	Score	Status	Receipt
Knowledge Retrieval	Pending	Pending	--
Knowledge Scale	Pending	Pending	--
Truth Arbitration	Pending	Pending	--
Budget Curves	Pending	Pending	--
Reliability	Pending	Pending	--
Other Benchmarks
LongMemEval	Not applicable — outside Knowledge Graph track
LoCoMo	Not applicable — outside Knowledge Graph track
Memory Poisoning	Not applicable — outside Knowledge Graph track

Relative Performance vs All Benchmarked Systems

vs 16 scored systems

Each dot is a system. Amber dot is Cognee. Amber line = LLM Baseline (no memory).

Overall

No memory: 57.6%

gbrain

0.00th percentile

Recall

No memory: 57.6%

gbrain

0.00th percentile

Temporal

No memory: 57.6%

gbrain

0.00th percentile

Reasoning

No memory: 57.6%

gbrain

0.00th percentile

Bench'd Memory Index

The BMI combines accuracy (70%) and efficiency (30%) into a single production-weighted score. Formula is public and versioned.

0.0

/ 100

#7 of 8 systemsTop 87%

Accuracy (70%)0.0

Efficiency (30%)--

Per-Capability Score Matrix

Dimension	LongMemEval	Reliability
Recall	50.0	--
Temporal	0.0	--
Reasoning	0.0	--
Hallucination	--	0.0
Stale Memory	--	0.0
Entity Confusion	--	0.0
Deletion	--	0.0
Overall	20.0	0.0

Per-Benchmark Breakdown

Benchmark	Harness	Judge	Verified	Nuance	Completed	Receipt
LongMemEval	v0.1.0	openai/gpt-4o-mini	0.0	0.0	May 11, 2026	16795daf...
LongMemEval	v0.9.4	claude-sonnet-4-20250514	79.6	73.1	May 2, 2026	c3d4e5f6...
LoCoMo	v0.9.4	claude-sonnet-4-20250514	78.3	72.0	May 1, 2026	d4e5f6a7...

Performance Over Time — LongMemEval

2026-05-11 to 2026-05-13

Cognee20.0

Most often compared with

Add badge to your README

Show your Bench'd score on your GitHub repo.

Markdown

[![Bench'd Verified: 0.0 BMI](https://img.shields.io/badge/Bench'd_BMI-0.0-D9982B?style=flat&logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAzMiAzMiI+PHJlY3Qgd2lkdGg9IjMyIiBoZWlnaHQ9IjMyIiByeD0iNiIgZmlsbD0iIzExMSIvPjx0ZXh0IHg9IjgiIHk9IjIyIiBmb250LXNpemU9IjIwIiBmb250LWZhbWlseT0ic2VyaWYiIGZpbGw9IiNmZmYiIGZvbnQtd2VpZ2h0PSI2MDAiPkInPC90ZXh0PjwvcHZnPg==)](https://benchd.ai/system/cognee)

HTML

<a href="https://benchd.ai/system/cognee"><img src="https://img.shields.io/badge/Bench'd_BMI-0.0-D9982B?style=flat" alt="Bench'd Verified: 0.0 BMI" /></a>

Cognee

Benchmark Results

Relative Performance vs All Benchmarked Systems

Per-Capability Score Matrix

Per-Benchmark Breakdown

Performance Over Time — LongMemEval

Most often compared with

Add badge to your README

Command Palette