Most AI memory systems treat every stored fact the same way. A user preference from last week and a mention from six months ago get equal weight in retrieval. This doesn't match how useful memory actually works—and it's why your AI agent's context window fills up with stale, irrelevant facts.
The fix comes from a 140-year-old psychology experiment.
In 1885, Hermann Ebbinghaus published his research on memory retention. His key finding: memory retention decays exponentially over time, following a predictable curve. The stronger the initial encoding (how important the memory is), the slower it decays.
Where t is time elapsed and S is the memory's stability. A strongly encoded memory fades slowly. A weakly encoded one fades fast.
This is how human memory works. You remember yesterday's important conversation clearly. You vaguely remember a passing comment from last month. You've completely forgotten what you ate for lunch three Tuesdays ago. Your brain doesn't treat all memories equally—it prioritizes by importance and recency.
AI memory systems should work the same way.
Here's a concrete example. Say your AI assistant has stored 500 memories about a user named Alice over six months. Alice mentions she's moving from New York to San Francisco. She asks: "What's a good neighborhood for me?"
With flat vector retrieval, the system returns results ranked purely by semantic similarity:
Results 1 and 4 are about New York. They're semantically similar to the query but completely irrelevant now. A flat retrieval system has no way to know that.
With decay-ranked retrieval, the same search produces correctly prioritized results:
The old New York memories are still there, but they've decayed. The system naturally prioritizes current, relevant context.
Smara applies the forgetting curve at retrieval time. Every memory has two key attributes: importance (0.0 to 1.0, set when storing) and decay_score (calculated live).
function ebbinghaus(createdAt: Date, importance: number): number {
const days = (Date.now() - createdAt.getTime()) / (1000 * 60 * 60 * 24);
const halfLife = Math.max(importance, 0.1) * 10; // days
return Math.exp(-days / halfLife);
}
A memory with importance: 1.0 has a 10-day half-life. After 10 days, its decay score drops to ~0.37. After 20 days, ~0.14.
A memory with importance: 0.3 has a 3-day half-life. After 3 days, ~0.37. After a week, ~0.10. It fades much faster.
The final retrieval score blends vector similarity with decay:
function blendScore(similarity: number, decayScore: number): number {
return similarity * 0.7 + decayScore * 0.3;
}
70% semantic relevance, 30% temporal freshness. A highly relevant old memory can still beat a vaguely relevant new one—but all else being equal, recent memories win.
"Alice is vegetarian" — sim: 0.92
"Alice loves sushi" — sim: 0.89
"Alice is trying vegan" — sim: 0.87
Contradictions unsorted. Which is current?
"Alice is trying vegan" — score: 0.894
"Alice is vegetarian" — score: 0.680
"Alice loves sushi" — score: 0.632
Most recent fact surfaces first.
With Smara's decay scoring:
curl "https://api.smara.io/v1/memories/search?user_id=alice&q=food+preferences"
{
"results": [
{
"fact": "Alice is trying a vegan diet",
"similarity": 0.87,
"decay_score": 0.95,
"score": 0.894,
"created_at": "2026-04-25T..."
},
{
"fact": "Alice is vegetarian",
"similarity": 0.92,
"decay_score": 0.12,
"score": 0.680,
"created_at": "2025-11-03T..."
}
]
}
Smara goes further than decay scoring. When you store a new memory, it checks for near-duplicates and contradictions using cosine similarity bands:
| Cosine Range | Action |
|---|---|
| ≥ 0.985 | True duplicate — skip storage |
| 0.94 – 0.985 | Contradiction — store new, soft-delete old |
| < 0.94 | New fact — store alongside existing |
# This automatically handles the old "vegetarian" memory
curl -X POST https://api.smara.io/v1/memories \
-H "Authorization: Bearer smara_..." \
-d '{"user_id": "alice", "fact": "Alice is trying a vegan diet", "importance": 0.8}'
# Response shows what happened
{
"action": "replaced",
"id": "new-memory-id",
"replaced_id": "old-vegetarian-memory-id"
}
| Importance | Half-life | Use for |
|---|---|---|
| 1.0 | 10 days | Core identity facts, strong preferences |
| 0.7 | 7 days | Current projects, active goals |
| 0.5 | 5 days | General preferences, casual mentions |
| 0.3 | 3 days | Temporary states, passing interests |
| 0.1 | 1 day | Ephemeral context, one-off mentions |
For the fastest integration, use Smara's context endpoint. It returns decay-ranked memories pre-formatted for LLM system prompts:
curl "https://api.smara.io/v1/users/alice/context?q=weekend+plans&top_n=5"
{
"context": "[1] (importance: 0.8, decay: 0.95, source: api) Alice is planning a hike this Saturday\n[2] (importance: 0.6, decay: 0.72, source: api) Alice prefers morning activities",
"memories": [...]
}
Drop that context string directly into your system prompt. The LLM gets ranked, relevant context with zero post-processing.
Flat memory retrieval was a reasonable first approach, but it breaks down as memories accumulate. Ebbinghaus decay curves solve this by adding the dimension that vector search misses: time. Important recent facts surface. Stale facts fade. Contradictions resolve themselves.
This isn't a theoretical improvement. It's the difference between an AI assistant that confidently tells Alice about great restaurants in New York (where she used to live) and one that recommends spots in San Francisco (where she lives now).
Try Smara free — see decay scoring in action with your first 100 memories.
Try Smara Free →