Apr 13, 2026

Your AI Agent Shouldn't Treat You Like a Stranger Every Morning

ALMA v0.9.0 scores R@5=0.964 on LongMemEval — beating every open-source competitor. No API keys. No cloud LLMs. Runs entirely on your laptop.

Every AI agent starts from zero. Every time.

You open Claude. You explain your project. You describe your preferences. You remind it of the bug you already fixed last week. Then the session ends, and tomorrow you do it all again.

Claude Code, ChatGPT, and Gemini all have “memory” now. But here’s the thing — their memory is a notepad. It remembers what you told it. It doesn’t learn from what happened.

That’s the difference between remembering and learning.

What is ALMA?

ALMA (Agent Learning Memory Architecture) is an open-source Python library that gives AI agents persistent memory that actually learns and improves over time.

It sits between your AI and a database you control. Before every task, it retrieves what the agent learned from past runs. After every task, it stores the outcome. Over time, the system builds a compounding knowledge base.

But here’s what makes ALMA different from every other memory solution:

It doesn’t just store text. It stores intelligence.

ALMA classifies memories into 5 types:

Heuristics — Strategies that proved to work (“For forms with >5 fields, validate incrementally”)
Outcomes — What happened when an agent tried something (success/failure, duration, strategy used)
Anti-Patterns — What NOT to do, with an explanation of why and what to do instead
Domain Knowledge — Facts the agent accumulated over time
User Preferences — Your constraints and preferences

After 3+ similar successful outcomes, ALMA automatically creates a reusable heuristic. After 2+ similar failures, it creates an anti-pattern. Unused memories decay over time. Reinforced memories get stronger.

This isn’t a vector database with a search function. It’s a learning loop.

The Benchmark: #1 on LongMemEval

We benchmarked ALMA against LongMemEval (ICLR 2025) — the standard benchmark for AI agent memory systems. 500 questions testing retrieval across ~53 conversation sessions per question.

Results:

ALMA v0.9.0 — R@5 = 0.964 (no API keys needed)
Hindsight — R@5 = 0.914 (requires Gemini-3 Pro API)
Zep/Graphiti — R@5 = 0.638 (requires API keys)
Mem0 — R@5 = 0.490 (no API keys)

R@5 = 0.964 means when your agent asks “what did we discuss about authentication?”, the correct answer is in the top 5 results 96.4% of the time.

No cloud APIs. No GPU. Runs entirely on your machine with a 90MB embedding model.

The Journey: 0.236 to 0.964

When we first ran the benchmark, ALMA scored R@5 = 0.236. Terrible.

But the data told us something interesting: R@50 = 1.000. ALMA found the correct answer every single time — it just couldn’t rank it in the top 5.

The problem wasn’t retrieval. It was ranking. FAISS was computing perfect similarity scores, but they were being silently discarded in the pipeline. Every memory got a default score of 1.0, making the ranking random.

36 lines of code fixed it. The similarity scores now flow end-to-end from FAISS through the storage layer to the scorer.

0.236 → 0.964. Same embeddings. Same data. Proper score propagation.

The lesson: in memory systems, the ranking matters more than the retrieval.

”But Claude Already Has Memory…”

Yes. And ALMA works WITH it, not against it.

Use Claude’s memory for quick preferences (“I like TypeScript”). Use ALMA for:

Strategy tracking — Which deployment approach worked for this service? Blue-green succeeded 8/10 times. Rolling updates caused 2 incidents.
Failure prevention — Your agent won’t try the approach that failed last time. It knows why it failed and what to do instead.
Cross-platform memory — Claude doesn’t know what ChatGPT learned. ALMA does.
Multi-agent teams — Junior agents inherit knowledge from senior agents. Teams share across roles.

How to Start

pip install alma-memory[local]

That’s it. SQLite + FAISS + local embeddings. No servers, no accounts, no API keys. Your first memory in under 5 minutes.

Open Source, MIT Licensed

ALMA is free to use, modify, and distribute. We believe the memory layer for AI agents should be open infrastructure — like databases, not like SaaS.

2,121 tests passing. 7 storage backends (SQLite, PostgreSQL, Qdrant, Pinecone, Chroma, Azure Cosmos, file). 22 MCP tools for Claude Code integration. Full benchmark reproducible in 30 minutes.

Code: github.com/RBKunnela/ALMA-memory
Docs: alma-memory.pages.dev
Install: pip install alma-memory

Your AI should not treat you like a stranger every morning.

Author: Renata Baldissara-Kunnela