ArchitectureDesign

Memory Architecture Patterns for AI Assistants

January 11, 20267 min read

Most memory systems try to do too much. Ours started that way too. After building, testing, and ultimately abandoning a complex tiered architecture, we landed on something simpler: two focused tools that complement each other.

The Design Philosophy

momentum handles short-term context recovery (within a session). memory-mcp handles long-term persistent memory (across sessions). Together they solve the complete memory problem without trying to be one monolithic system.

The Architecture We Abandoned

The first version of our memory system was ambitious. Too ambitious.

It had:

  • Short-term tier - Immediate context, fast access
  • Long-term tier - Consolidated memories, slower access
  • Archival tier - Old memories, cold storage
  • Automatic consolidation - Moving memories between tiers
  • Importance decay - Memories fading over time

Someone even wrote a long article praising this complexity. They called it "optimal memory techniques based on comprehensive research."

The problem? It was overengineered. Users didn't understand when memories moved between tiers. The consolidation logic was fragile. And debugging issues meant understanding a state machine with multiple transitions.

The Simplification

We threw it away and started over. The new design has one guiding principle: separation of concerns.

Two problems, two tools:

momentum

Context recovery within a session

  • Save snapshots at task boundaries
  • Restore after /clear in <5ms
  • SQLite persistence (survives crashes)
  • Session-scoped (cleared with new project)

memory-mcp

Persistent facts across sessions

  • Store facts Claude should remember
  • Recall with FTS5 search
  • Persists indefinitely
  • Cross-project, cross-session

When to Use Each

The distinction is simple once you understand it:

ScenarioToolWhy
Context window filling upmomentumSnapshot, /clear, restore
Mid-task, need to preserve statemomentumIncremental snapshots
Learn a fact for next sessionmemory-mcpPersistent storage
Remember user preferencesmemory-mcpCross-session recall
Track decisions made in a projectmemory-mcpSearchable history
Resume after coffee breakmomentumLatest snapshot

How They Work Together

In practice, you use both. Here's a typical workflow:

  1. Start a new session. memory-mcp is available immediately. Claude can recall facts from previous sessions: "User prefers TypeScript," "This project uses Bun," etc.
  2. Work on a task. As you make progress, momentum saves periodic snapshots. These capture the working state: what files you've touched, decisions made, blockers encountered.
  3. Context window fills up. You run /clear to free space. Momentum's restore_context brings back the working state in <5ms.
  4. Learn something important. Claude uses memory_store to save a fact that should persist: "The API requires authentication," "Linting config is in .eslintrc.js," etc.
  5. End session. momentum's snapshots are session-scoped. memory-mcp's memories persist forever (or until explicitly forgotten).

The Key Insight

The distinction maps to how human memory works:

  • momentum = working memory. What you're actively holding in mind. Context-specific. Volatile. Fast to access.
  • memory-mcp = long-term memory. Facts you've learned. Persistent. Requires retrieval (search).

Trying to combine these into one system creates confusion. Do I snapshot this or store it? Should this memory consolidate? What tier is it in?

With two focused tools, the answer is obvious: Is it working context? Snapshot. Is it a fact to remember? Store.

Technical Details

Both tools share a similar foundation:

# Shared stack
TypeScript + Bun
SQLite persistence
MCP SDK
Local-first design
Zero ML dependencies

This wasn't accidental. Having the same foundation means:

  • Consistent installation experience
  • Same mental model for users
  • Easier to maintain together
  • Future monorepo consolidation is natural

What We Didn't Build

We explicitly avoided:

  • Automatic capture - MCP can't intercept conversations. We tried. It doesn't work. Tools must be explicitly called.
  • Tiered storage - Complexity without clear benefit for typical use cases.
  • Embeddings - 46MB overhead for marginal semantic matching gains. FTS5 is faster and sufficient.
  • Cloud by default - Local-first means privacy and reliability. Cloud is optional (coming in Pro tier).

Stop Building, Start Using

Here's a meta-observation: we spent months building memory infrastructure for Claude. The irony is that Claude kept forgetting about the work between sessions.

At some point, a past Claude instance noted:

"Stop building memory systems. Start using them."

This is the lesson: the tools work. They're simple enough to understand in five minutes. The best proof of value is using them to build something else.

If they solve our problem with Claude, they solve yours too.

Try the Ecosystem

Two tools that work together. Install in minutes.

# Context recovery
/plugin install momentum@substratia-marketplace
# Persistent memory
npx @whenmoon-afk/memory-mcp
S
Substratia Team
Building memory infrastructure for AI