ProductivityDeveloper Experience

The Real Cost of Context Window Churn

January 11, 20266 min read

If you use AI coding assistants for real work, you know the feeling. You've been working on a complex feature for an hour. Claude understands your codebase, your decisions, your constraints. Then the context window fills up. And you start over.

The Hidden Tax

Context window churn isn't just annoying. It's expensive in three ways:

  • Time - Re-explaining your project state, decisions, and constraints. Often 5-10 minutes per reset.
  • Tokens - Paying for the same context multiple times. If you lose context mid-task, you're re-sending what the model already knew.
  • Quality - The AI loses nuance. Decisions made earlier in the session that informed current work are forgotten.

The Real Numbers

In a typical 2-hour Claude Code session, we measured:

  • • 3-5 context resets
  • • 20-30 minutes lost to re-explaining context
  • • 15-25% of total tokens spent on redundant context

Traditional Approaches Don't Work

Most solutions to context management have significant drawbacks:

LLM Compaction

Claude Code's built-in compaction uses the model itself to summarize context. It's better than nothing, but:

  • Takes 30-60 seconds (an eternity when you're in flow)
  • Lossy - the model decides what's "important," not you
  • Still uses tokens for the summarization step

Manual Context Files

Some developers maintain README files or scratch pads with context. This helps, but:

  • Requires manual maintenance
  • Falls out of sync with actual state
  • Doesn't capture the nuanced back-and-forth that built understanding

Just Start a New Session

The "solution" many developers use by default. But you lose everything the AI learned about your project, your preferences, and your current task state.

A Different Approach: Snapshots

We built momentum to solve this differently. Instead of trying to compress context, we save it at task boundaries and restore it instantly after /clear.

# Traditional flow
[work] → [context full] → [compaction: 30-60s] → [continue]

# momentum flow
[work] → [save snapshot] → [/clear] → [restore: <5ms] → [continue]

The key insight: SQLite reads are instant. We don't need to re-process or re-summarize anything. Just read the snapshot and inject it into the new context.

Restore Speed

We measured restore times across different snapshot sizes:

Stored TokensRestore Timevs LLM Compaction
10,000~1ms~26,000x faster
50,000~2.5ms~12,000x faster
100,000~4ms~7,000x faster
150,000~5ms~6,000x faster

Benchmarks on M1 MacBook Pro using Bun's native SQLite. Your results may vary but will be in the same ballpark.

What Gets Saved

A momentum snapshot captures:

  • Summary - What you were working on
  • Key files - Files relevant to the current task
  • Decisions made - Technical choices and their rationale
  • Blockers - What was stopping progress
  • Code state - Important variables, configurations
  • Recent messages - The last few exchanges for context

The Workflow

momentum integrates as a Claude Code plugin with these tools:

  1. Work normally - Claude saves snapshots automatically at task boundaries (configurable)
  2. Context fills up - You notice things getting slow or hit Claude's limit
  3. Run /clear - Clears the context window
  4. Claude calls restore_context - Latest snapshot is loaded instantly
  5. Continue where you left off - No re-explanation needed

Why Not Just Bigger Context Windows?

Context windows are getting larger. Claude supports 200K tokens. GPT-4 has 128K. Why not just use more context?

Three reasons:

  • Cost - Larger context means more tokens, means more money. At scale, this matters.
  • Latency - Larger context windows are slower. The model has to process all that context for every response.
  • Attention degradation - Studies show models perform worse with very long contexts. Important information in the "middle" gets less attention.

Smart context management isn't about stuffing more tokens into the window. It's about having the right context available when you need it.

Try It

momentum is free and open source. Install it in Claude Code:

/plugin install momentum@substratia-marketplace

Requires Bun runtime. If you don't have it:

curl -fsSL https://bun.sh/install | bash

The Ecosystem

momentum handles short-term context (within a session). For long-term memory across sessions, use memory-mcp.

S
Substratia Team
Building memory infrastructure for AI