Why Your Agent Forgets (And How to Fix It)
Part 2 of the Agentic-Oriented Development series
I thought I was doing everything right.
I was building a SaaS product, a tool to help developers identify security risks in their AI-assisted code. I had read enough about agentic development to know that specialized agents outperform generalists. So I spun up a Frontend Developer agent, a Backend Developer agent, and set them to work.
The frontend came together beautifully. Clean React components. Polished UI. Responsive design. Then I went to connect it to the backend.
The backend connections didn't exist. The agent hadn't "forgotten" to build them. It had never known they were needed. Somewhere in the growing context of UI components, styling decisions, and state management, the architectural vision had been pushed out. The agent was doing exactly what I asked in each moment, but it had lost sight of the whole system.
I spent three times the effort putting the frontend and backend together. The lesson was expensive but clear: context isn't just a technical constraint. It's the difference between an agent that builds what you need and one that builds what it remembers.
In Chapter 1, I introduced four pillars that map OOP principles to agentic development. This chapter is the deep dive into the first: Encapsulation to Context Isolation. But to understand context isolation, you need to understand context itself. And the best analogy I've found is one every developer already knows: memory management.
The Memory Management Parallel
I learned about memory leaks the hard way.
When iPads first came out, I was building a stock trading application in Objective-C. We had a team of 10 testers running 2,000 regression tests on every release. For a full month before we felt confident enough to submit to the App Store, we were hunting memory leaks.
The pattern was always the same: variables declared but never cleaned up, memory allocated but never released. When the leaks accumulated past a threshold, the app crashed. And when it crashed mid-trade, the user's transaction was gone.
That month taught me the fundamentals of memory hygiene: clean up what you allocate, preserve critical state externally, and only allocate the memory you actually need.
And then I saw the pattern again, this time in AI.
Context windows have the same failure mode. Every message, every file read, every tool output consumes context. Unlike memory, you can't explicitly free it. The window fills. And when it fills, the agent doesn't crash. It does something worse: it forgets selectively. While the behavior varies by model (some use attention mechanisms, some apply summarization), the practical effect is similar. Early context becomes less accessible as the window fills. Requirements mentioned early in the conversation fade. Architectural decisions get overwritten by recent implementation details. The agent keeps working, confidently producing code that contradicts decisions you made an hour ago.
This is context pollution: the accumulation of low-value information that crowds out high-value context. And just like memory leaks, you don't notice it until the damage is done.

Here's how the concepts map:
┌───────────────────────────────────────────────────────────────────┐
│ MEMORY MANAGEMENT → CONTEXT MANAGEMENT │
├───────────────────────────────────────────────────────────────────┤
│ │
│ HEAP OVERFLOW → CONTEXT EXHAUSTION │
│ Program crashes when Agent degrades when │
│ memory is full context window fills │
│ │
│ MEMORY LEAK → CONTEXT POLLUTION │
│ Unreleased memory Irrelevant info │
│ accumulates silently crowds out decisions │
│ │
│ STACK FRAMES → SUB-AGENT CONTEXTS │
│ Each function call Each sub-agent gets │
│ isolated on stack isolated context window │
│ │
│ GARBAGE COLLECTION → COMPACTION/SUMMARIZATION │
│ Runtime reclaims Preserve decisions, │
│ unused memory discard deliberation │
│ │
│ MEMORY POOLS → SPECIALIZED AGENT POOLS │
│ Pre-allocated for Focused context for │
│ specific purposes specific domains │
│ │
│ POINTERS/REFERENCES → CROSS-AGENT RESULTS │
│ Pass references, Pass outputs only, │
│ not entire objects not full context │
│ │
└───────────────────────────────────────────────────────────────────┘
The parallels are exact. Heap overflow crashes your program, context exhaustion degrades your agent. Memory leaks accumulate silently, and context pollution crowds out decisions just as silently. The diagram above maps the rest, but the pattern you need to internalize is this: every technique we developed for memory has a context equivalent waiting to be applied.
The developers who mastered memory management in the 1990s didn't just write code that worked. They wrote code that scaled. The same will be true for context management in the 2020s.
The Governance Triad: Governance for Agent Teams
After that disaster, I rebuilt my approach. Specialized agents weren't enough. Isolation without direction is just faster failure.
The breakthrough came when I stopped thinking about agents as independent workers and started thinking about them as a team that needed leadership. Not project management. Product governance.
I introduced what I call the Governance Triad:
┌─────────────────┐
│ PM AGENT │
│ Holds Product │
│ Vision │
│ │
│ • WHAT/WHY │
│ • User Value │
│ • Priorities │
└────────┬────────┘
│
┌─────────────┴─────────────┐
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ ARCHITECT │ │ TEAM LEAD │
│ AGENT │◄───────►│ AGENT │
│ Holds the HOW │ │ Holds WHEN/WHO │
│ │ │ │
│ • System Design│ │ • Timeline │
│ • Tech Approach│ │ • Assignments │
│ • Integration │ │ │
└────────┬────────┘ └────────┬────────┘
│ │
└─────────────┬─────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ SPECIALIST AGENTS │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌──────┐ │
│ │Frontend│ │Backend │ │ DevOps │ │ QA │ │
│ └────────┘ └────────┘ └────────┘ └──────┘ │
│ Implementation Layer (DO) │
└─────────────────────────────────────────────┘
The PM Agent holds the product vision. It doesn't just store requirements, it actively validates that every decision serves the user and the product goals. Think of it as the "what" and "why" guardian.
The Architect Agent owns system design, technical approach, and integration patterns. When a specialist agent proposes a shortcut that violates your API boundaries, this is the agent that catches it.
The Team Lead Agent manages timelines and agent assignments. It decides who builds what, when, and whether the output meets your standards before it moves forward.
These aren't agents that write code. They're agents that govern. They're the product leadership layer that prevents the chaos I experienced on that first project.
Implementing the Governance Triad
How do you actually set this up? In Claude Code, Cursor, or similar tools, create three agent configurations with explicit boundaries:
PM Agent: System prompt focused on user stories, acceptance criteria, and prioritization. Load your PRD but exclude implementation details. It should answer "Should we build this?" and "What does success look like?"
Architect Agent: System prompt with your tech stack, API design principles, and system boundaries. Load architecture decisions but exclude sprint priorities. When a developer agent asks "Can I use WebSockets here?" the Architect is who decides.
Team Lead Agent: System prompt with coding standards, review checklists, and task breakdowns. Load current sprint scope but exclude strategic roadmap. This is your quality gate and scheduling engine rolled into one.
The key is explicit boundaries. Each agent knows what it needs. Nothing more. When they coordinate, you pass structured artifacts, not raw context.
Persistent Memory That Survives Context Decay
Here's what I learned the hard way: conversation history is unreliable memory. It decays. It gets summarized. It pushes out early decisions in favor of recent chatter.
Specifications don't decay.
I needed a framework that treated specifications as persistent memory. The pattern is simple: anything you can't afford to forget shouldn't live only in context. Write it down. Make it reloadable.
This is the core insight behind what I call the Spec-Kit pattern: a discipline of persistent specifications. Instead of jumping into code, you define what you're building in executable specifications. Then you plan, break work into tasks, and implement against validated specs. The specs become the source of truth, reloadable into any agent's context at any time. The principles apply whether you're using Claude Code, Cursor, or any other AI coding tool.
The core artifacts:
Product Requirements Document (PRD) - The source of truth for what we're building. A document that can be reloaded cleanly into any agent's context.
Architecture Decision Records - When technical decisions get made, they get written down. Not buried in chat history.
Constitution - Project principles and standards that govern all implementation decisions. Example: "All API responses must include correlation IDs for debugging" or "No direct database access from frontend agents."
This is the agentic equivalent of moving from stack allocation to heap allocation in C. The stack is fast but temporary. The heap persists. Specs are your persistent memory. Conversation is your working memory. Know which is which.

From Project to Product
The Spec-Kit pattern solves the memory problem. It doesn't solve the governance problem.
Project-based development: Define requirements in PRD. Build features against spec. Deliver and move on. Specs exist, but nobody enforces them.
Product-led development: PM Agent validates every decision against product vision. Architect Agent ensures technical approach serves product goals. Team Lead Agent verifies timeline and priorities reflect product strategy. Implementation happens within continuous governance.
This is what the Governance Triad adds: not just documentation, but governance. Active validation. Continuous alignment.
Practical Strategies for Context Hygiene

Let me give you the playbook I use now.
Strategy 1: Start Fresh, Start Focused
New major task? New chat. Don't let unrelated context accumulate. The cost of re-establishing context is lower than the cost of pollution from irrelevant history.
I start every significant feature with a clean context window and a reload of the relevant specs. The agent gets exactly what it needs. Nothing more.
Strategy 2: Delegate to Preserve
What happens when a task needs deep focus? You spawn a sub-agent. That sub-agent gets a fresh context window, focused instructions, and relevant specs. It goes deep without polluting the orchestrator's context.
The orchestrator stays strategic. It knows the Frontend Developer agent is working on the dashboard, but it doesn't need to hold every CSS decision in memory. It gets the result: "Dashboard complete, here's the API contract it expects."
When designing agent delegation, apply least-privilege principles. The Frontend Developer agent doesn't need database credentials. It needs the API contract. Information boundaries aren't just about context efficiency, they're about security.
This is encapsulation. The sub-agent's internal state is private. Only the interface is public.
┌─────────────────────────────────────────────────────────────────┐
│ ORCHESTRATOR AGENT │
│ Context: PRD | Arch Decisions | Goals | Agent Status │
│ [Clean, strategic - NO implementation details] │
└──────────────────────────┬──────────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
│ TASK + Spec │ TASK + Spec │ TASK + Spec
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ FRONTEND │ │ BACKEND │ │ QA │
│ AGENT │ │ AGENT │ │ AGENT │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │ FRESH │ │ │ │ FRESH │ │ │ │ FRESH │ │
│ │ CONTEXT │ │ │ │ CONTEXT │ │ │ │ CONTEXT │ │
│ │ • UI Spec │ │ │ │ • API Spec │ │ │ │ • Test Plan │ │
│ │ • Styles │ │ │ │ • DB Schema │ │ │ │ • Coverage │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
└──────┬──────────┘ └──────┬──────────┘ └──────┬──────────┘
│ │ │
│ RESULTS ONLY │ RESULTS ONLY │ RESULTS ONLY
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ ORCHESTRATOR RECEIVES: │
│ ✓ Results ✓ Contracts ✓ Status │
│ ✗ CSS decisions ✗ Debug steps ✗ Internal reasoning │
└─────────────────────────────────────────────────────────────────┘
Strategy 3: Compact Strategically
When context gets heavy, summarize deliberately.
Preserve: decisions made, constraints discovered, interfaces defined. Discard: deliberation, dead ends, verbose explanations.
This is garbage collection for context.
Strategy 4: Treat Specs as Cache
Specs aren't just documentation. They're context cache. When an agent needs to remember the authentication architecture, it doesn't search conversation history. It reads the auth spec.
Reload specs at the start of every session and after every major context shift. It's the same pattern as warming a cache: accept the upfront cost to avoid the downstream misses.
Strategy 5: Design Information Flow Like a Product Team
Information is power. So is information architecture.
Ask yourself: who needs to know what? The Frontend Developer doesn't need the database schema. It needs the API contract. The Backend Developer doesn't need the component hierarchy. It needs the data requirements.
Information flows through interfaces, not shared memory. The Triad governs these flows. PM ensures requirements reach the right specialists. Architect ensures technical decisions propagate correctly. Team Lead ensures assignments match capacity.
Enforcing Boundaries
Strategies are only as good as your enforcement. For production-grade agentic systems, consider these mechanisms:
Tool whitelisting: Limit which tools each agent can access. Your PM Agent doesn't need file system access. Your Architect doesn't need to run tests.
Capability constraints: Define what actions each agent is allowed to take. Read-only access for analysts, write access for implementers. Explicit approval gates for destructive operations.
Audit logging: Track what context each agent accessed and what actions it took. When something goes wrong (and it will), you need the forensics to understand why.
These aren't optional for enterprise systems. They're the difference between proactive architecture and months of cleanup.
Why This Matters
Context mismanagement is why agents "forget" requirements. It's why they produce inconsistent code. It's why they lose architectural vision mid-project.
That failure wasn't a bug in the AI. It was a bug in my architecture. I spent three weeks rebuilding what the Governance Triad would have prevented in three days. That's the tax for not managing context.
The complete picture looks like this:
Layer 1: Context Management (the technical foundation) Understand context windows like you understand memory. Clean up pollution. Delegate to isolate. Compact to preserve decisions.
Layer 2: Spec-Driven Development (persistent memory) Externalize what matters: PRDs, architecture decisions, constitution. Reload them into context as needed. This is your persistent storage that survives session boundaries.
Layer 3: Governance Triad (active governance) PM Agent, Architect Agent, Team Lead Agent. Not just documentation, but governance. The leadership layer that ensures your specialists stay connected to the product vision.
The developers who master this combination will build agent systems that scale. Everyone else will keep wondering why their agents plateau, why the code gets inconsistent after a few hours, why the AI "forgot" what they told it yesterday.

Context windows ARE the new memory model. But memory management was never the goal. Building great software was.

What's Next
This is Part 2 of a series on Agentic-Oriented Development.
Coming next: Tools Are the New Methods
I'll explore how agent tools map to object methods and how to design tool interfaces that hide complexity while enabling capability. The same principles that made APIs powerful make agent tools powerful, if you design them right.
Want to implement spec-driven development? Check out Agentic-Oriented Development Kit, an open-source toolkit for persistent specifications and the Governance Triad.