Chapter 16

The Agent Factory

Spawning and Configuring Agents at Runtime

June 22, 202614 min read

Spawning and Configuring Agents at Runtime

The Spec-Kit Moment

I started my agentic learning journey on GitHub's spec-kit, a spec-driven harness that turns a constitution, specifications, and plans into executable instructions for whatever coding agent you wire in. On paper it was the right shape. In practice, every run kept doing more on its own. It would spawn additional subagents. It would invent new commands. It would load skills that weren't part of the original ask. I sat in front of the terminal and read the thinking threads scroll past the way you read a code review. The workers were going unexpected directions. They were touching files I had not asked them to touch. They were pulling in patterns I had not approved.

So I did the only thing you can do when an autonomous system goes off-script. I halted it. I read where it had wandered. I redirected. Then it would run for another stretch and the same thing would happen. Three hours into a single feature run I had halted and redirected the agents eight times. Each interruption cost ten to fifteen minutes of context reload, on a system I was using to save hours. At that rate, a five-engineer team running this pattern unmodified loses roughly two engineer-days per sprint to recoverable friction. I tell you this story because every agent system I sit down to review now has the same hole in it. Not in the workers. In the contract on creation.

The workers were not broken. The harness had a thousand ways to instantiate work and zero contracts on instantiation. The decision I made next put governance agents in front of and behind the worker subagents. The governance agents would create the input and monitor the output. The workers could keep doing what workers do. But they would only spawn under a contract, and they would only return through a check.

This is Chapter 16 of the AOD series. Part III's third pattern, after Topology (Chapter 14) and the Protocol Stack (Chapter 15). The Topology chapter introduced the four-layer architecture for agentic systems (user, orchestrator, agent, capability). The Protocol Stack chapter showed which protocols govern which edges, with MCP for vertical tool calls (agent to capability) and A2A for horizontal agent traffic (agent to agent). This chapter shows what creates the agents, on what contract, and why the absence of that contract is the failure mode you keep mistaking for a worker problem.

The contract on creation is what was missing. The factory is what holds it.

1. The Failure Isn't the Workers

The harness was productive. It just was not disciplined. The workers were doing what the harness allowed them to do, which was almost anything. Naming the symptom as a worker problem misreads the architecture.

Workers will spawn what the harness lets them spawn. They will decide what the harness lets them decide. They will load skills the harness keeps available and invent commands the harness does not actively block. The architectural failure sits one layer up from where the workers run.

The pattern in OOP that maps here is familiar. When a class can be instantiated arbitrarily by anyone, you get the same shape of problem. Constructors get called from places the design never anticipated. State leaks. Lifecycles tangle. The fix in OOP was to wrap construction in a function or class that knows the right way to do it. The Gang of Four (GoF) Factory pattern, in short.

Same fix in agentic systems. Wrap creation in something that decides what to instantiate, with what configuration, on what contract. The workers do not need to be fixed. The boundary above them needs to exist.

The workers were never the bug. The harness was.

2. The OOP Map: From Class Factory to Agent Factory

The GoF Factory pattern is common knowledge by now. Wrap object creation behind a function or class. The caller stops deciding which concrete type to instantiate. The factory does, based on inputs and rules the caller does not need to know.

Three things change when the products are agents.

First, the type decision is no longer just about which class. It is about which kind of product entirely. A persistent agent that retains context across tasks, or an ephemeral subagent scoped to a single piece of work. The factory chooses the product category before it chooses the configuration.

Second, configuration injection at instantiation grows beyond constructor parameters. What gets injected when an agent is created includes the requirements the user gave the system, the architectural and product decisions made earlier in the lifecycle, the project knowledge the agent must respect, the tool and protocol bindings the agent is allowed to speak through, and the lifecycle hooks that govern its spawn and cleanup. Not a list of arguments. A loadout, borrowing the gaming term, the configuration package an entity carries into action chosen for the mission, not assembled mid-fight.

Third, the factory has to make a placement decision against whatever topology the system uses. In the AOD-Kit topology, persistent agents sit at the agent layer and ephemeral subagents drop to the capability layer. Other harnesses make different placement choices. Claude Code's Task-spawned subagents, for instance, live at the agent layer rather than the capability layer. The placement question matters less than the lifecycle question. The factory's job is to honor whichever choice your topology has made.

This is also where governance attaches. The contract on creation is governance, applied before the worker has executed anything. We saw the early version of this in Chapter 4, where MCP servers acted as dependency injection at runtime. Configuration injection at the factory is dependency injection at instantiation. Same principle. Earlier moment in the lifecycle.

What to instantiate. With what configuration. On what contract. Three decisions, one boundary, the factory.

3. The Two Product Types

The factory's first decision is which kind of product to instantiate. There are two, and the choice between them is structural, not stylistic.

A persistent agent retains context across tasks, communicates with peer agents, and holds accumulated judgment over the life of a project. It spawns once with full configuration injected, executes across many tasks, and cleans up at a session or project boundary. The Governance Triad (the persistent set of PM, Architect, and Team Lead Agents introduced earlier in the series, each holding a slice of the contract on creation) is the canonical persistent product set. In OOP terms, these behave like long-lived service objects whose state matters across many calls.

An ephemeral subagent is scoped to a single task. It spawns with scoped configuration, accepts a typed input, executes, returns a typed result, and discards its context. Backend, Frontend, Tester, Code Reviewer, Security Analyst, DevOps. These are the canonical ephemeral products. In OOP terms, they behave like method-scoped locals that exist for the duration of one call.

Chapter 8 framed the agent-vs-subagent decision as structural rather than stylistic, with the decision criterion of whether the role needs to remember or needs to execute. The factory turns the decision into the literal output of instantiation. Roles that need to remember become persistent agents. Roles that need to execute become ephemeral subagents.

The reason this is structural and not preference is mechanical. Making everything persistent floods context with persistence the role does not need, and the system pays the cost in tokens, drift, and overlapping memory. Making everything ephemeral loses the cross-phase judgment governance roles depend on.

Persistence is the substrate of governance. Ephemerality is the substrate of focused execution. The Architect Agent that has reviewed seventeen prior decisions draws on the prior sixteen when evaluating the seventeenth. An ephemeral one cannot. Strip the persistence and the contract becomes a checklist that has to be reread and re-explained every spawn.

The Governance Triad is the canonical persistent set. The PM holds product intent. The Architect holds technical standard. The Team Lead holds plan, authorization, and wave assignment. The worker pool is the canonical ephemeral set, and its members spawn under whatever contract the triad has just signed off on.

Two product types. One factory. The decision rule is what makes the contract real.

[!NOTE] Sidebar: Agent Pools and Warm-Starts

The ephemeral lifecycle has a hidden cost. Spawn time. Loading the model context, injecting the configuration, opening tool bindings, none of it is free. In low-volume systems the cost is invisible. In high-volume systems it dominates.

Production handles this by pooling. The factory keeps a warm set of pre-configured ephemeral subagents ready for invocation. When a worker is needed, the factory pulls from the pool rather than cold-spawning.

Two pool flavors. Typed pools keep one pool per worker type (code reviewers in one, security analysts in another). They optimize for spawn speed but cost memory. Generic pools keep unconfigured workers and inject configuration at checkout. They optimize for memory but pay a configuration step on every use.

Pooling does not change the contract. The same governance triad still injects requirements, decisions, project knowledge, and bindings at checkout. What pooling adds is amortization, not a new pattern. Project knowledge stays out of the pool. Persistence belongs to governance, not to pooled workers.

The cost of pooling is configuration drift. A pooled worker that has been alive for hours may hold residue from prior tasks. Accept-the-drift only works inside a single trust boundary. The moment pooled workers cross between tasks with different permission scopes or data classifications, the residue becomes a leakage path. Reset between checkouts when tasks do not share a trust boundary. Like the topology in Chapter 14, where the rule was to match the pattern to the constraint, pooling is a constraint-driven choice, not a default.

4. Configuration Injection at Spawn Time

I keep coming back to this list because every agent system I review fails the same way when one of these items is missing. The factory does not just pick a type. It hands the new agent everything it needs to execute under the contract. Five things, every time:

Requirements. What the user actually asked for, in the form the harness has already captured. The new agent should not have to guess, and it should not have to re-derive.
Decisions. Architectural and product decisions made earlier in the lifecycle that the new agent has to honor. Not preferences to consider. Decisions to respect.
Project knowledge. The middle layer of Chapter 11's three-layer memory model (session context above it, practitioner knowledge below). Project knowledge is durable across spawns within a project, which is what lets governance enforce a contract that outlives any single session. It is read-only at spawn for ephemeral workers and read-write through an MCP binding for persistent governance agents.
Tool and protocol bindings. Which MCP servers the agent can reach, and which A2A endpoints it can speak to (Chapter 15). The bindings are also the permission allowlist. The worker has the permissions the bindings name and no more.
Lifecycle hooks. What runs on spawn, what runs on cleanup, what gets logged, what gets handed back, including the audit handle that later checks attach to.

A wave is a coordinated batch of work the team executes together, with explicit task dependencies and an authorization checkpoint between waves. The Team Lead Agent governs spawn timing across the wave plan, and the factory refuses to spawn workers for a wave that has not been authorized.

The chapter has been talking about a contract for several pages without showing one. Here is what one looks like as the factory hands it to a worker on spawn:

# Factory contract - injected at spawn
agent_id: backend-engineer-w2-task-3
contract:
  requirements:
    - REQ-104: "POST /reports returns 201 on success"
    - REQ-105: "Idempotency key required, 24h replay"
  decisions:
    - ADR-007: "Postgres for report storage, no Redis"
    - ADR-012: "Service-layer validation"
  project_knowledge:
    handle: pk://repo/reports-service@v3
    scope: [reports, idempotency, postgres-conventions]
  bindings:
    mcp_servers: [postgres-mcp, fs-mcp]
    a2a_endpoints: [tester-agent, code-reviewer-agent]
    permission_scope: [read:reports, write:reports]
  lifecycle:
    on_spawn: log-spawn-event
    on_cleanup: discard-context
    audit_handle: audit://feature-104/wave-2/task-3
    wave: 2
    authorized_by: team-lead-agent@2026-05-08

That YAML is illustrative, not production-grade. The shape is what matters. PMs own the requirements and decisions blocks. Engineering owns the bindings, lifecycle, and audit blocks. The factory enforces both. In production, the contract artifact carries a signature so the audit chain does not depend on the orchestrator's honesty.

This is dependency injection at the factory boundary. The agent does not pull its dependencies. The factory pushes them. Without injection, the worker starts with a request and a guess, and you are back to scrolling threads at the terminal halting and redirecting.

The YAML is the contract artifact. The enforcement broker is the runtime piece that checks every tool call against the bindings list before the call goes out. Without the broker, the allowlist is documentation. The factory is where Chapter 4 (DI), Chapter 11 (project knowledge), and Chapter 15 (protocols) meet.

5. Governance as the Factory's Enforcement Mechanism

The Enforcement Layer

A factory without enforcement is a function with arguments. Anybody can call it with anything. The contract becomes a checklist nobody reads. This was the spec-kit experience exactly. There was a pattern available. Nothing was making it load-bearing.

The Governance Triad turns the factory into a contract that is actually enforced before workers spawn. Each member governs a different dimension at the moment of creation.

The PM Agent validates that the requirements being injected match the product intent. Wrong scope, missing acceptance criteria, drift from the spec, all blocked at the front of the factory.
The Architect Agent validates that the technical bindings (tools, protocols, prior decisions) match the architecture. A worker asking for a tool the architecture forbids does not get the binding.
The Team Lead Agent validates that the spawn sits inside the wave plan and that the worker has the permissions it needs and no more, with permission scope as its own gate that fires independently of the other two.

Each governance agent has veto authority before the worker spawns. After the worker returns, the same triad checks the output. Inputs gated. Outputs checked. The factory is the boundary on both sides.

What Veto Looks Like

When governance rejects a spawn (wrong scope, missing decision, unauthorized binding) the factory halts instantiation and surfaces the violation back to the orchestrator. Same shape when post-execution output checks fail. The worker's result is rejected, the violation is logged, and the orchestrator decides whether to retry, reroute, or escalate.

Wave authorization gates spawn but does not retroactively halt workers from a prior wave already in flight. Containment of in-flight workers under a later veto is taken up in Chapter 18.

Retry needs a bound. An orchestrator that retries indefinitely against a security veto is a denial-of-service vector against the governance plane and a grinding attack surface against the policy. The bound lives with the orchestrator, not the factory. The factory's job is to refuse spawns. The orchestrator's job is to decide when to stop asking. Chapter 18 takes up failure propagation through the topology, including how the factory's veto signals climb the layers without flooding the upstream caller.

What Gets Logged

Every spawn is logged, not just every veto. The audit record at minimum names the spawn request, the dimensions checked, the dimension that failed, the requesting orchestrator's identity, and the policy decision point, the governance agent that ruled on the request. Output checks log the same fields. The integrity of that audit log is itself a governance question, taken up in Book 2 along with bootstrap root-of-trust and policy decision point hardening.

The Human PM and the PM Agent

A clarification matters here because product leaders read this chapter looking for the actionable surface. The human PM is not replaced by the PM Agent. The human PM defines the requirements and writes the spec. The PM Agent encodes that intent and enforces it at spawn time and at output check, a persistent enforcer of human-defined intent rather than a substitute for the human PM. The human side lives in the Product Triad (Business User, UX/Product Designer, Agentic Engineer), where the spec actually gets written.

When the human PM updates a requirement, the Agentic Engineer in the Product Triad re-deploys the encoded contract, and the PM Agent picks up the new spec at the next spawn. When the human PM and the PM Agent disagree, the human wins by default, but the override is logged. Governance that cannot be overridden is brittle. Governance overridden without audit is theater.

For a human PM, the actionable surface is the spec. A vague spec lets the workers wander. A precise spec lets the factory hold the line.

Here is what one requirement looks like traveling through the factory. The PM writes acceptance criteria for REQ-104 in the spec. The harness captures the spec and turns it into the requirements block of the contract. The factory injects REQ-104 into the backend-engineer worker on spawn. If the worker's plan widens scope beyond REQ-104's acceptance criteria, the PM Agent vetoes before any code is written. On output check, the PM Agent verifies the worker's diff satisfies REQ-104 before the work is accepted. One requirement, five lifecycle moments, one contract holding the line.

The Bootstrap Problem

One detail sits at the bottom of this section because it is the structural caveat that frames everything above it. The persistent governance agents are themselves products of the factory. The factory instantiates the triad first. The triad then governs subsequent instantiations. Recursive but bounded. The bootstrapping happens once per project. Everything afterward runs under the contract the triad enforces.

Who governs that first instantiation is the root-of-trust question. The honest answer in this chapter is the human operator launching the harness. Book 2 (Securing Agentic Systems) takes up how to make that anchor verifiable rather than presumed, including signed contracts, attested triad bootstrap, and tamper-evident audit chains.

This is the spec-kit-to-AOD-Kit arc translated into mechanism. The harness was not broken. It was missing the layer that turns instantiation into a contract. I built AOD-Kit out of this pattern. The Triad on the input. The Triad on the output. The workers in between.

6. When the Factory Earns Its Keep

Not every system needs an agent factory. Right-sizing matters here as much as it did for topology in Chapter 14, where simpler systems were better off without the four-layer architecture imposed on them.

A single-agent system does not need one. The user calls the agent, the agent does the work, the session ends. There is no instantiation decision worth wrapping.

A two-agent system might not need one. If both are persistent, identical configuration is fine, and there is no governance gate, the factory collapses to a constructor with one branch in it. Fine. Skip it.

The factory earns its keep when these conditions stack:

Multiple product types exist. Persistent and ephemeral. The decision criterion is non-trivial because the lifecycles diverge.
Configuration must be injected. Different agents get different requirements, decisions, tool bindings, and lifecycle hooks. There is no one-size-fits-all loadout.
Governance is needed at spawn time. The contract has to be enforced before workers execute, not after. Post-hoc checks on output do not catch what gets started in the wrong scope.
The system is large enough that ad-hoc instantiation reproduces the spec-kit failure mode. If you can already see the workers wandering, you crossed the threshold.

When fewer of those conditions hold, the factory is overhead. When all four hold, the absence of the factory is the architectural failure you will spend the next six months working around. Match the pattern to the constraint, not to the convention.

7. Why This Matters Now

The factory pattern in agentic development is not the GoF Factory with new vocabulary. It is a contract that decides what to instantiate, with what configuration, on what governance, before the worker has done anything.

Two product types. Different lifecycles. One contract. The Governance Triad is the persistent product set. The worker pool is the ephemeral product set. The factory is the boundary that keeps the two from collapsing into a single role and reproducing the spec-kit failure under a different name.

The agent factory is a configure-your-harness pattern, not a buy-this-from-a-vendor pattern. AOD-Kit is one build-it-yourself implementation. Equivalent harness configuration in Claude Code, Cursor agents, or your own orchestrator works the same way. Build, don't buy, but build on top of an existing harness.

Workers are workers. They will execute whatever the contract makes legal and whatever no contract forbids. If you do not name the boundary above them, you are running spec-kit, and the only governance you have is your finger on the halt key.

I have named the boundary in three places now. Topology in Chapter 14 placed it. Protocol stack in Chapter 15 wired it. The factory in this chapter contracts it.

The factory is not the constructor. It is the contract on creation. Stop letting agents instantiate themselves.

Coming up next, Chapter 17 on the Facade pattern shows how the orchestrator hides Topology, Protocol Stack, and Factory behind one face.