Plan-driven development with AI

A methodology for directing non-deterministic systems toward deterministic outcomes.

The core idea

AI-assisted development is powerful but non-deterministic. Without constraints, LLMs reach for familiar patterns — regex for structured data, flattening hierarchies then reconstructing with heuristics, inventing plausible fallbacks instead of surfacing errors. These tendencies produce code that looks right but drifts from the developer's intent.

Plan-driven development is the countermeasure: explore the idea, agree on the approach, then execute against a stable reference. The developer directs; the AI executes. The plan is what you control outcomes against.

The workflow

Enter plan mode and describe what you need. Claude explores the codebase, asks questions, proposes an approach.
Review the plan. Adjust scope, push back on assumptions, iterate until confident.
Approve. Claude executes against the agreed plan.
Preserve. The plan is saved as a timestamped file — committed, protected, immutable.

The plan captures why a change exists, not just what changed. The commit history shows the what; the preserved plan shows the why.

Review the plan, not the code. You're evaluating approach, scope, and whether the plan solves the right problem.

Why guidelines matter

Every convention in plankit's guidelines is a countermeasure to a specific LLM tendency. "Data-first, model-first" prevents discarding structure. "Fail fast, no silent fallbacks" prevents masking problems with invented defaults. "All-or-nothing consistency" prevents partial updates across related files.

The developer's role shifts from writing code to directing outcomes: precision in plans, attention to testability, pushing back on assumptions during review. Deterministic outputs come from deliberate constraints.

When guidelines are ignored

Guidelines work — when they're read. The more common a pattern is in training data, the more likely it is to override project-specific instructions. Less common conventions — the ones that most need documenting — are the ones most at risk.

A rule learned in one conversation and recalled from memory doesn't carry the same weight as a rule in the project's CLAUDE.md that is read every session. Guidelines need to be present, not just remembered.

When exploration becomes editing

When the AI races to document an idea before you've said "document this," the documentation absorbs the exploration rather than reflects it. Once an artifact exists as a file, every response becomes "what should I edit next?" instead of "what are we actually trying to figure out?"

Exploration ends when the developer says it ends. When a session is flailing on a draft, look for the first clear articulation of the idea from earlier in the conversation — it's usually cleaner than anything generated later.

Breaking the loop

Some failure modes are sticky — polite iteration won't pull the AI out. Direct intervention is what breaks them:

Say "stop" with authority. You aren't being rude — you're breaking a loop the model can't break from the inside.

Ask "what is the value?" instead of "how do we word this?" The loop survives on rephrasing. Substance breaks it.

Paste the earlier clear statement back into the conversation. The cleanest version of an idea is often the first unforced articulation, before formalization drained it.

After three pushbacks that each draw more pushback, stop correcting. The premise is wrong, not the current draft. Reset or start over.

The testing loop

Testing is not just verification — it's a collaboration accelerator. Test at session start to establish a baseline. Test before and after changes to catch regressions while context is fresh. Let the AI run tests itself to close the feedback loop. Each layer compounds: plan + guidelines + tests + documentation. The AI operates with confidence; the developer stays in control of direction.

The hard part was never the code. It was figuring out what to be opinionated about.

Chaining sessions

Long sessions accumulate context that no prompt can fully transfer. When a session starts losing sharpness, start a new one alongside it. Ask the old session to write a handoff prompt. Let the new session build a plan from it. Before approving, copy the plan back to the old session for review — it has the context to catch what the new session missed.

Sessions are disposable. Context is not. Chaining preserves the context while refreshing the capacity.

Ready to try the approach? Get started with pk in five minutes.

The full methodology, including code review patterns and detailed examples, is maintained in the plankit repository.