Build Log May 3, 2026 ⏱️ 7 min read

The Orchestra Now Asks Before Building

TL;DR

🤔 The gap: A user typed orchestra plan: finish the migration... wanting a plan to review. The skill said plan mode wasn’t supported and offered a fallback.
🏗️ The constraint: Orchestra’s entire design is “every thinking step happens in a fresh, separate Claude session.” Bolting plan mode into the main session would have broken that rule.
✅ The fix: A new Planner worker fires once after research, writes the plan to a file, then the run pauses. Approve and the build starts. Send revisions and the Planner respawns with the feedback baked in. As many revisions as the user wants.

🎻 What Orchestra Actually Is

Orchestra is the protocol you reach for when a task is too big for one Claude session to hold in its head: migrations, audits, multi-phase builds with research, design, code, tests, and visual checks all stacked on top of each other.

The way it works is unusual. The Claude session you talk to is just a narrator, it doesn’t do any of the actual thinking.

Instead, it runs a small command-line program (the “runner”) that lives in the background. That runner is the thing that spawns brand-new Claude sessions to do every piece of real work.

Real-world analogy: picture the foreman on a building site. He doesn’t pour concrete, lay bricks, or wire the lights. He flips through clipboards, calls in the next specialist when one finishes, and tells you how it’s going. Orchestra’s narrator is the foreman. Each “worker” (research, build, test, audit, polish) is a different specialist arriving with empty hands and full attention.

That setup gives you two things at once. The narrator session stays small (no memory bloat across phases), and every specialist gets a full clean working memory of about a million tokens to focus on its one job.

The narrator does no thinking.

Every reasoning step happens in a fresh worker session. The visible chat just narrates what the runner is doing in the background.

🚧 The Gap We Hit

The user typed something like this:

/one-shot-orchestra plan let’s finish the migration from
render to cloudflare pages...

The word plan there matters. It’s how you tell our other protocol, /one-shot-scripts, that you want to see and approve a written plan before any code gets touched. The user expected orchestra to behave the same way.

It didn’t. Orchestra responded that plan mode wasn’t supported and suggested falling back to /one-shot.

Reasonable enough as a fallback. But the user wanted the real thing: orchestra’s heavy-duty multi-phase machinery, with a chance to read the plan and shape it before the runner started spending hours on workers.

Three steps from a normal command to a missing feature.

🔁 Why The Easy Fix Was Wrong

In /one-shot-scripts, plan mode is simple. After the research phase, the same Claude session writes the plan, hands it to Claude Code’s built-in “approve plan” popup, and the user sees it inline.

That works because in one-shot-scripts, the same session does everything. The session that did the research is the session that writes the plan, so the plan comes straight out of its own working memory.

Orchestra is built around the opposite rule: the narrator session never does any thinking work.

So if we asked the narrator to “just write the plan,” we’d be putting planning logic right back where we’d spent the whole architecture trying to avoid it.

Orchestra’s rule

Every thinking step happens in a fresh, separate Claude session. The narrator just drives the runner.

The wrong shortcut

Have the narrator write the plan itself. Saves a spawn, breaks the design, bloats the narrator’s memory, and the next big task pays the price.

SHORTCUT

narrator writes the plan
thinking sneaks back into main session
memory bloats across phases
orchestra rule broken

RIGHT FIX

fresh Planner spawn writes the plan
narrator just opens the file and shows it
main session stays small
architecture intact

Why the cheap option was the expensive one.

📐 The Shape Of The Fix

The plan had to be written by a brand-new Claude session that the runner spawns specifically to write it. So we added a new role to orchestra’s lineup: the Planner.

The Planner is what we call a “decision worker.” That’s a session whose job is to make a decision and write a document, not to build anything.

We already had a few of these: the Conductor decides how to split work into parallel jobs, the Merger combines results, the Loop-Judge decides whether the run needs another pass. The Planner slots in next to them.

It fires exactly once per run, in one specific spot:

🔍 Diagnose: figure out what the task actually needs
↓
🗺️ Recon: survey the project files
↓
📚 Research: pull in outside context (docs, examples, APIs)
↓
🧠 Planner (new): synthesize all three into a written plan
↓
🛑 Pause for user approval
↓
🔨 Builder, Test, Harden, Document, Verify, Visual, Polish: business as usual

This is the only spot in an orchestra run where it stops and waits for a human. Everything else is autonomous from start to finish.

OLD PIPELINE

Diagnose
Recon
Research
Builder fires immediately
(no human checkpoint)

→

WITH PLAN MODE

Diagnose
Recon
Research
Planner writes plan.md
Pause for user approval
Builder fires after approve

Same pipeline, one new spawn, one new gate.

📝 What The Plan Actually Looks Like

The Planner’s brief tells it to write a plan with eight specific sections. We pinned the structure down so the user always knows what they’re reading.

Section	What goes in it
Goal	One paragraph in plain language saying what success looks like.
Scope	What’s in, what’s out, and which fuzzy bits we made a call on.
Approach	The technical route chosen, with a sentence on why not the alternatives.
Decomposition	A numbered list of what the Builder will actually do. Specific files, endpoints, settings.
Risks & mitigations	What could go sideways and what we’ll do about each.
Verification strategy	How later phases will confirm the work is correct.
Estimated complexity	Small / medium / large, plus a note on how many workers will run in parallel.
Open questions	Anything unresolved, surfaced for the user to answer. “None” if there genuinely are none.

The Planner writes this whole document to a file called work/plan.md. The narrator’s job at the gate is to open that file and show it to the user word-for-word, not summarized.

The user needs to see exactly what they’re saying yes or no to.

Eight required sections, never summarized.

Goal, Scope, Approach, Decomposition, Risks, Verification, Complexity, Open questions. The narrator opens work/plan.md and surfaces it word for word so the user is approving the actual document, not a paraphrase.

🎛️ Two New Buttons For The Narrator

Once the plan is on screen, the user replies in normal English. The narrator then picks one of two new commands to run.

Command	When the narrator uses it	What the runner does
`orchestra approve <run-id>`	User said yes (“approve”, “ship it”, “go”).	Flips a flag called `plan_approved` to true and resumes the run at the Builder phase.
`orchestra revise <run-id> "feedback"`	User wants changes (“move the migration before the deploy step”).	Writes the feedback to a notes file every worker reads, bumps a revision counter, and triggers a fresh Planner respawn.

If the user’s reply is ambiguous (“hmm, maybe”), the narrator is told not to guess. It asks again.

One reply, two branches, no guessing.

🔁 The Revision Loop

Revisions have no cap. The user can ask for changes once, three times, or ten times.

We don’t want a tool that gives up on you because you’re still figuring out the shape.

The trick is what happens on each revision. We don’t want the Planner to write a “v2 changes” section tacked onto the bottom.

We want the new plan to read as if the Planner had written it correctly the first time, with the user’s feedback baked in.

📥 User feedback arrives in plain English
↓
📝 Runner appends it to a shared notes file (running-notes.md)
↓
🗑️ Old result file deleted (so the runner sees the next one as new)
↓
🆕 Fresh Planner spawns, given the prior plan AND the user’s feedback
↓
✍️ New plan written, surfaced to user, gate fires again

Because the Planner is always a brand-new session, there’s no memory of the old plan getting in the way. It reads the prior plan as a document, reads the feedback as a document, and writes a new clean plan from those inputs.

Why this matters: revision rounds in a regular session tend to drift, because the model still half-remembers the old version it’s emotionally attached to. A fresh session has no attachment. It just reads the latest inputs and writes the right thing.

~/orchestra — revision 02

$ orchestra revise 4f2 "move migration before deploy"

→ feedback appended to running-notes.md

→ revision_counter = 2

→ old work/plan.md deleted

→ fresh Planner spawned (run id 4f2-r2)

✍️ new plan written, gate fires again

One revision round, no caps, no drift.

🔀 The State Machine Got Three New Stops

Orchestra runs by cycling through “states” (named stages like routing, phase_running, scoring_pending). Plan mode added three new ones, plus a permanent flag.

planner_pending  →  planner_running  →  plan_pending_approval
                                                  ↓
                                          ( approve )  vs  ( revise )
                                                  ↓
                                              routing       plan_revising
                                            (Builder)            ↓
                                                          planner_running
                                                          (loops back)

Once the user approves, the plan_approved flag flips to true and stays true for the rest of the run.

Even if the Loop-Judge later decides to re-run an earlier phase to fix a quality issue, the plan gate doesn’t fire again. One approval, one run.

Three new stops, one persistent flag, no double-fire.

🧪 How We Tested It Without Burning Tokens

We didn’t want to spawn real Claude sessions for every test (each one costs real money and takes minutes). So we built an offline test that fakes the worker outputs and just verifies the plumbing.

The test script lives at runner/test-plan-mode.sh. It runs six checks. All six pass.

#	What it verifies
1	`orchestra start --plan` writes `plan_mode=true` on the run’s state file.
2	After Research finishes, the runner routes to `planner_pending` instead of jumping straight to Builder.
3	When a Planner result file appears, the state advances to `plan_pending_approval` with a flag asking the narrator to step in.
4	`orchestra approve` flips `plan_approved=true` and resumes the run at Builder.
5	`orchestra revise` writes the feedback to the shared notes file, bumps the revision counter, and transitions to `plan_revising`.
6	If someone calls `approve` on a run that wasn’t started in plan mode, the runner refuses politely instead of corrupting state.

The one thing the test deliberately doesn’t do is launch a real Planner terminal. That path is one line in the spawn code, and we verify it by hand on real runs.

Six checks, all green, zero spawn cost.

The script at runner/test-plan-mode.sh fakes worker outputs and exercises every routing decision: flag set, route to Planner, approve resumes, revise loops, refuse-on-non-plan-mode. No real Claude sessions burned.

⚖️ Two Skills, Same Word, Different Implementations

This was the second skill we’ve added plan mode to. The other one, /one-shot-scripts, did it in a much simpler way, and the contrast shows why the orchestra version had to be different.

	one-shot-scripts	one-shot-orchestra
Where the plan is written	Inline, in the same Claude session running the protocol.	By a fresh-spawn Planner worker, in its own clean session.
How the user sees it	Through Claude Code’s built-in plan mode UI.	The narrator opens `work/plan.md` and surfaces it verbatim.
Approval mechanism	Built-in “approve plan” button in the Claude Code interface.	The narrator runs `orchestra approve <run-id>` when the user says yes.
Why this shape?	One session does everything anyway. Plan mode is a free addition.	The narrator does no thinking. Planning has to happen in a separate session.

The takeaway: the same feature can have totally different implementations depending on the architecture underneath. Both versions feel identical to use (“type plan, get a plan, approve it, build runs”), but the wiring behind them respects each skill’s rules.

Two roads, same destination, different wiring underneath.

🎁 What You Get

If you already have one-shot-orchestra, the plan-mode upgrade is part of the latest release. Type plan after the skill name and the gate fires automatically.

For huge tasks (migrations, audits, multi-day builds), the gate pays for itself the first time it catches a misread requirement before the runner spends an hour building the wrong thing.

~/orchestra — first run with plan mode

> /one-shot-orchestra plan finish the cloudflare migration

→ Diagnose, Recon, Research complete

→ Planner spawned (run id 9c1)

→ work/plan.md written (8 sections)

🛑 gate: review the plan, then approve or revise

> approve

→ plan_approved=true, Builder firing now

One word command, one human checkpoint, autonomous after.

Get the orchestra (with plan mode baked in)

Multi-phase execution with a fresh Claude session for every step, plus an optional human gate before any code is written.

See Pricing More posts

← Workers Stop Hauling The Whole Toolbox All posts →

// promote_godmode

Got value from this post? Become an affiliate. Auto-approved in 60 seconds, 30 to 40% recurring commission, your audience gets 10% off automatically with code AFFILIATE10. 90-day cookie, monthly payouts.

Become an affiliate →