23 KiB
| id | name | status | created |
|---|---|---|---|
| 20260529-claude-session-resume | Claude CLI Session Resume | proposed | 2026-05-29 |
Overview
Every chat turn in a conversation spawns a fresh, context-free claude -p
process and re-sends the entire conversation as a recomposed transcript. The
process never resumes the prior turn's Claude Code session, so each turn pays
to re-read project files, re-process the full history, and forfeits any
cross-turn prompt-cache hits. On long conversations the per-turn payload grows
O(N) in turns. (Reported externally: "every message starts a new Claude Code
session, re-sends prompt/history, re-reads project files → token waste.")
This is a deliberate trade-off, not a defect — the recompose-transcript model
buys uniform behavior across ~15 heterogeneous adapters, full daemon control of
per-turn prompt composition, and a single daemon-owned source of truth. But the
project's cited architectural inspiration, multica,
runs the same claude -p --output-format stream-json base invocation and yet
resumes the CLI's own session via --resume <session_id> with a small set of
guards. multica demonstrates that session resume is compatible with this daemon
architecture when the invalidation edges are handled explicitly.
Goal: add an opt-in, Claude-first, best-effort session-resume path that, on
a qualifying follow-up turn, passes --resume <session_id> and sends only the
latest user message instead of the full transcript — while never removing the
transcript path that remains the correct behavior for every other adapter and
for any turn that fails the resume guards.
Constraints:
- Do not regress any existing behavior. Resume is best-effort: when a guard
fails, the capability is absent, or
--resumeis rejected at runtime, the daemon falls back to today's full-transcript spawn for that turn. - Do not break the interactive
stream-json/AskUserQuestionmachinery (pendingHostAnswers,POST /api/runs/:id/tool-result). - Keep the daemon the source of truth: the stored session pointer is a cache keyed on daemon-owned conversation state, never the authoritative history.
- Claude-only in v1. Other adapters keep
resumesSessionViaCliunset and the transcript path unchanged. - Honor the UI/CLI dual-track rule: a force-fresh control must exist on both
the web composer and the
odrun path.
Open questions:
- Should resume default on for Claude, or ship behind a per-project opt-in for one release while we gather token-savings telemetry? (Leaning: default on, with a force-fresh escape hatch — see Design Decisions.)
- Should a resumed turn show a small "continuing session" affordance in the chat UI, or stay invisible? (Leaning: telemetry only in v1, no UI badge.)
- Do we extend the same machinery to other resumable CLIs (codex, cursor-agent) in a follow-up, gated per-adapter? (Out of scope here.)
Research
Existing System
- Claude's
buildArgsemits no session flag; every spawn is a cold start. Source:apps/daemon/src/runtimes/defs/claude.ts:45-74 - The daemon already parses
session_idfrom Claude'ssystem/initline and emits it as astatusevent, but it is only streamed to the client. Source:apps/daemon/src/claude-stream.ts:91 session_idis never persisted: the persisted-event mapper drops it, and themessagestable has no session column. Source:apps/daemon/src/db.ts:86-103(schema), the status-event branch ofdaemonAgentPayloadToPersistedAgentEventinapps/daemon/src/server.ts- Per-turn context is recomposed client-side into a markdown transcript and
sent as
message;currentPromptcarries only the latest user turn. Source:apps/web/src/providers/daemon.ts:171-185(buildDaemonTranscript),:326-330(call site) - The daemon already has the seam to send only the latest turn: when
def.resumesSessionViaCli === true,composeChatUserRequestForAgentskips the transcript and sendscurrentPrompt. Source:apps/daemon/src/server.ts:2542-2571,:10985-10989 RuntimeContext.hasPriorAssistantTurnis already computed per run and passed intobuildArgs— the hook a resume flag would read. Source:apps/daemon/src/runtimes/types.ts:19-29,apps/daemon/src/server.ts:11322,11376- The transcript is already scoped to the active agent and sanitized for prior
<question-form>markup — work that only applies to the daemon-owned transcript, not to a CLI-held session. Source:apps/web/src/providers/daemon.ts:127-136(scopeHistoryToAgent),:150-169(sanitizePriorAssistantTurnForTranscript) - The capability-probe pattern (probe
claude -p --help, set a capability flag, gate the arg) already exists for--include-partial-messages/--add-dir. Source:apps/daemon/src/runtimes/defs/claude.ts:16-24,58-71
Reference Implementation (multica)
multica runs the same base invocation and resumes via --resume, persisting
and re-injecting a per-(agent, issue) session id with guards that map directly
onto OD's edges:
- Same base argv, plus conditional
--resume. Source:multica/server/pkg/agent/claude.go:483-519(--resumeat:513-514) - Resume only when not a forced rerun, and only when the prior session ran on
the same runtime as the claiming task; excludes poisoned sessions; falls
back to the last task that recorded a session id.
Source:
multica/server/internal/handler/daemon.go:1306-1360 - session id captured from the
systemmessage and pinned mid-flight to the task row. Source:multica/server/internal/daemon/client.go:248(PinTaskSession) - A "Focus on THIS comment" prompt guard defends against the resumed session
inheriting the prior turn's completion marker.
Source:
multica/server/internal/daemon/prompt.go
Why OD diverged (product characteristics that shape this design)
OD is a synchronous, interactive design-chat, not an async issue/task runner. These traits are why resume must be guarded rather than unconditional, and why it stays Claude-first and opt-out-able:
- Interactive mid-turn tools. OD keeps
stream-jsonstdin open to answerAskUserQuestionwith a realtool_result. multica disables it (--disallowedTools AskUserQuestion). Resume must not disturb this path. - Per-turn prompt rewriting. OD recomposes the system prompt + skills +
memory + design system into the
# Instructionsblock of the stdin user message every turn. Because the instructions ride the stdin message (not a one-time--append-system-prompt), they still reach the model on a resumed turn — so OD keeps prompt control and gets resume. This is the key reason OD can adopt resume where the antigravity-cpath could not:agy -cactivated an internal agentic loop OD could not steer (apps/daemon/src/runtimes/defs/antigravity.ts:189-204);claude --resumestill consumes the fresh stdin turn. - Mid-conversation agent switching. OD lets a user switch active agent per conversation; a Claude session id is meaningless to Codex. Mirror multica's runtime-match guard.
- User-editable history. OD supports retry-from-message and editing prior turns. Claude's session is an immutable replay; once OD's history diverges from what produced the session, the session is stale. multica has no direct analog, so OD needs an extra epoch guard.
- Heterogeneous fleet. Most adapters have no compatible resume. Resume is a per-adapter opt-in; the transcript path stays the default everywhere else.
Constraints & Dependencies
- Contracts-first: any new request/response field lands in
packages/contractsbefore web/daemon wiring (AGENTS.mdcontract rule). - SQLite migration discipline: additive, backward-compatible schema only.
- Capability probe must gate
--resumeso forks lacking it (e.g. openclaude) never receive it (fallbackBinsincludesopenclaude).
Key References
apps/daemon/src/runtimes/defs/claude.tsapps/daemon/src/claude-stream.ts:91apps/daemon/src/server.ts:2542-2571,10985-10989,11322,11376apps/daemon/src/runtimes/types.ts:19-29,45-54,149-155apps/web/src/providers/daemon.ts:127-185,326-330apps/daemon/src/db.ts:86-103,888-908,1031-1050- multica:
server/pkg/agent/claude.go,server/internal/handler/daemon.go:1306-1360
Design
Architecture Overview
Three additions, all behind a Claude-only capability gate:
- Capture + persist the Claude
session_id(already parsed) keyed on(conversationId, agentId), plus ahistoryEpochsnapshot, the run'sworkDir, and aruntimeIdruntime-identity fingerprint. Update on run success. Two distinct guards ride these fields:workDiris a cwd-scope guard, not just the spawn cwd: Claude Code sessions live under a per-cwd path, soclaude --resume <id>is only meaningful from the same working directory that produced the session —resolveResumeDecision()rejects a pinned row whoseworkDirdiffers from the current run's.runtimeIdis the OD analog of multica's "same runtime" guard (multica's guard is about the producing runtime, not just the cwd). The Claude adapter can resolve through different binaries/forks (fallbackBinsincludesopenclaude, Research lines 140-141), and a session created by one executable is not guaranteed resumable by another.runtimeIdis a small, stable fingerprint of the resolved runtime (e.g. the resolved bin path plus its--resumecapability probe result);resolveResumeDecision()rejects a pinned row whoseruntimeIddiffers from the current run's resolved runtime, so a mid-conversation binary/fork swap falls back to a transcript spawn rather than attempting--resumeagainst a foreign session.
- Decide at run start whether to resume: a pure
resolveResumeDecision()helper that returns asessionIdonly when every guard passes. - Apply: when resuming,
buildArgsappends--resume <id>and the prompt composer sends onlycurrentPrompt(existingskipTranscriptseam) plus an anti-echo guard line. On runtime rejection, fall back to a transcript spawn.
turn N: claude system/init → session_id ──pin──▶ conversation_agent_session
(conv, agent, sid, epoch, wd, rt)
turn N+1: resolveResumeDecision(conv, agent, epoch, forceFresh, caps)
│ pass → buildArgs(+--resume sid) + skipTranscript + anti-echo guard
│ fail → today's full-transcript cold spawn
claude --resume rejects sid? → one retry: transcript cold spawn
Change Scope
packages/contracts: addforceFreshSession?: booleanto the chat/run request DTO; addsessionResumed?: booleanto run status/telemetry shape.apps/daemon/src/db.ts: add aconversation_agent_epochtable keyed on(conversation_id, agent_id)withhistory_epoch INTEGER NOT NULL DEFAULT 0(the live agent-scoped epoch, bumped only when a mutation affects that agent's scoped history — edit / truncate / retry of a turn insidescopeHistoryToAgent's slice for that agent); newconversation_agent_sessiontable + queries (itshistoryEpochcolumn records the epoch a session was pinned under, distinct from the liveconversation_agent_epoch.history_epoch; itsruntimeIdcolumn records the runtime fingerprint a session was produced by); stop strippingsession_idso it reaches the pin path.apps/daemon/src/runtimes/types.ts: extendRuntimeContextwithresumeSessionId?: string; addsupportsSessionResumecapability key.apps/daemon/src/runtimes/defs/claude.ts: probe--resume; emit it whenruntimeContext.resumeSessionIdis set.apps/daemon/src/server.ts:resolveResumeDecision(), pin-on-success, resume-awareskipTranscript, anti-echo guard line, runtime-rejection fallback.apps/web/src/: force-fresh control in the composer; on a history edit/truncate/retry, bump the live epoch of every agent whosescopeHistoryToAgentoutput changes as a result of the mutation — defined by diffing each agent's scoped slice before vs after, not by which agent's slice literally contains the mutated turn. This matters becausescopeHistoryToAgentcuts each agent's slice at the most recent assistant turn from a different agent (apps/web/src/providers/daemon.ts:127-135), so truncating or retrying that foreign-agent boundary turn (or any of its descendants) shifts or removes the active agent's resumed suffix even though the mutated turn is not inside the active agent's slice — that case must bump the active agent's epoch too. The diff-the-scoped-output rule still prevents a Codex-only edit that does not change Claude's scoped slice (e.g. editing the content of a Codex turn that stays the boundary) from invalidating Claude's pinned session.apps/daemon/src/cli.ts:--fresh-sessionflag on the run path (dual-track).
Design Decisions
- Claude-only, capability-gated, opt-in per adapter. Other adapters are
untouched. Note the existing
resumesSessionViaCliflag is a static, per-adapter skip-transcript switch (agy resumes on every follow-up or never). Claude resumes conditionally per turn, so it must NOT reuse that static flag to gate transcript-skipping — see Design Decision #1a. 1a. Transcript-skip is keyed to the per-run resume decision, never a static flag. Both--resumeand skip-transcript derive from the single per-runresumeSessionIdreturned byresolveResumeDecision(): present → emit--resumeAND skip the transcript; null → omit--resumeAND send the full transcript. They are coupled so a guard-fail / no-capability / force-fresh turn can never produce the broken state "cold spawn + no--resume+ no transcript" (which would silently drop all prior context and break the non-regression guarantee in Design Decision #4). For Claude, leave the staticresumesSessionViaCliflag unset; introducesupportsSessionResumeas a capability (can this CLI resume at all?) distinct from the per-turn decision (should THIS turn resume?). - Store a pointer keyed on
(conversationId, agentId)— the OD analog of multica's per-(agent, issue) session. A dedicated small table avoids bloatingconversationsand makes the agent-switch guard a natural key miss. historyEpochguard (agent-scoped). Each(conversation, agent)carries a monotonic epoch bumped whenever that agent's scoped history is edited, truncated, or retried. The epoch is agent-scoped, not conversation-wide, because the resumable state itself is agent-scoped: OD scopes the prompt transcript to the active agent (scopeHistoryToAgent, Research lines 79-83) and pins the session pointer by(conversationId, agentId)(Design Decision #2). A conversation-wide epoch would let an edit to a Codex-only turn bump the key and invalidate Claude's pinned session even though Claude's scoped history — the actual prompt source — never changed, making the invalidation model stricter than the prompt source. The live epoch therefore lives in the newconversation_agent_epochtable keyed on(conversationId, agentId)(the web edit/retry path bumps the epoch of every agent whosescopeHistoryToAgentoutput changes under the mutation — which includes the active agent when a truncate/retry lands on the most recent foreign-agent boundary turn or its descendants, sincescopeHistoryToAgentcuts the slice at that boundary (apps/web/src/providers/daemon.ts:127-135) and dropping/regenerating it shifts or removes the active agent's resumed suffix even though the mutated turn sits outside that agent's slice); the pinnedconversation_agent_session.historyEpochrecords the epoch a session was produced under. At run start the daemon readsconversation_agent_epoch.history_epochfor the active(conversation, agent)to formcurrentHistoryEpoch, thenresolveResumeDecision()compares it against the pinned row; a mismatch invalidates resume (Claude's immutable session would diverge from that agent's edited history). This is OD's addition over multica.- Best-effort with guaranteed fallback. If
claude --resumeexits early reporting an unknown/invalid session, retry the same turn once without--resumeand with the full transcript. Worst case equals today. - Anti-echo guard, not transcript sanitization. On a resumed turn the prior
<question-form>lives in Claude's session, beyond reach ofsanitizePriorAssistantTurnForTranscript. Replace that defense with a short instruction-block guard (multica's "Focus on THIS turn" analog) telling the model the prior form was already answered. - Force-fresh is first-class and dual-surface. Retry-from-message, history
edits, an explicit composer toggle, and
od --fresh-sessionall setforceFreshSession, mirroring multica'sForceFreshSession.
Why this design
It captures multica's token win on the dominant path (Claude, linear conversation) while preserving the four OD properties the transcript model was protecting: interactive tools, per-turn prompt control, agent switching, and editable history. The fallback and capability gate make it strictly non-regressing, so it can ship default-on without a flag day.
Test Strategy
Red-spec first at the cheapest layer that sees the symptom (daemon HTTP e2e),
per AGENTS.md:
- Resume happy path (red on main). Two turns, same conversation, same agent:
assert turn 2 spawns with
--resume <sid-from-turn-1>and that turn 2's stdin prompt does not contain the full transcript (onlycurrentPrompt). Every guard-fail case below asserts BOTH halves of the coupling (Design Decision #1a): no--resumeAND the full transcript is present in the stdin prompt — this is the explicit non-regression proof that a fallback turn never loses prior context. - Agent-switch guard. Turn 1 Claude, turn 2 after switching agent → no
--resumeAND turn 2's prompt contains the full transcript. - Force-fresh. Retry-from-message and
--fresh-session→ no--resumeAND full transcript present. - History-edit epoch guard. Editing a prior turn in the active agent's scoped
history bumps that agent's epoch → no
--resumeAND full transcript present. Conversely, editing the content of a turn that belongs only to a different agent's scoped history without changing Claude's scoped slice (e.g. a Codex-only turn that stays the boundary) must NOT bump Claude's epoch → Claude still resumes with--resume. - Cross-agent boundary retry/truncate guard. History contains a Codex
(foreign-agent) assistant turn followed by later Claude turns, so Claude's
scopeHistoryToAgentslice starts just after that Codex boundary turn. A retry-from-message or truncate on the Codex boundary turn (or its descendants) drops or shifts Claude's resumed suffix, so it MUST bump Claude's epoch → no--resumeAND full transcript present — even though the mutated turn sits outside Claude's own slice. (This is the case a "mutated turn is inside the slice" rule would miss: Claude could otherwise resume against stale history.) - Work-dir guard. A pinned session whose
workDirdiffers from the current run's (e.g. the conversation's project cwd moved) → no--resumeAND full transcript present. - Same-runtime guard. A pinned session whose
runtimeIddiffers from the current run's resolved runtime (e.g. the Claude bin resolved toopenclaudeon turn 2 while cwd and agent stayed the same) → no--resumeAND full transcript present. - Runtime-rejection fallback. Stub a
claudethat rejects--resume→ the daemon retries WITHOUT--resumeAND WITH the full transcript, and the run still succeeds. - Capability probe. A
claude --helpwithout--resume→ flag never sent AND every turn carries the full transcript (skip-transcript stays off becauseresolveResumeDecisionreturns null without the capability). - Interactive intact. An
AskUserQuestionturn still resolves viaPOST /api/runs/:id/tool-resultunder resume.
Human verification (per AGENTS.md, two namespaced runtimes: main vs branch):
drive a multi-turn chat through production HTTP only; confirm continuity holds
and compare per-turn input-token counts (resume turn should drop sharply).
Pseudocode
// server.ts — at run start, before buildArgs
function resolveResumeDecision(ctx): string | null {
if (!agentDef.capabilities.supportsSessionResume) return null;
if (chatBody.forceFreshSession) return null;
if (!hasPriorAssistantTurn) return null;
const row = getConversationAgentSession(db, conversationId, agentId);
if (!row || row.agentId !== agentId) return null; // agent-switch guard
// currentHistoryEpoch is read from conversation_agent_epoch for THIS
// (conversationId, agentId) at run start — agent-scoped, so a Codex-only edit
// never invalidates Claude's pinned session.
if (row.historyEpoch !== currentHistoryEpoch) return null; // edit guard
if (row.workDir !== currentWorkDir) return null; // cwd-scope guard
if (row.runtimeId !== currentRuntimeId) return null; // same-runtime guard
return row.sessionId;
}
// claude.ts buildArgs
if (runtimeContext.resumeSessionId && caps.sessionResume) {
args.push('--resume', runtimeContext.resumeSessionId);
}
// prompt composition — keyed ONLY to the per-run decision, NOT a static flag.
// `--resume` (above) and skipTranscript share the same source of truth, so a
// guard-fail turn (resumeSessionId == null) always sends the full transcript.
const skipTranscript = resumeSessionId != null;
// on run success (mirrors multica PinTaskSession)
if (finalSessionId) upsertConversationAgentSession(db, {
conversationId, agentId, sessionId: finalSessionId,
historyEpoch: currentHistoryEpoch, workDir, runtimeId: currentRuntimeId,
});
// runtime rejection
if (exitedWith('unknown session') && usedResume) {
return respawnWithoutResumeAndFullTranscript();
}
File Structure
specs/change/20260529-claude-session-resume/spec.md(this file)- (implementation, follow-up PR) contracts → db → runtime types → claude def → server orchestration → web composer → cli flag
Plan
- Land this spec (this PR).
- Contracts:
forceFreshSession,sessionResumed. - DB:
conversation_agent_session(withruntimeId) +conversation_agent_epochtables + queries; stop droppingsession_id. - Runtime:
RuntimeContext.resumeSessionId,supportsSessionResumeprobe,--resumeinclaude.ts. - Server:
resolveResumeDecision, pin-on-success, resume-aware skipTranscript, anti-echo guard, rejection fallback. - Web + CLI: force-fresh control on both surfaces; epoch bump on history edit.
- Tests per Test Strategy, then human verification.
Notes
Implementation
- Steps 2–6 ship as separate PRs after this spec is accepted; each carries its own red spec.
- Keep
resolveResumeDecisiona pure function for unit-testing the guard matrix without spawning a process.
Verification
pnpm guard,pnpm typecheck,pnpm --filter @open-design/daemon testper changed surface.- Token-savings telemetry (
sessionResumed, input-token delta) confirms the win that motivates the change.