* feat(claude): wire AskUserQuestion tool through chat + pin TodoWrite
Claude calls `AskUserQuestion` for mid-conversation clarifications when
the natural answer is one of a small finite set of choices. Until now
the tool round trip hit two dead ends in headless mode: claude-code -p
cannot prompt the user, so it auto-errored the tool and retried 4x;
the model then hedged by also writing the same options as a markdown
bulleted list. The host had no way to feed a real `tool_result` back.
This change makes the AskUserQuestion round trip work end to end:
* Switch Claude to `--input-format stream-json`. The daemon wraps the
prompt as a JSONL `user` message on stdin and keeps stdin OPEN, so
later writes (a `tool_result` for the open AskUserQuestion) feed
back into the same child instead of needing a fresh spawn.
* New `RuntimeAdapter.promptInputFormat()` ('text' default,
'stream-json' for Claude) so the spawn loop keeps the old close-on-
prompt behavior for every other agent.
* New `POST /api/runs/:id/tool-result` daemon endpoint and
`submitChatRunToolResult` web helper. Body carries `toolUseId` and
`content`; daemon writes a JSONL `user` message with the matching
`tool_result` content block.
* Track outstanding host answers on the run (`pendingHostAnswers`)
and close stdin on either a `usage` event or a synthesized
`turn_end` event (extracted from `assistant.message.stop_reason`
in `claude-stream`). Without the per-turn `turn_end` signal stdin
would never close after the follow up turn finished and the run
would hang until the inactivity watchdog killed it.
* System prompt: tell Claude to use AskUserQuestion for follow ups
with 2-4 finite choices, and to STOP after the tool call instead
of writing a markdown duplicate.
Web UI:
* New `AskUserQuestionCard` renders the tool input as labelled chip
buttons (single or multi select) with a Submit button styled like
the composer's Send. On submit the answer routes through
`submitChatRunToolResult` (live tool_result path) and falls back
to `onSubmitForm` (plain user message) only if the run has already
terminated. Selected chips persist across page reloads by re
parsing the stored `tool_result.content`.
* Hide markdown text that follows an AskUserQuestion in the same
turn — defense in depth against the model emitting the duplicate.
* Collapse identical `AskUserQuestion` / `TodoWrite` retries inside
any tool group to a single card. TodoWrite is a snapshot tool,
so older calls are duplicates of state.
* Pinned TodoCard above the chat composer. The latest TodoWrite
snapshot across the conversation renders once, expandable /
collapsible header, count shows in-progress + completed (1/4),
Done button dismisses when all tasks finish, soft fade gradient
above so scrolling chat text dissolves into the panel instead of
hard clipping under the card.
* Composer gains a top shadow that only appears when the pinned
todo slot sits directly above it (dark mode strengthened).
* Accordion expand / collapse motion shared between TodoCard, the
ToolGroupCard disclosure, and BashCard output via
`grid-template-rows: 0fr -> 1fr` with `cubic-bezier(0.23, 1, 0.32, 1)`
and asymmetric durations (200ms enter, 140ms exit) per Emil
Kowalski's animation framework.
* Jump-to-latest button no longer unmounts on hide; slides up with
scale 0.9 -> 1 + fade on show, slides down with scale + fade on
hide. Always horizontally centered via `margin: 0 auto`.
i18n:
* `tool.askQuestion`, `tool.askQuestionSubmit`, `tool.askQuestionPending`,
`tool.askQuestionAnswered`, `tool.todosExpand`, `tool.todosCollapse`,
`tool.todosDone`, `tool.todosDismiss` added to all 18 locales.
Unblocker:
* Fix a pre-existing render loop in `ProjectView` when the user
clicks "New conversation". `handleNewConversation` now navigates
to the fresh conversation id synchronously after
`setActiveConversationId` so the route-sync effect at L512 and
the URL-sync effect at L851 do not ping pong (route mismatch
triggered repeated reverts; React's nested-update guard fired).
* fix(claude): order turn_end after content blocks + cover chat switching
Two follow-up fixes to the AskUserQuestion + new-conversation work:
* `claude-stream.ts` emitted `turn_end` BEFORE iterating the assistant
message's content blocks. When claude-code lacks
`--include-partial-messages` (older builds), tool_use events surface
only from that loop, so the daemon's stdin-close handler saw an
empty `pendingHostAnswers` set and closed stdin before the
AskUserQuestion tool_use was even registered. The result: the model
retried, hit the same race, and gave up writing the questions in
prose. Emit `turn_end` AFTER the content loop so tool_use ids land
in `pendingHostAnswers` first.
* `server.ts` now ignores `turn_end` events with
`stop_reason: 'tool_use'`. That stop reason means the model paused
to wait for a tool execution (claude-code's internal tool runner
for Bash / Edit / Read, or a host-answered tool like
AskUserQuestion). Either way the conversation is still in flight —
closing stdin there would kill the follow-up response. Only the
natural turn-end stop reasons (`end_turn`, etc.) close stdin.
* `ProjectView.handleSelectConversation` now navigates to the picked
conversation id synchronously, mirroring the fix already in
handleNewConversation. The route-sync effect at L512 was reverting
the active conversation on every switch, ping-ponging with the
URL-sync effect at L851 until React's nested-update guard fired
with "Maximum update depth exceeded". Same bug class as the
pre-existing new-conversation render loop.
* docs(agents): capture AskUserQuestion runtime + chat UI conventions
Record the patterns this PR introduces so future contributors can find
them without spelunking server.ts:
* Agent runtime conventions — `RuntimeAgentDef.promptInputFormat`,
`run.pendingHostAnswers` / `run.stdinOpen` lifecycle, `turn_end`
ordering rule, `POST /api/runs/:id/tool-result` endpoint shape, the
Claude only system prompt block that nudges AskUserQuestion, and the
`suppressAskUserQuestionFallbackText` defense in depth.
* Chat UI conventions — URL-load vs srcDoc render mode dispatch with
bridge disqualifiers, the dual iframe visibility swap pattern,
`isOurIframe` plus the active-iframe re-check for signals that must
only come from the visible iframe, pinned TodoCard via
`PinnedTodoSlot`, count includes `in_progress`,
`dedupeSnapshotToolRetries` for AskUserQuestion / TodoWrite stacks.
* i18n keys — 18 locale files, add the key to `types.ts` first.
* UI animation philosophy — `cubic-bezier(0.23, 1, 0.32, 1)` ease out,
asymmetric 200/140ms enter/exit, accordion via `grid-template-rows`,
no `transform: scale(0)`, keep mounted + toggle class for exit
transitions instead of relying on React unmount.
* fix(claude): read promptInputFormat as field, close stdin on deferred answer
Two PR review follow-ups on the AskUserQuestion stream-json wiring.
* server.ts:4616 referenced `runtimeAdapter.promptInputFormat()` — but
`runtimeAdapter` is not declared, imported, or assigned anywhere. The
prior adapter abstraction was deleted in #1656; when the changes
were folded back into the inline handler the format was moved onto
`RuntimeAgentDef.promptInputFormat`, but this call site was missed.
`server.ts` starts with `// @ts-nocheck` so typecheck never caught
it — every chat run hit `ReferenceError: runtimeAdapter is not
defined` the moment we wrote the prompt to a stdin-fed child, which
is every agent with `promptViaStdin: true` (claude, codex, copilot,
cursor-agent, gemini, opencode, pi, qoder). Read the format off the
in-scope `def` and default missing values to `'text'`.
* `submitToolResultToRun` cleared the answered id from
`pendingHostAnswers` but never closed stdin if a `turn_end` /
`usage` event had already fired with the set non-empty (deferred
by the event handler). The child then waited indefinitely for
further input until the inactivity watchdog killed it, losing the
model's follow-up response. Close stdin on the last-answer
transition when stream-json stdin is still open.
Test: pin `promptInputFormat` for every `promptViaStdin: true` agent
so future regressions of the field-vs-method contract fail at
typecheck-adjacent test time instead of in production. The new test
asserts `typeof def.promptInputFormat` is a string (or undefined),
not a function — exactly the shape mistake the original line made.
* fix(web): keep AskUserQuestion multi-select chips selected after reload when labels contain commas
`handleSubmit` joined multi-select answers with `', '` while the
reload parser split them on `','`. The pair is asymmetric: a valid
model-generated option like `"Yes, including images"` round-tripped
as `["Yes", "including images"]`, so after a page reload the locked
question card showed the user's pick as unselected — even though the
`tool_result` content the daemon actually wrote into the run was
correct, and the model saw the right answer. Bounded to post-reload
visual state, but silently confusing.
Switch to a `- ` bullet list per option, one per line, with the
parser stripping the leading `- ` back off. Newlines never appear
inside a label so the round trip is exact. The outer pairs separator
stays `\n\n` because individual answer bodies still never contain
that double-newline.
* chore: drop accidental personal design-system file
`design-systems/foldar/DESIGN.md` was added to the AskUserQuestion
branch in 31ac531 by mistake — it's a personal brand spec that does
not belong in the upstream design-systems catalogue. Removing it
keeps the branch's surface area scoped to the feature.
23 KiB
Directory guide
This file is the single source of truth for agents entering this repository. Read this file first; after entering apps/, packages/, tools/, or e2e/, read that layer's AGENTS.md for module-level details. Do not copy module details back into the root file; root stays focused on cross-repository boundaries, workflow, and commands.
Core documentation index
- Product and onboarding:
README.md,README.zh-CN.md,QUICKSTART.md. - Contribution and environment:
CONTRIBUTING.md,CONTRIBUTING.zh-CN.md. - Architecture and protocols:
docs/spec.md,docs/architecture.md,docs/skills-protocol.md,docs/agent-adapters.md,docs/modes.md. - Roadmap and references:
docs/roadmap.md,docs/references.md,docs/code-review-guidelines.md,specs/current/maintainability-roadmap.md. - Directory-level agent guidance:
apps/AGENTS.md,packages/AGENTS.md,tools/AGENTS.md,e2e/AGENTS.md.
Workspace directories
- Workspace packages come from
pnpm-workspace.yaml:apps/*,packages/*,tools/*, ande2e. - Top-level content directories:
skills/(functional skills the agent invokes mid-task — utilities, briefs, packagers; seeskills/AGENTS.md),design-templates/(rendering catalogue: decks, prototypes, image/video/audio templates; seedesign-templates/AGENTS.mdandspecs/current/skills-and-design-templates.md),design-systems/(brandDESIGN.mdfiles),craft/(universal brand-agnostic craft rules a skill can opt into viaod.craft.requires). apps/webis the Next.js 16 App Router + React 18 web runtime; do not restoreapps/nextjs.apps/daemonis the local privileged daemon andodbin. It owns/api/*, agent spawning, skills, design systems, artifacts, and static serving.apps/desktopis the Electron shell; it discovers the web URL through sidecar IPC.apps/packagedis the thin packaged Electron runtime entry; it starts packaged sidecars and owns theod://entry glue only.packages/contractsis the pure TypeScript web/daemon app contract layer.packages/sidecar-protoowns the Open Design sidecar business protocol;packages/sidecarowns the generic sidecar runtime;packages/platformowns generic OS process primitives.tools/devis the local development lifecycle control plane.tools/packis the local packaged build/start/stop/logs control plane and mac beta release artifact preparation surface.tools/pris the maintainer PR-duty control plane: a thinghwrapper that encodes this repo's review-lane derivation, forbidden-surface flags, lane checklists, and validation-command suggestions.e2eowns user-level end-to-end smoke tests and Playwright UI automation; reade2e/AGENTS.mdbefore editing its tests or commands.
Inactive or placeholder directories
apps/nextjsandpackages/sharedhave been removed; do not recreate or reference them..od/,.tmp/, Playwright reports, and agent scratch directories are local runtime data and must stay out of git.
Development workflow
Environment baseline
- Runtime target is Node
~24andpnpm@10.33.2; use Corepack so the pnpm version pinned inpackage.jsonis selected. - New project-owned entrypoints, modules, scripts, tests, reporters, and configs should default to TypeScript.
- Residual JavaScript is limited to generated output, vendored dependencies, explicitly documented compatibility build artifacts, and the allowlist in
scripts/guard.ts.
Local lifecycle
- Use
pnpm tools-devas the only local development lifecycle entry point. - Do not add or restore root lifecycle aliases:
pnpm dev,pnpm dev:all,pnpm daemon,pnpm preview, orpnpm start. - Ports are governed by
tools-devflags:--daemon-portand--web-port. tools-devexportsOD_PORTfor the web proxy target andOD_WEB_PORTfor the web listener; do not useNEXT_PORT.
Root command boundary
- Keep root scripts reserved for true repo-level checks and tools control-plane entrypoints:
pnpm guard,pnpm typecheck,pnpm tools-dev,pnpm tools-pack, andpnpm tools-pr. - Do not add root aggregate
pnpm buildorpnpm testaliases. Build/test commands must stay package-scoped (pnpm --filter <package> ...) or tool-scoped (pnpm tools-pack .../pnpm tools-pr ...). - Do not add root e2e aliases; e2e package commands and ownership rules live in
e2e/AGENTS.md.
Boundary constraints
- Tests under
apps/,packages/, andtools/live in a package/app/tool-leveltests/directory sibling tosrc/; keepsrc/source-only and do not add new*.test.tsor*.test.tsxfiles undersrc/. Playwright UI automation belongs toe2e/ui/, not app packages. - App packages must not import another app's private
src/ortests/implementation as a shared helper. In particular,apps/web/**must not importapps/daemon/src/**; web/daemon integration belongs behind HTTP APIs,packages/contracts, and app-local provider boundaries. - Cross-app, cross-runtime, or repository-resource consistency checks belong in
e2e/tests/when they need to observe more than one app/package boundary; promote reusable logic to a pure package instead of borrowing another app's private source. - Keep shared API DTOs, SSE event unions, error shapes, task shapes, and example payloads in
packages/contracts; update contracts before wiring divergent web/daemon request or response shapes. - Keep
packages/contractspure TypeScript and free of Next.js, Express, Node filesystem/process APIs, browser APIs, SQLite, daemon internals, and sidecar control-plane dependencies. - Keep project-owned entrypoints, modules, scripts, tests, reporters, and configs TypeScript-first; generated
dist/*.jsis runtime output, and source edits belong in.tsfiles. - New
.js,.mjs, or.cjsfiles need an explicit generated/vendor/compatibility reason and must passpnpm guard. - App business logic must not know about sidecar/control-plane concepts. Keep sidecar awareness in
apps/<app>/sidecaror the desktop sidecar entry wrapper. - Shared web/daemon app contracts belong in
packages/contracts; that package must not depend on Next.js, Express, Node filesystem/process APIs, browser APIs, SQLite, daemon internals, or the sidecar control-plane protocol. - Sidecar process stamps must have exactly five fields:
app,mode,namespace,ipc, andsource. - Orchestration layers (
tools-dev,tools-pack, packaged launchers) must call package primitives; do not hand-build--od-stamp-*args or process-scan regexes. - Packaged runtime paths must be namespace-scoped and independent from daemon/web ports; ports are transient transport details only.
- Default runtime files live under
<project-root>/.tmp/<source>/<namespace>/...; POSIX IPC sockets are fixed at/tmp/open-design/ipc/<namespace>/<app>.sock.
Git commit policy
- Git commits must not include
Co-authored-bytrailers or any other co-author metadata.
Pull request expectations
- Opening a PR uses
.github/pull_request_template.md; fill every section, not just the title. - "Why" must answer both the author's use case (what made you write this PR) and the pain being addressed (user problem, technical debt, prod issue, or unblocker), not just a one-line restatement of the title.
- "What users will see" describes the change from a user's perspective — what they click, what new thing appears, what default behavior changed — not from a code perspective.
- The Surface area checklist must reflect actual surfaces touched; check every box that applies, including extension points (
skills/,design-systems/,design-templates/,craft/), CLI flags, env vars, i18n keys, and new rootpackage.jsondependencies. - If any UI surface is checked, attach screenshots showing the entry point — where users discover the change — not just the feature in isolation; before/after is best for behavior changes.
- For bug-fix PRs, link the red-spec test that reproduces the bug and confirm it went red on
mainand green on the branch, per theBug follow-up workflowsection above. CONTRIBUTING.mdcovers PR scope, title format, dependency policy, and the issue-first rule for non-trivial features;docs/code-review-guidelines.mdis the reviewer-facing complement.
Code review guide
- Use
docs/code-review-guidelines.mdas the repository-wide review standard. That document is the operational guide; thisAGENTS.mdis the source of truth when the two disagree. - Walk reviews top-down through
docs/code-review-guidelines.md: Product relevance test → forbidden surfaces → ownership/scope → matching lane → checklist → comments → approval bar. - Pick the matching review lane: default code/tests, contract and protocol changes, design-system additions, skill additions, or craft additions.
- Before reviewing changes under
apps/,packages/,tools/, ore2e/, read that directory'sAGENTS.mdand apply its local boundaries. - Blocking review feedback should focus on correctness, security/secrets, data integrity, repository boundary violations, contract/migration breakage, missing required validation, or high-risk maintainability issues.
- Only maintainers may close a PR instead of requesting changes, and only when the change is not salvageable on the existing branch (wrong target product, foreign test harness, DOM/API assumptions absent from this repo, or scripts that conflict with lifecycle rules).
PR-duty tooling
pnpm tools-pr is the maintainer-only control plane for PR-duty work on this repo. It is a thin gh wrapper that encodes repo-specific knowledge — review-lane derivation, forbidden-surface flags, per-lane checklists, validation-command suggestions, and a fixed dictionary of factual classify tags (bot-only-approval, needs-rebase, stale-approval, unresolved-changes-requested, awaiting-* timing, org-member, etc.). The tool is read-only on the PR surface: it never approves, merges, comments, or closes; those side effects stay in explicit gh invocations the maintainer runs.
Common subcommands:
pnpm tools-pr list— triage the open queue by lane and review-state bucket.pnpm tools-pr view <num>— factual review brief for a single PR.pnpm tools-pr classify --all— script-level tag JSON for the whole open queue (entry point for cron / digest consumers); per-PRclassify <num>for spot checks.pnpm tools-pr assignment— assigner-perspective ownership + idle-time / blocker view across the queue.
For the full tag dictionary, operational playbook (direct merge / duplicate-title / awaiting-author / org-member / agent-review flows), comment templates, language-detection rules, and tool-design constraints (precision boundaries, factual-output rule, retry + pagination strategy), see tools/pr/AGENTS.md.
Agent runtime conventions
RuntimeAgentDef.promptInputFormatselects how the daemon writes the prompt to a child's stdin. The default'text'writes the composed prompt and ends stdin immediately.'stream-json'wraps the prompt as one JSONLusermessage and KEEPS stdin open so the daemon can stream further user messages back in mid-turn. Claude (apps/daemon/src/runtimes/defs/claude.ts) ships'stream-json'together with--input-format stream-jsonso the host can answer interactive tools likeAskUserQuestionwith a realtool_resultblock. Every other agent stays on'text'.apps/daemon/src/server.tstracksrun.pendingHostAnswers(a Set oftool_use_idstrings) andrun.stdinOpenon the run object. Theclaude-stream-jsonevent handler adds AskUserQuestion ids to the set and closes stdin only when both the set is empty AND aturn_end(orusage) event arrives with a nontool_usestop_reason. Thetool_usestop reason means the model paused mid tool (waiting on claude-code's internal runner or on a host answer); closing stdin there would truncate the follow up response.claude-stream.tsemits theturn_endevent AFTER iterating the assistant message's content blocks, not before. When--include-partial-messagesis unsupported, tool_use events surface only from the assistant wrapper, so emittingturn_endfirst would let the daemon close stdin before the host had registered any pending answers.POST /api/runs/:id/tool-resultis the daemon endpoint for feeding atool_resultblock back into a still running stream-json child. Body shape:{ toolUseId: string, content: string, isError?: boolean }. Web callers usesubmitChatRunToolResultfromapps/web/src/providers/daemon.ts. The daemon writes a JSONLusermessage containing onetool_resultcontent block, removes the id frompendingHostAnswers, and lets the nextturn_enddecide when to close stdin.- AskUserQuestion specifically: Claude's system prompt section in
apps/daemon/src/prompts/system.ts(Claude only block at the bottom ofcomposeSystemPrompt) tells the model to use the tool for 2 to 4 finite choices, and to stop generating tokens after the tool call instead of also writing a markdown duplicate.AssistantMessage.suppressAskUserQuestionFallbackTextis the belt and suspenders that hides any trailing markdown text in the same turn.
Chat UI conventions
apps/web/src/components/file-viewer-render-mode.tsdecides URL-load vs srcDoc for HTML previews. Bridges (deck, comment/inspect selection, palette, edit, tweaks) can ONLY inject through the srcDoc path. Add a new disqualifier toUrlLoadDecisionwhenever a feature needs a srcDoc-only bridge; pass it fromFileViewer.tsxbased on a source-content heuristic where appropriate (e.g.hasTweaksTemplate). The host keeps both iframes mounted simultaneously and swaps CSS visibility so toggling render mode does not cause an iframe reload flash;iframeRef.currentstays aligned with the active iframe viauseEffect. Receive filters useisOurIframe(ev.source)to accept messages from either iframe but signals that should ONLY come from the active iframe (e.g.od:tweaks-available) re-checkev.source === iframeRef.current?.contentWindow.- TodoWrite UI pins one canonical task list above the chat composer via
PinnedTodoSlotinChatPane.tsx. The slot reads the latest TodoWrite snapshot across the conversation throughlatestTodoWriteInputFromMessages(apps/web/src/runtime/todos.ts).AssistantMessage.stripTodoToolGroupsremoves any TodoWrite tool groups from per message rendering so there is exactly one TodoCard on screen. The progress count includes bothcompletedandin_progressitems (1/4 reads "one underway" not "zero finished"). Dismissal via the Done button is keyed on the snapshot's JSON, so a fresh TodoWrite from the agent automatically re shows the card. AskUserQuestionCard(inToolCard.tsx) prefers the liveonAnswerToolUse(toolUseId, content)route (POSTs to/api/runs/:id/tool-result) and falls back to the legacyonSubmitForm(text)path when the run has already terminated. Selected chips persist across reloads by parsing the storedtool_result.contentback into the selections shape.- Tool group rendering uses
dedupeSnapshotToolRetriesto collapse identicalAskUserQuestionretries (one card per unique input, keeping the latest tool_use_id) andTodoWritesnapshots (only the most recent call, since each call is a state replace).
i18n keys
apps/web/src/i18n/types.tsis the typedDict; every key must be defined in all 18 locale files underapps/web/src/i18n/locales/*.ts(ar,de,en,es-ES,fa,fr,hu,id,ja,ko,pl,pt-BR,ru,th,tr,uk,zh-CN,zh-TW). Add the key totypes.tsfirst; missing translations produce a typecheck error.
UI animation philosophy
- Default ease-out for UI transitions:
cubic-bezier(0.23, 1, 0.32, 1). Built-ineaseis too weak;ease-inis forbidden for UI elements because it feels sluggish. - Asymmetric durations: enter around 200ms, exit around 140ms. Exit reads as decisive because the user has already chosen to dismiss.
- Accordion expand and collapse uses
grid-template-rows: 0fr -> 1fr(modern auto height pattern). Pair with opacity fade and the easing above. The shared.accordion-collapsible+.accordion-collapsible-innerclass pair (defined inapps/web/src/index.css) is the canonical implementation; reuse it for new disclosure UI. - Never animate from
transform: scale(0). Start fromscale(0.9)or higher withopacity: 0. - For elements that show conditionally, keep them mounted and toggle a CSS class (e.g.
.chat-jump-btn-active). React unmounts skip the exit transition entirely.
Validation strategy
- After package, workspace, or command-entry changes, run
pnpm installso workspace links and generated dist entries stay fresh. - Before marking regular work ready, run at least
pnpm guardandpnpm typecheck, plus the package-scoped tests/builds that match the files changed. Do not use or add rootpnpm test/pnpm buildaliases. - For local web runtime loops, prefer
pnpm tools-dev run web --daemon-port <port> --web-port <port>. - On a GUI-capable machine, validate desktop by running
pnpm tools-dev, thenpnpm tools-dev inspect desktop status. - Stamp/namespace changes must validate two concurrent namespaces and run desktop
inspect evalplusinspect screenshotfor each namespace. - Path/log changes must run
pnpm tools-dev logs --namespace <name> --jsonand confirm log paths are under.tmp/tools-dev/<namespace>/....
Bug follow-up workflow
The following is a working playbook for routine bug follow-ups, distilled from recent practice. Treat it as a default action shape, not a contract — production reality always has edges these bullets can't anticipate, so use judgment when the situation doesn't fit cleanly.
- Lead with a red spec. Default to encoding the bug as a falsifiable test that goes red before any source change, so the fix is anchored in observable behavior rather than source-code intuition. If a red spec can't be written cheaply, that's usually a signal to clarify scope rather than push forward on a guess.
- Try the cheapest layer first. Reach for the lightest test layer that can still see the symptom (e2e Vitest at the daemon HTTP boundary → app-local Vitest → Playwright UI → platform-native harnesses), and drop down only when the cheaper layer can't.
- Hold the spec's scope. Defects discovered outside the bug's described boundary belong in a follow-up — their own red spec, their own PR — not in this fix. List them in the PR body's "Adjacent issues" section with the rationale and move on.
- Let the fix read as an invariant. Prefer a named helper whose docblock describes what must hold over a bolt-on
ifguard with apologetic history-comments. The call site should read as intent. - Diff against the baseline. When neighboring suites have pre-existing failures, stash or check out upstream before claiming "no new failures."
- Link the issue from the PR body. Use
Fixes #N/Closes #N/Resolves #Nso the issue auto-closes on merge and the release-time reverse lookup (gh issue view N --json closedByPullRequestsReferences→git tag --contains <merge sha>) actually has a chain to follow. The repo's PR template prompts for this; deleting the prompt is fine when the PR genuinely closes nothing. - Stage human verification for visible bugs. When the symptom needs an eye to confirm — UI, platform-native behavior, animations, race conditions a unit test can't see — green specs alone aren't acceptance. Stand up a buggy-vs-fix comparison the reviewer can drive themselves (typical shape: two namespaced runtimes, one on
main, one on the fix branch), and seed any required data only through production HTTP APIs; source-level test backdoors invalidate the verification because they prove a fake flow rather than the real one.
For a worked example of one full loop (red e2e spec → fix → green), see e2e/tests/dialog/stop-reconciles-message.test.ts (issue #135).
Common commands
pnpm install
pnpm tools-dev
pnpm tools-dev start web
pnpm tools-dev run web --daemon-port 17456 --web-port 17573
pnpm tools-dev status --json
pnpm tools-dev logs --json
pnpm tools-dev inspect desktop status --json
pnpm tools-dev inspect desktop screenshot --path /tmp/open-design.png
pnpm tools-dev stop
pnpm tools-dev check
pnpm guard
pnpm typecheck
pnpm tools-pr list
pnpm tools-pr list --bucket=merge-ready,approved-blocked
pnpm tools-pr list --lane=skill,contract --json
pnpm tools-pr view 1180
pnpm tools-pr view 1180 --json
pnpm --filter @open-design/web typecheck
pnpm --filter @open-design/web test
pnpm --filter @open-design/web build
pnpm --filter @open-design/daemon test
pnpm --filter @open-design/daemon build
pnpm --filter @open-design/desktop build
pnpm --filter @open-design/tools-dev build
pnpm --filter @open-design/tools-pack build
pnpm --filter @open-design/tools-pr build
pnpm tools-pack mac build --to all
pnpm tools-pack mac install
pnpm tools-pack mac cleanup
pnpm tools-pack win build --to nsis
pnpm tools-pack win install
pnpm tools-pack win cleanup
pnpm tools-pack linux build --to appimage
pnpm tools-pack linux install
pnpm tools-pack linux build --containerized
FAQ
Why is there no root pnpm dev / pnpm start?
To avoid starting daemon, web, and desktop through inconsistent env, port, namespace, or log paths. All local lifecycle flows must go through pnpm tools-dev.
Why should apps/nextjs not be restored?
The current web runtime is apps/web. The historical apps/nextjs layout has been removed from the active repo shape; restoring it would reintroduce duplicate app boundaries and stale scripts.
How does desktop discover the web URL?
Desktop queries runtime status through sidecar IPC. The web URL comes from tools-dev launch status, not from desktop guessing ports or reading web internals.
How are sidecar-proto, sidecar, and platform split?
@open-design/sidecar-proto owns Open Design app/mode/source constants, namespace validation, stamp fields/flags, IPC message schema, status shapes, and error semantics. @open-design/sidecar provides only generic bootstrap, IPC transport, path/runtime resolution, launch env, and JSON runtime files. @open-design/platform provides only generic OS process stamp serialization, command parsing, and process matching/search primitives, consuming the proto descriptor.
Where is data written?
The daemon writes .od/ by default: SQLite at .od/app.sqlite, agent CWDs under .od/projects/<id>/, saved renders under .od/artifacts/, and credentials at .od/media-config.json. Two env vars override the storage root, in order:
OD_DATA_DIR=<dir>— relocates all daemon runtime data to<dir>(used by Playwright for test isolation, and by the packaged daemon and the Home Manager / NixOS modules to point the daemon at a writable directory when the install root is read-only). The path is resolved with~/expansion and relative paths anchored to<projectRoot>.OD_MEDIA_CONFIG_DIR=<dir>— narrower override that relocates onlymedia-config.json. Same resolution semantics. Most installs do not need this; it exists for setups that want to keep API credentials in a different location from the rest of the runtime data.
Default precedence is OD_MEDIA_CONFIG_DIR > OD_DATA_DIR > <projectRoot>/.od.
When is pnpm install required?
Run pnpm install after changing package manifests, workspace layout, command entrypoints, bin/link-related content, or after adding/removing workspace packages.