* feat(runtimes): register AMR (vela) as an ACP stdio agent
AMR is the vela CLI's ACP runtime mode. `vela agent run --runtime opencode`
speaks ACP JSON-RPC over stdio (see vela's
`specs/current/runtime/manual-agent-run-openrouter.md`); per
`docs/new-agent-runtime-acp.md` we expose it through the same `streamFormat:
'acp-json-rpc'` transport that already powers Hermes, Devin, Kimi, etc.
The new `defs/amr.ts` is the entire wiring — `buildArgs` returns
`['agent', 'run', '--runtime', 'opencode']`, `fetchModels` reuses
`detectAcpModels`, and the fallback list seeds the OpenRouter ids vela's
e2e baseline uses. `executables.ts`/`app-config.ts`/`metadata.ts` get the
matching `VELA_BIN`/`VELA_LINK_URL`/`VELA_RUNTIME_KEY`/`VELA_OPENCODE_BIN`
allowlist + install/docs URLs, so users can configure the per-agent env in
Settings without leaking into other adapters.
Coverage: `tests/fixtures/fake-vela.mjs` is a minimal ACP stub that returns
the documented `initialize` / `session/new` / `session/set_model` /
`session/prompt` shapes; `tests/amr-acp-integration.test.ts` spawns it via
`child_process.spawn` and drives a full turn through `attachAcpSession` and
`detectAcpModels`, so the ACP transport contract for AMR is end-to-end
verified locally even before a real `vela` binary is installed.
Validated:
- pnpm guard
- pnpm typecheck (all workspace projects)
- pnpm --filter @open-design/daemon test (2881/2881)
Deferred: real OpenRouter-backed turn through a built `vela` binary —
the runtime def needs no changes for that path, only `VELA_RUNTIME_KEY`
and `VELA_LINK_URL` in env (or Settings).
* fix(runtimes/amr): pin a concrete default model and bare openai ids
End-to-end validation against a freshly-built `vela` (nexu-io/vela@main)
+ OpenRouter surfaced two contract details the first AMR runtime def
got wrong:
1. vela rejects `session/prompt` with `session/set_model must be called
before session/prompt`. attachAcpSession in apps/daemon/src/acp.ts
skips set_model whenever the picked model is the synthetic 'default'
id, so AMR's fallback list must NOT include DEFAULT_MODEL_OPTION. The
def now ships a concrete `gpt-5.4-mini` as both `fetchModels`'
default option and `fallbackModels[0]`, which makes attachAcpSession
always send a real `session/set_model` for AMR turns.
2. `vela --runtime opencode` auto-prepends `openai/` to whatever modelId
it forwards to opencode's openai provider. With OpenRouter-style ids
like `openai/gpt-5.4-mini`, opencode receives the double-prefixed
`openai/openai/gpt-5.4-mini` and replies `ProviderModelNotFoundError`.
The new fallback list ships the bare ids opencode's openai registry
actually knows about (gpt-5.4, gpt-5.4-mini, gpt-5.4-fast, etc.).
Stub + tests:
- tests/fixtures/fake-vela.mjs now enforces the set_model gate the same
way real vela does, so a regression that silently goes back to
model: 'default' would surface as a fatal error in tests instead of a
hidden production failure.
- tests/amr-acp-integration.test.ts pins both contracts: no 'default' /
no 'openai/' prefix in fallbackModels, and a negative case that
asserts session/prompt fails when no model is set.
Adds `apps/daemon/scripts/verify-amr-real-vela.mjs` — a small dev-time
runner that drives `attachAcpSession` against a real `vela` binary and
prints the daemon's chat events, so future protocol drift can be checked
against an actual OpenRouter call.
Verified locally: `vela agent run --runtime opencode` + OpenRouter
returns the prompted string ("AMR-E2E-PASS") through the full daemon
pipeline; daemon test suite stays 2883/2883.
* fix(runtimes/amr): substitute concrete model when chat run sends 'default'
A plugin-driven AMR run from the UI surfaced a real-world hole in the
prior commit:
json-rpc id 3: session/set_model must be called before session/prompt
The Default-design-router plugin (and any caller that doesn't pin a
real model) sends `model: 'default'` straight through, which the AMR
runtime def cannot accept — vela rejects `session/prompt` without
`session/set_model` and attachAcpSession skips set_model whenever
model === 'default'. Just leaving DEFAULT_MODEL_OPTION out of the
adapter's `fallbackModels` is not enough: the chat-run handler in
server.ts still forwarded 'default' verbatim.
This adds `resolveModelForAgent(def, resolved, env?)` as the
single source of truth for the substitution:
1. If the caller picked a real id, pass it through.
2. Else, if `def.defaultModelEnvVar` is set and the daemon process
env has a non-empty value for it, return that (operator escape
hatch — see below).
3. Else, if the def's `fallbackModels` does NOT contain a 'default'
id, return `fallbackModels[0].id`.
4. Else, return the original value (the historic shape — defs that
list 'default' themselves are untouched).
AMR sets `defaultModelEnvVar: 'VELA_DEFAULT_MODEL'`, so when
opencode's openai-provider registry deprecates `gpt-5.4-mini`
upstream, an operator can swap the fallback id without a code change
by exporting `VELA_DEFAULT_MODEL=gpt-5.5` before launching tools-dev
/ od. Worth noting the env var must live in the daemon's `process.env`
(Settings-UI per-agent env values only reach the spawned child, not
the daemon's resolver) — the new field's docblock spells this out.
Coverage:
- `tests/runtimes/resolve-model.test.ts` — 8 unit tests covering all
four resolver branches plus the env-override happy path / fallback /
ignore-when-user-picked-a-real-id case.
- `pnpm --filter @open-design/daemon typecheck` clean.
* chore(runtimes/amr): move AMR to the top of the base agent list
So `AMR (vela)` shows up first in the agent picker / status views,
ahead of claude / codex. Pure ordering change; no behavior delta.
* feat(amr): Sign-in / Sign-out button on the AMR Settings card
The first half of the AMR work assumed the operator would set
VELA_RUNTIME_KEY / VELA_LINK_URL on the daemon process and never
surfaced login state to users. This adds the missing UX so a fresh
install can drive the full path from Settings:
- GET /api/integrations/vela/status reads ~/.vela/config.json
for the active profile and returns { loggedIn, profile, user }
(without leaking the runtime/control keys themselves).
- POST /api/integrations/vela/login spawns `vela login` once
(409 if one is already in flight). The vela CLI opens the user's
browser to the device-authorization page itself — Open Design
only needs to kick the subprocess off.
- POST /api/integrations/vela/logout removes ~/.vela/config.json
so the next status read returns logged-out.
`AmrAgentCard` is a dedicated agent-card component for AMR because
the existing `<button>` row can't host an interactive sub-control
(nested interactive elements). It polls /status after a login click
until the daemon reports loggedIn=true (or 5 minutes elapse), and
exposes a Sign-out action on hover. Other adapters (claude, codex,
hermes, …) keep their existing `<button>` card.
i18n: 8 new keys (settings.amrLogin / Logout / LoggingIn / etc.)
added to en + zh-CN. Other locales spread `en` and inherit the
English copy until translations land.
Coverage:
- `tests/integrations/vela.test.ts` pins the config.json reader
against a tmp HOME — including the negative case where a profile
has user info but no runtimeKey (still logged-out), and the
secret-leak guard ("rt-secret-*" must not appear in the projection
payload).
- `tests/components/AmrAgentCard.test.tsx` covers all four UI
states (logged-out, logging-in, logged-in, logging-out) plus the
click-propagation invariant the divergent card was built to keep.
`pnpm --filter @open-design/daemon test` 2901 / 2901 passing.
`pnpm --filter @open-design/web test` 1719 / 1719 passing.
`pnpm typecheck` + `pnpm guard` clean.
Dev script side-effects: `apps/daemon/scripts/verify-amr-real-vela.mjs`
no longer requires both VELA_RUNTIME_KEY and VELA_LINK_URL — if
VELA_PROFILE is set, the vela CLI is allowed to resolve credentials
from `~/.vela/config.json`. Added the two AMR `.mjs` fixtures to
`scripts/guard.ts` allowlist with the executable-fixture / dev-runner
rationale.
* fix(connection-test): substitute model for AMR before attachAcpSession
The chat-run path in server.ts already routes the requested model through
`resolveModelForAgent` so AMR / vela (whose CLI demands an explicit
`session/set_model` before `session/prompt`) gets the def's first
concrete fallback id when the chat run ships `model: 'default'`.
`connectionTest.ts` was wiring `attachAcpSession({ ..., model: model ?? null })`
directly, which made the Test Connection button on the AMR Settings
card deadlock with the same `session/set_model must be called before
session/prompt` error the chat-run path already handles — surfaced as a
permanent "Testing connection…" spinner in the UI.
Reuse the same helper here so Test Connection mirrors chat-run behavior.
* test(amr): three-layer end-to-end coverage for the AMR login + turn flow
The PR up to this point shipped runtime + UI code with unit-level Vitest
coverage. This commit adds the cross-layer regression net the live demo
relied on:
1. apps/daemon/tests/integrations/vela.routes.test.ts (HTTP, Vitest)
Spins up the real daemon Express app via `startServer({port:0,...})`,
persists `agentCliEnv.amr.VELA_BIN = <fake>` into app-config.json,
and exercises every /api/integrations/vela/* endpoint against the
extended fake-vela stub:
- status reads ~/.vela/config.json under various states
- login spawns the fake, waits for config.json to appear, returns
pid + startedAt + profile
- 409 already-running guard with the stub's delay knob
- logout removes the file (idempotent)
- secrets (runtimeKey / controlKey) never leak in the projection
- login → status round-trip flips loggedIn=false → true
2. e2e/tests/amr/turn.test.ts (tools-dev orchestrated, Vitest)
Boots a namespaced daemon + web pair through `createSmokeSuite`,
inlines a self-contained fake `vela` binary that handles BOTH
`vela login` (writes ~/.vela/config.json) and
`vela agent run --runtime opencode` (ACP stdio with the
`session/set_model must precede session/prompt` gate the real binary
enforces), then drives a complete /api/runs lifecycle for
`agentId: 'amr', model: 'default'` and asserts the assistant message
captures the fake's streamed text. This is the test that would have
surfaced today's plugin-default-model regression (the `set_model
before prompt` error) at PR time instead of demo time.
3. e2e/ui/amr-login-pill.test.ts (Playwright)
Mocks /api/agents + /api/integrations/vela/{status,login,logout}
to drive the Settings AMR card through the full Sign in → Signed in
→ Sign out cycle. Pins the AmrLoginPill polling contract and the
aria-label semantics (the pill's accessible name is "Sign out" once
logged in, regardless of which label the hover-state text shows).
fake-vela.mjs extensions:
- Handles `vela login` argv by writing
~/.vela/config.json for the active VELA_PROFILE and exiting 0 —
mirrors real vela's on-disk side-effect without the device-auth
loop.
- FAKE_VELA_LOGIN_DELAY_MS knob so route tests can observe the
in-flight state of the spawn lifecycle.
- FAKE_VELA_LOGIN_USER_EMAIL / _USER_PLAN to assert the surfaced
user fields end-to-end.
Validated:
- `pnpm guard` + `pnpm typecheck` (all workspace projects)
- `pnpm --filter @open-design/daemon test`: 2998 / 2998 passing,
including the new 8-test integration suite.
- `cd e2e && pnpm test tests/amr`: 1 / 1 passing.
- `cd e2e && pnpm exec playwright test ui/amr-login-pill.test.ts`:
1 / 1 passing (6.7s).
* feat(amr): package native cli and refine login ui
* feat(amr): wire vela cli beta packaging
* docs(amr): document vela ci packaging review
* docs(amr): refine vela ci integration review
* fix(ci): refresh nix pnpm dependency hashes
* fix(pack): clean up Vela CLI packaging
* fix(pack): bundle Vela CLI support files
* fix(amr): recover login attempts from stale auth state
* test: expand AMR and automations coverage
* fix(amr): address review follow-ups
* test(web): align tasks fixtures with contracts
* fix(daemon): type wildcard route params
* fix(ci): refresh PR merge validation
* fix(amr): clear env credentials on logout
* feat(settings): inline local CLI model configuration
* fix(amr): recognize daemon env credentials
* [codex] Fix Vela companion packaging (#2979)
* Fix Vela companion packaging
* Update Nix pnpm dependency hashes
* [codex] Surface AMR account failures (#2980)
* fix: surface AMR account failures
* fix: cover AMR recovery error guidance
* chore: bump beta base version to 0.8.1 (#2990)
* Fix AMR profile and packaged runtime review issues
* Detect packaged AMR OpenCode companion tree
* feat(web): polish AMR frontend flows
* Polish AMR onboarding card
* fix: read AMR login state from dot-amr config (#3048)
* test: tighten AMR credential and packaging coverage
* test: restore AMR executable test env helper
* [codex] Fix packaged mac Dock identity and AMR label (#3076)
* Fix packaged mac sidecar Dock identity
* Rename AMR assistant label
* Fix AMR live models and dot-amr login state (#3073)
* fix: read AMR login state from dot-amr config
* fix: load live AMR models before runs
* fix: point AMR onboarding link to production wallet
* fix: address AMR model review feedback
* fix: persist live AMR model fallback
* [codex] Fix AMR link catalog model ids (#3088)
* Fix packaged mac sidecar Dock identity
* Rename AMR assistant label
* Fix AMR link catalog model ids
* Fix AMR model normalization typecheck
* Use live AMR model for default runs
* fix: polish AMR runtime settings UI
* Accelerate AMR startup defaults (#3092)
* Surface AMR insufficient balance wallet URL (#3099)
* fix(web): polish onboarding controls (#3112)
* fix(web): show CLI scan loading state
* Avoid duplicate AMR wallet recharge links (#3117)
* Avoid duplicate AMR wallet recharge links
* Use Vela CLI 0.0.3 test package
* chore(nix): refresh pnpm deps hash
* Fix AMR wallet guidance display
---------
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
* chore(pack): pin Vela CLI 0.0.3-test.1 (#3127)
* chore(nix): refresh pnpm deps hash
* chore(pack): pin Vela CLI 0.0.3
* chore(nix): refresh pnpm deps hash
* fix(web): suppress AMR exit 130 fallback (#3136)
* feat(web): nudge users to hosted AMR on model/auth/quota failures (#3083)
* feat(web): nudge users to hosted AMR on model/auth/quota failures
When a non-AMR agent run fails with an auth / quota / upstream model
error, surface an inline nudge under the error pill linking to Open
Design's hosted AMR gateway (https://open-design.ai/amr). The nudge
fires `surface_view` (element=run_failed_toast) on impression and
`ui_click` (element=go_amr) on the link.
Also teach the daemon to classify CLI-agent auth/quota/upstream failures
(Claude Code, codex, ...) into specific API error codes
(AGENT_AUTH_REQUIRED / RATE_LIMITED / UPSTREAM_UNAVAILABLE) instead of
the generic AGENT_EXECUTION_FAILED, so both the error message and the
nudge key off accurate codes. AMR's own runs are excluded from the
nudge — they keep the dedicated sign-in / recharge affordances.
* feat(web): rework failed-run AMR guidance into per-case error UI
Replace the single inline nudge with a per-case failed-run experience
driven by the run's error code + agent:
- The error card is now neutral gray (was red) and always carries a
retry button; it is driven by the persisted per-message error event so
it survives a reload.
- Non-AMR agent hitting a model/auth/quota wall: a theme-color promotion
card under the error card offers "switch to AMR & retry" — switches the
run to AMR, opens Settings on the AMR card, and auto-retries once the
account signs in (ProjectView polls vela login status, independent of
the Settings pill lifecycle, with success / 5-min-timeout / unmount
exits).
- AMR agent unauthorized: clearer copy + an "authorize & retry" button.
- AMR agent out of balance: clearer copy + a "top up" button to the AMR
wallet, with manual retry.
- Settings AMR card: when opened from the nudge, it scrolls into view and
pulses, and an authorize-button coachmark (a fake hand cursor that
rises in and dismisses on hover) points at the sign-in control when not
yet authorized.
analytics: surface_view (run_failed_toast) on the promotion card and
ui_click (go_amr) on its action are retained. i18n adds chat.amrCard.*
and chat.amrError.* (en / zh-CN / zh-TW translated; other locales fall
back to en) and drops the old chat.amrErrorGuidance keys.
* fix(daemon): require status context for numeric service-failure codes
Per review on #3083: the model-service classifier matched bare HTTP
status numbers (`500`, `502`, `429`, `401`), so ordinary CLI output like
`line 500`, `read 502 bytes`, or `exit code 401` could be misclassified
as a provider outage / auth wall and wrongly surface the AMR nudge. Now
a status number only counts when it carries explicit context (`HTTP 500`,
`status 503`, `code: 401`, `502 Bad Gateway`); textual provider phrases
(overloaded, bad gateway, service unavailable, rate limit, …) are
unchanged. Adds fixtures proving unrelated numeric output stays null.
* fix(web): keep error pill for failed runs ChatPane's card doesn't cover
Per review on #3083: the per-message gray error pill was suppressed for
every persisted error status event, but ChatPane only renders the
replacement top-level error card for `retryableAssistantMessage` (the
last failed assistant). So a failed turn that is no longer last (after a
follow-up) or an older failed run in history showed neither the pill nor
the card — its error detail vanished, undercutting reload/history
survival. ChatPane now passes `errorCardOwnerId` (the assistant id whose
error the card represents); AssistantMessage suppresses only that one
pill and keeps rendering StatusPill for all other error events.
* fix(daemon): don't treat a process exit code as an HTTP status
Follow-up to review on #3083: the status-context helper accepted a bare
`code` prefix, so `exit code 401` / `process exited with code 429` still
matched and got classified as AGENT_AUTH_REQUIRED / RATE_LIMITED (the
very `exit code 401` case the comment calls out as noise). `code` now
only counts when qualified (`status code` / `error code` / `response
code`) or punctuation-bound (`code: 401`); bare `exit code N` no longer
matches. Adds fixtures for exit-code lines returning null.
* chore(web): translate AMR card / error keys for 16 remaining locales
PR #3083 added 10 new `chat.amrCard.*` / `chat.amrError.*` keys but only
provided en/zh-CN/zh-TW translations; the other 16 locales fell back to
English. Translate the card title/body, three chips, primary CTA, and
the AMR self-error (auth / balance) messages and buttons for ar, de,
es-ES, fa, fr, hu, id, it, ja, ko, pl, pt-BR, ru, th, tr, uk.
* fix(amr): address review feedback on #2355
Targeted fixes for the unresolved review threads on #2355. Each fix
includes / updates a focused test.
- runtimes/executables.ts: `packagedVelaOpenCodeCompanionTree` now
verifies the inner `opencode` executable exists + is runnable, not
just the directory. This closes the false-positive availability path
that let `detectAgents()` surface AMR as available even when the
packaged companion was empty / partially copied (mrcfps, 4 threads).
- runtimes/executables.ts: `resolveAmrOpenCodeExecutable` now prefers
the bundled `<OD_RESOURCE_ROOT>/bin/libexec/opencode/opencode` over a
stale `opencode` on the user's PATH, so packaged AMR builds can't be
hijacked by a global installation.
- web/EntryShell.tsx: when the Local CLI scan returns an available
agent and the previously-selected agent is AMR, switch the selection
to the first available local agent so the runtime and persisted
agent agree before Continue.
- server.ts (model-probe branch): for AMR, check `readVelaLoginStatus`
BEFORE rejecting on an empty live-model catalog — a signed-out user
was getting `AMR_MODEL_UNAVAILABLE` ("choose a model") instead of
the correct `AMR_AUTH_REQUIRED` (sign-in affordance).
- server.ts (default model fallback): if the user asked for the AMR
agent default and the cached id is no longer in the FRESH catalog,
fall back to `liveModels[0]` from the probe instead of rejecting the
run as `AMR_MODEL_UNAVAILABLE`.
- integrations/vela.ts: route `vela login` through
`createCommandInvocation` so an npm/Node-style `vela.cmd` / `.bat`
shim on Windows gets the correct `cmd.exe /d /s /c …` wrapping with
verbatim args (matches `execAgentFile` / chat-run spawning).
- tools/pack/src/linux.ts: in containerized Linux builds, bind-mount
the host directory of `OPEN_DESIGN_VELA_CLI_BIN` and rewrite the env
to the container-side path. The host path was being passed in as-is
even though the default container only mounts /project, /tools-pack
and cache/home — `copyOptionalVelaCliBinary` saw a missing path.
Deferred (out of scope for this PR):
- `od amr status/login/logout/cancel` CLI subcommands (AGENTS.md
UI/CLI dual-track rule, server.ts:5763) — sizable surface; tracked
for a separate focused PR.
- Strict `--require-vela-cli` for Windows + mac-x64 beta builds:
prematurely blocked — `@powerformer/vela-cli` only publishes the
`darwin-arm64` platform binary today; adding the flag elsewhere
would fail the builds. Revisit once win/x64/linux binaries ship.
* fix(amr): hoist sendAmrAccountFailure above the AMR catalog preflight (TDZ)
The new signed-out AMR branch in the catalog preflight at server.ts:10875
calls `sendAmrAccountFailure(...)` to emit AMR_AUTH_REQUIRED, but the
const declaration sat ~100 lines below at the outer function scope. Because
`const` is TDZ-aware, that branch would have thrown `ReferenceError:
Cannot access 'sendAmrAccountFailure' before initialization` for the
exact users it tries to help — defeating the original intent.
Hoist the helper to just above the AMR preflight block so it's available
to every AMR code path in this function. Behavior elsewhere is unchanged.
Also rerun the daemon test suite: `launch.test.ts > resolveAgentLaunch
uses packaged built-in Vela for AMR` was creating the
`<resourceRoot>/bin/libexec/opencode/` companion *directory* only, but
this PR's earlier tightening of `packagedVelaOpenCodeCompanionTree`
also requires the inner `opencode` executable. Add it to that fixture
to match the new contract; the test was a sibling of the executables /
env-and-detection fixtures already updated in 13fc4f4.
Addresses #2355 review (mrcfps, 2026-05-28).
* feat(web): add hover cancel for AMR login (#3158)
* feat(web): add hover cancel for AMR login
* fix(web): don't bounce AmrLoginPill back to 'Signing in…' after local cancel
Both codex-connector (P2) and looper (CHANGES_REQUESTED) on this PR
flagged the same race in the new local-cancel path: `handleCancelLogin`
dispatches `notifyAmrLoginStatusChanged('login-canceled')` immediately
after `/login/cancel` returns, but the `AMR_LOGIN_STATUS_EVENT` listener
unconditionally re-enters `refresh()` and then restarts polling
whenever `/api/integrations/vela/status` still reports
`loginInFlight: true`.
That is a real race because the daemon's `cancelVelaLogin()` only sends
SIGTERM (escalating to SIGKILL after `LOGIN_CANCEL_KILL_GRACE_MS` =
2000 ms) and keeps the child in `activeLoginProcs` until it actually
exits — so the first `/status` read after a successful cancel can
legally still come back as in-flight. Under that window the pill flips
back to 'Signing in…' and can later surface the timeout/error path even
though the user already canceled, defeating the behavior promised in
the PR description.
Fix the listener instead of every dispatch site: in the
`login-canceled` branch, after the local reset (stopPolling +
setPending(null) + clear refs), optimistically mark every subscribed
pill instance as not-in-flight (`setStatus((c) => c ? { ...c,
loginInFlight: false } : c)`) and `return` — skip the
refresh-and-reconcile branch below entirely. The next explicit refresh
(component mount, user interaction, or a `status-changed` event) will
pick up the daemon's confirmed state once the child has actually
exited.
Add a focused regression test that holds `/api/integrations/vela/status`
at `loginInFlight: true` even after a successful `/login/cancel`,
asserting that the pill stays at the Canceled → Authorize sequence and
never bounces back to 'Signing in…'. This test fails on the pre-fix
listener and passes on the new behavior; existing
'cancels an in-flight AMR sign-in…' and 'reconciles late AMR browser
completion to Signed in after local cancel' tests continue to pass.
Addresses review feedback on #3158 (chatgpt-codex-connector, nettee).
---------
Co-authored-by: lefarcen <935902669@qq.com>
---------
Co-authored-by: a1chzt <chizblank@gmail.com>
Co-authored-by: Amy <1184569493@qq.com>
Co-authored-by: Mason <jinmeihong0201@gmail.com>
Co-authored-by: Caprika <56862773+alchemistklk@users.noreply.github.com>
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
Use danger-full-access when WSL_DISTRO_NAME is set and pass
default_permissions=":workspace" so newer Codex builds can write
inside the project directory instead of staying read-only.
* fix(daemon): reconcile missing artifact manifests on run end (#2893)
When an agent writes HTML via write_file instead of create_artifact,
no .artifact.json manifest sidecar is created. If the run then
terminates (inactivity watchdog, user cancel, or process exit), the
HTML file exists on disk but the manifest is missing — breaking the
artifact panel, finalize, and export flows.
Add a best-effort reconciliation step in the child.on('close') handler
that lists project HTML files and calls reconcileHtmlArtifactManifest
for any missing sidecars. The IIFE runs asynchronously after
design.runs.finish() so it never blocks run finalisation.
* fix(daemon): scope run-end reconciliation to files modified during the run
The review on #3110 flagged that listing the entire project tree and
reconciling every HTML file without a sidecar is too broad — for
imported-folder projects (metadata.baseDir), pre-existing HTML files
would receive spurious manifests.
Record runStartTimeMs at the beginning of startChatRun and filter the
reconciliation loop to only touch HTML files whose mtime >= that
timestamp. Add a regression test that backdates a pre-existing HTML
file and verifies it is skipped while a new file is reconciled.
* test(daemon): fix mtime ordering in reconciliation regression test
The runStartTimeMs was recorded after writing the new file, so its
mtime fell before the threshold and the reconciliation filter skipped
it. Move the timestamp capture to before the write to match the real
startChatRun semantics.
* fix(amr): close ACP stdin on abort so vela tears down OpenCode
When an AMR (vela) run is cancelled, attachAcpSession.abort() sent a
`session/cancel` RPC but left the child's stdin open. The vela ACP bridge
keeps running until it sees EOF (or is signaled), and it only shuts down
its private OpenCode `serve` process on a clean exit — so on abort the
OpenCode server lingered until the caller's SIGTERM fallback, and leaked
entirely if the parent was killed before cleanup ran.
End stdin after sending the cancel (mirroring the clean-completion path)
so the agent receives EOF and shuts down its own runtime promptly,
independent of signal timing.
* fix(amr): end stdin on abort even before session/new resolves
Addresses review on #3097: abort() still returned early when sessionId
was unset, so the stdin EOF only happened after session/new completed.
Cancelling during ACP startup (before the session exists) left the
OpenCode-teardown window open until the caller's SIGTERM fallback — and a
parent hard-kill before that could still strand the private OpenCode
process.
Move stdin.end() out of the sessionId guard so abort always closes stdin
when the pipe is writable; gate only the session/cancel RPC on sessionId.
Add a regression test that aborts during startup and asserts stdin is
ended with no session/cancel emitted.
Packaged diagnostics bundles never contained the daemon or web
`latest.log` — the very logs that hold the agent/critique run flow — so
support exports could not explain "sent prompt to the agent, then
nothing happened" reports.
Root cause: the sidecar `base` means different things per launch path.
tools-dev passes the pre-namespace source root, so
`resolveNamespaceRoot(base, namespace)` is correct. But the packaged
orchestrator launches every child with `base = <namespaceRoot>/runtime`
(apps/packaged/src/{paths,sidecars}.ts) while logs live a level up at
`<namespaceRoot>/logs`. The diagnostics builders re-appended the
namespace and resolved every log to
`<namespaceRoot>/runtime/<namespace>/logs/...` → ENOENT. renderer.log
only survived by accident: the desktop main process wrote it to the
same wrong path the reader looked in.
Add `resolveRuntimeNamespaceRoot(runtime, contract, runtimeMode)` to
`@open-design/sidecar` which walks up out of the `runtime/` dir in
packaged (runtime-mode) launches and falls back to the dev layout
otherwise. Route the desktop renderer-log path and both diagnostics
exporters (desktop IPC + daemon HTTP) through it so writer and reader
stay in lockstep and renderer.log lands next to the desktop log dir.
Tests: sidecar unit specs for both layouts; a daemon export spec that
writes a real `<namespaceRoot>/logs/daemon/latest.log` and asserts the
bundle captures its contents (red on main → ENOENT placeholder, green
here).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(connectors): expire stale auth credentials
Mark connector credentials as expired when provider reads report auth-shaped failures so Memory stops presenting stale connected apps as healthy.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(connectors): avoid expiring grants on platform 401
Only delete connector credentials for provider tool errors attributable to the current connector so Composio platform auth failures do not wipe valid grants.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
Treat Claude Code stdout like "Not logged in · Please run /login." as an
auth failure in diagnoseClaudeCliFailure so connection tests and chat
runs surface actionable login guidance instead of raw CLI text.
* fix(daemon): detect CodeWhale as DeepSeek TUI fallback binary
The renamed CodeWhale CLI installs the `codewhale` dispatcher instead of
`deepseek`. Probe it via fallbackBins so agent detection works without
requiring DEEPSEEK_BIN overrides.
Fixes#2983
* test(daemon): align deepseek docsUrl expectation with CodeWhale metadata
Update env-and-detection coverage to match the runtime metadata URL
changed for issue #2983.
Import/install routes compared bare directory slugs against catalog ids
prefixed with user:, causing a false 500 after a successful write and
duplicate entries on retry. Normalize lookup and reserved slug ids.
Fixes#2489
* fix(daemon): widen HTTP keep-alive so SSE survives idle gaps
The daemon's `/api/runs/:id/events` SSE stream emits an in-band
`: keepalive` comment every 25s (`SSE_KEEPALIVE_INTERVAL_MS`), but
Node's default `server.keepAliveTimeout` is 5_000ms. When a run is
quiet for more than five seconds — e.g. the agent is still composing,
or the user briefly walks away — Node closes the underlying TCP
connection from under the SSE writer, the next 25s ping lands on a
dead socket, and the browser surfaces it as a generic
"network error" mid-stream.
This is most visible behind any keep-alive-aware middlebox (the
nginx running in the desktop bundle, the socat/docker bridges users
set up for remote access, EC2 security-group idle timers): the
default 5s window is shorter than every reasonable in-band keepalive
cadence, so the connection dies before the application gets a chance
to assert it's still alive.
Set the listener to:
- `keepAliveTimeout = 120_000` — 4.8× the in-band keepalive, plenty
of slack for clock skew and slow flushes.
- `headersTimeout = 125_000` — must exceed `keepAliveTimeout` per the
Node docs, otherwise a misbehaving client can stall request parsing
indefinitely.
- `requestTimeout = 0` — disable the per-request timeout entirely;
an SSE response intentionally runs for as long as the agent runs.
Verified by curling
`/api/runs/<id>/events` from inside the daemon container and
watching the connection stay open through three full 25s keepalive
cycles where it previously RST'd at ~5s.
* fix(daemon): address PR #2557 review — drop requestTimeout, add regression test
Three changes responding to @PerishCode's review (#2557):
1. Drop `server.requestTimeout = 0`. The reviewer is correct: that knob
bounds how long the server waits to *receive* a complete request
(headers + body) and is cleared the moment the request is fully
parsed — it does not gate the duration of an SSE response. Setting
it to 0 only removes Node 18+'s default 300s slow-loris guard, which
is a real regression on a daemon that binds to 0.0.0.0 / Tailscale.
2. Rewrite the comment block. The previous comment claimed
`keepAliveTimeout` "closes any idle SSE connection." Per the Node
docs, `keepAliveTimeout` arms *after* a response finishes writing —
it bounds the between-request idle gap on a kept-alive socket, not
an in-flight streaming response. SSE drops mid-stream are almost
always middlebox idle timers (nginx, socat/docker, EC2 NAT), not
Node's own socket timeout, and this listener-side change cannot
extend a connection past those middleboxes.
What this PR actually fixes: routine kept-alive sockets used around
an SSE stream (status polls, run-status fetches, the initial GET
before the SSE upgrade) surviving normal client pauses. 120s gives
comfortable headroom over the 25s in-band cadence so chat clients
stop reconnect-storming between bursts.
3. Add `apps/daemon/tests/server-keepalive.test.ts` so a future
refactor cannot silently restore the Node defaults. The test uses
the existing `startServer({ port: 0, returnServer: true })` fixture
(mirroring version-route.test.ts) and asserts the listener's
`keepAliveTimeout` and `headersTimeout` invariants.
Verified:
- pnpm --filter @open-design/daemon run typecheck passes
- pnpm vitest run tests/server-keepalive.test.ts → 2 passed
Wire Aider (https://aider.chat) into the daemon agent registry alongside
the existing 16 CLIs. Aider is one of the most-used open-source coding
CLIs and routes through LiteLLM, so users can drive any provider they
already have a key for (OpenAI, Anthropic, DeepSeek, Gemini, OpenRouter,
local Ollama, etc.) without an extra adapter per provider.
Implementation follows the DeepSeek TUI pattern: prompt-via-argv with a
30 KB byte budget guard, plain stdout streaming, and the suppression
flags needed to keep aider runnable without a TTY (--yes-always,
--no-pretty, --no-git, --no-auto-commits, --no-suggest-shell-commands,
--no-show-model-warnings). `--message-file -` is not used because aider
treats `-` as a literal filename rather than a stdin sentinel.
Touchpoints mirror the other one-shot adapters:
- runtimes/defs/aider.ts new RuntimeAgentDef
- runtimes/registry.ts register in AGENT_DEFS
- runtimes/executables.ts AIDER_BIN override
- app-config.ts AIDER_BIN in agent env set
- web/utils/agentLabels.ts 'Aider' display label + aliases
- tests/runtimes/agent-args.test.ts buildArgs shape coverage
- tests/runtimes/env-and-detection bin override coverage
- tests/runtimes/helpers shared `aider` test helper
Validated with `pnpm guard`, `pnpm typecheck`, and
`pnpm --filter @open-design/daemon test tests/runtimes` (123 passing).
End-to-end probe against the live aider binary against DeepSeek via the
exact argv the adapter produces returned the expected output.
* feat(daemon): attach structured diagnostics to agent connection test results
Local agent connection-test failures currently flatten everything into
a single free-form `detail` string (e.g. "exit 1"). Settings UI and CLI
consumers can't tell what phase failed, which binary the daemon picked,
or what the child's exit metadata looked like — they have to scrape the
human-readable text.
Add an optional `diagnostics` block on the connection-test response so
callers can read structured fields instead. The existing `kind` and
`detail` strings are kept bit-for-bit identical, so older UIs keep
rendering unchanged.
- packages/contracts: add `ConnectionTestPhase`
(binary_resolution / version_probe / model_list / spawn /
connection_smoke_test / output_parse) and a `ConnectionTestDiagnostics`
interface with optional `binaryPath`, `binaryVersion`, `exitCode`,
`signal`, `stdoutTail`, `stderrTail`; extend
`ConnectionTestResponse.diagnostics?` to carry it.
- apps/daemon/connectionTest.ts: thread a `phase` tracker through
testAgentConnectionInternal, flip it at the meaningful boundaries
(binary_resolution → spawn → connection_smoke_test / output_parse),
and stamp diagnostics into every result return point — the four
result helpers plus both early returns. Tail data already buffered
by `createAgentSink` is reused; nothing new is captured.
- tests: three regressions per #2248 — success path attaches
phase='connection_smoke_test' + exitCode 0, exit-failed path
attaches phase='spawn' + the failing exitCode + the stderr tail,
and a missing-CLI path attaches an early-phase diagnostics block.
This is PR 1 of the #2248 plan (contracts + minimum daemon fill);
follow-ups will introduce a normalized failure classifier
(binary_not_found, unsupported_version, auth_failed, quota_exceeded,
network_failed, unsupported_flags, no_text_output, output_parse_failed,
spawn_failed), candidate-alternative reporting via
inspectAgentExecutableResolution, and the Settings "View details"
disclosure.
Refs #2248.
* fix(connectionTest): honor diagnostics contract on all local return paths
Two follow-ups from review of #2419:
- packages/contracts/src/api/connectionTest.ts advertises diagnostics
as 'Always set on local agent test responses', but three local
returns still bypassed buildDiagnostics(): the buildArgs failure
around 1295, the preflight probeAgentAuthStatus().status === 'missing'
branch around 1317, and the outer catch around 1566. Thread
buildDiagnostics() through all three; phase is still 'binary_resolution'
at the first two and whatever the runtime advanced to at the catch.
- resultFromAgentText() hard-coded exitCode: 0 even though
resultFromChildExit() routes ACP clean-SIGTERM completion through
this success helper (winner.code === null, winner.signal ===
'SIGTERM' with acpCleanCompletion). Add an optional exit argument
threaded from both call sites so the diagnostics reflect the actual
child code/signal pair instead of a synthesized 0 that masks the
SIGTERM teardown. Only synthesize 0 when no exit context is
available (theoretical text-without-exit path).
Tests:
- regression locking the diagnostics contract for the preflight auth
path on Cursor Agent (phase: binary_resolution, binaryPath set)
* docs(contracts): widen diagnostics contract to match early-failure paths
Reviewer flagged that the JSDoc-style comment on
ConnectionTestResponse.diagnostics still said 'Populated only when the
test actually spawned an agent CLI', but the previous follow-up made
the daemon stamp diagnostics on three pre-spawn local-agent failures
too: the unknown-agent and unresolved-binary branches around
connectionTest.ts:1123-1148 and the preflight auth return around
1338-1353. Reword the contract so Settings/CLI consumers do not
incorrectly special-case those early local failures as
diagnostics === undefined.
* fix(connectionTest): keep contracts browser-safe and fold probe output into preflight diagnostics
Two follow-ups from review of #2419:
- ConnectionTestDiagnostics.signal was typed as
`NodeJS.Signals | string | null`, which made the generated .d.ts of
the shared @open-design/contracts surface depend on ambient Node
types. Downstream consumers reading a plain HTTP response shape
should not need @types/node. Narrow to `string | null` (NodeJS.Signals
literals are strings, so the daemon write site is unchanged) and
document the boundary in the field comment.
- The Cursor-style preflight auth path stamped diagnostics built from
the smoke-test sink, which is always empty at that point because the
smoke spawn never happened. As a result the diagnostics block
silently dropped `cursor-agent status`'s own stderr/stdout/exit
context — the only structured failure information available on that
path. Thread the probe output back out of probeAgentAuthStatus()
via new optional stdoutTail/stderrTail/exitCode/signal fields, then
merge them into the diagnostics overrides in connectionTest.ts so
Settings/CLI consumers can render the auth-failure context instead
of just the guidance string.
Tests:
- extended the Cursor preflight regression to assert that diagnostics
carries the probe's stderr ("Not logged in") and exit code (1).
* chore(deps): upgrade express 4.22.1 -> 5.2.1 and @types/express
Breaking changes addressed:
- Renamed all bare wildcard route segments from * to *splat across
src/server.ts, src/static-resource-routes.ts, src/project-routes.ts,
src/import-export-routes.ts, and all three test stubs that define
app.get/options/delete routes using /raw/* or /raw/* patterns
- Updated wildcard param access from (req.params as any)[0] / req.params[0]
to Array.isArray(req.params.splat) ? req.params.splat.join('/') : String(...)
to handle the Express 5 / path-to-regexp v8 change where wildcard params
are now string[] instead of string
- Updated app.get('*') SPA fallback to app.get('/*splat') in server.ts
- Annotated five connector route handlers with Request<{ connectorId: string }>
so the typed param resolves as string, not string | string[], fixing the
10 TS2345 / TS2322 errors that surfaced when @types/express moved to 5.0.6
- Fixed two app.listen() beforeAll callbacks in origin-validation.test.ts to
accept and propagate the optional Error argument Express 5 now passes to
the listen callback, resolving TS2769 overload mismatch
* chore(nix): refresh daemonHash for rebased lockfile
* fix(daemon): await res.sendFile() in async route handlers for Express 5 compatibility
Express 5 res.sendFile() returns a Promise. Without await, async route
handlers return before the response is sent, causing Express to call
next() and fall through to a 404. Add await to all res.sendFile() calls
in async handlers in static-resource-routes.ts and server.ts.
* fix(daemon): use readFile+send for spritesheet route instead of sendFile
Express 5 res.sendFile() returns undefined (not a Promise). ENOENT errors
call next() asynchronously after the route handler's try/catch has returned,
causing unhandled 404 responses. Replacing with fs.promises.readFile + res.send
keeps the error path fully within the handler's try/catch.
---------
Co-authored-by: Patrick A <259201958+eefynet@users.noreply.github.com>
A design system imported from a SwiftUI repo could never be published. The
import's file scorer was web-only, so every .swift file scored 0 while the
repo's config dotfiles (.zenflow, .vscode, .zed) scored on the generic text
bonus and won the top-N selection. Even when a source file was selected, the
snapshot writer's text-file allowlist didn't include .swift, so it was dropped.
The result: snapshots were all config dotfiles, which the project files API
hides, so the publish gate saw zero evidence snapshots and stayed blocked no
matter how many times the intake ran.
Three layers fixed in tools-connectors-cli.ts:
- scoreDesignFile now scores native/design-token source (Swift, Kotlin, Go,
Rust, etc.), with a high boost for token files like ColorSystem.swift,
Typography.swift, Spacing.swift.
- shouldSkipRepoPath skips editor/CI/agent tooling dirs (.vscode, .zed, .idea,
.zenflow, .github, .husky, ...) so their files stop crowding out real source.
- isTextSnapshotPath recognizes native source extensions so selected files are
actually written.
Also teach the design-system swatch extraction to read SwiftUI colors:
swift-colors.ts parses Color(red:green:blue:), Color(hue:saturation:brightness:)
(HSB), and Color(white:), evaluating decimal, hex-byte, and division component
expressions (0xF4 / 255, 220 / 360), and design-systems.ts uses it as a new
swatch form. scoreDesignFile, shouldSkipRepoPath, and isTextSnapshotPath are
exported for unit tests.
Verified against a real SwiftUI repo: the intake now captures ColorSystem,
TypographySystem, SpacingSystem and the view/model source instead of config
dotfiles.
* feat: rename editable design systems from Settings + od CLI
Editable (user-created) design systems can already be renamed via
PATCH /api/design-systems/:id, but the capability was not surfaced
in the UI or CLI.
- Settings -> Design Systems: editable cards show a hover-reveal pencil
next to the name that opens a rename modal; built-in cards stay
read-only. Reuses common.rename/save/cancel (no new i18n keys).
- CLI: 'od design-systems rename <id> --title <new> [--json]', backed by
a unit-tested pure arg parser (design-system-rename-args.ts).
Both surfaces call the existing PATCH endpoint.
* Route od design-systems --help and -h to the rename-aware usage
The dispatcher only special-cased the `help` subcommand, so
`od design-systems --help` and `-h` fell through to the generic library
list, which advertises only `list` and `show`. That left `rename` off the
main discovery path even though this PR ships it.
Pulled the usage text and the help-arg check into a small pure module so
`help`, `--help`, and `-h` all render the same rename-aware usage, and added
a test that asserts the flag forms route to help and that the text lists
rename. The pure module keeps the assertion off process.exit / console.log.
* Reject --title flag-as-value and keep the rename modal open on failure
Two rename edge cases from review.
CLI: parseDesignSystemRenameArgs took the next token after --title
unconditionally, so `rename user:acme --title --json` parsed the title as
"--json" and could rename the system to a flag name instead of failing usage
validation. A separate --title value must now be a real token; a leading dash
means the user uses the --title=<value> form. Malformed inputs return null,
which the CLI surfaces as a usage error.
Web: commitRename closed the modal unconditionally, but updateDesignSystemDraft
returns null on any non-OK response or fetch failure, so a transient error
dropped the typed title with no feedback. The modal now stays open with the
title intact and shows an inline error on failure, matching the existing import
error pattern in this component. Added tests for the flag-as-value rejection
and for the failed-update modal state.
* Gate the rename completion on the active modal session
commitRename mutated the shared modal state after awaiting the PATCH, so a
slow rename for system A could resolve after the user cancelled and opened a
rename for system B, then close B's modal or show A's failure inside B's
dialog.
A monotonic session token (bumped whenever the modal opens or closes) is now
captured before the request and rechecked after it resolves. A stale
completion skips all modal-state updates. The list update for a successful
rename still applies, since that reflects a real server-side change regardless
of which modal is open. Added a regression test that opens a second rename
before the first PATCH settles and confirms the newer modal is untouched.
* Localize the rename-failed error instead of hardcoding English
The inline rename error was hardcoded English on a Settings surface that
otherwise runs through useT(), so non-English users saw English while the
rest of the panel was localized.
Added settings.designSystemRenameFailed to the typed dictionary and all 19
locale files, and the modal now reads it through t(). The translations are
adapted from each locale's existing settings.rescanFailed string ("X failed.
Check the daemon and try again."), swapping the verb to rename, so the daemon
and retry wording matches what those locales already ship.
The composed chat prompt prepends a '# Instructions (read first)'
block in front of '# User request' so a single user message carries
both the system rules and the actual request — the shape every agent
CLI (Claude, Codex, OpenCode, Gemini) expects on stdin.
In practice claude-opus-4-7 (and a few other instruction-tuned
models, particularly with --include-partial-messages on the stream)
start their reply by echoing the top of that user message verbatim.
The chat UI then shows the system prompt as a literal block leading
the visible answer, e.g.:
Instructions
Always respond in Korean. Use Korean for all explanations…
…Maintain full orthographic correctness…
).네, 완료했습니다. 전달하신 4가지 보강 포인트를 …
(The closing token of the instructions block runs straight into the
real answer without a newline — the telltale of a model-side echo
rather than a UI render bug.)
Close every Instructions block with one trailing line:
(Do not quote, restate, or echo the # Instructions block above in
your reply. Begin your response with the answer to the # User
request below.)
This kills the regression in practice without changing the turn
shape (still one user message), so no agent CLI plumbing has to move.
Tested via tests/chat-route.test.ts — pins the literal guard string
so a future refactor cannot silently drop it.
Co-authored-by: nicejames <nicejames@gmail.com>
* feat: pin custom design systems to top and read swatches from color tables
Two changes to Settings -> Design Systems.
Custom (user-created) systems now sort to the top of the list instead of
sitting under the built-in catalog. A small pure helper
(orderDesignSystemGroups) floats any group that holds an editable system
above the rest; everything else keeps its order.
Swatches now show for systems whose DESIGN.md keeps colors in a markdown
table. extractSwatches only understood inline forms before, so table
palettes came back empty and the cards showed no color squares. Added a
table-row pass that reads the first hex in a row as the value and the
first plain text cell as the name. Inline forms still win when a file
mixes both.
* Sort editable systems first within a category group
The group-level sort floated any category holding a user system to the top,
but items inside a group rendered in their incoming (alphabetized) order. A
user system that shares a category with built-ins (its DESIGN.md can set any
category) still landed below Apple/Airbnb in that group, which misses the
point of pinning custom systems to the top.
orderDesignSystemGroups now also sorts items editable-first within each
group, stable so built-ins keep their alphabetical order. The display order
comes from the helper output, so this covers the import path re-alphabetizing
before grouping without touching it.
* fix(packaged): honor OD_DATA_DIR in desktop runtime
Co-authored-by: multica-agent <github@multica.ai>
* fix(packaged): scope OD_DATA_DIR by namespace
Co-authored-by: multica-agent <github@multica.ai>
* fix(packaged): reject relative OD_DATA_DIR overrides
Co-authored-by: multica-agent <github@multica.ai>
* fix(packaged): preserve scoped OD_DATA_DIR overrides
Co-authored-by: multica-agent <github@multica.ai>
* fix(packaged): surface OD_DATA_DIR validation as PackagedPathAccessError
Relative OD_DATA_DIR in packaged mode now throws PackagedPathAccessError
instead of a plain Error. apps/packaged/src/index.ts main() only routes
PackagedPathAccessError to dialog.showErrorBox, so the prior plain Error
made the app exit silently for GUI launches with an invalid override.
Extract PackagedPathAccessError into apps/packaged/src/errors.ts so
paths.ts can throw it without an inter-module value cycle with launch.ts.
Co-authored-by: multica-agent <github@multica.ai>
* fix(packaged): make OD_DATA_DIR absolute-path guard platform-aware
The previous guard ran `win32.isAbsolute(expanded)` unconditionally on
every platform, so on macOS/Linux a value like `C:\Users\Fred\OD` passed
the check (win32 considers it absolute) and silently flowed into
`join(expanded, "namespaces", namespace, "data")`, producing a
cwd-relative POSIX path instead of throwing.
Branch the check on `process.platform === "win32"` so Windows paths are
only accepted on Windows. Update the existing Windows-themed test
fixtures to stub `process.platform = "win32"` (the omission was what
masked this bug) and add a regression that stubs `linux` and asserts
`C:\foo` and `\\server\share` are rejected as PackagedPathAccessError.
Co-authored-by: multica-agent <github@multica.ai>
* fix(packaged): reject mismatched scoped OD_DATA_DIR
Co-authored-by: multica-agent <github@multica.ai>
---------
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: kami.c <kami.c@chative.com>
PR #2461 sync prep — resolves 14 conflicts merging 84 main-side commits
on top of 58 release-side commits accumulated during the 0.8.0 cycle.
Resolution summary:
Take main (theirs) where main carried deliberate forward progress:
- apps/web/src/components/PluginCard.tsx — 7 hunks, i18n migration:
hardcoded English aria-labels/titles replaced with t() calls keyed
on pluginCard.* (all 8 keys verified present in en.ts).
- apps/web/src/components/TasksView.tsx — 1 hunk, source-ingestion
feature: sortedRoutines (newest-first), sourceIngestionTemplates,
patchSourceForm, submitSourceIngestion. activeCount/pausedCount
semantics preserved (now keyed on sortedRoutines, count unchanged).
- e2e/ui/app.test.ts — new node:fs/promises + tmpdir + path + @/timeouts
imports needed by main-side test helpers.
- e2e/ui/settings-local-cli-codex-fallback.test.ts — menu-dismissal
helper block added by main.
Keep both sides where each added a different field to the same object
literal:
- apps/web/src/components/ProjectView.tsx (locale + analyticsHints
spread).
- apps/web/src/components/DesignSystemFlow.tsx (locale + analyticsHints).
Take release (ours) where release carried deliberate work that ships
0.8.0:
- CHANGELOG.md — release-side 0.8.0 entry + PR link refs; main's
Unreleased section was the same body of work, now finalized.
- apps/landing-page/public/{apple-touch-icon,favicon}.png +
apps/web/public/app-icon.svg — release-side visual refresh assets
consistent with 0.8.0 stable ship.
- tools/pack/src/linux.ts — packageVersion const required by line 466;
taking main's empty line would build-error.
- e2e/ui/project-management-flows.test.ts +
e2e/ui/settings-api-protocol.test.ts +
e2e/ui/settings-memory-routines.test.ts — release-side release-smoke
hardening (shangxinyu1 + PerishFire) takes precedence on overlap.
Closes-issue / unblocks: PR #2461 sync release/v0.8.0 → main.
* fix(prompt): instruct discovery form to follow user's chat language
The discovery form was reaching users in English even when their UI
language was Chinese (#1416). The form is generated by the LLM under
guidance from packages/contracts/src/prompts/discovery.ts, but the
prompt only mentioned that option labels MAY follow the user's
language. The example form embedded English text for title,
description, per-question labels, and placeholders, and the LLM
copied that text verbatim instead of localizing.
Two minimal changes to the prompt:
1. Add a sentence under RULE 1 making the language-match expectation
explicit before the example forms.
2. Expand the Form authoring rules bullet so it covers every
user-facing string (title, description, label, placeholder, option
label) and pins the unlocalized identifiers (id, type, option
value, branch values) for the runtime branch logic.
Fixes#1416
* fix(prompts): mirror discovery localization rule to daemon prompt copy
Apply the same 'Match the user's chat language' paragraph and the
expanded 'Localize every user-facing string' bullet to
apps/daemon/src/prompts/discovery.ts, which the daemon-backed chat
path uses (it imports ./discovery.js, not the contracts copy).
Also add apps/daemon/tests/prompts/discovery-localization-drift.test.ts,
which reads both prompt copies and asserts each one contains both rules,
so the contracts and daemon files cannot silently drift on this behavior.
Apply-anyway reason: pnpm install / pnpm vitest could not run locally
(registry DNS blocked in sandbox + node v26 vs required v24). Direct
Node content assertion over both files passes. CI will run vitest.
---------
Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
* fix(daemon): isolate per-agent detection failures so one bad probe cannot blank the picker
`detectAgents` ran every adapter probe in a bare `Promise.all`, so a
synchronous throw from any single probe (e.g. a filesystem error
during PATH walking on a packaged Windows daemon, or an unhandled
async rejection from one of the post-launch probes) rejected the
whole batch. The `/api/agents` route's `catch(() => [])` then handed
the UI an empty list and the model picker collapsed to BYOK / Cloud
only, losing every installed CLI option — which matches what users
in issue #2297 observed after one or two app restarts on Windows.
Wrap each probe in `safeProbe` so a single failure degrades just that
adapter to `unavailableAgent(def)` while the rest of the registry
keeps its real availability. The new
`apps/daemon/tests/runtimes/detection-resilience.test.ts` pins both
synchronous failure sites that previously sat outside the existing
inner try/catch blocks (`resolveAgentLaunch` and
`applyAgentLaunchEnv`) so a future code change cannot regress the
isolation contract.
This is a defensive guard rather than a Windows-only diagnosis: it
fixes any scenario where a single probe blows up, including ones we
have not reproduced yet. If a user still hits #2297 after this lands,
the daemon log will identify which adapter failed instead of silently
returning an empty list.
Fixes#2297
* ci: re-run checks (unrelated e2e baseline flake on previous run)
* feat(mcp): add write_file, delete_file, delete_project tools
External coding agents driving Open Design through MCP can create new
artifacts (create_artifact) but cannot iterate on a file once written
(create_artifact rejects existing targets), cannot remove a stale
file, and cannot tear down a throwaway project they just spun up via
create_project. Close that loop so the same agent can drive the full
file/project lifecycle end-to-end through MCP.
- write_file(path, content, encoding?): POSTs to /api/projects/:id/files
without `artifact: true`, which the daemon route writes as a plain
overwrite. Supports nested paths and base64 binaries.
- delete_file(path): DELETEs /api/projects/:id/raw/<path> so nested
paths work just like create_artifact's nested name argument.
- delete_project(project, confirm:true): DELETEs /api/projects/:id but
refuses to fall back to the active project and requires confirm:true,
since the operation purges the SQLite row and on-disk project dir
irreversibly. Marked destructiveHint:true on the annotation.
Tests cover each tool's success path, the active-context fallback for
write/delete_file, missing-argument rejection before any network call,
the daemon-error mapper, and the two delete_project guards.
* fix(mcp): echo resolvedProject from delete_project and cover the daemon error path
Two follow-ups from review of #2416:
- delete_project accepts a name substring per its inputSchema and the
server instructions block tells callers to verify which row was
matched via resolvedProject. write_file/delete_file already honor
that contract via withActiveEcho(json, active, resolved), but
deleteProject destructured only `id` and dropped the echo on the
one irreversible tool. Capture `resolved` and pass it through;
active is always null here because the active-context fallback is
intentionally disabled.
- formatDaemonError and the !resp.ok branches in writeFile/deleteFile/
deleteProject had zero coverage — all nine tests stubbed status: 200.
Add three regressions covering the structured-error reformat, the
raw-text fallthrough for non-JSON bodies, and the irreversible
delete_project surface, so a regression in the parse/fallthrough
logic will fail in CI instead of reaching agents.
Mirror the issue #398 fix the claude adapter already has: when
spawning Codex CLI without a custom OPENAI_BASE_URL, strip both
OPENAI_API_KEY and CODEX_API_KEY from the child env so Codex CLI's
own `~/.codex/auth.json` (codex login) wins.
Without this guard, a stale BYOK key left behind in
`agentCliEnv.codex.OPENAI_API_KEY` (e.g. after the user clears the
BYOK dialog and switches execution mode back to Local CLI) silently
flows through `spawnEnvForAgent` and trips 401 invalid_api_key.
The stripping is gated on OPENAI_BASE_URL so users who intentionally
route Codex CLI through a third-party OpenAI-compatible gateway keep
the credential that authenticates against it. Comparison is
case-insensitive to close the Windows mixed-case env name hole that
issue #398 already documents for ANTHROPIC_API_KEY.
Fixes#2420
* fix(daemon): finish live-artifact chat runs via watchdog quiet-period handoff (#1451)
Live-artifact runs were staying in `Working` for the full 10-minute
inactivity window even after the deliverable had been registered, and
sometimes finishing as `failed` with `Agent stalled without emitting
any new output for 600s`. The agent process kept its stdin/stdout
alive (claude-code stream-json idle stdin, post-write reasoning that
never reaches the chat) so the existing watchdog could not tell the
deliverable was already in the user's hands.
Wire `/api/tools/live-artifacts/create` back into the chat run via a
small per-run handle registry: on the first `created` event, the run
flips a local `artifactRegistered` flag and rearms the watchdog with
the shorter `OD_CHAT_RUN_ARTIFACT_QUIET_PERIOD_MS` (default 60s)
instead of the 10-minute pre-artifact ceiling. When that quiet timer
trips, the watchdog no longer emits a stalled-error / `failed` finish;
it SIGTERMs the child and lets the existing child-exit handler do
final classification — the close handler now treats a SIGTERM exit
after a registered artifact as `succeeded`, matching what the user
actually got (a delivered artifact, not a failed run).
The handoff stays with the existing child-exit lifecycle, so tool
token revocation, cancel semantics, and exit-status classification
keep their current owner — addressing the PR #1543 review history
where finishing the run from the tool route bypassed those guarantees.
Closes#1451.
* fix(daemon): gate artifact quiet-period close on daemon-initiated flag (#1451 review follow-up)
Reviewer (#2585) found that the close-handler branch reclassifying
SIGTERM/SIGKILL as `succeeded` only checked `artifactRegistered`, so an
unrelated later termination (external `kill`, OOM, container shutdown)
after a successful artifact write would silently flip the run from
`failed` to `succeeded` — the exact "completed without producing
anything visible" failure mode the existing close handler is trying
to prevent.
Track the watchdog-initiated shutdown explicitly: set
`artifactQuietShutdownRequested = true` immediately before
`failForInactivity()` sends SIGTERM (covering the kill-grace SIGKILL
escalation under the same flag), and require that flag in the close
handler's quiet-period branch.
Extract the final-status decision into a pure
`classifyChatRunCloseStatus` so the daemon-initiated vs external
signal cases can be pinned with focused unit tests instead of
asserting closure-internal state via end-to-end timing.
* fix(daemon): treat OD_CHAT_RUN_ARTIFACT_QUIET_PERIOD_MS=0 as disabled (#1451 review follow-up)
Reviewer (#2585 non-blocking) found that an operator override of
`OD_CHAT_RUN_ARTIFACT_QUIET_PERIOD_MS=0` no longer behaved as a
"disable the quiet period" knob: once the artifact was registered,
`activeInactivityTimeoutMs()` dropped to 0, `noteAgentActivity()`
early-returned without clearing the prior timer, and the pre-artifact
10-minute timer kept running while further agent activity stopped
refreshing it.
Make the quiet-period switch conditional on a positive value. A 0
override now means "do not shorten after artifact registration" — the
pre-artifact ceiling stays active, subsequent activity continues to
reschedule it, and the existing pre-artifact stalled-error path still
fires when the agent genuinely hangs. Pin the resolver as a pure
`resolveActiveInactivityTimeoutMs` helper so the four quiet-vs-pre
matrix cases are unit-tested directly.
* fix(daemon): arm the quiet-period watchdog when pre-artifact timeout is disabled (#1451 review follow-up)
Reviewer (#2585 non-blocking, round 3) found that
`OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS=0` paired with
`OD_CHAT_RUN_ARTIFACT_QUIET_PERIOD_MS>0` left the watchdog disarmed
forever. The `noteAgentActivity()` call at run start exited early
because the pre-artifact delay was 0, so `inactivityTimer` was still
`null` when the artifact was registered, and the prior
`if (inactivityTimer) noteAgentActivity()` guard inside
`noteArtifactRegistered()` then skipped the re-arm. The
newly-positive quiet-period delay never armed a timer at all — a
chat run that went silent right after artifact creation would stay
`running` indefinitely.
Drop the guard. `noteAgentActivity()` is already the function that
decides whether to schedule (it bails when the active delay is 0),
so calling it unconditionally keeps the behavior coherent across the
four pre/quiet combinations: both non-zero (was already fine), pre=0
+ quiet>0 (now arms the quiet timer), pre>0 + quiet=0 (still falls
back to the pre-artifact ceiling via the existing resolver), both
zero (still no watchdog at all — operator opted out).
Pure-function coverage of the ceiling decision stays in
`resolveActiveInactivityTimeoutMs` — exercised across the same four
combinations in the existing unit suite.
* fix(daemon): restore full assistant turn after mid-flight reload reattach
When a daemon run is in progress and the browser reloads, the client
reattaches and the artifact recovers, but the restored chat turn drops
assistant text, thinking events, and producedFiles. Three independent
defects combine to cause this:
1. The reattach onDone never populated producedFiles. The pre-turn file
snapshot used as the diff baseline lived only in a closure. Now it is
persisted on the assistant message as preTurnFileNames so the reattach
path can rebuild the diff after reload.
2. The SSE replay used a strict `>` cursor compare. A client that had
already persisted lastRunEventId equal to the final event id received
zero replay events on terminal-run reattach, fell into the status-only
REST fallback, and never fired a clean onDone. The server now replays
the final buffered event on terminal-run reattach when the cursor is
at or past the end, so the client always sees a terminal signal.
3. The text buffer flushed on visibilitychange but not on pagehide.
Hard reloads on browsers where visibilitychange does not fire before
teardown could lose the last ~250ms of streamed text from the
persisted message. A pagehide listener now flushes synchronously.
Refactor: extracted computeProducedFiles helper so the send and reattach
flows share the diff logic and cannot drift apart again.
Tests:
- apps/web/tests/components/ProjectView.reattach-restore.test.tsx
covers: reattach onDone populates producedFiles from preTurnFileNames;
reattach reaches succeeded via SSE even when only the end event replays;
computeProducedFiles unit cases.
- apps/daemon/tests/runs.test.ts adds replay-cursor coverage for both
the terminal-replay safety branch and the no-duplicate normal branch.
* fix(daemon): persist preTurnFileNames end-to-end on the messages table
Review on #2383 caught that `ChatMessage.preTurnFileNames` (added in
packages/contracts) had no daemon-side persistence: the messages
schema, upsertMessage, and normalizeMessage all ignored the field.
saveMessage() would PUT the field, the daemon would silently drop it,
and a real page reload would read a row without `preTurnFileNames`, so
the reattach onDone fell back to `new Set(nextFiles.map(...))` and
still missed files produced earlier in the turn.
This commit closes the round trip:
- New `pre_turn_file_names_json TEXT` column on the messages table,
with a forward-compatible ALTER for existing databases (same pattern
as agent_id / feedback_json / run_status).
- Both upsertMessage branches (UPDATE and INSERT) now serialize
m.preTurnFileNames into the new column.
- listMessages, the post-upsert readback SELECT, and normalizeMessage
surface the column back to callers.
Round-trip tests in apps/daemon/tests/db-pre-turn-file-names.test.ts
cover: write+listMessages, the UPDATE upsert path preserving the
baseline, and a legacy-row case returning undefined.
* fix(web): preserve terminal status + full multi-file diff on reattach
Two correctness issues caught in review of the prior reattach commits:
1. The reattach onDone path hard-coded `runStatus: 'succeeded'`, which
overwrote a 'failed' or 'canceled' status that the replayed terminal
event had already recorded via onRunStatus. Restored messages would
come back as success even when the run had actually failed or been
canceled. Now derives the final status from `prev.runStatus` via the
existing `resolveSucceededRunStatus` helper, mirroring the send path
at line 2333.
2. When `findExistingArtifactProjectFile()` recovered an existing
on-disk artifact, the produced-files list was replaced with that
single file, dropping any other files the turn had created earlier.
Now always computes the full diff against `preTurnFileNames`, then
appends the recovered artifact only if it isn't already in that
set. Extracted as `mergeRecoveredArtifact(diff, recovered)` so the
logic is a unit-testable invariant.
Tests in ProjectView.reattach-restore.test.tsx:
- mergeRecoveredArtifact: three cases (recovered appended to pre-files,
no duplication when already in the diff, passthrough on no recovery).
- reattach failed-status: onRunStatus('failed') → onDone → final
saveMessage has runStatus 'failed', not 'succeeded'.
- reattach canceled-status: same shape for cancellation.
* fix(web): force keepalive PUT on pagehide so the last buffered chunk survives reload
Review on #2383 caught that onPageHide() only called flush(), which
updates React state then schedules persistSoon() — a 500ms debounce.
On a hard reload the page tears down before that timer fires, so the
final ~250ms of streamed text never reaches the daemon.
Threaded a new flushAndPersistNow() callback through
createBufferedTextUpdates(). Both buffer call sites (send-path +
reattach-path) supply it backed by persistMessageById(id, { keepalive:
true }). saveMessage in state/projects.ts forwards the new
SaveMessageOptions.keepalive flag onto fetch's keepalive option, which
the browser honors specifically for unload-time requests.
onPageHide now calls flush() followed by flushAndPersistNow?.(), so:
- flush() pushes the buffered delta into React state synchronously
- the immediate persistMessageById then PUTs the updated message with
keepalive:true, surviving document teardown
Regression test in ProjectView.reattach-restore.test.tsx: stream a
delta, dispatch pagehide, assert saveMessage was called with the
flushed content AND { keepalive: true } before the 500ms debounce
would otherwise have fired.
Lands the v2 PostHog spec's P0 design-system event family: five new
result events covering source ingest, create, review, status, and
picker apply; the existing file_upload_result + run_created/run_finished
schemas widened to discriminate DS workspaces from regular chat runs.
Contract (packages/contracts/src/analytics/events.ts):
- AnalyticsEventName gains design_system_{source_ingest,create,review,
status,apply}_result.
- Props interfaces + bucket/origin/method/status enums per spec.
- TrackingProjectKind gains 'design_system' for DS-as-project runs.
- RunCreatedProps / RunFinishedProps widen page_name+area to discriminate
chat_panel vs design_system_project; entry_from union accepts DS values;
DS-variant context fields (ds_source_origin, source_count, brand
description length bucket, per-source counts, design_system_created,
preview_module_count, missing_font_count).
- FileUploadSurface union adds design_systems / design_system_source.
- Bucket helpers (designSystemLengthBucket, folderCountBucket,
totalSizeBucket), module slug + type derivation, repo host parser.
Web emission sites:
- DesignSystemFlow.generate(): create_result + threads
prepareCreatedDesignSystemProject with analyticsTrack so each of the
4 source paths emits source_ingest_result (success / partial / failed
/ empty), repo-host dominance, fallback type from connector status.
- DropZone onFiles handlers: file_upload_result with deriveUploadCohort.
- DesignSystemDetailView: status_result on togglePublished + Make-default,
review_result on Looks-good / Needs-work; module_id from markdown
section header slug (designSystemModuleSlug), module_type via keyword
heuristic.
- DesignSystemsTab: status_result on publish toggle, set/unset default,
delete (incl. cancelled when window.confirm dismissed).
- NewProjectPanel: apply_result on DS picker change (manual select +
clear) plus an auto_select emit when the picker mounts with a default
DS not yet user-touched.
- ProjectView.streamViaDaemon: when project.metadata.importedFrom ===
'design-system', pass analyticsHints with entry_from
(onboarding_design_system for the auto-sent first message,
regenerate_from_review for subsequent sends), projectKind=design_system,
designSystemRunContext.
Daemon:
- ChatRequest gains optional analyticsHints (entryFrom / projectKind /
designSystemRunContext). Behavior never depends on these; only PostHog
props do.
- /api/runs handler reads analyticsHints to flip baseProps to the DS
variant (page_name=design_system_project, area=design_system_generation,
project_kind=design_system) when the run is DS-flagged, and spreads the
DS context fields onto run_created.
- run_finished mirrors the DS area + adds design_system_created (true iff
the run wrote DESIGN.md), preview_module_count (distinct preview/*.html
writes), missing_font_count (0 placeholder; pending font-audit hook).
- run-artifacts.ts: extracts collectWrittenPathsMatching as the shared
Write/Edit + isError-pair core; adds didRunCreateDesignSystemFile and
countDesignSystemPreviewModules using the same dedup + failure-skip
invariants as countNewHtmlArtifacts.
Tests:
- packages/contracts/tests/analytics-design-system-helpers.test.ts: 18
new test cases over the bucket helpers, module slug + type mapping,
repo host parser.
- apps/daemon/tests/run-artifacts.test.ts: 9 new tests for
didRunCreateDesignSystemFile + countDesignSystemPreviewModules covering
Write-then-Edit dedupe, case-insensitive DESIGN.md match, isError pair
skip, preview/index.html as a module, non-preview path rejection.
Targets release/v0.8.0.
* feat(daemon): add CTA hierarchy static QA pass
Introduce apps/daemon/src/qa/cta-hierarchy.ts exporting a pure
analyseCtaHierarchy(html) that parses generated prototypes with cheerio
and flags three precision-biased findings: multiple-primary CTAs in the
same section, ambiguous-weight (all CTAs share identical class + inline
style), and misleading-prominence (secondary-coded copy like "Learn
more" / "了解更多" styled with primary weight).
CTA candidates come from <button>, <a>, role="button" with btn/button/cta
class markers plus CTA copy keywords covering both English (Get started,
Sign up, Buy, Subscribe, Learn more, ...) and Chinese (立即购买,
立即下单, 了解更多, ...). Weight is inferred from class tokens
(primary/solid/filled/accent/cta) and from non-transparent inline
background-color, matching the inverse of the issue #2251 sample where
the header CTA was rendered with the neutral .btn style.
This PR only ships the pure function plus its tests. HTTP route, CLI
subcommand, and any auto-repair feedback loop are deliberate follow-ups
so the first cut can land without touching the daemon HTTP surface.
Refs #2251
* fix(qa): respect container boundaries in CTA hierarchy heuristics
Two precision fixes from review of #2427:
- computeContainerKey()'s parent fallback keyed by tag name alone, so
flat layouts like <div><a class=btn-primary>...</a></div> repeated
for sibling cards all landed in 'parent:div' and
detectMultiplePrimary() reported a fake shared-section conflict on
what is in fact one CTA per card. Switch to parent-node identity
(positional index of the matched parent within its tag group, same
trick the landmark branch already uses), so each sibling wrapper
gets its own bucket.
- detectAmbiguousWeight() compared signatures across the entire
document, so two unrelated sections each containing one '.btn' CTA
with matching style would trigger 'ambiguous-weight' despite neither
container having 2+ CTAs. The PR body's rule is narrower — 'every
CTA in a container shares the same class + inline style' — so bucket
by containerKey first and only emit the finding for containers with
2+ CTAs whose signatures are identical.
Tests lock both behaviors down:
- sibling <div> card-grid without a landmark ancestor stays under the
multiple-primary threshold;
- one-CTA-per-section pairs stay under the ambiguous-weight threshold.
* fix(daemon): remove 10-item cap from discovery TodoWrite plan prompt
The RULE 3 sentence in DISCOVERY_AND_PHILOSOPHY told the model to write
'a plan of 5–10 short imperative items'. That upper bound caused the agent
to cap every plan at exactly ten steps even when the task genuinely needed
more. The TodoWrite JSON schema imposes no maxItems constraint, so the cap
was entirely prompt-driven.
Replace '5–10 short imperative items' with 'short imperative items covering
the work'. TodoWrite intent, RULE 3 label, and planning-before-building
requirement all survive unchanged.
Red spec: apps/daemon/tests/prompts/discovery-todo-cap.test.ts
* fix(prompts): remove 10-item cap from contracts discovery copy and harden tests
[pass-6,7 BLOCKER] packages/contracts/src/prompts/discovery.ts still had
the old '5-10 short imperative items' wording. apps/web imports
composeSystemPrompt from @open-design/contracts (ProjectView.tsx:43),
so web-originated chat runs were still subject to the cap.
[pass-8 WARNING] discovery-todo-cap.test.ts did not cover the contracts
copy, leaving that path unguarded. Also no guard against semantically
equivalent re-introduction via 'at most / maximum / no more than'.
Changes:
- packages/contracts/src/prompts/discovery.ts: apply same wording fix as
apps/daemon; add inline rationale comment
- apps/daemon/src/prompts/discovery.ts: add inline rationale comment
- apps/daemon/tests/prompts/discovery-todo-cap.test.ts: add 4th assertion
blocking 'at most|maximum|no more than N item' re-introduction
- packages/contracts/tests/system-prompt.test.ts: add 5-assertion suite
guarding the contracts copy and composed prompt output
The post-onboarding disclosure modal (App.tsx:349 — `showPrivacyConsent =
... && config.onboardingCompleted === true`) only renders after the user
completes the welcome flow. Before that, `config.telemetry` is undefined
and both the daemon-side gate
(`analytics.ts: if (cfg.telemetry?.metrics !== true) return`) and the
web-side `/api/analytics/config` drop every event the onboarding view
fires.
E2E on nightly.10 (QA, 2026-05-22 06:15+ UTC) confirmed the symptom: a
real user completed the full Connect → About you → Design system → Generate
flow but PostHog received zero `page_view pn=onboarding` / `ui_click` /
`onboarding_runtime_scan_result` / `onboarding_complete_result` rows from
their distinct_id. Other events (post-onboarding home, settings, project
creation) flowed normally because by then the disclosure had been
accepted and `telemetry.metrics` was true.
Product decision (2026-05-22): default telemetry ON. The disclosure
modal stays disclosure-style ("I get it") and Settings → Privacy
remains the one-click opt-out — same UX, only the pre-decision default
changes from off to on.
Changes:
- `apps/web/src/state/config.ts`: `DEFAULT_CONFIG.telemetry =
{ metrics: true, content: true, artifactManifest: false }` so fresh
`loadConfig()` calls emit during the first onboarding render.
- `apps/web/src/types.ts`: comment now documents the default-on
semantics + the opt-out path.
- `apps/daemon/src/app-config.ts`: new `applyTelemetryDefaults` helper
fills in the same defaults when `readAppConfig` finds no telemetry
field on disk. Helper runs on BOTH the
installation-file-shadowing path and the fallback path. An
explicit user opt-out (`metrics: false`) is preserved untouched —
defaults only fill `undefined`, never overwrite a saved value.
- `apps/daemon/tests/app-config.test.ts`: 49 → 51 tests. Updated 9
existing assertions that expected `cfg.telemetry === undefined` /
`cfg === {}` to expect the new default; added 2 regression guards:
- "preserves an explicit telemetry opt-out across reads" pins the
`metrics: false` invariant so a future refactor can't silently
re-enable opted-out users.
- "preserves a partial explicit telemetry (metrics on, content off)"
pins per-field user choices against the default fill.
Validation:
- `pnpm --filter @open-design/daemon exec vitest run tests/app-config.test.ts` ✅ 51/51
- `pnpm --filter @open-design/web typecheck` ✅
- `pnpm --filter @open-design/daemon typecheck` ✅
- `pnpm --filter @open-design/web test` ✅ 201 files / 1839 tests
* refactor(daemon): introduce HTTP Request Adapter + typed Deps (proof on active-context-routes)
Adds a typed HTTP boundary Adapter under apps/daemon/src/http/ that replaces
the untyped ServerContext service-locator pattern (30+ fields, mostly any)
for route handlers. Routes become pure (input, deps) -> Result<output>
functions, unit-testable without Express or supertest.
Six new modules under apps/daemon/src/http/:
- types.ts Result<T,E>, ok(), err(), JsonRouteSpec, Handler,
RouteInputContext, HttpMethod, InputParser
- parse.ts rawInput(req), validationError(message, issues?)
- response.ts sendJson(), sendApiError(), statusForError() +
ERROR_STATUS_BY_CODE map
- origin-guard.ts guardSameOrigin(req, origin) wrapping isLocalSameOrigin
as a Result
- adapter.ts defineJsonRoute(), mountJsonRoute() (only place that
knows about req/res)
- index.ts barrel
active-context-routes.ts migrated as proof of pattern. parsePostActive(),
handlePostActive(), handleGetActive() are now pure functions; postActiveRoute
and getActiveRoute are exported route specs. The wire signature
registerActiveContextRoutes(app, ctx) is preserved so server.ts is untouched.
Spec at specs/current/daemon-http-adapter.md captures the strangler migration
order for the remaining route files (mcp-routes, chat-routes, artifact
routes, etc.) and a StreamRoute follow-up where the Run Orchestrator lands.
Wire-format note: cross-origin response moves from the legacy
{ error: 'cross-origin request rejected' } shape to the structured
{ error: { code: 'FORBIDDEN', message: ... } } shape. Backwards-compatible
via the existing CompatibleErrorResponse = ApiErrorResponse | LegacyErrorResponse
union in @open-design/contracts.
Validation:
- pnpm install (post-rebase, exit 0)
- pnpm --filter @open-design/daemon typecheck (both tsconfig.json and
tsconfig.tests.json silent => pass)
- pnpm --filter @open-design/daemon test: 15 new tests pass
(tests/http/adapter.test.ts + tests/active-context-routes.test.ts).
84 pre-existing failures across 23 files are unchanged and unrelated
to this PR (Windows symlink / short-name / colon-in-filename, upstream
behavior drift, missing plugin marketplace fixtures, and a freshly-
added tools-connectors-cli suite of 38 failures that landed during
the rebase).
Sharpens W4/W5 of specs/current/maintainability-roadmap.md and unlocks
W6 (Run Orchestrator).
* chore: add core-js, electron-winstaller, protobufjs, sharp to pnpm.onlyBuiltDependencies