open-design

mirror of https://github.com/nexu-io/open-design.git synced 2026-05-31 19:04:39 +07:00

Author	SHA1	Message	Date
lefarcen	df8a0faff6	feat(runtimes): register AMR (vela) as an ACP stdio agent (#2355 ) * feat(runtimes): register AMR (vela) as an ACP stdio agent AMR is the vela CLI's ACP runtime mode. `vela agent run --runtime opencode` speaks ACP JSON-RPC over stdio (see vela's `specs/current/runtime/manual-agent-run-openrouter.md`); per `docs/new-agent-runtime-acp.md` we expose it through the same `streamFormat: 'acp-json-rpc'` transport that already powers Hermes, Devin, Kimi, etc. The new `defs/amr.ts` is the entire wiring — `buildArgs` returns `['agent', 'run', '--runtime', 'opencode']`, `fetchModels` reuses `detectAcpModels`, and the fallback list seeds the OpenRouter ids vela's e2e baseline uses. `executables.ts`/`app-config.ts`/`metadata.ts` get the matching `VELA_BIN`/`VELA_LINK_URL`/`VELA_RUNTIME_KEY`/`VELA_OPENCODE_BIN` allowlist + install/docs URLs, so users can configure the per-agent env in Settings without leaking into other adapters. Coverage: `tests/fixtures/fake-vela.mjs` is a minimal ACP stub that returns the documented `initialize` / `session/new` / `session/set_model` / `session/prompt` shapes; `tests/amr-acp-integration.test.ts` spawns it via `child_process.spawn` and drives a full turn through `attachAcpSession` and `detectAcpModels`, so the ACP transport contract for AMR is end-to-end verified locally even before a real `vela` binary is installed. Validated: - pnpm guard - pnpm typecheck (all workspace projects) - pnpm --filter @open-design/daemon test (2881/2881) Deferred: real OpenRouter-backed turn through a built `vela` binary — the runtime def needs no changes for that path, only `VELA_RUNTIME_KEY` and `VELA_LINK_URL` in env (or Settings). * fix(runtimes/amr): pin a concrete default model and bare openai ids End-to-end validation against a freshly-built `vela` (nexu-io/vela@main) + OpenRouter surfaced two contract details the first AMR runtime def got wrong: 1. vela rejects `session/prompt` with `session/set_model must be called before session/prompt`. attachAcpSession in apps/daemon/src/acp.ts skips set_model whenever the picked model is the synthetic 'default' id, so AMR's fallback list must NOT include DEFAULT_MODEL_OPTION. The def now ships a concrete `gpt-5.4-mini` as both `fetchModels`' default option and `fallbackModels[0]`, which makes attachAcpSession always send a real `session/set_model` for AMR turns. 2. `vela --runtime opencode` auto-prepends `openai/` to whatever modelId it forwards to opencode's openai provider. With OpenRouter-style ids like `openai/gpt-5.4-mini`, opencode receives the double-prefixed `openai/openai/gpt-5.4-mini` and replies `ProviderModelNotFoundError`. The new fallback list ships the bare ids opencode's openai registry actually knows about (gpt-5.4, gpt-5.4-mini, gpt-5.4-fast, etc.). Stub + tests: - tests/fixtures/fake-vela.mjs now enforces the set_model gate the same way real vela does, so a regression that silently goes back to model: 'default' would surface as a fatal error in tests instead of a hidden production failure. - tests/amr-acp-integration.test.ts pins both contracts: no 'default' / no 'openai/' prefix in fallbackModels, and a negative case that asserts session/prompt fails when no model is set. Adds `apps/daemon/scripts/verify-amr-real-vela.mjs` — a small dev-time runner that drives `attachAcpSession` against a real `vela` binary and prints the daemon's chat events, so future protocol drift can be checked against an actual OpenRouter call. Verified locally: `vela agent run --runtime opencode` + OpenRouter returns the prompted string ("AMR-E2E-PASS") through the full daemon pipeline; daemon test suite stays 2883/2883. * fix(runtimes/amr): substitute concrete model when chat run sends 'default' A plugin-driven AMR run from the UI surfaced a real-world hole in the prior commit: json-rpc id 3: session/set_model must be called before session/prompt The Default-design-router plugin (and any caller that doesn't pin a real model) sends `model: 'default'` straight through, which the AMR runtime def cannot accept — vela rejects `session/prompt` without `session/set_model` and attachAcpSession skips set_model whenever model === 'default'. Just leaving DEFAULT_MODEL_OPTION out of the adapter's `fallbackModels` is not enough: the chat-run handler in server.ts still forwarded 'default' verbatim. This adds `resolveModelForAgent(def, resolved, env?)` as the single source of truth for the substitution: 1. If the caller picked a real id, pass it through. 2. Else, if `def.defaultModelEnvVar` is set and the daemon process env has a non-empty value for it, return that (operator escape hatch — see below). 3. Else, if the def's `fallbackModels` does NOT contain a 'default' id, return `fallbackModels[0].id`. 4. Else, return the original value (the historic shape — defs that list 'default' themselves are untouched). AMR sets `defaultModelEnvVar: 'VELA_DEFAULT_MODEL'`, so when opencode's openai-provider registry deprecates `gpt-5.4-mini` upstream, an operator can swap the fallback id without a code change by exporting `VELA_DEFAULT_MODEL=gpt-5.5` before launching tools-dev / od. Worth noting the env var must live in the daemon's `process.env` (Settings-UI per-agent env values only reach the spawned child, not the daemon's resolver) — the new field's docblock spells this out. Coverage: - `tests/runtimes/resolve-model.test.ts` — 8 unit tests covering all four resolver branches plus the env-override happy path / fallback / ignore-when-user-picked-a-real-id case. - `pnpm --filter @open-design/daemon typecheck` clean. * chore(runtimes/amr): move AMR to the top of the base agent list So `AMR (vela)` shows up first in the agent picker / status views, ahead of claude / codex. Pure ordering change; no behavior delta. * feat(amr): Sign-in / Sign-out button on the AMR Settings card The first half of the AMR work assumed the operator would set VELA_RUNTIME_KEY / VELA_LINK_URL on the daemon process and never surfaced login state to users. This adds the missing UX so a fresh install can drive the full path from Settings: - GET /api/integrations/vela/status reads ~/.vela/config.json for the active profile and returns { loggedIn, profile, user } (without leaking the runtime/control keys themselves). - POST /api/integrations/vela/login spawns `vela login` once (409 if one is already in flight). The vela CLI opens the user's browser to the device-authorization page itself — Open Design only needs to kick the subprocess off. - POST /api/integrations/vela/logout removes ~/.vela/config.json so the next status read returns logged-out. `AmrAgentCard` is a dedicated agent-card component for AMR because the existing `<button>` row can't host an interactive sub-control (nested interactive elements). It polls /status after a login click until the daemon reports loggedIn=true (or 5 minutes elapse), and exposes a Sign-out action on hover. Other adapters (claude, codex, hermes, …) keep their existing `<button>` card. i18n: 8 new keys (settings.amrLogin / Logout / LoggingIn / etc.) added to en + zh-CN. Other locales spread `en` and inherit the English copy until translations land. Coverage: - `tests/integrations/vela.test.ts` pins the config.json reader against a tmp HOME — including the negative case where a profile has user info but no runtimeKey (still logged-out), and the secret-leak guard ("rt-secret-" must not appear in the projection payload). - `tests/components/AmrAgentCard.test.tsx` covers all four UI states (logged-out, logging-in, logged-in, logging-out) plus the click-propagation invariant the divergent card was built to keep. `pnpm --filter @open-design/daemon test` 2901 / 2901 passing. `pnpm --filter @open-design/web test` 1719 / 1719 passing. `pnpm typecheck` + `pnpm guard` clean. Dev script side-effects: `apps/daemon/scripts/verify-amr-real-vela.mjs` no longer requires both VELA_RUNTIME_KEY and VELA_LINK_URL — if VELA_PROFILE is set, the vela CLI is allowed to resolve credentials from `~/.vela/config.json`. Added the two AMR `.mjs` fixtures to `scripts/guard.ts` allowlist with the executable-fixture / dev-runner rationale. fix(connection-test): substitute model for AMR before attachAcpSession The chat-run path in server.ts already routes the requested model through `resolveModelForAgent` so AMR / vela (whose CLI demands an explicit `session/set_model` before `session/prompt`) gets the def's first concrete fallback id when the chat run ships `model: 'default'`. `connectionTest.ts` was wiring `attachAcpSession({ ..., model: model ?? null })` directly, which made the Test Connection button on the AMR Settings card deadlock with the same `session/set_model must be called before session/prompt` error the chat-run path already handles — surfaced as a permanent "Testing connection…" spinner in the UI. Reuse the same helper here so Test Connection mirrors chat-run behavior. * test(amr): three-layer end-to-end coverage for the AMR login + turn flow The PR up to this point shipped runtime + UI code with unit-level Vitest coverage. This commit adds the cross-layer regression net the live demo relied on: 1. apps/daemon/tests/integrations/vela.routes.test.ts (HTTP, Vitest) Spins up the real daemon Express app via `startServer({port:0,...})`, persists `agentCliEnv.amr.VELA_BIN = <fake>` into app-config.json, and exercises every /api/integrations/vela/* endpoint against the extended fake-vela stub: - status reads ~/.vela/config.json under various states - login spawns the fake, waits for config.json to appear, returns pid + startedAt + profile - 409 already-running guard with the stub's delay knob - logout removes the file (idempotent) - secrets (runtimeKey / controlKey) never leak in the projection - login → status round-trip flips loggedIn=false → true 2. e2e/tests/amr/turn.test.ts (tools-dev orchestrated, Vitest) Boots a namespaced daemon + web pair through `createSmokeSuite`, inlines a self-contained fake `vela` binary that handles BOTH `vela login` (writes ~/.vela/config.json) and `vela agent run --runtime opencode` (ACP stdio with the `session/set_model must precede session/prompt` gate the real binary enforces), then drives a complete /api/runs lifecycle for `agentId: 'amr', model: 'default'` and asserts the assistant message captures the fake's streamed text. This is the test that would have surfaced today's plugin-default-model regression (the `set_model before prompt` error) at PR time instead of demo time. 3. e2e/ui/amr-login-pill.test.ts (Playwright) Mocks /api/agents + /api/integrations/vela/{status,login,logout} to drive the Settings AMR card through the full Sign in → Signed in → Sign out cycle. Pins the AmrLoginPill polling contract and the aria-label semantics (the pill's accessible name is "Sign out" once logged in, regardless of which label the hover-state text shows). fake-vela.mjs extensions: - Handles `vela login` argv by writing ~/.vela/config.json for the active VELA_PROFILE and exiting 0 — mirrors real vela's on-disk side-effect without the device-auth loop. - FAKE_VELA_LOGIN_DELAY_MS knob so route tests can observe the in-flight state of the spawn lifecycle. - FAKE_VELA_LOGIN_USER_EMAIL / _USER_PLAN to assert the surfaced user fields end-to-end. Validated: - `pnpm guard` + `pnpm typecheck` (all workspace projects) - `pnpm --filter @open-design/daemon test`: 2998 / 2998 passing, including the new 8-test integration suite. - `cd e2e && pnpm test tests/amr`: 1 / 1 passing. - `cd e2e && pnpm exec playwright test ui/amr-login-pill.test.ts`: 1 / 1 passing (6.7s). * feat(amr): package native cli and refine login ui * feat(amr): wire vela cli beta packaging * docs(amr): document vela ci packaging review * docs(amr): refine vela ci integration review * fix(ci): refresh nix pnpm dependency hashes * fix(pack): clean up Vela CLI packaging * fix(pack): bundle Vela CLI support files * fix(amr): recover login attempts from stale auth state * test: expand AMR and automations coverage * fix(amr): address review follow-ups * test(web): align tasks fixtures with contracts * fix(daemon): type wildcard route params * fix(ci): refresh PR merge validation * fix(amr): clear env credentials on logout * feat(settings): inline local CLI model configuration * fix(amr): recognize daemon env credentials * [codex] Fix Vela companion packaging (#2979) * Fix Vela companion packaging * Update Nix pnpm dependency hashes * [codex] Surface AMR account failures (#2980) * fix: surface AMR account failures * fix: cover AMR recovery error guidance * chore: bump beta base version to 0.8.1 (#2990) * Fix AMR profile and packaged runtime review issues * Detect packaged AMR OpenCode companion tree * feat(web): polish AMR frontend flows * Polish AMR onboarding card * fix: read AMR login state from dot-amr config (#3048) * test: tighten AMR credential and packaging coverage * test: restore AMR executable test env helper * [codex] Fix packaged mac Dock identity and AMR label (#3076) * Fix packaged mac sidecar Dock identity * Rename AMR assistant label * Fix AMR live models and dot-amr login state (#3073) * fix: read AMR login state from dot-amr config * fix: load live AMR models before runs * fix: point AMR onboarding link to production wallet * fix: address AMR model review feedback * fix: persist live AMR model fallback * [codex] Fix AMR link catalog model ids (#3088) * Fix packaged mac sidecar Dock identity * Rename AMR assistant label * Fix AMR link catalog model ids * Fix AMR model normalization typecheck * Use live AMR model for default runs * fix: polish AMR runtime settings UI * Accelerate AMR startup defaults (#3092) * Surface AMR insufficient balance wallet URL (#3099) * fix(web): polish onboarding controls (#3112) * fix(web): show CLI scan loading state * Avoid duplicate AMR wallet recharge links (#3117) * Avoid duplicate AMR wallet recharge links * Use Vela CLI 0.0.3 test package * chore(nix): refresh pnpm deps hash * Fix AMR wallet guidance display --------- Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com> * chore(pack): pin Vela CLI 0.0.3-test.1 (#3127) * chore(nix): refresh pnpm deps hash * chore(pack): pin Vela CLI 0.0.3 * chore(nix): refresh pnpm deps hash * fix(web): suppress AMR exit 130 fallback (#3136) * feat(web): nudge users to hosted AMR on model/auth/quota failures (#3083) * feat(web): nudge users to hosted AMR on model/auth/quota failures When a non-AMR agent run fails with an auth / quota / upstream model error, surface an inline nudge under the error pill linking to Open Design's hosted AMR gateway (https://open-design.ai/amr). The nudge fires `surface_view` (element=run_failed_toast) on impression and `ui_click` (element=go_amr) on the link. Also teach the daemon to classify CLI-agent auth/quota/upstream failures (Claude Code, codex, ...) into specific API error codes (AGENT_AUTH_REQUIRED / RATE_LIMITED / UPSTREAM_UNAVAILABLE) instead of the generic AGENT_EXECUTION_FAILED, so both the error message and the nudge key off accurate codes. AMR's own runs are excluded from the nudge — they keep the dedicated sign-in / recharge affordances. * feat(web): rework failed-run AMR guidance into per-case error UI Replace the single inline nudge with a per-case failed-run experience driven by the run's error code + agent: - The error card is now neutral gray (was red) and always carries a retry button; it is driven by the persisted per-message error event so it survives a reload. - Non-AMR agent hitting a model/auth/quota wall: a theme-color promotion card under the error card offers "switch to AMR & retry" — switches the run to AMR, opens Settings on the AMR card, and auto-retries once the account signs in (ProjectView polls vela login status, independent of the Settings pill lifecycle, with success / 5-min-timeout / unmount exits). - AMR agent unauthorized: clearer copy + an "authorize & retry" button. - AMR agent out of balance: clearer copy + a "top up" button to the AMR wallet, with manual retry. - Settings AMR card: when opened from the nudge, it scrolls into view and pulses, and an authorize-button coachmark (a fake hand cursor that rises in and dismisses on hover) points at the sign-in control when not yet authorized. analytics: surface_view (run_failed_toast) on the promotion card and ui_click (go_amr) on its action are retained. i18n adds chat.amrCard.* and chat.amrError.* (en / zh-CN / zh-TW translated; other locales fall back to en) and drops the old chat.amrErrorGuidance keys. * fix(daemon): require status context for numeric service-failure codes Per review on #3083: the model-service classifier matched bare HTTP status numbers (`500`, `502`, `429`, `401`), so ordinary CLI output like `line 500`, `read 502 bytes`, or `exit code 401` could be misclassified as a provider outage / auth wall and wrongly surface the AMR nudge. Now a status number only counts when it carries explicit context (`HTTP 500`, `status 503`, `code: 401`, `502 Bad Gateway`); textual provider phrases (overloaded, bad gateway, service unavailable, rate limit, …) are unchanged. Adds fixtures proving unrelated numeric output stays null. * fix(web): keep error pill for failed runs ChatPane's card doesn't cover Per review on #3083: the per-message gray error pill was suppressed for every persisted error status event, but ChatPane only renders the replacement top-level error card for `retryableAssistantMessage` (the last failed assistant). So a failed turn that is no longer last (after a follow-up) or an older failed run in history showed neither the pill nor the card — its error detail vanished, undercutting reload/history survival. ChatPane now passes `errorCardOwnerId` (the assistant id whose error the card represents); AssistantMessage suppresses only that one pill and keeps rendering StatusPill for all other error events. * fix(daemon): don't treat a process exit code as an HTTP status Follow-up to review on #3083: the status-context helper accepted a bare `code` prefix, so `exit code 401` / `process exited with code 429` still matched and got classified as AGENT_AUTH_REQUIRED / RATE_LIMITED (the very `exit code 401` case the comment calls out as noise). `code` now only counts when qualified (`status code` / `error code` / `response code`) or punctuation-bound (`code: 401`); bare `exit code N` no longer matches. Adds fixtures for exit-code lines returning null. * chore(web): translate AMR card / error keys for 16 remaining locales PR #3083 added 10 new `chat.amrCard.` / `chat.amrError.` keys but only provided en/zh-CN/zh-TW translations; the other 16 locales fell back to English. Translate the card title/body, three chips, primary CTA, and the AMR self-error (auth / balance) messages and buttons for ar, de, es-ES, fa, fr, hu, id, it, ja, ko, pl, pt-BR, ru, th, tr, uk. * fix(amr): address review feedback on #2355 Targeted fixes for the unresolved review threads on #2355. Each fix includes / updates a focused test. - runtimes/executables.ts: `packagedVelaOpenCodeCompanionTree` now verifies the inner `opencode` executable exists + is runnable, not just the directory. This closes the false-positive availability path that let `detectAgents()` surface AMR as available even when the packaged companion was empty / partially copied (mrcfps, 4 threads). - runtimes/executables.ts: `resolveAmrOpenCodeExecutable` now prefers the bundled `<OD_RESOURCE_ROOT>/bin/libexec/opencode/opencode` over a stale `opencode` on the user's PATH, so packaged AMR builds can't be hijacked by a global installation. - web/EntryShell.tsx: when the Local CLI scan returns an available agent and the previously-selected agent is AMR, switch the selection to the first available local agent so the runtime and persisted agent agree before Continue. - server.ts (model-probe branch): for AMR, check `readVelaLoginStatus` BEFORE rejecting on an empty live-model catalog — a signed-out user was getting `AMR_MODEL_UNAVAILABLE` ("choose a model") instead of the correct `AMR_AUTH_REQUIRED` (sign-in affordance). - server.ts (default model fallback): if the user asked for the AMR agent default and the cached id is no longer in the FRESH catalog, fall back to `liveModels[0]` from the probe instead of rejecting the run as `AMR_MODEL_UNAVAILABLE`. - integrations/vela.ts: route `vela login` through `createCommandInvocation` so an npm/Node-style `vela.cmd` / `.bat` shim on Windows gets the correct `cmd.exe /d /s /c …` wrapping with verbatim args (matches `execAgentFile` / chat-run spawning). - tools/pack/src/linux.ts: in containerized Linux builds, bind-mount the host directory of `OPEN_DESIGN_VELA_CLI_BIN` and rewrite the env to the container-side path. The host path was being passed in as-is even though the default container only mounts /project, /tools-pack and cache/home — `copyOptionalVelaCliBinary` saw a missing path. Deferred (out of scope for this PR): - `od amr status/login/logout/cancel` CLI subcommands (AGENTS.md UI/CLI dual-track rule, server.ts:5763) — sizable surface; tracked for a separate focused PR. - Strict `--require-vela-cli` for Windows + mac-x64 beta builds: prematurely blocked — `@powerformer/vela-cli` only publishes the `darwin-arm64` platform binary today; adding the flag elsewhere would fail the builds. Revisit once win/x64/linux binaries ship. * fix(amr): hoist sendAmrAccountFailure above the AMR catalog preflight (TDZ) The new signed-out AMR branch in the catalog preflight at server.ts:10875 calls `sendAmrAccountFailure(...)` to emit AMR_AUTH_REQUIRED, but the const declaration sat ~100 lines below at the outer function scope. Because `const` is TDZ-aware, that branch would have thrown `ReferenceError: Cannot access 'sendAmrAccountFailure' before initialization` for the exact users it tries to help — defeating the original intent. Hoist the helper to just above the AMR preflight block so it's available to every AMR code path in this function. Behavior elsewhere is unchanged. Also rerun the daemon test suite: `launch.test.ts > resolveAgentLaunch uses packaged built-in Vela for AMR` was creating the `<resourceRoot>/bin/libexec/opencode/` companion directory only, but this PR's earlier tightening of `packagedVelaOpenCodeCompanionTree` also requires the inner `opencode` executable. Add it to that fixture to match the new contract; the test was a sibling of the executables / env-and-detection fixtures already updated in `13fc4f4`. Addresses #2355 review (mrcfps, 2026-05-28). * feat(web): add hover cancel for AMR login (#3158) * feat(web): add hover cancel for AMR login * fix(web): don't bounce AmrLoginPill back to 'Signing in…' after local cancel Both codex-connector (P2) and looper (CHANGES_REQUESTED) on this PR flagged the same race in the new local-cancel path: `handleCancelLogin` dispatches `notifyAmrLoginStatusChanged('login-canceled')` immediately after `/login/cancel` returns, but the `AMR_LOGIN_STATUS_EVENT` listener unconditionally re-enters `refresh()` and then restarts polling whenever `/api/integrations/vela/status` still reports `loginInFlight: true`. That is a real race because the daemon's `cancelVelaLogin()` only sends SIGTERM (escalating to SIGKILL after `LOGIN_CANCEL_KILL_GRACE_MS` = 2000 ms) and keeps the child in `activeLoginProcs` until it actually exits — so the first `/status` read after a successful cancel can legally still come back as in-flight. Under that window the pill flips back to 'Signing in…' and can later surface the timeout/error path even though the user already canceled, defeating the behavior promised in the PR description. Fix the listener instead of every dispatch site: in the `login-canceled` branch, after the local reset (stopPolling + setPending(null) + clear refs), optimistically mark every subscribed pill instance as not-in-flight (`setStatus((c) => c ? { ...c, loginInFlight: false } : c)`) and `return` — skip the refresh-and-reconcile branch below entirely. The next explicit refresh (component mount, user interaction, or a `status-changed` event) will pick up the daemon's confirmed state once the child has actually exited. Add a focused regression test that holds `/api/integrations/vela/status` at `loginInFlight: true` even after a successful `/login/cancel`, asserting that the pill stays at the Canceled → Authorize sequence and never bounces back to 'Signing in…'. This test fails on the pre-fix listener and passes on the new behavior; existing 'cancels an in-flight AMR sign-in…' and 'reconciles late AMR browser completion to Signed in after local cancel' tests continue to pass. Addresses review feedback on #3158 (chatgpt-codex-connector, nettee). --------- Co-authored-by: lefarcen <935902669@qq.com> --------- Co-authored-by: a1chzt <chizblank@gmail.com> Co-authored-by: Amy <1184569493@qq.com> Co-authored-by: Mason <jinmeihong0201@gmail.com> Co-authored-by: Caprika <56862773+alchemistklk@users.noreply.github.com> Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>	2026-05-28 05:09:55 +00:00
jasonyang365	840019c8e2	Add Trae CLI as an ACP coding-agent adapter (#2729 ) Some checks failed visual-baseline / Capture visual baselines (push) Waiting to run Details ci / Detect CI change scopes (push) Successful in 0s Details nix-check / build (push) Failing after 1s Details ci / Validate Nix flake (push) Has been skipped Details ci / Preflight (push) Failing after 1s Details ci / Workspace unit tests (push) Failing after 1s Details ci / Daemon workspace tests (push) Failing after 1s Details ci / Web workspace tests (push) Failing after 1s Details ci / Browser tests (push) Failing after 1s Details ci / Build workspaces (push) Failing after 1s Details ci / Validate workspace (push) Failing after 1s Details ci / Runtime trace (push) Has been skipped Details * Add Trae CLI ACP adapter * Add Trae CLI binary override support * Update mature ACP MCP discovery test * Stabilize Orbit summary tracking test --------- Co-authored-by: AI Bot <bot@example.com>	2026-05-23 15:17:42 +00:00
lefarcen	c30f3fbfac	feat(analytics): default telemetry to on so onboarding events emit pre-disclosure (#2682 ) The post-onboarding disclosure modal (App.tsx:349 — `showPrivacyConsent = ... && config.onboardingCompleted === true`) only renders after the user completes the welcome flow. Before that, `config.telemetry` is undefined and both the daemon-side gate (`analytics.ts: if (cfg.telemetry?.metrics !== true) return`) and the web-side `/api/analytics/config` drop every event the onboarding view fires. E2E on nightly.10 (QA, 2026-05-22 06:15+ UTC) confirmed the symptom: a real user completed the full Connect → About you → Design system → Generate flow but PostHog received zero `page_view pn=onboarding` / `ui_click` / `onboarding_runtime_scan_result` / `onboarding_complete_result` rows from their distinct_id. Other events (post-onboarding home, settings, project creation) flowed normally because by then the disclosure had been accepted and `telemetry.metrics` was true. Product decision (2026-05-22): default telemetry ON. The disclosure modal stays disclosure-style ("I get it") and Settings → Privacy remains the one-click opt-out — same UX, only the pre-decision default changes from off to on. Changes: - `apps/web/src/state/config.ts`: `DEFAULT_CONFIG.telemetry = { metrics: true, content: true, artifactManifest: false }` so fresh `loadConfig()` calls emit during the first onboarding render. - `apps/web/src/types.ts`: comment now documents the default-on semantics + the opt-out path. - `apps/daemon/src/app-config.ts`: new `applyTelemetryDefaults` helper fills in the same defaults when `readAppConfig` finds no telemetry field on disk. Helper runs on BOTH the installation-file-shadowing path and the fallback path. An explicit user opt-out (`metrics: false`) is preserved untouched — defaults only fill `undefined`, never overwrite a saved value. - `apps/daemon/tests/app-config.test.ts`: 49 → 51 tests. Updated 9 existing assertions that expected `cfg.telemetry === undefined` / `cfg === {}` to expect the new default; added 2 regression guards: - "preserves an explicit telemetry opt-out across reads" pins the `metrics: false` invariant so a future refactor can't silently re-enable opted-out users. - "preserves a partial explicit telemetry (metrics on, content off)" pins per-field user choices against the default fill. Validation: - `pnpm --filter @open-design/daemon exec vitest run tests/app-config.test.ts` ✅ 51/51 - `pnpm --filter @open-design/web typecheck` ✅ - `pnpm --filter @open-design/daemon typecheck` ✅ - `pnpm --filter @open-design/web test` ✅ 201 files / 1839 tests	2026-05-22 15:42:20 +08:00
huyhoangnhh98	140a4e1ff6	Improve responsive preview and design handoff outputs (#1224 ) * feat: improve responsive design handoff * feat: refine cross-platform design outputs Changelog:\n- Add auto-fit responsive preview behavior for tablet/mobile frames.\n- Add landing page and OS widgets metadata options with project header chips.\n- Strengthen prompt contracts for modern breakpoints, app-specific modules, CJX-ready UX, and final product surfaces.\n- Require cross-platform outputs to use separate platform files instead of tabbed demo selectors.\n- Add DESIGN-MANIFEST.json plus richer handoff guidance to daemon/client exports.\n- Update archive/export tests for manifest and responsive viewport matrix. * feat: enforce screen-file design outputs Changelog:\n- Enforce screen-file-first generation for landing pages, app screens, platform surfaces, and OS widgets.\n- Update design handoff and manifest exports so coding tools map each screen file to separate routes/surfaces.\n- Strengthen minimal-brief visual guidance to avoid monochrome or unstyled design outputs. * fix: address responsive handoff review feedback * fix: address handoff review blockers * fix: preserve proxy auth and normalized export entry * fix: narrow frame wrapper filter to directory paths only * fix: make artifact save failure banner generic --------- Co-authored-by: Huy Hoàng <macos@MacBook-Pro-Hoang.local>	2026-05-12 14:18:33 +08:00
lefarcen	afb331a288	feat: add opt-in Langfuse telemetry (#800 ) * docs(specs): add langfuse telemetry change spec Captures the design for forwarding completed agent runs to Langfuse, including data-model mapping, field-budget caps, privacy gates, build-secret injection, GDPR right-to-deletion approach, and the resolved decisions on default consent, identifier shape, region, and ownership. * feat(daemon): add langfuse-trace module and telemetry prefs Adds the dependency-free building blocks for forwarding completed agent runs to Langfuse. Two layers: - AppConfigPrefs gains installationId and a TelemetryPrefs object with metrics / content / artifactManifest gates. The daemon validator treats telemetry like agentModels — replace-on-write, drop-when-empty, reject non-boolean inner values. - New langfuse-trace.ts builds a {trace-create, generation-create} pair from a ReportContext, capping prompt at 8 KB, output at 16 KB, artifacts at 50 entries, and dropping any batch larger than 1 MB before send. reportRunCompleted is no-op when LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY are unset (so dev runs and forks never emit) and short-circuits on prefs.metrics === false. Server-side wiring into the run-close path lands in a follow-up. * fix(langfuse): default to US Langfuse region End-to-end smoke against the project's actual dev key on 2026-05-07 returned 401 from cloud.langfuse.com (EU) and 207 from us.cloud.langfuse.com (US), confirming the org lives in US. Update the default base URL, the matching test, and the spec's Q3 decision row to match. Self-hosted or EU-region operators can still override via the LANGFUSE_BASE_URL env var. * feat(daemon): wire langfuse trace forwarding into run-close Adds the daemon-side glue to forward completed agent runs: - runs.ts gains an optional onTerminate hook fired once per run after it reaches a terminal state. Errors thrown from the hook are caught and logged, never propagated, so telemetry can never break the run path. - New langfuse-bridge.ts assembles a ReportContext from the in-memory run record, the conversation's persisted assistant message, and the user's app-config preferences. It tolerates a missing message (e.g. when web has not yet PUT the final delta) and a missing app-config. - server.ts stashes the original user prompt on the run object inside startChatRun so the bridge can include it without crossing the createChatRunService boundary, and registers the hook callback when building the run service. Behavior remains a no-op unless LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY are set in the daemon env AND telemetry.metrics is true in app-config. A live smoke against us.cloud.langfuse.com on 2026-05-07 confirmed the matching trace + generation schema is accepted (HTTP 207, both events 201 created). * fix(langfuse): address PR #800 review feedback P1 — Move trace forwarding off the daemon-internal run-close hook and onto the message-persistence path. The original onTerminate hook ran inside finish() the moment the SSE 'end' event was emitted, which is before the web client's onDone handler refreshes project files and PUTs producedFiles + final assistant content back to SQLite. Reading SQLite at that moment routinely missed both. The fix: drop the runs.ts hook entirely and trigger from PUT /api/projects/:id/conversations/:cid/ messages/:mid when the saved row carries a terminal runStatus. A reportedRuns Set guards against the multiple PUT calls web makes per turn (each retry / state update). Set entries auto-evict after the same 30 min TTL the runs map uses. Web persists a terminal-status message in all three completion paths — onDone (succeeded), onError (failed), and cancel (canceled) — so this catches every run shape. P2 — postLangfuseBatch now parses the 207 Multi-Status response body. Langfuse legacy ingestion always returns 207, and response.ok is true for 207, so per-event validation errors used to slip through silently. We now warn when body.errors is non-empty. Two new unit tests. P2 — truncate() and the HARD_BATCH cap now compare UTF-8 byte length, not String.length (which counts UTF-16 code units). A 4096-character CJK prompt occupies 12 KB, well over the 8 KB input cap. truncate also walks backwards to a UTF-8 leading byte so the cut never lands inside a multi-byte codepoint. New unit test covers '设'.repeat(4096). P2 — Spec R7 now lists the actual Langfuse trace deletion endpoint (DELETE /api/public/traces/{traceId} for single, DELETE /api/public/traces with body for batch). Verified by curl on us.cloud.langfuse.com: DELETE /api/public/traces/X → 200; the path the original spec named (POST /api/public/trace/X) returns 404. Reference link points at langfuse.com/docs/administration/data-deletion. P3 — Q4 (legacy ingestion vs OTel) moved from Open Questions to Resolved Decisions. The implementation already commits to legacy and the trade-off was discussed during design; the open-question status was stale. * feat(web): privacy consent surface + Settings → Privacy tab Adds the user-facing half of the telemetry feature so the daemon-side hook from PR #800 has something to talk to. - AppConfig gains optional `installationId` (anonymous v4 uuid generated on first opt-in; null after explicit decline; undefined when the user has never seen the consent surface) and `telemetry: TelemetryConfig` ({metrics, content, artifactManifest}). syncConfigToDaemon round-trips both fields so the bridge module sees the same prefs. - SettingsDialog grows a Privacy section with two states. When the user has never made a consent decision (typical first-run path), the section renders the GDPR-aligned consent card: a kicker, the disclosure body listing both metrics and conversation content as separate bullets, and two equally-prominent buttons ("Share usage data" / "Don't share"). The Don't-share path keeps the app fully usable (core app must work with all tracking declined). After a decision the same panel switches to three independent toggles + the anonymous ID + a "Delete my data" button that rotates the ID and turns everything off. - App.tsx points the welcome modal at the new Privacy section so the consent decision is the first thing a fresh installation sees. - 17 i18n keys land in en + zh-CN + zh-TW with hand-translated copy, and as English placeholders in the remaining 14 locales — enough for the parity check to pass while leaving room for proper localisation in a follow-up. Dict type updated. - Minimal index.css for the consent card + toggle rows so the panel is legible without depending on follow-up design polish. Telemetry remains a no-op end-to-end until the user clicks Share usage data: the daemon gate (prefs.metrics === true) keeps every code path short-circuited otherwise. * refactor(web): rebuild Privacy panel using project-native settings primitives The first cut used custom .settings-privacy-* classes + raw HTML checkboxes that didn't match any other Settings tab. Replace with the shell other sections already use: - settings-subsection containers with section-head + h4 + .hint - seg-control / seg-btn pill toggles ("active" / "offline") for each of the three telemetry preferences, mirroring NotificationsSection - a 2-cell seg-control for the consent card so Share usage data and Don't share carry identical visual weight (the GDPR equal-prominence requirement that the previous accent / outline split missed) - ghost button + readonly text input for the installation id row, mirroring the API-key field pattern elsewhere Drop the bespoke CSS block in favor of inheriting the existing settings-section / seg-control / ghost styling. The only privacy- specific style left is a tight definition list inside the consent card for the metrics + content disclosure rows. * refactor(web): use .toggle-row iOS switch for Privacy preferences Active/offline pills (the seg-control single-cell pattern that NotificationsSection uses) read awkwardly for a flat preference list. Switch the three telemetry toggles to .toggle-row — the same control NewProjectPanel uses for "speaker notes" / "animations": label + hint on the left, iOS-style sliding switch on the right, full-row click target. The consent card's two-button seg-control stays as-is — there the equal-weight pill pair is exactly what GDPR equal-prominence wants. * feat(web): standalone first-run privacy consent banner Replaces the Settings-dialog-as-onboarding hack with a dedicated bottom-right banner card that mounts whenever the user has never made a privacy decision (cfg.installationId === undefined). The banner is prominent (anchored to the corner with a soft shadow) but non-blocking, mirrors cookie-consent UX, and shares the project's panel styling — same .modal-elevated background, --radius-lg corners, --shadow-lg lift. Wiring: - App.tsx imports PrivacyConsentModal and renders it at the root, gated on installationId === undefined && !settingsOpen so it doesn't double up with the Privacy tab's own consent card when Settings is already showing. - Share / Don't share both go through handleConfigPersist, so the resulting installationId + telemetry prefs land in localStorage and the daemon at the same time, reusing the existing autosave plumbing. - The previous attempt that pinned the welcome SettingsDialog to the Privacy section is reverted; onboarding now stays focused on agent configuration, and the consent decision lives in its own surface. * fix(web): keep privacy banner visible while Settings welcome modal is open The banner gated itself on `!settingsOpen` to avoid double-rendering with the Privacy tab's consent card. But the first-run path opens the Settings welcome modal automatically when `onboardingCompleted=false`, which fired immediately after bootstrap — so the banner flashed for a moment and then vanished behind the modal backdrop. Drop the `!settingsOpen` clause so the banner stays mounted whenever the user has not yet made a privacy decision, and bump its z-index above the modal backdrop (200 vs 100) so first-run users can actually reach the consent buttons. The minor visual overlap with the Privacy tab's own card is fine: clicking either copy resolves both surfaces. * copy(privacy): soften consent button labels Banner action buttons now read "Help improve Open Design" / "Not now" (en, with hand translations in zh-CN / zh-TW and English placeholders in the other 13 locales) instead of "Share usage data" / "Don't share". The new wording aligns the affirmative action with the kicker copy ("Help us improve Open Design") and reads less alarming, while the disclosure list above still names both data categories explicitly so the consent stays informed under GDPR. The decline button stays as a soft "Not now" rather than an aggressive "Don't share" so the reject path doesn't read as hostile to the user. No structural change — the two-cell seg-control still gives the buttons identical visual weight, and the underlying side-effects are unchanged (installationId is generated on Help / nulled on Not now, and the telemetry prefs flip the same way). * feat(telemetry): expand trace fields for evals & dataset construction Each Langfuse trace now ships the full per-turn + per-install fact sheet that the eval/dataset workflow needs, instead of only the bare turn id + token count from before. Everything below is gated by `prefs.metrics === true`; nothing here is content (those gates remain separate). Per-turn: - model — first-class generation.model field, drives Langfuse cost lookup and model-grouping in the UI; also mirrored in trace.metadata and trace.tags so list-view filters work. - reasoning — generation.modelParameters.{ reasoning } so the Model Parameters card lights up; mirrored in metadata. - skillId / designSystemId — metadata + tags, so dataset slices can group by which skill/DS produced which output. Per-process / build (constant within one daemon run, cached at start): - appVersion / appChannel / packaged from app-version.ts - nodeVersion (process.version), os (platform()), osRelease, arch (os.arch()) - clientType — desktop vs web, derived from a new X-OD-Client header the web layer sets in providers/daemon.ts (with a User-Agent sniff fallback for third-party callers). Plumbing: - startChatRun stashes model / reasoning / skillId / designSystemId on the run object alongside the existing userPrompt stash. - POST /api/runs reads X-OD-Client and stores run.clientType. - langfuse-bridge collects RuntimeInfo once per process and merges per-run client carrier; ReportContext gains optional `turn` + `runtime` blocks; existing fields stay backward compatible. Spec gains a "Telemetry Fields Catalog" section enumerating every field, its source, and the gate it lives under, so the eval team has a single place to look up what's available without reading the trace schema by example. Tests: - new langfuse-trace tests cover turn tags, runtime tags, generation model/modelParameters promotion, modelParameters omission when reasoning is unset, and metadata mirroring. - langfuse-bridge gains an end-to-end "turn-level config" test that threads model/reasoning/skill/DS/clientType + appVersion through the bridge and asserts the Langfuse payload shape. - existing tests adjusted to tolerate host-dependent os tag. * copy(privacy): trim Share button to verb phrase only "Help improve Open Design" overflowed the equal-width 2-cell seg-control on the consent banner — the product name is already in the kicker + headline above the buttons, so the button itself only needs the verb phrase. Drop the product name from all locales: - en: Help improve Open Design → Help improve - zh-CN: 帮助改进 Open Design → 帮助改进 - zh-TW: 協助改進 Open Design → 協助改進 The decline button ("Not now" / "暂不" / "暫不") was already short, so the two buttons now have comparable length and the equal-prominence seg-control fits cleanly. Standalone Settings → Privacy panel uses the same labels for consistency. * fix(web): defer Settings welcome modal until privacy decision is made Previously bootstrap raced two surfaces against each other on first launch: the privacy consent banner (gated on installationId === undefined) and the Settings welcome modal (gated on onboardingCompleted === false). The banner's higher z-index kept it above the backdrop visually, but having two foreground surfaces at once is still confusing UX. Sequence them instead: bootstrap only opens the welcome modal when the user has already resolved consent (installationId !== undefined). Until then the banner owns the foreground alone. Once the user clicks Help improve / Not now, the corresponding handler hands off to the welcome modal if onboarding is still pending. End state matches what it was before — just without the simultaneous-render flash. * debug(privacy): log banner gate state to track sudden disappearance Two console.log points to find which setCfg call (or stale bundle) is flipping cfg.installationId from undefined to a value while the banner is visible. To remove once the regression is reproduced. * fix(privacy): keep installationId + telemetry out of localStorage Daemon is now the single source of truth for the privacy decision. Why this matters: the consent banner gates on \`config.installationId === undefined\`, but loadConfig() merges localStorage on top of the daemon's reply, so a stale uuid in \`open-design:config\` (left over from a previous opt-in) was re-hydrating the React state and immediately syncing back to the daemon — defeating "Delete my data" and re-suppressing the banner within milliseconds of every page load. The deeper reason to fix it here, not just patch the gate: a privacy identifier persisted in browser storage that the user can't see or clear without DevTools is a compliance liability. Anything users can revoke needs one canonical place to store it. Daemon \`app-config.json\` already serves that role for everything else gated through syncConfigToDaemon, so installationId + telemetry now ride that path exclusively: - saveConfig() strips both keys before writing localStorage. - loadConfig() strips both keys when reading older stale payloads, so existing installs migrate transparently on next launch. - syncConfigToDaemon() / mergeDaemonConfig still round-trip them, so the React state stays in sync with the daemon as before. Net effect: clearing app-config.json (or hitting "Delete my data") now fully resets the install identity, with no residual cohort key in browser storage. * feat(privacy): scrub secrets + PII from prompt/output before send When prefs.content is on, daemon now runs the prompt and assistant text through a regex scrubber (apps/daemon/src/redact.ts) before posting to Langfuse. The scrubber is the simplest thing that gives the user-facing copy a truthful claim — pure regex, zero new dependencies, fully auditable in this Apache-2.0 repo (vs. pulling a single-maintainer 5-month-old npm package into a core process). Categories covered (each replaced with [REDACTED:<kind>]): - Anthropic / OpenAI sk- keys (incl. proj/live/test/ant variants) - Langfuse pk-lf- / sk-lf- (specific rule wins over generic sk-) - GitHub gh[opsur]_ tokens - AWS access key ids (AKIA + 16 uppercase) - Google API keys (AIza + 35) - Slack xox[abprs]- tokens - Stripe live/test keys - JWT header.payload.signature triples - Bearer-header values (scheme word stays readable) - Emails, IPv4, US-style phone numbers - Credit cards — 13–19 digit runs that pass a Luhn check, so order ids and unix-nanos timestamps that fail Luhn pass through unchanged Not covered, stated openly in spec + i18n: names, postal addresses, business-secret semantics, raw 40-hex tokens (too high a false-positive cost for artifact slugs). Those would require an ML layer. Wired in: - apps/daemon/src/redact.ts — exports redactSecrets() + redactSecretsWithCounts() helper for future audit-summary metadata. - apps/daemon/src/langfuse-bridge.ts — runs both prompt and output through redactSecrets() before they reach the trace builder. - 18 unit tests cover every pattern plus negative cases (Luhn-failing digit runs, out-of-range IPv4 octets, idempotence on re-redacted text, ordinary prose passthrough). - i18n privacyContentHint on en + zh-CN + zh-TW (plus 14 locale placeholders) enumerates the categories so the consent disclosure matches the implementation — the GDPR informed-consent requirement. - spec gains a Pre-send Redaction subsection with the regex shape table + intentional non-coverage list. Drive-by: dropped the [privacy] debug logs that traced the now-fixed bootstrap regression. * fix(telemetry): make Langfuse reporting resilient * feat(telemetry): nest Langfuse turn observations * feat(telemetry): emit Langfuse tool spans * fix(telemetry): report after finalized message writes * fix(telemetry): honor persisted terminal status * fix(web): let consent banner yield page clicks * fix(telemetry): report current turn prompt only	2026-05-09 10:06:01 +08:00
Marc Chan	e14b8092ea	feat: add Orbit activity summaries (#681 ) * feat: add Orbit activity summaries * fix(orbit): make runs navigable while agent continues * fix(web): widen minimum chat panel * feat: support Orbit template selection * fix(daemon): avoid bogus skill side-file preflight * fix(web): collapse orbit artifact project cards * fix(web): preserve orbit project card titles * fix: improve Orbit run daily briefing * fix: handle Orbit digest data failures * fix: load Orbit templates and connector tools reliably * fix: keep Orbit summary counts consistent Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: apply Orbit template skill context * fix: cache and curate connector tools for Orbit * fix: align Orbit defaults and connector discovery * fix: simplify Orbit template settings * fix: move connectors into settings * fix: compact connector settings catalog * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: prevent connector action button from stretching into pill The icon-only connect/disconnect buttons in the embedded connectors catalog inherited min-width: 92px / 106px from the non-embedded pill rules, overriding the 24px square sizing and causing the buttons to overlap the card head text. Reset min-width to 0 in the embedded icon-only rule so the compact square layout holds. * fix(web): align live artifact file rows * fix: clean up Orbit connector settings lifecycle Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: address Orbit review regressions Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * feat(web): localize Orbit and connector settings * feat(web): gate Orbit runs without connectors * feat(web): refine connector settings UX * feat(web): safeguard Composio key clearing * fix(web): refresh Composio tool badges * feat(web): show connector logos * feat(daemon): localize Orbit prompt window * fix(daemon): clarify blocked connector callback closes * test(daemon): harden flaky async probes * fix(web): align Indonesian connector locale keys * test(web): align connector browser props * fix(web): preserve explicit credential clears Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): time out Composio logo proxy fetches Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): localize Indonesian connector settings copy Translate the new connector settings strings in the Indonesian locale and lock them with a regression test so this surface no longer silently falls back to English. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve discovered connector tools Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve onboarding autosave completion Keep settings autosave from clearing onboarding completion after the close gesture, and expose the desktop main types from source so workspace validation can typecheck packaged imports without a prior desktop build. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): defer Composio catalog cache hydration Load persisted Composio catalog data only after the runtime data directory is configured so startup cannot read another namespace's cache. Add a regression test that exercises the module-load singleton path. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): treat discovery completion independently Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve latest settings draft on close Use the latest persisted settings draft when the dialog closes so onboarding completion does not race a stale daemon sync and overwrite newer Orbit/template selections. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): avoid syncing draft Composio key on Orbit run Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): localize Orbit settings copy Translate the new Indonesian Orbit and autosave strings so the settings UI no longer falls back to English and the locale regression stays covered. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): prefer fresh connector catalog state Keep refetched connector status/auth data authoritative while retaining discovery-only tool metadata so the connectors UI stays consistent after refreshes. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): declare Indonesian locale fallback keys explicitly Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): inline Indonesian fallback strings for CI Replace the Indonesian locale's per-key English lookups with explicit strings so workspace typecheck no longer depends on brittle build-mode resolution in CI. Add a regression test that blocks those per-key English lookups from reappearing in the CI-sensitive fallback sections. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): restrict proxied connector logos to image MIME types Reject non-image upstream logo responses so the daemon never serves third-party HTML from its localhost origin. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * test(e2e): align settings dialog regressions Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): decouple Orbit runs from media sync failures Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): keep SPA catch-all export-compatible Disable dynamic catch-all params for the exported SPA shell so Next.js static builds can emit the root route again. Add a regression test covering the route config against the web export mode. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve Orbit config and workspace routes Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): block SVG in connector logo proxy Reject SVG and other unsafe proxied logo responses so third-party logo content cannot execute under the daemon origin, while keeping raster logo fetches working and making rejected responses non-cacheable. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): fall back to static catalog for empty cache Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): disable Orbit run before connector gate resolves Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(desktop): export shipped desktop types Point the desktop ./main type export at the generated declaration so installed consumers resolve the published file set. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): restore persisted question form selections Render historical submitted answers directly so reloaded question forms keep their locked selections visible. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): retry forced media sync autosave Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): keep Composio logo timeout through body read Keep the Composio logo fetch timeout active until the response body is fully consumed so stalled body reads abort and clear the inflight cache entry. Add a regression test that proves a delayed body read times out and the next request can recover.\n\nGenerated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): refresh Orbit gate after connector auth Re-check connector availability when the settings window regains focus so Orbit unlocks as soon as a connector finishes authenticating in the same settings session. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): keep connector detail tool lists intact Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): ignore malformed Orbit summaries Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(e2e): stabilize design-system multi-select flow Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): cap Composio logo cache growth Bound the Composio logo cache with LRU eviction and expired-entry pruning so repeated untrusted logo requests cannot grow daemon memory without limit. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): bound proxied Composio logo payloads Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): align autosave settings tests Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): remove stray CSS conflict marker Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fixer: address PR #681 follow-up items Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): restore restart routes and connector flows * fix(web): keep SPA export route static * fix(web): stabilize chat scroll tests --------- Co-authored-by: lefarcen <935902669@qq.com>	2026-05-08 14:27:46 +08:00
VanJay	369d136d19	Add Docker Compose deployment workflow (#65 ) * Add Docker Compose deployment workflow * Address Docker deployment review feedback Harden publishing inputs and temporary credential handling, and tighten Docker runtime defaults requested by the PR review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix Docker publish build in CI mode Set CI=true during the image build so pnpm prune can run non-interactively inside Docker. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix Docker runtime dependency layout Use pnpm deploy for the daemon package so the runtime image includes production dependencies where Node resolves them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Use legacy pnpm deploy in Docker build Allow pnpm v10 deploy to package the daemon workspace without requiring injected workspace packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Align Docker runtime with Node 24 Use Node 24 for both build and runtime stages and update image verification for the workspace daemon dependency layout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Remove legacy OD_HOST Docker binding fallback Use OD_BIND_HOST as the single daemon bind-host setting for Docker deployment and origin validation. * Update Docker image verifier for daemon dist runtime Check the packaged daemon dist entrypoint and allow npm from the Node 24 runtime image while still rejecting build-only tools. * Allow private LAN browser origins for daemon * Share daemon origin validation helpers Move browser origin validation into a shared daemon module so tests exercise the production logic and cover the remaining private LAN edge cases. * Harden Docker Compose port exposure Bind the Compose deployment to localhost by default and pass the published port through to the daemon origin checks so host-port overrides remain same-origin. * Keep deployment hosts out of local-only no-origin checks Require an actual matching Origin before configured deployment origins can satisfy local-only daemon guards, preventing no-Origin remote clients from bypassing those checks. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: mrcfps <mrc@powerformer.com> Co-authored-by: lefarcen <935902669@qq.com>	2026-05-08 11:51:51 +08:00
shangxinyu1	9b501f12a5	Support overriding the Codex executable path (#755 ) * Support overriding the Codex executable path * Replace save-as-template prompts with an in-app dialog * Seed local packaged app config from workspace * Fix packaged config and connection test overrides * Keep tools-pack mac config seeding self-contained * Require absolute CODEX_BIN overrides	2026-05-07 15:00:52 +08:00
Sid	33255a8fdf	Fix agent CLI config and workspace focus mode (#604 ) * fix agent CLI config and workspace focus mode * address CLI env review follow-ups	2026-05-06 16:06:56 +08:00
Justin Gao	cbe2baf596	feat(web): add skills & design systems management page in settings (#535 ) * feat(web): add skills & design systems management page in settings Add a new "Library" section in Settings that lets users browse, search, preview, and enable/disable skills and design systems. Disabled items are excluded from the create-project picker. Phase 1 — browse/toggle only. Closes #497 * fix(web): persist empty disabled lists and deduplicate DS preview Use empty array instead of undefined when all items are re-enabled so the daemon merge clears the key. Move DS preview panel outside the category group loop so it renders once, not per group. * fix(web): address review feedback on library settings Clear disabled lists on invalid daemon writes, memoize enabled item filters in App.tsx, and guard preview fetch against rapid-click race conditions. * fix(web): hydrate disabled lists from daemon and keep full lists in ProjectView Merge daemonConfig.disabledSkills/disabledDesignSystems during bootstrap so the values survive localStorage resets. Pass unfiltered skills and design systems to ProjectView so existing project metadata resolves correctly.	2026-05-05 22:50:25 +08:00
emilneander	33c3b94b42	feat(daemon): add od mcp - expose Open Design as an MCP server (#399 ) * feat(daemon): add `od mcp` subcommand for stdio MCP server Lets a coding agent in a different repo (Claude Code, Cursor, Zed) pull files from a locally-running OD project over the Model Context Protocol — no export/import zip dance. The MCP server is a thin stdio process that proxies read-only tool calls to the daemon's existing HTTP API; no daemon-side changes required. Exposes 8 tools: list_projects, get_project, list_files, get_file, list_skills, get_skill, list_design_systems, get_design_system Wired exactly like `od media`: a hoisted flag set, a SUBCOMMAND_MAP entry, a thin handler that resolves OD_DAEMON_URL and hands off to src/mcp.ts. Tool dispatch is a switch over the tool name; each branch fetches the matching daemon route and surfaces the response as MCP text content. Binary mimes return a clear error pending phase-2 support. Lifecycle gotcha worth flagging: Server.connect(transport) only starts the stdio reader; the promise resolves immediately. Without holding the function awaiting until transport/stdin close, cli.ts's top-level process.exit(0) kills the server before the first request arrives. The fix in src/mcp.ts holds until onclose / stdin EOF. Wire-up example for a consuming repo: { "mcpServers": { "open-design": { "command": "od", "args": ["mcp"], "env": { "OD_DAEMON_URL": "http://127.0.0.1:7456" } } } } New dep: @modelcontextprotocol/sdk (MIT, official Anthropic SDK). * feat(daemon): add MCP server instructions for zero-shot LLM context Hand the consuming LLM a system-prompt-style overview of the OD workflow so it picks the right tool without prompt-engineering on the user's side. Mentions get_artifact and project-name resolution ahead of their actual implementation; both ship in the same batch. * feat(daemon): resolve MCP project args by UUID, name, or substring Lets a consuming agent say `project: "recaptr"` instead of pasting a UUID. Match order: exact id → exact name (case-insensitive) → slug-normalized name (strips trailing " (N)", normalizes whitespace) → substring (errors if multiple). UUID inputs short-circuit and never hit the daemon. * feat(daemon): surface entryFile and kind on MCP get_project response Promote metadata.entryFile and metadata.kind to top-level fields so consumers (including get_artifact in this branch) can find the entry without digging through nested metadata blobs. * feat(daemon): add MCP get_artifact tool for bundle retrieval A design rarely lives in a single file. get_artifact pulls the entry HTML/JSX plus every sibling it references (tokens CSS, JSX modules, imported components) in one call, so a consuming agent doesn't need to parse HTML and round-trip per file. Three modes: auto (default): BFS over relative <script src>, <link href>, <img src>, <source/video src>, JSX import/from, CSS url(), with depth cap 3 and a visited set. CDN, data:, mailto:, anchors, and paths containing .. are skipped. all: every textual file in the project (mirror of /archive minus binaries). shallow: just the entry file (same as get_file). Output is a structured JSON blob with name/mime/size/content per file and the project's manifest metadata at the top. * feat(daemon): add /api/projects/:id/search route + MCP search_files Server-side substring search across textual project files. Returns file, 1-indexed line, and snippet, capped at 1000 matches. Exposed through the MCP layer as search_files(project, query, pattern?, max?). Treats the query as a literal substring (regex chars escaped) to avoid catastrophic-backtracking attacks from LLM-supplied input. Honors the project dir's existing path-safety guards via listFiles. * feat(daemon): add since= filter to /files route + MCP list_files arg Lets a consumer poll for "what's changed since I last looked" without re-walking every file. Daemon-side: parse since= as ms, filter listFiles output by mtime. MCP-side: forward as URL query. * feat(daemon): expose skills and design systems as MCP resources Catalog reads are stable reference material — they fit MCP's resources surface (LLM-passive) better than tools (LLM-active). Skills and design systems each become resources at od://skills/<id>/SKILL.md and od://design-systems/<id>/DESIGN.md; existing list_skills / get_skill / list_design_systems / get_design_system tools remain as fallbacks for clients that don't handle resources cleanly. * fix(daemon): tighten MCP correctness in get_artifact and resources Several silent-failure paths and minor footguns the first pass missed: - get_artifact auto: the entry's own fetch now raises a clear error instead of returning files: []. Previously a typo in `entry:` looked like an empty project. - get_artifact: invalid `include` value returns a clear error listing the valid modes instead of silently behaving as auto. - get_artifact all: includes binary files as metadata stubs to match auto's behavior. Both modes are now strict supersets of shallow. - extractRelativeRefs: gate JS-only patterns (import/from/require/ dynamic-import) by file mime/extension so prose in markdown or HTML doesn't generate spurious 404 round-trips on words like "imported from 'X'". - extractRelativeRefs: cover <iframe>, <audio>, srcset, and CSS @import — common in real OD output. - resources/list descriptions are collapsed to a single line (newlines + repeated whitespace -> one space) so MCP UIs that don't normalize whitespace render cleanly. - fetchProjectFile: 0-byte binary files no longer report size: null due to falsy short-circuit on Number(content-length). * perf(daemon): cache MCP project list for 5s in resolveProjectId A typical agent session calls list_files/get_file/get_artifact several times in a row, each with a project name argument. Each previously re-fetched /api/projects. Cache the list in module scope with a 5s TTL so back-to-back lookups are local; renames in the OD UI still propagate within a few seconds. * feat(daemon): MCP UX polish — tool order, annotations, get_artifact maxBytes Three changes well-behaved MCP clients pick up automatically: - Tool ordering. list_projects + get_artifact are now first; LLMs that weight earlier entries surface the bundle path before per-file fetching. Catalog tools (list_skills, get_skill, list_design_systems, get_design_system) sit at the bottom; they are also exposed as MCP resources. - readOnlyHint / idempotentHint / openWorldHint annotations on every tool so clients can skip confirmation prompts on safe tools and let the LLM know re-running is fine. Per-tool `title` annotations give clients a friendlier display name than the snake_case tool id. - get_artifact gains a `maxBytes` arg (default 1.5MB). Once the accumulated textual content crosses the cap, remaining files are dropped and `truncated: true` is set on the bundle so the consumer knows to use list_files / get_file for the rest. * feat(daemon): expose user's active OD project/file via MCP The "what file are you on?" round-trip the agent had to do every session is now answered automatically. Three pieces: - Daemon: in-memory active-context slot with 5-minute TTL. POST /api/active sets {projectId, fileName}; GET /api/active returns the current value enriched with projectName, or {active:false} when the slot is empty/stale. Cleared on daemon restart. - Web: a small useEffect in App.tsx posts the active project + file to the daemon on every route change. Best-effort fire- and-forget; a missing daemon doesn't surface an error. - MCP: get_active_context tool (no args) and a matching MCP resource at od://focus/active. The tool is listed second, right after list_projects, so an LLM picks it up before asking for ids. Server instructions tell the model to call it FIRST when the user says "this file" / "the design I have open" / "what I'm looking at." End to end: user opens a project in OD, agent in another repo calls get_active_context() → gets {projectName: "recaptr", fileName: "recaptr-onboarding-4.html"}, then immediately calls get_artifact(project: "recaptr") with no further user input. * feat(daemon): make MCP project arg optional, fall back to active OD context get_artifact, get_project, get_file, search_files, and list_files now accept project as optional. When omitted, the MCP resolves project from /api/active so an agent in another repo can call search_files({ query: "Polaroid" }) without first asking the user "which project?". get_file and get_artifact also default their path/entry to the active file, so get_file({}) returns whatever the user is currently looking at. The implicit path stamps `usedActiveContext` on JSON responses (or a separate `[od:active-context …]` content block on get_file) so the agent can see exactly which project/file got chosen. Explicit project args pass through with zero added overhead. Cuts the common case from two MCP round trips (get_active_context → search_files) to one. Server instructions and get_active_context's own description are updated to point at the new default. * fix(daemon): require same-origin for /api/active POST and GET The active-context endpoint was added without isLocalSameOrigin guard. Since the daemon binds 0.0.0.0 by default, a LAN peer could GET it to learn what file the user has open, or POST it to redirect the MCP fallback to a project of their choice. Same-origin only is the right scope: the web app proxies its requests through Next.js on the daemon port, and the MCP runs over loopback in-process, so both legitimate callers pass. Pattern matches the existing /api/app-config etc. guards. * feat(daemon): add /api/mcp/install-info for cross-platform install snippets The Settings -> MCP server panel needs absolute paths to node and the daemon's built cli.js so it can render snippets that work on a fresh source clone (where `od` is not on PATH) and dodge the /usr/bin/od octal-dump tool that ships on macOS/Linux and would otherwise shadow ours. Endpoint returns: - command: process.execPath (the node binary running the daemon) - args: [<absolute path to dist/cli.js>, "mcp"] - daemonUrl: http://127.0.0.1:<port> - platform: process.platform (so the panel can localize ~/.cursor vs %USERPROFILE%\.cursor and Cmd vs Ctrl shortcuts) - cliExists / nodeExists: existsSync checks on both binaries - buildHint: human-readable build/reinstall instructions when either path is missing isLocalSameOrigin guard same as /api/active. Cached for 5s because the panel may re-fetch on every open and the paths cannot change without a daemon restart. Test file covers the happy path, cross-origin rejection, two allowed-Origin variants, and the cache by counting fresh resolves across rapid calls. 5/5 pass. * refactor(daemon): tighten MCP surface, trim descriptions, polish copy Three intertwined cleanups that all live in mcp.ts + cli.ts: 1. Drop catalog tools from MCP. list_skills / get_skill / list_design_systems / get_design_system are removed. The audience is a coding agent in a separate repo consuming Open Design's output; it cannot run skills (those are recipes Open Design uses to generate) and design-system DESIGN.md is reference material that already ships as an MCP resource. Keeping the catalog as tools cost ~350 token-overhead per turn for capabilities the agent could not act on. Tool count: 11 -> 7. 2. Trim tool descriptions. The active-context fallback explanation was repeated in 5 separate tool descriptions; hoisted into PROJECT_ARG and explained once in the server `instructions` block instead. Saves ~150-200 tokens per tools/list response. 3. User-facing branding pass. Tool titles, tool descriptions, resource names, error messages, comments, and `od mcp --help` now consistently use "Open Design" rather than "OD". Internal abbreviation `OD` is retained only inside the server instructions block where it is introduced inline as "Open Design (OD)" for compactness across multi-paragraph guidance. Em dashes replaced with hyphens throughout, per project style. * feat(web): add MCP server install panel in Settings New "MCP server" section in the Settings dialog, surfacing copy-paste install snippets for the major MCP-compatible coding agents (Claude Code, Cursor, VS Code, Antigravity, Zed, Windsurf). Highlights: - In-brand custom dropdown (reuses the existing .ds-picker pattern from the design-system / prompt-template pickers, click outside / Escape to close, chevron animates) instead of a native <select>. - Per-client snippet that uses absolute paths to node + cli.js fetched from /api/mcp/install-info on mount, so it works even when `od` is not on PATH. - Cursor gets a one-click "Install in Cursor" deeplink (cursor://anysphere.cursor-deeplink/mcp/install) that pops an approval dialog and writes the config for the user. UTF-8-safe base64 so paths with accented characters do not throw. - Per-OS path hints (~/.cursor on POSIX, %USERPROFILE%\.cursor on Windows) and keyboard shortcuts (Cmd vs Ctrl). - Build-required warning card when cli.js or the node binary does not exist on disk; deeplink button disables in that state. - Prominent "restart your client to pick up the new server" callout below the snippet, with per-client guidance. - Capability list ("what your agent can do") instead of a tool- name dump, so non-developer designers can also tell what is possible without reading MCP docs. README adds a short "Use Open Design from your coding agent" section that points at the panel and summarizes the per-client flow (one-click for Cursor, JSON merge elsewhere). Read-only by design; the daemon must be running locally. * docs(readme): align MCP server section with the Settings panel The "Use Open Design from your coding agent" section had drifted from what the panel actually emits and lists. - Add Antigravity to the supported-client list (previously missing). - Drop the "(GitHub Copilot)" parenthetical from VS Code so the label matches the panel. - Fix the Claude Code line: we no longer emit a single `claude mcp add ...` shell command. The snippet is JSON; the panel additionally suggests `claude mcp add-json` as the safer way to apply it instead of hand-editing ~/.claude.json. - Swap the "find the Polaroid section" example for two more universal phrases ("build this in my app", "match these styles") that match what the panel surfaces. - Add a one-line "restart or reload your client after install" note - this was prominent in the panel and absent from the README. - Trim the /usr/bin/od octal-dump aside; it was technical detail that did not earn its space at the README intro level. * feat(web): add Codex CLI to the MCP server install panel Codex is a first-class supported coding agent (listed alongside Claude Code, Cursor, etc. in the README's PATH-detected agent table) but the install panel was missing it. Codex stores MCP server config at ~/.codex/config.toml (TOML, not JSON) under an `[mcp_servers.<name>]` table, and the same file is shared between the Codex CLI and the Codex IDE extension - so one install covers both. Added a 7th client entry that emits the right TOML snippet, expanded the snippet-lang union to include 'toml' (behaves like 'json' for whitespace handling, just a different syntax-highlight hint). For our minimal payload (just command + args), JSON.stringify happens to produce valid TOML literal values since TOML basic strings use the same double-quote escape rules as JSON, and TOML inline arrays match JSON array syntax. No new TOML serializer needed. README updated to list Codex among the supported clients. Schema verified against https://developers.openai.com/codex/mcp. * fix(daemon): accept any loopback origin in same-origin guard The previous port-pinned check required the request's Origin to match either the daemon's own port or OD_WEB_PORT. tools-dev does not pass OD_WEB_PORT to the daemon process, so any browser POST to /api/active proxied through the dev web (port 17573 etc.) was rejected with 403, and get_active_context always returned {active: false}. Relax to a loopback-prefix match: any http://127.0.0.1:, http://localhost:, or http://[::1]:* origin passes regardless of port. Cross-origin (https://evil.com) is still rejected. The trade-off is that another local web app on a different loopback port could now CSRF the daemon; same-origin checks are inherently a CSRF defense, not a network ACL. * fix(web): make Claude Code MCP snippet a real copyable one-liner claude mcp add-json open-design '<json>' takes only the inner server-config object, not the full {"mcpServers": ...} wrapper, and rejected the wrapped shape with "Invalid configuration: : Invalid input". Pass only the inner config, and inline the JSON into the command itself so the snippet is a real one-liner the user can copy and paste, no template substitution. * test(daemon): drop loopback-prefix assertions superseded by upstream origin policy The two proxy-flow allow tests were added in `ae13094` to cover our relaxed isLocalSameOrigin. Main's port-pinned implementation (from #365) now handles the dev-flow via the web sidecar proxy origin rewrite (#a719f02), making the relaxation -- and these tests -- unnecessary. Also replace the inline LOOPBACK__RE / isLocalSameOrigin replica in mcp-install-info.test.ts with a direct import from server.ts so both test files stay in sync with the production guard automatically. fix(daemon): bake daemon URL into MCP install-info args The install panel snippet previously emitted `od mcp` with no daemon URL, so the MCP server always fell back to the hardcoded default port 7456. When tools-dev starts the daemon on a non-default port the snippet silently targets the wrong daemon. Fix: include --daemon-url http://127.0.0.1:<port> as the third arg so the generated snippet is always tied to the running daemon's actual port. Update the matching mini-app and assertion in the install-info test. * fix(daemon): address MCP reviewer feedback - extractRelativeRefs: replace blanket `includes('..')` rejection with proper POSIX-style path normalization. `../tokens.css` in a nested project layout now resolves to `tokens.css` instead of being silently dropped. - getArtifact: add MAX_FILES=200 cap to BFS auto and include=all modes. Pass `remainingBytes` to fetchProjectFile so it can bail early when the server-advertised content-length would already exceed the budget. - resolveProjectId: return {id, name, source} instead of a bare id. Callers echo `resolvedProject` in the response when the match was by slug or substring, letting the agent confirm which project was chosen without an extra round-trip. - getFile: thread `resolved` through so substring matches surface the same `[od:resolved-project ...]` annotation. - @ts-nocheck: add a comment explaining the Zod-vs-JSON-Schema SDK mismatch so future contributors don't remove it accidentally. - get_active_context description: note the ~5-minute cache TTL. * test(daemon): restore @ts-nocheck on mcp-install-info test Dropped accidentally when replacing the import header. The directive suppresses expected test-file noise (baseUrl pre-assignment and res.json() unknown return type); keeping it avoids littering the test body with `as any` casts for zero real safety benefit. * docs(readme): expand MCP section with why-MCP, security model, and recovery note - Soften "No zip export, no copy-paste" to "Replaces the export-then-attach loop" per reviewer feedback. - Add "Why MCP?" paragraph explaining the structured-API benefit over zip exports. - Add daemon-not-running recovery note (clear error, not a crash; start with pnpm tools-dev and retry). - Add security model callout: read-only, loopback-only, Host/Origin guard rejects non-loopback requests. * docs: complete security model and daemon recovery notes for MCP section 8.3: Expand README security model to include stdio child process context, trust framing (treat like a VS Code extension), and OD_BIND_HOST opt-in for LAN exposure. 8.4: Replace terse "daemon not running" note in README with a full recovery sentence covering the start-agent-before-Open-Design case. Add the same recovery note as a footer paragraph in IntegrationsSection so users see it in the Settings panel without needing to read the README. * fix(daemon): pass resolved through get_artifact so substring matches echo resolvedProject * feat(daemon): add MCP unit tests and fill description/instructions gaps - Export extractRelativeRefs, resolveProjectId, resolveProjectArg, withActiveEcho, fetchProjectFile, getArtifact for testing - mcp-extract-refs.test.ts: 10 cases covering flat, nested, deep, escape attempts, external/data/anchor/mailto URLs, srcset - mcp-get-artifact.test.ts: MAX_FILES=200 cap, maxBytes cap, per-file content-length pre-check via fetchProjectFile - mcp-resolve-project.test.ts: uuid/exact/slug/substring source values, ambiguity error, withActiveEcho resolvedProject stamping - get_artifact maxBytes description now mentions the 200-file cap - Instructions block now mentions resolvedProject field and when it appears (slug or substring match) * docs(daemon): document MCP active-context TTL and surface wake-up hint Address PR #399 review item P2.5 (active-context TTL undocumented) plus the related UX gap where the agent had no way to tell the user that clicking around in Open Design refreshes the cache. - PROJECT_ARG, get_artifact entry, get_file path: append TTL note to argument descriptions so agents see the ~5-minute fallback window. - get_active_context: when /api/active reports active:false, return an explicit hint string explaining the recovery action ("ask the user to click into a project") instead of a bare {active:false} the agent can't act on. - get_active_context tool description: mention the new hint payload. - resolveProjectArg error: extend the missing-active-context message with the same TTL + recovery wording for tool calls that omit project= and have no fallback. * feat(daemon): add offset/limit pagination to MCP get_file Real-world MCP usage hit a wall on large files: get_file returned the full body, the agent decided the result was too large for its context budget, and recovered by spawning a sub-agent that ran Python with manual brace-matching for several minutes. That defeats the value proposition of skipping zip-export. Mirror Claude Code's Read tool semantics: get_file now accepts optional offset (0-indexed line) and limit (default 2000) args, slices the file in mcp.ts after fetching from the daemon, and stamps an [od:file-window offset=.. returnedLines=.. totalLines=..] marker on sliced or truncated responses so the agent can page by re-calling with the next offset. - Tool definition: add offset/limit args, expand description. - getFile helper: line-split, slice, marker, range clamp at EOF. - Instructions block: mention pagination in the get_file bullet. - Binary rejection unchanged. - New tests in mcp-get-file.test.ts cover default behavior, limit truncation, mid-file offset, offset past EOF, and binary rejection. * fix(daemon): set truncated: true when per-file content-length pre-check fires When fetchProjectFile throws because a file's advertised content-length exceeds the remaining byte budget, both the include=all loop and the auto BFS loop silently skipped the file without setting truncated: true. The bundle could then report truncated: false even though files were dropped. Introduce BudgetExceededError as a sentinel so callers can distinguish a budget rejection (truncated: true) from a genuine fetch failure (404, network) that should just be skipped. Both getArtifact call sites now check instanceof BudgetExceededError and set truncated accordingly. Adds a regression test: 5 files of 250 bytes with explicit content-length, maxBytes=400. Only file 0 fits; files 1-4 each exceed the remaining 150 bytes. totalTextBytes never reaches maxBytes, so only the new path sets truncated=true. Previously the bundle reported truncated: false.	2026-05-04 22:34:17 +08:00
Ajay Satish	9d700ec74f	feat(daemon): persist code agent startup (#255 ) * feat(daemon): persist code agent startup * fix: complete all suggestions * fix: types for app config * chore: revert local origin * chore: format to single quotes * fix: duplicate headers * fix: isLocalSameOrigin rewriting issue --------- Co-authored-by: mrcfps <mrc@powerformer.com>	2026-05-03 12:14:04 +08:00

12 commits