open-design

mirror of https://github.com/nexu-io/open-design.git synced 2026-06-01 03:14:35 +07:00

Author	SHA1	Message	Date
lefarcen	df8a0faff6	feat(runtimes): register AMR (vela) as an ACP stdio agent (#2355 ) * feat(runtimes): register AMR (vela) as an ACP stdio agent AMR is the vela CLI's ACP runtime mode. `vela agent run --runtime opencode` speaks ACP JSON-RPC over stdio (see vela's `specs/current/runtime/manual-agent-run-openrouter.md`); per `docs/new-agent-runtime-acp.md` we expose it through the same `streamFormat: 'acp-json-rpc'` transport that already powers Hermes, Devin, Kimi, etc. The new `defs/amr.ts` is the entire wiring — `buildArgs` returns `['agent', 'run', '--runtime', 'opencode']`, `fetchModels` reuses `detectAcpModels`, and the fallback list seeds the OpenRouter ids vela's e2e baseline uses. `executables.ts`/`app-config.ts`/`metadata.ts` get the matching `VELA_BIN`/`VELA_LINK_URL`/`VELA_RUNTIME_KEY`/`VELA_OPENCODE_BIN` allowlist + install/docs URLs, so users can configure the per-agent env in Settings without leaking into other adapters. Coverage: `tests/fixtures/fake-vela.mjs` is a minimal ACP stub that returns the documented `initialize` / `session/new` / `session/set_model` / `session/prompt` shapes; `tests/amr-acp-integration.test.ts` spawns it via `child_process.spawn` and drives a full turn through `attachAcpSession` and `detectAcpModels`, so the ACP transport contract for AMR is end-to-end verified locally even before a real `vela` binary is installed. Validated: - pnpm guard - pnpm typecheck (all workspace projects) - pnpm --filter @open-design/daemon test (2881/2881) Deferred: real OpenRouter-backed turn through a built `vela` binary — the runtime def needs no changes for that path, only `VELA_RUNTIME_KEY` and `VELA_LINK_URL` in env (or Settings). * fix(runtimes/amr): pin a concrete default model and bare openai ids End-to-end validation against a freshly-built `vela` (nexu-io/vela@main) + OpenRouter surfaced two contract details the first AMR runtime def got wrong: 1. vela rejects `session/prompt` with `session/set_model must be called before session/prompt`. attachAcpSession in apps/daemon/src/acp.ts skips set_model whenever the picked model is the synthetic 'default' id, so AMR's fallback list must NOT include DEFAULT_MODEL_OPTION. The def now ships a concrete `gpt-5.4-mini` as both `fetchModels`' default option and `fallbackModels[0]`, which makes attachAcpSession always send a real `session/set_model` for AMR turns. 2. `vela --runtime opencode` auto-prepends `openai/` to whatever modelId it forwards to opencode's openai provider. With OpenRouter-style ids like `openai/gpt-5.4-mini`, opencode receives the double-prefixed `openai/openai/gpt-5.4-mini` and replies `ProviderModelNotFoundError`. The new fallback list ships the bare ids opencode's openai registry actually knows about (gpt-5.4, gpt-5.4-mini, gpt-5.4-fast, etc.). Stub + tests: - tests/fixtures/fake-vela.mjs now enforces the set_model gate the same way real vela does, so a regression that silently goes back to model: 'default' would surface as a fatal error in tests instead of a hidden production failure. - tests/amr-acp-integration.test.ts pins both contracts: no 'default' / no 'openai/' prefix in fallbackModels, and a negative case that asserts session/prompt fails when no model is set. Adds `apps/daemon/scripts/verify-amr-real-vela.mjs` — a small dev-time runner that drives `attachAcpSession` against a real `vela` binary and prints the daemon's chat events, so future protocol drift can be checked against an actual OpenRouter call. Verified locally: `vela agent run --runtime opencode` + OpenRouter returns the prompted string ("AMR-E2E-PASS") through the full daemon pipeline; daemon test suite stays 2883/2883. * fix(runtimes/amr): substitute concrete model when chat run sends 'default' A plugin-driven AMR run from the UI surfaced a real-world hole in the prior commit: json-rpc id 3: session/set_model must be called before session/prompt The Default-design-router plugin (and any caller that doesn't pin a real model) sends `model: 'default'` straight through, which the AMR runtime def cannot accept — vela rejects `session/prompt` without `session/set_model` and attachAcpSession skips set_model whenever model === 'default'. Just leaving DEFAULT_MODEL_OPTION out of the adapter's `fallbackModels` is not enough: the chat-run handler in server.ts still forwarded 'default' verbatim. This adds `resolveModelForAgent(def, resolved, env?)` as the single source of truth for the substitution: 1. If the caller picked a real id, pass it through. 2. Else, if `def.defaultModelEnvVar` is set and the daemon process env has a non-empty value for it, return that (operator escape hatch — see below). 3. Else, if the def's `fallbackModels` does NOT contain a 'default' id, return `fallbackModels[0].id`. 4. Else, return the original value (the historic shape — defs that list 'default' themselves are untouched). AMR sets `defaultModelEnvVar: 'VELA_DEFAULT_MODEL'`, so when opencode's openai-provider registry deprecates `gpt-5.4-mini` upstream, an operator can swap the fallback id without a code change by exporting `VELA_DEFAULT_MODEL=gpt-5.5` before launching tools-dev / od. Worth noting the env var must live in the daemon's `process.env` (Settings-UI per-agent env values only reach the spawned child, not the daemon's resolver) — the new field's docblock spells this out. Coverage: - `tests/runtimes/resolve-model.test.ts` — 8 unit tests covering all four resolver branches plus the env-override happy path / fallback / ignore-when-user-picked-a-real-id case. - `pnpm --filter @open-design/daemon typecheck` clean. * chore(runtimes/amr): move AMR to the top of the base agent list So `AMR (vela)` shows up first in the agent picker / status views, ahead of claude / codex. Pure ordering change; no behavior delta. * feat(amr): Sign-in / Sign-out button on the AMR Settings card The first half of the AMR work assumed the operator would set VELA_RUNTIME_KEY / VELA_LINK_URL on the daemon process and never surfaced login state to users. This adds the missing UX so a fresh install can drive the full path from Settings: - GET /api/integrations/vela/status reads ~/.vela/config.json for the active profile and returns { loggedIn, profile, user } (without leaking the runtime/control keys themselves). - POST /api/integrations/vela/login spawns `vela login` once (409 if one is already in flight). The vela CLI opens the user's browser to the device-authorization page itself — Open Design only needs to kick the subprocess off. - POST /api/integrations/vela/logout removes ~/.vela/config.json so the next status read returns logged-out. `AmrAgentCard` is a dedicated agent-card component for AMR because the existing `<button>` row can't host an interactive sub-control (nested interactive elements). It polls /status after a login click until the daemon reports loggedIn=true (or 5 minutes elapse), and exposes a Sign-out action on hover. Other adapters (claude, codex, hermes, …) keep their existing `<button>` card. i18n: 8 new keys (settings.amrLogin / Logout / LoggingIn / etc.) added to en + zh-CN. Other locales spread `en` and inherit the English copy until translations land. Coverage: - `tests/integrations/vela.test.ts` pins the config.json reader against a tmp HOME — including the negative case where a profile has user info but no runtimeKey (still logged-out), and the secret-leak guard ("rt-secret-" must not appear in the projection payload). - `tests/components/AmrAgentCard.test.tsx` covers all four UI states (logged-out, logging-in, logged-in, logging-out) plus the click-propagation invariant the divergent card was built to keep. `pnpm --filter @open-design/daemon test` 2901 / 2901 passing. `pnpm --filter @open-design/web test` 1719 / 1719 passing. `pnpm typecheck` + `pnpm guard` clean. Dev script side-effects: `apps/daemon/scripts/verify-amr-real-vela.mjs` no longer requires both VELA_RUNTIME_KEY and VELA_LINK_URL — if VELA_PROFILE is set, the vela CLI is allowed to resolve credentials from `~/.vela/config.json`. Added the two AMR `.mjs` fixtures to `scripts/guard.ts` allowlist with the executable-fixture / dev-runner rationale. fix(connection-test): substitute model for AMR before attachAcpSession The chat-run path in server.ts already routes the requested model through `resolveModelForAgent` so AMR / vela (whose CLI demands an explicit `session/set_model` before `session/prompt`) gets the def's first concrete fallback id when the chat run ships `model: 'default'`. `connectionTest.ts` was wiring `attachAcpSession({ ..., model: model ?? null })` directly, which made the Test Connection button on the AMR Settings card deadlock with the same `session/set_model must be called before session/prompt` error the chat-run path already handles — surfaced as a permanent "Testing connection…" spinner in the UI. Reuse the same helper here so Test Connection mirrors chat-run behavior. * test(amr): three-layer end-to-end coverage for the AMR login + turn flow The PR up to this point shipped runtime + UI code with unit-level Vitest coverage. This commit adds the cross-layer regression net the live demo relied on: 1. apps/daemon/tests/integrations/vela.routes.test.ts (HTTP, Vitest) Spins up the real daemon Express app via `startServer({port:0,...})`, persists `agentCliEnv.amr.VELA_BIN = <fake>` into app-config.json, and exercises every /api/integrations/vela/* endpoint against the extended fake-vela stub: - status reads ~/.vela/config.json under various states - login spawns the fake, waits for config.json to appear, returns pid + startedAt + profile - 409 already-running guard with the stub's delay knob - logout removes the file (idempotent) - secrets (runtimeKey / controlKey) never leak in the projection - login → status round-trip flips loggedIn=false → true 2. e2e/tests/amr/turn.test.ts (tools-dev orchestrated, Vitest) Boots a namespaced daemon + web pair through `createSmokeSuite`, inlines a self-contained fake `vela` binary that handles BOTH `vela login` (writes ~/.vela/config.json) and `vela agent run --runtime opencode` (ACP stdio with the `session/set_model must precede session/prompt` gate the real binary enforces), then drives a complete /api/runs lifecycle for `agentId: 'amr', model: 'default'` and asserts the assistant message captures the fake's streamed text. This is the test that would have surfaced today's plugin-default-model regression (the `set_model before prompt` error) at PR time instead of demo time. 3. e2e/ui/amr-login-pill.test.ts (Playwright) Mocks /api/agents + /api/integrations/vela/{status,login,logout} to drive the Settings AMR card through the full Sign in → Signed in → Sign out cycle. Pins the AmrLoginPill polling contract and the aria-label semantics (the pill's accessible name is "Sign out" once logged in, regardless of which label the hover-state text shows). fake-vela.mjs extensions: - Handles `vela login` argv by writing ~/.vela/config.json for the active VELA_PROFILE and exiting 0 — mirrors real vela's on-disk side-effect without the device-auth loop. - FAKE_VELA_LOGIN_DELAY_MS knob so route tests can observe the in-flight state of the spawn lifecycle. - FAKE_VELA_LOGIN_USER_EMAIL / _USER_PLAN to assert the surfaced user fields end-to-end. Validated: - `pnpm guard` + `pnpm typecheck` (all workspace projects) - `pnpm --filter @open-design/daemon test`: 2998 / 2998 passing, including the new 8-test integration suite. - `cd e2e && pnpm test tests/amr`: 1 / 1 passing. - `cd e2e && pnpm exec playwright test ui/amr-login-pill.test.ts`: 1 / 1 passing (6.7s). * feat(amr): package native cli and refine login ui * feat(amr): wire vela cli beta packaging * docs(amr): document vela ci packaging review * docs(amr): refine vela ci integration review * fix(ci): refresh nix pnpm dependency hashes * fix(pack): clean up Vela CLI packaging * fix(pack): bundle Vela CLI support files * fix(amr): recover login attempts from stale auth state * test: expand AMR and automations coverage * fix(amr): address review follow-ups * test(web): align tasks fixtures with contracts * fix(daemon): type wildcard route params * fix(ci): refresh PR merge validation * fix(amr): clear env credentials on logout * feat(settings): inline local CLI model configuration * fix(amr): recognize daemon env credentials * [codex] Fix Vela companion packaging (#2979) * Fix Vela companion packaging * Update Nix pnpm dependency hashes * [codex] Surface AMR account failures (#2980) * fix: surface AMR account failures * fix: cover AMR recovery error guidance * chore: bump beta base version to 0.8.1 (#2990) * Fix AMR profile and packaged runtime review issues * Detect packaged AMR OpenCode companion tree * feat(web): polish AMR frontend flows * Polish AMR onboarding card * fix: read AMR login state from dot-amr config (#3048) * test: tighten AMR credential and packaging coverage * test: restore AMR executable test env helper * [codex] Fix packaged mac Dock identity and AMR label (#3076) * Fix packaged mac sidecar Dock identity * Rename AMR assistant label * Fix AMR live models and dot-amr login state (#3073) * fix: read AMR login state from dot-amr config * fix: load live AMR models before runs * fix: point AMR onboarding link to production wallet * fix: address AMR model review feedback * fix: persist live AMR model fallback * [codex] Fix AMR link catalog model ids (#3088) * Fix packaged mac sidecar Dock identity * Rename AMR assistant label * Fix AMR link catalog model ids * Fix AMR model normalization typecheck * Use live AMR model for default runs * fix: polish AMR runtime settings UI * Accelerate AMR startup defaults (#3092) * Surface AMR insufficient balance wallet URL (#3099) * fix(web): polish onboarding controls (#3112) * fix(web): show CLI scan loading state * Avoid duplicate AMR wallet recharge links (#3117) * Avoid duplicate AMR wallet recharge links * Use Vela CLI 0.0.3 test package * chore(nix): refresh pnpm deps hash * Fix AMR wallet guidance display --------- Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com> * chore(pack): pin Vela CLI 0.0.3-test.1 (#3127) * chore(nix): refresh pnpm deps hash * chore(pack): pin Vela CLI 0.0.3 * chore(nix): refresh pnpm deps hash * fix(web): suppress AMR exit 130 fallback (#3136) * feat(web): nudge users to hosted AMR on model/auth/quota failures (#3083) * feat(web): nudge users to hosted AMR on model/auth/quota failures When a non-AMR agent run fails with an auth / quota / upstream model error, surface an inline nudge under the error pill linking to Open Design's hosted AMR gateway (https://open-design.ai/amr). The nudge fires `surface_view` (element=run_failed_toast) on impression and `ui_click` (element=go_amr) on the link. Also teach the daemon to classify CLI-agent auth/quota/upstream failures (Claude Code, codex, ...) into specific API error codes (AGENT_AUTH_REQUIRED / RATE_LIMITED / UPSTREAM_UNAVAILABLE) instead of the generic AGENT_EXECUTION_FAILED, so both the error message and the nudge key off accurate codes. AMR's own runs are excluded from the nudge — they keep the dedicated sign-in / recharge affordances. * feat(web): rework failed-run AMR guidance into per-case error UI Replace the single inline nudge with a per-case failed-run experience driven by the run's error code + agent: - The error card is now neutral gray (was red) and always carries a retry button; it is driven by the persisted per-message error event so it survives a reload. - Non-AMR agent hitting a model/auth/quota wall: a theme-color promotion card under the error card offers "switch to AMR & retry" — switches the run to AMR, opens Settings on the AMR card, and auto-retries once the account signs in (ProjectView polls vela login status, independent of the Settings pill lifecycle, with success / 5-min-timeout / unmount exits). - AMR agent unauthorized: clearer copy + an "authorize & retry" button. - AMR agent out of balance: clearer copy + a "top up" button to the AMR wallet, with manual retry. - Settings AMR card: when opened from the nudge, it scrolls into view and pulses, and an authorize-button coachmark (a fake hand cursor that rises in and dismisses on hover) points at the sign-in control when not yet authorized. analytics: surface_view (run_failed_toast) on the promotion card and ui_click (go_amr) on its action are retained. i18n adds chat.amrCard.* and chat.amrError.* (en / zh-CN / zh-TW translated; other locales fall back to en) and drops the old chat.amrErrorGuidance keys. * fix(daemon): require status context for numeric service-failure codes Per review on #3083: the model-service classifier matched bare HTTP status numbers (`500`, `502`, `429`, `401`), so ordinary CLI output like `line 500`, `read 502 bytes`, or `exit code 401` could be misclassified as a provider outage / auth wall and wrongly surface the AMR nudge. Now a status number only counts when it carries explicit context (`HTTP 500`, `status 503`, `code: 401`, `502 Bad Gateway`); textual provider phrases (overloaded, bad gateway, service unavailable, rate limit, …) are unchanged. Adds fixtures proving unrelated numeric output stays null. * fix(web): keep error pill for failed runs ChatPane's card doesn't cover Per review on #3083: the per-message gray error pill was suppressed for every persisted error status event, but ChatPane only renders the replacement top-level error card for `retryableAssistantMessage` (the last failed assistant). So a failed turn that is no longer last (after a follow-up) or an older failed run in history showed neither the pill nor the card — its error detail vanished, undercutting reload/history survival. ChatPane now passes `errorCardOwnerId` (the assistant id whose error the card represents); AssistantMessage suppresses only that one pill and keeps rendering StatusPill for all other error events. * fix(daemon): don't treat a process exit code as an HTTP status Follow-up to review on #3083: the status-context helper accepted a bare `code` prefix, so `exit code 401` / `process exited with code 429` still matched and got classified as AGENT_AUTH_REQUIRED / RATE_LIMITED (the very `exit code 401` case the comment calls out as noise). `code` now only counts when qualified (`status code` / `error code` / `response code`) or punctuation-bound (`code: 401`); bare `exit code N` no longer matches. Adds fixtures for exit-code lines returning null. * chore(web): translate AMR card / error keys for 16 remaining locales PR #3083 added 10 new `chat.amrCard.` / `chat.amrError.` keys but only provided en/zh-CN/zh-TW translations; the other 16 locales fell back to English. Translate the card title/body, three chips, primary CTA, and the AMR self-error (auth / balance) messages and buttons for ar, de, es-ES, fa, fr, hu, id, it, ja, ko, pl, pt-BR, ru, th, tr, uk. * fix(amr): address review feedback on #2355 Targeted fixes for the unresolved review threads on #2355. Each fix includes / updates a focused test. - runtimes/executables.ts: `packagedVelaOpenCodeCompanionTree` now verifies the inner `opencode` executable exists + is runnable, not just the directory. This closes the false-positive availability path that let `detectAgents()` surface AMR as available even when the packaged companion was empty / partially copied (mrcfps, 4 threads). - runtimes/executables.ts: `resolveAmrOpenCodeExecutable` now prefers the bundled `<OD_RESOURCE_ROOT>/bin/libexec/opencode/opencode` over a stale `opencode` on the user's PATH, so packaged AMR builds can't be hijacked by a global installation. - web/EntryShell.tsx: when the Local CLI scan returns an available agent and the previously-selected agent is AMR, switch the selection to the first available local agent so the runtime and persisted agent agree before Continue. - server.ts (model-probe branch): for AMR, check `readVelaLoginStatus` BEFORE rejecting on an empty live-model catalog — a signed-out user was getting `AMR_MODEL_UNAVAILABLE` ("choose a model") instead of the correct `AMR_AUTH_REQUIRED` (sign-in affordance). - server.ts (default model fallback): if the user asked for the AMR agent default and the cached id is no longer in the FRESH catalog, fall back to `liveModels[0]` from the probe instead of rejecting the run as `AMR_MODEL_UNAVAILABLE`. - integrations/vela.ts: route `vela login` through `createCommandInvocation` so an npm/Node-style `vela.cmd` / `.bat` shim on Windows gets the correct `cmd.exe /d /s /c …` wrapping with verbatim args (matches `execAgentFile` / chat-run spawning). - tools/pack/src/linux.ts: in containerized Linux builds, bind-mount the host directory of `OPEN_DESIGN_VELA_CLI_BIN` and rewrite the env to the container-side path. The host path was being passed in as-is even though the default container only mounts /project, /tools-pack and cache/home — `copyOptionalVelaCliBinary` saw a missing path. Deferred (out of scope for this PR): - `od amr status/login/logout/cancel` CLI subcommands (AGENTS.md UI/CLI dual-track rule, server.ts:5763) — sizable surface; tracked for a separate focused PR. - Strict `--require-vela-cli` for Windows + mac-x64 beta builds: prematurely blocked — `@powerformer/vela-cli` only publishes the `darwin-arm64` platform binary today; adding the flag elsewhere would fail the builds. Revisit once win/x64/linux binaries ship. * fix(amr): hoist sendAmrAccountFailure above the AMR catalog preflight (TDZ) The new signed-out AMR branch in the catalog preflight at server.ts:10875 calls `sendAmrAccountFailure(...)` to emit AMR_AUTH_REQUIRED, but the const declaration sat ~100 lines below at the outer function scope. Because `const` is TDZ-aware, that branch would have thrown `ReferenceError: Cannot access 'sendAmrAccountFailure' before initialization` for the exact users it tries to help — defeating the original intent. Hoist the helper to just above the AMR preflight block so it's available to every AMR code path in this function. Behavior elsewhere is unchanged. Also rerun the daemon test suite: `launch.test.ts > resolveAgentLaunch uses packaged built-in Vela for AMR` was creating the `<resourceRoot>/bin/libexec/opencode/` companion directory only, but this PR's earlier tightening of `packagedVelaOpenCodeCompanionTree` also requires the inner `opencode` executable. Add it to that fixture to match the new contract; the test was a sibling of the executables / env-and-detection fixtures already updated in `13fc4f4`. Addresses #2355 review (mrcfps, 2026-05-28). * feat(web): add hover cancel for AMR login (#3158) * feat(web): add hover cancel for AMR login * fix(web): don't bounce AmrLoginPill back to 'Signing in…' after local cancel Both codex-connector (P2) and looper (CHANGES_REQUESTED) on this PR flagged the same race in the new local-cancel path: `handleCancelLogin` dispatches `notifyAmrLoginStatusChanged('login-canceled')` immediately after `/login/cancel` returns, but the `AMR_LOGIN_STATUS_EVENT` listener unconditionally re-enters `refresh()` and then restarts polling whenever `/api/integrations/vela/status` still reports `loginInFlight: true`. That is a real race because the daemon's `cancelVelaLogin()` only sends SIGTERM (escalating to SIGKILL after `LOGIN_CANCEL_KILL_GRACE_MS` = 2000 ms) and keeps the child in `activeLoginProcs` until it actually exits — so the first `/status` read after a successful cancel can legally still come back as in-flight. Under that window the pill flips back to 'Signing in…' and can later surface the timeout/error path even though the user already canceled, defeating the behavior promised in the PR description. Fix the listener instead of every dispatch site: in the `login-canceled` branch, after the local reset (stopPolling + setPending(null) + clear refs), optimistically mark every subscribed pill instance as not-in-flight (`setStatus((c) => c ? { ...c, loginInFlight: false } : c)`) and `return` — skip the refresh-and-reconcile branch below entirely. The next explicit refresh (component mount, user interaction, or a `status-changed` event) will pick up the daemon's confirmed state once the child has actually exited. Add a focused regression test that holds `/api/integrations/vela/status` at `loginInFlight: true` even after a successful `/login/cancel`, asserting that the pill stays at the Canceled → Authorize sequence and never bounces back to 'Signing in…'. This test fails on the pre-fix listener and passes on the new behavior; existing 'cancels an in-flight AMR sign-in…' and 'reconciles late AMR browser completion to Signed in after local cancel' tests continue to pass. Addresses review feedback on #3158 (chatgpt-codex-connector, nettee). --------- Co-authored-by: lefarcen <935902669@qq.com> --------- Co-authored-by: a1chzt <chizblank@gmail.com> Co-authored-by: Amy <1184569493@qq.com> Co-authored-by: Mason <jinmeihong0201@gmail.com> Co-authored-by: Caprika <56862773+alchemistklk@users.noreply.github.com> Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>	2026-05-28 05:09:55 +00:00
lefarcen	c14baf07d3	Merge origin/main into release/v0.8.0 PR #2461 sync prep — resolves 14 conflicts merging 84 main-side commits on top of 58 release-side commits accumulated during the 0.8.0 cycle. Resolution summary: Take main (theirs) where main carried deliberate forward progress: - apps/web/src/components/PluginCard.tsx — 7 hunks, i18n migration: hardcoded English aria-labels/titles replaced with t() calls keyed on pluginCard.* (all 8 keys verified present in en.ts). - apps/web/src/components/TasksView.tsx — 1 hunk, source-ingestion feature: sortedRoutines (newest-first), sourceIngestionTemplates, patchSourceForm, submitSourceIngestion. activeCount/pausedCount semantics preserved (now keyed on sortedRoutines, count unchanged). - e2e/ui/app.test.ts — new node:fs/promises + tmpdir + path + @/timeouts imports needed by main-side test helpers. - e2e/ui/settings-local-cli-codex-fallback.test.ts — menu-dismissal helper block added by main. Keep both sides where each added a different field to the same object literal: - apps/web/src/components/ProjectView.tsx (locale + analyticsHints spread). - apps/web/src/components/DesignSystemFlow.tsx (locale + analyticsHints). Take release (ours) where release carried deliberate work that ships 0.8.0: - CHANGELOG.md — release-side 0.8.0 entry + PR link refs; main's Unreleased section was the same body of work, now finalized. - apps/landing-page/public/{apple-touch-icon,favicon}.png + apps/web/public/app-icon.svg — release-side visual refresh assets consistent with 0.8.0 stable ship. - tools/pack/src/linux.ts — packageVersion const required by line 466; taking main's empty line would build-error. - e2e/ui/project-management-flows.test.ts + e2e/ui/settings-api-protocol.test.ts + e2e/ui/settings-memory-routines.test.ts — release-side release-smoke hardening (shangxinyu1 + PerishFire) takes precedence on overlap. Closes-issue / unblocks: PR #2461 sync release/v0.8.0 → main.	2026-05-23 12:17:18 +08:00
lefarcen	9912fa899a	feat(analytics): full design-system event family + DS run variant (#2706 ) Lands the v2 PostHog spec's P0 design-system event family: five new result events covering source ingest, create, review, status, and picker apply; the existing file_upload_result + run_created/run_finished schemas widened to discriminate DS workspaces from regular chat runs. Contract (packages/contracts/src/analytics/events.ts): - AnalyticsEventName gains design_system_{source_ingest,create,review, status,apply}_result. - Props interfaces + bucket/origin/method/status enums per spec. - TrackingProjectKind gains 'design_system' for DS-as-project runs. - RunCreatedProps / RunFinishedProps widen page_name+area to discriminate chat_panel vs design_system_project; entry_from union accepts DS values; DS-variant context fields (ds_source_origin, source_count, brand description length bucket, per-source counts, design_system_created, preview_module_count, missing_font_count). - FileUploadSurface union adds design_systems / design_system_source. - Bucket helpers (designSystemLengthBucket, folderCountBucket, totalSizeBucket), module slug + type derivation, repo host parser. Web emission sites: - DesignSystemFlow.generate(): create_result + threads prepareCreatedDesignSystemProject with analyticsTrack so each of the 4 source paths emits source_ingest_result (success / partial / failed / empty), repo-host dominance, fallback type from connector status. - DropZone onFiles handlers: file_upload_result with deriveUploadCohort. - DesignSystemDetailView: status_result on togglePublished + Make-default, review_result on Looks-good / Needs-work; module_id from markdown section header slug (designSystemModuleSlug), module_type via keyword heuristic. - DesignSystemsTab: status_result on publish toggle, set/unset default, delete (incl. cancelled when window.confirm dismissed). - NewProjectPanel: apply_result on DS picker change (manual select + clear) plus an auto_select emit when the picker mounts with a default DS not yet user-touched. - ProjectView.streamViaDaemon: when project.metadata.importedFrom === 'design-system', pass analyticsHints with entry_from (onboarding_design_system for the auto-sent first message, regenerate_from_review for subsequent sends), projectKind=design_system, designSystemRunContext. Daemon: - ChatRequest gains optional analyticsHints (entryFrom / projectKind / designSystemRunContext). Behavior never depends on these; only PostHog props do. - /api/runs handler reads analyticsHints to flip baseProps to the DS variant (page_name=design_system_project, area=design_system_generation, project_kind=design_system) when the run is DS-flagged, and spreads the DS context fields onto run_created. - run_finished mirrors the DS area + adds design_system_created (true iff the run wrote DESIGN.md), preview_module_count (distinct preview/*.html writes), missing_font_count (0 placeholder; pending font-audit hook). - run-artifacts.ts: extracts collectWrittenPathsMatching as the shared Write/Edit + isError-pair core; adds didRunCreateDesignSystemFile and countDesignSystemPreviewModules using the same dedup + failure-skip invariants as countNewHtmlArtifacts. Tests: - packages/contracts/tests/analytics-design-system-helpers.test.ts: 18 new test cases over the bucket helpers, module slug + type mapping, repo host parser. - apps/daemon/tests/run-artifacts.test.ts: 9 new tests for didRunCreateDesignSystemFile + countDesignSystemPreviewModules covering Write-then-Edit dedupe, case-insensitive DESIGN.md match, isError pair skip, preview/index.html as a module, non-preview path rejection. Targets release/v0.8.0.	2026-05-22 17:18:57 +08:00
lefarcen	6690dbd5bb	feat(analytics): PostHog + Langfuse instrumentation for assistant feedback (#1558 ) * feat(analytics): PostHog + Langfuse instrumentation for assistant feedback Re-bases the original three-commit PR onto release/v0.8.0. The web-side feedback UI instrumentation (surface_view / ui_click / feedback_submit_result) landed on main while this branch was open, so on this rebase that wiring is taken from main; the remaining net additions are: - Contracts: TrackingFeedback* enums and the four dedicated assistant_feedback_* event payload types (click, reason_view, reason_click, reason_submit), plus normalizeCustomReason helper. The new event-name variants are added to TrackingEventName and the AnalyticsEventPayload discriminated union next to the existing surface_view/ui_click variants — both wire formats coexist. - POST /api/runs/:id/feedback in apps/daemon/src/chat-routes.ts: thin route that validates rating, allowlists reasonCodes through a simple string filter, and fire-and-forgets into the daemon's reportFeedback hook. - apps/daemon/src/langfuse-bridge.ts reportRunFeedbackFromDaemon forwards the rating + reasonCodes into Langfuse as user_rating (NUMERIC ±1) + user_rating_reason (CATEGORICAL, one per code) score-create entries. Gates on telemetry.metrics + telemetry.content. - apps/web/src/providers/daemon.ts reportChatRunFeedback (fire-and-forget fetch) and apps/web/src/components/ProjectView.tsx wiring so each thumbs-up/down + reason submission posts the side-channel. Conflicts resolved (release/v0.8.0 vs the branch's old base): - packages/contracts/src/analytics/events.ts: keep main's file_upload_result / feedback_submit_result / settings_* event variants alongside the new assistant_feedback_* additions. - apps/daemon/src/server.ts: keep DNS-aware validateExternalApiBaseUrl, add reportFeedback closure wired into registerChatRoutes telemetry. - apps/daemon/src/chat-routes.ts: keep both /tool-result and the new /feedback routes; merge RegisterChatRoutesDeps to include both 'paths' and 'telemetry'. Drop PR's chat-routes-local reconcileAssistantMessageOnRunEnd helper (main has the equivalent in server.ts). - apps/web/src/components/ChatPane.tsx & AssistantMessage.tsx & ProjectView.tsx: keep main's projectKindForTracking prop name and its existing emission of surface_view / ui_click / feedback_submit_result; the PR's analyticsCtx-based reason_view/click/submit emission is dropped in this rebase since it would duplicate the existing wire format. - apps/web/tests/components/: rename projectKind → projectKindForTracking to match ChatPane's current prop name. Outstanding review feedback (from the pre-rebase round, will be addressed in a follow-up commit): - AssistantMessage tests not yet passing the new feedback context to the direct render path. - ProjectView clear-feedback path skips reportChatRunFeedback, leaving stale Langfuse user_rating scores. - buildFeedbackPayload has no deletion path for previously-submitted user_rating_reason scores when the user switches thumbs. - POST /api/runs/:id/feedback always returns {status:'accepted'} even when consent is off; needs to surface skipped_consent / skipped_no_sink. - reasonCodes are filtered to string[] but not allowlisted against ChatMessageFeedbackReasonCode or deduped. fix(analytics): address review on assistant feedback rebase Picks up the in-scope correctness items from the prior review round and the rebase residue without rewriting history: - chat-routes.ts: `/feedback` now awaits the daemon's preflight outcome and echoes it as the response. The contract was already shaped as `accepted \| skipped_consent \| skipped_no_sink`, but the previous handler always returned `accepted` because the network send was fire-and-forget. The consent + sink decision is local (a small file read and an env-var lookup); the actual Langfuse upload still runs as a detached promise. - chat-routes.ts: reasonCodes are now allowlisted against the contract's reason-code union and deduplicated before reaching Langfuse, so a stale or replayed client can't poison the Langfuse score table with unknown categorical values or duplicate stable ids in the same batch. - langfuse-bridge.ts: split the consent + sink resolution from the fire-and-forget network send so the route can claim `accepted` honestly. The legacy `skipped_no_sink` return on app-config read failure is preserved. Contracts + comment hygiene: - TrackingFeedbackReasonCode in packages/contracts/src/analytics/events.ts drifted from ChatMessageFeedbackReasonCode in packages/contracts/src/api/chat.ts; add `followed_design_system` and `missed_design_system` so the analytics wire format stays aligned with the persistence shape. - langfuse-trace.ts buildFeedbackPayload: the docblock claimed the raw custom-reason text is bucketed before send. Product reversed that on 2026-05-13 (raw text now ships, consent-gated). Replace the stale comment with the real semantics + a note that there is no tombstone path for reason codes the user removes in a follow-up submission (left as scope for a later PR). - AssistantMessage.tsx: remove the now-unused `AssistantFeedbackAnalyticsCtx` interface and a stray blank-line delete from the rebase; restore the analytics-context comment above the feedback hook. Left as follow-up (intentional, documented in code): - Sending a tombstone score when the user clears their rating — ProjectView still skips reportChatRunFeedback on `change===null`, so Langfuse retains the previous rating until the user re-submits. The PostHog event captures the clear separately. - Removing reason-code scores when the user re-submits with a smaller set — buildFeedbackPayload only overwrites the codes present in the current payload. * feat(analytics): wire PR's dedicated assistant_feedback_* events The four dedicated event types (`assistant_feedback_click` / `_reason_view` / `_reason_click` / `_reason_submit`) the PR added to contracts were sitting unused after the rebase because main's umbrella `surface_view` / `ui_click` / `feedback_submit_result` emissions covered the same user gestures. Wire the dedicated events alongside the umbrella ones so both wire formats fire on every feedback action — dashboards / evals can pick whichever schema they were built against without losing signal. Each dedicated event has stricter typing than its umbrella sibling (`project_id` / `project_kind` / `conversation_id` are non-null), so the new emissions are guarded behind a presence check and skipped on test renders that mount AssistantMessage without project context. The umbrella emissions retain their nullable fallbacks unchanged. Pairing: - surface_view (feedback reason panel) ↔ assistant_feedback_reason_view - ui_click (feedback button) ↔ assistant_feedback_click - ui_click (reason submit button) ↔ assistant_feedback_reason_click - feedback_submit_result ↔ assistant_feedback_reason_submit Reason click + submit share the existing `requestId` so PostHog can stitch click→result across both schemas, matching the spec.	2026-05-21 19:28:51 +08:00
Siri-Ray	3a33a7b475	fix(web): localize quick brief prompt (#2520 ) * fix(web): localize quick brief prompt Generated-By: looper 0.8.1 (runner=worker, agent=codex) * fix(web): pass locale from design system chat Generated-By: looper 0.8.1 (runner=fixer, agent=codex) * fix(web): preserve task-type routing options Generated-By: looper 0.8.1 (runner=fixer, agent=codex) * fix(web): preserve task-type routing options Generated-By: looper 0.8.1 (runner=fixer, agent=codex)	2026-05-21 19:18:13 +08:00
lefarcen	6bb0f0fd91	feat(observability): web lifecycle telemetry + stable installationId migration (#2527 ) * feat(observability): web lifecycle telemetry + stable installationId migration Two intertwined safety-telemetry additions for the 0.8.0 release. Web lifecycle observability --------------------------- New `apps/web/src/observability/` module installed at module load via client-app.tsx — alongside the existing error-tracking exception hooks from #2521. Reuses error-tracking's direct-fetch transport (the same consent-bypass + early-buffer guarantees) so every event flows even when the user has opted out of general analytics: - client_long_task PerformanceObserver longtask >100ms (real "feels janky" signal, FPS proxy) - client_white_screen app fails to mount after 5s; MutationObserver cancels the timer the moment the React root renders so a normal boot is zero events - client_resource_error capture-phase window.error catches failed <script>/<link>/<img>/<iframe> loads (chunk-load failures, broken artifact refs) - client_boot_timing navigationStart → load timings via Navigation Timing v2 - client_visibility_change visibilitychange + page lifetime - client_session_summary real foreground duration emitted on pagehide - client_run_stuck 5min watchdog on SSE runs that don't progress (#2464 / #2405 / #1451 in data form) - client_iframe_error FileViewer iframe load failures (iframe errors don't bubble to window, so the global resource-error observer can't see them) - desktop_renderer_crash Electron main observes render-process-gone and forwards to daemon /api/observability/event - daemon_uncaught_exception daemon_unhandled_rejection process-level handlers on the daemon error-tracking.ts is generalised: `reportSafetyEvent(name, props)` now exposes the same buffer + direct-fetch transport that `reportHandledException` used, with identical $exception wire shape preserved for the existing exception path. Daemon cross-process bridge --------------------------- New `AnalyticsService.captureSafety()` skips the consent re-check and posts via posthog-node with installationId as distinct_id. Wired into: - `POST /api/observability/event` for desktop main and any future helper process that needs to ship a safety event (no consent check — same contract as web's direct-fetch path) - `process.on('uncaughtException')` / `unhandledRejection` on the daemon itself Stable installationId across reinstalls (critical for 0.8.0 rollout) -------------------------------------------------------------------- installationId previously lived in `<namespace>/data/app-config.json`, so a packaged reinstall that churned the namespace token (or any future namespace-scoped data wipe) rotated the id and the user showed up as a brand-new PostHog person. This is the immediate trigger: when 0.8.0 ships, every 0.7.x user upgrading would silently double the user count. New module `apps/daemon/src/installation.ts` reads/writes `<installationDir>/installation.json` at the channel root. The daemon gets the path from `OD_INSTALLATION_DIR`, set by `apps/packaged/src/sidecars.ts` to `paths.installationRoot` (one level above `namespaces/` — e.g. `~/Library/Application Support/Open Design Nightly/` on mac). `readAppConfig` transparently merges: if installation.json has an id it wins; if only app-config.json has one (the 0.7.x state), it gets mirrored to installation.json on the next read. `writeAppConfig` mirrors any explicit installationId write, including the null-clear path used by Settings → "Delete my data". 7 call sites of readAppConfig keep their signatures unchanged. Survives: - same-channel reinstall (DMG drag-replace, NSIS reinstall) - namespace churn between packaged builds - per-namespace data reset (future installer that clears `<ns>/data/`) Still rotates (intentionally): - explicit "Delete my data" - manual `rm -rf "~/Library/Application Support/Open Design <Channel>/"` - different channel (Stable vs Nightly stay distinct because userData paths differ; that's the existing channel-isolation contract) What this changes for posthog-js -------------------------------- client.ts had `capture_exceptions: false` from #2521; nothing else changes. autocapture / $pageview / $autocapture / track() / daemon analyticsService.capture() — all unchanged. New events are additive. Validation ---------- - pnpm guard pass - pnpm typecheck whole repo pass - pnpm --filter @open-design/web test 200 files / 1824 tests - pnpm --filter @open-design/daemon test 251 files / 2981 tests (includes 10 new tests in installation.test.ts pinning the 0.7.x → 0.8.0 migration, namespace-wipe survival, delete-my-data clear, and fresh-id rotation) - pnpm --filter @open-design/packaged test 9 files / 89 tests - Pre-existing baseline: apps/desktop/src/main/updater.ts has typecheck references to RELEASE_CHANNEL_NAMES.PREVIEW/NIGHTLY on release/v0.8.0; unrelated to this PR. * fix(observability): preserve fatal exit on uncaught + skip loading shell in white-screen check Addresses codex review on PR #2527 (Siri-Ray). 1) Daemon process handlers must keep Node fatal semantics Installing an uncaughtException listener silences Node's default crash/exit; Node 15+ does the same for unhandledRejection when a listener is present. The previous handlers logged telemetry and let control return to the event loop, leaving a corrupted daemon serving requests instead of letting the supervisor restart it cleanly. triggerFatalShutdown() now: - dispatches captureSafety once (guarded against re-entry from cascading faults) - races posthog-node's shutdown against a 1s bounded timeout so a slow flush can't keep the process alive - calls process.exit(1) after the race resolves Both uncaughtException and unhandledRejection route through it. apps/daemon/tests/uncaught-fatal-shutdown.test.ts pins: - captureSafety is invoked exactly once even on repeated faults - exit(1) fires on the happy path - exit(1) still fires when shutdown hangs past the timeout - exit(1) still fires when captureSafety itself throws 2) White-screen detector treated the loading shell as a successful mount apps/web/app/[[...slug]]/client-app.tsx renders the dynamic-import fallback as <div class="od-loading-shell">Loading Open Design…</div> whose visible text (19 chars) exceeded the previous 10-char floor. monitorMount() would therefore cancel the 5s timer the instant Next swapped the loading shell in, completely missing the white-screen signal the observer is meant to add. isAppMounted() now: - primary signal: <html data-od-app-mounted="1"> set by App.tsx's first useEffect — authoritative because once App has mounted at least once, any later tree crash is an $exception story, not a white-screen story - fallback: only counts children of the root container whose classList does NOT include known loading-shell markers (od-loading-shell). Their visible text drives the > MIN_VISIBLE_TEXT check, so the loading sentinel can never be mistaken for a mount. apps/web/tests/observability/white-screen.test.ts pins: - fires client_white_screen when only the loading shell is present after the timeout - does NOT fire when data-od-app-mounted is set before the timeout - cancels the timer the moment a real workspace-shell child appears alongside the loading shell - still fires when only sub-MIN_VISIBLE_TEXT non-shell content is present (effectively blank) Validation: - pnpm guard pass - pnpm typecheck pass - pnpm --filter @open-design/daemon test 252 files / 2985 tests - pnpm --filter @open-design/web test 201 files / 1828 tests * fix(observability): await captureSafety enqueue before fatal shutdown flush Addresses second-pass codex review on PR #2527 (Siri-Ray, 3279268246). The previous fatal-shutdown path called `analyticsService.captureSafety()` synchronously and immediately raced `analyticsService.shutdown()` against the bounded timeout. captureSafety in apps/daemon/src/analytics.ts does its real `client.capture()` call only inside an async IIFE after `await readInstallationIdSafe()` — so shutdown could win the race, drain an empty posthog-node queue, and let `process.exit(1)` run BEFORE the daemon crash event ever got enqueued. We'd then preserve the process-lifecycle contract but lose the exact signal this PR is adding. Changes: - AnalyticsService.captureSafety now returns Promise<void>. The async IIFE is gone; the body awaits readInstallationIdSafe directly so the returned promise resolves only AFTER client.capture() has been invoked (which is when posthog-node's local buffer contains the event). - server.ts triggerFatalShutdown awaits captureSafety, then calls shutdown, and races that whole sequence against the 1s bounded timeout. Capture failures still don't block exit (try/catch around the await). - NOOP_SERVICE.captureSafety becomes `async () => undefined` to match the new signature. - Fire-and-forget callers (/api/observability/event) are unaffected; voiding the returned promise keeps them non-blocking. apps/daemon/tests/uncaught-fatal-shutdown.test.ts adds the reviewer- requested fixture: - 'waits for the captureSafety promise to settle before invoking shutdown' — gives capture a 50ms delay and shutdown a separate 50ms delay so the intermediate "capture done / shutdown not yet" state is observable. - 'still aborts and exits if captureSafety hangs past the bounded timeout' — captureSafety never resolves; the outer 1s timeout still forces process.exit(1). Validation: - pnpm guard pass - pnpm typecheck whole repo pass - pnpm --filter @open-design/daemon test 252 files / 2987 tests	2026-05-21 15:37:48 +08:00
kami	dea07840f3	fix: stop stale pinned todos after terminal runs (#2321 ) Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 11:13:20 +08:00
Tom Huang	86ec951fb9	[codex] Add automation templates and proposal workflows (#2193 ) * feat(web): introduce Automations tab with dual-track capability for routines This commit adds a new Automations tab that consolidates routines, schedules, and live artifacts, allowing users to manage automations seamlessly. The tab features a modal for creating and editing automations, which supports various scheduling options (hourly, daily, weekdays, weekly) and project modes (create_each_run, reuse). The CLI is also updated to expose automation commands, ensuring consistency between the web UI and CLI interfaces. Key changes include: - New `NewAutomationModal` component for automation creation and editing. - Updated `TasksView` to integrate the new Automations functionality. - Enhanced styling for the Automations tab to improve user experience. This implementation aligns with the dual-track capability exposure policy, ensuring all features are accessible via both the web UI and CLI. * feat(daemon): enhance automation context handling and CLI commands This commit introduces several improvements to the automation context management and updates the CLI commands accordingly. Key changes include: - Added support for new context fields (`plugin`, `mcp`, `connector`) in automation commands. - Updated the CLI to reflect new target options (`new-project`). - Enhanced error messages for invalid target inputs. - Introduced functions to handle context selection and normalization for routines, including the ability to parse and store context data in the database. - Updated the database schema to include a new `context_json` field for routines. - Improved the handling of context in routine routes and the web interface, ensuring that selected contexts are properly managed and displayed. These changes aim to provide a more robust and flexible automation experience, aligning with the recent enhancements in the web UI. * feat(web): enhance TasksView with automation run history and status indicators This commit introduces several new features to the TasksView component, including: - Added functionality to display automation run history for each routine, showing metadata such as status, timestamps, and project details. - Implemented status indicators for routine runs, providing visual feedback on their current state (succeeded, failed, running, queued). - Enhanced the UI to allow users to expand and view detailed run history, including the ability to open the corresponding project conversation. - Updated styles to improve the presentation of automation statuses and history. These changes aim to provide users with better insights into their automation routines and improve overall usability. * feat(daemon): implement automation ingestion and proposal management This commit introduces several new features related to automation ingestion and proposal management within the daemon. Key changes include: - Added new modules for handling automation source packets and proposals, allowing for the storage, retrieval, and management of automation-related data. - Implemented functions to list, create, and apply automation proposals, enhancing the automation workflow. - Introduced new CLI commands for interacting with memory entries and automation sources, providing users with more control over their automation processes. - Enhanced the server routes to support automation source and proposal APIs, enabling seamless integration with the existing system. These changes aim to improve the overall automation experience, making it easier for users to manage and utilize automation proposals and ingestions effectively.	2026-05-19 16:35:28 +08:00
Eli	4376d8a8ec	[codex] Add pet task center and desktop pet (#1833 ) * feat: add pet task center and desktop pet * Fix pet task center review regressions	2026-05-19 15:38:39 +08:00
kami	6990291217	fix: escape chat transcript role delimiters (#2156 ) Co-authored-by: multica-agent <github@multica.ai>	2026-05-19 15:37:35 +08:00
kami	68b0e0258d	fix(web): scope daemon transcript to active agent (#2157 )	2026-05-19 13:49:34 +08:00
Chris Seifert	9cf265e520	feat(claude): wire AskUserQuestion tool through chat + pin TodoWrite (#1743 ) * feat(claude): wire AskUserQuestion tool through chat + pin TodoWrite Claude calls `AskUserQuestion` for mid-conversation clarifications when the natural answer is one of a small finite set of choices. Until now the tool round trip hit two dead ends in headless mode: claude-code -p cannot prompt the user, so it auto-errored the tool and retried 4x; the model then hedged by also writing the same options as a markdown bulleted list. The host had no way to feed a real `tool_result` back. This change makes the AskUserQuestion round trip work end to end: * Switch Claude to `--input-format stream-json`. The daemon wraps the prompt as a JSONL `user` message on stdin and keeps stdin OPEN, so later writes (a `tool_result` for the open AskUserQuestion) feed back into the same child instead of needing a fresh spawn. * New `RuntimeAdapter.promptInputFormat()` ('text' default, 'stream-json' for Claude) so the spawn loop keeps the old close-on- prompt behavior for every other agent. * New `POST /api/runs/:id/tool-result` daemon endpoint and `submitChatRunToolResult` web helper. Body carries `toolUseId` and `content`; daemon writes a JSONL `user` message with the matching `tool_result` content block. * Track outstanding host answers on the run (`pendingHostAnswers`) and close stdin on either a `usage` event or a synthesized `turn_end` event (extracted from `assistant.message.stop_reason` in `claude-stream`). Without the per-turn `turn_end` signal stdin would never close after the follow up turn finished and the run would hang until the inactivity watchdog killed it. * System prompt: tell Claude to use AskUserQuestion for follow ups with 2-4 finite choices, and to STOP after the tool call instead of writing a markdown duplicate. Web UI: * New `AskUserQuestionCard` renders the tool input as labelled chip buttons (single or multi select) with a Submit button styled like the composer's Send. On submit the answer routes through `submitChatRunToolResult` (live tool_result path) and falls back to `onSubmitForm` (plain user message) only if the run has already terminated. Selected chips persist across page reloads by re parsing the stored `tool_result.content`. * Hide markdown text that follows an AskUserQuestion in the same turn — defense in depth against the model emitting the duplicate. * Collapse identical `AskUserQuestion` / `TodoWrite` retries inside any tool group to a single card. TodoWrite is a snapshot tool, so older calls are duplicates of state. * Pinned TodoCard above the chat composer. The latest TodoWrite snapshot across the conversation renders once, expandable / collapsible header, count shows in-progress + completed (1/4), Done button dismisses when all tasks finish, soft fade gradient above so scrolling chat text dissolves into the panel instead of hard clipping under the card. * Composer gains a top shadow that only appears when the pinned todo slot sits directly above it (dark mode strengthened). * Accordion expand / collapse motion shared between TodoCard, the ToolGroupCard disclosure, and BashCard output via `grid-template-rows: 0fr -> 1fr` with `cubic-bezier(0.23, 1, 0.32, 1)` and asymmetric durations (200ms enter, 140ms exit) per Emil Kowalski's animation framework. * Jump-to-latest button no longer unmounts on hide; slides up with scale 0.9 -> 1 + fade on show, slides down with scale + fade on hide. Always horizontally centered via `margin: 0 auto`. i18n: * `tool.askQuestion`, `tool.askQuestionSubmit`, `tool.askQuestionPending`, `tool.askQuestionAnswered`, `tool.todosExpand`, `tool.todosCollapse`, `tool.todosDone`, `tool.todosDismiss` added to all 18 locales. Unblocker: * Fix a pre-existing render loop in `ProjectView` when the user clicks "New conversation". `handleNewConversation` now navigates to the fresh conversation id synchronously after `setActiveConversationId` so the route-sync effect at L512 and the URL-sync effect at L851 do not ping pong (route mismatch triggered repeated reverts; React's nested-update guard fired). * fix(claude): order turn_end after content blocks + cover chat switching Two follow-up fixes to the AskUserQuestion + new-conversation work: * `claude-stream.ts` emitted `turn_end` BEFORE iterating the assistant message's content blocks. When claude-code lacks `--include-partial-messages` (older builds), tool_use events surface only from that loop, so the daemon's stdin-close handler saw an empty `pendingHostAnswers` set and closed stdin before the AskUserQuestion tool_use was even registered. The result: the model retried, hit the same race, and gave up writing the questions in prose. Emit `turn_end` AFTER the content loop so tool_use ids land in `pendingHostAnswers` first. * `server.ts` now ignores `turn_end` events with `stop_reason: 'tool_use'`. That stop reason means the model paused to wait for a tool execution (claude-code's internal tool runner for Bash / Edit / Read, or a host-answered tool like AskUserQuestion). Either way the conversation is still in flight — closing stdin there would kill the follow-up response. Only the natural turn-end stop reasons (`end_turn`, etc.) close stdin. * `ProjectView.handleSelectConversation` now navigates to the picked conversation id synchronously, mirroring the fix already in handleNewConversation. The route-sync effect at L512 was reverting the active conversation on every switch, ping-ponging with the URL-sync effect at L851 until React's nested-update guard fired with "Maximum update depth exceeded". Same bug class as the pre-existing new-conversation render loop. * docs(agents): capture AskUserQuestion runtime + chat UI conventions Record the patterns this PR introduces so future contributors can find them without spelunking server.ts: * Agent runtime conventions — `RuntimeAgentDef.promptInputFormat`, `run.pendingHostAnswers` / `run.stdinOpen` lifecycle, `turn_end` ordering rule, `POST /api/runs/:id/tool-result` endpoint shape, the Claude only system prompt block that nudges AskUserQuestion, and the `suppressAskUserQuestionFallbackText` defense in depth. * Chat UI conventions — URL-load vs srcDoc render mode dispatch with bridge disqualifiers, the dual iframe visibility swap pattern, `isOurIframe` plus the active-iframe re-check for signals that must only come from the visible iframe, pinned TodoCard via `PinnedTodoSlot`, count includes `in_progress`, `dedupeSnapshotToolRetries` for AskUserQuestion / TodoWrite stacks. * i18n keys — 18 locale files, add the key to `types.ts` first. * UI animation philosophy — `cubic-bezier(0.23, 1, 0.32, 1)` ease out, asymmetric 200/140ms enter/exit, accordion via `grid-template-rows`, no `transform: scale(0)`, keep mounted + toggle class for exit transitions instead of relying on React unmount. * fix(claude): read promptInputFormat as field, close stdin on deferred answer Two PR review follow-ups on the AskUserQuestion stream-json wiring. * server.ts:4616 referenced `runtimeAdapter.promptInputFormat()` — but `runtimeAdapter` is not declared, imported, or assigned anywhere. The prior adapter abstraction was deleted in #1656; when the changes were folded back into the inline handler the format was moved onto `RuntimeAgentDef.promptInputFormat`, but this call site was missed. `server.ts` starts with `// @ts-nocheck` so typecheck never caught it — every chat run hit `ReferenceError: runtimeAdapter is not defined` the moment we wrote the prompt to a stdin-fed child, which is every agent with `promptViaStdin: true` (claude, codex, copilot, cursor-agent, gemini, opencode, pi, qoder). Read the format off the in-scope `def` and default missing values to `'text'`. * `submitToolResultToRun` cleared the answered id from `pendingHostAnswers` but never closed stdin if a `turn_end` / `usage` event had already fired with the set non-empty (deferred by the event handler). The child then waited indefinitely for further input until the inactivity watchdog killed it, losing the model's follow-up response. Close stdin on the last-answer transition when stream-json stdin is still open. Test: pin `promptInputFormat` for every `promptViaStdin: true` agent so future regressions of the field-vs-method contract fail at typecheck-adjacent test time instead of in production. The new test asserts `typeof def.promptInputFormat` is a string (or undefined), not a function — exactly the shape mistake the original line made. * fix(web): keep AskUserQuestion multi-select chips selected after reload when labels contain commas `handleSubmit` joined multi-select answers with `', '` while the reload parser split them on `','`. The pair is asymmetric: a valid model-generated option like `"Yes, including images"` round-tripped as `["Yes", "including images"]`, so after a page reload the locked question card showed the user's pick as unselected — even though the `tool_result` content the daemon actually wrote into the run was correct, and the model saw the right answer. Bounded to post-reload visual state, but silently confusing. Switch to a `- ` bullet list per option, one per line, with the parser stripping the leading `- ` back off. Newlines never appear inside a label so the round trip is exact. The outer pairs separator stays `\n\n` because individual answer bodies still never contain that double-newline. * chore: drop accidental personal design-system file `design-systems/foldar/DESIGN.md` was added to the AskUserQuestion branch in `31ac531` by mistake — it's a personal brand spec that does not belong in the upstream design-systems catalogue. Removing it keeps the branch's surface area scoped to the feature.	2026-05-15 15:50:27 +08:00
nettee	6b3cc61714	Revert "Refactor agent runtime stream handling behind adapter (#1622 )" (#1656 ) This reverts commit `8cb9cdb593`.	2026-05-14 15:23:19 +08:00
nettee	8cb9cdb593	Refactor agent runtime stream handling behind adapter (#1622 )	2026-05-14 14:12:24 +08:00
Rocky	6c3fd86642	fix(daemon/acp): terminate ACP child after clean prompt completion (#1286 ) * fix(daemon/acp): terminate ACP child after clean prompt completion (Bug B / #1265) Some ACP agents (notably Devin for Terminal) keep the child process alive after stdin closes, waiting for the next prompt. Open Design spawns a fresh agent per chat turn and relies on child.on('close') to finalize the run, so without an explicit signal-driven shutdown the chat sits stuck in the 'working' state indefinitely. Three small, targeted changes: - apps/daemon/src/acp.ts: After a clean session/prompt response we schedule a 500ms grace period and then SIGTERM the child. This mirrors the pattern detectAcpModels() already uses after model discovery. The grace period leaves well-behaved agents that exit on stdin.end() unaffected. - apps/daemon/src/acp.ts: New completedSuccessfully() method on the session handle reports whether the prompt resolved without a fatal error or abort, so the consumer can distinguish 'clean signal exit' from 'genuine signal failure'. - apps/daemon/src/server.ts: child.on('close') now treats a SIGTERM exit as 'succeeded' when acpSession.completedSuccessfully() is true. - apps/web/src/providers/daemon.ts: Trust the server's authoritative endStatus; the signal/non-zero-code safety net no longer overrides an explicit 'succeeded' status, so the chat doesn't surface a fake 'agent exited with signal SIGTERM' error after a clean ACP run. Daemon tests cover the SIGTERM grace timer, clean early-exit (timer cleared), and completedSuccessfully() abort/error states. Manual UI test on plain main + this fix confirms Devin chats now return to ready automatically after Done · ... * fix(daemon/connectionTest): treat ACP clean SIGTERM as success Codex review on #1286 caught that the new SIGTERM in attachAcpSession breaks ACP connection tests for agents that don't shut down on stdin.end() (the exact Devin behavior the patch targets). attachAgentStreamHandlers() in connectionTest.ts now also respects acpSession.completedSuccessfully(), mirroring the same check we apply in server.ts. Without this, a clean prompt response followed by our SIGTERM would set winner.signal === 'SIGTERM', flip exitedCleanly to false, and the connection test would report 'agent_spawn_failed' even when the agent had returned a healthy response. Also widened the AgentSpawnHandle type so completedSuccessfully is visible on the structural type used inside connectionTest.ts. All 56 daemon tests still pass; typecheck + guard clean. * fix(daemon/acp): narrow ACP success-on-signal override to forced-SIGTERM Looper review on #1286 caught that the success predicate was broader than the SIGTERM case it was meant to handle. `completedSuccessfully()` flips to true as soon as the ACP `session/prompt` response is processed, but it does not say why the child later closed. With the broad predicate, an ACP agent that returned a prompt result and then exited with code 1 (or was killed by SIGKILL/SIGSEGV) was still marked 'succeeded', regressing the existing close-status behavior for genuine post-response process failures. Scope the override to the exact forced-shutdown shape this PR introduces: code === null && signal === 'SIGTERM' && acpCleanCompletion Applied to both `server.ts` (chat run finalization) and `connectionTest.ts` (connection-test classification). Any other post-response failure now falls through to 'failed' / 'agent_spawn_failed' as before. All 59 daemon tests still pass; typecheck + guard clean. * fix(web/daemon): only bypass exit-code safety net on explicit server success Looper review on #1286 caught that the previous web change trusted `endStatus === 'succeeded'` absolutely, but `endStatus` can become 'succeeded' in two distinct ways: 1. The SSE end event explicitly carries `status: 'succeeded'` (authoritative server declaration). 2. The end event omits or has an invalid `status` field and the handler silently falls back to 'succeeded' as a local default. Both produced `endStatus === 'succeeded'` in the existing code, so the new safety-net bypass treated them identically. That regressed backward compat: a compatible or older daemon emitting an end event like `{code:1}` or `{code:null,signal:"SIGTERM"}` with no `status` would suddenly skip the failure banner. Track explicit success separately via `serverDeclaredSuccess`, set true only when: - The SSE end event has `status === 'succeeded'`, or - The fallback `fetchChatRunStatus` REST path returns `status === 'succeeded'` (which the existing `isChatRunStatus()` guard already proves is explicit). The safety net is now bypassed only on that explicit signal; the local-fallback success path still reaches the exit-code/signal check so real failures surface as before. Adds three web-side regression tests in `apps/web/tests/providers/sse.test.ts`: - Explicit `status: 'succeeded'` + SIGTERM → onDone called, no error - End event with `{code:1}` and no `status` → onError surfaces 'agent exited with code 1' as before - End event with `{code:null,signal:'SIGTERM'}` and no `status` → onError surfaces 'agent exited with signal SIGTERM' as before `pnpm guard` + daemon typecheck clean; 27/27 SSE tests pass (up from 24).	2026-05-12 17:13:10 +08:00
Caprika	5bd9763181	[codex] Improve Claude Code exit diagnostics (#1267 ) * fix daemon claude diagnostics * fix claude custom endpoint auth diagnostics * fix project view api empty response test props * fix claude diagnostic review gaps * fix silent custom endpoint claude diagnostics * fix claude diagnostic credential redaction * fix quoted api key redaction * fix claude diagnostic tail redaction * fix silent claude configured profile diagnostics	2026-05-12 00:08:31 +08:00
Caprika	fb079d8115	Add reliable agent-browser skill (#1284 ) * Add reliable agent browser skill * Fix ProjectView delete conversation test props	2026-05-11 20:09:12 +08:00
Tom Huang	b5eb8c1647	feat: generic skills + split skills/design-templates + finalize-design API (#955 ) * feat: general-purpose skills with @-mention composition and user import Lift skills from "one mode-bound skill per project" to a generic capability the user can compose per turn: - Daemon: scan multiple skill roots (user-skills under runtime data, then the bundled `skills/`); user-imported skills can shadow built-ins by id. - New `POST /api/skills/import` and `DELETE /api/skills/:id` endpoints, with CONFLICT/BAD_REQUEST/NOT_FOUND error codes and built-in delete protection. - ChatRequest gains `skillIds: string[]`; the chat run concatenates each picked skill's body (and merges craftRequires) into the system prompt for that turn only — the project's persistent `skillId` is untouched. - Web composer: `@` popover now lists skills alongside project files; picks render as removable chips above the textarea and ride along with the request as `skillIds`. - Settings → Library: import form (name/description/triggers/body), per-card delete for user skills, "user" origin badge. * chore(web): drop welcome pet teaser + add ds→prompt-template mapping util - SettingsDialog: remove the inline pet adoption teaser from the welcome panel so the first-run modal stays focused on configuration. - New `inferPromptTemplateCategoriesForDs(ds)` helper that maps a design system's authored metadata to prompt-template gallery categories. Imported by the design-system gallery wiring on a sibling branch; no callers in this branch yet. * feat: split skills/design-templates and add finalize-design API Phase 0 of the skills/design-templates refactor (specs/current/ skills-and-design-templates.md): - Move ~104 rendering catalogue entries from skills/ to design-templates/ and keep skills/ for the small set of functional skills that do work on user input (utilities, briefs, packagers). - Add design-templates/AGENTS.md and skills/AGENTS.md describing the contract, and a brand-agnostic craft/ surface for opt-in craft rules. - Daemon: add DESIGN_TEMPLATES_DIR / USER_DESIGN_TEMPLATES_DIR roots and an /api/design-templates surface mirroring /api/skills. Asset/example routes still span both registries so existing srcdoc URLs keep resolving across the rename. - Web: split LibrarySection into SkillsSection + DesignSystemsSection, rename the EntryView "Examples" tab to "Templates", and update locales + the New-project picker accordingly. Adds the finalize-design endpoint: - New apps/daemon/src/finalize-design.ts and packages/contracts/src/api/ finalize.ts — one-shot synthesis of a project's transcript + active design system + current artifact into <projectDir>/DESIGN.md via the Anthropic Messages API. Per-project .finalize.lock mirrors the transcript-export hygiene from PR #493; provider credentials are not persisted by the daemon. Other supporting changes: - README + AGENTS.md updates to document the new directory split and craft/ surface, plus i18n strings across 13 locales. - Test refactors and new coverage (finalize-design, runs, sidecar server, plus refreshed daemon integration tests). - .gitignore: scope the .exe ignore to /OpenDesign.exe so legitimate vendor binaries are no longer hidden. fix(merge): move clinical-case-report to design-templates/ Origin/main added the clinical-case-report skill under skills/ before the skills/design-templates split landed. Its od.mode is prototype, so per specs/current/skills-and-design-templates.md it is a design template and belongs alongside the other rendering catalogue entries — not under the slimmed-down functional skills/ root. Moving it keeps the EntryView Templates tab consistent with origin/main's intent. * feat(skills): curated design/creative catalogue + collapsible Settings rows Seed ~100 curated design/creative skill stubs under skills/ sourced from awesome-claude-skills (ComposioHQ) and awesome-agent-skills (VoltAgent). Each stub carries an od.category tag so the new filter pill row in Settings -> Skills can group them. The seed script (scripts/seed-curated-design-skills.ts, pnpm seed:curated-design-skills) is idempotent: it only creates folders that don't already exist, so hand-edited stubs are never overwritten. - Daemon: parse and surface od.category on SkillInfo with a strict slug normaliser; mirror the field on SkillSummary in @open-design/contracts. Category is purely a UI hint — system-prompt composition is unchanged. - Web: rewrite SkillsSection from a left-list / right-detail grid into a vertical stack of collapsible rows mirroring the External MCP panel (header always visible with name + mode/source/category pills + per-row enable toggle; SKILL.md preview, file tree and inline edit form expand on demand). Add a Category filter row above the list. Reorder Settings nav so Skills + External MCP sit above the Composio/MCP cluster. Update composer placeholder/hint across 17 locales to advertise '@ files or skills · / for commands'. - Docs: extend skills/AGENTS.md with the curated catalogue rules (idempotency, category vocabulary, no upstream vendoring). Co-authored-by: Cursor <cursoragent@cursor.com> * test(skills): teach localized-content + system-prompt tests about the skills/design-templates split mrcfps blocking review on PR #955: the skills/design-templates split (`b5993385`) moved ~110 SKILL.md entries out of `skills/` and into `design-templates/`, but two repo-level tests still hard-coded the single-root layout, so CI gates went red on the merged branch: - `e2e/tests/localized-content.test.ts` only scanned `<repo>/skills` while the locale `skillCopy` map keeps id-keyed entries spanning both roots (ExamplesTab/Templates uses one lookup regardless of origin). Teach the helper to read both `skills/` and `design-templates/`, deduplicating ids so the union matches the localized claim. - `apps/daemon/tests/prompts/system.test.ts` read `skills/live-artifact/SKILL.md`, which now lives under `design-templates/live-artifact/`. Update the absolute path so composeSystemPrompt's coverage of the live-artifact preamble is exercised again. Also enroll the curated design/creative catalogue (PR #955, ~91 stubs sourced from awesome-claude-skills / awesome-agent-skills) in the DE / FR / RU `_SKILL_IDS_WITH_EN_FALLBACK` lists. The stubs are English-only by design (frontmatter advertises an upstream URL); the fallback list is exactly the place to acknowledge "we know this id exists, English copy is fine here" so the localized-content coverage gate passes without forcing a translation task per locale. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(skills): always quote frontmatter name so importUserSkill round-trips numeric / boolean ids mrcfps PR #955 review: `buildSkillMarkdown` emitted `name: ${escapeYamlString(name)}` without quotes, so YAML coerced names like `123`, `true`, `false`, or `null` into non-string scalars on re-parse. listSkills() then read `data.name` as a number/boolean and the import flow's follow-up `findSkillById(skills, result.id)` missed it, falling into `/api/skills/import`'s "imported skill could not be re-read" 500 path for those ids. Switch the emitter to a quoted scalar (`name: "..."`) — the double-escape already in `escapeYamlString` makes the quoted form safe — and add a round-trip test covering `123`, `true`, `false`, `null`, and `0` to lock in the contract. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): drop staged-skill chips when the matching @<id> token leaves the draft mrcfps PR #955 review: `submit()` always forwarded every id in `stagedSkills`, but that state was only mutated on picker click and chip removal. Hand-deleting an `@<id>` token from the textarea left the chip staged, so the request still carried `skillIds: [<id>]` and the daemon composed a skill the prompt no longer referenced. Sync the chips with the draft inside `handleChange()` by pruning `stagedSkills` whenever the new value no longer contains the `@<id>` token (using the same whitespace boundary as `removeStagedSkill`'s strip regex). Comment explains why this prune does not run for `staged` file attachments — users frequently add files via the upload button without leaving an `@<path>` token, so a symmetric prune there would erase legitimate uploads. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(daemon): stage @-composed skills' side files alongside the active skill codex PR #955 review: composing a per-turn `@`-picked skill into the system prompt appended its body (with the `withSkillRootPreamble` guidance pointing at relative paths under `<cwd>/.od-skills/<folder>/`) but never staged the actual folder. `startChatRun` only copied `activeSkillDir`, so when the project's primary skill was different (or absent) the composed skill's references/, examples/, and scripts/ files lived only at their absolute repo path — agents that honour the cwd-relative form (or that don't get `--add-dir`, e.g. Codex with allowlisted gpt-image projects) couldn't reach them. Thread the composed skills' dirs out of `composeDaemonSystemPrompt` as `extraSkillDirs` and stage each one through the same `stageActiveSkill` API used for the primary skill. Dedupe by folder basename so a project whose primary skill is also `@`-composed isn't copied twice. Each preamble already advertises its own folder, so the prompt and the staged tree stay aligned without further changes. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): respect the Library disable toggle in the project @-mention picker codex PR #955 review: only `EntryView` received `enabledSkills` (filtered against `config.disabledSkills`); active projects still got `skills={skills}` raw, so a skill the user disabled in Settings kept appearing in the project's `@`-mention popover and could ride along to the daemon via `skillIds`. That broke the Library toggle for any project opened on the post-split branch. Compute a functional-skills-only enabled subset (`enabledFunctionalSkills`) and pass it into `<ProjectView>` instead. Templates stay separate — design-templates are filtered through their own `enabledDesignTemplates` memo for the Templates gallery — so ProjectView's chat composer still only sees skills, never templates, matching the pre-split prop surface. Co-authored-by: Cursor <cursoragent@cursor.com> * test(e2e): mock /api/design-templates for example-use-prompt flow The Templates tab in EntryView fetches from /api/design-templates after the skills/design-templates split (specs/current/skills-and-design-templates.md). The example-use-prompt Playwright scenario only mocked /api/skills, so the gallery card never appeared and the test timed out waiting on example-card-warm-utility-example. Serve the same fixture summary on both endpoints so the templates gallery renders the card the test clicks. Co-authored-by: Cursor <cursoragent@cursor.com> * test(tools-pack): create design-templates fixture for resources test The packaging resources copy now bundles the new design-templates tree alongside skills (see resources.ts BUNDLED_RESOURCE_TREES). The copyBundledResourceTrees fixture only created skills, design-systems, craft, etc., so the recursive copy crashed with ENOENT on design-templates before it could check the prompt-templates assertion. Add the missing fixture directory so the test exercises the same set of resource trees the packaged build does. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(skills): clone built-in side files into the shadow on first edit mrcfps PR #955 review: editing a built-in skill wrote a USER_SKILLS_DIR shadow folder that contained only a new SKILL.md. The next listSkills() pass surfaced the shadow as the active dir, but every side-file resolver (/api/skills/:id/files, /example, /assets/, the system-prompt preamble, and the per-turn cwd staging) reads through skill.dir. With nothing but SKILL.md in the shadow, the bundled assets/, references/, scripts/, and examples/ disappeared the moment the user hit save — a built-in like last30days or live-artifact would break immediately after edit instead of just having its body overridden. Teach updateUserSkill() to take a `sourceDir` and clone every entry except SKILL.md / dotfiles into the shadow on the very first edit. The shadow stays self-contained, so all the resolvers keep working without fallback bookkeeping. Subsequent edits detect the existing shadow and skip the clone, so user tweaks under the side tree survive a re-save. Wire `sourceDir: skill.dir` from server.ts's PUT /api/skills/:id handler and add two regression tests: - 'clones built-in side files into the shadow on the first edit' walks the file tree after save and asserts assets/template.html, references/ notes.md, and scripts/helper.sh all round-trip from the built-in. - 'preserves user-edited side files on subsequent edits' edits the staged assets/template.html, re-saves, and confirms the user content is still there. Co-authored-by: Cursor <cursoragent@cursor.com> test(e2e): rename home tab from Examples to Templates The Examples tab was renamed to Templates in EntryView (b5993385's skills/design-templates split — entry.tabExamples became entry.tabTemplates and the tab value moved from 'examples' to 'templates'), but entry-chrome-flows still asserted the old label and testId. Update both. * fix(skills+web): preserve template body in API mode and dir-based skill delete Two follow-ups from PR #955 review: 1. ProjectView only received `enabledFunctionalSkills`, but `composedSystemPrompt()` still resolved `project.skillId` through that prop and `fetchSkill()`. Projects created from the new `/api/design-templates` surface keep a template id in `project.skillId`, so opening one in API mode dropped the template body from the system prompt and the upstream request ran without the project's primary template instructions. Now ProjectView takes a separate `designTemplates` prop (the unfiltered template list, so a later-disabled template still loads for projects already created from it) and `composedSystemPrompt()` plus the metadata / `isDeck` lookups fall back to that list, with `fetchDesignTemplate()` as the body-fetch fallback to `fetchSkill()`. The chat composer's `@`-picker keeps receiving only the enabled functional skills. 2. `DELETE /api/skills/:id` used `deleteUserSkill(USER_SKILLS_DIR, skill.id)` which re-slugified the frontmatter id and removed `<userSkillsDir>/<slug>/`. That matched the import shape but missed the install shape — `installFromTarget` writes the folder at `sanitizeRepoName(url)` (GitHub) or `path.basename(realpath)` (local symlink), neither of which is guaranteed to equal the slugified frontmatter `name`. A duplicate `app.delete('/api/skills/:id', ...)` handler at the install routes never fired because Express resolved the earlier registration first, leaving the install/uninstall path without working teardown. The handler now removes `skill.dir` (the absolute path listSkills already discovered) under a USER_SKILLS_DIR safety check, using `lstat` + `unlinkSync` so symlinked local installs unlink cleanly without recursing into the user's source tree. The dead duplicate handler is removed; `deleteUserSkill` is dropped from the server.ts import set (still exported and unit-tested in skills.ts). Regression coverage in `apps/daemon/tests/skills-delete-route.test.ts` pins both shapes plus the symlink-preserves-source case. * test(daemon): point hyperframes system-prompt test at design-templates The merge with main brought in a hyperframes system-prompt test that reads `skills/hyperframes/SKILL.md`, but this branch's split moved `hyperframes` into `design-templates/` (same migration as `live-artifact` already handled above in this file). CI was failing with ENOENT on the old path. --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 17:48:34 +08:00
bojie.hbj	bb578b3dca	fix: Support OpenCode Write tool display as card (#1126 ) The Write tool from OpenCode AI wasn't being displayed correctly as a card. This fix addresses two issues: 1. Tool name normalization: Added support for lowercase 'write' in addition to 'Write' 2. Field naming normalization: Added support for camelCase 'filePath' in addition to snake_case 'file_path' Changes made: - Added `normalizeToolInput()` function in daemon.ts for root-level field normalization - Updated ToolCard.tsx to recognize both tool name variants and field naming conventions - Updated AssistantMessage.tsx for tool name recognition - Updated ProjectView.tsx for file path parsing in auto-open feature This ensures consistent behavior across different AI providers regardless of their tool naming conventions.	2026-05-10 11:49:00 +08:00
lefarcen	afb331a288	feat: add opt-in Langfuse telemetry (#800 ) * docs(specs): add langfuse telemetry change spec Captures the design for forwarding completed agent runs to Langfuse, including data-model mapping, field-budget caps, privacy gates, build-secret injection, GDPR right-to-deletion approach, and the resolved decisions on default consent, identifier shape, region, and ownership. * feat(daemon): add langfuse-trace module and telemetry prefs Adds the dependency-free building blocks for forwarding completed agent runs to Langfuse. Two layers: - AppConfigPrefs gains installationId and a TelemetryPrefs object with metrics / content / artifactManifest gates. The daemon validator treats telemetry like agentModels — replace-on-write, drop-when-empty, reject non-boolean inner values. - New langfuse-trace.ts builds a {trace-create, generation-create} pair from a ReportContext, capping prompt at 8 KB, output at 16 KB, artifacts at 50 entries, and dropping any batch larger than 1 MB before send. reportRunCompleted is no-op when LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY are unset (so dev runs and forks never emit) and short-circuits on prefs.metrics === false. Server-side wiring into the run-close path lands in a follow-up. * fix(langfuse): default to US Langfuse region End-to-end smoke against the project's actual dev key on 2026-05-07 returned 401 from cloud.langfuse.com (EU) and 207 from us.cloud.langfuse.com (US), confirming the org lives in US. Update the default base URL, the matching test, and the spec's Q3 decision row to match. Self-hosted or EU-region operators can still override via the LANGFUSE_BASE_URL env var. * feat(daemon): wire langfuse trace forwarding into run-close Adds the daemon-side glue to forward completed agent runs: - runs.ts gains an optional onTerminate hook fired once per run after it reaches a terminal state. Errors thrown from the hook are caught and logged, never propagated, so telemetry can never break the run path. - New langfuse-bridge.ts assembles a ReportContext from the in-memory run record, the conversation's persisted assistant message, and the user's app-config preferences. It tolerates a missing message (e.g. when web has not yet PUT the final delta) and a missing app-config. - server.ts stashes the original user prompt on the run object inside startChatRun so the bridge can include it without crossing the createChatRunService boundary, and registers the hook callback when building the run service. Behavior remains a no-op unless LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY are set in the daemon env AND telemetry.metrics is true in app-config. A live smoke against us.cloud.langfuse.com on 2026-05-07 confirmed the matching trace + generation schema is accepted (HTTP 207, both events 201 created). * fix(langfuse): address PR #800 review feedback P1 — Move trace forwarding off the daemon-internal run-close hook and onto the message-persistence path. The original onTerminate hook ran inside finish() the moment the SSE 'end' event was emitted, which is before the web client's onDone handler refreshes project files and PUTs producedFiles + final assistant content back to SQLite. Reading SQLite at that moment routinely missed both. The fix: drop the runs.ts hook entirely and trigger from PUT /api/projects/:id/conversations/:cid/ messages/:mid when the saved row carries a terminal runStatus. A reportedRuns Set guards against the multiple PUT calls web makes per turn (each retry / state update). Set entries auto-evict after the same 30 min TTL the runs map uses. Web persists a terminal-status message in all three completion paths — onDone (succeeded), onError (failed), and cancel (canceled) — so this catches every run shape. P2 — postLangfuseBatch now parses the 207 Multi-Status response body. Langfuse legacy ingestion always returns 207, and response.ok is true for 207, so per-event validation errors used to slip through silently. We now warn when body.errors is non-empty. Two new unit tests. P2 — truncate() and the HARD_BATCH cap now compare UTF-8 byte length, not String.length (which counts UTF-16 code units). A 4096-character CJK prompt occupies 12 KB, well over the 8 KB input cap. truncate also walks backwards to a UTF-8 leading byte so the cut never lands inside a multi-byte codepoint. New unit test covers '设'.repeat(4096). P2 — Spec R7 now lists the actual Langfuse trace deletion endpoint (DELETE /api/public/traces/{traceId} for single, DELETE /api/public/traces with body for batch). Verified by curl on us.cloud.langfuse.com: DELETE /api/public/traces/X → 200; the path the original spec named (POST /api/public/trace/X) returns 404. Reference link points at langfuse.com/docs/administration/data-deletion. P3 — Q4 (legacy ingestion vs OTel) moved from Open Questions to Resolved Decisions. The implementation already commits to legacy and the trade-off was discussed during design; the open-question status was stale. * feat(web): privacy consent surface + Settings → Privacy tab Adds the user-facing half of the telemetry feature so the daemon-side hook from PR #800 has something to talk to. - AppConfig gains optional `installationId` (anonymous v4 uuid generated on first opt-in; null after explicit decline; undefined when the user has never seen the consent surface) and `telemetry: TelemetryConfig` ({metrics, content, artifactManifest}). syncConfigToDaemon round-trips both fields so the bridge module sees the same prefs. - SettingsDialog grows a Privacy section with two states. When the user has never made a consent decision (typical first-run path), the section renders the GDPR-aligned consent card: a kicker, the disclosure body listing both metrics and conversation content as separate bullets, and two equally-prominent buttons ("Share usage data" / "Don't share"). The Don't-share path keeps the app fully usable (core app must work with all tracking declined). After a decision the same panel switches to three independent toggles + the anonymous ID + a "Delete my data" button that rotates the ID and turns everything off. - App.tsx points the welcome modal at the new Privacy section so the consent decision is the first thing a fresh installation sees. - 17 i18n keys land in en + zh-CN + zh-TW with hand-translated copy, and as English placeholders in the remaining 14 locales — enough for the parity check to pass while leaving room for proper localisation in a follow-up. Dict type updated. - Minimal index.css for the consent card + toggle rows so the panel is legible without depending on follow-up design polish. Telemetry remains a no-op end-to-end until the user clicks Share usage data: the daemon gate (prefs.metrics === true) keeps every code path short-circuited otherwise. * refactor(web): rebuild Privacy panel using project-native settings primitives The first cut used custom .settings-privacy-* classes + raw HTML checkboxes that didn't match any other Settings tab. Replace with the shell other sections already use: - settings-subsection containers with section-head + h4 + .hint - seg-control / seg-btn pill toggles ("active" / "offline") for each of the three telemetry preferences, mirroring NotificationsSection - a 2-cell seg-control for the consent card so Share usage data and Don't share carry identical visual weight (the GDPR equal-prominence requirement that the previous accent / outline split missed) - ghost button + readonly text input for the installation id row, mirroring the API-key field pattern elsewhere Drop the bespoke CSS block in favor of inheriting the existing settings-section / seg-control / ghost styling. The only privacy- specific style left is a tight definition list inside the consent card for the metrics + content disclosure rows. * refactor(web): use .toggle-row iOS switch for Privacy preferences Active/offline pills (the seg-control single-cell pattern that NotificationsSection uses) read awkwardly for a flat preference list. Switch the three telemetry toggles to .toggle-row — the same control NewProjectPanel uses for "speaker notes" / "animations": label + hint on the left, iOS-style sliding switch on the right, full-row click target. The consent card's two-button seg-control stays as-is — there the equal-weight pill pair is exactly what GDPR equal-prominence wants. * feat(web): standalone first-run privacy consent banner Replaces the Settings-dialog-as-onboarding hack with a dedicated bottom-right banner card that mounts whenever the user has never made a privacy decision (cfg.installationId === undefined). The banner is prominent (anchored to the corner with a soft shadow) but non-blocking, mirrors cookie-consent UX, and shares the project's panel styling — same .modal-elevated background, --radius-lg corners, --shadow-lg lift. Wiring: - App.tsx imports PrivacyConsentModal and renders it at the root, gated on installationId === undefined && !settingsOpen so it doesn't double up with the Privacy tab's own consent card when Settings is already showing. - Share / Don't share both go through handleConfigPersist, so the resulting installationId + telemetry prefs land in localStorage and the daemon at the same time, reusing the existing autosave plumbing. - The previous attempt that pinned the welcome SettingsDialog to the Privacy section is reverted; onboarding now stays focused on agent configuration, and the consent decision lives in its own surface. * fix(web): keep privacy banner visible while Settings welcome modal is open The banner gated itself on `!settingsOpen` to avoid double-rendering with the Privacy tab's consent card. But the first-run path opens the Settings welcome modal automatically when `onboardingCompleted=false`, which fired immediately after bootstrap — so the banner flashed for a moment and then vanished behind the modal backdrop. Drop the `!settingsOpen` clause so the banner stays mounted whenever the user has not yet made a privacy decision, and bump its z-index above the modal backdrop (200 vs 100) so first-run users can actually reach the consent buttons. The minor visual overlap with the Privacy tab's own card is fine: clicking either copy resolves both surfaces. * copy(privacy): soften consent button labels Banner action buttons now read "Help improve Open Design" / "Not now" (en, with hand translations in zh-CN / zh-TW and English placeholders in the other 13 locales) instead of "Share usage data" / "Don't share". The new wording aligns the affirmative action with the kicker copy ("Help us improve Open Design") and reads less alarming, while the disclosure list above still names both data categories explicitly so the consent stays informed under GDPR. The decline button stays as a soft "Not now" rather than an aggressive "Don't share" so the reject path doesn't read as hostile to the user. No structural change — the two-cell seg-control still gives the buttons identical visual weight, and the underlying side-effects are unchanged (installationId is generated on Help / nulled on Not now, and the telemetry prefs flip the same way). * feat(telemetry): expand trace fields for evals & dataset construction Each Langfuse trace now ships the full per-turn + per-install fact sheet that the eval/dataset workflow needs, instead of only the bare turn id + token count from before. Everything below is gated by `prefs.metrics === true`; nothing here is content (those gates remain separate). Per-turn: - model — first-class generation.model field, drives Langfuse cost lookup and model-grouping in the UI; also mirrored in trace.metadata and trace.tags so list-view filters work. - reasoning — generation.modelParameters.{ reasoning } so the Model Parameters card lights up; mirrored in metadata. - skillId / designSystemId — metadata + tags, so dataset slices can group by which skill/DS produced which output. Per-process / build (constant within one daemon run, cached at start): - appVersion / appChannel / packaged from app-version.ts - nodeVersion (process.version), os (platform()), osRelease, arch (os.arch()) - clientType — desktop vs web, derived from a new X-OD-Client header the web layer sets in providers/daemon.ts (with a User-Agent sniff fallback for third-party callers). Plumbing: - startChatRun stashes model / reasoning / skillId / designSystemId on the run object alongside the existing userPrompt stash. - POST /api/runs reads X-OD-Client and stores run.clientType. - langfuse-bridge collects RuntimeInfo once per process and merges per-run client carrier; ReportContext gains optional `turn` + `runtime` blocks; existing fields stay backward compatible. Spec gains a "Telemetry Fields Catalog" section enumerating every field, its source, and the gate it lives under, so the eval team has a single place to look up what's available without reading the trace schema by example. Tests: - new langfuse-trace tests cover turn tags, runtime tags, generation model/modelParameters promotion, modelParameters omission when reasoning is unset, and metadata mirroring. - langfuse-bridge gains an end-to-end "turn-level config" test that threads model/reasoning/skill/DS/clientType + appVersion through the bridge and asserts the Langfuse payload shape. - existing tests adjusted to tolerate host-dependent os tag. * copy(privacy): trim Share button to verb phrase only "Help improve Open Design" overflowed the equal-width 2-cell seg-control on the consent banner — the product name is already in the kicker + headline above the buttons, so the button itself only needs the verb phrase. Drop the product name from all locales: - en: Help improve Open Design → Help improve - zh-CN: 帮助改进 Open Design → 帮助改进 - zh-TW: 協助改進 Open Design → 協助改進 The decline button ("Not now" / "暂不" / "暫不") was already short, so the two buttons now have comparable length and the equal-prominence seg-control fits cleanly. Standalone Settings → Privacy panel uses the same labels for consistency. * fix(web): defer Settings welcome modal until privacy decision is made Previously bootstrap raced two surfaces against each other on first launch: the privacy consent banner (gated on installationId === undefined) and the Settings welcome modal (gated on onboardingCompleted === false). The banner's higher z-index kept it above the backdrop visually, but having two foreground surfaces at once is still confusing UX. Sequence them instead: bootstrap only opens the welcome modal when the user has already resolved consent (installationId !== undefined). Until then the banner owns the foreground alone. Once the user clicks Help improve / Not now, the corresponding handler hands off to the welcome modal if onboarding is still pending. End state matches what it was before — just without the simultaneous-render flash. * debug(privacy): log banner gate state to track sudden disappearance Two console.log points to find which setCfg call (or stale bundle) is flipping cfg.installationId from undefined to a value while the banner is visible. To remove once the regression is reproduced. * fix(privacy): keep installationId + telemetry out of localStorage Daemon is now the single source of truth for the privacy decision. Why this matters: the consent banner gates on \`config.installationId === undefined\`, but loadConfig() merges localStorage on top of the daemon's reply, so a stale uuid in \`open-design:config\` (left over from a previous opt-in) was re-hydrating the React state and immediately syncing back to the daemon — defeating "Delete my data" and re-suppressing the banner within milliseconds of every page load. The deeper reason to fix it here, not just patch the gate: a privacy identifier persisted in browser storage that the user can't see or clear without DevTools is a compliance liability. Anything users can revoke needs one canonical place to store it. Daemon \`app-config.json\` already serves that role for everything else gated through syncConfigToDaemon, so installationId + telemetry now ride that path exclusively: - saveConfig() strips both keys before writing localStorage. - loadConfig() strips both keys when reading older stale payloads, so existing installs migrate transparently on next launch. - syncConfigToDaemon() / mergeDaemonConfig still round-trip them, so the React state stays in sync with the daemon as before. Net effect: clearing app-config.json (or hitting "Delete my data") now fully resets the install identity, with no residual cohort key in browser storage. * feat(privacy): scrub secrets + PII from prompt/output before send When prefs.content is on, daemon now runs the prompt and assistant text through a regex scrubber (apps/daemon/src/redact.ts) before posting to Langfuse. The scrubber is the simplest thing that gives the user-facing copy a truthful claim — pure regex, zero new dependencies, fully auditable in this Apache-2.0 repo (vs. pulling a single-maintainer 5-month-old npm package into a core process). Categories covered (each replaced with [REDACTED:<kind>]): - Anthropic / OpenAI sk- keys (incl. proj/live/test/ant variants) - Langfuse pk-lf- / sk-lf- (specific rule wins over generic sk-) - GitHub gh[opsur]_ tokens - AWS access key ids (AKIA + 16 uppercase) - Google API keys (AIza + 35) - Slack xox[abprs]- tokens - Stripe live/test keys - JWT header.payload.signature triples - Bearer-header values (scheme word stays readable) - Emails, IPv4, US-style phone numbers - Credit cards — 13–19 digit runs that pass a Luhn check, so order ids and unix-nanos timestamps that fail Luhn pass through unchanged Not covered, stated openly in spec + i18n: names, postal addresses, business-secret semantics, raw 40-hex tokens (too high a false-positive cost for artifact slugs). Those would require an ML layer. Wired in: - apps/daemon/src/redact.ts — exports redactSecrets() + redactSecretsWithCounts() helper for future audit-summary metadata. - apps/daemon/src/langfuse-bridge.ts — runs both prompt and output through redactSecrets() before they reach the trace builder. - 18 unit tests cover every pattern plus negative cases (Luhn-failing digit runs, out-of-range IPv4 octets, idempotence on re-redacted text, ordinary prose passthrough). - i18n privacyContentHint on en + zh-CN + zh-TW (plus 14 locale placeholders) enumerates the categories so the consent disclosure matches the implementation — the GDPR informed-consent requirement. - spec gains a Pre-send Redaction subsection with the regex shape table + intentional non-coverage list. Drive-by: dropped the [privacy] debug logs that traced the now-fixed bootstrap regression. * fix(telemetry): make Langfuse reporting resilient * feat(telemetry): nest Langfuse turn observations * feat(telemetry): emit Langfuse tool spans * fix(telemetry): report after finalized message writes * fix(telemetry): honor persisted terminal status * fix(web): let consent banner yield page clicks * fix(telemetry): report current turn prompt only	2026-05-09 10:06:01 +08:00
Tom Huang	56bf6ee1b6	feat: agent-callable research command and /search (#615 ) * feat: pre-generation research (Tavily) for grounded generation Adds an optional pre-generation research step so the agent can produce slides / prototypes / decks grounded in real sources instead of guessing. User flow: 1. Settings -> Tavily Search -> paste API key (or set TAVILY_API_KEY). 2. Click the new Research button in the chat composer. 3. On send, the daemon runs a Tavily search, prepends the findings as a <research_context> block ahead of the system prompt, and spawns the agent. Research progress shows up as status pills in the chat stream; the agent cites sources inline as [1]/[2]/... Phase 1 surface: - Single provider (Tavily), single depth ('shallow'), no LLM synthesis pass (Tavily's `answer` is the summary). - Composer toggle only; no popover / depth picker yet. - Reuses the existing `status` SSE agent payload + StatusPill UI so no new event variants or renderer code are needed. Layers touched: - contracts: ResearchOptions / Source / Findings DTOs; ChatRequest.research; export from index. - daemon: apps/daemon/src/research/{index,tavily}.ts orchestrator + provider; tavily added to MEDIA_PROVIDERS and ENV_KEYS; hook in startChatRun before prompt assembly. - web: ChatComposer toggle + ChatSendMeta; threaded through ChatPane / ProjectView / streamViaDaemon into ChatRequest. Side fix (required to land the feature, but useful on its own): contracts internal relative imports lacked the `.js` suffix that NodeNext module resolution requires. This was already breaking `pnpm --filter @open-design/daemon typecheck` on main; without the fix, none of the new research types were visible to the daemon. All internal contracts imports now carry `.js`. Spec: specs/current/research-feature.md (phases 2-4 outlined for follow-up: composer popover, multi-provider, deep recursion, example skills with research_recommends). Verified: - pnpm --filter @open-design/contracts typecheck/test - pnpm --filter @open-design/daemon typecheck (the chokidar project-watchers test is a pre-existing flake, unrelated) - pnpm --filter @open-design/web typecheck - node scripts/verify-media-models.mjs * fix(daemon): clamp Tavily max_results to 20 Tavily's /search endpoint requires `max_results` in [0, 20]; sending a larger value (e.g. when `research.depth: "deep"` resolves to 30) returns 400 and `runResearch` silently falls back to no-research. Clamp at the provider boundary so Phase 2 depth tiers above 20 still produce results instead of failing the request. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * Remove stale research merge leftovers * Add agent-callable research search * Fix Indonesian locale typecheck * Fix research command invocation edge cases * Harden slash search prompt expansion * Honor research source caps in command contract * Require search reports in design files * Add research data provider settings * Wire web research provider fallback order * Update research provider fallback wording * Revert "Update research provider fallback wording" This reverts commit `86fb6001e3`. * Revert "Wire web research provider fallback order" This reverts commit `4c9e16036b`. * Revert "Add research data provider settings" This reverts commit `23630d1746`. * Add Dexter and Last30Days research skills * Add DCF and Last30Days OD skills * Add Last30Days and Dexter skills * Resolve research review threads --------- Co-authored-by: a1chzt <chizblank@gmail.com>	2026-05-08 10:33:44 +08:00
Marc Chan	c3d9136a0c	Add live artifacts and Composio connector catalog (#381 ) * docs: add live artifacts implementation spec * docs: align live artifacts implementation plan * Ralph iteration 1: work in progress * Ralph iteration 2: work in progress * Ralph iteration 3: work in progress * Ralph iteration 4: work in progress * Ralph iteration 5: work in progress * Ralph iteration 6: work in progress * Ralph iteration 7: work in progress * Ralph iteration 8: work in progress * Ralph iteration 9: work in progress * Ralph iteration 10: work in progress * Ralph iteration 11: work in progress * Ralph iteration 12: work in progress * Ralph iteration 13: work in progress * Ralph iteration 14: work in progress * Ralph iteration 15: work in progress * Ralph iteration 16: work in progress * Ralph iteration 17: work in progress * Ralph iteration 18: work in progress * Ralph iteration 19: work in progress * Ralph iteration 20: work in progress * Ralph iteration 21: work in progress * Ralph iteration 22: work in progress * Ralph iteration 23: work in progress * Ralph iteration 24: work in progress * Ralph iteration 25: work in progress * Ralph iteration 26: work in progress * Ralph iteration 27: work in progress * Ralph iteration 28: work in progress * Ralph iteration 29: work in progress * Ralph iteration 30: work in progress * Ralph iteration 31: work in progress * Ralph iteration 32: work in progress * Ralph iteration 33: work in progress * Ralph iteration 34: work in progress * Ralph iteration 35: work in progress * Ralph iteration 36: work in progress * Ralph iteration 37: work in progress * Ralph iteration 38: work in progress * Ralph iteration 39: work in progress * Ralph iteration 40: work in progress * Ralph iteration 41: work in progress * Ralph iteration 42: work in progress * Ralph iteration 43: work in progress * Ralph iteration 44: work in progress * Ralph iteration 45: work in progress * Ralph iteration 46: work in progress * Ralph iteration 47: work in progress * Ralph iteration 48: work in progress * Ralph iteration 49: work in progress * Ralph iteration 50: work in progress * Ralph iteration 51: work in progress * Ralph iteration 52: work in progress * Ralph iteration 53: work in progress * Ralph iteration 54: work in progress * Ralph iteration 55: work in progress * Ralph iteration 56: work in progress * Ralph iteration 57: work in progress * Ralph iteration 58: work in progress * Ralph iteration 59: work in progress * Ralph iteration 60: work in progress * Ralph iteration 61: work in progress * Ralph iteration 62: work in progress * Ralph iteration 63: work in progress * Ralph iteration 64: work in progress * Ralph iteration 65: work in progress * Ralph iteration 1: work in progress * Ralph iteration 2: work in progress * Ralph iteration 3: work in progress * Ralph iteration 4: work in progress * Ralph iteration 5: work in progress * Ralph iteration 6: work in progress * Ralph iteration 8: work in progress * Ralph iteration 9: work in progress * Ralph iteration 17: work in progress * Add Composio-backed connectors * Add Composio-backed connector catalog * Fix connector callback flow * Update live artifact connector refresh * Fix live artifact refresh updates * Improve live artifact viewer toolbar * Refine live artifact source tabs * Expand Composio connector catalog * Improve Composio connector browsing * Fix artifact refresh source safety checks Generated-By: looper 0.4.1 (runner=fixer, agent=opencode) * Fix live artifacts PR feedback Generated-By: looper 0.5.0 (runner=fixer, agent=opencode) * Fix live artifact preview CORS validation Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix connector OAuth IPv6 loopback hosts Allow bracketed IPv6 loopback Host headers when deriving connector OAuth callback URLs so IPv6-bound daemons can complete connection flow. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Preserve live artifact refresh permissions Respect explicit refresh permission choices during live artifact create and update flows so revoked connector sources remain gated. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact preview cache freshness Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact refresh validation Guard manual refreshes with local daemon checks and reject daemon_tool sources without a toolName before refresh execution. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix Composio credential invalidation Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact CORS methods Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix workspace validation Restore media config test isolation under Vitest setup data-dir overrides and add the missing French live artifact display copy so the workspace test suite stays aligned.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector safety filtering Keep agent-preview connector listings aligned with execution safety policy and prune stale Composio OAuth state records before they accumulate. Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix agent runtime cleanup Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix live artifact daemon access Validate local-only live artifact routes against the peer socket address and pass daemon-resolved CLI paths to ACP MCP descriptors.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector run limit pruning Evict stale connector rate-limit buckets so long-lived daemon processes do not retain per-run entries indefinitely.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector compact schemas Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Improve connector connection feedback * Adjust connector gate positioning * Fix live artifact refresh commits Avoid marking refresh candidates failed after snapshot or state persistence errors by deferring live artifact mutations until the durable refresh metadata is written. Also align connector OAuth callback host validation with daemon loopback handling.\n\nGenerated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Improve connector search relevance * fix(daemon): harden connector connection state Require loopback daemon validation before connector connect side effects and only clear provider-owned connector statuses during credential reset. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): guard connector disconnect route Require local daemon request validation before connector disconnect side effects. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): guard composio config updates Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): dispatch live artifacts mcp first Route the live-artifacts MCP server before the generic MCP CLI so od mcp live-artifacts starts the dedicated server instead of failing generic argument parsing.\n\nGenerated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): handle integer connector schemas Allow JSON Schema integer connector inputs while preserving fractional-value validation so generated connector tool schemas accept valid page sizes and limits. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix: align live artifact refresh error codes Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Fix live artifact connector refresh flow * Update live artifact design cards * Add beta badge to live artifact form * Remove live artifact tile model * Fix live artifact refresh sync * Fix live artifact MCP refresh durability Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Fix live artifact refresh safety Enforce persisted refresh opt-out and connector auto-read gating before refresh sources execute. Generated-By: looper 0.5.5 (runner=fixer, agent=opencode)	2026-05-05 16:42:11 +08:00
Caprika	0c00f241e7	Add preview comment attachments (#284 )	2026-05-02 19:23:46 +08:00
nettee	3fb849d047	Fix chat runs surviving web disconnects (#146 ) * fix chat runs surviving web disconnects * fix chat run create abort propagation Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon keepalive reconnect budget Generated-By: looper 0.0.0-dev (runner=fixer, agent=gpt-5.5) * fix daemon stream disconnect cancellation Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon stream abort cancellation race Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon run cancellation semantics * fix load * doc * 2 * add run refresh recovery * fix active run refresh status * fix reattach abort handling * fix * fix chat initial scroll * fix daemon start failures Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix background run recovery Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix stop run status Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix background run recovery Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * extract daemon run service * move prompt composition to daemon * fix prompt module resolution * fix project id generation * add project run status * add designs kanban view with awaiting_input status - add grid/kanban view toggle on Designs tab; persist choice in localStorage - introduce awaiting_input project display status (daemon-derived from unanswered <question-form>) so projects asking the user aren't shown as Completed; ordered between Running and Completed with amber accent - hide transient queued state from users: coerce queued/starting to running in daemon /api/projects projection and drop the queued kanban column - a11y polish on Designs cards: Space activation, aria-labels on delete, focus-visible outlines, reveal delete on focus-within and touch, prefers-reduced-motion handling - kanban layout uses flex sizing instead of viewport math; scoped icon- only pill button rule fixes view-toggle icon alignment --------- Co-authored-by: mrcfps <mrc@powerformer.com>	2026-04-30 20:16:46 +08:00
PerishFire	c6d11018a0	Refresh desktop integration control plane (#123 ) * feat(dev): add desktop tools-dev control plane * refactor(sidecar): split Open Design contracts Move Open Design-specific sidecar protocol definitions into @open-design/contracts so sidecar and platform can remain descriptor-driven primitives. * refactor(daemon): organize package sources Keep daemon app code, tests, and sidecar entrypoints in separate package directories so each layer can be built and verified independently. * chore(repo): streamline maintenance entrypoints Centralize agent guidance by directory and reduce root command chains while preserving the existing build scope. * docs: translate agent guidance to English * fix(sidecar): tolerate stale IPC sockets Remove stale Unix socket files only after confirming no listener is active, so tools-dev can restart after unclean shutdowns.	2026-04-30 14:23:53 +08:00
nettee	56d08b8c5f	Add shared contracts and migrate project code to TypeScript (#118 )	2026-04-30 13:01:15 +08:00
Caprika	5c45c3b967	fix sse keepalive behind nginx (#111 )	2026-04-30 11:31:18 +08:00
PerishFire	cfebff9653	Align app directories and isolate e2e tests (#102 ) * chore: align app directories * test: consolidate external suites under e2e	2026-04-30 09:47:03 +08:00

28 commits