Commit graph

237 commits

Author SHA1 Message Date
Siri-Ray
170a05f5d2
Formalize skill artifacts into plugins (#3085)
* Add skill-to-plugin candidate flow

* Fix skill plugin candidate card reuse

Generated-By: looper 0.9.1 (runner=fixer, agent=codex)

* Fix skill plugin candidate dismiss and URL gates

Generated-By: looper 0.9.1 (runner=fixer, agent=codex)

* Polish skill plugin candidate copy
2026-05-27 08:26:00 +00:00
吴杨帆
916438d919
fix(daemon): hide agent executable paths from chat status (#2874) (#3046)
Stop emitting resolved filesystem paths in chat start events and
inactivity-timeout diagnostics; surface agent ids instead.
Complements web-side redaction in #2894.
2026-05-27 06:22:56 +00:00
chaoxiaoche
fce444bcab
Consolidate chat comments preview on main (#2906)
* feat(web): queue chat sends

* feat(web): render code comment directives

* feat(web): add preview comments and manual edits

* fix(web): polish shared chrome controls

* fix(web): align queued send loading state

* feat(web): open primary project artifacts

* fix(web): keep queued sends and tests aligned

* fix(web): restore docked comment tools layout

* fix(web): align preview comment toolbar

* fix(web): place local cli beside handoff

* fix(web): move agent menu beside handoff

* fix(web): make project instructions a direct header action

* fix(web): compact handoff and toolbar labels

* fix(web): clarify handoff menu and annotation label

* fix(web): restore compact cursor handoff trigger

* fix(web): align agent menu trigger with handoff

* fix(web): add draw toolbar close action

* fix(web): move inspect editing into edit mode

* fix(web): avoid reserving comment sidebar in annotation mode

* fix(web): float preview comments panel

* fix(web): keep edit canvas full width

* fix(web): polish preview annotation tools

* fix(web): highlight active preview comments

* fix(web): open comments panel after annotation save

* fix(web): polish comment handoff controls

* fix(web): remove palette preview tool

* fix(web): simplify draw annotation toolbar

* fix(web): restore queued tasks into composer

* fix(web): restore queued send strip styling

* fix(web): hide internal comment target ids

* fix(web): align manual edit panel header

* test(web): cover visual interaction contracts

* fix(web): address PR feedback regressions

* fix(web): preserve artifact chrome state

* fix(daemon): restore project raw file routes

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
Co-authored-by: mrcfps <mrc@powerformer.com>
2026-05-26 10:31:19 +00:00
Yuhao Chen
fb1e0c819f
fix(plugins): reject symlinked plugin assets (#2036)
* fix(plugins): reject symlinked plugin assets

* test(plugins): cover asset directory symlink escapes

* fix(plugins): reject symlinked asset path segments
2026-05-26 07:03:22 +00:00
李晏丞
53b9d779ac
fix(daemon): widen HTTP keep-alive on the daemon listener (#2557)
* fix(daemon): widen HTTP keep-alive so SSE survives idle gaps

The daemon's `/api/runs/:id/events` SSE stream emits an in-band
`: keepalive` comment every 25s (`SSE_KEEPALIVE_INTERVAL_MS`), but
Node's default `server.keepAliveTimeout` is 5_000ms. When a run is
quiet for more than five seconds — e.g. the agent is still composing,
or the user briefly walks away — Node closes the underlying TCP
connection from under the SSE writer, the next 25s ping lands on a
dead socket, and the browser surfaces it as a generic
"network error" mid-stream.

This is most visible behind any keep-alive-aware middlebox (the
nginx running in the desktop bundle, the socat/docker bridges users
set up for remote access, EC2 security-group idle timers): the
default 5s window is shorter than every reasonable in-band keepalive
cadence, so the connection dies before the application gets a chance
to assert it's still alive.

Set the listener to:

- `keepAliveTimeout = 120_000` — 4.8× the in-band keepalive, plenty
  of slack for clock skew and slow flushes.
- `headersTimeout = 125_000` — must exceed `keepAliveTimeout` per the
  Node docs, otherwise a misbehaving client can stall request parsing
  indefinitely.
- `requestTimeout = 0` — disable the per-request timeout entirely;
  an SSE response intentionally runs for as long as the agent runs.

Verified by curling
`/api/runs/<id>/events` from inside the daemon container and
watching the connection stay open through three full 25s keepalive
cycles where it previously RST'd at ~5s.

* fix(daemon): address PR #2557 review — drop requestTimeout, add regression test

Three changes responding to @PerishCode's review (#2557):

1. Drop `server.requestTimeout = 0`. The reviewer is correct: that knob
   bounds how long the server waits to *receive* a complete request
   (headers + body) and is cleared the moment the request is fully
   parsed — it does not gate the duration of an SSE response. Setting
   it to 0 only removes Node 18+'s default 300s slow-loris guard, which
   is a real regression on a daemon that binds to 0.0.0.0 / Tailscale.

2. Rewrite the comment block. The previous comment claimed
   `keepAliveTimeout` "closes any idle SSE connection." Per the Node
   docs, `keepAliveTimeout` arms *after* a response finishes writing —
   it bounds the between-request idle gap on a kept-alive socket, not
   an in-flight streaming response. SSE drops mid-stream are almost
   always middlebox idle timers (nginx, socat/docker, EC2 NAT), not
   Node's own socket timeout, and this listener-side change cannot
   extend a connection past those middleboxes.

   What this PR actually fixes: routine kept-alive sockets used around
   an SSE stream (status polls, run-status fetches, the initial GET
   before the SSE upgrade) surviving normal client pauses. 120s gives
   comfortable headroom over the 25s in-band cadence so chat clients
   stop reconnect-storming between bursts.

3. Add `apps/daemon/tests/server-keepalive.test.ts` so a future
   refactor cannot silently restore the Node defaults. The test uses
   the existing `startServer({ port: 0, returnServer: true })` fixture
   (mirroring version-route.test.ts) and asserts the listener's
   `keepAliveTimeout` and `headersTimeout` invariants.

Verified:
- pnpm --filter @open-design/daemon run typecheck passes
- pnpm vitest run tests/server-keepalive.test.ts → 2 passed
2026-05-26 04:03:44 +00:00
Patrick A
7bc11b398d
chore(deps): upgrade express 4 -> 5 in daemon (#2311)
* chore(deps): upgrade express 4.22.1 -> 5.2.1 and @types/express

Breaking changes addressed:
- Renamed all bare wildcard route segments from * to *splat across
  src/server.ts, src/static-resource-routes.ts, src/project-routes.ts,
  src/import-export-routes.ts, and all three test stubs that define
  app.get/options/delete routes using /raw/* or /raw/* patterns
- Updated wildcard param access from (req.params as any)[0] / req.params[0]
  to Array.isArray(req.params.splat) ? req.params.splat.join('/') : String(...)
  to handle the Express 5 / path-to-regexp v8 change where wildcard params
  are now string[] instead of string
- Updated app.get('*') SPA fallback to app.get('/*splat') in server.ts
- Annotated five connector route handlers with Request<{ connectorId: string }>
  so the typed param resolves as string, not string | string[], fixing the
  10 TS2345 / TS2322 errors that surfaced when @types/express moved to 5.0.6
- Fixed two app.listen() beforeAll callbacks in origin-validation.test.ts to
  accept and propagate the optional Error argument Express 5 now passes to
  the listen callback, resolving TS2769 overload mismatch

* chore(nix): refresh daemonHash for rebased lockfile

* fix(daemon): await res.sendFile() in async route handlers for Express 5 compatibility

Express 5 res.sendFile() returns a Promise. Without await, async route
handlers return before the response is sent, causing Express to call
next() and fall through to a 404. Add await to all res.sendFile() calls
in async handlers in static-resource-routes.ts and server.ts.

* fix(daemon): use readFile+send for spritesheet route instead of sendFile

Express 5 res.sendFile() returns undefined (not a Promise). ENOENT errors
call next() asynchronously after the route handler's try/catch has returned,
causing unhandled 404 responses. Replacing with fs.promises.readFile + res.send
keeps the error path fully within the handler's try/catch.

---------

Co-authored-by: Patrick A <259201958+eefynet@users.noreply.github.com>
2026-05-26 03:16:48 +00:00
chaoxiaoche
2b7b6590ae
feat(comments): add comment attachment API (#2869)
* feat(comments): add comment attachment API

* ci: add fork PR workflow approval script

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-05-25 07:24:21 +00:00
kami
024e6d86a9
fix: validate plugin connector refs in doctor (#2164)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
nix-check / build (push) Failing after 1s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 1s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 1s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
* fix: validate plugin connector refs in doctor

Co-authored-by: multica-agent <github@multica.ai>

* chore: refresh pool review queue

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-24 16:28:00 +00:00
leessju
8e3d1360bd
fix(daemon): close # Instructions block with an explicit do-not-echo guard (#2827)
The composed chat prompt prepends a '# Instructions (read first)'
block in front of '# User request' so a single user message carries
both the system rules and the actual request — the shape every agent
CLI (Claude, Codex, OpenCode, Gemini) expects on stdin.

In practice claude-opus-4-7 (and a few other instruction-tuned
models, particularly with --include-partial-messages on the stream)
start their reply by echoing the top of that user message verbatim.
The chat UI then shows the system prompt as a literal block leading
the visible answer, e.g.:

  Instructions
  Always respond in Korean. Use Korean for all explanations…
  …Maintain full orthographic correctness…
  ).네, 완료했습니다. 전달하신 4가지 보강 포인트를 …

(The closing token of the instructions block runs straight into the
real answer without a newline — the telltale of a model-side echo
rather than a UI render bug.)

Close every Instructions block with one trailing line:

  (Do not quote, restate, or echo the # Instructions block above in
  your reply. Begin your response with the answer to the # User
  request below.)

This kills the regression in practice without changing the turn
shape (still one user message), so no agent CLI plumbing has to move.

Tested via tests/chat-route.test.ts — pins the literal guard string
so a future refactor cannot silently drop it.

Co-authored-by: nicejames <nicejames@gmail.com>
2026-05-24 14:30:58 +00:00
999axel999
db90cb0bdb
fix(daemon): reject unsafe plugin manifest names (#2757)
Co-authored-by: Zerocracy Assistant <zerocracy-assistant@example.com>
2026-05-23 12:53:39 +08:00
lefarcen
c14baf07d3 Merge origin/main into release/v0.8.0
PR #2461 sync prep — resolves 14 conflicts merging 84 main-side commits
on top of 58 release-side commits accumulated during the 0.8.0 cycle.

Resolution summary:

Take main (theirs) where main carried deliberate forward progress:
- apps/web/src/components/PluginCard.tsx — 7 hunks, i18n migration:
  hardcoded English aria-labels/titles replaced with t() calls keyed
  on pluginCard.* (all 8 keys verified present in en.ts).
- apps/web/src/components/TasksView.tsx — 1 hunk, source-ingestion
  feature: sortedRoutines (newest-first), sourceIngestionTemplates,
  patchSourceForm, submitSourceIngestion. activeCount/pausedCount
  semantics preserved (now keyed on sortedRoutines, count unchanged).
- e2e/ui/app.test.ts — new node:fs/promises + tmpdir + path + @/timeouts
  imports needed by main-side test helpers.
- e2e/ui/settings-local-cli-codex-fallback.test.ts — menu-dismissal
  helper block added by main.

Keep both sides where each added a different field to the same object
literal:
- apps/web/src/components/ProjectView.tsx (locale + analyticsHints
  spread).
- apps/web/src/components/DesignSystemFlow.tsx (locale + analyticsHints).

Take release (ours) where release carried deliberate work that ships
0.8.0:
- CHANGELOG.md — release-side 0.8.0 entry + PR link refs; main's
  Unreleased section was the same body of work, now finalized.
- apps/landing-page/public/{apple-touch-icon,favicon}.png +
  apps/web/public/app-icon.svg — release-side visual refresh assets
  consistent with 0.8.0 stable ship.
- tools/pack/src/linux.ts — packageVersion const required by line 466;
  taking main's empty line would build-error.
- e2e/ui/project-management-flows.test.ts +
  e2e/ui/settings-api-protocol.test.ts +
  e2e/ui/settings-memory-routines.test.ts — release-side release-smoke
  hardening (shangxinyu1 + PerishFire) takes precedence on overlap.

Closes-issue / unblocks: PR #2461 sync release/v0.8.0 → main.
2026-05-23 12:17:18 +08:00
YOMXXX
48ed23c72f
fix(daemon): finish live-artifact chat runs via watchdog quiet-period handoff (#1451) (#2585)
* fix(daemon): finish live-artifact chat runs via watchdog quiet-period handoff (#1451)

Live-artifact runs were staying in `Working` for the full 10-minute
inactivity window even after the deliverable had been registered, and
sometimes finishing as `failed` with `Agent stalled without emitting
any new output for 600s`. The agent process kept its stdin/stdout
alive (claude-code stream-json idle stdin, post-write reasoning that
never reaches the chat) so the existing watchdog could not tell the
deliverable was already in the user's hands.

Wire `/api/tools/live-artifacts/create` back into the chat run via a
small per-run handle registry: on the first `created` event, the run
flips a local `artifactRegistered` flag and rearms the watchdog with
the shorter `OD_CHAT_RUN_ARTIFACT_QUIET_PERIOD_MS` (default 60s)
instead of the 10-minute pre-artifact ceiling. When that quiet timer
trips, the watchdog no longer emits a stalled-error / `failed` finish;
it SIGTERMs the child and lets the existing child-exit handler do
final classification — the close handler now treats a SIGTERM exit
after a registered artifact as `succeeded`, matching what the user
actually got (a delivered artifact, not a failed run).

The handoff stays with the existing child-exit lifecycle, so tool
token revocation, cancel semantics, and exit-status classification
keep their current owner — addressing the PR #1543 review history
where finishing the run from the tool route bypassed those guarantees.

Closes #1451.

* fix(daemon): gate artifact quiet-period close on daemon-initiated flag (#1451 review follow-up)

Reviewer (#2585) found that the close-handler branch reclassifying
SIGTERM/SIGKILL as `succeeded` only checked `artifactRegistered`, so an
unrelated later termination (external `kill`, OOM, container shutdown)
after a successful artifact write would silently flip the run from
`failed` to `succeeded` — the exact "completed without producing
anything visible" failure mode the existing close handler is trying
to prevent.

Track the watchdog-initiated shutdown explicitly: set
`artifactQuietShutdownRequested = true` immediately before
`failForInactivity()` sends SIGTERM (covering the kill-grace SIGKILL
escalation under the same flag), and require that flag in the close
handler's quiet-period branch.

Extract the final-status decision into a pure
`classifyChatRunCloseStatus` so the daemon-initiated vs external
signal cases can be pinned with focused unit tests instead of
asserting closure-internal state via end-to-end timing.

* fix(daemon): treat OD_CHAT_RUN_ARTIFACT_QUIET_PERIOD_MS=0 as disabled (#1451 review follow-up)

Reviewer (#2585 non-blocking) found that an operator override of
`OD_CHAT_RUN_ARTIFACT_QUIET_PERIOD_MS=0` no longer behaved as a
"disable the quiet period" knob: once the artifact was registered,
`activeInactivityTimeoutMs()` dropped to 0, `noteAgentActivity()`
early-returned without clearing the prior timer, and the pre-artifact
10-minute timer kept running while further agent activity stopped
refreshing it.

Make the quiet-period switch conditional on a positive value. A 0
override now means "do not shorten after artifact registration" — the
pre-artifact ceiling stays active, subsequent activity continues to
reschedule it, and the existing pre-artifact stalled-error path still
fires when the agent genuinely hangs. Pin the resolver as a pure
`resolveActiveInactivityTimeoutMs` helper so the four quiet-vs-pre
matrix cases are unit-tested directly.

* fix(daemon): arm the quiet-period watchdog when pre-artifact timeout is disabled (#1451 review follow-up)

Reviewer (#2585 non-blocking, round 3) found that
`OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS=0` paired with
`OD_CHAT_RUN_ARTIFACT_QUIET_PERIOD_MS>0` left the watchdog disarmed
forever. The `noteAgentActivity()` call at run start exited early
because the pre-artifact delay was 0, so `inactivityTimer` was still
`null` when the artifact was registered, and the prior
`if (inactivityTimer) noteAgentActivity()` guard inside
`noteArtifactRegistered()` then skipped the re-arm. The
newly-positive quiet-period delay never armed a timer at all — a
chat run that went silent right after artifact creation would stay
`running` indefinitely.

Drop the guard. `noteAgentActivity()` is already the function that
decides whether to schedule (it bails when the active delay is 0),
so calling it unconditionally keeps the behavior coherent across the
four pre/quiet combinations: both non-zero (was already fine), pre=0
+ quiet>0 (now arms the quiet timer), pre>0 + quiet=0 (still falls
back to the pre-artifact ceiling via the existing resolver), both
zero (still no watchdog at all — operator opted out).

Pure-function coverage of the ceiling decision stays in
`resolveActiveInactivityTimeoutMs` — exercised across the same four
combinations in the existing unit suite.
2026-05-22 19:06:13 +08:00
lefarcen
9912fa899a
feat(analytics): full design-system event family + DS run variant (#2706)
Lands the v2 PostHog spec's P0 design-system event family: five new
result events covering source ingest, create, review, status, and
picker apply; the existing file_upload_result + run_created/run_finished
schemas widened to discriminate DS workspaces from regular chat runs.

Contract (packages/contracts/src/analytics/events.ts):
- AnalyticsEventName gains design_system_{source_ingest,create,review,
  status,apply}_result.
- Props interfaces + bucket/origin/method/status enums per spec.
- TrackingProjectKind gains 'design_system' for DS-as-project runs.
- RunCreatedProps / RunFinishedProps widen page_name+area to discriminate
  chat_panel vs design_system_project; entry_from union accepts DS values;
  DS-variant context fields (ds_source_origin, source_count, brand
  description length bucket, per-source counts, design_system_created,
  preview_module_count, missing_font_count).
- FileUploadSurface union adds design_systems / design_system_source.
- Bucket helpers (designSystemLengthBucket, folderCountBucket,
  totalSizeBucket), module slug + type derivation, repo host parser.

Web emission sites:
- DesignSystemFlow.generate(): create_result + threads
  prepareCreatedDesignSystemProject with analyticsTrack so each of the
  4 source paths emits source_ingest_result (success / partial / failed
  / empty), repo-host dominance, fallback type from connector status.
- DropZone onFiles handlers: file_upload_result with deriveUploadCohort.
- DesignSystemDetailView: status_result on togglePublished + Make-default,
  review_result on Looks-good / Needs-work; module_id from markdown
  section header slug (designSystemModuleSlug), module_type via keyword
  heuristic.
- DesignSystemsTab: status_result on publish toggle, set/unset default,
  delete (incl. cancelled when window.confirm dismissed).
- NewProjectPanel: apply_result on DS picker change (manual select +
  clear) plus an auto_select emit when the picker mounts with a default
  DS not yet user-touched.
- ProjectView.streamViaDaemon: when project.metadata.importedFrom ===
  'design-system', pass analyticsHints with entry_from
  (onboarding_design_system for the auto-sent first message,
  regenerate_from_review for subsequent sends), projectKind=design_system,
  designSystemRunContext.

Daemon:
- ChatRequest gains optional analyticsHints (entryFrom / projectKind /
  designSystemRunContext). Behavior never depends on these; only PostHog
  props do.
- /api/runs handler reads analyticsHints to flip baseProps to the DS
  variant (page_name=design_system_project, area=design_system_generation,
  project_kind=design_system) when the run is DS-flagged, and spreads the
  DS context fields onto run_created.
- run_finished mirrors the DS area + adds design_system_created (true iff
  the run wrote DESIGN.md), preview_module_count (distinct preview/*.html
  writes), missing_font_count (0 placeholder; pending font-audit hook).
- run-artifacts.ts: extracts collectWrittenPathsMatching as the shared
  Write/Edit + isError-pair core; adds didRunCreateDesignSystemFile and
  countDesignSystemPreviewModules using the same dedup + failure-skip
  invariants as countNewHtmlArtifacts.

Tests:
- packages/contracts/tests/analytics-design-system-helpers.test.ts: 18
  new test cases over the bucket helpers, module slug + type mapping,
  repo host parser.
- apps/daemon/tests/run-artifacts.test.ts: 9 new tests for
  didRunCreateDesignSystemFile + countDesignSystemPreviewModules covering
  Write-then-Edit dedupe, case-insensitive DESIGN.md match, isError pair
  skip, preview/index.html as a module, non-preview path rejection.

Targets release/v0.8.0.
2026-05-22 17:18:57 +08:00
Siri-Ray
e6da01e998
Add i18n metadata for official content (#2692) 2026-05-22 16:39:32 +08:00
shangxinyu1
cc6edb9afe
Proxy GitHub metadata through the daemon (#2654)
* Proxy GitHub metadata through the daemon

* fix(contracts): share GitHub metadata responses

Generated-By: looper 0.6.0 (runner=fixer, agent=codex)

* fix(contracts): align GitHub fetchedAt payload types

Generated-By: looper 0.6.0 (runner=fixer, agent=codex)

* Proxy GitHub metadata through the daemon

Generated-By: looper 0.6.0 (runner=fixer, agent=codex)
2026-05-22 14:06:07 +08:00
lefarcen
e1818f2677
feat(analytics): onboarding ui_click + lifecycle events + update_popover surface_view (#2590)
* feat(analytics): onboarding ui_click + lifecycle + update_popover surface_view

Spec rows 1-3 of the Onboarding family (ui_click,
onboarding_runtime_scan_result, onboarding_complete_result) and the
home `update_popover` surface_view were all listed as P0 in the v2
doc but unwired — PostHog showed 0 events for every onboarding
ui_click, 0 for the scan/complete result events, and 0 for the
update-popover exposure.

Contract (`packages/contracts/src/analytics/events.ts`):
- Adds event names `onboarding_runtime_scan_result` /
  `onboarding_complete_result` and wires them into
  `AnalyticsEventPayload`.
- Adds `OnboardingClickProps` (page_name=onboarding, area/element/
  action discriminators + optional runtime/about_you/source rider
  fields) and threads it into `UiClickProps`.
- Adds `OnboardingRuntimeScanResultProps` and
  `OnboardingCompleteResultProps` with the doc's full field set —
  enums for runtime_type / scan result / completion result /
  completion_type, plus the lifecycle context (has_about_you,
  has_design_system_request, source_count, exit_step_name).
- Extends `TrackingFileUploadSurface` with an `onboarding /
  design_system_source` shape so the design-system-step source ingest
  can ride the same `file_upload_result` event the file_manager /
  chat composer already use. `source_type` is required on this shape
  so the dashboard can split by `local_code|fig|assets` without
  inspecting `file_type`.
- Adds `UpdatePopoverSurfaceViewProps` for the home toolbar's
  "Update ready" panel.

Onboarding wiring (`apps/web/src/components/EntryShell.tsx`):
- Centralises step/runtime-context derivation in `emitOnboardingClick`
  + `emitOnboardingComplete` helpers; every interactive control inside
  OnboardingView now fires through one of them so a future spec tweak
  changes one place.
- Click rows for runtime cards (local_coding_agent / byok), design-
  source cards (github_repo / local_code / fig_upload), about_you
  selects (organization_size / use_case / hear_about_us), and the
  Continue / Back / Skip navigation buttons. Multi-select use_case
  emits one row per added value, not per render.
- `scanCliAgents` now emits `onboarding_runtime_scan_result` with
  detected/available counts on every terminal state — success when
  any CLI is available, failed when scan returned zero or threw.
  `duration_ms` measures wall-clock from start to terminal.
- `onboarding_complete_result` fires from the Skip / last-step
  Continue / Generate paths with the right `completion_type`. The
  Generate path uses a new `DesignSystemCreationFlow.onBeforeGenerate`
  callback so the embedded flow can expose its local source-count
  state to the wrapper.

DS creation flow (`apps/web/src/components/DesignSystemFlow.tsx`):
- New `onBeforeGenerate(snapshot)` prop with a typed
  `DesignSystemGenerateSnapshot` shape. Fired right before the async
  generate() work; OnboardingView consumes it for both the `generate`
  ui_click (with source_type derived from which-counts-equal-total)
  and the completion lifecycle event.
- `renderDesignSystemCreation` in `EntryView` / `EntryShell` / `App`
  grows a second `hooks` arg that plumbs `onBeforeGenerate` through.

Update popover (`apps/web/src/components/UpdaterPopup.tsx`):
- Fires `surface_view page_name=home area=update_popover` once per
  panel-open transition, deduped by `app_version_before ->
  app_version_after` so a re-render of the same offer doesn't
  inflate the count.

Validation:
- `pnpm guard` 
- `pnpm --filter @open-design/web typecheck` 
- `pnpm --filter @open-design/web test`  203 files / 1828 tests
- `pnpm --filter @open-design/daemon test`  249 files / 2977 tests

* fix(analytics): generation_progress fires from chat_panel + complete_result uses snapshot

E2E (2026-05-21, distinct_id=e2e-onboarding-test-001) drove the full
welcome flow and exposed two issues in the previous commit:

1. `page_view page_name=onboarding area=generation_progress` (step 4)
   never fired. PR #2590's commit wired this from
   `DesignSystemDetailView`, but the Generate path actually navigates
   to ProjectView (`page_name=chat_panel`), not to the DS detail
   surface. PostHog showed `chat_panel` and `file_manager` page_views
   landing right after the Generate click but no
   `area=generation_progress` row.

   Fix: fire `area=generation_progress` from `ProjectView` right
   alongside its `chat_panel` page_view when an onboarding session
   id is still in sessionStorage. Clear the session id immediately
   after so a later unrelated project visit doesn't inherit the
   onboarding attribution. The `DesignSystemDetailView` site can
   stay as a defense-in-depth — same dedup guard, no double-fire.

2. `onboarding_complete_result` from the Generate path shipped with
   `has_design_system_request: false` and `source_count: 0`. The
   `emitOnboardingComplete` helper read `designSource` (the click
   state on the three source-type cards), but E2E showed users
   click Generate without clicking those cards — they type a brand
   description and add a GitHub URL directly in the embedded form,
   so `designSource` stays null even when a request is clearly in
   flight.

   Fix: thread `DesignSystemGenerateSnapshot` from the
   `onBeforeGenerate` callback into `emitOnboardingComplete` via a
   new `extra.sourceSnapshot` option. When present, derive
   `has_design_system_request` from `sourceCount > 0 ||
   hasBrandDescription` and `source_count` from the snapshot's
   `sourceCount`. Skip / last-step Continue paths still fall back
   to the `designSource` heuristic since no snapshot exists there.

* fix(analytics): emit artifact_count from new-html count + remove unmount session-id clear

Cherry-picked from the orphaned `fix/analytics-app-version-zero` HEAD
(commit 5b5a7ed5 — pushed after PR #2453 had already squash-merged,
never made it into release/v0.8.0). Two P0 data bugs:

1. `run_finished.artifact_count` was hard-coded `0` at
   `server.ts:11061` (now `:11394`). Every run on PostHog reported
   zero artifacts, breaking the "generation success → artifact
   produced" funnel.

   Fix: count incremental `.html` paths the run wrote or edited,
   deduped per path so a Write-then-Edit cycle on the same file
   counts as one artifact. Pure helper in
   `apps/daemon/src/run-artifacts.ts` with 10 unit tests covering
   empty / no-html runs, single Write, dedup across Write+Edit+
   MultiEdit, distinct paths, Codex aliases (create_file,
   str_replace_edit), both `file_path` and `path` input shapes,
   case-insensitive extension, non-agent / malformed payloads, and
   Read/Grep/Bash always ignored. Wired into server.ts's
   `run_finished` properties block.

2. `OnboardingView` cleared `onboardingSessionId` on unmount. The
   Generate path unmounts OnboardingView *before* the post-Generate
   page_view fires elsewhere, so an unmount-clear consistently
   wiped the id before the 4th-step emission could read it.
   PostHog showed zero `area=generation_progress` events.

   Fix: drop the unmount cleanup effect entirely. Skip / Back /
   last-step Continue paths clear inline in their respective
   handlers (already in place from this PR's earlier instrumentation
   commit). The Generate path's clear now lives in `ProjectView`
   right after the `chat_panel` page_view (and the
   `generation_progress` page_view that rides with it). Abandoned
   sessions clear on sessionStorage tab close.

* fix(analytics): emit onboarding complete after generate settles + text source_type

Two review fixes on PR #2590 from mrcfps (2026-05-21 14:11):

1. `onboarding_complete_result` was emitted from `onBeforeGenerate`,
   which fires synchronously BEFORE
   `DesignSystemCreationFlow.generate()` runs the async draft-create
   / workspace-open work. Both of those have failure branches that
   bounce the user back to the setup form with an error. In that
   case the lifecycle row would have shipped as
   `result=completed` / `completion_type=completed_with_design_system`
   even though no design system was actually generated.

   Fix: add a new `onGenerateSettled(snapshot, outcome)` callback to
   `DesignSystemCreationFlow` and fire it from each branch of the
   `generate()` function (success after `onCreated` / failed on
   draft-create returning null / failed on workspace-open returning
   null / failed on catch). OnboardingView keeps the `onBeforeGenerate`
   hook for the intent-only `generate` ui_click row, and moves the
   lifecycle complete emit into `onGenerateSettled`. Failed outcomes
   ship as `result=failed` + `completion_type=completed_without_design_system`
   + the daemon's error code, and clear the onboarding session id
   since the user stays in the wrapper.

2. The `source_type` ternary in OnboardingView's `generate` ui_click
   mapped `sourceCount === 0` to `'none'` unconditionally, so a
   prompt-only generate ("user only typed a brand description, no
   GitHub / local / fig / assets sources") was indistinguishable on
   PostHog from "no input at all". The v2 contract reserves the
   `'text'` literal precisely for that prompt-only path.

   Fix: extract a `deriveOnboardingSourceType(snapshot)` helper that
   returns `'text'` when `sourceCount === 0 && hasBrandDescription`,
   `'none'` only when both are absent, single-source literal when one
   kind dominates, `'mixed'` otherwise. Single source of truth for
   the mapping so the ui_click and any future complete-row tagging
   stay consistent.

* fix(analytics): countNewHtmlArtifacts skips failed tool ops

Review fix on PR #2590 from mrcfps (2026-05-21 14:30, on commit
9e9a0019). `countNewHtmlArtifacts` counted every `Write` / `Edit`
tool_use on a `.html` path regardless of whether the matching
`tool_result` came back with `isError: true`. A permission denied
`Write index.html`, a path-outside-cwd refusal, or a
parent-missing failure all still bumped `run_finished.artifact_count`
to 1 — which is exactly the corruption pattern this helper was
introduced to fix (hard-coded zero → spuriously > 0 is the same
class of broken funnel signal).

Fix: mirror the web-side `apps/web/src/runtime/file-ops.ts` pattern.
Build a `resultByToolUseId` map in a first pass, then in the
second pass only count a tool_use whose paired result exists AND
`isError !== true`. A tool_use with no matching result is treated
as "still in flight" and not counted; the dashboard would rather
under-count attempts than promise artifacts we can't confirm
landed.

Tests grow 3 → 13:
- successful Write pair counts (canonical path)
- isError=true result does NOT count
- unpaired tool_use does NOT count
- Write-success-then-Edit-fail on same path still counts (artifact
  is on disk; later edit failure doesn't unmake it)
- existing dedup / distinct-paths / alias / case / malformed /
  read-skip cases all updated to use the new pair() helper

* fix(analytics): re-arm onboarding lifecycle on generate failure for retry

Review fix on PR #2590 from mrcfps (2026-05-21 14:45, on commit
2cd05f09). The previous `onGenerateSettled` failure branch did two
things that together broke the retry path:

1. Flipped `lifecycleReportedRef.current` to `true` (via
   `emitOnboardingComplete`), which the same guard then uses to
   short-circuit every subsequent complete emit.
2. Called `clearOnboardingSessionId()`, wiping the sessionStorage id
   that downstream surfaces (ProjectView's `generation_progress`
   page_view, subsequent ui_click rows) need to attribute under the
   same funnel session.

But `DesignSystemCreationFlow.generate()` doesn't bail out on
failure — it `setStep('setup')` and leaves the user in the same
embedded form to try again. So the retry sequence used to look
like:

  click Generate → fails → complete(failed) → flag locked + id cleared
  user fixes input → click Generate again
    ui_click `generate` row → fires under the STALE in-memory ref
      (sessionStorage was cleared but `onboardingSessionIdRef.current`
       still holds the old uuid)
    generate succeeds → onGenerateSettled(success)
      → emitOnboardingComplete → lifecycleReportedRef guard returns
        early → second complete row never lands
    navigate to ProjectView → peekOnboardingSessionId() = null
      → step-4 `area=generation_progress` row never lands

Fix: the failure handler keeps the session id intact and just
re-arms `lifecycleReportedRef.current = false`. A retry then
emits a fresh complete row under the same `onboarding_session_id`
(useful for "N retries until success" analysis) and an eventual
success can still hand off through ProjectView with the id available
for the step-4 emission. The Skip / last-step Continue paths still
clear via the inline `clearOnboardingSessionId()` next to their
`onFinish()` because those terminate the flow explicitly.
2026-05-21 22:50:46 +08:00
nettee
052f8097de
fix(daemon): inject @-mention skills into system prompt (#2552)
* fix(daemon): inject @-mention skills into system prompt

Generated-By: looper 0.8.1 (runner=worker, agent=opencode)

* fix(daemon): compose ad-hoc skill mode and aliases

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* fix(daemon): lazily load and stage ad-hoc skills

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* test(daemon): assert staged skill files before spawn

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* fix(daemon): compose skill metadata across @ mentions

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* test(daemon): cover ad-hoc critique skill policy

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* fix(daemon): preserve plugin skill composition

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* fix(daemon): resolve conflicting composed skill surfaces

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* fix(daemon): preserve primary skill surface

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* fix(daemon): share resolved critique surface

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)
2026-05-21 22:20:21 +08:00
lefarcen
6690dbd5bb
feat(analytics): PostHog + Langfuse instrumentation for assistant feedback (#1558)
* feat(analytics): PostHog + Langfuse instrumentation for assistant feedback

Re-bases the original three-commit PR onto release/v0.8.0. The web-side
feedback UI instrumentation (surface_view / ui_click / feedback_submit_result)
landed on main while this branch was open, so on this rebase that wiring
is taken from main; the remaining net additions are:

- Contracts: TrackingFeedback* enums and the four dedicated
  assistant_feedback_* event payload types (click, reason_view,
  reason_click, reason_submit), plus normalizeCustomReason helper.
  The new event-name variants are added to TrackingEventName and the
  AnalyticsEventPayload discriminated union next to the existing
  surface_view/ui_click variants — both wire formats coexist.
- POST /api/runs/:id/feedback in apps/daemon/src/chat-routes.ts:
  thin route that validates rating, allowlists reasonCodes through a
  simple string filter, and fire-and-forgets into the daemon's
  reportFeedback hook.
- apps/daemon/src/langfuse-bridge.ts reportRunFeedbackFromDaemon
  forwards the rating + reasonCodes into Langfuse as user_rating
  (NUMERIC ±1) + user_rating_reason (CATEGORICAL, one per code)
  score-create entries. Gates on telemetry.metrics + telemetry.content.
- apps/web/src/providers/daemon.ts reportChatRunFeedback (fire-and-forget
  fetch) and apps/web/src/components/ProjectView.tsx wiring so each
  thumbs-up/down + reason submission posts the side-channel.

Conflicts resolved (release/v0.8.0 vs the branch's old base):
- packages/contracts/src/analytics/events.ts: keep main's
  file_upload_result / feedback_submit_result / settings_* event
  variants alongside the new assistant_feedback_* additions.
- apps/daemon/src/server.ts: keep DNS-aware validateExternalApiBaseUrl,
  add reportFeedback closure wired into registerChatRoutes telemetry.
- apps/daemon/src/chat-routes.ts: keep both /tool-result and the new
  /feedback routes; merge RegisterChatRoutesDeps to include both
  'paths' and 'telemetry'. Drop PR's chat-routes-local
  reconcileAssistantMessageOnRunEnd helper (main has the equivalent in
  server.ts).
- apps/web/src/components/ChatPane.tsx & AssistantMessage.tsx & ProjectView.tsx:
  keep main's projectKindForTracking prop name and its existing
  emission of surface_view / ui_click / feedback_submit_result; the
  PR's analyticsCtx-based reason_view/click/submit emission is dropped
  in this rebase since it would duplicate the existing wire format.
- apps/web/tests/components/*: rename projectKind → projectKindForTracking
  to match ChatPane's current prop name.

Outstanding review feedback (from the pre-rebase round, will be
addressed in a follow-up commit):
- AssistantMessage tests not yet passing the new feedback context to
  the direct render path.
- ProjectView clear-feedback path skips reportChatRunFeedback, leaving
  stale Langfuse user_rating scores.
- buildFeedbackPayload has no deletion path for previously-submitted
  user_rating_reason scores when the user switches thumbs.
- POST /api/runs/:id/feedback always returns {status:'accepted'} even
  when consent is off; needs to surface skipped_consent / skipped_no_sink.
- reasonCodes are filtered to string[] but not allowlisted against
  ChatMessageFeedbackReasonCode or deduped.

* fix(analytics): address review on assistant feedback rebase

Picks up the in-scope correctness items from the prior review round
and the rebase residue without rewriting history:

- chat-routes.ts: `/feedback` now awaits the daemon's preflight
  outcome and echoes it as the response. The contract was already
  shaped as `accepted | skipped_consent | skipped_no_sink`, but the
  previous handler always returned `accepted` because the network
  send was fire-and-forget. The consent + sink decision is local
  (a small file read and an env-var lookup); the actual Langfuse
  upload still runs as a detached promise.
- chat-routes.ts: reasonCodes are now allowlisted against the
  contract's reason-code union and deduplicated before reaching
  Langfuse, so a stale or replayed client can't poison the
  Langfuse score table with unknown categorical values or
  duplicate stable ids in the same batch.
- langfuse-bridge.ts: split the consent + sink resolution from the
  fire-and-forget network send so the route can claim `accepted`
  honestly. The legacy `skipped_no_sink` return on app-config read
  failure is preserved.

Contracts + comment hygiene:
- TrackingFeedbackReasonCode in packages/contracts/src/analytics/events.ts
  drifted from ChatMessageFeedbackReasonCode in packages/contracts/src/api/chat.ts;
  add `followed_design_system` and `missed_design_system` so the
  analytics wire format stays aligned with the persistence shape.
- langfuse-trace.ts buildFeedbackPayload: the docblock claimed the
  raw custom-reason text is bucketed before send. Product reversed
  that on 2026-05-13 (raw text now ships, consent-gated). Replace
  the stale comment with the real semantics + a note that there is
  no tombstone path for reason codes the user removes in a
  follow-up submission (left as scope for a later PR).
- AssistantMessage.tsx: remove the now-unused
  `AssistantFeedbackAnalyticsCtx` interface and a stray blank-line
  delete from the rebase; restore the analytics-context comment
  above the feedback hook.

Left as follow-up (intentional, documented in code):
- Sending a tombstone score when the user clears their rating —
  ProjectView still skips reportChatRunFeedback on `change===null`,
  so Langfuse retains the previous rating until the user re-submits.
  The PostHog event captures the clear separately.
- Removing reason-code scores when the user re-submits with a
  smaller set — buildFeedbackPayload only overwrites the codes
  present in the current payload.

* feat(analytics): wire PR's dedicated assistant_feedback_* events

The four dedicated event types (`assistant_feedback_click` /
`_reason_view` / `_reason_click` / `_reason_submit`) the PR added to
contracts were sitting unused after the rebase because main's
umbrella `surface_view` / `ui_click` / `feedback_submit_result`
emissions covered the same user gestures. Wire the dedicated events
alongside the umbrella ones so both wire formats fire on every
feedback action — dashboards / evals can pick whichever schema they
were built against without losing signal.

Each dedicated event has stricter typing than its umbrella sibling
(`project_id` / `project_kind` / `conversation_id` are non-null), so
the new emissions are guarded behind a presence check and skipped on
test renders that mount AssistantMessage without project context. The
umbrella emissions retain their nullable fallbacks unchanged.

Pairing:
- surface_view (feedback reason panel) ↔ assistant_feedback_reason_view
- ui_click (feedback button)           ↔ assistant_feedback_click
- ui_click (reason submit button)      ↔ assistant_feedback_reason_click
- feedback_submit_result               ↔ assistant_feedback_reason_submit

Reason click + submit share the existing `requestId` so PostHog can
stitch click→result across both schemas, matching the spec.
2026-05-21 19:28:51 +08:00
shangxinyu1
10e2019c59
Fix plugin publish and Open Design PR workflow UX (#2564)
* Fix plugin publish and PR workflow UX

* Update plugin workflow test expectations

* Fix fake gh repo view verification path

* Fix plugin publish headless tests and preserve PATH in shell wrappers.

The publish-repo flow needs real git commits and fake gh auth output that
matches gh auth status parsing. Login shells no longer drop PATH so test
fakes and agent wrappers stay visible to nested gh/git calls.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Restore plugin action card when share-task startup fails.

If startGeneratedPluginShareTask rejects before a task is created, clear
hiddenAssistantPluginActionPaths so the assistant action card reappears.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Make daemon vitest self-contained for publish-github CLI shell-outs.

Build dist/cli.js in tests/setup.ts when missing and set OD_DAEMON_CLI_PATH
before server.ts resolves OD_BIN, so headless plugin tests pass from a clean
checkout without a prior manual daemon build.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-21 19:21:17 +08:00
Siri-Ray
3a33a7b475
fix(web): localize quick brief prompt (#2520)
* fix(web): localize quick brief prompt

Generated-By: looper 0.8.1 (runner=worker, agent=codex)

* fix(web): pass locale from design system chat

Generated-By: looper 0.8.1 (runner=fixer, agent=codex)

* fix(web): preserve task-type routing options

Generated-By: looper 0.8.1 (runner=fixer, agent=codex)

* fix(web): preserve task-type routing options

Generated-By: looper 0.8.1 (runner=fixer, agent=codex)
2026-05-21 19:18:13 +08:00
Marc Chan
047d2bdd95
fix(orbit): respect selected app language (#2522)
* fix(orbit): respect selected app language

Generated-By: looper 0.8.1 (runner=worker, agent=opencode)

* fix(orbit): respect selected app language

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)
2026-05-21 19:17:53 +08:00
lefarcen
6bb0f0fd91
feat(observability): web lifecycle telemetry + stable installationId migration (#2527)
* feat(observability): web lifecycle telemetry + stable installationId migration

Two intertwined safety-telemetry additions for the 0.8.0 release.

Web lifecycle observability
---------------------------
New `apps/web/src/observability/` module installed at module load via
client-app.tsx — alongside the existing error-tracking exception hooks
from #2521. Reuses error-tracking's direct-fetch transport (the same
consent-bypass + early-buffer guarantees) so every event flows even when
the user has opted out of general analytics:

  - client_long_task         PerformanceObserver longtask >100ms (real
                             "feels janky" signal, FPS proxy)
  - client_white_screen      app fails to mount after 5s; MutationObserver
                             cancels the timer the moment the React root
                             renders so a normal boot is zero events
  - client_resource_error    capture-phase window.error catches failed
                             <script>/<link>/<img>/<iframe> loads
                             (chunk-load failures, broken artifact refs)
  - client_boot_timing       navigationStart → load timings via
                             Navigation Timing v2
  - client_visibility_change visibilitychange + page lifetime
  - client_session_summary   real foreground duration emitted on pagehide
  - client_run_stuck         5min watchdog on SSE runs that don't progress
                             (#2464 / #2405 / #1451 in data form)
  - client_iframe_error      FileViewer iframe load failures (iframe
                             errors don't bubble to window, so the global
                             resource-error observer can't see them)
  - desktop_renderer_crash   Electron main observes render-process-gone
                             and forwards to daemon /api/observability/event
  - daemon_uncaught_exception
    daemon_unhandled_rejection
                             process-level handlers on the daemon

error-tracking.ts is generalised: `reportSafetyEvent(name, props)` now
exposes the same buffer + direct-fetch transport that `reportHandledException`
used, with identical $exception wire shape preserved for the existing
exception path.

Daemon cross-process bridge
---------------------------
New `AnalyticsService.captureSafety()` skips the consent re-check and
posts via posthog-node with installationId as distinct_id. Wired into:

  - `POST /api/observability/event` for desktop main and any future
    helper process that needs to ship a safety event (no consent check —
    same contract as web's direct-fetch path)
  - `process.on('uncaughtException')` / `unhandledRejection` on the
    daemon itself

Stable installationId across reinstalls (critical for 0.8.0 rollout)
--------------------------------------------------------------------
installationId previously lived in `<namespace>/data/app-config.json`,
so a packaged reinstall that churned the namespace token (or any future
namespace-scoped data wipe) rotated the id and the user showed up as a
brand-new PostHog person. This is the immediate trigger: when 0.8.0
ships, every 0.7.x user upgrading would silently double the user count.

New module `apps/daemon/src/installation.ts` reads/writes
`<installationDir>/installation.json` at the channel root. The daemon
gets the path from `OD_INSTALLATION_DIR`, set by
`apps/packaged/src/sidecars.ts` to `paths.installationRoot`
(one level above `namespaces/` — e.g.
`~/Library/Application Support/Open Design Nightly/` on mac).

`readAppConfig` transparently merges: if installation.json has an id it
wins; if only app-config.json has one (the 0.7.x state), it gets mirrored
to installation.json on the next read. `writeAppConfig` mirrors any
explicit installationId write, including the null-clear path used by
Settings → "Delete my data". 7 call sites of readAppConfig keep their
signatures unchanged.

Survives:
  - same-channel reinstall (DMG drag-replace, NSIS reinstall)
  - namespace churn between packaged builds
  - per-namespace data reset (future installer that clears `<ns>/data/`)

Still rotates (intentionally):
  - explicit "Delete my data"
  - manual `rm -rf "~/Library/Application Support/Open Design <Channel>/"`
  - different channel (Stable vs Nightly stay distinct because userData
    paths differ; that's the existing channel-isolation contract)

What this changes for posthog-js
--------------------------------
client.ts had `capture_exceptions: false` from #2521; nothing else
changes. autocapture / $pageview / $autocapture / track() / daemon
analyticsService.capture() — all unchanged. New events are additive.

Validation
----------
  - pnpm guard                              pass
  - pnpm typecheck                          whole repo pass
  - pnpm --filter @open-design/web test     200 files / 1824 tests
  - pnpm --filter @open-design/daemon test  251 files / 2981 tests
    (includes 10 new tests in installation.test.ts pinning the 0.7.x →
    0.8.0 migration, namespace-wipe survival, delete-my-data clear, and
    fresh-id rotation)
  - pnpm --filter @open-design/packaged test 9 files / 89 tests
  - Pre-existing baseline: apps/desktop/src/main/updater.ts has typecheck
    references to RELEASE_CHANNEL_NAMES.PREVIEW/NIGHTLY on release/v0.8.0;
    unrelated to this PR.

* fix(observability): preserve fatal exit on uncaught + skip loading shell in white-screen check

Addresses codex review on PR #2527 (Siri-Ray).

1) Daemon process handlers must keep Node fatal semantics

Installing an uncaughtException listener silences Node's default
crash/exit; Node 15+ does the same for unhandledRejection when a
listener is present. The previous handlers logged telemetry and let
control return to the event loop, leaving a corrupted daemon serving
requests instead of letting the supervisor restart it cleanly.

triggerFatalShutdown() now:
  - dispatches captureSafety once (guarded against re-entry from
    cascading faults)
  - races posthog-node's shutdown against a 1s bounded timeout so a
    slow flush can't keep the process alive
  - calls process.exit(1) after the race resolves
Both uncaughtException and unhandledRejection route through it.

apps/daemon/tests/uncaught-fatal-shutdown.test.ts pins:
  - captureSafety is invoked exactly once even on repeated faults
  - exit(1) fires on the happy path
  - exit(1) still fires when shutdown hangs past the timeout
  - exit(1) still fires when captureSafety itself throws

2) White-screen detector treated the loading shell as a successful mount

apps/web/app/[[...slug]]/client-app.tsx renders the dynamic-import
fallback as <div class="od-loading-shell">Loading Open Design…</div>
whose visible text (19 chars) exceeded the previous 10-char floor.
monitorMount() would therefore cancel the 5s timer the instant Next
swapped the loading shell in, completely missing the white-screen
signal the observer is meant to add.

isAppMounted() now:
  - primary signal: <html data-od-app-mounted="1"> set by App.tsx's
    first useEffect — authoritative because once App has mounted at
    least once, any later tree crash is an $exception story, not a
    white-screen story
  - fallback: only counts children of the root container whose
    classList does NOT include known loading-shell markers
    (od-loading-shell). Their visible text drives the > MIN_VISIBLE_TEXT
    check, so the loading sentinel can never be mistaken for a mount.

apps/web/tests/observability/white-screen.test.ts pins:
  - fires client_white_screen when only the loading shell is present
    after the timeout
  - does NOT fire when data-od-app-mounted is set before the timeout
  - cancels the timer the moment a real workspace-shell child appears
    alongside the loading shell
  - still fires when only sub-MIN_VISIBLE_TEXT non-shell content is
    present (effectively blank)

Validation:
  - pnpm guard pass
  - pnpm typecheck pass
  - pnpm --filter @open-design/daemon test  252 files / 2985 tests
  - pnpm --filter @open-design/web test     201 files / 1828 tests

* fix(observability): await captureSafety enqueue before fatal shutdown flush

Addresses second-pass codex review on PR #2527 (Siri-Ray, 3279268246).

The previous fatal-shutdown path called `analyticsService.captureSafety()`
synchronously and immediately raced `analyticsService.shutdown()` against
the bounded timeout. captureSafety in apps/daemon/src/analytics.ts does
its real `client.capture()` call only inside an async IIFE after
`await readInstallationIdSafe()` — so shutdown could win the race,
drain an empty posthog-node queue, and let `process.exit(1)` run BEFORE
the daemon crash event ever got enqueued. We'd then preserve the
process-lifecycle contract but lose the exact signal this PR is adding.

Changes:

  - AnalyticsService.captureSafety now returns Promise<void>. The async
    IIFE is gone; the body awaits readInstallationIdSafe directly so the
    returned promise resolves only AFTER client.capture() has been
    invoked (which is when posthog-node's local buffer contains the
    event).
  - server.ts triggerFatalShutdown awaits captureSafety, then calls
    shutdown, and races that whole sequence against the 1s bounded
    timeout. Capture failures still don't block exit (try/catch around
    the await).
  - NOOP_SERVICE.captureSafety becomes `async () => undefined` to
    match the new signature.
  - Fire-and-forget callers (/api/observability/event) are unaffected;
    voiding the returned promise keeps them non-blocking.

apps/daemon/tests/uncaught-fatal-shutdown.test.ts adds the reviewer-
requested fixture:

  - 'waits for the captureSafety promise to settle before invoking
    shutdown' — gives capture a 50ms delay and shutdown a separate 50ms
    delay so the intermediate "capture done / shutdown not yet" state
    is observable.
  - 'still aborts and exits if captureSafety hangs past the bounded
    timeout' — captureSafety never resolves; the outer 1s timeout still
    forces process.exit(1).

Validation:
  - pnpm guard                                pass
  - pnpm typecheck                            whole repo pass
  - pnpm --filter @open-design/daemon test    252 files / 2987 tests
2026-05-21 15:37:48 +08:00
lefarcen
88dee44892
feat(analytics): always-on $exception capture with early window hooks (#2521)
PostHog Error tracking was missing the vast majority of real exceptions:

  1. posthog-js's capture_exceptions: true is silenced by opt_out_capturing,
     so every opted-out user vanished from the error feed even though we
     could perfectly safely keep collecting their stacks (the consent
     toggle's user copy gates analytics, not safety telemetry).
  2. posthog-js is dynamically imported only after /api/analytics/config
     resolves AND the user has consented. Errors thrown during the first
     1-2 seconds (React hydration, early effects) had no listener to
     catch them.

Net effect: 14d $exception count was 54 events / 10 users across ~5k DAU,
producing the misleading 99.93% crash-free curve in PostHog's dashboard.

This PR makes exception capture independent of both gates:

  - apps/web/src/analytics/error-tracking.ts (new): own window.error +
    unhandledrejection handlers, in-memory buffer (capped at 50 entries),
    direct fetch to https://<host>/i/v0/e/ with the public phc_ key. Same
    scrub layer as the posthog-js path so file paths still get redacted.
  - apps/web/app/[[...slug]]/client-app.tsx: installErrorHandlers() at
    module-load, before React or any feature code can throw.
  - apps/web/src/analytics/provider.tsx: bootstrapExceptionTracking() in
    the identity useEffect, parallel to getAnalyticsClient() — runs
    regardless of consent state, fetches /api/analytics/config, hands the
    phc_ key + host + distinctId to the error tracker so buffered events
    can flush.
  - apps/web/src/analytics/client.ts: capture_exceptions: false so
    posthog-js stops also emitting $exception (would have produced
    duplicate events server-side); also re-bridges the error-tracking
    context inside the loaded() callback so future events inherit the
    fully-resolved appVersion / sessionId.
  - apps/daemon/src/server.ts + packages/contracts: /api/analytics/config
    now returns key + host even when consent=false. enabled still reflects
    only the analytics consent toggle (posthog-js full autocapture stays
    off when enabled=false), but the always-on error tracker can read key
    directly. Forks without POSTHOG_KEY still get key=null and the whole
    pipeline becomes a no-op — fork-safe by construction.
  - apps/web/src/analytics/scrub.ts: regex fix so packaged-mac paths like
    /Applications/Open Design.app/Contents/Resources/apps/web/... (which
    contain a space) get fully rewritten to app://apps/web/...; previously
    the [^\s] guard stopped at 'Open' and leaked the install dir.

Validation:

  - pnpm --filter @open-design/web typecheck: pass
  - pnpm --filter @open-design/web test: 199 files / 1823 tests pass
    (includes 8 new error-tracking.test.ts cases for buffer cap, hook
    install, scrub, and direct dispatch)
  - pnpm --filter @open-design/daemon test: 250 files / 2971 tests pass
  - pnpm guard: pass

After release/v0.8.0 ships and rolls out, expect the crash-free curve to
drop from the artificial 99.93% to a realistic 95-98% — that's not a
regression, it's the first time we're measuring it.
2026-05-21 13:07:26 +08:00
lefarcen
f5f8937421 Merge origin/main into release/v0.8.0
Conflict resolved by taking origin/main:

- apps/web/src/components/EntryNavRail.tsx  design-systems rail
  button icon name palette-filled (release-side) -> blocks (main);
  main's icon swap is part of the more recent design-systems rail
  pass.
2026-05-21 10:52:08 +08:00
Eli-tangerine
ce95266586
[codex] Polish home composer working-directory controls (#2468)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 1s
nix-check / build (push) Failing after 3s
ci / Preflight (push) Failing after 2s
ci / Core package tests (push) Failing after 1s
ci / Tools workspace tests (push) Failing after 1s
ci / Daemon workspace tests (1/2) (push) Failing after 1s
ci / Daemon workspace tests (2/2) (push) Failing after 1s
ci / Web workspace tests (push) Failing after 1s
ci / E2E vitest (push) Failing after 1s
ci / Playwright critical (starters) (push) Failing after 1s
ci / Playwright critical (core) (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / App workspace tests (push) Failing after 0s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
* Polish design system home flows

* Polish home prompt presets

* Polish home working directory controls

* test: align home hero chrome smoke

* fix: stabilize home composer ci checks

---------

Co-authored-by: qiongyu1999 <2694684348@qq.com>
2026-05-21 00:22:46 +08:00
lefarcen
722ddfa235 Merge origin/main into release/v0.8.0
Conflicts resolved by taking origin/main on both files. Root cause:
main's PR #2460 (fix(landing): align logo.webp with brand icon) changed
HomeHero.tsx's .home-hero__brand-mark to render <img src=/app-icon.svg>
instead of an inlined <HeroBrandIcon /> SVG, and bundled the matching
CSS (26px round badge with bg-panel + border + padding 2px) plus a
gap/font-size tune. The release-side visual-refresh CSS still targeted
the SVG layout (38px square, transparent, inset SVG selector). Keeping
release's CSS would leave main's <img> unstyled.

- apps/web/src/styles/home/home-hero.css  three blocks, all taken from
  main: .home-hero__brand gap 8px, .home-hero__brand-mark redesigned for
  <img> child, .home-hero__brand-name font-size 16px.
- apps/web/src/index.css  two blocks, both taken from main: workspace
  tab close column 22px and .workspace-tab__close 18x18 (paired
  tune-down of tab UI spacing).
2026-05-20 22:28:38 +08:00
Eli-tangerine
8193981511
Keep PR 2400 changes without folder pickers (#2462)
* feat(daemon): add project working directory management and editor hand-off functionality

- Introduced new flags for project commands to manage working directories, including `--working-dir` and `--dir`.
- Implemented API routes for listing available editors and opening projects in selected editors.
- Added a hand-off button in the ChatPane header to facilitate opening project folders in local applications.
- Enhanced the HomeHero component to include working directory and design system settings, improving user experience in project creation.
- Created HomeHeroSettingsChips component for inline management of working directory and design system selection.

* feat(chat): implement voice transcription proxy and enhance UI components

- Added a new API route for voice transcription using OpenAI's `/audio/transcriptions` endpoint, allowing users to send audio blobs directly for transcription.
- Integrated multer for handling audio file uploads in memory, ensuring efficient processing without disk storage.
- Updated the HomeHero component to include example prompt suggestions for plugins, enhancing user interaction.
- Introduced the EditorIcon component to visually represent different editors in the hand-off menu, improving the user experience.
- Refined the HandoffButton component to utilize the new EditorIcon, providing a more cohesive interface for selecting editors.
- Enhanced CSS styles for various components to improve layout and responsiveness, including adjustments to tab and button sizes for better usability.

* style(workspace-shell): enhance layout and overflow handling

- Updated CSS for .workspace-shell to ensure full viewport width and height, with proper overflow management.
- Adjusted grid layout to prevent content overflow and maintain responsiveness.
- Modified styles for .workspace-tabs-chrome to improve width handling and prevent overflow issues.

* refactor(chat): remove voice transcription proxy and related components

- Deleted the voice transcription proxy implementation, including the associated API route and multer configuration.
- Removed the MicButton component from the ChatComposer and HomeHero components to streamline the UI.
- Updated HomeHero to include example suggestions without the voice input functionality.
- Adjusted CSS styles for various components to maintain layout consistency after the removal of the MicButton.

* feat(daemon): implement minting of HMAC tokens for working directory management

- Added a new function `mintImportTokenFromCurrentSecret` to generate HMAC tokens bound to a specified base directory, enhancing security for working directory operations.
- Updated the `desktop-auth.ts` file to include the new token minting functionality, which returns structured errors when the desktop auth secret is cleared.
- Introduced new IPC message types for minting import tokens in the sidecar protocol, allowing seamless integration with the daemon's working directory management.
- Enhanced the `WorkingDirPill` component to utilize the new token minting flow for secure directory selection in desktop builds.
- Updated CSS styles for the HomeHero component to accommodate new example suggestion features and maintain layout consistency.

* fix(HomeView): import HOME_HERO_CHIPS constant for improved chip management

- Updated the HomeView component to import the HOME_HERO_CHIPS constant from the chips module, enhancing the management of hero chips within the component.

* feat(daemon): implement mintImportTokenViaSidecar for secure working directory management

- Introduced the `mintImportTokenViaSidecar` function to facilitate the minting of HMAC tokens for desktop-import operations via the daemon's sidecar IPC. This allows CLI commands to bypass authentication when the desktop-auth gate is active.
- Updated the CLI to utilize the new token minting function when setting the working directory, ensuring secure access to trust-gated API endpoints.
- Enhanced the sidecar server to handle minting requests and return structured error messages for improved user feedback.
- Added tests to validate the new token minting functionality and its integration with the working directory management process.
- Refactored related components to support the new token flow, improving overall security and user experience.

* feat(HomeHero): enhance UI components and styles for improved user experience

- Updated HomeHero component to replace active dot indicators with Plug icons for better visual representation of active plugins.
- Adjusted CSS styles for various elements, including padding and dimensions, to enhance layout consistency and responsiveness.
- Introduced new styles for active type icons and improved hover effects for buttons.
- Updated HomeHeroSettingsChips to change button titles and icons for clarity.
- Added tests to ensure proper rendering and functionality of updated components.

* feat(ProjectDesignSystemPicker): enhance design system selection with preview functionality

- Updated the ProjectDesignSystemPicker component to include a preview feature for design systems, allowing users to see a preview of the selected design system.
- Implemented hover functionality to update the preview based on the hovered design system.
- Added fullscreen preview capability for a more immersive experience.
- Enhanced CSS styles for the design system picker to improve layout and responsiveness.
- Introduced tests to validate the new preview functionality and ensure proper interaction within the component.

* feat: refactor project metadata handling and enhance design system picker

- Updated the default scenario plugin ID retrieval to use project metadata, improving the logic for determining the appropriate plugin based on project intent.
- Enhanced the ProjectDesignSystemPicker and related components to support localized design system summaries and categories, improving user experience.
- Introduced new translations for working directory and design system picker components, ensuring better accessibility and usability across different locales.
- Added a new 'live-artifact' project type to the HomeHero chips, expanding the functionality for users creating refreshable artifacts.
- Updated tests to validate the new project metadata handling and design system picker functionalities.

* feat: enhance localization and styling for design system components

- Added French translations for working directory and design system picker components, improving accessibility for French-speaking users.
- Updated CSS styles for the pet task item to ensure consistent padding and layout.
- Introduced a new test suite for HomeHeroSettingsChips to validate localization and design system selection functionality.
- Enhanced ProjectDesignSystemPicker tests to ensure proper localization and interaction with design system categories.

* fix: update .gitignore to include all claude-sessions directories and remove specific session files

- Modified .gitignore to ensure all claude-sessions directories are ignored by using a wildcard pattern.
- Deleted two specific claude-sessions markdown files to clean up unnecessary session data.

* fix: repair home automation ci regressions

* fix: stabilize artifact consistency e2e

* Remove folder picker changes from PR 2400

---------

Co-authored-by: pftom <1043269994@qq.com>
Co-authored-by: qiongyu1999 <2694684348@qq.com>
2026-05-20 22:07:30 +08:00
lefarcen
255c3058c5
fix(analytics): app_version=0.0.0 + media providers clicks + lock run_finished error_code (#2453)
* fix(analytics): use state for runtime app version so PostHog gets the real value

`useAppVersion()` stored the fetched `/api/version` result in a `useRef`,
but ref writes do NOT trigger a re-render. The hook therefore kept
returning '0.0.0' forever and the downstream `useEffect` that calls
`client.register({ app_version, ui_version })` never re-ran with the
real version. PostHog dashboards then showed `app_version=0.0.0` and
`ui_version=0.0.0` on every event ever shipped from the web client.

Switching to `useState` lets the resolved version flow through React's
render cycle so the register-on-change effect picks it up. The boot
placeholder still ships as '0.0.0' for the first events before the
fetch resolves (we don't re-emit those), but every event after init
now carries the real daemon-pinned version.

Adds a red-spec at apps/web/tests/analytics-app-version.test.tsx that
went red on the `useRef` shape (`expected '0.0.0' to be '1.2.3'`) and
green on the `useState` shape, so a future refactor can't silently
regress it.

* feat(analytics): wire media providers click events + lock run_finished error_code invariant

Two analytics gaps shipped together because both came out of the same
PostHog spot-check after PR #2390 landed:

1. Settings → Media providers (CSV row "client_type=desktop / mason /
   media_providers") wasn't emitting any ui_click events. The contract
   type `SettingsMediaProvidersClickProps` and helper
   `trackSettingsMediaProvidersClick` were defined but no call site
   used them, so the dashboard showed zero traffic on every element.
   Added the four v2 elements:
   - `reload` on the "Reload from daemon" button
   - `key_input` on every per-provider API key field (onFocus, mirrors
     the BYOK key field pattern in this same dialog)
   - `url_input` on every per-provider base-URL field
   - `clear` on each row's Clear button (fires before the confirm
     dialog so the intent signal is recorded even if the user backs
     out)
   Each event carries `providers_id` (provider.id) and `is_configured`
   (truthy when the row has a stored entry).

2. `run_finished` with `result=failed` was reported as missing
   `error_code` on PostHog. Audited every failure path: the daemon's
   `child.on('close', ...)` handler has several branches that call
   `runs.finish('failed', code, signal)` directly without first
   emitting an SSE `error` event (ACP fatal, agentStreamError fall
   through, child close without diagnostic), leaving
   `run.errorCode === null` in the status body. The existing fallback
   in `server.ts` already derives `AGENT_SIGNAL_*` / `AGENT_EXIT_*` /
   `AGENT_TERMINATED_UNKNOWN` from `signal` / `exitCode` for those
   cases, so the wire emission should never blank out — but the logic
   was inline and had no unit coverage.

   Extracted the result/error_code derivation into
   `apps/daemon/src/run-result.ts` and added 12 unit tests covering:
   - explicit errorCode forwarding
   - signal-only failures
   - exit-code-only failures
   - clean (code=0) failures (ACP fatal shape)
   - cancelled runs (with and without stamped code)
   - empty-string errorCode defensive case
   - status→result mapping for succeeded/canceled/failed/unknown

   All 12 pass — confirming the invariant "result=failed always
   carries error_code" holds for every failure shape the daemon
   produces. The refactor pins that invariant so a future change
   loses test coverage rather than silently regressing on PostHog.

   If `error_code` still looks empty on a live event, share the
   PostHog event JSON + the agent id and I'll dig further — at this
   point the daemon emission itself is exercised end-to-end.
2026-05-20 21:50:11 +08:00
lefarcen
1cfe274a90 Merge origin/main into release/v0.8.0
Conflicts resolved by taking origin/main on all six points:

- apps/web/src/components/HomeHero.tsx:479-487  brand div removed
  (main dropped the .home-hero__brand wrapper; the release-side visual
  refresh still had it).
- apps/web/src/components/HomeHero.tsx:894-898  attach Icon size
  18 (main's update) replaces 20 from release.
- apps/web/src/components/HomeHero.tsx:913-927  submit button uses
  <Icon name="arrow-up" size={22} /> (main's component refactor)
  instead of the release-side inline SVG.
- apps/web/src/components/EntryShell.tsx:578-582  Discord Icon size
  14 (main) instead of 16 (release).
- apps/web/src/styles/home/home-hero.css  drop .home-hero__brand /
  __brand-mark / __brand-name rules — main removed both the component
  div and these CSS rules together; keeping the CSS would be dead code.
- apps/web/src/styles/home/entry-layout.css  Discord badge icon color
  #5865f2 (main, the brand color introduced by PR #2386) instead of
  release's neutral var(--text-strong).
2026-05-20 20:59:00 +08:00
PerishFire
31ca20f2c6
Add packaged update apply observations (#2429) 2026-05-20 19:11:36 +08:00
lefarcen
c80acfefeb
fix(daemon,web): block pitch-deck placeholder publishes and unbreak framework decks (#2384)
Two preview-time bugs surfaced ahead of 0.8.0:

1. Pitch-deck example (#2215): the official html-ppt-pitch-deck prompt asked
   the agent to confirm three facts first, but the manifest had no
   structured `od.inputs`, so the platform's required-input gate had no
   fields to enforce and the run could publish HTML that still contained
   unresolved fundraising placeholders (`Name to confirm`, `$X.XM`,
   `Replace this panel with`, ...). Add structured required inputs to the
   manifest and a daemon-side publication guard that rejects HTML/deck
   artifact writes whose body still contains those placeholders. Scope is
   the file-write boundary only (no assistant-text scanning), so the
   guard cannot trip on the agent's chat prose mid-clarification.

2. Framework deck preview off-screen: `injectDeckBridge` injected
   `place-content: center !important` on `.deck-shell` for every deck-mode
   srcdoc, which forced the framework's `display: grid` shell to re-center
   its implicit track. The framework's `fit()` already centers a
   `transform-origin: top left` stage with an explicit `translate(tx, ty)`
   that assumes the stage's natural layout position is (0, 0); the two
   centerings stacked and the scaled stage landed ~1000px off-screen, so
   the preview showed a sliver of slide content in the top-left with the
   rest black. Skip the override when the framework's `id="deck-stage"`
   marker is in the doc, and drop the dead `display: grid; place-items:
   center` from the deck framework template so future drift can't
   re-introduce the same stack.
2026-05-20 16:20:34 +08:00
Xinmin Zeng
6ca4491294
fix(mcp): forward external MCP servers to OpenCode (#2174)
* fix(mcp): forward external MCP servers to OpenCode

OpenCode (and 5 other non-Claude/non-ACP runtimes) silently dropped
the user's `.od/mcp-config.json` entries at spawn time because
`server.ts` only branched on `def.id === 'claude'` and
`def.streamFormat === 'acp-json-rpc'`. The UI happily saved the
servers and the user never learned the agent process never received
them — the "ghost MCP" UX called out in #2142.

Replace the two hardcoded checks with a single `def.externalMcpInjection`
discriminator on `RuntimeAgentDef` (`claude-mcp-json` / `acp-merge` /
`opencode-env-content`). The Claude `.mcp.json` write and ACP
`mcpServers` merge paths keep their existing behavior; OpenCode now
gets its config layered in via `OPENCODE_CONFIG_CONTENT`, which
OpenCode merges on top of the user's saved `~/.config/opencode
/opencode.json` (verified against opencode-ai 1.15.5 — `opencode mcp
list` shows the injected server as `connected`).

Surface the same discriminator through `AgentInfo` so the Settings →
External MCP panel renders a banner naming the agents that DO receive
the servers and the ones that don't, with a hint to configure those
agents' own config files instead. Replaces the silent-failure UX with
explicit, actionable information.

Fixes #2142

* fix(web): scope external MCP banner to installed agents only

mrcfps's review on #2174 pointed out that `/api/agents` returns every
runtime def — including ones the user hasn't installed (those carry
`available: false`) — so the support banner was happily listing
Devin / Kimi / Kiro / Mistral Vibe under "Forwarded to" and DeepSeek /
Pi / Qoder / Qwen under "Not forwarded to" on a machine where none
of those CLIs were even present. Misleading at best, since the banner
copy reads as "agents on your system."

Filter the agents array by `available: true` before grouping by
`externalMcpInjection`. The "no enabled MCP servers" and "daemon
unreachable" short-circuits stay; add one more guard for "the user
hasn't installed a single supported CLI yet" so the banner just
disappears instead of showing two empty lines.

Verified against the local dev runtime: on a host with 7 of the 16
known agents installed, the banner now shows the actual 3+4 split
(Claude Code · Hermes · OpenCode forwarded; Codex CLI · Cursor Agent ·
Gemini CLI · GitHub Copilot CLI not), down from the previous 8+8 that
included CLIs that don't exist on the machine.

* fix(web): mark ACP runtimes as stdio-only in MCP support banner

Second review pass on #2174 (mrcfps) caught that banner was treating
every `acp-merge` runtime as fully forwarded, even though
`buildAcpMcpServers()` in `apps/daemon/src/mcp-config.ts:386` drops
every non-stdio server before spawn. Save a Higgsfield HTTP MCP and
pick Hermes — daemon hands Hermes nothing, but banner still listed
Hermes under "Forwarded to". The exact silent-failure UX the banner
was supposed to remove.

Tag ACP runtimes inline with `(stdio only)` in `renderNames`, and
when at least one ACP adapter shows up in the supported group, add a
one-sentence sibling explaining the limit. Pure presentation change
in `McpAgentSupportBanner` — no new state, no transport-aware
filtering, no contract change. Dropping the warning will be cheap to
do later if ACP grows HTTP support.

Verified against the local dev runtime with both a stdio
(`basic-memory`) and an HTTP server (Higgsfield) saved: banner now
renders "Forwarded to: Claude Code · Hermes (stdio only) · OpenCode.
ACP adapters marked stdio only receive stdio MCP servers from this
list; HTTP and SSE entries are dropped at spawn time."
2026-05-20 15:22:09 +08:00
shangxinyu1
5fc27f8923
Fix daemon run recovery across reloads (#2374)
* Fix daemon run recovery across reloads

Pin daemon-created runs to assistant messages immediately so hard reloads before the create response can reattach.

Replay terminal and active run events from the beginning on reload so restored turns keep assistant text, thinking events, produced files, and artifacts.

Fixes #2366

Fixes #2368

Fixes #2371

* Fix ProjectView daemon run recovery tests
2026-05-20 15:10:23 +08:00
Bryan
c530d163f8
feat(web): "Resume conversation in new chat" UI — #462 Commit B (companion to #1718) (#2264)
* feat(contracts): add handoff request/response DTOs

Adds HandoffRequest, HandoffResponse, and HANDOFF_SCHEMA_VERSION for
the upcoming POST /api/projects/:id/handoff synthesis endpoint. Mirrors
the finalize.ts subpath pattern (package.json#exports + esbuild entry +
index re-export) so daemon and web can import
@open-design/contracts/api/handoff.

Refs nexu-io/open-design#462.

* feat(daemon): add handoff synthesis pipeline (buildHandoffPrompt + synthesizeHandoffPrompt)

Adds `apps/daemon/src/handoff-design.ts` exposing the resume-conversation
synthesis primitives the upcoming `POST /api/projects/:id/handoff` route will
call into.

- `buildHandoffPrompt({ projectId, transcriptJsonl, transcriptMessageCount,
  now })` returns the system + user prompts. System prompt asks Claude to
  emit a structured Markdown body with Context / Decisions made / Open
  questions / Current focus / Provenance, with Provenance bullets explicitly
  flat (no Markdown emphasis on labels) to preempt the PR #1584 round-2
  parser bug.
- `synthesizeHandoffPrompt(db, projectsRoot, projectId, options)` reuses the
  existing finalize-design pipeline pieces: `exportProjectTranscript` →
  `truncateTranscriptForPrompt` → `buildHandoffPrompt` →
  `callAnthropicWithRetry` → `extractDesignMd`, but without the lockfile,
  disk write, design-system, or artifact-resolution paths.
- Promotes `DEFAULT_TIMEOUT_MS` in finalize-design.ts to `export const` so
  handoff shares the same 120s upstream-call bound.

Refs nexu-io/open-design#462.

* feat(daemon): wire POST /api/projects/:id/handoff route

Adds the handoff HTTP route and registers it in server.ts. Validation
block + error-mapping shape mirror registerFinalizeRoutes (BYOK payload,
upstream-error → ApiErrorCode mapping, redactSecrets on the raw upstream
body). Handoff has no lockfile, so the CONFLICT branch is omitted.

`res.on('close')` is wired to flip an AbortController whose signal is
threaded into synthesizeHandoffPrompt, so a UI-side cancel actually
aborts the daemon-side Anthropic call rather than letting it keep
running after the client walks away (mirrors the PR #974 fix for
finalize).

- `apps/daemon/src/handoff-routes.ts` — new, exports registerHandoffRoutes
  + RegisterHandoffRoutesDeps.
- `apps/daemon/src/server-context.ts` — adds handoff slot to ServerContext.
- `apps/daemon/src/route-context-contract.ts` — adds RegisterHandoffRoutesDeps
  to the compile-time coverage assertion.
- `apps/daemon/src/server.ts` — imports synthesizeHandoffPrompt +
  registerHandoffRoutes, builds handoffDeps, registers the route next
  to finalize.
- `apps/daemon/tests/handoff-route.test.ts` — 12 HTTP-layer tests:
  validation (400/403/404), happy path, upstream error mapping
  (401/429/502/502 non-JSON), api-key redaction.
- `apps/daemon/tests/handoff-route-abort.test.ts` — client-disconnect
  aborts the daemon-side controller.

Refs nexu-io/open-design#462.

* fix(daemon): map TranscriptExportLockedError to 409 CONFLICT on handoff route

`exportProjectTranscript` acquires a per-project `.transcript.lock`
internally (apps/daemon/src/transcript-export.ts:131-163) and throws
`TranscriptExportLockedError` on EEXIST. Concurrent handoff requests —
or a handoff that races `/api/projects/:id/finalize/anthropic` — lost
that lock and surfaced as 500 INTERNAL_ERROR through the route's
generic catch.

- `apps/daemon/src/handoff-routes.ts` — catch `TranscriptExportLockedError`
  and return `409 CONFLICT` ahead of the generic 500 branch, mirroring
  the existing `FinalizePackageLockedError → 409 CONFLICT` mapping at
  `apps/daemon/src/import-export-routes.ts:603-605`.
- `apps/daemon/src/server.ts` — thread `TranscriptExportLockedError`
  through `handoffDeps` so the route can match without a direct import.
- `apps/daemon/src/handoff-design.ts` — correct the module header
  comment that incorrectly claimed "no lockfile (concurrent handoff
  calls are safe)" — handoff does not add its own lock, but it does
  transitively acquire `.transcript.lock` via the transcript-export
  call.
- `apps/daemon/tests/handoff-route.test.ts` — regression test that
  pre-acquires `.transcript.lock` on disk via `fs.openSync(lockPath, 'wx')`
  before firing a handoff request, asserts 409 CONFLICT.

Refs nexu-io/open-design#462 — addresses @nettee's blocking review on
PR #1718 (comment 3242251338).

* fix(daemon): keep handoff request timeout armed through the response body read

`synthesizeHandoffPrompt` cleared the upstream-call timeout in a `finally`
that ran as soon as `callAnthropicWithRetry` returned. But `fetch()`
resolves once the upstream sends *headers* — so the subsequent
`await response.json()` body read ran with no timeout. A response that
sends headers and then stalls its body could hang `/api/projects/:id/handoff`
indefinitely instead of failing.

- `apps/daemon/src/handoff-design.ts` — move `clearTimeout(timeoutId)` into a
  single outer `finally` spanning both the call and the `response.json()`
  body parse, so the timeout stays armed until the body is fully consumed.
- `apps/daemon/src/handoff-design.ts` — the body-parse catch now re-throws
  `AbortError` as-is, mirroring the call-phase catch. Without this a
  body-phase timeout would surface as `502` "non-JSON body"; re-throwing
  lets the route map it to the intended `503` "handoff timed out"
  (`handoff-routes.ts:122-124`).
- `apps/daemon/tests/handoff-design.test.ts` — regression test: a `fetchImpl`
  returning a `Response` whose body never closes after headers, raced
  against a 500ms deadline, asserts the call aborts (not hangs) and rejects
  with `AbortError`.

Refs nexu-io/open-design#462 — addresses @nettee's round-2 blocking review
on PR #1718 (`handoff-design.ts:196`).

* fix(daemon): map upstream 400 to 400 BAD_REQUEST on handoff route

`callAnthropicWithRetry` preserves a non-retryable upstream status, so an
Anthropic HTTP 400 (`invalid_request_error` — unknown model, invalid
maxTokens, malformed body) reached the route's `FinalizeUpstreamError`
branch and fell through to `502 UPSTREAM_UNAVAILABLE`. That reported
deterministic caller input as a transient server outage, inviting
pointless retries and hiding which field was wrong.

- `apps/daemon/src/handoff-routes.ts` — special-case `err.status === 400`
  to `400 BAD_REQUEST` with the redacted upstream detail, ahead of the
  generic 502. Also refresh the route docblock: it claimed the 409 branch
  was omitted (stale since the R1 TranscriptExportLockedError fix) and
  that error mapping fully mirrors finalize (now diverges on 400).
- `apps/daemon/tests/handoff-route.test.ts` — route test driving an
  Anthropic `400 invalid_request_error`: asserts 400 BAD_REQUEST, the
  upstream detail is surfaced, and an echoed key is redacted.
- `packages/contracts/tests/package-runtime.test.ts` — import
  `@open-design/contracts/api/handoff` through the package `exports` map
  and assert `HANDOFF_SCHEMA_VERSION`, covering the built publish surface
  (esbuild entry + exports map + root re-export) that the source-only
  `handoff-contract.test.ts` does not exercise.

Refs nexu-io/open-design#462 — addresses @nettee's round-3 blocking
review on PR #1718.

* fix(daemon): await the now-async external base-URL validator on handoff route

Main's #1176 (`9a64fccd`) made `validateExternalApiBaseUrl` DNS-aware and
asynchronous (`validateBaseUrlResolved`) and updated the proxy and finalize
callers to `await` it. The handoff route — added on this branch in parallel,
against the old synchronous validator — still called it without `await`, so
`validated` was a Promise: `validated.error` / `validated.forbidden` were
`undefined`, the SSRF / malformed-URL guard silently no-opped, and a bad
`baseUrl` fell through to the upstream call and surfaced as 502.

A semantic merge break — no textual conflict, green on the branch in
isolation, red once CI re-merged latest main.

- `apps/daemon/src/handoff-routes.ts` — `await validateExternalApiBaseUrl(...)`,
  mirroring the finalize route (`import-export-routes.ts:561`). The handler
  is already `async`.

The existing `handoff-route.test.ts` cases "400 BAD_REQUEST when baseUrl is
not a valid URL" and "403 FORBIDDEN when baseUrl points at a private internal
IP" already encode this — red against branch + latest main, green now.

Refs nexu-io/open-design#462 — PR #1718 CI fix.

* chore(daemon): list handoff in the assertServerContextSatisfiesRoutes literal

The `assertServerContextSatisfiesRoutes({...})` call in `server.ts` enumerates
every route registrar's deps but omitted `handoff`. Adding `handoff: handoffDeps`
makes the literal complete and consistent with the other route deps.

This was not a typecheck break: route-dep coverage is guaranteed by the
`Assert<ServerContext extends AllRegisteredRouteDeps>` type in
`route-context-contract.ts` — and `AllRegisteredRouteDeps` already includes
`RegisterHandoffRoutesDeps` — not by this assertion-call literal. The literal
has omitted `handoff` since this branch's first push (`806db576`) through green
CI throughout; `tsc -p tsconfig.json --noEmit` is clean before and after.

Refs nexu-io/open-design#462 — addresses @nettee's round-4 review note on PR #1718.

* feat(web): add "Resume conversation in new chat" action (#462)

Adds a Resume control to the chat header, next to "New conversation".
Clicking it synthesizes a handoff prompt from the current transcript
via POST /api/projects/:id/handoff, opens a fresh conversation, and
auto-sends the synthesized prompt as its first user message — so a
drifted session resumes without the user replaying context by hand.
The old conversation is preserved.

- synthesizeHandoff() web-state wrapper in apps/web/src/state/projects.ts
- resume-conversation icon button in ChatPane (onResumeConversation /
  resumeConversationDisabled props)
- handleResumeConversation + pendingResumeRef + auto-send effect in
  ProjectView; effect gates on messagesConversationId so the prompt
  cannot fire before the new conversation's message read settles
- chat.resumeConversation i18n key across all 19 locales

Commit B of #462; Commit A is the daemon endpoint (PR #1718). This
branch is stacked on feat/handoff-endpoint so the web code resolves
@open-design/contracts/api/handoff.

* fix(daemon): scope handoff to one conversation + reject empty transcripts (#462)

Addresses the review on #1718 and #2264:

- mrcfps (#2264): the handoff endpoint exported the whole project's
  transcript, so a multi-conversation project blended unrelated chats
  into the synthesized prompt. HandoffRequest now carries a required
  conversationId; the route validates it belongs to the project
  (404 CONVERSATION_NOT_FOUND), and exportProjectTranscript takes an
  optional conversationId filter so only that conversation is exported.
- nettee (#1718): a zero-message conversation still called Anthropic and
  fabricated a handoff. synthesizeHandoffPrompt now throws
  EmptyTranscriptError on messageCount === 0; the route maps it to
  400 EMPTY_TRANSCRIPT before any BYOK tokens are spent.

HANDOFF_SCHEMA_VERSION bumped to 2 (conversationId is a new required
request field). Regression tests: a two-conversation scoping test, an
empty-conversation route + pipeline test, and a transcript-export
conversationId-filter unit test.

* feat(web): send conversationId with the resume handoff request (#462)

Follows the handoff endpoint becoming conversation-scoped. The resume
flow now passes the active conversationId to POST /handoff so the
synthesized prompt summarizes only the conversation being resumed.
handleResumeConversation bails when there is no active conversation;
synthesizeHandoff and the resume tests carry the new field.

* feat(daemon): add `od project handoff` CLI + register handoff error codes (#462)

Addresses the second-round review on #1718 and #2264:

- mrcfps (#2264): per AGENTS.md "Capability exposure (UI/CLI dual-track)",
  a user-facing capability must be reachable through the `od` CLI, not
  only the web UI. Adds `od project handoff <id> --conversation <id>
  --api-key <key> --model <model> [--base-url] [--max-tokens] [--json]`,
  driving the same POST /api/projects/:id/handoff endpoint. The logic
  lives in a testable handoff-cli.ts sibling module (mirrors
  artifacts-cli.ts) so cli.ts's import-time dispatch stays out of tests.
- nettee (#1718): the route emitted CONVERSATION_NOT_FOUND and
  EMPTY_TRANSCRIPT, which were absent from the shared API_ERROR_CODES
  union. Both are now registered in packages/contracts/src/errors.ts,
  with a contract test pinning them so the route and contract cannot
  drift again.

A CLI contract test covers the conversation-scoped request shape,
--json output, flag validation, and daemon-error surfacing.

* fix(daemon): fail `od project handoff` on a malformed 2xx response (#462)

Addresses nettee's review on #1718: runProjectHandoff treated any 2xx
response as success, so a broken daemon/proxy 200 with malformed or
shape-invalid JSON would print `undefined` (or `{}` under --json) and
still exit 0 — breaking the fail-fast contract scripts rely on. It now
validates the body is a well-formed HandoffResponse via an
isHandoffResponse type guard and fails fast otherwise. Regression tests
cover a shape-invalid and an unparseable 200 body.

* feat(web): surface the daemon's classified handoff error in the resume toast (#462)

Addresses mrcfps's non-blocking note on #2264: synthesizeHandoff returned
null for every non-2xx response, so RATE_LIMITED, EMPTY_TRANSCRIPT, and an
upstream 400 with provider detail all collapsed into one generic "check
your API key" toast — even though handoff-routes.ts had already classified
and sanitized them.

synthesizeHandoff now returns the daemon's structured `{ error }` on a
classified failure; `null` stays reserved for a transport failure or an
unparseable body. handleResumeConversation surfaces error.message plus
redacted details for the `{ error }` case, and a distinct
daemon-unreachable message for null.

* fix(web): omit empty baseUrl from the resume handoff request (#462)

Addresses mrcfps's review on #2264: the default Anthropic config
normalizes baseUrl to '' (config.ts), and the handoff route 400s an
explicit empty baseUrl — so the Resume action failed before synthesis
for every user who never set a custom base URL.

handleResumeConversation now forwards baseUrl only when config.baseUrl
is a non-empty string, matching the contract's optional-field semantics.
Tests: the default-config path asserts baseUrl is absent from the
request, and a new case covers a custom baseUrl being forwarded.

* refactor(daemon): dispatch `od project handoff` before the generic project parser (#462)

Addresses nettee's non-blocking note on #1718: runProject ran the shared
parseFlags(PROJECT_*) before reaching the handoff switch case, so a
malformed `od project handoff` invocation (`--unknown`, `--max-tokens`
with no value) threw out of the generic parser instead of hitting
handoff-cli's structured fail() — the entrypoint behaved differently
from the unit-tested runProjectHandoff helper.

The handoff sub now short-circuits before parseFlags / projectDaemonUrl,
so `od project handoff` runs exactly runProjectHandoff with no
intervening parsing. handoff-cli.test.ts gains unknown-flag and
missing-value cases covering the structured fail path.

---------

Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>
2026-05-20 13:28:27 +08:00
lefarcen
204599a7ae
feat(analytics): ship PostHog v2 event schema (#2285)
* feat(analytics): ship PostHog v2 event schema

Aligns the PostHog wire format with the product team's v2 tracking
spec (Open Design 埋点文档 2.0). The previous v1 catalogue defined a
flat per-page event name (home_view / studio_click / settings_view…);
v2 collapses everything to four core events identified through the
page_name + area + element triplet so dashboards can group by surface
without owning a separate event per page.

Key changes
- packages/contracts/src/analytics: collapse to page_view / ui_click /
  surface_view / *_result event names; bump EVENT_SCHEMA_VERSION to 2;
  rename the wire field anonymous_id → device_id (value unchanged);
  promote the configure-state triplet (has_available_configure_cli /
  configure_type / configure_availability) to a global PostHog register
  so every event inherits it without per-helper boilerplate.
- apps/web/src/analytics: rewrite the 43 trackXxx helpers behind the
  new typed catalogue; opt out of PostHog's built-in UA bot filter so
  legitimate embedded webviews, fingerprinted browsers, and the
  Playwright-based e2e runs ingest captures (the Privacy → "Share
  usage data" toggle remains the single consent gate).
- apps/web components: wire P0/P1/P2 click + view + result surfaces
  end-to-end — left nav, toolbar, home chat composer, recent projects,
  new project modal, plugins / design systems / integrations /
  automations pages, file manager, artifact toolbar/header/share popup,
  feedback panel, settings sidebar / language / appearance /
  notifications / pets / privacy / connectors. Fixes the v1 feedback
  bug where action=clear_feedback_rating shipped rating=null instead of
  the rating being cleared.
- apps/daemon: extend run_created / run_finished with the v2 context
  (entry_from / project_kind / target_platforms / fidelity /
  connectors / etc.), add explicit error_code classification on
  result=failed (run.errorCode → AGENT_SIGNAL_* → AGENT_EXIT_* →
  AGENT_TERMINATED_UNKNOWN), and read device_id from the new
  x-od-analytics-device-id header. Also moves the run_created /
  run_finished emission to the canonical /api/runs handler in
  server.ts; the chat-routes copy was shadowed by Express's earlier
  registration and never executed, which also meant run.clientType
  never made it to Langfuse — fixed in the same move.

Verification
- pnpm guard / pnpm typecheck clean for daemon, web, and contracts.
- pnpm --filter @open-design/web test: 1645/1645 passing.
- End-to-end smoke through Playwright + local PostHog ingest project
  420348: every page_view (home/projects/automations/design_systems/
  plugins/integrations/chat_panel/file_manager), every nav element,
  the new_project_modal surface_view + tab + create flow, the
  plugin_replacement_modal surface_view, settings_view across nine
  sections, settings_cli_test_result (codex CLI), the
  project_create_result success path, and run_created + run_finished
  (result=failed, error_code=AGENT_EXIT_1) all reached PostHog with
  the v2 schema and the expected device_id / page_name / area /
  element / fidelity / target_platforms props. The remaining
  *_result events (artifact_export / feedback_submit / file_upload /
  plugin_replacement / settings_byok_test / settings_connector_auth)
  are wired in code; production traffic will trigger them.

* fix(analytics): preserve style category on design-systems surface chip switch

The merge resolution in DesignSystemsTab incorrectly re-introduced a
`setCategory('All')` call alongside the new `trackDesignSystemsTopClick`
emit. main intentionally keeps the active style category when the surface
filter refines within it; the regression was caught by the existing
"keeps the style category when a surface chip refines within it" test
in tests/components/DesignSystemsTab.test.tsx.

* fix(analytics): address review — senseaudio passthrough + daemon-side configure-state

Two follow-ups from the v2 schema review on #2285:

1. `byokProtocolToTracking()` was still falling through to `null` for
   `senseaudio` even though the v2 BYOK provider enum now lists it. Every
   `SettingsDialog` BYOK call site guards on `if (byokProviderId)`, so a
   user on SenseAudio was silently dropping the provider-option,
   field-focus, and test-result captures. Added the missing case so
   SenseAudio gets the same analytics coverage as the other providers.

2. The daemon-authoritative `run_created` / `run_finished` events were
   missing the configure-state triplet (`has_available_configure_cli` /
   `configure_type` / `configure_availability`) that v2 promotes to a
   global register on the web side. Daemon captures don't go through the
   PostHog global register, so dashboards couldn't segment run lifecycle
   by execution setup after the migration.

   The fix derives the triplet server-side from `detectAgents()` and the
   request's `agentId` before `design.analytics.capture(...)`:
     - has_available_configure_cli: any CLI on PATH reports installed
     - configure_type: 'local_cli' when the run targets an installed CLI,
       otherwise 'unknown' (daemon can't see BYOK keys, which live in
       web-client storage)
     - configure_availability: 'available' / 'unavailable' / 'unknown'
       based on the requested agent's install status, with a fallback to
       'available' when any CLI is installed

   This keeps the v2 schema consistent across both daemon-side and
   web-side captures.

* fix(analytics): wire setConfigureGlobals so browser events carry fresh state

Third follow-up from the v2 schema review on #2285. The previous fix
addressed senseaudio + daemon-side configure-state, but reviewer flagged
that `setConfigureGlobals` was still defined-only — no caller — so every
browser-side capture inherited the boot defaults
(`has_available_configure_cli=false`, `configure_type='unknown'`,
`configure_availability='unknown'`). PostHog dashboards therefore could
not segment the new `page_view` / `ui_click` / `surface_view` events by
execution setup after a user configured their environment.

Changes:

- `packages/contracts/src/analytics/events.ts` — add a pure
  `deriveConfigureGlobals(mode, agentId, agents, byokConfigured)` helper
  so the web client and the daemon can derive the triplet from the same
  source of truth. The helper covers all 5 `configure_type` buckets
  (`local_cli` / `byok` / `both` / `none` / `unknown`) and the 3
  `configure_availability` buckets (`available` / `unavailable` /
  `unknown`).
- `apps/web/src/App.tsx` — add a useEffect that re-derives the triplet
  whenever the user changes execution mode, selects a new CLI, saves a
  BYOK key, or the detected-agent list refreshes, then pushes it to
  PostHog via `analytics.setConfigureGlobals(...)`. The setter goes
  through the provider so the analytics module stays the single source
  of truth.
- `apps/web/src/analytics/provider.tsx` — expose
  `setConfigureGlobals` on the analytics context and the test stub so
  consumers route through the provider boundary.
- `apps/daemon/src/server.ts` — switch the daemon-side derive in
  `/api/runs` to the shared `deriveConfigureGlobals` helper so the
  authoritative run_created/run_finished captures match the web-side
  payload. BYOK credentials live in the web client and stay invisible
  to the daemon, so the daemon arm passes `byokConfigured: undefined`
  and falls back to the installed-CLI signal.
- `apps/web/tests/analytics-configure-globals.test.ts` — new regression
  test that pins the derive behavior across all branches and confirms
  the setter actually mutates the client-side store. Locks the wire-up
  so a future refactor can't silently turn the setter back into a
  no-op.

Verification: pnpm guard clean; daemon / web typecheck clean; web tests
1703/1703 passing (up from 1696 — 7 new tests in the configure-globals
suite).

* fix(analytics): emit projects page_view + drop misattributed chat_panel source

Fourth review pass on PR #2285. Two follow-ups from mrcfps:

1. DesignsTab (projects landing) was emitting click events but no
   matching page_view. Opening /projects without clicking anything left
   the surface invisible in PostHog. Added a once-per-mount
   trackPageView({ page_name: 'projects' }) with the same ref-keyed
   pattern HomeView / PluginsView use.

2. ChatComposer was hard-coding source: 'recent_project' on every
   chat_panel page_view. The web router currently only carries
   projectId / conversationId / fileName, so we cannot distinguish a
   New-project launch from a template-pick or a Recent-projects click
   from this layer. A false constant would over-attribute every chat
   launch to 'recent_project' and break the funnel slice this schema
   was meant to unlock. Dropped the field for now — better no source
   than the wrong source — until the router grows a launch-source
   channel; the field is still defined as optional on PageViewProps so
   the channel can land in a follow-up PR.

Verification: web typecheck clean; web tests 1703/1703 passing.

* fix(analytics): correct plugin-replacement async result + heterogeneous upload + missing requestId

Three follow-ups from the fifth review pass on PR #2285:

1. **plugin_replacement_result emitted before the apply settled**
   (`apps/web/src/components/HomeView.tsx`). The modal's confirm action
   was a synchronous wrapper around an async `usePlugin(...)` call, so
   the surrounding try/catch never observed real failures and every
   attempt was reported as `result=success`. Changed `PendingReplacement.
   confirm` to return `Promise<void>`, made the wrapper return the
   underlying promise, and moved the analytics emit into an async
   IIFE in the click handler so the success/failure branches reflect
   the actual outcome.

2. **file_upload_result mis-typed heterogeneous batches**
   (`apps/web/src/components/FileWorkspace.tsx`). The earlier
   implementation only inspected `picked[0]`, so a mixed batch like
   `image.png + demo.mp4` reported `file_type=image`. Per the comment
   above the block ("mixed batches collapse to other"), the
   implementation now maps every file to a tracking type, collapses to
   `other` when more than one distinct type is present, and falls
   back to the single type otherwise.

3. **project_create_result lost the click→result correlation id**
   (`apps/web/src/components/NewProjectPanel.tsx`). The click event
   no longer carried the locally-generated `requestId` that
   `project_create_result` keeps, so the two could not be joined.
   `trackNewProjectModalElementClick()` now accepts an optional
   `{ requestId }`, mirroring the other helpers, and the create-button
   click threads the same id used for the result.

Verification: web typecheck clean; web tests 1703/1703 passing.

* fix(analytics): gate configure-state on agents probe + drop unsent run_created fields

Two follow-ups from the sixth review pass on PR #2285:

1. **Cold-start configure-state was stamped before fetchAgents() landed**
   (`apps/web/src/App.tsx`). The useEffect that pushes the v2 triplet
   into the PostHog global register fired on first paint with
   `agents=[]`, so the first home/projects/plugins page_view reported
   `has_available_configure_cli=false` / `configure_availability=
   unavailable` even on machines that did have an installed CLI. The
   effect now waits on `agentsLoading === false` and leaves the boot
   defaults ('unknown'/'unknown') in place until the probe resolves.

2. **Daemon read run-context fields the web never sends**
   (`apps/daemon/src/server.ts`). The daemon-side run_created /
   run_finished baseProps read `projectKind`, `entryFrom`,
   `projectSource`, `targetPlatforms`, `companionSurfaces`, `fidelity`,
   `connectors`, `useSpeakerNotes`, `includeAnimations`,
   `referenceTemplate`, and `aspect` from `req.body`, but
   `packages/contracts/src/api/chat.ts` and
   `apps/web/src/providers/daemon.ts` don't carry those keys on the
   wire. Reading them therefore always produced null/undefined.
   Dropped the unsent fields from the daemon capture; a follow-up can
   extend the create payload to thread the real context through. The
   `design_system_id` field stays because the chat contract does send
   it.

Tests: added 3 regression tests in `tests/analytics-configure-globals.
test.ts` covering the boot-time gating contract (empty agents +
daemon mode → unavailable / local_cli; installed agent → available;
undefined agents list → unavailable). Verification: web typecheck
clean; daemon typecheck clean; web tests 1706/1706 passing (up from
1703 — 3 new cold-start tests).

* fix(analytics): pin mode='daemon' so missing-agent run reports unavailable

Eleventh review pass on PR #2285. mrcfps flagged that
`apps/daemon/src/server.ts` was calling `deriveConfigureGlobals(...)`
without `mode`, so the helper fell through to the generic branch.
Result: a run for an uninstalled agent was tagged
`configure_availability: 'available'` whenever any OTHER CLI was on
PATH, because the generic branch only looks at the cohort-wide
"any installed?" signal. That precisely undermines the slice the
daemon emit is trying to power.

The daemon's /api/runs handler is always a daemon-mode capture
(daemon is the local CLI runner — BYOK lives in the web layer), so we
now pin `mode: 'daemon'` on the call site. The helper then judges
`configure_availability` from the REQUESTED agent's install status and
reports `unavailable` when the user picked an agent that is not
installed, even if peers are.

Added a regression case in `tests/analytics-configure-globals.test.ts`:
`{ mode: 'daemon', agentId: 'codex', agents: [{claude,true},{codex,false}] }`
→ `{ has_available_configure_cli: true, configure_type: 'local_cli',
configure_availability: 'unavailable' }`.

Verification: daemon typecheck clean; web tests 1707/1707 passing
(up from 1706 — 1 new regression test).

* fix(analytics): hoist chat_panel page_view + thread requestId

- Move chat_panel page_view emit from ChatComposer to ProjectView so
  it survives activeConversationId-driven ChatPane remounts. ProjectView
  keys the dedupe ref by project.id; the composer drops its duplicate.
- Thread { requestId } into trackAssistantFeedbackReasonSubmitClick so
  the click pairs with the existing feedback_submit_result on the same
  request id (mirrors the trackNewProjectModalElementClick pattern).

* fix(analytics): keep v2 super-props alive across reset and stamp design_system_source

- Snapshot the register payload in client.ts on PostHog init and
  re-register it from applyConsent(true) and applyIdentity() so a
  privacy-toggle or Delete-my-data rotation does not resume capture
  without event_schema_version / device_id / session_id / locale /
  configure-state globals. setConfigureGlobals() also patches the
  cache so a later restore picks up the current configure state.
- Stamp design_system_source on daemon-side run_created / run_finished
  (it is required by RunCreatedProps / RunFinishedProps). Daemon
  can't tell default vs user_selected vs inherited from the wire, so
  it derives 'unknown' when designSystemId is present, 'not_applicable'
  otherwise — a follow-up that threads designSystemSource through
  CreateRunRequest can replace this with the precise source.
2026-05-20 13:04:20 +08:00
kami
59a9867cf3
fix(daemon): surface discovery form answers to agents (#2071) 2026-05-20 10:58:51 +08:00
lefarcen
80d305858b
feat(diagnostics): add one-click log export from Settings → About (#798)
* feat(diagnostics): add one-click log export from Settings → About

Adds a new "Export diagnostics" entry under the About section that bundles
daemon/web/desktop logs, machine info, and recent macOS crash reports into
a zip the user can share when reporting issues.

- Browser hits a new daemon HTTP endpoint and triggers a download.
- Electron uses an IPC bridge with the native save dialog and reveals the
  saved file in Finder/Explorer; the Help menu also exposes it as a
  fallback when the daemon is unresponsive.

Packaging + redaction lives in a new @open-design/diagnostics package so
both surfaces share it. Sensitive JSON keys, URL query secrets, and the
current user's home path are redacted before packaging.

* build(nix): include packages/diagnostics in daemon build targets

The Nix daemon derivation builds workspace siblings in dependency order
before compiling apps/daemon. Without @open-design/diagnostics in that
list, the daemon TypeScript build fails inside the Nix sandbox with
`Cannot find module '@open-design/diagnostics'` because pnpm install
only creates the symlink — the dist output that the package.json
exports point at isn't produced until each sibling's build script runs.

* build(tools-pack): include @open-design/diagnostics in packaged INTERNAL_PACKAGES

Without this, packaged win/mac/linux builds fail with `npm error 404` when
the post-build `npm install --omit=dev --no-package-lock` step in the
assembled app tries to resolve `@open-design/diagnostics@0.2.0` from the
public npm registry. The package is workspace-private, so it has to be
tarballed via `pnpm pack` and file:-referenced from the assembled
package.json like every other internal workspace dep that daemon/desktop
depend on.

Also wires the package's `pnpm --filter ... build` into the pre-pack
workspace build step so the dist/ exists before pnpm pack runs, and
updates the two test fixtures (`win-app.test.ts`, `workspace-build.test.ts`)
that mirror INTERNAL_PACKAGES.

The diagnostics package itself is repinned to exact dependency versions
already used elsewhere in the workspace (`jszip 3.10.1`, `@types/node
20.19.39`, `esbuild 0.28.0`, `typescript 5.9.3`, `vitest 4.1.6`) so it
passes the new `pnpm guard` exact-version rule and produces a minimal
lockfile diff vs main (additions only, no resolution-string churn).

* fix(diagnostics): include `~` in bearer-token redaction char class

RFC 6750 token68 syntax allows `~`, so tokens like `Authorization: Bearer
abcd~efgh` were only partially matched by `HTTP_AUTH_SCHEME_RE`. The
regex stopped at the first `~`, leaving the tail (`~efgh`) un-redacted in
the exported diagnostics zip — a clear leak since this feature explicitly
generates support bundles for external sharing.

Add `~` to the character class and a regression test.

* fix(diagnostics): only collect renderer.log from desktop

`buildSidecarLogSources` unconditionally added `logs/${app}/renderer.log`
for daemon/web/desktop, but only the desktop runtime writes a renderer
log (see apps/desktop/src/main/runtime.ts) — daemon and web are pure
Node services with no Electron renderer. Every export therefore produced
missing-file placeholders and manifest warnings for the two phantom
paths, polluting the bundle.

Gate the renderer.log source on APP_KEYS.DESKTOP so the daemon-side
collector matches the desktop-side collector in apps/desktop/src/main/
diagnostics.ts:63.

* fix(diagnostics): mirror desktop-side renderer.log gate

The previous fix only updated the daemon-side `buildSidecarLogSources`
in `apps/daemon/src/diagnostics-export.ts`. The desktop-side collector
at `apps/desktop/src/main/diagnostics.ts` had an identical copy of the
same bug that I overlooked: it also unconditionally added
`logs/${appKey}/renderer.log` for daemon/web/desktop, producing
missing-file placeholders + manifest warnings for the two phantom paths
on every desktop-initiated export.

Apply the same `appKey === APP_KEYS.DESKTOP` gate here so both export
entry points (browser via daemon HTTP, Electron via native save dialog)
emit the same clean manifest.

* feat(diagnostics): add `od diagnostics export` CLI subcommand

AGENTS.md's dual-track capability-exposure contract requires every
user-facing feature to ship on both the web UI and the `od` CLI. The
diagnostics export was only reachable through Settings → About and the
desktop Help menu; this commit closes the loop with an `od diagnostics
export [<path>] [--json]` subcommand registered in SUBCOMMAND_MAP.

The CLI is a thin shell over the existing GET /api/diagnostics/export
endpoint — same zip output, same redaction, same crash-report scope.
Defaults to writing `open-design-diagnostics-<timestamp>.zip` in the
current directory; `--output <path>` or a positional arg overrides.
`--json` prints `{path, sizeBytes}` for shell pipelines.

Use cases this unlocks:
- A CI script can `od diagnostics export ~/artifacts/bundle.zip` after
  a failed run.
- Bug reporters on headless boxes can grab a bundle without booting
  the web UI.
- `od doctor` follow-ups can collect a full snapshot when a probe fails.

* fix(diagnostics): surface non-sidecar launch in manifest warnings

`buildSidecarLogSources()` returns `[]` when the daemon has no sidecar
runtime context, which is the standard `od` (plain) launch path —
`runDaemonCliStartup()` -> `startDaemonRuntime()` does not pass a
runtime. Settings → About and the new `od diagnostics export` previously
reported success but produced a bundle with only the summary JSONs, so
operators could not tell "no logs because plain launch" from "no logs
because something genuinely broke."

- Extend `DiagnosticsContext` with an optional upstream `warnings:
  string[]` that `buildManifest` merges into the manifest warnings.
- Emit STANDALONE_LAUNCH_WARNING from the daemon handler when
  `options.runtime == null`. The warning names the limitation and
  points the user at the sidecar entry points that DO capture logs.
- Add a regression spec at `apps/daemon/tests/diagnostics-export.test.ts`
  that drives the handler with `runtime: null` and asserts the warning
  surfaces in `summary/manifest.json` (and that `files` is empty so a
  user reading the bundle does not confuse "no log sources" with
  "missing files").
2026-05-20 09:10:51 +08:00
mzl163
210b94069a
feat(senseaudio): BYOK chat with image + video generation tools (#2065)
* feat(senseaudio): BYOK chat with image + video generation tools

Adds SenseAudio as a first-class BYOK chat protocol and wires the daemon's
chat proxy with a tool loop so BYOK users can generate images and videos
without dropping to a CLI agent.

- BYOK protocol: new senseaudio tab + /api/proxy/senseaudio/stream route +
  connection-test + provider-models discovery (OpenAI-compatible wire)
- Tool loop: generate_image (synchronous /v1/image/sync) and generate_video
  (async /v1/video/create + 5s polling /v1/video/status, 10-min ceiling,
  periodic progress log every 30s)
- Settings dropdown + chat-composer dropdown for the BYOK image model
  default; generate_image's model enum lets the LLM override per call
- Seed-on-success: a successful BYOK chat call idempotently mirrors the
  key into media-config (preserves env-resolved + already-stored keys)
- Generated artifacts land in <projectsRoot>/<projectId>/ so FileViewer,
  DesignFilesPanel, and project export pick them up automatically;
  legacy /api/byok-image/:id route kept for old conversation links
- Markdown renderer learns ![alt](url) image syntax with a scheme
  allowlist (http(s) / data:image/ / blob: / relative paths)
- i18n key settings.byokImageModel across all 19 locales
- 3 SenseAudio image models registered (2.0, 1.0, doubao-seedream-5.0);
  1 video model (doubao-seedance-2.0)
- Tests: byok-tools (29), media-senseaudio-image (8), media-config seed
  (7), proxy-routes (47), markdown image rendering (8)

* fix(senseaudio): unblock image gen + design file preview switching

- SenseAudio /v1/image/sync rejected the previous size mapping with
  `参数错误:size` (1664x936, 936x1664, 1280x960, 960x1280 are not in
  the gateway's accepted set). Switched to standard HD / SD sizes that
  every aspect bucket can hit: 1024×1024, 1280×720, 720×1280,
  1024×768, 768×1024. Kept the byok-tools and media.ts tables in sync
  so the BYOK chat tool and the CLI agent path both stop failing on
  non-square aspects.

- DesignFilesPanel's <DfPreview> was missing a key prop, so React
  reused the same iframe DOM node when the user picked a different
  file — the src prop changed but the iframe never navigated. Added
  key={previewFile.name} so the previous preview unmounts cleanly.

- Updated byok-tools + media-senseaudio-image tests for the new size
  expectations.

* docs(senseaudio): clear stale provider hint + update README

- Settings → Media → SenseAudio: clear the auto-promoted
  "Image · TTS · 70+ voices · clone" hint; the provider label alone is
  enough now that the BYOK chat surface covers image + video tooling.
- README: list the new senseaudio (and missing ollama) proxy routes so
  the BYOK section reflects what the daemon actually serves, and
  mention the generate_image / generate_video chat tools that ship
  with the SenseAudio path.

* fix(senseaudio): address PR #2065 review feedback

Three non-blocking review notes from @PerishCode on PR #2065:

1. Drop the dead /api/byok-image/:id route. The PR description claimed
   it was "legacy fallback for old chat history" but that storage
   layout never existed on main, so the route can only ever 400 or
   404 — never 200. Removed the handler, the isSafeByokImageId
   export, the unused createReadStream / stat / path / Request /
   Response imports, and the two byok-image regression tests.

2. Add rejectProxyPluginContext guard to the senseaudio proxy
   handler so it matches the invariant the other five proxy paths
   already enforce (plugin runs must go through /api/runs for
   snapshot pinning). Extended the existing "API fallback rejects
   plugin runs" describe to also cover /api/proxy/senseaudio/stream
   with the 409 PLUGIN_REQUIRES_DAEMON expectation.

3. Wrap the secondary image / video downloads (the URLs the
   SenseAudio gateway hands back in /v1/image/sync .url and
   /v1/video/status .video_url) in validateBaseUrlResolved so a
   malicious gateway can't point us at 169.254.169.254 (AWS / Azure
   metadata) or RFC1918 hosts via the response payload. Also passed
   `redirect: 'error'` on both fetches to match the SSRF posture
   the primary proxy fetch already uses. The new
   assertExternalAssetUrl helper lives next to executeGenerateImage
   so future tool downloads can reuse it.

Tests: 120/120 daemon tests pass; guard + typecheck green.

* fix(senseaudio): mirror SSRF guard onto renderSenseAudioImage CLI path

Follow-up to 01b1260a — the chat-tool fix in byok-tools.ts wasn't
mirrored onto the parallel renderSenseAudioImage path in media.ts.
Same attacker-controllable shape (gateway-returned `data.url`),
same one-line fix.

- Hoist assertExternalAssetUrl from byok-tools.ts into
  connectionTest.ts next to validateBaseUrlResolved so both call
  sites (the BYOK chat tool loop AND the CLI agent media dispatcher)
  share one helper. Made the error strings provider-agnostic so a
  future caller doesn't get a misleading "senseaudio" attribution
  for a Volcengine / Grok / etc. download.
- renderSenseAudioImage now runs the response url through
  assertExternalAssetUrl before fetching bytes, and passes
  redirect: 'error' to block a 3xx hop into private space.

Scope intentionally limited to the senseaudio path PerishCode
flagged; the other unguarded fetch(entry.url) call sites in
media.ts (OpenAI / Volcengine / Grok / Nano-Banana) are pre-existing
patterns and belong in a separate follow-up if the daemon wants
defense-in-depth across every provider.

Tests: 127/127 daemon tests pass; guard + typecheck green.

---------

Co-authored-by: unknown <mazeliang@sensetime.com>
2026-05-19 23:14:56 +08:00
Eli
e94663bfbd
Add connector memory extraction flow (#2265) 2026-05-19 21:27:41 +08:00
Jiannanya
555bc5e7ed
fix(daemon): ensure node binary dir is on PATH for agent sub-processes on Windows (#1989)
* fix(daemon): ensure node binary dir is on PATH for agent sub-processes on Windows

When pnpm tools-dev launches the daemon via a full path to node.exe the
nodejs directory may not appear in process.env.PATH. Agent .cmd shims on
Windows (e.g. the npm copilot shim) call `"node" script.js`, so cmd.exe
fails to find node and the run exits immediately with:

  '"node"' is not recognized as an internal or external command

Fix: in createAgentRuntimeEnv, prepend path.dirname(nodeBin) to PATH
when that directory is not already present. This is a no-op on systems
where the node directory is already in PATH.

Also extend the stdin 'error' swallow to cover the Windows equivalent of
EPIPE: err.code === 'EOF' / err.message === 'write EOF' (libuv UV_EOF).
These fire when the child process exits before reading stdin and were
incorrectly surfaced to the user as AGENT_EXECUTION_FAILED.

Affected file: apps/daemon/src/server.ts

* fix(daemon): anchor PATH-prepend test assertions and add coverage for both branches

- Update the existing 'injects daemon URL and run-scoped tool token' test to
  expect PATH = `/opt/open-design/bin${path.delimiter}/bin` after the prepend,
  matching the behaviour introduced in the Windows PATH fix.

- Add 'prepends node binary directory to PATH when not already present': given
  PATH='/bin' and nodeBin='/opt/node/node', asserts PATH becomes
  `/opt/node${path.delimiter}/bin`.

- Add 'does not duplicate node binary directory when already present in PATH':
  given PATH already contains the bin dir, asserts it is unchanged (idempotent).

- Replace parts.includes(nodeBinDir) with a normalised some() predicate that
  strips trailing path separators and performs a case-insensitive comparison on
  win32, producing true idempotence when PATH contains the same directory with a
  different case or trailing slash.

* case-insensitive key path finding

* fix Case-insensitive key lookup and add test

* fix(daemon): inject Node binary dir into all agent launch/probe envs on Windows

applyAgentLaunchEnv() is the shared terminal step for every agent spawn
path — detectAgents(), connection tests, and chat runs.  Add an optional
`nodeBinDir` parameter (default: path.dirname(process.execPath)) so the
running Node directory is prepended to PATH at every call site without
requiring changes to detection.ts or connectionTest.ts.

Add two regression tests: a cross-platform test covering the Windows-style
'Path' key-casing fix, and a Windows-only test using proper C::\ paths and
';' delimiter verifying the full end-to-end Windows behavior.

* test(daemon): fix applyAgentLaunchEnv tests broken by nodeBinDir default

The 4 tests added in agent-runtime-env.test.ts used the 2-argument form of
applyAgentLaunchEnv, picking up path.dirname(process.execPath) as the new
default and making assertions non-deterministic across CI runners.  Also,
three of the fixtures mixed Windows C:\...;... paths with POSIX path.delimiter
(':'), which silently produced wrong splits on POSIX runners.

Pass '' as nodeBinDir to all 4 tests so the early-return and path-prepend
logic is exercised in isolation.  Replace Windows-style fixtures with
POSIX paths + path.delimiter; the real C:\... end-to-end shape is already
covered by winTest in tests/runtimes/launch.test.ts.
2026-05-19 16:58:15 +08:00
chaoxiaoche
6a08dfe111
Add design system package quality guard (#2224)
* Add design system import manifest schema

* Generate hybrid design system imports

* Read design system usage and cached manifests

* Add design system pull-file tool

* Show design system package evidence

* Wire design system import semantics

* Add design system package quality guard

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-05-19 16:53:29 +08:00
Tom Huang
86ec951fb9
[codex] Add automation templates and proposal workflows (#2193)
* feat(web): introduce Automations tab with dual-track capability for routines

This commit adds a new Automations tab that consolidates routines, schedules, and live artifacts, allowing users to manage automations seamlessly. The tab features a modal for creating and editing automations, which supports various scheduling options (hourly, daily, weekdays, weekly) and project modes (create_each_run, reuse). The CLI is also updated to expose automation commands, ensuring consistency between the web UI and CLI interfaces.

Key changes include:
- New `NewAutomationModal` component for automation creation and editing.
- Updated `TasksView` to integrate the new Automations functionality.
- Enhanced styling for the Automations tab to improve user experience.

This implementation aligns with the dual-track capability exposure policy, ensuring all features are accessible via both the web UI and CLI.

* feat(daemon): enhance automation context handling and CLI commands

This commit introduces several improvements to the automation context management and updates the CLI commands accordingly. Key changes include:

- Added support for new context fields (`plugin`, `mcp`, `connector`) in automation commands.
- Updated the CLI to reflect new target options (`new-project`).
- Enhanced error messages for invalid target inputs.
- Introduced functions to handle context selection and normalization for routines, including the ability to parse and store context data in the database.
- Updated the database schema to include a new `context_json` field for routines.
- Improved the handling of context in routine routes and the web interface, ensuring that selected contexts are properly managed and displayed.

These changes aim to provide a more robust and flexible automation experience, aligning with the recent enhancements in the web UI.

* feat(web): enhance TasksView with automation run history and status indicators

This commit introduces several new features to the TasksView component, including:

- Added functionality to display automation run history for each routine, showing metadata such as status, timestamps, and project details.
- Implemented status indicators for routine runs, providing visual feedback on their current state (succeeded, failed, running, queued).
- Enhanced the UI to allow users to expand and view detailed run history, including the ability to open the corresponding project conversation.
- Updated styles to improve the presentation of automation statuses and history.

These changes aim to provide users with better insights into their automation routines and improve overall usability.

* feat(daemon): implement automation ingestion and proposal management

This commit introduces several new features related to automation ingestion and proposal management within the daemon. Key changes include:

- Added new modules for handling automation source packets and proposals, allowing for the storage, retrieval, and management of automation-related data.
- Implemented functions to list, create, and apply automation proposals, enhancing the automation workflow.
- Introduced new CLI commands for interacting with memory entries and automation sources, providing users with more control over their automation processes.
- Enhanced the server routes to support automation source and proposal APIs, enabling seamless integration with the existing system.

These changes aim to improve the overall automation experience, making it easier for users to manage and utilize automation proposals and ingestions effectively.
2026-05-19 16:35:28 +08:00
Eli
18b947c25f
[codex] Land design system GitHub intake handoff (#2187)
* Add Claude-style design system workflow

* Merge design system workflow into main

* Restore design system workflow UI styles

* Fix design system setup scrolling

* Fix design system setup connector button

* Preserve connector auth link after popup block

* Simplify connected GitHub setup state

* Open generated design system workspace project

* Summarize design system auto prompt in chat

* Add bounded GitHub connector design intake

* Prefer path-scoped GitHub intake tools

* Restore branch GitHub design context intake

* Restore design system review workspace

* Restore design system manager tab

* Let design system workflow routes own details

* Open editable design systems as projects

* Restore design system workspace coverage

* Fix bounded GitHub connector intake

* Hide design system review while generating

* Suppress design system generation questions

* Constrain GitHub design intake to bounded command

* Tolerate oversized GitHub metadata during intake

* Rebuild daemon CLI when sources change

* Fallback when GitHub connector snapshots are rate limited

* Allow GitHub intake without Composio

* Use native GitHub auth for design intake

* Remove design system review group heading

* Improve design system extraction evidence

* Align design system scaffold with Claude output

* Add evidence inventory for design system intake

* Add local design system evidence intake

* Add design system package audit gate

* Allow auditing Claude Design reference packages

* Audit design system package content quality

* Migrate legacy design system artifacts

* Clean migrated design system artifacts

* Require modular design system UI kits

* Reject thin design system UI kits

* Prioritize core design evidence intake

* Require role-based design system UI kits

* Clean stale design system manifest references

* Require representative preserved design assets

* Warn on generic design system visuals

* Enforce design system quality warnings

* Audit connected design system UI kits

* Require mounted design system UI kits

* Require composed design system app shells

* Require runnable JSX design system kits

* Require browser globals for design system components

* Infer design system names from source URLs

* Require source examples in design system packages

* Bind preserved fonts in design system tokens

* Require skill frontmatter in design system packages

* Preserve build icons in design system packages

* Require real assets in brand previews

* Require substantive source examples

* Require product overview in design system README

* Require reusable UI kit README

* Require reusable design system skill docs

* Seed Claude-style UI kit entry contract

* Preserve runtime build assets in design packages

* Audit design system packages after generation

* Audit design system first-run output

* Audit source-backed preview cards

* Align design system UI kit scaffolds

* Materialize design evidence package artifacts

* Show project chat during design system setup

* Hand off design system setup to project chat

* Auto-repair design system audit failures

* Harden design system evidence preservation

* Tighten design system package guidance

* Add targeted design system repair guidance

* Bound design system audit auto repair

* Use connector statuses in design system setup

* Audit design system preview manifests

* Require README preview manifests for design systems

* Fix design system GitHub intake handoff

* Fix daemon prompt CI assertions
2026-05-19 14:30:17 +08:00
kami
a83cfe9a0c
fix(daemon): preserve Windows chat attachments (#2158) 2026-05-19 12:01:33 +08:00
Joey-nexu
56988e406c
feat: integrate xAI SuperGrok subscription as a credential source for Grok media + X search (#2134)
* feat(daemon): add xAI OAuth client with PKCE + token storage

Wraps mcp-oauth.ts PKCE primitives for xAI's auth.x.ai OAuth server.
xAI doesn't speak MCP and doesn't expose Dynamic Client Registration,
so issuer / endpoints / client_id / scope / loopback :56121 are
hardcoded constants.

Adds xai-tokens.ts for persistent storage, mirroring mcp-tokens.ts:
atomic write + chmod 0600 + per-dataDir in-memory mutex. Simplified
for the single-token case (no per-server-id map).

Reference: NousResearch/hermes-agent hermes_cli/auth.py:93-100.
PoC reuses Hermes client_id (b1a00492-...); replace before stable
release once Open Design has its own.

Tests: 11 + 20, all green. tsc --noEmit clean. pnpm guard clean.

* feat(daemon): expose xAI Grok models in Hermes runtime fallbackModels

Lists grok-4.3, grok-4.20-reasoning, grok-4.20-non-reasoning, and
grok-4.20-multi-agent-0309 as discoverable Hermes fallback models.
A user who has not installed Hermes yet now sees these xAI options
in the model picker, signalling that `hermes auth add xai-oauth`
(SuperGrok subscription) or XAI_API_KEY unlocks Grok in Open Design
without OD itself implementing OAuth-for-chat.

`fetchModels` (which calls `hermes acp` to enumerate the user's
actually-installed providers) is unchanged; this list only kicks
in when probing fails (e.g. Hermes off PATH).

Reference: xAI × Nous Research grok-hermes integration announcement,
2026-05-15. https://x.ai/news/grok-hermes

* feat(media): route Grok Imagine through xAI OAuth credentials

Adds resolveXAIBearer() — a refresh-aware helper on top of the
xai-tokens.json store written by the daemon's OAuth client. Returns
a fresh access_token, transparently refreshing in-place when the
stored token enters the 120 s expiry skew window.

Wires it into media-config.ts so the existing Grok provider gets the
same OAuth-fallback treatment OD already gives the OpenAI provider:
env keys win, then stored Settings keys, then OD-native xAI OAuth,
then a borrowed Hermes-side xai-oauth token from ~/.hermes/auth.json.
SuperGrok subscribers who already authorized Hermes get OD image /
video generation routed through their subscription with zero extra
setup.

Updates the "no xAI API key" error in renderGrokImage / renderGrokVideo
to point at the new OAuth path so users hitting it know they have a
zero-cost option.

Also exposes mediaConfigDir() so credential helpers next to
media-config.json (like xai-tokens.json) reuse the same precedence:
OD_MEDIA_CONFIG_DIR > OD_DATA_DIR > <projectRoot>/.od.

Tests: 7 new xai-credentials cases (refresh on expiry, refresh
failure, missing refresh_token, response without refresh_token) +
8 new media-config Grok OAuth fallback cases (OD-native, Hermes
borrow, OD vs Hermes precedence, env precedence, stored precedence,
unconfigured, expired-without-refresh). All green; tsc / guard clean.

* feat(media): add xAI Grok TTS provider

Registers grok-tts in the speech model catalog and wires up
renderXAITTS to dispatch (provider=grok, surface=audio, kind=speech)
to https://api.x.ai/v1/tts. xAI exposes a dedicated /tts endpoint
that returns raw audio bytes — distinct from OpenAI's /audio/speech
JSON shape — so TTS gets its own renderer rather than reusing
renderOpenAISpeech.

Credentials route through the same OAuth-aware path as Grok image
and video (PR follow-up to media-config.ts), so a SuperGrok
subscriber gets TTS for free once they have authorized once.

Default request body matches the documented minimal shape
(text / voice_id / language); sample_rate / bit_rate / codec are
left unset so the server applies its mp3 / 24 kHz / 128 kbps
defaults. Plumbing for explicit overrides is left for a later PR
once the agent-facing contract grows the corresponding flags.

Tests: 5 cases covering documented body shape, voice / language
override, env-key fallback, server-error surfacing, and the
no-credentials error. All green; tsc / guard clean.

Reference: https://docs.x.ai/developers/model-capabilities/audio/text-to-speech

* feat(daemon, web): expose xAI OAuth flow in Settings UI

Closes the loop on the Grok integration: a SuperGrok subscriber can
now authorize Open Design directly from Settings → Media Providers →
Grok, with no API key and no Hermes install. After authorizing, image,
video, and TTS routes pick up the bearer through the OAuth fallback
chain added in 'route Grok Imagine through xAI OAuth credentials'.

Daemon side
- xai-oauth-server.ts opens a one-shot HTTP listener on
  127.0.0.1:56121 to receive the OAuth callback. The redirect URI is
  hard-locked to that port because the PoC reuses the Hermes-issued
  client_id. Listener self-closes on first matching callback or after
  a 30 min timeout.
- xai-routes.ts wires three endpoints onto the daemon's HTTP app:
    POST /api/xai/oauth/start       — mint state, open listener,
                                       return authorize URL
    GET  /api/xai/auth/status       — has-token / expiry / in-flight
    POST /api/xai/oauth/disconnect  — wipe stored token, stop listener
- server.ts registers xai-routes alongside the existing mcp-routes.

Web side
- XaiOAuthControl.tsx renders a Sign in / Reconnect / Disconnect
  surface mirroring McpOAuthControl, but polls /api/xai/auth/status
  exclusively because the :56121 callback page lives in a separate
  process and can't postMessage back to the OD UI. SettingsDialog
  embeds it inside the Grok provider row.

Tests: 9 listener cases (bind / state mismatch / replay / favicon /
EADDRINUSE / timeout / explicit error param / one-shot consume /
early stop) + 8 route cases (start mints PKCE URL, second start
replaces in-flight listener, status reports listening + connected,
callback ok stores token, callback error skips storage, disconnect
wipes, cross-origin guard rejects all three endpoints). All 17 +
the 74 from prior commits pass; tsc / web typecheck / pnpm guard
clean.

PoC client_id stays Hermes-issued; user-visible strings are
hardcoded English pending an i18n pass before stable.

* fix(daemon, web): xAI OAuth follow-up — paste-back, X search, UX polish

PoC testing surfaced four real-world rough edges in the Sign in flow
that were not obvious before getting an actual SuperGrok subscription
in front of it. None alter the architecture in 'expose xAI OAuth flow
in Settings UI'; they round it off so the path the user actually walks
matches the one the design assumed.

1. Layout. XaiOAuthControl was a grid item inside .media-provider-body
   and got squeezed into the API-key column. Moves it out of the body
   so the row's flex-column layout gives it the full width — matches
   what every other Settings provider OAuth surface gets.

2. Paste-back. xAI's `auth.x.ai` page often shows a "cannot connect to
   your application" fallback that hands the user a code instead of
   redirecting back to 127.0.0.1:56121, even when the loopback listener
   is reachable (browser DOES quietly redirect in the background, but
   the page lies and shows the manual-paste UI anyway). Adds:
     - POST /api/xai/oauth/complete that takes {state, code} and runs
       completeXAIAuth + setXAIToken + stops the listener.
     - A paste-back input row in XaiOAuthControl that surfaces while
       the dance is in flight; submitting either via Enter or the
       button calls /complete and falls through to the same connected
       state the loopback path lands on.

3. X search. New POST /api/xai/search wraps Grok's native x_search tool
   through the Responses API, gated on the same OAuth-first credential
   chain as Grok image / video / TTS. Body accepts query (required),
   allowed_x_handles, excluded_x_handles, from_date, to_date, model.
   Returns { answer, citations[], model } parsed from the Responses
   payload via two newly exported helpers (extractAnswerText,
   extractUrlCitations).

4. State machine + warning banner. Three issues collapsed into one:
     - Polling that flipped busy → 'idle' the moment the loopback
       listener self-closed disabled the paste-back input even though
       the dance was still recoverable. Removed that branch; awaiting
       state now only ends on connected=true or explicit cancel.
     - paste-input `disabled` was over-eager (`busy !== 'awaiting' &&
       busy !== 'refreshing'`); now it's only blocked while a submit
       is in flight (`busy === 'refreshing'`).
     - Added a heads-up banner inside the awaiting region explaining
       that xAI's "cannot connect" page is a UX bug on their side and
       the OD panel is the source of truth for sign-in success. The
       connected message picks up the cue too: "You can close any
       open xAI browser tabs now."

Tests: +12 cases on top of the existing 17. The complete endpoint
covers happy path, blank-field rejection, and unknown-state error.
The search endpoint covers blank-query rejection, no-credentials 401,
full bearer / x_search-options forwarding with response parsing, and
upstream-error pass-through. Two helper functions get four direct
parser cases. All 29 in the file pass; 225 across the daemon test
suite pass; tsc / web tsc / pnpm guard all clean.

* fix(daemon): satisfy tsconfig.tests.json strictness in xai test files

The CI workspace typecheck step runs tsconfig.tests.json (which extends
tsconfig.json's strict + exactOptionalPropertyTypes settings and adds
the tests/ directory to the include set) — but the local
`tsc -p tsconfig.json --noEmit` I ran while iterating only covered
src/. That gap let two classes of strict-mode errors slip into the
PR's CI:

- `let outcome: CallbackOutcome | null = null` mutated from inside an
  async callback narrowed to `never` after `outcome?.kind` because TS
  doesn't track cross-function mutation. Switched the seven sites in
  xai-oauth-server.test.ts to a `{ current: CallbackOutcome | null }`
  ref object — TS does narrow .current correctly, so `kind` / `error`
  field access stops collapsing to `never`.
- `await r.json()` returns `Promise<unknown>` in the lib.dom typings
  shipped with TS 5.x, so every `body.field` / `status.connected`
  access in xai-routes.test.ts tripped TS18046. Added a one-line
  `jsonOf<T = any>` helper at the top of the file and switched all
  call sites (both `await r.json()` and `.then((r) => r.json())`).
- The cross-origin guard test iterated `for (const [method, path] of
  [...])` — under noUncheckedIndexedAccess that destructures to
  `string | undefined`, which RequestInit.method (a `string` under
  exactOptionalPropertyTypes) won't accept. Hoisted the cases to a
  typed `ReadonlyArray<readonly [string, string]>` so the elements
  stay non-optional.

Behaviour is unchanged; vitest still reports 29/29 across these two
files. tsc -p tsconfig.tests.json --noEmit now passes locally,
matching what CI will run.

* fix(xai-oauth): preserve refresh_token + release :56121 on cancel

Two lifecycle issues Looper flagged on the prior commit:

1. resolveXAIBearer dropped the existing refresh_token whenever the
   refresh response omitted one. RFC 6749 §6 explicitly allows the
   server to skip refresh_token rotation and keep the old one valid;
   xAI's behaviour is currently to rotate, but a future change could
   silently break OD users. With the old code the first refresh
   succeeded but persisted a token with no refresh credential, so the
   next expiry forced the user back through Sign in even though their
   grant was still good. Carries the previous refresh_token forward
   when fresh.refresh_token is absent. Updates the matching
   xai-credentials test to assert the carried-forward value instead of
   the previous (incorrect) "drop it" assertion.

2. The Cancel button in XaiOAuthControl only cleared React-side
   pending state; the daemon's one-shot 127.0.0.1:56121 listener kept
   running for the full 30 min server timeout. /api/xai/auth/status
   would still report listening=true, and that singleton port could
   block the next Sign in (or a Hermes session on the same machine).
   Adds POST /api/xai/oauth/cancel that calls stopActiveListener()
   without touching the stored token (Disconnect is the destructive
   path; this is the narrow "release the port" affordance), wires the
   UI Cancel handler to fire it, and adds two route tests covering
   the listener-stopped-but-token-preserved invariant and the no-op
   behaviour when no listener is in flight.

All 38 xai tests + tsconfig.tests.json typecheck + web typecheck +
pnpm guard pass.

* fix(xai-oauth): close two more lifecycle gaps Looper flagged

Both are non-blocking but cheap and right.

1. window.open used 'noopener=no,noreferrer=no' (carried over from the
   sibling McpOAuthControl), which deliberately KEEPS the auth.x.ai
   tab's window.opener reference back to the Settings tab. Reverse
   tabnabbing risk if the auth page or any redirect target along the
   OAuth chain ever turns hostile, with no upside — the xAI flow
   doesn't use postMessage, the daemon receives the code through the
   :56121 listener (or paste-back), so opener access buys nothing.
   Switched to 'noopener,noreferrer'.

2. PendingAuthCache was constructed with its default 10 min TTL while
   the loopback listener self-closes at 30 min and the UI shows a
   pending state for the same 30 min. After 10 min, a user looking at
   a live paste-back input would hit `xAI OAuth state not found or
   expired` even though everything visible (and the daemon socket)
   still claimed the dance was live. Constructed the cache with
   30 * 60 * 1000 so the PKCE state, the open :56121 socket, and the
   paste-back UI all expire together.

The third inline comment (XaiOAuthControl.tsx:248 — "Cancel only
clears React-side state") was a stale reference: the previous commit
fd04887 wired the Cancel button to fire `cancelInFlightOAuth()` which
hits the new `POST /api/xai/oauth/cancel` endpoint. Looper carried
the old comment forward when re-reviewing the rebased file; no code
change needed.

All 38 xai tests still green; tsconfig.tests.json clean; web tsc
clean; pnpm guard clean.

* fix(xai-oauth): keep loopback listener open on stale-tab callbacks

The one-shot listener marked itself consumed at the top of every
/callback request, then closed itself in the finally block whether
or not the state actually matched. A stray browser tab replaying an
old /callback?state=… (real-world scenario: user re-clicked Sign in
before closing the previous tab) would therefore close the singleton
:56121 listener with a state-mismatch error before the real xAI
redirect could arrive.

Now we only tear the listener down on outcomes that actually
terminate the dance:
  - ok callback (matched state, code present)
  - explicit ?error= from xAI (auth provider terminated; we should
    propagate, not wait for the 30 min timeout). xAI's error
    redirects may or may not echo state, but a stale tab can't
    fabricate ?error= without colluding with the auth server, so
    this branch is safe to consume.

Stale tabs / browser prefetches / malformed redirects still get the
HTTP 400 / "Sign-in failed" page, but the listener stays open and
the matching xAI redirect that arrives next is what closes it.

Tests: replaces the previous "rejects state mismatch with kind=error"
test with the recovery scenario (stale-then-real callbacks both hit
the listener; only the real one fires onCallback). Adds a sibling
case for missing-code / missing-state callbacks. xai-oauth-server
suite is now 10/10; full xai sweep 39/39.

* fix(xai-oauth): scope error-callback consume to matching/missing state

c00252c simplified the consume rule to "any explicit ?error= closes
the listener", which was broader than the stale-tab protection added
in the same commit. A browser history replay of an old
`/callback?error=access_denied&state=stale` would set `consumed`,
fire `onCallback`, and tear down the singleton 127.0.0.1:56121 socket
before the current dance's real callback could land — undoing the
defence the commit was supposed to add.

Tighten the rule so error-callbacks consume only when:
  - the URL carries no state (xAI rejected before issuing one, so
    there's nothing to compare against — safe to terminate), or
  - the carried state matches our expectedState (xAI explicitly
    rejected this dance; propagate immediately rather than wait for
    the 30 min timeout).

An ?error= replay carrying a *different* state is now treated like
the stale success replay above: returns the 400 page to the browser,
keeps the listener live, lets the real callback close it.

Tests: adds two cases — error+wrong-state followed by real success
must still resolve to ok; error+matching-state still consumes the
listener and surfaces the error to onCallback. xai-oauth-server
suite goes 10 → 12; full xai sweep 39 → 41.
2026-05-19 11:10:34 +08:00
kami
7b3b7c3b74
Fix finalize provider routing for Gemini BYOK (#1964)
Routes Finish Design/finalize requests through the selected BYOK provider, including Gemini, while preserving the Anthropic fallback. Validation: CI and nix-check were green on PR head 6c334e08d1.
2026-05-18 18:03:44 +08:00
Caprika
832bdeb535
Centralize daemon startup (#2054) 2026-05-18 17:08:17 +08:00
chaoxiaoche
1f66c53203
feat(daemon): consume component manifests (#2053)
* feat(design-systems): extract component manifests

* feat(daemon): consume component manifests

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-05-18 16:50:52 +08:00
kami
2e97ac9e68
Pin daemon data dir in agent runtime env (#2005)
Co-authored-by: multica-agent <github@multica.ai>
2026-05-18 14:02:11 +08:00
kami
91be2696dd
Persist routine failure reasons (#1963)
Co-authored-by: multica-agent <github@multica.ai>
2026-05-17 23:22:00 +08:00