open-design

mirror of https://github.com/nexu-io/open-design.git synced 2026-06-01 03:14:35 +07:00

Author	SHA1	Message	Date
Denis Redozubov	c847ace554	Add run-scoped media execution policy (#3106 ) * feat(contracts): add run media execution policy * feat(daemon): enforce run media execution policy * test(daemon): cover media execution policy gates	2026-05-28 09:19:40 +00:00
Siri-Ray	170a05f5d2	Formalize skill artifacts into plugins (#3085 ) * Add skill-to-plugin candidate flow * Fix skill plugin candidate card reuse Generated-By: looper 0.9.1 (runner=fixer, agent=codex) * Fix skill plugin candidate dismiss and URL gates Generated-By: looper 0.9.1 (runner=fixer, agent=codex) * Polish skill plugin candidate copy	2026-05-27 08:26:00 +00:00
shangxinyu1	cc6edb9afe	Proxy GitHub metadata through the daemon (#2654 ) * Proxy GitHub metadata through the daemon * fix(contracts): share GitHub metadata responses Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix(contracts): align GitHub fetchedAt payload types Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * Proxy GitHub metadata through the daemon Generated-By: looper 0.6.0 (runner=fixer, agent=codex)	2026-05-22 14:06:07 +08:00
Eli-tangerine	8193981511	Keep PR 2400 changes without folder pickers (#2462 ) * feat(daemon): add project working directory management and editor hand-off functionality - Introduced new flags for project commands to manage working directories, including `--working-dir` and `--dir`. - Implemented API routes for listing available editors and opening projects in selected editors. - Added a hand-off button in the ChatPane header to facilitate opening project folders in local applications. - Enhanced the HomeHero component to include working directory and design system settings, improving user experience in project creation. - Created HomeHeroSettingsChips component for inline management of working directory and design system selection. * feat(chat): implement voice transcription proxy and enhance UI components - Added a new API route for voice transcription using OpenAI's `/audio/transcriptions` endpoint, allowing users to send audio blobs directly for transcription. - Integrated multer for handling audio file uploads in memory, ensuring efficient processing without disk storage. - Updated the HomeHero component to include example prompt suggestions for plugins, enhancing user interaction. - Introduced the EditorIcon component to visually represent different editors in the hand-off menu, improving the user experience. - Refined the HandoffButton component to utilize the new EditorIcon, providing a more cohesive interface for selecting editors. - Enhanced CSS styles for various components to improve layout and responsiveness, including adjustments to tab and button sizes for better usability. * style(workspace-shell): enhance layout and overflow handling - Updated CSS for .workspace-shell to ensure full viewport width and height, with proper overflow management. - Adjusted grid layout to prevent content overflow and maintain responsiveness. - Modified styles for .workspace-tabs-chrome to improve width handling and prevent overflow issues. * refactor(chat): remove voice transcription proxy and related components - Deleted the voice transcription proxy implementation, including the associated API route and multer configuration. - Removed the MicButton component from the ChatComposer and HomeHero components to streamline the UI. - Updated HomeHero to include example suggestions without the voice input functionality. - Adjusted CSS styles for various components to maintain layout consistency after the removal of the MicButton. * feat(daemon): implement minting of HMAC tokens for working directory management - Added a new function `mintImportTokenFromCurrentSecret` to generate HMAC tokens bound to a specified base directory, enhancing security for working directory operations. - Updated the `desktop-auth.ts` file to include the new token minting functionality, which returns structured errors when the desktop auth secret is cleared. - Introduced new IPC message types for minting import tokens in the sidecar protocol, allowing seamless integration with the daemon's working directory management. - Enhanced the `WorkingDirPill` component to utilize the new token minting flow for secure directory selection in desktop builds. - Updated CSS styles for the HomeHero component to accommodate new example suggestion features and maintain layout consistency. * fix(HomeView): import HOME_HERO_CHIPS constant for improved chip management - Updated the HomeView component to import the HOME_HERO_CHIPS constant from the chips module, enhancing the management of hero chips within the component. * feat(daemon): implement mintImportTokenViaSidecar for secure working directory management - Introduced the `mintImportTokenViaSidecar` function to facilitate the minting of HMAC tokens for desktop-import operations via the daemon's sidecar IPC. This allows CLI commands to bypass authentication when the desktop-auth gate is active. - Updated the CLI to utilize the new token minting function when setting the working directory, ensuring secure access to trust-gated API endpoints. - Enhanced the sidecar server to handle minting requests and return structured error messages for improved user feedback. - Added tests to validate the new token minting functionality and its integration with the working directory management process. - Refactored related components to support the new token flow, improving overall security and user experience. * feat(HomeHero): enhance UI components and styles for improved user experience - Updated HomeHero component to replace active dot indicators with Plug icons for better visual representation of active plugins. - Adjusted CSS styles for various elements, including padding and dimensions, to enhance layout consistency and responsiveness. - Introduced new styles for active type icons and improved hover effects for buttons. - Updated HomeHeroSettingsChips to change button titles and icons for clarity. - Added tests to ensure proper rendering and functionality of updated components. * feat(ProjectDesignSystemPicker): enhance design system selection with preview functionality - Updated the ProjectDesignSystemPicker component to include a preview feature for design systems, allowing users to see a preview of the selected design system. - Implemented hover functionality to update the preview based on the hovered design system. - Added fullscreen preview capability for a more immersive experience. - Enhanced CSS styles for the design system picker to improve layout and responsiveness. - Introduced tests to validate the new preview functionality and ensure proper interaction within the component. * feat: refactor project metadata handling and enhance design system picker - Updated the default scenario plugin ID retrieval to use project metadata, improving the logic for determining the appropriate plugin based on project intent. - Enhanced the ProjectDesignSystemPicker and related components to support localized design system summaries and categories, improving user experience. - Introduced new translations for working directory and design system picker components, ensuring better accessibility and usability across different locales. - Added a new 'live-artifact' project type to the HomeHero chips, expanding the functionality for users creating refreshable artifacts. - Updated tests to validate the new project metadata handling and design system picker functionalities. * feat: enhance localization and styling for design system components - Added French translations for working directory and design system picker components, improving accessibility for French-speaking users. - Updated CSS styles for the pet task item to ensure consistent padding and layout. - Introduced a new test suite for HomeHeroSettingsChips to validate localization and design system selection functionality. - Enhanced ProjectDesignSystemPicker tests to ensure proper localization and interaction with design system categories. * fix: update .gitignore to include all claude-sessions directories and remove specific session files - Modified .gitignore to ensure all claude-sessions directories are ignored by using a wildcard pattern. - Deleted two specific claude-sessions markdown files to clean up unnecessary session data. * fix: repair home automation ci regressions * fix: stabilize artifact consistency e2e * Remove folder picker changes from PR 2400 --------- Co-authored-by: pftom <1043269994@qq.com> Co-authored-by: qiongyu1999 <2694684348@qq.com>	2026-05-20 22:07:30 +08:00
Bryan	c530d163f8	feat(web): "Resume conversation in new chat" UI — #462 Commit B (companion to #1718 ) (#2264 ) * feat(contracts): add handoff request/response DTOs Adds HandoffRequest, HandoffResponse, and HANDOFF_SCHEMA_VERSION for the upcoming POST /api/projects/:id/handoff synthesis endpoint. Mirrors the finalize.ts subpath pattern (package.json#exports + esbuild entry + index re-export) so daemon and web can import @open-design/contracts/api/handoff. Refs nexu-io/open-design#462. * feat(daemon): add handoff synthesis pipeline (buildHandoffPrompt + synthesizeHandoffPrompt) Adds `apps/daemon/src/handoff-design.ts` exposing the resume-conversation synthesis primitives the upcoming `POST /api/projects/:id/handoff` route will call into. - `buildHandoffPrompt({ projectId, transcriptJsonl, transcriptMessageCount, now })` returns the system + user prompts. System prompt asks Claude to emit a structured Markdown body with Context / Decisions made / Open questions / Current focus / Provenance, with Provenance bullets explicitly flat (no Markdown emphasis on labels) to preempt the PR #1584 round-2 parser bug. - `synthesizeHandoffPrompt(db, projectsRoot, projectId, options)` reuses the existing finalize-design pipeline pieces: `exportProjectTranscript` → `truncateTranscriptForPrompt` → `buildHandoffPrompt` → `callAnthropicWithRetry` → `extractDesignMd`, but without the lockfile, disk write, design-system, or artifact-resolution paths. - Promotes `DEFAULT_TIMEOUT_MS` in finalize-design.ts to `export const` so handoff shares the same 120s upstream-call bound. Refs nexu-io/open-design#462. * feat(daemon): wire POST /api/projects/:id/handoff route Adds the handoff HTTP route and registers it in server.ts. Validation block + error-mapping shape mirror registerFinalizeRoutes (BYOK payload, upstream-error → ApiErrorCode mapping, redactSecrets on the raw upstream body). Handoff has no lockfile, so the CONFLICT branch is omitted. `res.on('close')` is wired to flip an AbortController whose signal is threaded into synthesizeHandoffPrompt, so a UI-side cancel actually aborts the daemon-side Anthropic call rather than letting it keep running after the client walks away (mirrors the PR #974 fix for finalize). - `apps/daemon/src/handoff-routes.ts` — new, exports registerHandoffRoutes + RegisterHandoffRoutesDeps. - `apps/daemon/src/server-context.ts` — adds handoff slot to ServerContext. - `apps/daemon/src/route-context-contract.ts` — adds RegisterHandoffRoutesDeps to the compile-time coverage assertion. - `apps/daemon/src/server.ts` — imports synthesizeHandoffPrompt + registerHandoffRoutes, builds handoffDeps, registers the route next to finalize. - `apps/daemon/tests/handoff-route.test.ts` — 12 HTTP-layer tests: validation (400/403/404), happy path, upstream error mapping (401/429/502/502 non-JSON), api-key redaction. - `apps/daemon/tests/handoff-route-abort.test.ts` — client-disconnect aborts the daemon-side controller. Refs nexu-io/open-design#462. * fix(daemon): map TranscriptExportLockedError to 409 CONFLICT on handoff route `exportProjectTranscript` acquires a per-project `.transcript.lock` internally (apps/daemon/src/transcript-export.ts:131-163) and throws `TranscriptExportLockedError` on EEXIST. Concurrent handoff requests — or a handoff that races `/api/projects/:id/finalize/anthropic` — lost that lock and surfaced as 500 INTERNAL_ERROR through the route's generic catch. - `apps/daemon/src/handoff-routes.ts` — catch `TranscriptExportLockedError` and return `409 CONFLICT` ahead of the generic 500 branch, mirroring the existing `FinalizePackageLockedError → 409 CONFLICT` mapping at `apps/daemon/src/import-export-routes.ts:603-605`. - `apps/daemon/src/server.ts` — thread `TranscriptExportLockedError` through `handoffDeps` so the route can match without a direct import. - `apps/daemon/src/handoff-design.ts` — correct the module header comment that incorrectly claimed "no lockfile (concurrent handoff calls are safe)" — handoff does not add its own lock, but it does transitively acquire `.transcript.lock` via the transcript-export call. - `apps/daemon/tests/handoff-route.test.ts` — regression test that pre-acquires `.transcript.lock` on disk via `fs.openSync(lockPath, 'wx')` before firing a handoff request, asserts 409 CONFLICT. Refs nexu-io/open-design#462 — addresses @nettee's blocking review on PR #1718 (comment 3242251338). * fix(daemon): keep handoff request timeout armed through the response body read `synthesizeHandoffPrompt` cleared the upstream-call timeout in a `finally` that ran as soon as `callAnthropicWithRetry` returned. But `fetch()` resolves once the upstream sends headers — so the subsequent `await response.json()` body read ran with no timeout. A response that sends headers and then stalls its body could hang `/api/projects/:id/handoff` indefinitely instead of failing. - `apps/daemon/src/handoff-design.ts` — move `clearTimeout(timeoutId)` into a single outer `finally` spanning both the call and the `response.json()` body parse, so the timeout stays armed until the body is fully consumed. - `apps/daemon/src/handoff-design.ts` — the body-parse catch now re-throws `AbortError` as-is, mirroring the call-phase catch. Without this a body-phase timeout would surface as `502` "non-JSON body"; re-throwing lets the route map it to the intended `503` "handoff timed out" (`handoff-routes.ts:122-124`). - `apps/daemon/tests/handoff-design.test.ts` — regression test: a `fetchImpl` returning a `Response` whose body never closes after headers, raced against a 500ms deadline, asserts the call aborts (not hangs) and rejects with `AbortError`. Refs nexu-io/open-design#462 — addresses @nettee's round-2 blocking review on PR #1718 (`handoff-design.ts:196`). * fix(daemon): map upstream 400 to 400 BAD_REQUEST on handoff route `callAnthropicWithRetry` preserves a non-retryable upstream status, so an Anthropic HTTP 400 (`invalid_request_error` — unknown model, invalid maxTokens, malformed body) reached the route's `FinalizeUpstreamError` branch and fell through to `502 UPSTREAM_UNAVAILABLE`. That reported deterministic caller input as a transient server outage, inviting pointless retries and hiding which field was wrong. - `apps/daemon/src/handoff-routes.ts` — special-case `err.status === 400` to `400 BAD_REQUEST` with the redacted upstream detail, ahead of the generic 502. Also refresh the route docblock: it claimed the 409 branch was omitted (stale since the R1 TranscriptExportLockedError fix) and that error mapping fully mirrors finalize (now diverges on 400). - `apps/daemon/tests/handoff-route.test.ts` — route test driving an Anthropic `400 invalid_request_error`: asserts 400 BAD_REQUEST, the upstream detail is surfaced, and an echoed key is redacted. - `packages/contracts/tests/package-runtime.test.ts` — import `@open-design/contracts/api/handoff` through the package `exports` map and assert `HANDOFF_SCHEMA_VERSION`, covering the built publish surface (esbuild entry + exports map + root re-export) that the source-only `handoff-contract.test.ts` does not exercise. Refs nexu-io/open-design#462 — addresses @nettee's round-3 blocking review on PR #1718. * fix(daemon): await the now-async external base-URL validator on handoff route Main's #1176 (`9a64fccd`) made `validateExternalApiBaseUrl` DNS-aware and asynchronous (`validateBaseUrlResolved`) and updated the proxy and finalize callers to `await` it. The handoff route — added on this branch in parallel, against the old synchronous validator — still called it without `await`, so `validated` was a Promise: `validated.error` / `validated.forbidden` were `undefined`, the SSRF / malformed-URL guard silently no-opped, and a bad `baseUrl` fell through to the upstream call and surfaced as 502. A semantic merge break — no textual conflict, green on the branch in isolation, red once CI re-merged latest main. - `apps/daemon/src/handoff-routes.ts` — `await validateExternalApiBaseUrl(...)`, mirroring the finalize route (`import-export-routes.ts:561`). The handler is already `async`. The existing `handoff-route.test.ts` cases "400 BAD_REQUEST when baseUrl is not a valid URL" and "403 FORBIDDEN when baseUrl points at a private internal IP" already encode this — red against branch + latest main, green now. Refs nexu-io/open-design#462 — PR #1718 CI fix. * chore(daemon): list handoff in the assertServerContextSatisfiesRoutes literal The `assertServerContextSatisfiesRoutes({...})` call in `server.ts` enumerates every route registrar's deps but omitted `handoff`. Adding `handoff: handoffDeps` makes the literal complete and consistent with the other route deps. This was not a typecheck break: route-dep coverage is guaranteed by the `Assert<ServerContext extends AllRegisteredRouteDeps>` type in `route-context-contract.ts` — and `AllRegisteredRouteDeps` already includes `RegisterHandoffRoutesDeps` — not by this assertion-call literal. The literal has omitted `handoff` since this branch's first push (`806db576`) through green CI throughout; `tsc -p tsconfig.json --noEmit` is clean before and after. Refs nexu-io/open-design#462 — addresses @nettee's round-4 review note on PR #1718. * feat(web): add "Resume conversation in new chat" action (#462) Adds a Resume control to the chat header, next to "New conversation". Clicking it synthesizes a handoff prompt from the current transcript via POST /api/projects/:id/handoff, opens a fresh conversation, and auto-sends the synthesized prompt as its first user message — so a drifted session resumes without the user replaying context by hand. The old conversation is preserved. - synthesizeHandoff() web-state wrapper in apps/web/src/state/projects.ts - resume-conversation icon button in ChatPane (onResumeConversation / resumeConversationDisabled props) - handleResumeConversation + pendingResumeRef + auto-send effect in ProjectView; effect gates on messagesConversationId so the prompt cannot fire before the new conversation's message read settles - chat.resumeConversation i18n key across all 19 locales Commit B of #462; Commit A is the daemon endpoint (PR #1718). This branch is stacked on feat/handoff-endpoint so the web code resolves @open-design/contracts/api/handoff. * fix(daemon): scope handoff to one conversation + reject empty transcripts (#462) Addresses the review on #1718 and #2264: - mrcfps (#2264): the handoff endpoint exported the whole project's transcript, so a multi-conversation project blended unrelated chats into the synthesized prompt. HandoffRequest now carries a required conversationId; the route validates it belongs to the project (404 CONVERSATION_NOT_FOUND), and exportProjectTranscript takes an optional conversationId filter so only that conversation is exported. - nettee (#1718): a zero-message conversation still called Anthropic and fabricated a handoff. synthesizeHandoffPrompt now throws EmptyTranscriptError on messageCount === 0; the route maps it to 400 EMPTY_TRANSCRIPT before any BYOK tokens are spent. HANDOFF_SCHEMA_VERSION bumped to 2 (conversationId is a new required request field). Regression tests: a two-conversation scoping test, an empty-conversation route + pipeline test, and a transcript-export conversationId-filter unit test. * feat(web): send conversationId with the resume handoff request (#462) Follows the handoff endpoint becoming conversation-scoped. The resume flow now passes the active conversationId to POST /handoff so the synthesized prompt summarizes only the conversation being resumed. handleResumeConversation bails when there is no active conversation; synthesizeHandoff and the resume tests carry the new field. * feat(daemon): add `od project handoff` CLI + register handoff error codes (#462) Addresses the second-round review on #1718 and #2264: - mrcfps (#2264): per AGENTS.md "Capability exposure (UI/CLI dual-track)", a user-facing capability must be reachable through the `od` CLI, not only the web UI. Adds `od project handoff <id> --conversation <id> --api-key <key> --model <model> [--base-url] [--max-tokens] [--json]`, driving the same POST /api/projects/:id/handoff endpoint. The logic lives in a testable handoff-cli.ts sibling module (mirrors artifacts-cli.ts) so cli.ts's import-time dispatch stays out of tests. - nettee (#1718): the route emitted CONVERSATION_NOT_FOUND and EMPTY_TRANSCRIPT, which were absent from the shared API_ERROR_CODES union. Both are now registered in packages/contracts/src/errors.ts, with a contract test pinning them so the route and contract cannot drift again. A CLI contract test covers the conversation-scoped request shape, --json output, flag validation, and daemon-error surfacing. * fix(daemon): fail `od project handoff` on a malformed 2xx response (#462) Addresses nettee's review on #1718: runProjectHandoff treated any 2xx response as success, so a broken daemon/proxy 200 with malformed or shape-invalid JSON would print `undefined` (or `{}` under --json) and still exit 0 — breaking the fail-fast contract scripts rely on. It now validates the body is a well-formed HandoffResponse via an isHandoffResponse type guard and fails fast otherwise. Regression tests cover a shape-invalid and an unparseable 200 body. * feat(web): surface the daemon's classified handoff error in the resume toast (#462) Addresses mrcfps's non-blocking note on #2264: synthesizeHandoff returned null for every non-2xx response, so RATE_LIMITED, EMPTY_TRANSCRIPT, and an upstream 400 with provider detail all collapsed into one generic "check your API key" toast — even though handoff-routes.ts had already classified and sanitized them. synthesizeHandoff now returns the daemon's structured `{ error }` on a classified failure; `null` stays reserved for a transport failure or an unparseable body. handleResumeConversation surfaces error.message plus redacted details for the `{ error }` case, and a distinct daemon-unreachable message for null. * fix(web): omit empty baseUrl from the resume handoff request (#462) Addresses mrcfps's review on #2264: the default Anthropic config normalizes baseUrl to '' (config.ts), and the handoff route 400s an explicit empty baseUrl — so the Resume action failed before synthesis for every user who never set a custom base URL. handleResumeConversation now forwards baseUrl only when config.baseUrl is a non-empty string, matching the contract's optional-field semantics. Tests: the default-config path asserts baseUrl is absent from the request, and a new case covers a custom baseUrl being forwarded. * refactor(daemon): dispatch `od project handoff` before the generic project parser (#462) Addresses nettee's non-blocking note on #1718: runProject ran the shared parseFlags(PROJECT_*) before reaching the handoff switch case, so a malformed `od project handoff` invocation (`--unknown`, `--max-tokens` with no value) threw out of the generic parser instead of hitting handoff-cli's structured fail() — the entrypoint behaved differently from the unit-tested runProjectHandoff helper. The handoff sub now short-circuits before parseFlags / projectDaemonUrl, so `od project handoff` runs exactly runProjectHandoff with no intervening parsing. handoff-cli.test.ts gains unknown-flag and missing-value cases covering the structured fail path. --------- Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>	2026-05-20 13:28:27 +08:00
Tom Huang	86ec951fb9	[codex] Add automation templates and proposal workflows (#2193 ) * feat(web): introduce Automations tab with dual-track capability for routines This commit adds a new Automations tab that consolidates routines, schedules, and live artifacts, allowing users to manage automations seamlessly. The tab features a modal for creating and editing automations, which supports various scheduling options (hourly, daily, weekdays, weekly) and project modes (create_each_run, reuse). The CLI is also updated to expose automation commands, ensuring consistency between the web UI and CLI interfaces. Key changes include: - New `NewAutomationModal` component for automation creation and editing. - Updated `TasksView` to integrate the new Automations functionality. - Enhanced styling for the Automations tab to improve user experience. This implementation aligns with the dual-track capability exposure policy, ensuring all features are accessible via both the web UI and CLI. * feat(daemon): enhance automation context handling and CLI commands This commit introduces several improvements to the automation context management and updates the CLI commands accordingly. Key changes include: - Added support for new context fields (`plugin`, `mcp`, `connector`) in automation commands. - Updated the CLI to reflect new target options (`new-project`). - Enhanced error messages for invalid target inputs. - Introduced functions to handle context selection and normalization for routines, including the ability to parse and store context data in the database. - Updated the database schema to include a new `context_json` field for routines. - Improved the handling of context in routine routes and the web interface, ensuring that selected contexts are properly managed and displayed. These changes aim to provide a more robust and flexible automation experience, aligning with the recent enhancements in the web UI. * feat(web): enhance TasksView with automation run history and status indicators This commit introduces several new features to the TasksView component, including: - Added functionality to display automation run history for each routine, showing metadata such as status, timestamps, and project details. - Implemented status indicators for routine runs, providing visual feedback on their current state (succeeded, failed, running, queued). - Enhanced the UI to allow users to expand and view detailed run history, including the ability to open the corresponding project conversation. - Updated styles to improve the presentation of automation statuses and history. These changes aim to provide users with better insights into their automation routines and improve overall usability. * feat(daemon): implement automation ingestion and proposal management This commit introduces several new features related to automation ingestion and proposal management within the daemon. Key changes include: - Added new modules for handling automation source packets and proposals, allowing for the storage, retrieval, and management of automation-related data. - Implemented functions to list, create, and apply automation proposals, enhancing the automation workflow. - Introduced new CLI commands for interacting with memory entries and automation sources, providing users with more control over their automation processes. - Enhanced the server routes to support automation source and proposal APIs, enabling seamless integration with the existing system. These changes aim to improve the overall automation experience, making it easier for users to manage and utilize automation proposals and ingestions effectively.	2026-05-19 16:35:28 +08:00
chaoxiaoche	46a64edce3	feat(design-systems): extract component manifests (#2051 ) Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>	2026-05-18 16:48:59 +08:00
lefarcen	53997990b7	Merge origin/main (post-0.7.0) into reconciled garnet branch Second-pass merge layering 41+ new commits from origin/main on top of the first reconcile commit. Headline upstream additions absorbed: - 0.7.0 release: redesigned chat bubble user-text styling, neutralised palette, lucide icons, ElevenLabs audio voice option discovery in the prompt composer, analytics tracking (PostHog) wired across home / studio / create surfaces, Prometheus `/api/metrics` endpoint, critique-theater drop-in mount with a settings toggle. - Misc upstream fixes (titlebar padding, release header layout, deck preview chrome, feedback form auto-scroll, conversation-created SSE on routine runs, etc.) Conflict resolutions (12 files, ~22 hunks): - contracts barrel + prompts/system: union of both sides; new analytics exports (`./analytics/events`, `./analytics/public-params`) added alongside garnet's plugin/atom/genui exports. Both ElevenLabs voice fields (audioVoiceOptions/audioVoiceOptionsError, main) and pluginBlock/activeStageBlocks (garnet) preserved on ComposeInput. - daemon/server.ts: Prometheus `/api/metrics` route inserted after garnet's `/api/daemon/shutdown`. main's `createAnalyticsService` call added before the chat-run service init alongside the prior reconcile note about the dropped legacy POST /api/projects body. - App.tsx: handleCreateProject now consumes both garnet's plugin fields (pluginId / appliedPluginSnapshotId / pluginInputs / autoSendFirstMessage) and main's analytics requestId. Tracking fires success + failure paths; PluginLoopHome auto-send sessionStorage flag is preserved. - ProjectView.tsx: the garnet auto-send useEffect coexists with main's `useCritiqueTheaterEnabled()` hook. - ChatComposer.tsx: imports merged (drop now-unused fetchSkills, add analytics provider + tracking + buildVisualAnnotationAttachment). - index.css: main's redesigned `.msg.user .user-text` chat bubble styling wins over garnet's plain text rule; garnet's `.msg-plugin-chip*` rules preserved alongside. - EntryView.tsx: accepted HEAD (garnet wrapper) — consistent with reconcile decision #2. main's added PetRail / TopTab / analytics view tracking is intentionally NOT brought into the wrapper; the follow-up to re-integrate PetRail / image-templates / video-templates into EntryShell still stands and now also covers analytics view-tracking hooks. - daemon/package.json + pnpm-lock: merged dep set (tar + posthog-node + prom-client coexist). - Test fixtures (FileWorkspace.test): kept garnet's plugin-folders describe block intact; main's projectKind="prototype" addition is dropped where it conflicted with garnet's plugin-folder fixture files. Verification: `pnpm install` (after lockfile reconciled), `pnpm typecheck` exits 0 across all workspace packages. Follow-up not done in this commit: - PetRail / image-templates / video-templates / 0.7.0 analytics view-tracking hooks need to be added to EntryShell. - Critique-theater settings toggle UX (added on main) lives in the SettingsDialog hierarchy; the reconcile state preserves the SettingsDialog so this should work without changes, but no end-to-end verification yet.	2026-05-13 23:29:56 +08:00
lefarcen	d3602be666	Merge origin/main into garnet-hemisphere (reconcile) Merge of `origin/main` (`03ed3960`, 2026-05-13 pre-0.7.0) into the 161-commit garnet-hemisphere line, reconciling the product-vibe-coded plugin/marketplace/EntryShell surfaces from garnet with the routines / skills / live-artifacts feature work landed on main since the fork point. Headline decisions (full rationale + side-by-side screenshots in `specs/change/20260513-garnet-skills-automations/reconcile-result-vs-garnet.md`): - #1 SettingsDialog: keep main's Memory / Skills / External MCP / Connectors / Routines / MCP server nav items even though the top-level /integrations + /automations routes also cover them. Two entries coexist for now; revisit once Track A/B fill in the placeholder content. - #2 EntryView: accept garnet's thin wrapper delegating to EntryShell. Main's PetRail sidebar + image-templates/video-templates tabs are intentionally deferred to a follow-up that re-integrates them into the new EntryShell layout. - #3 /integrations + /automations top-level routes: kept (garnet's product intent). Skills tab is still a "Coming soon" placeholder awaiting Track A; Routines/Schedules/Live-artifacts cards on /automations are still mock awaiting Track B. - #5 DesignFilesPanel: hybrid — main's pagination as primary list, garnet's Plugin folders section preserved between the live-artifacts block and the pagination block. (by-kind sections drop in favour of pagination; plugin-folders rendering stays because it is a garnet-specific product addition.) - #7 server.ts (10 hunks, ~5400 conflict lines): manual hunk-by-hunk merge. Both daemon admin routes + plugin/genui routes (garnet) and routines/memory/skills upgrades (main) preserved. Garnet's inline project route block kept alongside main's `registerProjectRoutes` / `registerProjectUploadRoutes` modular wiring — duplicate route audit is a follow-up. Garnet's POST /api/projects plugin-snapshot resolution + default-scenario fallback is intentionally dropped from the inline body (now handled by registerProjectRoutes) and listed for follow-up re-integration into `project-routes.ts`. Verification (worktree at /Users/elian/Documents/open-design-garnet): - `pnpm typecheck` exits 0 across all workspace packages - daemon (`pnpm tools-dev run web --namespace reconcile-shots`) boots, serves `/api/daemon/status` healthy, and survives a Playwright walkthrough of /integrations / /automations / home / projects / design-systems / plugins / settings dialog - `@open-design/plugin-runtime` package built (was missing dist/ on garnet); without it the daemon's plugins/* imports fail at boot Track A (Skills tab → real SkillsSection) and Track B (Automations cards → real routines / live-artifacts backend) are the two remaining follow-ups blocking the placeholder/mock content from going live. See `spec.md` and `track-skills.md` in the same directory.	2026-05-13 22:29:21 +08:00
lefarcen	e1bc83a476	feat(analytics): PostHog product analytics (P0 events, consent-gated, packaged) (#1428 ) * feat(analytics): scaffold PostHog product-analytics integration - Add @open-design/contracts/analytics subpath with the 17 P0 event payload types, header constants, and code↔CSV enum mapping helpers. - Add apps/daemon/src/analytics.ts with env-gated posthog-node client, request-scoped analytics context reader, and artifact-id anonymizer. - Expose GET /api/analytics/config so the web bundle never embeds the PostHog key at build time; daemon owns POSTHOG_KEY / POSTHOG_HOST. - Add apps/web/src/analytics module (identity + lazy posthog-js client + React provider) and mount it under <I18nProvider> in app/layout. No event wiring yet — that lands in the next commit alongside trigger points (App.tsx, EntryView, NewProjectPanel, SettingsDialog, FileViewer, runs.ts). * feat(analytics): wire app_launch, home_view, home_click, project_create_result - App.tsx: fire app_launch once after first effect tick. handleCreateProject now emits project_create_result on both success and failure paths. - EntryView.tsx: home_view (page) gated on agents loading so has_available_cli isn't transiently false; home_view (asset_panel) fires per top-tab change with the right result_count. - NewProjectPanel.tsx: home_click create_button fires before delegating to the parent; a fresh request_id is generated here and threaded through onCreate so the matching project_create_result stitches via $insert_id. - contracts/analytics: tighten createTabToTracking and topTabToTracking for the worktree branch's renamed tabs (live-artifact, templates). * feat(analytics): wire settings_view + 3 settings_click events - settings_view fires on dialog mount and on every section switch, carrying the active section (mapped via settingsSectionToTracking for the 16-section worktree layout), execution_mode, and the selected CLI provider id when present. - settings_click execution_mode_tab: setMode now emits before/after values whenever the user toggles between Local CLI and BYOK. - settings_click cli_provider_card: agent card onClick reports cli_provider_id via agentIdToTracking (kiro → other). - settings_click byok_field: onFocus added to api_key, model select, and base_url inputs; provider_id widened to include google so the worktree's Gemini protocol slot type-checks. * feat(analytics): wire studio_view + studio_click chat, studio_view artifact - packages/contracts/src/analytics/artifact-id.ts: FNV-1a 64-bit helper produces a 16-hex anonymized id for (projectId, fileName). Stable cross-platform so the daemon and the web bundle resolve the same id without a Web Crypto round-trip; daemon now re-exports it. - ChatComposer: studio_view chat_panel fires once per project mount, studio_click chat_composer fires on attachment + send buttons with estimated user_query_tokens (length/4) and has_attachment. - FileViewer: studio_view artifact fires once per (project, file) at the dispatcher level, before any sub-viewer renders, with artifact_kind derived from the renderer registry / file.kind table. - Widen TrackingExportFormat to include markdown and cloudflare_pages so the worktree branch's full share menu can emit verbatim. * feat(analytics): wire studio_click share_option + artifact_export_result HtmlViewer's share menu now emits both events per click via a fireShareExport helper: - studio_click share_option fires immediately on click with the chosen export_format and a fresh request_id. - artifact_export_result fires when the export resolves — success for sync exporters (html, markdown, template) the moment the call returns, success/failed for async exporters (pdf, zip, deploy) via .then/.catch. The same request_id threads both events so PostHog stitches click → result via $insert_id. DEPLOY_PROVIDER_OPTIONS maps to the CSV's vercel / cloudflare_pages slots; markdown is now a first-class export_format value. Also ignore .env.local so local POSTHOG_KEY / .env-style secrets don't get committed. * feat(analytics): emit run_created and run_finished from the daemon POST /api/runs now reads the analytics context off the x-od-analytics-* headers the web client sets on every fetch, then: - Captures run_created with project_id, conversation_id, run_id, model_id, agent_provider_id (mapped via agentIdToTracking), skill_id, design_system_id, plus the token_count_source marker. - Schedules a run_finished capture on runs.wait(run) resolution, mapping succeeded/canceled/failed to success/cancelled/failed and reporting total_duration_ms. Both events use a stable insert_id derived from the same uuid so PostHog dedupes the daemon-side mirror against any future web-side capture without double-counting. Token sub-fields (user_query_tokens/system_prompt_tokens/...) stay omitted in v1 — the claude-stream parser only exposes input/output totals today. See tracking-doc-issues.md §3.2. * feat(analytics): emit settings_cli_test_result + settings_byok_test_result The original BLOCKING-list assumed these CSV P0 events were not implementable in this branch because main lacked Test buttons. The worktree HEAD actually wires `handleTestAgent` and `handleTestProvider` in SettingsDialog, so both events are now in scope. - handleTestAgent emits settings_cli_test_result on success and failure paths with cli_provider_id mapped via agentIdToTracking, result drawn from result.ok / catch branch, error_code from result.kind or the thrown error name, and duration_ms timed via performance.now(). - handleTestProvider emits settings_byok_test_result analogously, using apiProtocol (anthropic\|openai\|azure\|ollama\|google) directly as provider_id — wider than the CSV's 5-value enum, documented in tracking-doc-issues.md §2.5. Contracts: add SettingsCliTestResultProps / SettingsByokTestResultProps plus matching track* helpers. AnalyticsEventName union now covers all 14 P0 events this branch supports. * feat(analytics): gate PostHog on the existing telemetry.metrics consent The integration now reuses the same first-launch privacy banner + Settings → Privacy toggle that gates Langfuse, so a single user decision controls both telemetry sinks. - /api/analytics/config now consults the persisted AppConfigPrefs: it returns enabled=true only when POSTHOG_KEY is set AND the user has chosen "Share usage data" (telemetry.metrics === true). The response also echoes installationId so the web client uses the same anonymous id Langfuse keys off of — one identity per install, shared across both sinks. - Web AnalyticsProvider: - Bootstrap fetch resolves installationId and threads it through the x-od-analytics-anonymous-id header on every /api/* fetch, so daemon-side captures (run_created / run_finished / project_create_result) land on the same person record. - Exposes a setConsent(granted) method that calls posthog-js's opt_in_capturing / opt_out_capturing, wired from App.tsx via a useEffect watching config.telemetry?.metrics. Toggling Privacy → metrics now stops/resumes events immediately, no reload. - app_launch additionally gates on telemetry.metrics so a freshly- declined user fires nothing, and a freshly-opted-in user fires on the next reload. * feat(packaging): bake POSTHOG_KEY into packaged daemon spawn env Wires PostHog product analytics through the same Langfuse-style build- secret pipeline so official Open Design builds ship with the key while fork builds compile without it (the integration short-circuits cleanly when POSTHOG_KEY is absent). tools/pack - resolveToolPackConfig reads POSTHOG_KEY / POSTHOG_HOST from process.env at packaging time, validates them (no whitespace in the key, http(s) URL for host, trailing-slash strip), and stamps them on ToolPackConfig. Fork builds without the env vars simply omit the fields; the daemon-side gate keeps things off in that case. - Mac, Windows, and Linux packaged-config writers each append the two fields to open-design-config.json next to the existing telemetryRelayUrl entry. apps/packaged - RawPackagedConfig / PackagedConfig surface posthogKey / posthogHost so the Electron entry and headless entry both forward them to the daemon sidecar. - buildPackagedDaemonSpawnEnv emits POSTHOG_KEY / POSTHOG_HOST into the daemon child env when present. The daemon's existing analytics module reads these via process.env — no daemon-side changes needed. - The headless packaged path falls back to process.env for fields the builder hasn't injected, mirroring how OPEN_DESIGN_TELEMETRY_RELAY_URL is read there. CI - release-beta.yml and release-stable.yml expose POSTHOG_KEY (secret) and POSTHOG_HOST (var) at workflow-env scope so every packaging job inherits them. PR / fork builds without these set simply skip the bake step. Tests - tools/pack: config.test.ts covers bake-through, fork-build omission, whitespace rejection, invalid-URL rejection, and trailing-slash normalization. - apps/packaged: sidecars.test.ts covers buildPackagedDaemonSpawnEnv forwarding the keys when present and omitting them when null. * feat(analytics): enable PostHog autocapture + perf + exceptions Flip on the PostHog SDK's automatic diagnostic features so we capture click paths, page transitions, web vitals, dead clicks, and browser exceptions without scattering instrumentation through the codebase. Privacy defense lives in one place — apps/web/src/analytics/scrub.ts — wired in via posthog-js's `before_send` hook so every outgoing event passes through the same audit point: - $autocapture / $rageclick / $dead_click / $copy_autocapture: strips $el_text and value/placeholder/aria-label attrs from any input, textarea, password input, or contenteditable element. PostHog autocapture does not capture input.value by default, but $el_text on a <textarea> reflects the typed content — that's the prompt body for us, so it has to be scrubbed every time. - $pageview / $pageleave: drops query string and fragment from $current_url / $referrer so any future ?q=… can't leak. - $exception: rewrites file:// and absolute filesystem paths in stack frames to app://apps/<repo-relative> so we don't ship the user's home directory. - Suppresses $opt_in entirely — duplicate of our explicit setConsent toggle in App.tsx. Element-level defense in depth is limited to the single most sensitive surface: the chat composer textarea gets `ph-no-capture` so PostHog never even generates an event for clicks inside that subtree. Every other input relies on scrub.ts — sprinkling the class through every form would be noisy and easy to forget on new surfaces. The existing Privacy → "Share usage data" toggle continues to gate every new feature: posthog-js's opt_out_capturing() halts autocapture, $pageview, $exception, web vitals, and dead clicks alongside the explicit capture() calls — one global switch. 11 unit tests pin the scrub rules in apps/web/tests/analytics-scrub.test.ts. * ci(nix): bump pnpmDepsHash for posthog-js + posthog-node additions Adding posthog-js to apps/web and posthog-node to apps/daemon changed pnpm-lock.yaml, which Nix's fixed-output pnpmDeps derivation pins by sha256. The CI nix flake check failed with: specified: sha256-KF3Mld72/iau+pJmA7HvnanRx8VLtDP0N624SKrtrrc= got: sha256-PGFgX4lYyeH2TRAXfUq52A3EOa6bb1gO59hPsXhEk3s= Copy the new hash into both nix/package-web.nix and nix/package-daemon.nix per the procedure documented in nix/README.md §"First-build hash pinning". * feat(analytics): unify PostHog identity with Langfuse installationId PostHog's distinct_id is the installationId stamped by /api/analytics/ config; Langfuse already reads the same id off app-config.json to populate trace.userId. With both sinks keying off the same anonymous identity, dashboards can correlate user actions (PostHog events) with LLM runs (Langfuse traces) without re-identifying. Two gaps closed: 1. applyConsent(false) — clear posthog-js's persisted ph__posthog localStorage entry on opt-out via posthog.reset(). Without this, a user who opts out, then clicks Delete my data, then re-opts in would see PostHog stitch their new session to the deleted identity because bootstrap.distinctID only takes effect on first init. 2. applyIdentity(newInstallationId) — Delete my data rotates the installationId in app-config; App.tsx now watches config.installationId and calls posthog.reset() then identify(newId) so the next event batch is fully decoupled from the deleted one. Idempotent on same-id re-renders so benign config refreshes don't churn PostHog identities. The fetch wrapper's x-od-analytics-anonymous-id header also flips to the new id on rotation so daemon-side captures (run_created / run_finished) land on the same person record from the very next API call, not after a reload. The end-to-end rotation flow is verified against a live PostHog project; these unit tests pin the safety guards (no-client paths, null inputs) since stubbing posthog-js's init-loaded callback chain is brittle. fix(langfuse): require both metrics AND content consent for trace reports Tightens the Langfuse gate so a user who shares anonymous metrics but NOT conversation content stops emitting Langfuse traces entirely — Langfuse is used for turn-quality evals which only make sense with prompt/output bodies. PostHog (product analytics, content-free) stays gated on `metrics` alone and is unaffected. i18n: "Conversation content" → "Conversation and tool content" with hints expanded to mention tool inputs/outputs so the consent surface matches what the trace actually carries (en + zh-CN). Bundled here per PR scope — change originated outside this PostHog PR but lands cleanly on the same files; gating Langfuse strictly on `content` makes the dual-sink consent model (PostHog = metrics, Langfuse = metrics + content) symmetric across both i18n locales and the daemon-side gate. * feat(analytics): wire byok_provider_option + fix PR review P1s Adds the BYOK protocol-chip click event (5-value provider_id mirroring the apiProtocol Settings UI) and resolves four P1 review threads on PR #1428. byok_provider_option: - New SettingsClickByokProviderOptionProps in contracts (provider_id = anthropic\|openai\|azure\|google\|ollama; maps to CSV's 5 values per tracking-doc-issues.md §2.5). - trackSettingsClickByokProviderOption helper in apps/web/src/analytics. - SettingsDialog hooks it on the protocol-chip onClick alongside the existing setApiProtocol call; is_selected reflects whether the chip was already active. Review fixes: 1. client.ts (Siri-Ray): clear `initPromise` when the resolution is null so a Privacy → metrics opt-in after a previous decline triggers a fresh /api/analytics/config fetch. Without this, the disabled response was cached forever — first-session opt-in needed a reload to start sending PostHog events. 2. provider.tsx (Siri-Ray): replace `url.includes('/api/')` with a strict same-origin + /api/ pathname check (shared `isSameOriginApiCall` helper). Outbound third-party URLs containing `/api/` (e.g. provider.example.com/api/x) no longer receive our x-od-analytics-* headers. 3. provider.tsx (codex-connector, lefarcen): gate header injection on `resolvedAnonId` being non-null. When Privacy → metrics is off, /api/analytics/config returns enabled=false → resolvedAnonId stays null → wrapper never installs → daemon can't read consent-bearing headers → no daemon-side PostHog event. setConsent now also clears resolvedAnonId on opt-out and re-fetches on opt-in. 4. daemon/analytics.ts (defense in depth): createAnalyticsService now takes dataDir and capture() re-reads app-config to check telemetry.metrics inside the fire-and-forget wrapper. Even if a stale header somehow reaches the daemon after opt-out, the capture is dropped before posthog-node.capture is called. * fix(web): place "Share usage data" on the right in privacy consent banner Swap button order in PrivacyConsentModal and the in-settings ConsentCard so the affirmative "Share usage data" lands on the right and "Not now" on the left. Matches the OK-on-the-right pattern users expect for primary actions. Both buttons keep equal visual prominence (same .privacy-consent-action styling) so the swap doesn't change the EDPB equal-prominence stance called out in the original Langfuse telemetry spec. * feat(analytics): populate run_finished token totals from claude-stream usage Daemon's claude-stream parser already emits agent usage events with input_tokens / output_tokens totals; the run service buffers them in run.events and Langfuse reads them out the same way. The run_finished PostHog event was leaving these fields empty. Scan run.events for the most recent agent usage frame on terminal transition and emit input_tokens / output_tokens / total_tokens when present. token_count_source flips to 'provider_usage' only when at least one count landed; runs without provider-side usage data keep 'unknown'. Provider does not break the input down into the 7 sub-fields the tracking doc lists (memory / context / attachment / system_prompt / …); those stay omitted until a parser change exposes them. * feat(analytics): estimate user_query_tokens from prompt length The user_query_tokens field for run_created / run_finished was hardcoded to 0. We can't tokenize without bundling a model-specific tokenizer, but the character/4 heuristic is the industry-standard estimate when one isn't available and is enough for funnel analysis (prompt-length cohorts, short-vs-long-query conversion rates). Extracted from req.body via the same telemetryPromptFromRunRequest pattern the daemon already uses for langfuse-bridge (currentPrompt then message fallback). Only the integer count goes to PostHog — the prompt text itself never leaves the daemon. token_count_source flips appropriately: - run_created with a prompt: 'estimated' (was 'unknown') - run_created with no prompt: 'unknown' - run_finished with provider usage: 'provider_usage' (overrides baseProps' 'estimated' value) - run_finished without provider usage: inherits 'estimated' or 'unknown' from baseProps so input/output absent doesn't mask the estimate.	2026-05-12 22:32:42 +08:00
Tom Huang	e254d1280b	feat(memory): auto-memory store with chat-protocol-aware extraction (#999 ) * feat(memory): auto-memory store with chat-protocol-aware extraction Markdown memory store at <dataDir>/memory/ with two extractors — heuristic regex for explicit "remember:" / "我是 X" markers, and a small-model LLM pass after each turn — folded into the system prompt so cross-chat preferences, role, and ongoing-work context survive restarts. Settings UI: - Memory tab lists entries, exposes a hand-edited MEMORY.md index, and shows an extraction history with per-attempt phase/skip/failure rows. - Memory model picker is inline next to the chat model picker (CLI and BYOK) so the choice "which fast model mines facts each turn?" sits next to the chat-model decision instead of a separate panel. The picker reuses the same SUGGESTED_MODELS table and "Custom..." pattern the chat picker uses. LLM extractor supports all four protocols (anthropic / openai / azure / google); pickProvider takes the chat agent id from the chat handler and constrains its auto-pick to the chat's protocol family — Claude Code chats no longer surprise users by silently extracting on whatever OpenAI key happens to be in media-config. When no matching key is configured the attempt records as 'skipped: no-provider' instead of quietly switching vendors. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): keep hint outside <label> and disambiguate Model selectors The inline Memory model picker wrapped its hint paragraph inside the <label>, which made the hint's "API key" / "model" wording bleed into the <select>'s accessible name and broke Playwright's getByLabel('API key') / getByLabel('Model') strict-mode matching in the existing settings-api-protocol e2e suite. - Move the hint <p> out of the <label> in MemoryModelInline so the select's accessible name is just "Memory model". - Switch the chat-Model selectors in settings-api-protocol.test.ts from getByLabel('Model') to getByRole('combobox', { name: 'Model', exact: true }) so they no longer collide with the new "Memory model" select that sits next to the chat Model picker. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): address review changes — BYOK wiring, MEMORY.md index, /v1, label wrapper Addresses the four blocking review threads on PR #999. 1. MemoryModelInline accessibility (mrcfps) The inline picker still wrapped its select + custom input + flash + hint inside a single <label>, which made the select's accessible name absorb every text descendant — including the "API key" / "model" hint copy. The previous fix moved only the hint outside; the reviewer asked for a non-label wrapper. Switch to <div className="field"> and associate just the short title with the controls via `aria-labelledby` / `aria-label`. The select's accessible name is now exactly "Memory model" so `getByLabel` strict-mode locators on the surrounding chat form stop cross-matching the memory copy. 2. Respect the hand-edited MEMORY.md index (mrcfps + codex) `composeMemoryBody()` was reading every .md file in the memory dir, ignoring the index. Removing a `- [Name](id.md)` line had no effect on future prompts. Parse the index's `INDEX_LINK_RE` bullets and filter `listMemoryEntries()` to the linked id set, so the editor's "delete this line to disable injection" promise actually holds. 3. Versioned OpenAI-compatible base URLs (codex) `callOpenAI` and `callAnthropic` hard-coded `/v1` onto `provider.baseUrl`, breaking custom endpoints whose saved URL already includes `/v1` (`/v1/v1/chat/completions`). Apply the same conditional `appendVersionedApiPath` helper the chat proxy and connection-test routes already use. 4. Wire memory into BYOK / API-mode chats (mrcfps + codex) The previous PR's daemon-only memory hook never fired for BYOK, leaving the Memory tab + model picker as a no-op for that mode. Add the missing surface and wire it through ProjectView: - contracts: extend `composeSystemPrompt` with `memoryBody`, mirroring the daemon's local composer; add `MemorySystemPromptResponse` and the `attemptedLLM` flag on `ExtractMemoryResponse`. - daemon: expose `GET /api/memory/system-prompt` (returns the composed body) and turn `POST /api/memory/extract` into a two-phase endpoint — heuristic-only when only userMessage is supplied (pre-turn), LLM-only when assistantMessage is also supplied (post-turn), so the extraction-history doesn't double up. - web: ProjectView's BYOK branch now fetches the memory body before composing the system prompt, runs the heuristic extractor before the run (so "remember:" markers in this turn reach this turn's prompt), accumulates assistant text during streaming, and queues the LLM extractor on `onDone` — fire-and- forget so it never blocks the chat round-trip. Co-authored-by: Cursor <cursoragent@cursor.com> fix(memory): re-sync BYOK memory override when chat config drifts The inline memory-model picker captured `apiProtocol` / `chatApiKey` / `chatBaseUrl` / `chatApiVersion` into the saved override only at the moment the user clicked a model. If they later swapped the BYOK protocol tab, rotated the API key, or edited the base URL in the same settings flow, the daemon's background extractor kept calling the old vendor / credential — directly contradicting the picker's "borrows the surrounding chat picker's protocol, key, base URL, and api-version automatically" promise. Add a debounced effect that compares the persisted (masked) shape against the live chat props and re-PATCHes /api/memory/config when they drift. The masked config exposes `apiKeyTail` (last 4 chars), so key rotation is detectable without ever round-tripping the secret back to the browser. The 300 ms debounce coalesces the keystroke- granularity prop updates the parent settings dialog streams during its autosave loop, so a user editing the base URL doesn't trigger one PATCH per character. Background re-syncs are silent — the "Saved!" flash only fires for explicit user clicks, so the picker doesn't feel like it's fighting them as they edit unrelated chat fields. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): thread BYOK chat config through /api/memory/extract default path Leaving the BYOK memory picker on "Same as chat" still broke the default LLM extraction path: `MemoryModelInline` clears the override for that option, both `/api/memory/extract` calls in `ProjectView` only sent the messages, and the daemon never persists BYOK creds, so `extractWithLLM(..., { chatAgentId: null })` always reached `pickProvider()` with no chat context and fell through to env / media-config — the wrong vendor for a BYOK chat that works for inference. Thread the live BYOK chat config through the extract endpoint as a per-call snapshot: - contracts: extend `ExtractMemoryRequest` with an optional `chatProvider` (provider/apiKey/baseUrl/apiVersion/model) and add `'chat-byok'` to the credentialSource enum. - daemon: parse + validate `chatProvider` on `/api/memory/extract` (provider must be one of the five known shapes) and forward to `extractWithLLM` as a new option. `pickProvider()` gets a new path 2 that uses the snapshot directly with the per-protocol fast-model default — so a memory pass on `gpt-4o` / `claude-sonnet-4-5` silently turns into a cheap `gpt-4o-mini` / `claude-haiku-4-5` call instead of paying chat-tier rates for sediment work. Override and CLI-agent-constrained paths still win when they apply. - web: `ProjectView` snapshots `apiProtocol` / `apiKey` / `baseUrl` / `apiVersion` from the live `AppConfig` on each BYOK extract call (both pre-turn heuristic-only and post-turn LLM phases). The picker's existing drift-resync effect already covers explicit overrides; this snapshot covers the implicit "Same as chat" default that the override flow can't reach. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): treat empty apiKey on PATCH as a real clear MemoryModelInline silently re-PATCHes /api/memory/config whenever the surrounding BYOK chat creds drift. The previous reuse branch lumped `apiKey === ''` together with `apiKey === undefined`, so clearing the chat API key from the picker quietly preserved the old daemon-side secret and kept calling the provider on a stale credential. Distinguish four states for the apiKey field: - absent -> preserve stored secret (form re-save without re-typing) - '' -> clear stored secret (user removed it from the picker) - 'sk-...' -> replace - new provider -> ignore stored secret entirely Add tests/memory-config-route.test.ts covering all four cases. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 15:45:42 +08:00
Cursor Agent	e6eaa62294	feat(plugins): handoff atom + ArtifactManifest provenance fields Plan N3 / spec §11.5.1 / §21.5. @open-design/contracts ArtifactManifest gains the spec §11.5.1 provenance + downstream-distribution surface as additive optional fields: sourcePluginSnapshotId / sourcePluginId / sourcePluginVersion / sourceTaskKind / sourceRunId / sourceProjectId / parentArtifactId artifactKind / renderKind / handoffKind exportTargets[] / deployTargets[] Spec §11.5.1 invariants: - sourcePluginSnapshotId NEVER changes after first write. - exportTargets[] / deployTargets[] are append-only. - handoffKind promotes monotonically along design-only < implementation-plan < patch < deployable-app. apps/daemon/src/plugins/atoms/handoff.ts ships the daemon-side helper: recordHandoff({ manifest, exportTarget?, deployTarget?, handoffKind?, enforceMonotonicHandoff? }) → { manifest, changed } - Idempotent: a (surface, target) pair only ever lands once on exportTargets[]; same for (provider, location) on deployTargets[]. - handoffKind defaults to monotonic; pass enforceMonotonicHandoff: false on a rollback path. isDeployableAppEligible({ manifest, buildPassing, testsPassing }) → boolean Spec §11.5.1 promotion rule for the deployable-app tier: requires build.passing + tests.passing AND at least one exportTargets[] entry on docker / cli surface. Centralises the rule so plugins don't reimplement it. packages/contracts/src/index.ts now uses .js extensions on every re-export so the daemon's NodeNext moduleResolution picks up the new types end-to-end. Daemon tests: 1534 → 1543 (+9 cases on plugins-handoff: appends exportTargets / deployTargets, idempotency, monotonic handoffKind promotion, downgrade refusal vs. rollback escape, deployable-app eligibility rule). Co-authored-by: Tom Huang <1043269994@qq.com>	2026-05-09 14:48:29 +00:00
Demoniooo	617fb043fe	feat(settings): add fetch models button for BYOK providers (#1034 ) * feat(settings): add fetch models button for BYOK providers * fix(settings): exclude Ollama from fetch models, add manual-entry hint * fix(provider-models): classify non-JSON upstream errors by HTTP status * fix(i18n): drop redundant English overrides from non-English locales * fix(provider-models): allow ollama through allowlist, return unsupported_protocol --------- Co-authored-by: haolin122 <hl6593@nyu.edu>	2026-05-09 22:28:03 +08:00
Cursor Agent	847304ebc5	feat(plugins): atom SKILL.md body loader + renderActiveStageBlock (spec §23.4) Plan J3 / spec §23.3.2 patch 2 / §23.4. Lays the substrate slice for migrating prompt fragments out of `apps/daemon/src/prompts/system.ts` and into the bundled atom SKILL.md bodies registered by §3.I3. apps/daemon/src/plugins/atom-bodies.ts owns the daemon-side loader: loadAtomBodies(db, atomIds) → AtomBodyEntry[] The function looks each atom id up in installed_plugins (bundled rows win), reads the matching fsPath/SKILL.md, strips front-matter, and returns the raw body. Atoms with no installed plugin or unreadable SKILL.md are silently skipped — the caller drops empty entries from the prompt. packages/contracts/src/prompts/atom-block.ts ships the pure renderer: renderActiveStageBlock({ stageId, bodies, iteration? }) → string Mirrors spec §23.4's composeSystemPrompt sketch. Empty bodies return ''; multiple bodies are separated by '---' with no trailing separator. Lives in contracts so the daemon-side composer and any future contracts-side composer share one definition (§11.8 PB1 single-import guarantee). The composeSystemPrompt() rewiring itself is the next PR — this commit gives that PR zero scaffolding to build: the helpers are reachable, tested, and the bundled atom plugins from §3.I3 already have the matching SKILL.md bodies on disk. Tests: contracts 8 → 12 (+4 cases on atom-block); daemon 1482 → 1486 (+4 cases on plugins-atom-bodies covering the end-to-end loadAtomBodies → renderActiveStageBlock path). Co-authored-by: Tom Huang <1043269994@qq.com>	2026-05-09 13:15:52 +00:00
Tom Huang	643d0cf637	feat: add scheduled routines for unattended agent runs (#1033 ) * feat: add scheduled routines for unattended agent runs Generalizes Orbit's single hard-coded daily-digest scheduler into user-defined routines: each one fires on a schedule (hourly / daily / weekdays / weekly with IANA timezone) and starts a fresh agent conversation, either inside an existing project or in a new project minted on the spot. Backend: - New RoutineService with timezone-aware nextRunAt computed via Intl.DateTimeFormat (no new dependency); two-pass tzWallToUtc so DST transitions stay correct. Each fire chains rescheduleOne in finally() to keep the cadence alive. - routines + routine_runs SQLite tables; schedule_json is the authoritative form, with legacy schedule_kind/value kept populated. - /api/routines CRUD + /api/routines/:id/run + /api/routines/:id/runs. - Run handler resolves agent (routine override -> app config -> first available), creates project (or reuses configured one) and a fresh conversation per fire, then dispatches into startChatRun. UI (Settings -> Routines): - Pill-chip schedule kind picker, time + timezone fields, weekday picker for Weekly. Live preview line ("Runs daily at 9:00 AM GMT+8"). - Routine list with inline status pill, next/last meta, expandable run history; each history row links into the project the run wrote to via the existing router primitive. * fix(daemon): swallow trailing finally rejection for inflight cleanup Without a terminal `.catch`, the promise returned by `promise.finally(...)` mirrors the original rejection and produces an unhandled rejection — fatal in modern Node — when the run handler rejects before producing a start handle. Callers still see the rejection on the returned `promise`. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * fix(daemon): handle DST spring-forward gap in tzWallToUtc The two-pass conversion picked the pre-gap candidate when the requested wall time fell inside a spring-forward gap (e.g. 02:30 in America/New_York on 2026-03-08), so the resulting instant rendered back as 01:30 local and a 02:30 routine fired an hour early on the transition day. Routines are local wall-clock schedules, so firing before the requested time breaks the contract. Now we round-trip both candidates through partsInTimezone, return the one whose wall-clock matches the request, and on a gap day where neither matches return the later candidate so the routine fires at the first valid post-gap instant on the same day. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * fix(daemon): preserve both wall-time candidates on DST fall-back day On a fall-back day, the requested wall time inside the repeated hour (e.g. 01:30 America/New_York) maps to two distinct UTC instants. The previous tzWallToUtc collapsed them to the first (pre-transition) one, so a daemon that woke between the two instants would skip the second 01:30 entirely and fire a day late once per fall-back. Replace it with tzWallToUtcCandidates (returns all valid instants, ascending) plus a gap-only fallback for spring-forward, and have nextWallTimeMatching walk both ambiguous candidates before advancing to the next day. Adds fixtures for the repeated-hour case so the intended behavior stays locked in. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * fix(web): make routine timezone picker IANA-complete and DST-truthful The timezone dropdown was a hardcoded subset, but the backend validator accepts any IANA zone — so users could not pick zones like `America/Phoenix` or `Africa/Johannesburg` unless they happened to be local. And `gmtLabel()` always derived the offset from `new Date()`, which drifted seasonally for DST-observing zones (a New York routine created in winter rendered `GMT-5` while it would actually fire on `GMT-4` after DST started). Source the picker from `Intl.supportedValuesOf('timeZone')` (with a curated fallback for older runtimes) and anchor the GMT label to the routine's next fire time. When the next fire time is unknown (e.g. the live preview while the form is open) and in the dropdown itself, fall back to the IANA city, which is stable year-round. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * fix(web): always include UTC in routine timezone picker `Intl.supportedValuesOf('timeZone')` returns only canonical region names on current runtimes (Node 24, recent browsers) and omits `UTC`, so the previous picker dropped the most common non-local zone unless the runtime itself was already UTC. The backend validator and the contract examples still accept `UTC`, so a user on a non-UTC machine could not create a documented UTC routine from Settings. Prepend `UTC` inside `listSupportedTimezones()` when the runtime list omits it, so the picker stays aligned with the supported schedule surface. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)	2026-05-09 19:30:22 +08:00
pftom	89be57b2c4	feat(genui): introduce GenUI surface management and event handling - Added a new GenUI module for managing user interface surfaces, including creation, response handling, and state synchronization. - Implemented API endpoints for listing and responding to GenUI surfaces associated with runs and projects. - Introduced event types and payload helpers for GenUI surface events, enhancing the interaction model for headless operations. - Established a persistent state writer for GenUI surfaces, ensuring reliable data management and retrieval. - Enhanced the plugin system to support auto-derived OAuth prompts for required connectors, improving user experience during plugin application.	2026-05-09 18:44:04 +08:00
pftom	4c7cd5d9f2	feat(plugins): introduce plugin system with installation and management capabilities - Added support for a new plugin system, allowing users to install, uninstall, and manage plugins through the daemon. - Implemented API endpoints for listing installed plugins, retrieving plugin details, and applying plugins with input validation. - Introduced a plugin doctor feature to validate plugin manifests and check for issues before application. - Established a plugin persistence layer with SQLite migrations for managing installed plugins and their metadata. - Enhanced the CLI with commands for plugin operations, improving user interaction with the plugin ecosystem.	2026-05-09 18:24:44 +08:00
初晨	9ef136ced5	fix: sync Orbit last run with selected prompt template (#937 ) * fix(orbit): scope last run to selected template * fix(orbit): preserve legacy last run on upgrade * fix(orbit): pin legacy last-run fallback on refresh * fix(orbit): pin template id at run start * test(web): sync orbit fixtures with skill summary	2026-05-09 11:19:59 +08:00
Bryan A	e13adf2e63	feat(daemon): finalize design package endpoint (closes #450 ) (#832 ) * feat(daemon): scaffold /api/projects/:id/finalize/anthropic (refs #450) Phase C of the PR 2 plan for issue #450: scaffold the route + module shape so subsequent phases (D-I) land function bodies and tests against a stable surface that already passes typecheck. What lands here: - apps/daemon/src/finalize-design.ts: module-level constants (DEFAULT_BASE_URL, DEFAULT_MAX_TOKENS=16000, INPUT_BODY_CAP_BYTES=384KiB, LOCK_FILENAME=.finalize.lock, OUTPUT_FILENAME=DESIGN.md, DEFAULT_TIMEOUT_MS=120s); inline interfaces for the request/response shape (kept out of packages/contracts per scope rules); two error classes - FinalizePackageLockedError (mirrors PR #493's TranscriptExportLockedError) and FinalizeUpstreamError (carries upstream HTTP status for the route's error mapping); function stub that throws "not yet implemented". - apps/daemon/tests/finalize-design.test.ts: vitest harness with describe.skip placeholder so the file imports cleanly. Real cases land in phases D-I. Default-import of node:fs (per memory: vi.spyOn cannot redefine on the frozen ESM Module Namespace; CJS exports object is mutable). - apps/daemon/src/server.ts: route handler at POST /api/projects/:id/finalize/anthropic, slotted next to the existing :id/deploy* family. Validates apiKey/model non-empty, optional baseUrl via the existing validateExternalApiBaseUrl closure (forbidden -> 403, invalid -> 400), optional maxTokens positive number; calls getProject (404 on miss); calls finalizeDesignPackage (which throws, caught and mapped to 500 for now); maps known error classes (FinalizePackageLockedError -> 409, FinalizeUpstreamError -> 502) pre-emptively. Path shape rationale (Bryan-confirmed): project-scoped path matches every sibling /api/projects/:id/* route in server.ts (deploy, deployments, deploy/preflight); provider-namespaced segment leaves a clean expansion line for /api/projects/:id/finalize/openai etc. as follow-ups. Field-name rationale: apiKey, baseUrl, model, maxTokens match ProxyStreamRequest verbatim (packages/contracts/src/api/proxy.ts:8-19) so a future caller can reuse the same body shape. baseUrl is optional here (intentional divergence from the proxy at server.ts which requires it) so standard Anthropic users do not need to set it; Bedrock / self-hosted-proxy users still can. Verification: pnpm --filter @open-design/daemon typecheck exits 0; finalize-design.test.ts loads cleanly with 1 skipped placeholder; no other tests touched. Refs nexu-io/open-design#450 (PR 2 scaffold; pipeline body in subsequent commits) * feat(daemon): transcript truncation helper for /finalize prompt Phase D of the PR 2 plan for issue #450: lands the helper that bounds the transcript section of the synthesis prompt. Why this exists: real-world signal at authoring time was a local project transcript already at 3.95 MB. Anthropic's claude-opus-4-7 context cap is roughly 200K tokens (~700 KB at typical density). Inserting an unbounded transcript would 4xx upstream on the first real call. This helper keeps the on-disk .transcript.jsonl lossless (PR #493's contract) while making the prompt-inclusion bounded. Strategy: - Cap output at INPUT_BODY_CAP_BYTES (384 KiB) so the prompt has room for the system prompt + design system body + current artifact + room for the synthesis output. - Always preserve the header line - it carries projectId, schemaVersion, conversation/message counts, attachment counts; synthesis quality depends on knowing the original sizes. - Split equal byte budgets between head and tail so both project genesis and most-recent intent survive. Two thinking segments separated only by mid-session truncation lose the same kind of boundary that PR #493 preserves between thinking blocks - that's accepted; smarter semantic chunking is a follow-up. - Insert a single `{"kind":"truncated","reason":"size","omittedBytes":N}` sentinel JSON line between the head and tail so a synthesis consumer can detect the gap. omittedBytes is the difference between the original UTF-8 byte length and the output's UTF-8 byte length. - If the head + tail budgets together cover the whole body (e.g. all message lines are tiny), no marker is emitted - the output is the input verbatim. Tests: - "returns the input verbatim when the JSONL fits under the 384 KiB cap" pins that small transcripts pass through unchanged with no marker. - "head+tail truncates with a single marker line when the JSONL exceeds the 384 KiB cap" pins that output is bounded, header survives, exactly one marker emitted with non-zero omittedBytes, both ends of the body preserved, and at least one middle message omitted. Suite delta: +2 tests in finalize-design.test.ts. Refs nexu-io/open-design#450 * fix(daemon): resolve noUncheckedIndexedAccess in truncateTranscriptForPrompt D1 (0eaa123) shipped with `body[headIndex]` and `body[i]` typed as `string \| undefined` under TypeScript's `noUncheckedIndexedAccess` strict mode. Local typecheck would have caught it but the prior verification piped through `tail` which masked the non-zero exit code of `tsc`. Coalesce each access via `?? ''` (the array is from `String.split('\n')` so undefined elements are not actually reachable; the coalesce is a type-narrowing convenience, not a behavior change). Verification: `pnpm --filter @open-design/daemon typecheck` exits 0; `pnpm --filter @open-design/daemon test finalize-design` shows 2/2 + 1 skipped, identical to the pre-fix run. Refs nexu-io/open-design#450 * feat(daemon): current-artifact resolver for /finalize Phase E of the PR 2 plan for issue #450: resolves which artifact (if any) accompanies the transcript + design system in the synthesis prompt. Priority order (Bryan-locked in plan §6): 1. The file referenced by tabs.is_active = 1 IF an <name>.artifact.json sidecar exists on disk. Sidecar presence is the discriminator: an inferred manifest from `inferLegacyManifest` (e.g. for a bare .html with no sidecar) does NOT count, and an active tab pointing at a non-artifact file (.md, .txt) falls through. 2. Newest project file with a real .artifact.json sidecar, sorted by manifest.updatedAt descending. Files without an updatedAt sort last so legacy pre-streaming manifests do not get accidentally promoted. 3. Returns null - "no artifact in scope". The Phase H caller will emit `artifact: null` in the response and the prompt's "Current artifact" section will read "none". Sidecar presence is checked via `existsSync` on the on-disk path, NOT via the `artifactManifest` field returned by readProjectFile/listFiles (those run inferLegacyManifest as a fallback for known kinds, which would otherwise cause a bare .html with no sidecar to look like an artifact). Tests: - "returns the active-tab artifact when its sidecar is present, even if a newer artifact exists elsewhere": pinned.html (older updatedAt) is in the active tab; newer.html (newer updatedAt) is not. Resolver returns pinned.html - intent (active tab) beats recency. - "falls through to newest .artifact.json when active tab points at a non-artifact file": README.md is the active tab (no sidecar); design.html has a real sidecar. Resolver falls through and returns design.html. - "returns null when no active tab and no .artifact.json sidecars exist": only a README.md is in the project; no tabs row. Resolver returns null. Suite delta: +3 tests in finalize-design.test.ts (5 active total). Refs nexu-io/open-design#450 * feat(daemon): synthesis prompt construction for /finalize Phase F of the PR 2 plan for issue #450: builds the system + user prompts that get sent to Anthropic's Messages API in the synthesis call. Pure function; no IO, no side effects. System prompt (literal, stored as a module-level constant): instructs Claude to emit a DESIGN.md document with a fixed 7-heading structure (# DESIGN.md / ## Summary / ## Brand & Voice / ## Information Architecture / ## Components & Patterns / ## Visual System / ## Open Questions / ## Provenance). The Provenance section is required to list project ID, design system, current artifact, transcript message count, and the UTC generation timestamp. User prompt (built at runtime): structured payload with the truncated transcript JSONL, the design system body, and the current artifact body, each under a ## heading. Missing inputs (no design system selected, no artifact in scope) produce explicit "none" headings + parenthetical placeholder body so Claude does not hallucinate content for absent sections. Truncation is the caller's concern - this function does not re-truncate. The caller (Phase H pipeline) feeds in a JSONL that has already been bounded by truncateTranscriptForPrompt. Tests: - "includes the transcript JSONL verbatim and the generation context": pins all section headings, the transcript body verbatim, the design system body verbatim, the artifact body verbatim, and every generation-context line. - "falls back to \"none\" + parenthetical when no design system is selected": designSystemId=null and designSystemBody=null -> heading reads "## Active design system: none" with the parenthetical body. - "falls back to \"none\" + parenthetical when no artifact is in scope": artifact=null -> heading reads "## Current artifact: none" with the parenthetical body. Suite delta: +3 tests in finalize-design.test.ts (8 active total). Refs nexu-io/open-design#450 * feat(daemon): Anthropic call + retry strategy for /finalize Phase G of the PR 2 plan for issue #450: lands the upstream Claude Messages API call with a single transient-error retry, plus the response extractor that turns Anthropic's content array into the DESIGN.md body. What lands here: - appendVersionedApiPath: inlined from the connectionTest helper at apps/daemon/src/connectionTest.ts:188-195 (it is not exported there). Appends /v1/messages when the base URL has no /vN segment, otherwise appends /messages directly. Same semantics; ~5 lines. - callAnthropicWithRetry: POSTs to <base>/v1/messages with the canonical Anthropic headers (content-type, x-api-key, anthropic-version: 2023-06-01) and body shape ({ model, max_tokens, system, messages, stream:false }). One retry on transient (HTTP 429 or 5xx); on terminal failure throws FinalizeUpstreamError carrying the upstream HTTP status and raw body text. The route handler in Phase I maps status to AUTH_FAILED / RATE_LIMITED / UPSTREAM_FAILED and runs the body through redactSecrets before exposing it as `details`. - extractDesignMd: concatenates content[].text for every block where type === 'text', preserving order. Throws FinalizeUpstreamError(502) on three malformed-response shapes: non-object payload, missing content array, zero text blocks. The route handler maps the throw to 502 UPSTREAM_FAILED so synthesis cannot land a half-empty DESIGN.md on disk. - Test-only `_sleepMs` injection on the call params so the retry-delay sleep is instant under vitest. Default sleep uses setTimeout. Retry posture (1 retry on transient) is opinionated; the maintainer's "standard exponential backoff" answer was directional and a single retry matches the existing daemon's posture (transcript export and connectionTest do zero retries) while staying inside the daemon's blocking-fast posture for /finalize. Tests: - callAnthropicWithRetry: throws on 401 with no retry; retries once on 429 and resolves on second 200; throws after both 5xx attempts; propagates AbortError when signal is pre-aborted. - extractDesignMd: concatenates ordered text blocks; throws on missing content array; throws on content with zero text blocks. A spurious typecheck error from `exactOptionalPropertyTypes` (signal typed as AbortSignal \| undefined where RequestInit expects AbortSignal \| null) was resolved by conditionally spreading signal into the RequestInit literal. Suite delta: +7 tests in finalize-design.test.ts (15 active total). Refs nexu-io/open-design#450 * feat(daemon): wire /finalize pipeline end-to-end Phase H of the PR 2 plan for issue #450: stitches together every phase D-G primitive into the full finalizeDesignPackage pipeline that the route handler in Phase I will expose over HTTP. Pipeline (in execution order, all inside a try/finally that always releases the lockfile): 1. getProject(db, projectId): defensive 404 (the route validates first; this throw catches direct CLI/script callers). 2. mkdirSync(<projectDir>, { recursive: true }): some projects have DB rows but no on-disk dir yet (PR #493's same fix). 3. fs.openSync(.finalize.lock, 'wx'): EEXIST -> FinalizePackageLockedError (mirror PR #493's TranscriptExportLockedError). 4. exportProjectTranscript(db, projectsRoot, projectId, { now }): produces .transcript.jsonl on disk; we read the body and run it through truncateTranscriptForPrompt to bound the prompt-inclusion size. 5. readDesignSystem(designSystemsRoot, designSystemId): returns null when the project has no design_system_id selected, when the design system directory does not exist, or when the DESIGN.md file is missing. 6. resolveCurrentArtifact(db, projectsRoot, projectId): active tab -> newest .artifact.json by manifest.updatedAt -> null. 7. buildSynthesisPrompt({...}): system + user prompt (per Phase F). 8. callAnthropicWithRetry({...}): one retry on 429/5xx; throws FinalizeUpstreamError on terminal failure. 9. extractDesignMd(payload): concatenates content[].text blocks; throws FinalizeUpstreamError(502) on malformed shape. 10. Atomic write: writeFileSync({flag:'wx'}) -> reopen for fsync -> rename. Errors unlink tmp before rethrowing. 11. Lock release in finally (always closeSync + unlinkSync). Bounded blocking: the function uses its own AbortController + 120s timeout when the caller does not supply a signal. Caller-supplied signal takes precedence. Type tightening: switched the local Db interface to `type Db = Database.Database` (better-sqlite3) so the function signature is compatible with `exportProjectTranscript`'s typed parameter. Source file already had a `better-sqlite3` import in claude-design-import area of the daemon, so no new dependency. Tests: - "writes DESIGN.md atomically on the happy path": end-to-end with seeded project + conversation + 2 messages + design system on disk; asserts file at exact path + body bytes match the fetch mock. - "response carries every documented field with correct types": designMdPath/bytesWritten/model/inputTokens/outputTokens/artifact/ transcriptMessageCount/designSystemId all present and typed. - "emits design system 'none' in the prompt when no design_system_id is set": fetch mock asserts on the body it receives. - "throws FinalizePackageLockedError when .finalize.lock is already held": pre-create lockfile; assert throw + DESIGN.md not written + pre-existing lock NOT unlinked (we did not own it). - "replaces an existing DESIGN.md atomically on a second finalize": inject a sentinel between two finalize calls; assert sentinel is gone after second run. - "cleans up tmp file AND lock file on every error path": mock fs.writeFileSync to throw on the tmp path; assert no DESIGN.md.tmp.* remain, no DESIGN.md, no .finalize.lock. - "uses the default https://api.anthropic.com baseUrl when baseUrl is omitted": fetch URL begins with the default; baseUrl=undefined path. vi.restoreAllMocks() now runs in afterEach so the writeFileSync spy from the cleanup test does not leak into subsequent tests. Suite delta: +7 tests in finalize-design.test.ts (22 active total). Refs nexu-io/open-design#450 * feat(daemon): /finalize HTTP route handler + error mapping Phase I of the PR 2 plan for issue #450: replaces the Phase C stub's catch-all 500 with status-aware error mapping that surfaces the right HTTP status + error code for each documented failure mode, and adds HTTP-layer tests that boot startServer to exercise the route's validation branches. Route handler changes: - :id format guard: an inline regex matching isSafeId at apps/daemon/src/projects.ts:556-558 rejects unsafe ids with 400 BAD_REQUEST before any DB or filesystem work. Without this, an id like 'bad!id' would either fail getProject as 404 (wrong code) or reach the function and throw 'invalid project id' (mapped to 500). - FinalizeUpstreamError mapping is now status-aware: - upstream 401 -> 401 AUTH_FAILED - upstream 429 -> 429 RATE_LIMITED - upstream 5xx (or our own 502 sentinel for malformed responses) -> 502 UPSTREAM_FAILED In all cases the upstream raw text is run through redactSecrets so the apiKey cannot leak through `details` even if the upstream echoes the inbound headers. - AbortError mapping: when the 120s AbortController fires (or the caller pre-aborted the signal), surface as 503 TIMEOUT. - Default case: console.error the error per daemon convention; client sees 500 INTERNAL with the message routed through redactSecrets. - Imported redactSecrets alongside the existing connectionTest imports (apps/daemon/src/server.ts:51). HTTP-layer tests (boot startServer({port:0,returnServer:true}) once in beforeAll, mirror the proxy-routes.test.ts pattern): - "400 BAD_REQUEST when baseUrl is not a valid URL (test #13)": baseUrl='not-a-url'. - "403 FORBIDDEN when baseUrl points at a private internal IP (test #14)": baseUrl='http://10.0.0.1'. Note: validateBaseUrl explicitly allows loopback (for local OpenAI-compatible servers) and only blocks non-loopback private IPs (10/8, 172.16/12, 192.168/16, fc00::/7, fe80::/10). - "400 BAD_REQUEST when apiKey is missing (test #15)": apiKey omitted. - "400 BAD_REQUEST when :id contains characters outside the safe-id regex (test #16)": id='bad!id' contains '!' which is not in [A-Za-z0-9._-]. Suite delta: +4 tests (26 active in finalize-design.test.ts). Full daemon suite: 1078/1078 pass; baseline+26 (the +5 above plan target reflects retry+extract split into more granular unit tests than originally enumerated; all real, none skipped). Refs nexu-io/open-design#450 * fix(daemon): tighten isSafeId to reject pure-dot project ids Addresses the P1 path-traversal finding from @lefarcen on PR #832 (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512644). The pre-fix `isSafeId` at apps/daemon/src/projects.ts:556-558 used regex `/^[A-Za-z0-9._-]{1,128}$/` which permitted pure-dot ids (`.`, `..`, `...`) because `.` is in the character class. `projectDir` and `resolveProjectDir` both delegated to `isSafeId`, so an id of `..` would resolve to the PARENT of `.od/projects/` via `path.join`. Threat model (per @lefarcen): - An attacker creates a project row whose stored id is `..` (or another pure-dot variant) — for instance via a workflow that writes the row directly without going through the API. Subsequent finalize/write ops keyed by that id then escape the project tree. - A direct CLI / scripted caller passing `..` as the project id reaches the function without HTTP normalization saving us. (Express normalizes %2e%2e to .. and collapses path segments, which yields 404 for the URL `/api/projects/%2e%2e/...` in practice — but that's Express's protection, not ours.) Fix: - isSafeId now explicitly rejects pure-dot ids (`/^\.+$/.test(id)`) before the char-class regex check. Empty string and inputs longer than 128 chars are also rejected explicitly so the function fails closed on edge cases. - isSafeId is now exported from apps/daemon/src/projects.ts so the /finalize route handler in apps/daemon/src/server.ts can use the same validator instead of re-implementing the regex inline. This prevents drift between the route guard and the projectDir guard, which was how this hole originally appeared. Tests (in finalize-design.test.ts because that's where the threat was flagged; isSafeId is daemon-wide so a dedicated test file would also work): - isSafeId rejects `.`, `..`, `...`, `....` - isSafeId rejects ids with `/`, `\`, `!`, leading whitespace - isSafeId rejects empty string and >128 chars - isSafeId rejects non-string inputs (null/undefined/number) - isSafeId accepts plain ids, ids with mid-string dots, UUIDs, single chars Suite delta: +7 tests (33 active in finalize-design.test.ts). Full daemon suite: 1085/1085. Refs nexu-io/open-design#832 * fix(daemon): address PR #832 P1 findings — imported folders + network 502 Addresses two of the three P1 findings from @lefarcen on PR #832: 1. Imported-folder projects route DESIGN.md to metadata.baseDir (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512656, also flagged independently by @chatgpt-codex-connector at #discussion_r3202430470) The pipeline previously called `projectDir(projectsRoot, projectId)` unconditionally, which resolves to `.od/projects/<id>`. For projects created via /api/import/folder the project row's `metadata.baseDir` carries the user's actual folder; without threading metadata through, finalize would silently land DESIGN.md in the hidden daemon data dir and the current-artifact resolver would miss the user's real files. Fix: switch from `projectDir` to `resolveProjectDir(projectsRoot, projectId, metadata)` in both `finalizeDesignPackage` and `resolveCurrentArtifact`. Thread `project.metadata` (from `getProject`'s normalized row) through both call paths. The resolver gets a new optional `metadata` parameter; native projects pass null and get identical behavior. 2. Network failures and JSON parse errors now map to 502 UPSTREAM_FAILED (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512661) Pre-fix, only HTTP-non-OK responses were wrapped as FinalizeUpstreamError. DNS failures (ECONNREFUSED, ENOTFOUND), fetch TypeErrors, and `response.json()` SyntaxErrors fell through to the route's catch-all and surfaced as 500 INTERNAL — incorrect: those are upstream-level failures, not daemon bugs. Fix: - Wrap callAnthropicWithRetry in a try/catch that passes FinalizeUpstreamError and AbortError through verbatim, but rewraps any other thrown error as FinalizeUpstreamError(502, '', message). - Wrap response.json() in a try/catch that rewraps SyntaxError as FinalizeUpstreamError(502, '', "upstream Anthropic returned non-JSON body: ..."). - The route handler's existing FinalizeUpstreamError mapping then correctly maps these to 502 with the message in `details` (run through redactSecrets first). Tests: - "writes DESIGN.md under metadata.baseDir for imported-folder projects": inserts a project row with metadata.baseDir pointing at a user-folder temp dir; asserts result.designMdPath lands there AND the hidden .od/projects/<id> dir does NOT contain a DESIGN.md. - "rewraps fetch network rejection as FinalizeUpstreamError(502)": fetchImpl throws TypeError with cause.code='ENOTFOUND'; assert thrown error has name=FinalizeUpstreamError and status=502. - "rewraps 200 with non-JSON body as FinalizeUpstreamError(502)": fetchImpl returns 200 with text/html body; response.json() throws SyntaxError internally; assert FinalizeUpstreamError(502). Suite delta: +3 tests (36 active in finalize-design.test.ts). Full daemon suite: green at last check; will re-verify before push. Refs nexu-io/open-design#832 * refactor(daemon): move /finalize DTOs to contracts + map error codes + validate active-tab Addresses the P2 and P3 findings from @lefarcen on PR #832: P2 — Error codes + DTOs not in packages/contracts https://github.com/nexu-io/open-design/pull/832#discussion_r3202512673 Reverses my plan's locked decision #10 ("no contracts changes in this PR; inline the request/response types"). That rule came from the predecessor PROMPT brief's anti-pattern table; @lefarcen's review is fresher signal and supersedes it. Drift risk between the daemon's inline types and any future PR 3 web client is real. - New contracts module: packages/contracts/src/api/finalize.ts with FinalizeAnthropicRequest / FinalizeArtifactRef / FinalizeAnthropicResponse. Re-exported from the package root and made addressable via `@open-design/contracts/api/finalize` subpath. - Daemon source imports the canonical types from contracts and re-exports the public type names so internal references keep working without touching every call site. - Daemon-local error codes remapped to existing ApiErrorCode union members (apps/daemon/src/server.ts), per @lefarcen's suggested mapping: FINALIZE_IN_PROGRESS -> CONFLICT AUTH_FAILED -> UNAUTHORIZED UPSTREAM_FAILED -> UPSTREAM_UNAVAILABLE TIMEOUT -> UPSTREAM_UNAVAILABLE (status 503) INTERNAL -> INTERNAL_ERROR HTTP status codes are unchanged; only the `code` field in the error JSON body changed. P3 — Active-tab name not validated before sidecar probe https://github.com/nexu-io/open-design/pull/832#discussion_r3202512684 resolveCurrentArtifact now runs the active tab's name through validateProjectPath BEFORE composing it into a path.join expression. An invalid tab (traversal segments, absolute path, null byte, reserved segment) causes resolveCurrentArtifact to fall through to the newest-artifact branch rather than abort or probe outside the project directory. Tests: - "falls through (does not throw) when active tab name contains traversal segments": injects a malformed `tabs.name = '../../../etc/passwd'` row directly via SQL (bypassing production tab-creation validation), seeds a real artifact, asserts the resolver returns the real artifact rather than the malformed name. Suite delta: +1 test (37 active in finalize-design.test.ts). Full daemon suite: 1089/1089 green. Refs nexu-io/open-design#832 * fix(contracts): publish /api/finalize as standalone runtime entrypoint Addresses @mrcfps's CI-red review on PR #832 (https://github.com/nexu-io/open-design/pull/832, inline comment on packages/contracts/package.json). The previous J3 commit added `./api/finalize` as a type-only subpath: the entry had only a `types` field, no `default`. That broke the contracts package-runtime gate (packages/contracts/tests/package- runtime.test.ts:38-47) which asserts every exports entry exposes both a `.mjs` runtime and a `.d.ts` types target. mrcfps proposed two fixes; this commit takes path B — make finalize a first-class published module rather than a type-only re-export from the package root. Path B vs path A (a peer-AI second opinion via /collaborate confirmed): under NodeNext + ESM with exports-map semantics, TypeScript validates re-exported symbols against the published module-identity surface. Because the previous J3 had `./api/finalize` neither declared as an exports-map entry nor materialized as a standalone .mjs, TS omitted the re-exported names during package boundary analysis. Even at runtime `import('@open-design/contracts').FINALIZE_SCHEMA_VERSION` worked from the bundled index.mjs but the type-checker rejected it. Path B aligns the runtime and declaration surfaces. Changes: - packages/contracts/esbuild.config.mjs: add `./src/api/finalize.ts` to entryPoints so dist/api/finalize.mjs is generated as a standalone module rather than only inlined into the bundled root. - packages/contracts/package.json: re-add `./api/finalize` to the exports map with both `default: ./dist/api/finalize.mjs` AND `types: ./dist/api/finalize.d.ts`. Mirrors `./api/connectionTest`'s shape (the canonical pattern for first-class submodule entries). - packages/contracts/src/api/finalize.ts: keep the runtime export `FINALIZE_SCHEMA_VERSION = 1` (giving the standalone module a real value to emit beyond the type-only interfaces) and update the doc-comment now that the standalone .mjs is wired. - apps/daemon/src/finalize-design.ts: switch the type import from the inline declarations introduced in the prior J3 fallback to `import type { ... } from '@open-design/contracts/api/finalize'`. Re-export the names so internal references inside finalize-design.ts keep working without touching every call site. Verified: - node --input-type=module -e "import('@open-design/contracts/api/finalize').then(m=>console.log(JSON.stringify(Object.keys(m))))" prints ["FINALIZE_SCHEMA_VERSION"] — runtime resolution clean. - pnpm --filter @open-design/contracts test: 6/6 (including both package-runtime.test.ts cases on the rebuilt exports map). - pnpm --filter @open-design/daemon typecheck: exits 0. - pnpm --filter @open-design/daemon test: 1089/1089 (no regression vs the prior J3 number). Refs nexu-io/open-design#832 --------- Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>	2026-05-08 19:52:11 +08:00
Tom Huang	d592f6087f	feat(mcp): external MCP client with daemon-managed OAuth and 39 design-focused templates (#898 ) * feat(mcp): add external MCP client with daemon-managed OAuth and 17 design-focused templates Open Design now acts as an MCP CLIENT and surfaces tools from third-party MCP servers to the underlying agent (Claude Code, Hermes, Kimi). Daemon - New mcp-config / mcp-oauth / mcp-tokens modules: persist server entries to .od/mcp-config.json, run the OAuth dance for HTTP/SSE servers end-to-end on the daemon (so cloud deployments work and tokens survive across turns), and inject Authorization: Bearer headers into the per-spawn .mcp.json the daemon writes for Claude Code (or the ACP mcpServers map for Hermes/Kimi). - /api/mcp/servers and /api/mcp/oauth/{start,status,disconnect} endpoints, plus spawn-time wiring in agents that hands the configured servers to the active agent CLI. - System-prompt directive for connected external MCPs so the model does not chase Claude Code's synthetic _authenticate / _complete_authentication tools when the Bearer is already pinned. Web - Settings -> External MCP servers panel with per-row OAuth Connect / Disconnect / Refresh affordances and per-row template hints. - New "Add server" picker categorized into 7 groups (image-generation, image-editing, web-capture, ui-components, data-viz, publishing, utilities) with a search box, sticky close button, collapsible <details> sections (auto-expand on search), 60vh capped scroll region, and a pinned Custom-server footer. - ChatComposer /mcp slash and MCP picker button forward to the new Settings tab; AssistantMessage renders MCP tool calls inline; markdown autolinker handles bare http(s) URLs (incl. OAuth links) before italic markers so OAuth callback URLs do not get italic-fragmented mid-token. Contracts - packages/contracts/src/api/mcp.ts owns the wire shapes (McpServerConfig, McpTemplate with stable McpTemplateCategory enum, McpServersResponse, OAuth start/status/disconnect bodies, the postMessage payload from the OAuth callback). Templates (17 built-in) - image-generation: Higgsfield (OpenClaw, OAuth HTTP), Pollinations, Allyson (animated SVG), AWS Bedrock Image (uvx). - image-editing: Imagician, ImageSorcery. - web-capture: just-every screenshot-website-fast, ScreenshotOne. - ui-components: 21st.dev Magic, shadcn/ui, FlyonUI. - data-viz: AntV Chart, Mermaid. - publishing: EdgeOne Pages. - utilities: Filesystem, GitHub, Fetch. Tests - apps/daemon/tests/mcp-{config,oauth,tokens,spawn}.test.ts cover storage round-trip, OAuth helpers, token persistence, spawn-time wiring, every template's transport / command / args / env-field invariants, and the canonical category enum. - apps/web/tests/runtime/markdown.test.tsx covers the new autolinker ordering rules. Co-authored-by: Cursor <cursoragent@cursor.com> * feat(mcp): add 21 more design-focused templates and a `design-systems` category Expands the built-in MCP picker from 17 to 38 templates so users can compose the full Open Design craft loop (design-system intake → generate → edit → audit → publish) without leaving the Settings dialog. Every install spec is verified live against the upstream README; templates that needed Go binaries, multi-step `init` ceremonies, or massive runtime stacks (PostgreSQL + Redis + Ollama) are intentionally deferred so picking a template still resolves to a working server in one click. New `design-systems` category between `web-capture` and `ui-components` (reflects the upstream-of-components position in the workflow). Mirrored in `McpTemplateCategory` on both contracts and daemon, and `CATEGORY_ORDER` on the web side. New templates by category: - image-generation (+4): prompt-to-asset (icons / favicons / OG / logos with free-tier routing across Cloudflare AI / NVIDIA NIM / HF / Stable Horde), Nano Banana (hosted streamable HTTP, virtual try-on + product placement), Seedream (hosted streamable HTTP, ByteDance Seedream v3-v5 + SeedEdit), fal.ai (uvx, 600+ models incl. FLUX / Kling / Hunyuan / MusicGen). - image-editing (+3): Photopea (34 layered-editor tools — closes the PSD gap), Topaz Labs (AI upscale / denoise / sharpen), Transloadit (86+ media pipeline robots). - web-capture (+1): Pagecast (browser → demo GIF / MP4 with auto-zoom). - design-systems (+4, NEW category): Figma-Context (Framelink, designs → code), Design Token Bridge (Tailwind ⇄ CSS ⇄ Figma ⇄ M3 / SwiftUI / W3C DTCG + WCAG contrast), Design System Extractor (Storybook scrape), Aesthetics Wiki (cottagecore / dark-academia / y2k / … moodboards). - data-viz (+2): MCP Dashboards (45+ chart types + KPI dashboards), Excalidraw Architect (hand-drawn architecture diagrams). - publishing (+6): PageDrop, PDFSpark, OGForge, QRMint, Slideshot (HTML → PDF / PPTX / PNG with 7 themes), Deckrun (Markdown → PDF / video, hosted free tier with no key required). - utilities (+1): A11y axe-core (WCAG 2.0/2.1/2.2 + color-contrast + ARIA). Tests cover every new template's wiring (command, args, env / header required-vs-optional, secret flag), the category enum invariant, and in-category declaration order for image-generation, design-systems and publishing buckets where the order is what users see in the picker. 21 new test cases pass; full mcp-config suite is green. Templates intentionally deferred (documented in PR body): figma-use (needs Figma desktop with --remote-debugging-port=9222), m-moire (multi-step `memi suite init` + daemon ceremony), gemini-media-mcp + trident-mcp (Go binaries — no npx / uvx path), Pixelle-MCP (full app with web UI + ComfyUI backend), storybook-addon-mcp (lives inside user's Storybook, not standalone), primitiv (multi-step init / build / serve), ReftrixMCP (PostgreSQL + Redis + Ollama + DINOv2), narasimhaponnada/mermaid (overlap with peng-shawn). Co-authored-by: Cursor <cursoragent@cursor.com> * feat(mcp): add figma-use template (write designs from chat) under design-systems figma-use is the natural counterpart to Figma-Context already in this PR: where Framelink reads Figma designs into the model, figma-use writes back into the canvas (90+ tools — create frames / text / components / variants, render JSX into Figma, export PNG/SVG, query nodes via XPath, lint for WCAG / auto-layout / hardcoded colors, analyze design systems). Wired as an HTTP MCP template (`http://localhost:38451/mcp`) because `figma-use mcp serve` only exposes HTTP — there's no stdio mode in the upstream `serve.ts`. No API key. Two prerequisites the user owns are spelled out in the description so picking the template still resolves to a working server: (1) start Figma with `--remote-debugging-port=9222` (or `figma-use daemon start --pipe` on Figma 126+), and (2) leave `npx figma-use mcp serve` running in a terminal. Inserted between `design-system-extractor` and `aesthetics-wiki` so the design-systems category reads as a workflow: read existing design (Figma Context) → translate tokens (Token Bridge) → extract from Storybook (Extractor) → write back to Figma (figma-use) → break creative block (Aesthetics Wiki). Tests cover the new template's transport (`http`), endpoint URL, the empty header-fields invariant (no auth required), and bump the design-systems group order to include it. Co-authored-by: Cursor <cursoragent@cursor.com> * feat(settings): i18n the External MCP / MCP server / Connectors sidebar entries and make the dialog header track the active section The External MCP sidebar entry this PR introduces was hardcoded English ("External MCP / Add MCP tools (Higgsfield, GitHub…)"). Same for the adjacent Connectors and MCP server entries. The dialog header was also pinned to "Execution & model" copy, so opening Settings → External MCP showed a header that lied about which section the user was on. Adds six translation keys — `settings.connectorsTitle/Hint`, `settings.mcpServerTitle/Hint`, `settings.externalMcpTitle/Hint` — and translates them across all 17 locales (ar, de, en, es-ES, fa, fr, hu, id, ja, ko, pl, pt-BR, ru, tr, uk, zh-CN, zh-TW). `SettingsDialog` now derives the header title/subtitle from the active section (11 sections total) instead of a single hardcoded pair, so each section renders an honest header. Co-authored-by: Cursor <cursoragent@cursor.com> * test(e2e): pin level: 3 on dialog heading lookups for Pets and Connectors CI's Validate workspace job (#1479) failed two Playwright cases with the strict-mode violation: getByRole('dialog').getByRole('heading', { name: 'Pets' }) resolved to 2 elements: 1) <h2>Pets</h2> 2) <h3>Pets</h3> Same root cause as the unit-test fix already in this PR: the dynamic dialog `<h2>` now echoes the section's own `<h3>` because the dialog header tracks the active section. Disambiguate to `level: 3` so each assertion still pins the section heading specifically (which is what the test intends to verify). Audit of the rest of e2e/ for `dialog.getByRole('heading', ...)` — settings-api-protocol.test.ts looks for "OpenAI API" / "Anthropic API" section h3s which never appear in the dialog `<h2>` (always "Execution & model"), so those stay safe. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): bind OAuth refresh to the issuing client and skip stale tokens Persist the OAuth client context (token endpoint, client_id, client_secret, issuer, redirect_uri, resource) alongside the bearer token so refresh hits the same client the refresh_token was bound to (RFC 6749 §6). The previous refresh path re-ran beginAuth with a dummy OOB redirect URI, which kept getOrRegisterClient from finding the original DCR client and made providers reject the refresh on the next chat turn. Refreshes now reuse the persisted endpoint/client pair directly. Also stop injecting expired access tokens at spawn time when refresh is unavailable or fails. Pinning a stale Bearer made every Claude MCP call 401 while the prompt still treated the server as connected; on that path we now skip the entry and let the UI surface a reconnect. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-08 17:59:20 +08:00
Tom Huang	56bf6ee1b6	feat: agent-callable research command and /search (#615 ) * feat: pre-generation research (Tavily) for grounded generation Adds an optional pre-generation research step so the agent can produce slides / prototypes / decks grounded in real sources instead of guessing. User flow: 1. Settings -> Tavily Search -> paste API key (or set TAVILY_API_KEY). 2. Click the new Research button in the chat composer. 3. On send, the daemon runs a Tavily search, prepends the findings as a <research_context> block ahead of the system prompt, and spawns the agent. Research progress shows up as status pills in the chat stream; the agent cites sources inline as [1]/[2]/... Phase 1 surface: - Single provider (Tavily), single depth ('shallow'), no LLM synthesis pass (Tavily's `answer` is the summary). - Composer toggle only; no popover / depth picker yet. - Reuses the existing `status` SSE agent payload + StatusPill UI so no new event variants or renderer code are needed. Layers touched: - contracts: ResearchOptions / Source / Findings DTOs; ChatRequest.research; export from index. - daemon: apps/daemon/src/research/{index,tavily}.ts orchestrator + provider; tavily added to MEDIA_PROVIDERS and ENV_KEYS; hook in startChatRun before prompt assembly. - web: ChatComposer toggle + ChatSendMeta; threaded through ChatPane / ProjectView / streamViaDaemon into ChatRequest. Side fix (required to land the feature, but useful on its own): contracts internal relative imports lacked the `.js` suffix that NodeNext module resolution requires. This was already breaking `pnpm --filter @open-design/daemon typecheck` on main; without the fix, none of the new research types were visible to the daemon. All internal contracts imports now carry `.js`. Spec: specs/current/research-feature.md (phases 2-4 outlined for follow-up: composer popover, multi-provider, deep recursion, example skills with research_recommends). Verified: - pnpm --filter @open-design/contracts typecheck/test - pnpm --filter @open-design/daemon typecheck (the chokidar project-watchers test is a pre-existing flake, unrelated) - pnpm --filter @open-design/web typecheck - node scripts/verify-media-models.mjs * fix(daemon): clamp Tavily max_results to 20 Tavily's /search endpoint requires `max_results` in [0, 20]; sending a larger value (e.g. when `research.depth: "deep"` resolves to 30) returns 400 and `runResearch` silently falls back to no-research. Clamp at the provider boundary so Phase 2 depth tiers above 20 still produce results instead of failing the request. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * Remove stale research merge leftovers * Add agent-callable research search * Fix Indonesian locale typecheck * Fix research command invocation edge cases * Harden slash search prompt expansion * Honor research source caps in command contract * Require search reports in design files * Add research data provider settings * Wire web research provider fallback order * Update research provider fallback wording * Revert "Update research provider fallback wording" This reverts commit `86fb6001e3`. * Revert "Wire web research provider fallback order" This reverts commit `4c9e16036b`. * Revert "Add research data provider settings" This reverts commit `23630d1746`. * Add Dexter and Last30Days research skills * Add DCF and Last30Days OD skills * Add Last30Days and Dexter skills * Resolve research review threads --------- Co-authored-by: a1chzt <chizblank@gmail.com>	2026-05-08 10:33:44 +08:00
monshunter	e6e5928be1	feat(web): add connection tests for execution settings (#507 ) * feat(settings): add connection test for providers and CLI agents Adds a "Test" action in the Settings dialog that verifies the configured provider (Anthropic/OpenAI/Azure/Google) or CLI agent without sending a real chat. Backed by a new daemon endpoint and shared contracts, with categorized inline statuses and i18n strings across all supported locales. * fix(settings): address connection test review feedback * fix(daemon): pass empty MCP servers for connection probes * fix(connection-test): address review blockers * fix(daemon): fail json stream runs on structured errors * fix(contracts): build connection test subpath export * Use draft CLI env in agent connection tests * fix(i18n): add fallback ids for new curated content	2026-05-07 11:25:37 +08:00
Marc Chan	c3d9136a0c	Add live artifacts and Composio connector catalog (#381 ) * docs: add live artifacts implementation spec * docs: align live artifacts implementation plan * Ralph iteration 1: work in progress * Ralph iteration 2: work in progress * Ralph iteration 3: work in progress * Ralph iteration 4: work in progress * Ralph iteration 5: work in progress * Ralph iteration 6: work in progress * Ralph iteration 7: work in progress * Ralph iteration 8: work in progress * Ralph iteration 9: work in progress * Ralph iteration 10: work in progress * Ralph iteration 11: work in progress * Ralph iteration 12: work in progress * Ralph iteration 13: work in progress * Ralph iteration 14: work in progress * Ralph iteration 15: work in progress * Ralph iteration 16: work in progress * Ralph iteration 17: work in progress * Ralph iteration 18: work in progress * Ralph iteration 19: work in progress * Ralph iteration 20: work in progress * Ralph iteration 21: work in progress * Ralph iteration 22: work in progress * Ralph iteration 23: work in progress * Ralph iteration 24: work in progress * Ralph iteration 25: work in progress * Ralph iteration 26: work in progress * Ralph iteration 27: work in progress * Ralph iteration 28: work in progress * Ralph iteration 29: work in progress * Ralph iteration 30: work in progress * Ralph iteration 31: work in progress * Ralph iteration 32: work in progress * Ralph iteration 33: work in progress * Ralph iteration 34: work in progress * Ralph iteration 35: work in progress * Ralph iteration 36: work in progress * Ralph iteration 37: work in progress * Ralph iteration 38: work in progress * Ralph iteration 39: work in progress * Ralph iteration 40: work in progress * Ralph iteration 41: work in progress * Ralph iteration 42: work in progress * Ralph iteration 43: work in progress * Ralph iteration 44: work in progress * Ralph iteration 45: work in progress * Ralph iteration 46: work in progress * Ralph iteration 47: work in progress * Ralph iteration 48: work in progress * Ralph iteration 49: work in progress * Ralph iteration 50: work in progress * Ralph iteration 51: work in progress * Ralph iteration 52: work in progress * Ralph iteration 53: work in progress * Ralph iteration 54: work in progress * Ralph iteration 55: work in progress * Ralph iteration 56: work in progress * Ralph iteration 57: work in progress * Ralph iteration 58: work in progress * Ralph iteration 59: work in progress * Ralph iteration 60: work in progress * Ralph iteration 61: work in progress * Ralph iteration 62: work in progress * Ralph iteration 63: work in progress * Ralph iteration 64: work in progress * Ralph iteration 65: work in progress * Ralph iteration 1: work in progress * Ralph iteration 2: work in progress * Ralph iteration 3: work in progress * Ralph iteration 4: work in progress * Ralph iteration 5: work in progress * Ralph iteration 6: work in progress * Ralph iteration 8: work in progress * Ralph iteration 9: work in progress * Ralph iteration 17: work in progress * Add Composio-backed connectors * Add Composio-backed connector catalog * Fix connector callback flow * Update live artifact connector refresh * Fix live artifact refresh updates * Improve live artifact viewer toolbar * Refine live artifact source tabs * Expand Composio connector catalog * Improve Composio connector browsing * Fix artifact refresh source safety checks Generated-By: looper 0.4.1 (runner=fixer, agent=opencode) * Fix live artifacts PR feedback Generated-By: looper 0.5.0 (runner=fixer, agent=opencode) * Fix live artifact preview CORS validation Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix connector OAuth IPv6 loopback hosts Allow bracketed IPv6 loopback Host headers when deriving connector OAuth callback URLs so IPv6-bound daemons can complete connection flow. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Preserve live artifact refresh permissions Respect explicit refresh permission choices during live artifact create and update flows so revoked connector sources remain gated. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact preview cache freshness Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact refresh validation Guard manual refreshes with local daemon checks and reject daemon_tool sources without a toolName before refresh execution. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix Composio credential invalidation Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact CORS methods Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix workspace validation Restore media config test isolation under Vitest setup data-dir overrides and add the missing French live artifact display copy so the workspace test suite stays aligned.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector safety filtering Keep agent-preview connector listings aligned with execution safety policy and prune stale Composio OAuth state records before they accumulate. Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix agent runtime cleanup Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix live artifact daemon access Validate local-only live artifact routes against the peer socket address and pass daemon-resolved CLI paths to ACP MCP descriptors.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector run limit pruning Evict stale connector rate-limit buckets so long-lived daemon processes do not retain per-run entries indefinitely.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector compact schemas Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Improve connector connection feedback * Adjust connector gate positioning * Fix live artifact refresh commits Avoid marking refresh candidates failed after snapshot or state persistence errors by deferring live artifact mutations until the durable refresh metadata is written. Also align connector OAuth callback host validation with daemon loopback handling.\n\nGenerated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Improve connector search relevance * fix(daemon): harden connector connection state Require loopback daemon validation before connector connect side effects and only clear provider-owned connector statuses during credential reset. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): guard connector disconnect route Require local daemon request validation before connector disconnect side effects. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): guard composio config updates Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): dispatch live artifacts mcp first Route the live-artifacts MCP server before the generic MCP CLI so od mcp live-artifacts starts the dedicated server instead of failing generic argument parsing.\n\nGenerated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): handle integer connector schemas Allow JSON Schema integer connector inputs while preserving fractional-value validation so generated connector tool schemas accept valid page sizes and limits. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix: align live artifact refresh error codes Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Fix live artifact connector refresh flow * Update live artifact design cards * Add beta badge to live artifact form * Remove live artifact tile model * Fix live artifact refresh sync * Fix live artifact MCP refresh durability Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Fix live artifact refresh safety Enforce persisted refresh opt-out and connector auto-read gating before refresh sources execute. Generated-By: looper 0.5.5 (runner=fixer, agent=opencode)	2026-05-05 16:42:11 +08:00
Nagendhra Madishetti	76e6c7a9f6	feat: Critique Theater Phase 4 (persistence + transcript + orchestrator) (#481 ) * docs(specs): add Critique Theater design spec for panel-tempered artifacts * docs(specs): add Critique Theater implementation plan * docs(specs): rename UI to Design Jury, add lane-density modes, ship-rule explainer, label sizing * feat(contracts): add CritiqueConfig schema and defaults * fix(contracts): apply Task 1.1 review (CRITIQUE_PROTOCOL_VERSION rename, descriptions, RoleWeights export) * feat(contracts): add PanelEvent discriminated union and isPanelEvent guard * fix(contracts): apply Task 1.2 review (exhaustive event-type list, runId guard, import order) * feat(contracts): add CritiqueSseEvent variants and panelEventToSse mapper * test(daemon): add v1 wire-protocol golden fixtures for Critique Theater parser * feat(daemon): add v1 streaming parser for Critique Theater wire protocol * chore(contracts): add .js extensions to relative imports for NodeNext consumers * fix(daemon): satisfy noUncheckedIndexedAccess in v1 parser regex match access * test(daemon): cover parser failure modes; fix unclosed-PANELIST swallow bug * fix(daemon,contracts): address PR #387 review - parser now clamps panelist + DIM scores against the run-declared scale captured from <CRITIQUE_RUN scale=...>, not a hardcoded 100 - PANELIST appearing before any <ROUND n=...> opens now throws MalformedBlockError rather than emitting events with NaN round - DIM_RE and MUST_FIX_RE hoisted to module scope and lastIndex reset per call so the parser hot path stops recompiling regex per artifact - overflow check after drain simplified to a plain buf.length > cap test (the prior compound condition was always true on the right side and obscured intent) - scoreThreshold <= scoreScale refine gains a 1e-9 epsilon so floating slack does not reject semantically valid configs - round-1 designer ARTIFACT guard gains a comment naming the spec invariant and the v2 relaxation path - 3 new regression tests cover the panelist-without-round, scale=10 clamp, and scale=20 plumbing cases * docs(specs): rationale for non-goals, failure-mode rate targets, Phase 10 matrix, Phase 14 doc layout * Merge branch 'main' into feat/critique-theater Resolves the contracts/index.ts conflict by keeping the .js extensions added by chore(contracts) `2d6e8d6` and slotting in the new export for ./api/app-config introduced upstream by #255 (`9d700ec`). Critique Theater additions (./sse/critique, ./critique) preserved in their original positions. Verified after merge: pnpm --filter @open-design/contracts test -> 10/10 pass pnpm --filter @open-design/contracts typecheck -> exit 0 pnpm --filter @open-design/daemon typecheck -> exit 0 pnpm --filter @open-design/web typecheck -> exit 0 Two daemon tests in tests/media-config.test.ts fail both before and after the merge because they read real OAuth credentials from the developer machine instead of using mock fixtures. That's an upstream isolation issue on origin/main, not something this branch introduces. * fix: unblock web build and address mrcfps PANELIST oversize bypass The chore commit that added .js extensions to satisfy daemon's nodenext typecheck broke apps/web's Next.js build, because webpack tried to resolve the literal ./common.js when only common.ts exists on disk. Replaced with a subpath approach: contracts/exports gains a './critique' entry pointing straight at src/critique.ts (which has no relative imports), and daemon imports route through @open-design/contracts/critique instead of the barrel. Web keeps the bundler-friendly barrel; daemon's nodenext walks only the leaf module. All 13 contracts source files reverted to no-.js. Separately, mrcfps flagged that parserMaxBlockBytes was only enforced on the leftover buffer after drain returned, so a complete oversized block arriving in one chunk slipped past the cap. Added an explicit per-block size check inside drain for every buffered block type (PANELIST, ROUND_END, SHIP). Three regression tests yield the whole stream as a single chunk and assert OversizeBlockError fires before any events emit. * fix(daemon): close three v1 parser invariant gaps from mrcfps review Three independent gaps that all let malformed or oversized protocol output pass the v1 envelope contract: (1) Envelope guard. ROUND, PANELIST, ROUND_END, and SHIP now throw MalformedBlockError when state.inRun is false. Without this, a stream that omits <CRITIQUE_RUN> could still emit panelist_* events without the run_started handshake, leaving downstream reducers with no run-level config. (2) UTF-8 byte length. Both the per-block size check and the post-drain buf-size check now compare Buffer.byteLength(text, 'utf8') against parserMaxBlockBytes. The previous string-length comparison let multibyte content (CJK, emoji) inside <NOTES>/<SUMMARY> exceed the configured byte cap while staying under the JS string length cap, bypassing the daemon's resource guard. (3) Header-end ordering. PANELIST, ROUND_END, and SHIP now require the opener's > to appear before the matched closing tag. A malformed opener like <PANELIST role="x" score="8"</PANELIST> previously fell through to the closing tag's > and emitted events for an invalid block. Four regression tests cover each gap (ROUND-without-run, SHIP-without-run, multibyte-byte-cap, malformed-opener). * feat(daemon): add critique_runs persistence (Task 4.1) Introduces a new SQLite table critique_runs to back the orchestrator's run lifecycle. Plan called for ALTER TABLE artifacts ADD COLUMN ..., but artifacts is not a DB concept in this repo; runs get their own table. - migrateCritique(db) creates the table + two indexes idempotently and is wired into the existing migrate(db) flow on daemon boot. - CRUD helpers (insertCritiqueRun, getCritiqueRun, updateCritiqueRun, listCritiqueRunsByProject, deleteCritiqueRun) round-trip rounds_json through helpers so callers see typed CritiqueRunRow. - reconcileStaleRuns flips stale 'running' rows to 'interrupted' with a recoveryReason='daemon_restart' marker, supporting the spec's daemon-restart-mid-run failure mode. - Public CritiqueRunStatus union excludes the in-flight 'running' value but the runtime CHECK accepts it, matching the spec's lifecycle. - 11 vitest cases cover migration idempotence, round-trip, default rounds, status validation, update + list ordering, deletion, and reconciliation, plus FK CASCADE on project deletion. * feat(daemon): add Critique Theater transcript writer (Task 4.2) Streams PanelEvent sequences to .ndjson on disk under the artifact dir, gzipping to .ndjson.gz when the cumulative UTF-8 byte size crosses gzipThresholdBytes (default 256 KiB). Uses Node fs streams plus zlib.createGzip so the writer never holds the full transcript in memory. readTranscript inverts the path and streams events back, picking the right pipeline by file extension. Covers happy path, large multibyte, empty input, mid-stream failure cleanup, and unknown-extension reject. * feat(daemon): add Critique Theater orchestrator (Task 4.3) Drives one run end-to-end: parses stdout via parseCritiqueStream, scores each round through scoreboard helpers, persists lifecycle to critique_runs, and emits CritiqueSseEvent variants on the existing project event bus. Honors per-round and total timeouts, applies fallbackPolicy when no <SHIP> arrives, and tees events into writeTranscript so transcripts stream to disk without buffering the whole run in memory. Defensive entry validation throws RangeError on invalid CritiqueConfig before any side effect. Also adds scoreboard.ts (computeComposite, decideRound, selectFallbackRound) and re-exports panelEventToSse/CritiqueSseEvent from the critique subpath so daemon imports never touch the barrel. Fixes missing .js extensions in sse/critique.ts that caused NodeNext module resolution errors. * feat(daemon): wire Critique Theater orchestrator into spawn path (Task 4.4) Adds loadCritiqueConfigFromEnv to read OD_CRITIQUE_* keys with strict validation at boot. Branches the existing CLI spawn flow on cfg.enabled: when false (the M0 default) the legacy single-pass generation runs unchanged; when true the orchestrator owns the run end-to-end. Same SSE bus, same artifact dir, no behavior change for users until they flip the flag. * fix(lockfile): regenerate to include contracts zod + vitest entries The earlier conflict resolution took main's lockfile and ran pnpm install, but the install pass on Windows didn't write the contracts package's zod and vitest entries back into the lockfile. CI's --frozen-lockfile install rejected the resulting state. Re-running pnpm install with --no-frozen-lockfile rewrites the lockfile so it now matches every package.json across the workspace, including contracts/zod ^3.23.8 and contracts/vitest ^2.1.8. Verified locally: pnpm install --frozen-lockfile passes. * fix(daemon): parser ship envelope, SHIP-before-round guard, real artifactRef (Defects 3 + 5) - ParserOptions gains projectId + artifactId; the parser threads them into every emitted ship event's artifactRef so downstream consumers see the real run identity instead of empty placeholders. - <SHIP> now requires at least one closed <ROUND_END> in the same run; malformed streams that emit SHIP before any round complete now throw MalformedBlockError instead of bypassing the round-1 artifact invariant. - The SHIP handler validates the inner <ARTIFACT> block is present and non-empty; missing artifact raises MissingArtifactError. - Three new regressions: SHIP-before-round, SHIP-without-artifact, artifactRef populated from parser options. - Orchestrator threads projectId + artifactId into parserOpts. - Test fixtures updated to include <ARTIFACT> inside <SHIP> blocks. * fix(daemon): orchestrator owns lifecycle, gzip atomicity, fallback on timeout (Defects 2,4,7,8) - Orchestrator now accepts child + childExitPromise, races parser / child-exit / abort / timeout in one awaited flow, and SIGTERMs the child on every non-clean termination. Server awaits the result so the run lifecycle has a single owner. - ChildExitError surfaces when child exits non-zero mid-stream; the run is classified as failed with cause cli_exit_nonzero. - Timeout / abort with at least one completed round elects a fallback via selectFallbackRound and emits a synthetic ship event with status=timed_out or interrupted; the score persists to critique_runs instead of staying null. - applyTimeouts includes childExitRace in every Promise.race so early child exits are classified without waiting for the total timeout. iter.return() cleanup is capped at 200ms to prevent hang on stalling generators. - writeTranscript writes gzip output to transcript.ndjson.gz.tmp, fsyncs, then atomic-renames. Crashes mid-write leave no partial .gz or .gz.tmp on disk. * fix(daemon): plain-stream gating, per-run artifact dir, boot reconcile (Defects 1, 2, 6) - Spawn-path branch now inspects def.streamFormat and only routes through runOrchestrator when format === 'plain'. Adapters emitting wrapper formats (claude-stream-json, copilot-stream-json, json-event-stream, acp-json-rpc, pi-rpc) fall through to legacy single-pass with a one-time stderr warning per format. Per-format decoding into the orchestrator is reserved for v2. - critiqueArtifactDir is now path.join(ARTIFACTS_DIR, projectId, runId) so concurrent or sequential runs in the same project never overwrite each other's transcript or final HTML. Persistence stores the relative per-run path. - reconcileStaleRuns is now invoked after openDatabase on every daemon boot with staleAfterMs = critiqueCfg.totalTimeoutMs. Stale running rows from a prior crash flip to interrupted with rounds_json. recoveryReason='daemon_restart'. Logs a one-line warning naming the flipped count when greater than zero. - Spawn now passes child + childExitPromise to runOrchestrator so the orchestrator can race child exit against the parser, abort signal, and timeouts in one awaited flow. Server awaits the orchestrator's result and surfaces failures through the existing run lifecycle. * fix(daemon): daemon-authoritative scoring, lifecycle status, stderr ordering, insert type Round 2 review feedback on PR #481. 1. CritiqueRunInsert.status now accepts 'running' so the boot-reconcile tests (and any caller seeding an in-flight row) typecheck without casting. The runtime check in insertCritiqueRun already accepted 'running' against the DB constraint set, only the public type was stricter than the DB. 2. round_end keeps the daemon-computed composite authoritative. The agent's <ROUND_END composite=...> attribute is advisory: a divergence beyond COMPOSITE_TOLERANCE emits a composite_mismatch parser_warning so the discrepancy is observable, but the daemon value is what scores and persists. Same policy for must_fix. 3. SHIP-handling derives the final status from decideRound(...) using the daemon's scored round rather than trusting <SHIP composite=... status=...>. A run that the agent claims as shipped but whose daemon composite is below threshold now finalizes as below_threshold, so a malformed or adversarial stream cannot force a ship. 4. server.ts captures the orchestrator's result and maps the critique terminal status to the chat run lifecycle. shipped/below_threshold finalize as 'succeeded'; timed_out/interrupted/degraded/failed finalize as 'failed'. cancelRequested is honored. 5. stderr forwarding and child.on('error') registrations moved BEFORE the orchestrator await so a CLI that floods stderr cannot fill the OS pipe and deadlock until the total timeout, and so an early child error fired during the run is observed by the same listener used after. Tests: - tests/critique-authority.test.ts: 3 new regressions (lying ship downgraded to below_threshold, mismatch warning emitted, aligned composites stay quiet). - All four affected suites green: 14 orchestrator + 10 spawn-wiring + 3 boot-reconcile + 3 authority = 30/30. Workspace typechecks: contracts, daemon, web all exit 0. * fix(daemon,contracts): inline critique SSE, signal-terminated child, null shipped artifactPath Round 3 review feedback on PR #481. 1. packages/contracts/src/critique.ts inlines CritiqueSseEvent + panelEventToSse + CRITIQUE_SSE_EVENT_NAMES + a local mirror of SseTransportEvent. The previous re-export from './sse/critique.js' broke the workspace web build (Turbopack cannot rewrite .js to .ts on a relative source import) while removing the .js extension broke daemon's NodeNext typecheck (it walks this leaf via the './critique' subpath export which requires explicit .js extensions). Inlining removes the cross-file relative import entirely so both consumers walk one self-contained file. packages/contracts/src/sse/critique.ts is removed and its co-located test moves up to packages/contracts/src/critique.test.ts. The barrel packages/contracts/src/index.ts drops the redundant './sse/critique' re-export since './critique' already exports the same symbols. 2. apps/daemon/src/critique/orchestrator.ts treats a signal-terminated child as a terminal race rejection. Previously the race only caught non-zero numeric exit codes and treated code === null as indefinitely pending, so a SIGTERM from /api/runs/:id/cancel resolved childExitPromise as { code: null, signal: 'SIGTERM' } and the orchestrator fell through to the no-SHIP fallback path, persisting below_threshold instead of interrupted. The race now rejects with a new ChildSignaledError when signal !== null, and a new catch branch classifies the run as 'interrupted' and (if at least one round closed) emits a synthetic ship event with status='interrupted' so the persisted row and the SSE transcript reflect the actual cause. 3. Same file, ship-handling: artifactPath is now persisted as null on shipped runs until a future phase actually extracts the <SHIP><ARTIFACT> body to disk. Previously the orchestrator wrote ${artifactDir}/${artifactId} even though no file existed at that path, so any later replay/export/UI code that trusted critique_runs.artifact_path would dereference a missing file. The transcript still records the ship event with the artifact reference so consumers can find the run. Tests: - apps/daemon/tests/critique-lifecycle.test.ts: 2 new regressions (SIGTERM-terminated child after one closed round persists 'interrupted' with a synthetic ship event of the same status; shipped run leaves artifactPath null in result and DB row). - 43 critique-suite tests pass: 14 orchestrator + 11 transcript + 10 spawn-wiring + 3 boot-reconcile + 3 authority + 2 lifecycle. Workspace typechecks: contracts, daemon, web all exit 0. * fix(daemon): buffer raw SHIP, emit only normalized; reject SHIP for unclosed round Round 4 review feedback on PR #481. The parser-event loop used to unconditionally collectedEvents.push(event) and bus.emit(panelEventToSse(event)) for every event, including raw <SHIP>. SSE clients and the transcript could see the agent's forged status="shipped" / composite="9.5" before decideRound(...) ran, even when the daemon later corrected the persisted DB row to below_threshold. The loop now skips ship events entirely; the orchestrator buffers the raw shipEvent, runs daemon-authoritative scoring, and emits a single normalized ship payload built from the daemon's computed composite, selectFallbackRound's mustFix, and decideRound's status. The transcript and SSE bus now only ever see the daemon-scored ship. The unknown-round fallback used to make agent-claimed status/composite authoritative when SHIP referenced a round that was never closed: a malformed stream could close low round 1, then send <SHIP round="2" status="shipped" composite="10">, completedRounds.find(r => r.n === 2) was undefined, and the orchestrator persisted the agent's value. That re-opened the scoring-integrity hole the previous round was meant to close. The orchestrator now drops a SHIP whose round isn't in completedRounds, emits a parser_warning, and falls through to the no-SHIP fallback policy. The synthetic ship from selectFallbackRound gets emitted instead, with daemon-authoritative round/composite/status. Tests: - tests/critique-authority.test.ts: extended the lying-ship regression to also assert the emitted critique.ship payload is downgraded (status='below_threshold', composite < threshold), so the SSE bus cannot see the agent's claim. Added a new regression where SHIP references an unclosed round 2: the agent ship is dropped, a parser_warning fires, the fallback selects round 1, and the only emitted critique.ship has round=1 and status=below_threshold. - 44 critique-suite tests pass: 14 orchestrator + 11 transcript + 10 spawn-wiring + 3 boot-reconcile + 4 authority + 2 lifecycle. Workspace daemon typecheck exits 0. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com> Co-authored-by: mrcfps <mrc@powerformer.com>	2026-05-05 15:50:35 +08:00
Nagendhra Madishetti	47eeaf445d	feat: Critique Theater foundation (contracts + parser, Phases 0-2) (#387 ) * docs(specs): add Critique Theater design spec for panel-tempered artifacts * docs(specs): add Critique Theater implementation plan * docs(specs): rename UI to Design Jury, add lane-density modes, ship-rule explainer, label sizing * feat(contracts): add CritiqueConfig schema and defaults * fix(contracts): apply Task 1.1 review (CRITIQUE_PROTOCOL_VERSION rename, descriptions, RoleWeights export) * feat(contracts): add PanelEvent discriminated union and isPanelEvent guard * fix(contracts): apply Task 1.2 review (exhaustive event-type list, runId guard, import order) * feat(contracts): add CritiqueSseEvent variants and panelEventToSse mapper * test(daemon): add v1 wire-protocol golden fixtures for Critique Theater parser * feat(daemon): add v1 streaming parser for Critique Theater wire protocol * chore(contracts): add .js extensions to relative imports for NodeNext consumers * fix(daemon): satisfy noUncheckedIndexedAccess in v1 parser regex match access * test(daemon): cover parser failure modes; fix unclosed-PANELIST swallow bug * fix(daemon,contracts): address PR #387 review - parser now clamps panelist + DIM scores against the run-declared scale captured from <CRITIQUE_RUN scale=...>, not a hardcoded 100 - PANELIST appearing before any <ROUND n=...> opens now throws MalformedBlockError rather than emitting events with NaN round - DIM_RE and MUST_FIX_RE hoisted to module scope and lastIndex reset per call so the parser hot path stops recompiling regex per artifact - overflow check after drain simplified to a plain buf.length > cap test (the prior compound condition was always true on the right side and obscured intent) - scoreThreshold <= scoreScale refine gains a 1e-9 epsilon so floating slack does not reject semantically valid configs - round-1 designer ARTIFACT guard gains a comment naming the spec invariant and the v2 relaxation path - 3 new regression tests cover the panelist-without-round, scale=10 clamp, and scale=20 plumbing cases * docs(specs): rationale for non-goals, failure-mode rate targets, Phase 10 matrix, Phase 14 doc layout * Merge branch 'main' into feat/critique-theater Resolves the contracts/index.ts conflict by keeping the .js extensions added by chore(contracts) `2d6e8d6` and slotting in the new export for ./api/app-config introduced upstream by #255 (`9d700ec`). Critique Theater additions (./sse/critique, ./critique) preserved in their original positions. Verified after merge: pnpm --filter @open-design/contracts test -> 10/10 pass pnpm --filter @open-design/contracts typecheck -> exit 0 pnpm --filter @open-design/daemon typecheck -> exit 0 pnpm --filter @open-design/web typecheck -> exit 0 Two daemon tests in tests/media-config.test.ts fail both before and after the merge because they read real OAuth credentials from the developer machine instead of using mock fixtures. That's an upstream isolation issue on origin/main, not something this branch introduces. * fix: unblock web build and address mrcfps PANELIST oversize bypass The chore commit that added .js extensions to satisfy daemon's nodenext typecheck broke apps/web's Next.js build, because webpack tried to resolve the literal ./common.js when only common.ts exists on disk. Replaced with a subpath approach: contracts/exports gains a './critique' entry pointing straight at src/critique.ts (which has no relative imports), and daemon imports route through @open-design/contracts/critique instead of the barrel. Web keeps the bundler-friendly barrel; daemon's nodenext walks only the leaf module. All 13 contracts source files reverted to no-.js. Separately, mrcfps flagged that parserMaxBlockBytes was only enforced on the leftover buffer after drain returned, so a complete oversized block arriving in one chunk slipped past the cap. Added an explicit per-block size check inside drain for every buffered block type (PANELIST, ROUND_END, SHIP). Three regression tests yield the whole stream as a single chunk and assert OversizeBlockError fires before any events emit. * fix(daemon): close three v1 parser invariant gaps from mrcfps review Three independent gaps that all let malformed or oversized protocol output pass the v1 envelope contract: (1) Envelope guard. ROUND, PANELIST, ROUND_END, and SHIP now throw MalformedBlockError when state.inRun is false. Without this, a stream that omits <CRITIQUE_RUN> could still emit panelist_* events without the run_started handshake, leaving downstream reducers with no run-level config. (2) UTF-8 byte length. Both the per-block size check and the post-drain buf-size check now compare Buffer.byteLength(text, 'utf8') against parserMaxBlockBytes. The previous string-length comparison let multibyte content (CJK, emoji) inside <NOTES>/<SUMMARY> exceed the configured byte cap while staying under the JS string length cap, bypassing the daemon's resource guard. (3) Header-end ordering. PANELIST, ROUND_END, and SHIP now require the opener's > to appear before the matched closing tag. A malformed opener like <PANELIST role="x" score="8"</PANELIST> previously fell through to the closing tag's > and emitted events for an invalid block. Four regression tests cover each gap (ROUND-without-run, SHIP-without-run, multibyte-byte-cap, malformed-opener). * fix(lockfile): regenerate to include contracts zod + vitest entries The earlier conflict resolution took main's lockfile and ran pnpm install, but the install pass on Windows didn't write the contracts package's zod and vitest entries back into the lockfile. CI's --frozen-lockfile install rejected the resulting state. Re-running pnpm install with --no-frozen-lockfile rewrites the lockfile so it now matches every package.json across the workspace, including contracts/zod ^3.23.8 and contracts/vitest ^2.1.8. Verified locally: pnpm install --frozen-lockfile passes. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-04 20:28:28 +08:00
Ajay Satish	9d700ec74f	feat(daemon): persist code agent startup (#255 ) * feat(daemon): persist code agent startup * fix: complete all suggestions * fix: types for app config * chore: revert local origin * chore: format to single quotes * fix: duplicate headers * fix: isLocalSameOrigin rewriting issue --------- Co-authored-by: mrcfps <mrc@powerformer.com>	2026-05-03 12:14:04 +08:00
Caprika	0c00f241e7	Add preview comment attachments (#284 )	2026-05-02 19:23:46 +08:00
Aresdgi	59e4966dda	feat(version): add app version awareness (#204 ) * feat(version): add app version awareness * fix(version): detect packaged sidecars across platforms	2026-05-01 17:26:54 +08:00
nettee	3fb849d047	Fix chat runs surviving web disconnects (#146 ) * fix chat runs surviving web disconnects * fix chat run create abort propagation Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon keepalive reconnect budget Generated-By: looper 0.0.0-dev (runner=fixer, agent=gpt-5.5) * fix daemon stream disconnect cancellation Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon stream abort cancellation race Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon run cancellation semantics * fix load * doc * 2 * add run refresh recovery * fix active run refresh status * fix reattach abort handling * fix * fix chat initial scroll * fix daemon start failures Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix background run recovery Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix stop run status Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix background run recovery Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * extract daemon run service * move prompt composition to daemon * fix prompt module resolution * fix project id generation * add project run status * add designs kanban view with awaiting_input status - add grid/kanban view toggle on Designs tab; persist choice in localStorage - introduce awaiting_input project display status (daemon-derived from unanswered <question-form>) so projects asking the user aren't shown as Completed; ordered between Running and Completed with amber accent - hide transient queued state from users: coerce queued/starting to running in daemon /api/projects projection and drop the queued kanban column - a11y polish on Designs cards: Space activation, aria-labels on delete, focus-visible outlines, reveal delete on focus-within and touch, prefers-reduced-motion handling - kanban layout uses flex sizing instead of viewport math; scoped icon- only pill button rule fixes view-toggle icon alignment --------- Co-authored-by: mrcfps <mrc@powerformer.com>	2026-04-30 20:16:46 +08:00
nettee	56d08b8c5f	Add shared contracts and migrate project code to TypeScript (#118 )	2026-04-30 13:01:15 +08:00

30 commits