mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
30 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
c847ace554
|
Add run-scoped media execution policy (#3106)
* feat(contracts): add run media execution policy * feat(daemon): enforce run media execution policy * test(daemon): cover media execution policy gates |
||
|
|
170a05f5d2
|
Formalize skill artifacts into plugins (#3085)
* Add skill-to-plugin candidate flow * Fix skill plugin candidate card reuse Generated-By: looper 0.9.1 (runner=fixer, agent=codex) * Fix skill plugin candidate dismiss and URL gates Generated-By: looper 0.9.1 (runner=fixer, agent=codex) * Polish skill plugin candidate copy |
||
|
|
cc6edb9afe
|
Proxy GitHub metadata through the daemon (#2654)
* Proxy GitHub metadata through the daemon * fix(contracts): share GitHub metadata responses Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix(contracts): align GitHub fetchedAt payload types Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * Proxy GitHub metadata through the daemon Generated-By: looper 0.6.0 (runner=fixer, agent=codex) |
||
|
|
8193981511
|
Keep PR 2400 changes without folder pickers (#2462)
* feat(daemon): add project working directory management and editor hand-off functionality - Introduced new flags for project commands to manage working directories, including `--working-dir` and `--dir`. - Implemented API routes for listing available editors and opening projects in selected editors. - Added a hand-off button in the ChatPane header to facilitate opening project folders in local applications. - Enhanced the HomeHero component to include working directory and design system settings, improving user experience in project creation. - Created HomeHeroSettingsChips component for inline management of working directory and design system selection. * feat(chat): implement voice transcription proxy and enhance UI components - Added a new API route for voice transcription using OpenAI's `/audio/transcriptions` endpoint, allowing users to send audio blobs directly for transcription. - Integrated multer for handling audio file uploads in memory, ensuring efficient processing without disk storage. - Updated the HomeHero component to include example prompt suggestions for plugins, enhancing user interaction. - Introduced the EditorIcon component to visually represent different editors in the hand-off menu, improving the user experience. - Refined the HandoffButton component to utilize the new EditorIcon, providing a more cohesive interface for selecting editors. - Enhanced CSS styles for various components to improve layout and responsiveness, including adjustments to tab and button sizes for better usability. * style(workspace-shell): enhance layout and overflow handling - Updated CSS for .workspace-shell to ensure full viewport width and height, with proper overflow management. - Adjusted grid layout to prevent content overflow and maintain responsiveness. - Modified styles for .workspace-tabs-chrome to improve width handling and prevent overflow issues. * refactor(chat): remove voice transcription proxy and related components - Deleted the voice transcription proxy implementation, including the associated API route and multer configuration. - Removed the MicButton component from the ChatComposer and HomeHero components to streamline the UI. - Updated HomeHero to include example suggestions without the voice input functionality. - Adjusted CSS styles for various components to maintain layout consistency after the removal of the MicButton. * feat(daemon): implement minting of HMAC tokens for working directory management - Added a new function `mintImportTokenFromCurrentSecret` to generate HMAC tokens bound to a specified base directory, enhancing security for working directory operations. - Updated the `desktop-auth.ts` file to include the new token minting functionality, which returns structured errors when the desktop auth secret is cleared. - Introduced new IPC message types for minting import tokens in the sidecar protocol, allowing seamless integration with the daemon's working directory management. - Enhanced the `WorkingDirPill` component to utilize the new token minting flow for secure directory selection in desktop builds. - Updated CSS styles for the HomeHero component to accommodate new example suggestion features and maintain layout consistency. * fix(HomeView): import HOME_HERO_CHIPS constant for improved chip management - Updated the HomeView component to import the HOME_HERO_CHIPS constant from the chips module, enhancing the management of hero chips within the component. * feat(daemon): implement mintImportTokenViaSidecar for secure working directory management - Introduced the `mintImportTokenViaSidecar` function to facilitate the minting of HMAC tokens for desktop-import operations via the daemon's sidecar IPC. This allows CLI commands to bypass authentication when the desktop-auth gate is active. - Updated the CLI to utilize the new token minting function when setting the working directory, ensuring secure access to trust-gated API endpoints. - Enhanced the sidecar server to handle minting requests and return structured error messages for improved user feedback. - Added tests to validate the new token minting functionality and its integration with the working directory management process. - Refactored related components to support the new token flow, improving overall security and user experience. * feat(HomeHero): enhance UI components and styles for improved user experience - Updated HomeHero component to replace active dot indicators with Plug icons for better visual representation of active plugins. - Adjusted CSS styles for various elements, including padding and dimensions, to enhance layout consistency and responsiveness. - Introduced new styles for active type icons and improved hover effects for buttons. - Updated HomeHeroSettingsChips to change button titles and icons for clarity. - Added tests to ensure proper rendering and functionality of updated components. * feat(ProjectDesignSystemPicker): enhance design system selection with preview functionality - Updated the ProjectDesignSystemPicker component to include a preview feature for design systems, allowing users to see a preview of the selected design system. - Implemented hover functionality to update the preview based on the hovered design system. - Added fullscreen preview capability for a more immersive experience. - Enhanced CSS styles for the design system picker to improve layout and responsiveness. - Introduced tests to validate the new preview functionality and ensure proper interaction within the component. * feat: refactor project metadata handling and enhance design system picker - Updated the default scenario plugin ID retrieval to use project metadata, improving the logic for determining the appropriate plugin based on project intent. - Enhanced the ProjectDesignSystemPicker and related components to support localized design system summaries and categories, improving user experience. - Introduced new translations for working directory and design system picker components, ensuring better accessibility and usability across different locales. - Added a new 'live-artifact' project type to the HomeHero chips, expanding the functionality for users creating refreshable artifacts. - Updated tests to validate the new project metadata handling and design system picker functionalities. * feat: enhance localization and styling for design system components - Added French translations for working directory and design system picker components, improving accessibility for French-speaking users. - Updated CSS styles for the pet task item to ensure consistent padding and layout. - Introduced a new test suite for HomeHeroSettingsChips to validate localization and design system selection functionality. - Enhanced ProjectDesignSystemPicker tests to ensure proper localization and interaction with design system categories. * fix: update .gitignore to include all claude-sessions directories and remove specific session files - Modified .gitignore to ensure all claude-sessions directories are ignored by using a wildcard pattern. - Deleted two specific claude-sessions markdown files to clean up unnecessary session data. * fix: repair home automation ci regressions * fix: stabilize artifact consistency e2e * Remove folder picker changes from PR 2400 --------- Co-authored-by: pftom <1043269994@qq.com> Co-authored-by: qiongyu1999 <2694684348@qq.com> |
||
|
|
c530d163f8
|
feat(web): "Resume conversation in new chat" UI — #462 Commit B (companion to #1718) (#2264)
* feat(contracts): add handoff request/response DTOs
Adds HandoffRequest, HandoffResponse, and HANDOFF_SCHEMA_VERSION for
the upcoming POST /api/projects/:id/handoff synthesis endpoint. Mirrors
the finalize.ts subpath pattern (package.json#exports + esbuild entry +
index re-export) so daemon and web can import
@open-design/contracts/api/handoff.
Refs nexu-io/open-design#462.
* feat(daemon): add handoff synthesis pipeline (buildHandoffPrompt + synthesizeHandoffPrompt)
Adds `apps/daemon/src/handoff-design.ts` exposing the resume-conversation
synthesis primitives the upcoming `POST /api/projects/:id/handoff` route will
call into.
- `buildHandoffPrompt({ projectId, transcriptJsonl, transcriptMessageCount,
now })` returns the system + user prompts. System prompt asks Claude to
emit a structured Markdown body with Context / Decisions made / Open
questions / Current focus / Provenance, with Provenance bullets explicitly
flat (no Markdown emphasis on labels) to preempt the PR #1584 round-2
parser bug.
- `synthesizeHandoffPrompt(db, projectsRoot, projectId, options)` reuses the
existing finalize-design pipeline pieces: `exportProjectTranscript` →
`truncateTranscriptForPrompt` → `buildHandoffPrompt` →
`callAnthropicWithRetry` → `extractDesignMd`, but without the lockfile,
disk write, design-system, or artifact-resolution paths.
- Promotes `DEFAULT_TIMEOUT_MS` in finalize-design.ts to `export const` so
handoff shares the same 120s upstream-call bound.
Refs nexu-io/open-design#462.
* feat(daemon): wire POST /api/projects/:id/handoff route
Adds the handoff HTTP route and registers it in server.ts. Validation
block + error-mapping shape mirror registerFinalizeRoutes (BYOK payload,
upstream-error → ApiErrorCode mapping, redactSecrets on the raw upstream
body). Handoff has no lockfile, so the CONFLICT branch is omitted.
`res.on('close')` is wired to flip an AbortController whose signal is
threaded into synthesizeHandoffPrompt, so a UI-side cancel actually
aborts the daemon-side Anthropic call rather than letting it keep
running after the client walks away (mirrors the PR #974 fix for
finalize).
- `apps/daemon/src/handoff-routes.ts` — new, exports registerHandoffRoutes
+ RegisterHandoffRoutesDeps.
- `apps/daemon/src/server-context.ts` — adds handoff slot to ServerContext.
- `apps/daemon/src/route-context-contract.ts` — adds RegisterHandoffRoutesDeps
to the compile-time coverage assertion.
- `apps/daemon/src/server.ts` — imports synthesizeHandoffPrompt +
registerHandoffRoutes, builds handoffDeps, registers the route next
to finalize.
- `apps/daemon/tests/handoff-route.test.ts` — 12 HTTP-layer tests:
validation (400/403/404), happy path, upstream error mapping
(401/429/502/502 non-JSON), api-key redaction.
- `apps/daemon/tests/handoff-route-abort.test.ts` — client-disconnect
aborts the daemon-side controller.
Refs nexu-io/open-design#462.
* fix(daemon): map TranscriptExportLockedError to 409 CONFLICT on handoff route
`exportProjectTranscript` acquires a per-project `.transcript.lock`
internally (apps/daemon/src/transcript-export.ts:131-163) and throws
`TranscriptExportLockedError` on EEXIST. Concurrent handoff requests —
or a handoff that races `/api/projects/:id/finalize/anthropic` — lost
that lock and surfaced as 500 INTERNAL_ERROR through the route's
generic catch.
- `apps/daemon/src/handoff-routes.ts` — catch `TranscriptExportLockedError`
and return `409 CONFLICT` ahead of the generic 500 branch, mirroring
the existing `FinalizePackageLockedError → 409 CONFLICT` mapping at
`apps/daemon/src/import-export-routes.ts:603-605`.
- `apps/daemon/src/server.ts` — thread `TranscriptExportLockedError`
through `handoffDeps` so the route can match without a direct import.
- `apps/daemon/src/handoff-design.ts` — correct the module header
comment that incorrectly claimed "no lockfile (concurrent handoff
calls are safe)" — handoff does not add its own lock, but it does
transitively acquire `.transcript.lock` via the transcript-export
call.
- `apps/daemon/tests/handoff-route.test.ts` — regression test that
pre-acquires `.transcript.lock` on disk via `fs.openSync(lockPath, 'wx')`
before firing a handoff request, asserts 409 CONFLICT.
Refs nexu-io/open-design#462 — addresses @nettee's blocking review on
PR #1718 (comment 3242251338).
* fix(daemon): keep handoff request timeout armed through the response body read
`synthesizeHandoffPrompt` cleared the upstream-call timeout in a `finally`
that ran as soon as `callAnthropicWithRetry` returned. But `fetch()`
resolves once the upstream sends *headers* — so the subsequent
`await response.json()` body read ran with no timeout. A response that
sends headers and then stalls its body could hang `/api/projects/:id/handoff`
indefinitely instead of failing.
- `apps/daemon/src/handoff-design.ts` — move `clearTimeout(timeoutId)` into a
single outer `finally` spanning both the call and the `response.json()`
body parse, so the timeout stays armed until the body is fully consumed.
- `apps/daemon/src/handoff-design.ts` — the body-parse catch now re-throws
`AbortError` as-is, mirroring the call-phase catch. Without this a
body-phase timeout would surface as `502` "non-JSON body"; re-throwing
lets the route map it to the intended `503` "handoff timed out"
(`handoff-routes.ts:122-124`).
- `apps/daemon/tests/handoff-design.test.ts` — regression test: a `fetchImpl`
returning a `Response` whose body never closes after headers, raced
against a 500ms deadline, asserts the call aborts (not hangs) and rejects
with `AbortError`.
Refs nexu-io/open-design#462 — addresses @nettee's round-2 blocking review
on PR #1718 (`handoff-design.ts:196`).
* fix(daemon): map upstream 400 to 400 BAD_REQUEST on handoff route
`callAnthropicWithRetry` preserves a non-retryable upstream status, so an
Anthropic HTTP 400 (`invalid_request_error` — unknown model, invalid
maxTokens, malformed body) reached the route's `FinalizeUpstreamError`
branch and fell through to `502 UPSTREAM_UNAVAILABLE`. That reported
deterministic caller input as a transient server outage, inviting
pointless retries and hiding which field was wrong.
- `apps/daemon/src/handoff-routes.ts` — special-case `err.status === 400`
to `400 BAD_REQUEST` with the redacted upstream detail, ahead of the
generic 502. Also refresh the route docblock: it claimed the 409 branch
was omitted (stale since the R1 TranscriptExportLockedError fix) and
that error mapping fully mirrors finalize (now diverges on 400).
- `apps/daemon/tests/handoff-route.test.ts` — route test driving an
Anthropic `400 invalid_request_error`: asserts 400 BAD_REQUEST, the
upstream detail is surfaced, and an echoed key is redacted.
- `packages/contracts/tests/package-runtime.test.ts` — import
`@open-design/contracts/api/handoff` through the package `exports` map
and assert `HANDOFF_SCHEMA_VERSION`, covering the built publish surface
(esbuild entry + exports map + root re-export) that the source-only
`handoff-contract.test.ts` does not exercise.
Refs nexu-io/open-design#462 — addresses @nettee's round-3 blocking
review on PR #1718.
* fix(daemon): await the now-async external base-URL validator on handoff route
Main's #1176 (`9a64fccd`) made `validateExternalApiBaseUrl` DNS-aware and
asynchronous (`validateBaseUrlResolved`) and updated the proxy and finalize
callers to `await` it. The handoff route — added on this branch in parallel,
against the old synchronous validator — still called it without `await`, so
`validated` was a Promise: `validated.error` / `validated.forbidden` were
`undefined`, the SSRF / malformed-URL guard silently no-opped, and a bad
`baseUrl` fell through to the upstream call and surfaced as 502.
A semantic merge break — no textual conflict, green on the branch in
isolation, red once CI re-merged latest main.
- `apps/daemon/src/handoff-routes.ts` — `await validateExternalApiBaseUrl(...)`,
mirroring the finalize route (`import-export-routes.ts:561`). The handler
is already `async`.
The existing `handoff-route.test.ts` cases "400 BAD_REQUEST when baseUrl is
not a valid URL" and "403 FORBIDDEN when baseUrl points at a private internal
IP" already encode this — red against branch + latest main, green now.
Refs nexu-io/open-design#462 — PR #1718 CI fix.
* chore(daemon): list handoff in the assertServerContextSatisfiesRoutes literal
The `assertServerContextSatisfiesRoutes({...})` call in `server.ts` enumerates
every route registrar's deps but omitted `handoff`. Adding `handoff: handoffDeps`
makes the literal complete and consistent with the other route deps.
This was not a typecheck break: route-dep coverage is guaranteed by the
`Assert<ServerContext extends AllRegisteredRouteDeps>` type in
`route-context-contract.ts` — and `AllRegisteredRouteDeps` already includes
`RegisterHandoffRoutesDeps` — not by this assertion-call literal. The literal
has omitted `handoff` since this branch's first push (`806db576`) through green
CI throughout; `tsc -p tsconfig.json --noEmit` is clean before and after.
Refs nexu-io/open-design#462 — addresses @nettee's round-4 review note on PR #1718.
* feat(web): add "Resume conversation in new chat" action (#462)
Adds a Resume control to the chat header, next to "New conversation".
Clicking it synthesizes a handoff prompt from the current transcript
via POST /api/projects/:id/handoff, opens a fresh conversation, and
auto-sends the synthesized prompt as its first user message — so a
drifted session resumes without the user replaying context by hand.
The old conversation is preserved.
- synthesizeHandoff() web-state wrapper in apps/web/src/state/projects.ts
- resume-conversation icon button in ChatPane (onResumeConversation /
resumeConversationDisabled props)
- handleResumeConversation + pendingResumeRef + auto-send effect in
ProjectView; effect gates on messagesConversationId so the prompt
cannot fire before the new conversation's message read settles
- chat.resumeConversation i18n key across all 19 locales
Commit B of #462; Commit A is the daemon endpoint (PR #1718). This
branch is stacked on feat/handoff-endpoint so the web code resolves
@open-design/contracts/api/handoff.
* fix(daemon): scope handoff to one conversation + reject empty transcripts (#462)
Addresses the review on #1718 and #2264:
- mrcfps (#2264): the handoff endpoint exported the whole project's
transcript, so a multi-conversation project blended unrelated chats
into the synthesized prompt. HandoffRequest now carries a required
conversationId; the route validates it belongs to the project
(404 CONVERSATION_NOT_FOUND), and exportProjectTranscript takes an
optional conversationId filter so only that conversation is exported.
- nettee (#1718): a zero-message conversation still called Anthropic and
fabricated a handoff. synthesizeHandoffPrompt now throws
EmptyTranscriptError on messageCount === 0; the route maps it to
400 EMPTY_TRANSCRIPT before any BYOK tokens are spent.
HANDOFF_SCHEMA_VERSION bumped to 2 (conversationId is a new required
request field). Regression tests: a two-conversation scoping test, an
empty-conversation route + pipeline test, and a transcript-export
conversationId-filter unit test.
* feat(web): send conversationId with the resume handoff request (#462)
Follows the handoff endpoint becoming conversation-scoped. The resume
flow now passes the active conversationId to POST /handoff so the
synthesized prompt summarizes only the conversation being resumed.
handleResumeConversation bails when there is no active conversation;
synthesizeHandoff and the resume tests carry the new field.
* feat(daemon): add `od project handoff` CLI + register handoff error codes (#462)
Addresses the second-round review on #1718 and #2264:
- mrcfps (#2264): per AGENTS.md "Capability exposure (UI/CLI dual-track)",
a user-facing capability must be reachable through the `od` CLI, not
only the web UI. Adds `od project handoff <id> --conversation <id>
--api-key <key> --model <model> [--base-url] [--max-tokens] [--json]`,
driving the same POST /api/projects/:id/handoff endpoint. The logic
lives in a testable handoff-cli.ts sibling module (mirrors
artifacts-cli.ts) so cli.ts's import-time dispatch stays out of tests.
- nettee (#1718): the route emitted CONVERSATION_NOT_FOUND and
EMPTY_TRANSCRIPT, which were absent from the shared API_ERROR_CODES
union. Both are now registered in packages/contracts/src/errors.ts,
with a contract test pinning them so the route and contract cannot
drift again.
A CLI contract test covers the conversation-scoped request shape,
--json output, flag validation, and daemon-error surfacing.
* fix(daemon): fail `od project handoff` on a malformed 2xx response (#462)
Addresses nettee's review on #1718: runProjectHandoff treated any 2xx
response as success, so a broken daemon/proxy 200 with malformed or
shape-invalid JSON would print `undefined` (or `{}` under --json) and
still exit 0 — breaking the fail-fast contract scripts rely on. It now
validates the body is a well-formed HandoffResponse via an
isHandoffResponse type guard and fails fast otherwise. Regression tests
cover a shape-invalid and an unparseable 200 body.
* feat(web): surface the daemon's classified handoff error in the resume toast (#462)
Addresses mrcfps's non-blocking note on #2264: synthesizeHandoff returned
null for every non-2xx response, so RATE_LIMITED, EMPTY_TRANSCRIPT, and an
upstream 400 with provider detail all collapsed into one generic "check
your API key" toast — even though handoff-routes.ts had already classified
and sanitized them.
synthesizeHandoff now returns the daemon's structured `{ error }` on a
classified failure; `null` stays reserved for a transport failure or an
unparseable body. handleResumeConversation surfaces error.message plus
redacted details for the `{ error }` case, and a distinct
daemon-unreachable message for null.
* fix(web): omit empty baseUrl from the resume handoff request (#462)
Addresses mrcfps's review on #2264: the default Anthropic config
normalizes baseUrl to '' (config.ts), and the handoff route 400s an
explicit empty baseUrl — so the Resume action failed before synthesis
for every user who never set a custom base URL.
handleResumeConversation now forwards baseUrl only when config.baseUrl
is a non-empty string, matching the contract's optional-field semantics.
Tests: the default-config path asserts baseUrl is absent from the
request, and a new case covers a custom baseUrl being forwarded.
* refactor(daemon): dispatch `od project handoff` before the generic project parser (#462)
Addresses nettee's non-blocking note on #1718: runProject ran the shared
parseFlags(PROJECT_*) before reaching the handoff switch case, so a
malformed `od project handoff` invocation (`--unknown`, `--max-tokens`
with no value) threw out of the generic parser instead of hitting
handoff-cli's structured fail() — the entrypoint behaved differently
from the unit-tested runProjectHandoff helper.
The handoff sub now short-circuits before parseFlags / projectDaemonUrl,
so `od project handoff` runs exactly runProjectHandoff with no
intervening parsing. handoff-cli.test.ts gains unknown-flag and
missing-value cases covering the structured fail path.
---------
Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>
|
||
|
|
86ec951fb9
|
[codex] Add automation templates and proposal workflows (#2193)
* feat(web): introduce Automations tab with dual-track capability for routines This commit adds a new Automations tab that consolidates routines, schedules, and live artifacts, allowing users to manage automations seamlessly. The tab features a modal for creating and editing automations, which supports various scheduling options (hourly, daily, weekdays, weekly) and project modes (create_each_run, reuse). The CLI is also updated to expose automation commands, ensuring consistency between the web UI and CLI interfaces. Key changes include: - New `NewAutomationModal` component for automation creation and editing. - Updated `TasksView` to integrate the new Automations functionality. - Enhanced styling for the Automations tab to improve user experience. This implementation aligns with the dual-track capability exposure policy, ensuring all features are accessible via both the web UI and CLI. * feat(daemon): enhance automation context handling and CLI commands This commit introduces several improvements to the automation context management and updates the CLI commands accordingly. Key changes include: - Added support for new context fields (`plugin`, `mcp`, `connector`) in automation commands. - Updated the CLI to reflect new target options (`new-project`). - Enhanced error messages for invalid target inputs. - Introduced functions to handle context selection and normalization for routines, including the ability to parse and store context data in the database. - Updated the database schema to include a new `context_json` field for routines. - Improved the handling of context in routine routes and the web interface, ensuring that selected contexts are properly managed and displayed. These changes aim to provide a more robust and flexible automation experience, aligning with the recent enhancements in the web UI. * feat(web): enhance TasksView with automation run history and status indicators This commit introduces several new features to the TasksView component, including: - Added functionality to display automation run history for each routine, showing metadata such as status, timestamps, and project details. - Implemented status indicators for routine runs, providing visual feedback on their current state (succeeded, failed, running, queued). - Enhanced the UI to allow users to expand and view detailed run history, including the ability to open the corresponding project conversation. - Updated styles to improve the presentation of automation statuses and history. These changes aim to provide users with better insights into their automation routines and improve overall usability. * feat(daemon): implement automation ingestion and proposal management This commit introduces several new features related to automation ingestion and proposal management within the daemon. Key changes include: - Added new modules for handling automation source packets and proposals, allowing for the storage, retrieval, and management of automation-related data. - Implemented functions to list, create, and apply automation proposals, enhancing the automation workflow. - Introduced new CLI commands for interacting with memory entries and automation sources, providing users with more control over their automation processes. - Enhanced the server routes to support automation source and proposal APIs, enabling seamless integration with the existing system. These changes aim to improve the overall automation experience, making it easier for users to manage and utilize automation proposals and ingestions effectively. |
||
|
|
46a64edce3
|
feat(design-systems): extract component manifests (#2051)
Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local> |
||
|
|
53997990b7 |
Merge origin/main (post-0.7.0) into reconciled garnet branch
Second-pass merge layering 41+ new commits from origin/main on top of the first reconcile commit. Headline upstream additions absorbed: - 0.7.0 release: redesigned chat bubble user-text styling, neutralised palette, lucide icons, ElevenLabs audio voice option discovery in the prompt composer, analytics tracking (PostHog) wired across home / studio / create surfaces, Prometheus `/api/metrics` endpoint, critique-theater drop-in mount with a settings toggle. - Misc upstream fixes (titlebar padding, release header layout, deck preview chrome, feedback form auto-scroll, conversation-created SSE on routine runs, etc.) Conflict resolutions (12 files, ~22 hunks): - contracts barrel + prompts/system: union of both sides; new analytics exports (`./analytics/events`, `./analytics/public-params`) added alongside garnet's plugin/atom/genui exports. Both ElevenLabs voice fields (audioVoiceOptions/audioVoiceOptionsError, main) and pluginBlock/activeStageBlocks (garnet) preserved on ComposeInput. - daemon/server.ts: Prometheus `/api/metrics` route inserted after garnet's `/api/daemon/shutdown`. main's `createAnalyticsService` call added before the chat-run service init alongside the prior reconcile note about the dropped legacy POST /api/projects body. - App.tsx: handleCreateProject now consumes both garnet's plugin fields (pluginId / appliedPluginSnapshotId / pluginInputs / autoSendFirstMessage) and main's analytics requestId. Tracking fires success + failure paths; PluginLoopHome auto-send sessionStorage flag is preserved. - ProjectView.tsx: the garnet auto-send useEffect coexists with main's `useCritiqueTheaterEnabled()` hook. - ChatComposer.tsx: imports merged (drop now-unused fetchSkills, add analytics provider + tracking + buildVisualAnnotationAttachment). - index.css: main's redesigned `.msg.user .user-text` chat bubble styling wins over garnet's plain text rule; garnet's `.msg-plugin-chip*` rules preserved alongside. - EntryView.tsx: accepted HEAD (garnet wrapper) — consistent with reconcile decision #2. main's added PetRail / TopTab / analytics view tracking is intentionally NOT brought into the wrapper; the follow-up to re-integrate PetRail / image-templates / video-templates into EntryShell still stands and now also covers analytics view-tracking hooks. - daemon/package.json + pnpm-lock: merged dep set (tar + posthog-node + prom-client coexist). - Test fixtures (FileWorkspace.test): kept garnet's plugin-folders describe block intact; main's projectKind="prototype" addition is dropped where it conflicted with garnet's plugin-folder fixture files. Verification: `pnpm install` (after lockfile reconciled), `pnpm typecheck` exits 0 across all workspace packages. Follow-up not done in this commit: - PetRail / image-templates / video-templates / 0.7.0 analytics view-tracking hooks need to be added to EntryShell. - Critique-theater settings toggle UX (added on main) lives in the SettingsDialog hierarchy; the reconcile state preserves the SettingsDialog so this should work without changes, but no end-to-end verification yet. |
||
|
|
d3602be666 |
Merge origin/main into garnet-hemisphere (reconcile)
Merge of `origin/main` (`03ed3960`, 2026-05-13 pre-0.7.0) into the 161-commit garnet-hemisphere line, reconciling the product-vibe-coded plugin/marketplace/EntryShell surfaces from garnet with the routines / skills / live-artifacts feature work landed on main since the fork point. Headline decisions (full rationale + side-by-side screenshots in `specs/change/20260513-garnet-skills-automations/reconcile-result-vs-garnet.md`): - #1 SettingsDialog: keep main's Memory / Skills / External MCP / Connectors / Routines / MCP server nav items even though the top-level /integrations + /automations routes also cover them. Two entries coexist for now; revisit once Track A/B fill in the placeholder content. - #2 EntryView: accept garnet's thin wrapper delegating to EntryShell. Main's PetRail sidebar + image-templates/video-templates tabs are intentionally deferred to a follow-up that re-integrates them into the new EntryShell layout. - #3 /integrations + /automations top-level routes: kept (garnet's product intent). Skills tab is still a "Coming soon" placeholder awaiting Track A; Routines/Schedules/Live-artifacts cards on /automations are still mock awaiting Track B. - #5 DesignFilesPanel: hybrid — main's pagination as primary list, garnet's Plugin folders section preserved between the live-artifacts block and the pagination block. (by-kind sections drop in favour of pagination; plugin-folders rendering stays because it is a garnet-specific product addition.) - #7 server.ts (10 hunks, ~5400 conflict lines): manual hunk-by-hunk merge. Both daemon admin routes + plugin/genui routes (garnet) and routines/memory/skills upgrades (main) preserved. Garnet's inline project route block kept alongside main's `registerProjectRoutes` / `registerProjectUploadRoutes` modular wiring — duplicate route audit is a follow-up. Garnet's POST /api/projects plugin-snapshot resolution + default-scenario fallback is intentionally dropped from the inline body (now handled by registerProjectRoutes) and listed for follow-up re-integration into `project-routes.ts`. Verification (worktree at /Users/elian/Documents/open-design-garnet): - `pnpm typecheck` exits 0 across all workspace packages - daemon (`pnpm tools-dev run web --namespace reconcile-shots`) boots, serves `/api/daemon/status` healthy, and survives a Playwright walkthrough of /integrations / /automations / home / projects / design-systems / plugins / settings dialog - `@open-design/plugin-runtime` package built (was missing dist/ on garnet); without it the daemon's plugins/* imports fail at boot Track A (Skills tab → real SkillsSection) and Track B (Automations cards → real routines / live-artifacts backend) are the two remaining follow-ups blocking the placeholder/mock content from going live. See `spec.md` and `track-skills.md` in the same directory. |
||
|
|
e1bc83a476
|
feat(analytics): PostHog product analytics (P0 events, consent-gated, packaged) (#1428)
* feat(analytics): scaffold PostHog product-analytics integration
- Add @open-design/contracts/analytics subpath with the 17 P0 event
payload types, header constants, and code↔CSV enum mapping helpers.
- Add apps/daemon/src/analytics.ts with env-gated posthog-node client,
request-scoped analytics context reader, and artifact-id anonymizer.
- Expose GET /api/analytics/config so the web bundle never embeds the
PostHog key at build time; daemon owns POSTHOG_KEY / POSTHOG_HOST.
- Add apps/web/src/analytics module (identity + lazy posthog-js client
+ React provider) and mount it under <I18nProvider> in app/layout.
No event wiring yet — that lands in the next commit alongside trigger
points (App.tsx, EntryView, NewProjectPanel, SettingsDialog, FileViewer,
runs.ts).
* feat(analytics): wire app_launch, home_view, home_click, project_create_result
- App.tsx: fire app_launch once after first effect tick. handleCreateProject
now emits project_create_result on both success and failure paths.
- EntryView.tsx: home_view (page) gated on agents loading so
has_available_cli isn't transiently false; home_view (asset_panel) fires
per top-tab change with the right result_count.
- NewProjectPanel.tsx: home_click create_button fires before delegating to
the parent; a fresh request_id is generated here and threaded through
onCreate so the matching project_create_result stitches via $insert_id.
- contracts/analytics: tighten createTabToTracking and topTabToTracking
for the worktree branch's renamed tabs (live-artifact, templates).
* feat(analytics): wire settings_view + 3 settings_click events
- settings_view fires on dialog mount and on every section switch,
carrying the active section (mapped via settingsSectionToTracking
for the 16-section worktree layout), execution_mode, and the
selected CLI provider id when present.
- settings_click execution_mode_tab: setMode now emits before/after
values whenever the user toggles between Local CLI and BYOK.
- settings_click cli_provider_card: agent card onClick reports
cli_provider_id via agentIdToTracking (kiro → other).
- settings_click byok_field: onFocus added to api_key, model select,
and base_url inputs; provider_id widened to include google so the
worktree's Gemini protocol slot type-checks.
* feat(analytics): wire studio_view + studio_click chat, studio_view artifact
- packages/contracts/src/analytics/artifact-id.ts: FNV-1a 64-bit helper
produces a 16-hex anonymized id for (projectId, fileName). Stable
cross-platform so the daemon and the web bundle resolve the same id
without a Web Crypto round-trip; daemon now re-exports it.
- ChatComposer: studio_view chat_panel fires once per project mount,
studio_click chat_composer fires on attachment + send buttons with
estimated user_query_tokens (length/4) and has_attachment.
- FileViewer: studio_view artifact fires once per (project, file) at
the dispatcher level, before any sub-viewer renders, with
artifact_kind derived from the renderer registry / file.kind table.
- Widen TrackingExportFormat to include markdown and cloudflare_pages
so the worktree branch's full share menu can emit verbatim.
* feat(analytics): wire studio_click share_option + artifact_export_result
HtmlViewer's share menu now emits both events per click via a
fireShareExport helper:
- studio_click share_option fires immediately on click with the chosen
export_format and a fresh request_id.
- artifact_export_result fires when the export resolves — success for
sync exporters (html, markdown, template) the moment the call
returns, success/failed for async exporters (pdf, zip, deploy)
via .then/.catch. The same request_id threads both events so
PostHog stitches click → result via $insert_id.
DEPLOY_PROVIDER_OPTIONS maps to the CSV's vercel / cloudflare_pages
slots; markdown is now a first-class export_format value.
Also ignore .env.local so local POSTHOG_KEY / .env-style secrets
don't get committed.
* feat(analytics): emit run_created and run_finished from the daemon
POST /api/runs now reads the analytics context off the
x-od-analytics-* headers the web client sets on every fetch, then:
- Captures run_created with project_id, conversation_id, run_id,
model_id, agent_provider_id (mapped via agentIdToTracking),
skill_id, design_system_id, plus the token_count_source marker.
- Schedules a run_finished capture on runs.wait(run) resolution,
mapping succeeded/canceled/failed to success/cancelled/failed and
reporting total_duration_ms.
Both events use a stable insert_id derived from the same uuid so
PostHog dedupes the daemon-side mirror against any future
web-side capture without double-counting.
Token sub-fields (user_query_tokens/system_prompt_tokens/...) stay
omitted in v1 — the claude-stream parser only exposes input/output
totals today. See tracking-doc-issues.md §3.2.
* feat(analytics): emit settings_cli_test_result + settings_byok_test_result
The original BLOCKING-list assumed these CSV P0 events were not
implementable in this branch because main lacked Test buttons. The
worktree HEAD actually wires `handleTestAgent` and `handleTestProvider`
in SettingsDialog, so both events are now in scope.
- handleTestAgent emits settings_cli_test_result on success and
failure paths with cli_provider_id mapped via agentIdToTracking,
result drawn from result.ok / catch branch, error_code from
result.kind or the thrown error name, and duration_ms timed via
performance.now().
- handleTestProvider emits settings_byok_test_result analogously,
using apiProtocol (anthropic|openai|azure|ollama|google) directly
as provider_id — wider than the CSV's 5-value enum, documented in
tracking-doc-issues.md §2.5.
Contracts: add SettingsCliTestResultProps / SettingsByokTestResultProps
plus matching track* helpers. AnalyticsEventName union now covers all
14 P0 events this branch supports.
* feat(analytics): gate PostHog on the existing telemetry.metrics consent
The integration now reuses the same first-launch privacy banner +
Settings → Privacy toggle that gates Langfuse, so a single user
decision controls both telemetry sinks.
- /api/analytics/config now consults the persisted AppConfigPrefs:
it returns enabled=true only when POSTHOG_KEY is set AND the user
has chosen "Share usage data" (telemetry.metrics === true). The
response also echoes installationId so the web client uses the
same anonymous id Langfuse keys off of — one identity per install,
shared across both sinks.
- Web AnalyticsProvider:
- Bootstrap fetch resolves installationId and threads it through
the x-od-analytics-anonymous-id header on every /api/* fetch,
so daemon-side captures (run_created / run_finished /
project_create_result) land on the same person record.
- Exposes a setConsent(granted) method that calls posthog-js's
opt_in_capturing / opt_out_capturing, wired from App.tsx via a
useEffect watching config.telemetry?.metrics. Toggling Privacy
→ metrics now stops/resumes events immediately, no reload.
- app_launch additionally gates on telemetry.metrics so a freshly-
declined user fires nothing, and a freshly-opted-in user fires on
the next reload.
* feat(packaging): bake POSTHOG_KEY into packaged daemon spawn env
Wires PostHog product analytics through the same Langfuse-style build-
secret pipeline so official Open Design builds ship with the key while
fork builds compile without it (the integration short-circuits cleanly
when POSTHOG_KEY is absent).
tools/pack
- resolveToolPackConfig reads POSTHOG_KEY / POSTHOG_HOST from
process.env at packaging time, validates them (no whitespace in the
key, http(s) URL for host, trailing-slash strip), and stamps them on
ToolPackConfig. Fork builds without the env vars simply omit the
fields; the daemon-side gate keeps things off in that case.
- Mac, Windows, and Linux packaged-config writers each append the two
fields to open-design-config.json next to the existing
telemetryRelayUrl entry.
apps/packaged
- RawPackagedConfig / PackagedConfig surface posthogKey / posthogHost
so the Electron entry and headless entry both forward them to the
daemon sidecar.
- buildPackagedDaemonSpawnEnv emits POSTHOG_KEY / POSTHOG_HOST into
the daemon child env when present. The daemon's existing analytics
module reads these via process.env — no daemon-side changes needed.
- The headless packaged path falls back to process.env for fields the
builder hasn't injected, mirroring how OPEN_DESIGN_TELEMETRY_RELAY_URL
is read there.
CI
- release-beta.yml and release-stable.yml expose POSTHOG_KEY (secret)
and POSTHOG_HOST (var) at workflow-env scope so every packaging job
inherits them. PR / fork builds without these set simply skip the
bake step.
Tests
- tools/pack: config.test.ts covers bake-through, fork-build omission,
whitespace rejection, invalid-URL rejection, and trailing-slash
normalization.
- apps/packaged: sidecars.test.ts covers buildPackagedDaemonSpawnEnv
forwarding the keys when present and omitting them when null.
* feat(analytics): enable PostHog autocapture + perf + exceptions
Flip on the PostHog SDK's automatic diagnostic features so we capture
click paths, page transitions, web vitals, dead clicks, and browser
exceptions without scattering instrumentation through the codebase.
Privacy defense lives in one place — apps/web/src/analytics/scrub.ts —
wired in via posthog-js's `before_send` hook so every outgoing event
passes through the same audit point:
- $autocapture / $rageclick / $dead_click / $copy_autocapture:
strips $el_text and value/placeholder/aria-label attrs from any
input, textarea, password input, or contenteditable element. PostHog
autocapture does not capture input.value by default, but $el_text
on a <textarea> reflects the typed content — that's the prompt
body for us, so it has to be scrubbed every time.
- $pageview / $pageleave: drops query string and fragment from
$current_url / $referrer so any future ?q=… can't leak.
- $exception: rewrites file:// and absolute filesystem paths in
stack frames to app://apps/<repo-relative> so we don't ship the
user's home directory.
- Suppresses $opt_in entirely — duplicate of our explicit
setConsent toggle in App.tsx.
Element-level defense in depth is limited to the single most sensitive
surface: the chat composer textarea gets `ph-no-capture` so PostHog
never even generates an event for clicks inside that subtree. Every
other input relies on scrub.ts — sprinkling the class through every
form would be noisy and easy to forget on new surfaces.
The existing Privacy → "Share usage data" toggle continues to gate
every new feature: posthog-js's opt_out_capturing() halts autocapture,
$pageview, $exception, web vitals, and dead clicks alongside the
explicit capture() calls — one global switch.
11 unit tests pin the scrub rules in apps/web/tests/analytics-scrub.test.ts.
* ci(nix): bump pnpmDepsHash for posthog-js + posthog-node additions
Adding posthog-js to apps/web and posthog-node to apps/daemon changed
pnpm-lock.yaml, which Nix's fixed-output pnpmDeps derivation pins by
sha256. The CI nix flake check failed with:
specified: sha256-KF3Mld72/iau+pJmA7HvnanRx8VLtDP0N624SKrtrrc=
got: sha256-PGFgX4lYyeH2TRAXfUq52A3EOa6bb1gO59hPsXhEk3s=
Copy the new hash into both nix/package-web.nix and
nix/package-daemon.nix per the procedure documented in nix/README.md
§"First-build hash pinning".
* feat(analytics): unify PostHog identity with Langfuse installationId
PostHog's distinct_id is the installationId stamped by /api/analytics/
config; Langfuse already reads the same id off app-config.json to
populate trace.userId. With both sinks keying off the same anonymous
identity, dashboards can correlate user actions (PostHog events) with
LLM runs (Langfuse traces) without re-identifying.
Two gaps closed:
1. applyConsent(false) — clear posthog-js's persisted ph_*_posthog
localStorage entry on opt-out via posthog.reset(). Without this, a
user who opts out, then clicks Delete my data, then re-opts in
would see PostHog stitch their new session to the deleted identity
because bootstrap.distinctID only takes effect on first init.
2. applyIdentity(newInstallationId) — Delete my data rotates the
installationId in app-config; App.tsx now watches config.installationId
and calls posthog.reset() then identify(newId) so the next event
batch is fully decoupled from the deleted one. Idempotent on
same-id re-renders so benign config refreshes don't churn PostHog
identities.
The fetch wrapper's x-od-analytics-anonymous-id header also flips to
the new id on rotation so daemon-side captures (run_created /
run_finished) land on the same person record from the very next API
call, not after a reload.
The end-to-end rotation flow is verified against a live PostHog
project; these unit tests pin the safety guards (no-client paths, null
inputs) since stubbing posthog-js's init-loaded callback chain is
brittle.
* fix(langfuse): require both metrics AND content consent for trace reports
Tightens the Langfuse gate so a user who shares anonymous metrics but
NOT conversation content stops emitting Langfuse traces entirely —
Langfuse is used for turn-quality evals which only make sense with
prompt/output bodies. PostHog (product analytics, content-free) stays
gated on `metrics` alone and is unaffected.
i18n: "Conversation content" → "Conversation and tool content" with
hints expanded to mention tool inputs/outputs so the consent surface
matches what the trace actually carries (en + zh-CN).
Bundled here per PR scope — change originated outside this PostHog
PR but lands cleanly on the same files; gating Langfuse strictly
on `content` makes the dual-sink consent model (PostHog = metrics,
Langfuse = metrics + content) symmetric across both i18n locales and
the daemon-side gate.
* feat(analytics): wire byok_provider_option + fix PR review P1s
Adds the BYOK protocol-chip click event (5-value provider_id mirroring
the apiProtocol Settings UI) and resolves four P1 review threads on
PR #1428.
byok_provider_option:
- New SettingsClickByokProviderOptionProps in contracts (provider_id =
anthropic|openai|azure|google|ollama; maps to CSV's 5 values per
tracking-doc-issues.md §2.5).
- trackSettingsClickByokProviderOption helper in apps/web/src/analytics.
- SettingsDialog hooks it on the protocol-chip onClick alongside the
existing setApiProtocol call; is_selected reflects whether the chip
was already active.
Review fixes:
1. client.ts (Siri-Ray): clear `initPromise` when the resolution is
null so a Privacy → metrics opt-in after a previous decline triggers
a fresh /api/analytics/config fetch. Without this, the disabled
response was cached forever — first-session opt-in needed a reload
to start sending PostHog events.
2. provider.tsx (Siri-Ray): replace `url.includes('/api/')` with a
strict same-origin + /api/ pathname check (shared
`isSameOriginApiCall` helper). Outbound third-party URLs containing
`/api/` (e.g. provider.example.com/api/x) no longer receive our
x-od-analytics-* headers.
3. provider.tsx (codex-connector, lefarcen): gate header injection on
`resolvedAnonId` being non-null. When Privacy → metrics is off,
/api/analytics/config returns enabled=false → resolvedAnonId stays
null → wrapper never installs → daemon can't read consent-bearing
headers → no daemon-side PostHog event. setConsent now also clears
resolvedAnonId on opt-out and re-fetches on opt-in.
4. daemon/analytics.ts (defense in depth): createAnalyticsService now
takes dataDir and capture() re-reads app-config to check
telemetry.metrics inside the fire-and-forget wrapper. Even if a
stale header somehow reaches the daemon after opt-out, the capture
is dropped before posthog-node.capture is called.
* fix(web): place "Share usage data" on the right in privacy consent banner
Swap button order in PrivacyConsentModal and the in-settings ConsentCard
so the affirmative "Share usage data" lands on the right and "Not now"
on the left. Matches the OK-on-the-right pattern users expect for
primary actions.
Both buttons keep equal visual prominence (same .privacy-consent-action
styling) so the swap doesn't change the EDPB equal-prominence stance
called out in the original Langfuse telemetry spec.
* feat(analytics): populate run_finished token totals from claude-stream usage
Daemon's claude-stream parser already emits agent usage events with
input_tokens / output_tokens totals; the run service buffers them in
run.events and Langfuse reads them out the same way. The run_finished
PostHog event was leaving these fields empty.
Scan run.events for the most recent agent usage frame on terminal
transition and emit input_tokens / output_tokens / total_tokens when
present. token_count_source flips to 'provider_usage' only when at
least one count landed; runs without provider-side usage data keep
'unknown'.
Provider does not break the input down into the 7 sub-fields the
tracking doc lists (memory / context / attachment / system_prompt /
…); those stay omitted until a parser change exposes them.
* feat(analytics): estimate user_query_tokens from prompt length
The user_query_tokens field for run_created / run_finished was hardcoded
to 0. We can't tokenize without bundling a model-specific tokenizer, but
the character/4 heuristic is the industry-standard estimate when one
isn't available and is enough for funnel analysis (prompt-length cohorts,
short-vs-long-query conversion rates).
Extracted from req.body via the same telemetryPromptFromRunRequest
pattern the daemon already uses for langfuse-bridge (currentPrompt then
message fallback). Only the integer count goes to PostHog — the prompt
text itself never leaves the daemon.
token_count_source flips appropriately:
- run_created with a prompt: 'estimated' (was 'unknown')
- run_created with no prompt: 'unknown'
- run_finished with provider usage: 'provider_usage' (overrides
baseProps' 'estimated' value)
- run_finished without provider usage: inherits 'estimated' or 'unknown'
from baseProps so input/output absent doesn't mask the estimate.
|
||
|
|
e254d1280b
|
feat(memory): auto-memory store with chat-protocol-aware extraction (#999)
* feat(memory): auto-memory store with chat-protocol-aware extraction
Markdown memory store at <dataDir>/memory/ with two extractors —
heuristic regex for explicit "remember:" / "我是 X" markers, and a
small-model LLM pass after each turn — folded into the system prompt
so cross-chat preferences, role, and ongoing-work context survive
restarts.
Settings UI:
- Memory tab lists entries, exposes a hand-edited MEMORY.md index, and
shows an extraction history with per-attempt phase/skip/failure rows.
- Memory model picker is inline next to the chat model picker (CLI and
BYOK) so the choice "which fast model mines facts each turn?" sits
next to the chat-model decision instead of a separate panel. The
picker reuses the same SUGGESTED_MODELS table and "Custom..." pattern
the chat picker uses.
LLM extractor supports all four protocols (anthropic / openai / azure /
google); pickProvider takes the chat agent id from the chat handler
and constrains its auto-pick to the chat's protocol family — Claude
Code chats no longer surprise users by silently extracting on whatever
OpenAI key happens to be in media-config. When no matching key is
configured the attempt records as 'skipped: no-provider' instead of
quietly switching vendors.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(memory): keep hint outside <label> and disambiguate Model selectors
The inline Memory model picker wrapped its hint paragraph inside the
<label>, which made the hint's "API key" / "model" wording bleed into
the <select>'s accessible name and broke Playwright's getByLabel('API
key') / getByLabel('Model') strict-mode matching in the existing
settings-api-protocol e2e suite.
- Move the hint <p> out of the <label> in MemoryModelInline so the
select's accessible name is just "Memory model".
- Switch the chat-Model selectors in settings-api-protocol.test.ts from
getByLabel('Model') to getByRole('combobox', { name: 'Model', exact:
true }) so they no longer collide with the new "Memory model" select
that sits next to the chat Model picker.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(memory): address review changes — BYOK wiring, MEMORY.md index, /v1, label wrapper
Addresses the four blocking review threads on PR #999.
1. MemoryModelInline accessibility (mrcfps)
The inline picker still wrapped its select + custom input + flash +
hint inside a single <label>, which made the select's accessible
name absorb every text descendant — including the "API key" / "model"
hint copy. The previous fix moved only the hint outside; the
reviewer asked for a non-label wrapper. Switch to <div className="field">
and associate just the short title with the controls via
`aria-labelledby` / `aria-label`. The select's accessible name is
now exactly "Memory model" so `getByLabel` strict-mode locators
on the surrounding chat form stop cross-matching the memory copy.
2. Respect the hand-edited MEMORY.md index (mrcfps + codex)
`composeMemoryBody()` was reading every *.md file in the memory
dir, ignoring the index. Removing a `- [Name](id.md)` line had no
effect on future prompts. Parse the index's `INDEX_LINK_RE` bullets
and filter `listMemoryEntries()` to the linked id set, so the
editor's "delete this line to disable injection" promise actually
holds.
3. Versioned OpenAI-compatible base URLs (codex)
`callOpenAI` and `callAnthropic` hard-coded `/v1` onto
`provider.baseUrl`, breaking custom endpoints whose saved URL
already includes `/v1` (`/v1/v1/chat/completions`). Apply the same
conditional `appendVersionedApiPath` helper the chat proxy and
connection-test routes already use.
4. Wire memory into BYOK / API-mode chats (mrcfps + codex)
The previous PR's daemon-only memory hook never fired for BYOK,
leaving the Memory tab + model picker as a no-op for that mode.
Add the missing surface and wire it through ProjectView:
- contracts: extend `composeSystemPrompt` with `memoryBody`,
mirroring the daemon's local composer; add
`MemorySystemPromptResponse` and the `attemptedLLM` flag on
`ExtractMemoryResponse`.
- daemon: expose `GET /api/memory/system-prompt` (returns the
composed body) and turn `POST /api/memory/extract` into a
two-phase endpoint — heuristic-only when only userMessage is
supplied (pre-turn), LLM-only when assistantMessage is also
supplied (post-turn), so the extraction-history doesn't double
up.
- web: ProjectView's BYOK branch now fetches the memory body
before composing the system prompt, runs the heuristic
extractor before the run (so "remember:" markers in this turn
reach this turn's prompt), accumulates assistant text during
streaming, and queues the LLM extractor on `onDone` — fire-and-
forget so it never blocks the chat round-trip.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(memory): re-sync BYOK memory override when chat config drifts
The inline memory-model picker captured `apiProtocol` / `chatApiKey` /
`chatBaseUrl` / `chatApiVersion` into the saved override only at the
moment the user clicked a model. If they later swapped the BYOK
protocol tab, rotated the API key, or edited the base URL in the same
settings flow, the daemon's background extractor kept calling the
*old* vendor / credential — directly contradicting the picker's
"borrows the surrounding chat picker's protocol, key, base URL, and
api-version automatically" promise.
Add a debounced effect that compares the persisted (masked) shape
against the live chat props and re-PATCHes /api/memory/config when
they drift. The masked config exposes `apiKeyTail` (last 4 chars), so
key rotation is detectable without ever round-tripping the secret
back to the browser. The 300 ms debounce coalesces the keystroke-
granularity prop updates the parent settings dialog streams during
its autosave loop, so a user editing the base URL doesn't trigger one
PATCH per character. Background re-syncs are silent — the "Saved!"
flash only fires for explicit user clicks, so the picker doesn't feel
like it's fighting them as they edit unrelated chat fields.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(memory): thread BYOK chat config through /api/memory/extract default path
Leaving the BYOK memory picker on "Same as chat" still broke the
default LLM extraction path: `MemoryModelInline` clears the override
for that option, both `/api/memory/extract` calls in `ProjectView`
only sent the messages, and the daemon never persists BYOK creds, so
`extractWithLLM(..., { chatAgentId: null })` always reached
`pickProvider()` with no chat context and fell through to env /
media-config — the wrong vendor for a BYOK chat that works for
inference.
Thread the live BYOK chat config through the extract endpoint as a
per-call snapshot:
- contracts: extend `ExtractMemoryRequest` with an optional
`chatProvider` (provider/apiKey/baseUrl/apiVersion/model) and add
`'chat-byok'` to the credentialSource enum.
- daemon: parse + validate `chatProvider` on `/api/memory/extract`
(provider must be one of the five known shapes) and forward to
`extractWithLLM` as a new option. `pickProvider()` gets a new
path 2 that uses the snapshot directly with the per-protocol
fast-model default — so a memory pass on `gpt-4o` / `claude-sonnet-4-5`
silently turns into a cheap `gpt-4o-mini` / `claude-haiku-4-5` call
instead of paying chat-tier rates for sediment work. Override and
CLI-agent-constrained paths still win when they apply.
- web: `ProjectView` snapshots `apiProtocol` / `apiKey` / `baseUrl` /
`apiVersion` from the live `AppConfig` on each BYOK extract call
(both pre-turn heuristic-only and post-turn LLM phases). The
picker's existing drift-resync effect already covers explicit
overrides; this snapshot covers the implicit "Same as chat"
default that the override flow can't reach.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(memory): treat empty apiKey on PATCH as a real clear
MemoryModelInline silently re-PATCHes /api/memory/config whenever the
surrounding BYOK chat creds drift. The previous reuse branch lumped
`apiKey === ''` together with `apiKey === undefined`, so clearing the
chat API key from the picker quietly preserved the old daemon-side
secret and kept calling the provider on a stale credential.
Distinguish four states for the apiKey field:
- absent -> preserve stored secret (form re-save without re-typing)
- '' -> clear stored secret (user removed it from the picker)
- 'sk-...' -> replace
- new provider -> ignore stored secret entirely
Add tests/memory-config-route.test.ts covering all four cases.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
e6eaa62294
|
feat(plugins): handoff atom + ArtifactManifest provenance fields
Plan N3 / spec §11.5.1 / §21.5.
@open-design/contracts ArtifactManifest gains the spec §11.5.1
provenance + downstream-distribution surface as additive optional
fields:
sourcePluginSnapshotId / sourcePluginId / sourcePluginVersion /
sourceTaskKind / sourceRunId / sourceProjectId / parentArtifactId
artifactKind / renderKind / handoffKind
exportTargets[] / deployTargets[]
Spec §11.5.1 invariants:
- sourcePluginSnapshotId NEVER changes after first write.
- exportTargets[] / deployTargets[] are append-only.
- handoffKind promotes monotonically along
design-only < implementation-plan < patch < deployable-app.
apps/daemon/src/plugins/atoms/handoff.ts ships the daemon-side
helper:
recordHandoff({ manifest, exportTarget?, deployTarget?,
handoffKind?, enforceMonotonicHandoff? })
→ { manifest, changed }
- Idempotent: a (surface, target) pair only ever lands once on
exportTargets[]; same for (provider, location) on deployTargets[].
- handoffKind defaults to monotonic; pass enforceMonotonicHandoff:
false on a rollback path.
isDeployableAppEligible({ manifest, buildPassing, testsPassing })
→ boolean
Spec §11.5.1 promotion rule for the deployable-app tier: requires
build.passing + tests.passing AND at least one exportTargets[]
entry on docker / cli surface. Centralises the rule so plugins
don't reimplement it.
packages/contracts/src/index.ts now uses .js extensions on every
re-export so the daemon's NodeNext moduleResolution picks up the
new types end-to-end.
Daemon tests: 1534 → 1543 (+9 cases on plugins-handoff: appends
exportTargets / deployTargets, idempotency, monotonic handoffKind
promotion, downgrade refusal vs. rollback escape, deployable-app
eligibility rule).
Co-authored-by: Tom Huang <1043269994@qq.com>
|
||
|
|
617fb043fe
|
feat(settings): add fetch models button for BYOK providers (#1034)
* feat(settings): add fetch models button for BYOK providers * fix(settings): exclude Ollama from fetch models, add manual-entry hint * fix(provider-models): classify non-JSON upstream errors by HTTP status * fix(i18n): drop redundant English overrides from non-English locales * fix(provider-models): allow ollama through allowlist, return unsupported_protocol --------- Co-authored-by: haolin122 <hl6593@nyu.edu> |
||
|
|
847304ebc5
|
feat(plugins): atom SKILL.md body loader + renderActiveStageBlock (spec §23.4)
Plan J3 / spec §23.3.2 patch 2 / §23.4.
Lays the substrate slice for migrating prompt fragments out of
`apps/daemon/src/prompts/system.ts` and into the bundled atom
SKILL.md bodies registered by §3.I3.
apps/daemon/src/plugins/atom-bodies.ts owns the daemon-side loader:
loadAtomBodies(db, atomIds) → AtomBodyEntry[]
The function looks each atom id up in installed_plugins (bundled
rows win), reads the matching fsPath/SKILL.md, strips
front-matter, and returns the raw body. Atoms with no installed
plugin or unreadable SKILL.md are silently skipped — the caller
drops empty entries from the prompt.
packages/contracts/src/prompts/atom-block.ts ships the pure
renderer:
renderActiveStageBlock({ stageId, bodies, iteration? }) → string
Mirrors spec §23.4's composeSystemPrompt sketch. Empty bodies
return ''; multiple bodies are separated by '---' with no trailing
separator. Lives in contracts so the daemon-side composer and any
future contracts-side composer share one definition (§11.8 PB1
single-import guarantee).
The composeSystemPrompt() rewiring itself is the next PR — this
commit gives that PR zero scaffolding to build: the helpers are
reachable, tested, and the bundled atom plugins from §3.I3 already
have the matching SKILL.md bodies on disk.
Tests: contracts 8 → 12 (+4 cases on atom-block); daemon
1482 → 1486 (+4 cases on plugins-atom-bodies covering the
end-to-end loadAtomBodies → renderActiveStageBlock path).
Co-authored-by: Tom Huang <1043269994@qq.com>
|
||
|
|
643d0cf637
|
feat: add scheduled routines for unattended agent runs (#1033)
* feat: add scheduled routines for unattended agent runs
Generalizes Orbit's single hard-coded daily-digest scheduler into
user-defined routines: each one fires on a schedule (hourly / daily /
weekdays / weekly with IANA timezone) and starts a fresh agent
conversation, either inside an existing project or in a new project
minted on the spot.
Backend:
- New RoutineService with timezone-aware nextRunAt computed via
Intl.DateTimeFormat (no new dependency); two-pass tzWallToUtc so
DST transitions stay correct. Each fire chains rescheduleOne in
finally() to keep the cadence alive.
- routines + routine_runs SQLite tables; schedule_json is the
authoritative form, with legacy schedule_kind/value kept populated.
- /api/routines CRUD + /api/routines/:id/run + /api/routines/:id/runs.
- Run handler resolves agent (routine override -> app config -> first
available), creates project (or reuses configured one) and a fresh
conversation per fire, then dispatches into startChatRun.
UI (Settings -> Routines):
- Pill-chip schedule kind picker, time + timezone fields, weekday
picker for Weekly. Live preview line ("Runs daily at 9:00 AM GMT+8").
- Routine list with inline status pill, next/last meta, expandable run
history; each history row links into the project the run wrote to
via the existing router primitive.
* fix(daemon): swallow trailing finally rejection for inflight cleanup
Without a terminal `.catch`, the promise returned by `promise.finally(...)`
mirrors the original rejection and produces an unhandled rejection — fatal
in modern Node — when the run handler rejects before producing a start
handle. Callers still see the rejection on the returned `promise`.
Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)
* fix(daemon): handle DST spring-forward gap in tzWallToUtc
The two-pass conversion picked the pre-gap candidate when the requested
wall time fell inside a spring-forward gap (e.g. 02:30 in
America/New_York on 2026-03-08), so the resulting instant rendered back
as 01:30 local and a 02:30 routine fired an hour early on the
transition day. Routines are local wall-clock schedules, so firing
before the requested time breaks the contract.
Now we round-trip both candidates through partsInTimezone, return the
one whose wall-clock matches the request, and on a gap day where
neither matches return the later candidate so the routine fires at the
first valid post-gap instant on the same day.
Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)
* fix(daemon): preserve both wall-time candidates on DST fall-back day
On a fall-back day, the requested wall time inside the repeated hour
(e.g. 01:30 America/New_York) maps to two distinct UTC instants. The
previous tzWallToUtc collapsed them to the first (pre-transition) one,
so a daemon that woke between the two instants would skip the second
01:30 entirely and fire a day late once per fall-back. Replace it with
tzWallToUtcCandidates (returns all valid instants, ascending) plus a
gap-only fallback for spring-forward, and have nextWallTimeMatching
walk both ambiguous candidates before advancing to the next day. Adds
fixtures for the repeated-hour case so the intended behavior stays
locked in.
Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)
* fix(web): make routine timezone picker IANA-complete and DST-truthful
The timezone dropdown was a hardcoded subset, but the backend validator
accepts any IANA zone — so users could not pick zones like
`America/Phoenix` or `Africa/Johannesburg` unless they happened to be
local. And `gmtLabel()` always derived the offset from `new Date()`,
which drifted seasonally for DST-observing zones (a New York routine
created in winter rendered `GMT-5` while it would actually fire on
`GMT-4` after DST started).
Source the picker from `Intl.supportedValuesOf('timeZone')` (with a
curated fallback for older runtimes) and anchor the GMT label to the
routine's next fire time. When the next fire time is unknown (e.g.
the live preview while the form is open) and in the dropdown itself,
fall back to the IANA city, which is stable year-round.
Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)
* fix(web): always include UTC in routine timezone picker
`Intl.supportedValuesOf('timeZone')` returns only canonical region
names on current runtimes (Node 24, recent browsers) and omits `UTC`,
so the previous picker dropped the most common non-local zone unless
the runtime itself was already UTC. The backend validator and the
contract examples still accept `UTC`, so a user on a non-UTC machine
could not create a documented UTC routine from Settings.
Prepend `UTC` inside `listSupportedTimezones()` when the runtime list
omits it, so the picker stays aligned with the supported schedule
surface.
Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)
|
||
|
|
89be57b2c4 |
feat(genui): introduce GenUI surface management and event handling
- Added a new GenUI module for managing user interface surfaces, including creation, response handling, and state synchronization. - Implemented API endpoints for listing and responding to GenUI surfaces associated with runs and projects. - Introduced event types and payload helpers for GenUI surface events, enhancing the interaction model for headless operations. - Established a persistent state writer for GenUI surfaces, ensuring reliable data management and retrieval. - Enhanced the plugin system to support auto-derived OAuth prompts for required connectors, improving user experience during plugin application. |
||
|
|
4c7cd5d9f2 |
feat(plugins): introduce plugin system with installation and management capabilities
- Added support for a new plugin system, allowing users to install, uninstall, and manage plugins through the daemon. - Implemented API endpoints for listing installed plugins, retrieving plugin details, and applying plugins with input validation. - Introduced a plugin doctor feature to validate plugin manifests and check for issues before application. - Established a plugin persistence layer with SQLite migrations for managing installed plugins and their metadata. - Enhanced the CLI with commands for plugin operations, improving user interaction with the plugin ecosystem. |
||
|
|
9ef136ced5
|
fix: sync Orbit last run with selected prompt template (#937)
* fix(orbit): scope last run to selected template * fix(orbit): preserve legacy last run on upgrade * fix(orbit): pin legacy last-run fallback on refresh * fix(orbit): pin template id at run start * test(web): sync orbit fixtures with skill summary |
||
|
|
e13adf2e63
|
feat(daemon): finalize design package endpoint (closes #450) (#832)
* feat(daemon): scaffold /api/projects/:id/finalize/anthropic (refs #450) Phase C of the PR 2 plan for issue #450: scaffold the route + module shape so subsequent phases (D-I) land function bodies and tests against a stable surface that already passes typecheck. What lands here: - apps/daemon/src/finalize-design.ts: module-level constants (DEFAULT_BASE_URL, DEFAULT_MAX_TOKENS=16000, INPUT_BODY_CAP_BYTES=384KiB, LOCK_FILENAME=.finalize.lock, OUTPUT_FILENAME=DESIGN.md, DEFAULT_TIMEOUT_MS=120s); inline interfaces for the request/response shape (kept out of packages/contracts per scope rules); two error classes - FinalizePackageLockedError (mirrors PR #493's TranscriptExportLockedError) and FinalizeUpstreamError (carries upstream HTTP status for the route's error mapping); function stub that throws "not yet implemented". - apps/daemon/tests/finalize-design.test.ts: vitest harness with describe.skip placeholder so the file imports cleanly. Real cases land in phases D-I. Default-import of node:fs (per memory: vi.spyOn cannot redefine on the frozen ESM Module Namespace; CJS exports object is mutable). - apps/daemon/src/server.ts: route handler at POST /api/projects/:id/finalize/anthropic, slotted next to the existing :id/deploy* family. Validates apiKey/model non-empty, optional baseUrl via the existing validateExternalApiBaseUrl closure (forbidden -> 403, invalid -> 400), optional maxTokens positive number; calls getProject (404 on miss); calls finalizeDesignPackage (which throws, caught and mapped to 500 for now); maps known error classes (FinalizePackageLockedError -> 409, FinalizeUpstreamError -> 502) pre-emptively. Path shape rationale (Bryan-confirmed): project-scoped path matches every sibling /api/projects/:id/* route in server.ts (deploy, deployments, deploy/preflight); provider-namespaced segment leaves a clean expansion line for /api/projects/:id/finalize/openai etc. as follow-ups. Field-name rationale: apiKey, baseUrl, model, maxTokens match ProxyStreamRequest verbatim (packages/contracts/src/api/proxy.ts:8-19) so a future caller can reuse the same body shape. baseUrl is optional here (intentional divergence from the proxy at server.ts which requires it) so standard Anthropic users do not need to set it; Bedrock / self-hosted-proxy users still can. Verification: pnpm --filter @open-design/daemon typecheck exits 0; finalize-design.test.ts loads cleanly with 1 skipped placeholder; no other tests touched. Refs nexu-io/open-design#450 (PR 2 scaffold; pipeline body in subsequent commits) * feat(daemon): transcript truncation helper for /finalize prompt Phase D of the PR 2 plan for issue #450: lands the helper that bounds the transcript section of the synthesis prompt. Why this exists: real-world signal at authoring time was a local project transcript already at 3.95 MB. Anthropic's claude-opus-4-7 context cap is roughly 200K tokens (~700 KB at typical density). Inserting an unbounded transcript would 4xx upstream on the first real call. This helper keeps the on-disk .transcript.jsonl lossless (PR #493's contract) while making the prompt-inclusion bounded. Strategy: - Cap output at INPUT_BODY_CAP_BYTES (384 KiB) so the prompt has room for the system prompt + design system body + current artifact + room for the synthesis output. - Always preserve the header line - it carries projectId, schemaVersion, conversation/message counts, attachment counts; synthesis quality depends on knowing the original sizes. - Split equal byte budgets between head and tail so both project genesis and most-recent intent survive. Two thinking segments separated only by mid-session truncation lose the same kind of boundary that PR #493 preserves between thinking blocks - that's accepted; smarter semantic chunking is a follow-up. - Insert a single `{"kind":"truncated","reason":"size","omittedBytes":N}` sentinel JSON line between the head and tail so a synthesis consumer can detect the gap. omittedBytes is the difference between the original UTF-8 byte length and the output's UTF-8 byte length. - If the head + tail budgets together cover the whole body (e.g. all message lines are tiny), no marker is emitted - the output is the input verbatim. Tests: - "returns the input verbatim when the JSONL fits under the 384 KiB cap" pins that small transcripts pass through unchanged with no marker. - "head+tail truncates with a single marker line when the JSONL exceeds the 384 KiB cap" pins that output is bounded, header survives, exactly one marker emitted with non-zero omittedBytes, both ends of the body preserved, and at least one middle message omitted. Suite delta: +2 tests in finalize-design.test.ts. Refs nexu-io/open-design#450 * fix(daemon): resolve noUncheckedIndexedAccess in truncateTranscriptForPrompt D1 (0eaa123) shipped with `body[headIndex]` and `body[i]` typed as `string | undefined` under TypeScript's `noUncheckedIndexedAccess` strict mode. Local typecheck would have caught it but the prior verification piped through `tail` which masked the non-zero exit code of `tsc`. Coalesce each access via `?? ''` (the array is from `String.split('\n')` so undefined elements are not actually reachable; the coalesce is a type-narrowing convenience, not a behavior change). Verification: `pnpm --filter @open-design/daemon typecheck` exits 0; `pnpm --filter @open-design/daemon test finalize-design` shows 2/2 + 1 skipped, identical to the pre-fix run. Refs nexu-io/open-design#450 * feat(daemon): current-artifact resolver for /finalize Phase E of the PR 2 plan for issue #450: resolves which artifact (if any) accompanies the transcript + design system in the synthesis prompt. Priority order (Bryan-locked in plan §6): 1. The file referenced by tabs.is_active = 1 IF an <name>.artifact.json sidecar exists on disk. Sidecar presence is the discriminator: an inferred manifest from `inferLegacyManifest` (e.g. for a bare .html with no sidecar) does NOT count, and an active tab pointing at a non-artifact file (.md, .txt) falls through. 2. Newest project file with a real .artifact.json sidecar, sorted by manifest.updatedAt descending. Files without an updatedAt sort last so legacy pre-streaming manifests do not get accidentally promoted. 3. Returns null - "no artifact in scope". The Phase H caller will emit `artifact: null` in the response and the prompt's "Current artifact" section will read "none". Sidecar presence is checked via `existsSync` on the on-disk path, NOT via the `artifactManifest` field returned by readProjectFile/listFiles (those run inferLegacyManifest as a fallback for known kinds, which would otherwise cause a bare .html with no sidecar to look like an artifact). Tests: - "returns the active-tab artifact when its sidecar is present, even if a newer artifact exists elsewhere": pinned.html (older updatedAt) is in the active tab; newer.html (newer updatedAt) is not. Resolver returns pinned.html - intent (active tab) beats recency. - "falls through to newest .artifact.json when active tab points at a non-artifact file": README.md is the active tab (no sidecar); design.html has a real sidecar. Resolver falls through and returns design.html. - "returns null when no active tab and no .artifact.json sidecars exist": only a README.md is in the project; no tabs row. Resolver returns null. Suite delta: +3 tests in finalize-design.test.ts (5 active total). Refs nexu-io/open-design#450 * feat(daemon): synthesis prompt construction for /finalize Phase F of the PR 2 plan for issue #450: builds the system + user prompts that get sent to Anthropic's Messages API in the synthesis call. Pure function; no IO, no side effects. System prompt (literal, stored as a module-level constant): instructs Claude to emit a DESIGN.md document with a fixed 7-heading structure (# DESIGN.md / ## Summary / ## Brand & Voice / ## Information Architecture / ## Components & Patterns / ## Visual System / ## Open Questions / ## Provenance). The Provenance section is required to list project ID, design system, current artifact, transcript message count, and the UTC generation timestamp. User prompt (built at runtime): structured payload with the truncated transcript JSONL, the design system body, and the current artifact body, each under a ## heading. Missing inputs (no design system selected, no artifact in scope) produce explicit "none" headings + parenthetical placeholder body so Claude does not hallucinate content for absent sections. Truncation is the caller's concern - this function does not re-truncate. The caller (Phase H pipeline) feeds in a JSONL that has already been bounded by truncateTranscriptForPrompt. Tests: - "includes the transcript JSONL verbatim and the generation context": pins all section headings, the transcript body verbatim, the design system body verbatim, the artifact body verbatim, and every generation-context line. - "falls back to \"none\" + parenthetical when no design system is selected": designSystemId=null and designSystemBody=null -> heading reads "## Active design system: none" with the parenthetical body. - "falls back to \"none\" + parenthetical when no artifact is in scope": artifact=null -> heading reads "## Current artifact: none" with the parenthetical body. Suite delta: +3 tests in finalize-design.test.ts (8 active total). Refs nexu-io/open-design#450 * feat(daemon): Anthropic call + retry strategy for /finalize Phase G of the PR 2 plan for issue #450: lands the upstream Claude Messages API call with a single transient-error retry, plus the response extractor that turns Anthropic's content array into the DESIGN.md body. What lands here: - appendVersionedApiPath: inlined from the connectionTest helper at apps/daemon/src/connectionTest.ts:188-195 (it is not exported there). Appends /v1/messages when the base URL has no /vN segment, otherwise appends /messages directly. Same semantics; ~5 lines. - callAnthropicWithRetry: POSTs to <base>/v1/messages with the canonical Anthropic headers (content-type, x-api-key, anthropic-version: 2023-06-01) and body shape ({ model, max_tokens, system, messages, stream:false }). One retry on transient (HTTP 429 or 5xx); on terminal failure throws FinalizeUpstreamError carrying the upstream HTTP status and raw body text. The route handler in Phase I maps status to AUTH_FAILED / RATE_LIMITED / UPSTREAM_FAILED and runs the body through redactSecrets before exposing it as `details`. - extractDesignMd: concatenates content[].text for every block where type === 'text', preserving order. Throws FinalizeUpstreamError(502) on three malformed-response shapes: non-object payload, missing content array, zero text blocks. The route handler maps the throw to 502 UPSTREAM_FAILED so synthesis cannot land a half-empty DESIGN.md on disk. - Test-only `_sleepMs` injection on the call params so the retry-delay sleep is instant under vitest. Default sleep uses setTimeout. Retry posture (1 retry on transient) is opinionated; the maintainer's "standard exponential backoff" answer was directional and a single retry matches the existing daemon's posture (transcript export and connectionTest do zero retries) while staying inside the daemon's blocking-fast posture for /finalize. Tests: - callAnthropicWithRetry: throws on 401 with no retry; retries once on 429 and resolves on second 200; throws after both 5xx attempts; propagates AbortError when signal is pre-aborted. - extractDesignMd: concatenates ordered text blocks; throws on missing content array; throws on content with zero text blocks. A spurious typecheck error from `exactOptionalPropertyTypes` (signal typed as AbortSignal | undefined where RequestInit expects AbortSignal | null) was resolved by conditionally spreading signal into the RequestInit literal. Suite delta: +7 tests in finalize-design.test.ts (15 active total). Refs nexu-io/open-design#450 * feat(daemon): wire /finalize pipeline end-to-end Phase H of the PR 2 plan for issue #450: stitches together every phase D-G primitive into the full finalizeDesignPackage pipeline that the route handler in Phase I will expose over HTTP. Pipeline (in execution order, all inside a try/finally that always releases the lockfile): 1. getProject(db, projectId): defensive 404 (the route validates first; this throw catches direct CLI/script callers). 2. mkdirSync(<projectDir>, { recursive: true }): some projects have DB rows but no on-disk dir yet (PR #493's same fix). 3. fs.openSync(.finalize.lock, 'wx'): EEXIST -> FinalizePackageLockedError (mirror PR #493's TranscriptExportLockedError). 4. exportProjectTranscript(db, projectsRoot, projectId, { now }): produces .transcript.jsonl on disk; we read the body and run it through truncateTranscriptForPrompt to bound the prompt-inclusion size. 5. readDesignSystem(designSystemsRoot, designSystemId): returns null when the project has no design_system_id selected, when the design system directory does not exist, or when the DESIGN.md file is missing. 6. resolveCurrentArtifact(db, projectsRoot, projectId): active tab -> newest .artifact.json by manifest.updatedAt -> null. 7. buildSynthesisPrompt({...}): system + user prompt (per Phase F). 8. callAnthropicWithRetry({...}): one retry on 429/5xx; throws FinalizeUpstreamError on terminal failure. 9. extractDesignMd(payload): concatenates content[].text blocks; throws FinalizeUpstreamError(502) on malformed shape. 10. Atomic write: writeFileSync({flag:'wx'}) -> reopen for fsync -> rename. Errors unlink tmp before rethrowing. 11. Lock release in finally (always closeSync + unlinkSync). Bounded blocking: the function uses its own AbortController + 120s timeout when the caller does not supply a signal. Caller-supplied signal takes precedence. Type tightening: switched the local Db interface to `type Db = Database.Database` (better-sqlite3) so the function signature is compatible with `exportProjectTranscript`'s typed parameter. Source file already had a `better-sqlite3` import in claude-design-import area of the daemon, so no new dependency. Tests: - "writes DESIGN.md atomically on the happy path": end-to-end with seeded project + conversation + 2 messages + design system on disk; asserts file at exact path + body bytes match the fetch mock. - "response carries every documented field with correct types": designMdPath/bytesWritten/model/inputTokens/outputTokens/artifact/ transcriptMessageCount/designSystemId all present and typed. - "emits design system 'none' in the prompt when no design_system_id is set": fetch mock asserts on the body it receives. - "throws FinalizePackageLockedError when .finalize.lock is already held": pre-create lockfile; assert throw + DESIGN.md not written + pre-existing lock NOT unlinked (we did not own it). - "replaces an existing DESIGN.md atomically on a second finalize": inject a sentinel between two finalize calls; assert sentinel is gone after second run. - "cleans up tmp file AND lock file on every error path": mock fs.writeFileSync to throw on the tmp path; assert no DESIGN.md.tmp.* remain, no DESIGN.md, no .finalize.lock. - "uses the default https://api.anthropic.com baseUrl when baseUrl is omitted": fetch URL begins with the default; baseUrl=undefined path. vi.restoreAllMocks() now runs in afterEach so the writeFileSync spy from the cleanup test does not leak into subsequent tests. Suite delta: +7 tests in finalize-design.test.ts (22 active total). Refs nexu-io/open-design#450 * feat(daemon): /finalize HTTP route handler + error mapping Phase I of the PR 2 plan for issue #450: replaces the Phase C stub's catch-all 500 with status-aware error mapping that surfaces the right HTTP status + error code for each documented failure mode, and adds HTTP-layer tests that boot startServer to exercise the route's validation branches. Route handler changes: - :id format guard: an inline regex matching isSafeId at apps/daemon/src/projects.ts:556-558 rejects unsafe ids with 400 BAD_REQUEST before any DB or filesystem work. Without this, an id like 'bad!id' would either fail getProject as 404 (wrong code) or reach the function and throw 'invalid project id' (mapped to 500). - FinalizeUpstreamError mapping is now status-aware: - upstream 401 -> 401 AUTH_FAILED - upstream 429 -> 429 RATE_LIMITED - upstream 5xx (or our own 502 sentinel for malformed responses) -> 502 UPSTREAM_FAILED In all cases the upstream raw text is run through redactSecrets so the apiKey cannot leak through `details` even if the upstream echoes the inbound headers. - AbortError mapping: when the 120s AbortController fires (or the caller pre-aborted the signal), surface as 503 TIMEOUT. - Default case: console.error the error per daemon convention; client sees 500 INTERNAL with the message routed through redactSecrets. - Imported redactSecrets alongside the existing connectionTest imports (apps/daemon/src/server.ts:51). HTTP-layer tests (boot startServer({port:0,returnServer:true}) once in beforeAll, mirror the proxy-routes.test.ts pattern): - "400 BAD_REQUEST when baseUrl is not a valid URL (test #13)": baseUrl='not-a-url'. - "403 FORBIDDEN when baseUrl points at a private internal IP (test #14)": baseUrl='http://10.0.0.1'. Note: validateBaseUrl explicitly allows loopback (for local OpenAI-compatible servers) and only blocks non-loopback private IPs (10/8, 172.16/12, 192.168/16, fc00::/7, fe80::/10). - "400 BAD_REQUEST when apiKey is missing (test #15)": apiKey omitted. - "400 BAD_REQUEST when :id contains characters outside the safe-id regex (test #16)": id='bad!id' contains '!' which is not in [A-Za-z0-9._-]. Suite delta: +4 tests (26 active in finalize-design.test.ts). Full daemon suite: 1078/1078 pass; baseline+26 (the +5 above plan target reflects retry+extract split into more granular unit tests than originally enumerated; all real, none skipped). Refs nexu-io/open-design#450 * fix(daemon): tighten isSafeId to reject pure-dot project ids Addresses the P1 path-traversal finding from @lefarcen on PR #832 (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512644). The pre-fix `isSafeId` at apps/daemon/src/projects.ts:556-558 used regex `/^[A-Za-z0-9._-]{1,128}$/` which permitted pure-dot ids (`.`, `..`, `...`) because `.` is in the character class. `projectDir` and `resolveProjectDir` both delegated to `isSafeId`, so an id of `..` would resolve to the PARENT of `.od/projects/` via `path.join`. Threat model (per @lefarcen): - An attacker creates a project row whose stored id is `..` (or another pure-dot variant) — for instance via a workflow that writes the row directly without going through the API. Subsequent finalize/write ops keyed by that id then escape the project tree. - A direct CLI / scripted caller passing `..` as the project id reaches the function without HTTP normalization saving us. (Express normalizes %2e%2e to .. and collapses path segments, which yields 404 for the URL `/api/projects/%2e%2e/...` in practice — but that's Express's protection, not ours.) Fix: - isSafeId now explicitly rejects pure-dot ids (`/^\.+$/.test(id)`) before the char-class regex check. Empty string and inputs longer than 128 chars are also rejected explicitly so the function fails closed on edge cases. - isSafeId is now exported from apps/daemon/src/projects.ts so the /finalize route handler in apps/daemon/src/server.ts can use the same validator instead of re-implementing the regex inline. This prevents drift between the route guard and the projectDir guard, which was how this hole originally appeared. Tests (in finalize-design.test.ts because that's where the threat was flagged; isSafeId is daemon-wide so a dedicated test file would also work): - isSafeId rejects `.`, `..`, `...`, `....` - isSafeId rejects ids with `/`, `\`, `!`, leading whitespace - isSafeId rejects empty string and >128 chars - isSafeId rejects non-string inputs (null/undefined/number) - isSafeId accepts plain ids, ids with mid-string dots, UUIDs, single chars Suite delta: +7 tests (33 active in finalize-design.test.ts). Full daemon suite: 1085/1085. Refs nexu-io/open-design#832 * fix(daemon): address PR #832 P1 findings — imported folders + network 502 Addresses two of the three P1 findings from @lefarcen on PR #832: 1. Imported-folder projects route DESIGN.md to metadata.baseDir (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512656, also flagged independently by @chatgpt-codex-connector at #discussion_r3202430470) The pipeline previously called `projectDir(projectsRoot, projectId)` unconditionally, which resolves to `.od/projects/<id>`. For projects created via /api/import/folder the project row's `metadata.baseDir` carries the user's actual folder; without threading metadata through, finalize would silently land DESIGN.md in the hidden daemon data dir and the current-artifact resolver would miss the user's real files. Fix: switch from `projectDir` to `resolveProjectDir(projectsRoot, projectId, metadata)` in both `finalizeDesignPackage` and `resolveCurrentArtifact`. Thread `project.metadata` (from `getProject`'s normalized row) through both call paths. The resolver gets a new optional `metadata` parameter; native projects pass null and get identical behavior. 2. Network failures and JSON parse errors now map to 502 UPSTREAM_FAILED (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512661) Pre-fix, only HTTP-non-OK responses were wrapped as FinalizeUpstreamError. DNS failures (ECONNREFUSED, ENOTFOUND), fetch TypeErrors, and `response.json()` SyntaxErrors fell through to the route's catch-all and surfaced as 500 INTERNAL — incorrect: those are upstream-level failures, not daemon bugs. Fix: - Wrap callAnthropicWithRetry in a try/catch that passes FinalizeUpstreamError and AbortError through verbatim, but rewraps any other thrown error as FinalizeUpstreamError(502, '', message). - Wrap response.json() in a try/catch that rewraps SyntaxError as FinalizeUpstreamError(502, '', "upstream Anthropic returned non-JSON body: ..."). - The route handler's existing FinalizeUpstreamError mapping then correctly maps these to 502 with the message in `details` (run through redactSecrets first). Tests: - "writes DESIGN.md under metadata.baseDir for imported-folder projects": inserts a project row with metadata.baseDir pointing at a user-folder temp dir; asserts result.designMdPath lands there AND the hidden .od/projects/<id> dir does NOT contain a DESIGN.md. - "rewraps fetch network rejection as FinalizeUpstreamError(502)": fetchImpl throws TypeError with cause.code='ENOTFOUND'; assert thrown error has name=FinalizeUpstreamError and status=502. - "rewraps 200 with non-JSON body as FinalizeUpstreamError(502)": fetchImpl returns 200 with text/html body; response.json() throws SyntaxError internally; assert FinalizeUpstreamError(502). Suite delta: +3 tests (36 active in finalize-design.test.ts). Full daemon suite: green at last check; will re-verify before push. Refs nexu-io/open-design#832 * refactor(daemon): move /finalize DTOs to contracts + map error codes + validate active-tab Addresses the P2 and P3 findings from @lefarcen on PR #832: P2 — Error codes + DTOs not in packages/contracts https://github.com/nexu-io/open-design/pull/832#discussion_r3202512673 Reverses my plan's locked decision #10 ("no contracts changes in this PR; inline the request/response types"). That rule came from the predecessor PROMPT brief's anti-pattern table; @lefarcen's review is fresher signal and supersedes it. Drift risk between the daemon's inline types and any future PR 3 web client is real. - New contracts module: packages/contracts/src/api/finalize.ts with FinalizeAnthropicRequest / FinalizeArtifactRef / FinalizeAnthropicResponse. Re-exported from the package root and made addressable via `@open-design/contracts/api/finalize` subpath. - Daemon source imports the canonical types from contracts and re-exports the public type names so internal references keep working without touching every call site. - Daemon-local error codes remapped to existing ApiErrorCode union members (apps/daemon/src/server.ts), per @lefarcen's suggested mapping: FINALIZE_IN_PROGRESS -> CONFLICT AUTH_FAILED -> UNAUTHORIZED UPSTREAM_FAILED -> UPSTREAM_UNAVAILABLE TIMEOUT -> UPSTREAM_UNAVAILABLE (status 503) INTERNAL -> INTERNAL_ERROR HTTP status codes are unchanged; only the `code` field in the error JSON body changed. P3 — Active-tab name not validated before sidecar probe https://github.com/nexu-io/open-design/pull/832#discussion_r3202512684 resolveCurrentArtifact now runs the active tab's name through validateProjectPath BEFORE composing it into a path.join expression. An invalid tab (traversal segments, absolute path, null byte, reserved segment) causes resolveCurrentArtifact to fall through to the newest-artifact branch rather than abort or probe outside the project directory. Tests: - "falls through (does not throw) when active tab name contains traversal segments": injects a malformed `tabs.name = '../../../etc/passwd'` row directly via SQL (bypassing production tab-creation validation), seeds a real artifact, asserts the resolver returns the real artifact rather than the malformed name. Suite delta: +1 test (37 active in finalize-design.test.ts). Full daemon suite: 1089/1089 green. Refs nexu-io/open-design#832 * fix(contracts): publish /api/finalize as standalone runtime entrypoint Addresses @mrcfps's CI-red review on PR #832 (https://github.com/nexu-io/open-design/pull/832, inline comment on packages/contracts/package.json). The previous J3 commit added `./api/finalize` as a type-only subpath: the entry had only a `types` field, no `default`. That broke the contracts package-runtime gate (packages/contracts/tests/package- runtime.test.ts:38-47) which asserts every exports entry exposes both a `.mjs` runtime and a `.d.ts` types target. mrcfps proposed two fixes; this commit takes path B — make finalize a first-class published module rather than a type-only re-export from the package root. Path B vs path A (a peer-AI second opinion via /collaborate confirmed): under NodeNext + ESM with exports-map semantics, TypeScript validates re-exported symbols against the published module-identity surface. Because the previous J3 had `./api/finalize` neither declared as an exports-map entry nor materialized as a standalone .mjs, TS omitted the re-exported names during package boundary analysis. Even at runtime `import('@open-design/contracts').FINALIZE_SCHEMA_VERSION` worked from the bundled index.mjs but the type-checker rejected it. Path B aligns the runtime and declaration surfaces. Changes: - packages/contracts/esbuild.config.mjs: add `./src/api/finalize.ts` to entryPoints so dist/api/finalize.mjs is generated as a standalone module rather than only inlined into the bundled root. - packages/contracts/package.json: re-add `./api/finalize` to the exports map with both `default: ./dist/api/finalize.mjs` AND `types: ./dist/api/finalize.d.ts`. Mirrors `./api/connectionTest`'s shape (the canonical pattern for first-class submodule entries). - packages/contracts/src/api/finalize.ts: keep the runtime export `FINALIZE_SCHEMA_VERSION = 1` (giving the standalone module a real value to emit beyond the type-only interfaces) and update the doc-comment now that the standalone .mjs is wired. - apps/daemon/src/finalize-design.ts: switch the type import from the inline declarations introduced in the prior J3 fallback to `import type { ... } from '@open-design/contracts/api/finalize'`. Re-export the names so internal references inside finalize-design.ts keep working without touching every call site. Verified: - node --input-type=module -e "import('@open-design/contracts/api/finalize').then(m=>console.log(JSON.stringify(Object.keys(m))))" prints ["FINALIZE_SCHEMA_VERSION"] — runtime resolution clean. - pnpm --filter @open-design/contracts test: 6/6 (including both package-runtime.test.ts cases on the rebuilt exports map). - pnpm --filter @open-design/daemon typecheck: exits 0. - pnpm --filter @open-design/daemon test: 1089/1089 (no regression vs the prior J3 number). Refs nexu-io/open-design#832 --------- Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai> |
||
|
|
d592f6087f
|
feat(mcp): external MCP client with daemon-managed OAuth and 39 design-focused templates (#898)
* feat(mcp): add external MCP client with daemon-managed OAuth and 17 design-focused templates
Open Design now acts as an MCP CLIENT and surfaces tools from third-party
MCP servers to the underlying agent (Claude Code, Hermes, Kimi).
Daemon
- New mcp-config / mcp-oauth / mcp-tokens modules: persist server entries
to .od/mcp-config.json, run the OAuth dance for HTTP/SSE servers
end-to-end on the daemon (so cloud deployments work and tokens
survive across turns), and inject Authorization: Bearer headers into
the per-spawn .mcp.json the daemon writes for Claude Code (or the
ACP mcpServers map for Hermes/Kimi).
- /api/mcp/servers and /api/mcp/oauth/{start,status,disconnect}
endpoints, plus spawn-time wiring in agents that hands the configured
servers to the active agent CLI.
- System-prompt directive for connected external MCPs so the model
does not chase Claude Code's synthetic *_authenticate /
*_complete_authentication tools when the Bearer is already pinned.
Web
- Settings -> External MCP servers panel with per-row OAuth Connect /
Disconnect / Refresh affordances and per-row template hints.
- New "Add server" picker categorized into 7 groups
(image-generation, image-editing, web-capture, ui-components,
data-viz, publishing, utilities) with a search box, sticky close
button, collapsible <details> sections (auto-expand on search),
60vh capped scroll region, and a pinned Custom-server footer.
- ChatComposer /mcp slash and MCP picker button forward to the new
Settings tab; AssistantMessage renders MCP tool calls inline;
markdown autolinker handles bare http(s) URLs (incl. OAuth links)
before italic markers so OAuth callback URLs do not get
italic-fragmented mid-token.
Contracts
- packages/contracts/src/api/mcp.ts owns the wire shapes
(McpServerConfig, McpTemplate with stable McpTemplateCategory
enum, McpServersResponse, OAuth start/status/disconnect bodies, the
postMessage payload from the OAuth callback).
Templates (17 built-in)
- image-generation: Higgsfield (OpenClaw, OAuth HTTP), Pollinations,
Allyson (animated SVG), AWS Bedrock Image (uvx).
- image-editing: Imagician, ImageSorcery.
- web-capture: just-every screenshot-website-fast, ScreenshotOne.
- ui-components: 21st.dev Magic, shadcn/ui, FlyonUI.
- data-viz: AntV Chart, Mermaid.
- publishing: EdgeOne Pages.
- utilities: Filesystem, GitHub, Fetch.
Tests
- apps/daemon/tests/mcp-{config,oauth,tokens,spawn}.test.ts cover
storage round-trip, OAuth helpers, token persistence, spawn-time
wiring, every template's transport / command / args / env-field
invariants, and the canonical category enum.
- apps/web/tests/runtime/markdown.test.tsx covers the new autolinker
ordering rules.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(mcp): add 21 more design-focused templates and a `design-systems` category
Expands the built-in MCP picker from 17 to 38 templates so users can compose
the full Open Design craft loop (design-system intake → generate → edit →
audit → publish) without leaving the Settings dialog. Every install spec is
verified live against the upstream README; templates that needed Go binaries,
multi-step `init` ceremonies, or massive runtime stacks (PostgreSQL + Redis
+ Ollama) are intentionally deferred so picking a template still resolves to
a working server in one click.
New `design-systems` category between `web-capture` and `ui-components`
(reflects the upstream-of-components position in the workflow). Mirrored in
`McpTemplateCategory` on both contracts and daemon, and `CATEGORY_ORDER` on
the web side.
New templates by category:
- image-generation (+4): prompt-to-asset (icons / favicons / OG / logos with
free-tier routing across Cloudflare AI / NVIDIA NIM / HF / Stable Horde),
Nano Banana (hosted streamable HTTP, virtual try-on + product placement),
Seedream (hosted streamable HTTP, ByteDance Seedream v3-v5 + SeedEdit),
fal.ai (uvx, 600+ models incl. FLUX / Kling / Hunyuan / MusicGen).
- image-editing (+3): Photopea (34 layered-editor tools — closes the PSD
gap), Topaz Labs (AI upscale / denoise / sharpen), Transloadit (86+ media
pipeline robots).
- web-capture (+1): Pagecast (browser → demo GIF / MP4 with auto-zoom).
- design-systems (+4, NEW category): Figma-Context (Framelink, designs →
code), Design Token Bridge (Tailwind ⇄ CSS ⇄ Figma ⇄ M3 / SwiftUI / W3C
DTCG + WCAG contrast), Design System Extractor (Storybook scrape),
Aesthetics Wiki (cottagecore / dark-academia / y2k / … moodboards).
- data-viz (+2): MCP Dashboards (45+ chart types + KPI dashboards),
Excalidraw Architect (hand-drawn architecture diagrams).
- publishing (+6): PageDrop, PDFSpark, OGForge, QRMint, Slideshot
(HTML → PDF / PPTX / PNG with 7 themes), Deckrun (Markdown → PDF / video,
hosted free tier with no key required).
- utilities (+1): A11y axe-core (WCAG 2.0/2.1/2.2 + color-contrast + ARIA).
Tests cover every new template's wiring (command, args, env / header
required-vs-optional, secret flag), the category enum invariant, and
in-category declaration order for image-generation, design-systems and
publishing buckets where the order is what users see in the picker. 21 new
test cases pass; full mcp-config suite is green.
Templates intentionally deferred (documented in PR body): figma-use
(needs Figma desktop with --remote-debugging-port=9222), m-moire (multi-step
`memi suite init` + daemon ceremony), gemini-media-mcp + trident-mcp (Go
binaries — no npx / uvx path), Pixelle-MCP (full app with web UI + ComfyUI
backend), storybook-addon-mcp (lives inside user's Storybook, not standalone),
primitiv (multi-step init / build / serve), ReftrixMCP (PostgreSQL + Redis +
Ollama + DINOv2), narasimhaponnada/mermaid (overlap with peng-shawn).
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(mcp): add figma-use template (write designs from chat) under design-systems
figma-use is the natural counterpart to Figma-Context already in this PR:
where Framelink reads Figma designs into the model, figma-use writes back
into the canvas (90+ tools — create frames / text / components / variants,
render JSX into Figma, export PNG/SVG, query nodes via XPath, lint for
WCAG / auto-layout / hardcoded colors, analyze design systems).
Wired as an HTTP MCP template (`http://localhost:38451/mcp`) because
`figma-use mcp serve` only exposes HTTP — there's no stdio mode in the
upstream `serve.ts`. No API key. Two prerequisites the user owns are
spelled out in the description so picking the template still resolves to
a working server: (1) start Figma with `--remote-debugging-port=9222`
(or `figma-use daemon start --pipe` on Figma 126+), and (2) leave
`npx figma-use mcp serve` running in a terminal.
Inserted between `design-system-extractor` and `aesthetics-wiki` so the
design-systems category reads as a workflow: read existing design (Figma
Context) → translate tokens (Token Bridge) → extract from Storybook
(Extractor) → write back to Figma (figma-use) → break creative block
(Aesthetics Wiki).
Tests cover the new template's transport (`http`), endpoint URL, the
empty header-fields invariant (no auth required), and bump the
design-systems group order to include it.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(settings): i18n the External MCP / MCP server / Connectors sidebar entries and make the dialog header track the active section
The External MCP sidebar entry this PR introduces was hardcoded English
("External MCP / Add MCP tools (Higgsfield, GitHub…)"). Same for the
adjacent Connectors and MCP server entries. The dialog header was also
pinned to "Execution & model" copy, so opening Settings → External MCP
showed a header that lied about which section the user was on.
Adds six translation keys — `settings.connectorsTitle/Hint`,
`settings.mcpServerTitle/Hint`, `settings.externalMcpTitle/Hint` — and
translates them across all 17 locales (ar, de, en, es-ES, fa, fr, hu, id,
ja, ko, pl, pt-BR, ru, tr, uk, zh-CN, zh-TW).
`SettingsDialog` now derives the header title/subtitle from the active
section (11 sections total) instead of a single hardcoded pair, so each
section renders an honest header.
Co-authored-by: Cursor <cursoragent@cursor.com>
* test(e2e): pin level: 3 on dialog heading lookups for Pets and Connectors
CI's Validate workspace job (#1479) failed two Playwright cases with the
strict-mode violation:
getByRole('dialog').getByRole('heading', { name: 'Pets' })
resolved to 2 elements:
1) <h2>Pets</h2>
2) <h3>Pets</h3>
Same root cause as the unit-test fix already in this PR: the dynamic
dialog `<h2>` now echoes the section's own `<h3>` because the dialog
header tracks the active section. Disambiguate to `level: 3` so each
assertion still pins the section heading specifically (which is what
the test intends to verify).
Audit of the rest of e2e/ for `dialog.getByRole('heading', ...)` —
settings-api-protocol.test.ts looks for "OpenAI API" / "Anthropic API"
section h3s which never appear in the dialog `<h2>` (always
"Execution & model"), so those stay safe.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(mcp): bind OAuth refresh to the issuing client and skip stale tokens
Persist the OAuth client context (token endpoint, client_id, client_secret,
issuer, redirect_uri, resource) alongside the bearer token so refresh hits
the same client the refresh_token was bound to (RFC 6749 §6). The previous
refresh path re-ran beginAuth with a dummy OOB redirect URI, which kept
getOrRegisterClient from finding the original DCR client and made
providers reject the refresh on the next chat turn. Refreshes now reuse
the persisted endpoint/client pair directly.
Also stop injecting expired access tokens at spawn time when refresh is
unavailable or fails. Pinning a stale Bearer made every Claude MCP call
401 while the prompt still treated the server as connected; on that path
we now skip the entry and let the UI surface a reconnect.
Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
56bf6ee1b6
|
feat: agent-callable research command and /search (#615)
* feat: pre-generation research (Tavily) for grounded generation
Adds an optional pre-generation research step so the agent can produce
slides / prototypes / decks grounded in real sources instead of guessing.
User flow:
1. Settings -> Tavily Search -> paste API key (or set TAVILY_API_KEY).
2. Click the new Research button in the chat composer.
3. On send, the daemon runs a Tavily search, prepends the findings
as a <research_context> block ahead of the system prompt, and
spawns the agent. Research progress shows up as status pills in
the chat stream; the agent cites sources inline as [1]/[2]/...
Phase 1 surface:
- Single provider (Tavily), single depth ('shallow'), no LLM
synthesis pass (Tavily's `answer` is the summary).
- Composer toggle only; no popover / depth picker yet.
- Reuses the existing `status` SSE agent payload + StatusPill UI
so no new event variants or renderer code are needed.
Layers touched:
- contracts: ResearchOptions / Source / Findings DTOs;
ChatRequest.research; export from index.
- daemon: apps/daemon/src/research/{index,tavily}.ts orchestrator
+ provider; tavily added to MEDIA_PROVIDERS and ENV_KEYS; hook
in startChatRun before prompt assembly.
- web: ChatComposer toggle + ChatSendMeta; threaded through
ChatPane / ProjectView / streamViaDaemon into ChatRequest.
Side fix (required to land the feature, but useful on its own):
contracts internal relative imports lacked the `.js` suffix that
NodeNext module resolution requires. This was already breaking
`pnpm --filter @open-design/daemon typecheck` on main; without the
fix, none of the new research types were visible to the daemon.
All internal contracts imports now carry `.js`.
Spec: specs/current/research-feature.md (phases 2-4 outlined for
follow-up: composer popover, multi-provider, deep recursion, example
skills with research_recommends).
Verified:
- pnpm --filter @open-design/contracts typecheck/test
- pnpm --filter @open-design/daemon typecheck (the chokidar
project-watchers test is a pre-existing flake, unrelated)
- pnpm --filter @open-design/web typecheck
- node scripts/verify-media-models.mjs
* fix(daemon): clamp Tavily max_results to 20
Tavily's /search endpoint requires `max_results` in [0, 20]; sending a
larger value (e.g. when `research.depth: "deep"` resolves to 30) returns
400 and `runResearch` silently falls back to no-research. Clamp at the
provider boundary so Phase 2 depth tiers above 20 still produce results
instead of failing the request.
Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)
* Remove stale research merge leftovers
* Add agent-callable research search
* Fix Indonesian locale typecheck
* Fix research command invocation edge cases
* Harden slash search prompt expansion
* Honor research source caps in command contract
* Require search reports in design files
* Add research data provider settings
* Wire web research provider fallback order
* Update research provider fallback wording
* Revert "Update research provider fallback wording"
This reverts commit
|
||
|
|
e6e5928be1
|
feat(web): add connection tests for execution settings (#507)
* feat(settings): add connection test for providers and CLI agents Adds a "Test" action in the Settings dialog that verifies the configured provider (Anthropic/OpenAI/Azure/Google) or CLI agent without sending a real chat. Backed by a new daemon endpoint and shared contracts, with categorized inline statuses and i18n strings across all supported locales. * fix(settings): address connection test review feedback * fix(daemon): pass empty MCP servers for connection probes * fix(connection-test): address review blockers * fix(daemon): fail json stream runs on structured errors * fix(contracts): build connection test subpath export * Use draft CLI env in agent connection tests * fix(i18n): add fallback ids for new curated content |
||
|
|
c3d9136a0c
|
Add live artifacts and Composio connector catalog (#381)
* docs: add live artifacts implementation spec * docs: align live artifacts implementation plan * Ralph iteration 1: work in progress * Ralph iteration 2: work in progress * Ralph iteration 3: work in progress * Ralph iteration 4: work in progress * Ralph iteration 5: work in progress * Ralph iteration 6: work in progress * Ralph iteration 7: work in progress * Ralph iteration 8: work in progress * Ralph iteration 9: work in progress * Ralph iteration 10: work in progress * Ralph iteration 11: work in progress * Ralph iteration 12: work in progress * Ralph iteration 13: work in progress * Ralph iteration 14: work in progress * Ralph iteration 15: work in progress * Ralph iteration 16: work in progress * Ralph iteration 17: work in progress * Ralph iteration 18: work in progress * Ralph iteration 19: work in progress * Ralph iteration 20: work in progress * Ralph iteration 21: work in progress * Ralph iteration 22: work in progress * Ralph iteration 23: work in progress * Ralph iteration 24: work in progress * Ralph iteration 25: work in progress * Ralph iteration 26: work in progress * Ralph iteration 27: work in progress * Ralph iteration 28: work in progress * Ralph iteration 29: work in progress * Ralph iteration 30: work in progress * Ralph iteration 31: work in progress * Ralph iteration 32: work in progress * Ralph iteration 33: work in progress * Ralph iteration 34: work in progress * Ralph iteration 35: work in progress * Ralph iteration 36: work in progress * Ralph iteration 37: work in progress * Ralph iteration 38: work in progress * Ralph iteration 39: work in progress * Ralph iteration 40: work in progress * Ralph iteration 41: work in progress * Ralph iteration 42: work in progress * Ralph iteration 43: work in progress * Ralph iteration 44: work in progress * Ralph iteration 45: work in progress * Ralph iteration 46: work in progress * Ralph iteration 47: work in progress * Ralph iteration 48: work in progress * Ralph iteration 49: work in progress * Ralph iteration 50: work in progress * Ralph iteration 51: work in progress * Ralph iteration 52: work in progress * Ralph iteration 53: work in progress * Ralph iteration 54: work in progress * Ralph iteration 55: work in progress * Ralph iteration 56: work in progress * Ralph iteration 57: work in progress * Ralph iteration 58: work in progress * Ralph iteration 59: work in progress * Ralph iteration 60: work in progress * Ralph iteration 61: work in progress * Ralph iteration 62: work in progress * Ralph iteration 63: work in progress * Ralph iteration 64: work in progress * Ralph iteration 65: work in progress * Ralph iteration 1: work in progress * Ralph iteration 2: work in progress * Ralph iteration 3: work in progress * Ralph iteration 4: work in progress * Ralph iteration 5: work in progress * Ralph iteration 6: work in progress * Ralph iteration 8: work in progress * Ralph iteration 9: work in progress * Ralph iteration 17: work in progress * Add Composio-backed connectors * Add Composio-backed connector catalog * Fix connector callback flow * Update live artifact connector refresh * Fix live artifact refresh updates * Improve live artifact viewer toolbar * Refine live artifact source tabs * Expand Composio connector catalog * Improve Composio connector browsing * Fix artifact refresh source safety checks Generated-By: looper 0.4.1 (runner=fixer, agent=opencode) * Fix live artifacts PR feedback Generated-By: looper 0.5.0 (runner=fixer, agent=opencode) * Fix live artifact preview CORS validation Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix connector OAuth IPv6 loopback hosts Allow bracketed IPv6 loopback Host headers when deriving connector OAuth callback URLs so IPv6-bound daemons can complete connection flow. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Preserve live artifact refresh permissions Respect explicit refresh permission choices during live artifact create and update flows so revoked connector sources remain gated. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact preview cache freshness Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact refresh validation Guard manual refreshes with local daemon checks and reject daemon_tool sources without a toolName before refresh execution. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix Composio credential invalidation Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact CORS methods Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix workspace validation Restore media config test isolation under Vitest setup data-dir overrides and add the missing French live artifact display copy so the workspace test suite stays aligned.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector safety filtering Keep agent-preview connector listings aligned with execution safety policy and prune stale Composio OAuth state records before they accumulate. Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix agent runtime cleanup Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix live artifact daemon access Validate local-only live artifact routes against the peer socket address and pass daemon-resolved CLI paths to ACP MCP descriptors.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector run limit pruning Evict stale connector rate-limit buckets so long-lived daemon processes do not retain per-run entries indefinitely.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector compact schemas Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Improve connector connection feedback * Adjust connector gate positioning * Fix live artifact refresh commits Avoid marking refresh candidates failed after snapshot or state persistence errors by deferring live artifact mutations until the durable refresh metadata is written. Also align connector OAuth callback host validation with daemon loopback handling.\n\nGenerated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Improve connector search relevance * fix(daemon): harden connector connection state Require loopback daemon validation before connector connect side effects and only clear provider-owned connector statuses during credential reset. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): guard connector disconnect route Require local daemon request validation before connector disconnect side effects. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): guard composio config updates Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): dispatch live artifacts mcp first Route the live-artifacts MCP server before the generic MCP CLI so od mcp live-artifacts starts the dedicated server instead of failing generic argument parsing.\n\nGenerated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): handle integer connector schemas Allow JSON Schema integer connector inputs while preserving fractional-value validation so generated connector tool schemas accept valid page sizes and limits. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix: align live artifact refresh error codes Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Fix live artifact connector refresh flow * Update live artifact design cards * Add beta badge to live artifact form * Remove live artifact tile model * Fix live artifact refresh sync * Fix live artifact MCP refresh durability Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Fix live artifact refresh safety Enforce persisted refresh opt-out and connector auto-read gating before refresh sources execute. Generated-By: looper 0.5.5 (runner=fixer, agent=opencode) |
||
|
|
76e6c7a9f6
|
feat: Critique Theater Phase 4 (persistence + transcript + orchestrator) (#481)
* docs(specs): add Critique Theater design spec for panel-tempered artifacts * docs(specs): add Critique Theater implementation plan * docs(specs): rename UI to Design Jury, add lane-density modes, ship-rule explainer, label sizing * feat(contracts): add CritiqueConfig schema and defaults * fix(contracts): apply Task 1.1 review (CRITIQUE_PROTOCOL_VERSION rename, descriptions, RoleWeights export) * feat(contracts): add PanelEvent discriminated union and isPanelEvent guard * fix(contracts): apply Task 1.2 review (exhaustive event-type list, runId guard, import order) * feat(contracts): add CritiqueSseEvent variants and panelEventToSse mapper * test(daemon): add v1 wire-protocol golden fixtures for Critique Theater parser * feat(daemon): add v1 streaming parser for Critique Theater wire protocol * chore(contracts): add .js extensions to relative imports for NodeNext consumers * fix(daemon): satisfy noUncheckedIndexedAccess in v1 parser regex match access * test(daemon): cover parser failure modes; fix unclosed-PANELIST swallow bug * fix(daemon,contracts): address PR #387 review - parser now clamps panelist + DIM scores against the run-declared scale captured from <CRITIQUE_RUN scale=...>, not a hardcoded 100 - PANELIST appearing before any <ROUND n=...> opens now throws MalformedBlockError rather than emitting events with NaN round - DIM_RE and MUST_FIX_RE hoisted to module scope and lastIndex reset per call so the parser hot path stops recompiling regex per artifact - overflow check after drain simplified to a plain buf.length > cap test (the prior compound condition was always true on the right side and obscured intent) - scoreThreshold <= scoreScale refine gains a 1e-9 epsilon so floating slack does not reject semantically valid configs - round-1 designer ARTIFACT guard gains a comment naming the spec invariant and the v2 relaxation path - 3 new regression tests cover the panelist-without-round, scale=10 clamp, and scale=20 plumbing cases * docs(specs): rationale for non-goals, failure-mode rate targets, Phase 10 matrix, Phase 14 doc layout * Merge branch 'main' into feat/critique-theater Resolves the contracts/index.ts conflict by keeping the .js extensions added by chore(contracts) |
||
|
|
47eeaf445d
|
feat: Critique Theater foundation (contracts + parser, Phases 0-2) (#387)
* docs(specs): add Critique Theater design spec for panel-tempered artifacts * docs(specs): add Critique Theater implementation plan * docs(specs): rename UI to Design Jury, add lane-density modes, ship-rule explainer, label sizing * feat(contracts): add CritiqueConfig schema and defaults * fix(contracts): apply Task 1.1 review (CRITIQUE_PROTOCOL_VERSION rename, descriptions, RoleWeights export) * feat(contracts): add PanelEvent discriminated union and isPanelEvent guard * fix(contracts): apply Task 1.2 review (exhaustive event-type list, runId guard, import order) * feat(contracts): add CritiqueSseEvent variants and panelEventToSse mapper * test(daemon): add v1 wire-protocol golden fixtures for Critique Theater parser * feat(daemon): add v1 streaming parser for Critique Theater wire protocol * chore(contracts): add .js extensions to relative imports for NodeNext consumers * fix(daemon): satisfy noUncheckedIndexedAccess in v1 parser regex match access * test(daemon): cover parser failure modes; fix unclosed-PANELIST swallow bug * fix(daemon,contracts): address PR #387 review - parser now clamps panelist + DIM scores against the run-declared scale captured from <CRITIQUE_RUN scale=...>, not a hardcoded 100 - PANELIST appearing before any <ROUND n=...> opens now throws MalformedBlockError rather than emitting events with NaN round - DIM_RE and MUST_FIX_RE hoisted to module scope and lastIndex reset per call so the parser hot path stops recompiling regex per artifact - overflow check after drain simplified to a plain buf.length > cap test (the prior compound condition was always true on the right side and obscured intent) - scoreThreshold <= scoreScale refine gains a 1e-9 epsilon so floating slack does not reject semantically valid configs - round-1 designer ARTIFACT guard gains a comment naming the spec invariant and the v2 relaxation path - 3 new regression tests cover the panelist-without-round, scale=10 clamp, and scale=20 plumbing cases * docs(specs): rationale for non-goals, failure-mode rate targets, Phase 10 matrix, Phase 14 doc layout * Merge branch 'main' into feat/critique-theater Resolves the contracts/index.ts conflict by keeping the .js extensions added by chore(contracts) |
||
|
|
9d700ec74f
|
feat(daemon): persist code agent startup (#255)
* feat(daemon): persist code agent startup * fix: complete all suggestions * fix: types for app config * chore: revert local origin * chore: format to single quotes * fix: duplicate headers * fix: isLocalSameOrigin rewriting issue --------- Co-authored-by: mrcfps <mrc@powerformer.com> |
||
|
|
0c00f241e7
|
Add preview comment attachments (#284) | ||
|
|
59e4966dda
|
feat(version): add app version awareness (#204)
* feat(version): add app version awareness * fix(version): detect packaged sidecars across platforms |
||
|
|
3fb849d047
|
Fix chat runs surviving web disconnects (#146)
* fix chat runs surviving web disconnects * fix chat run create abort propagation Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon keepalive reconnect budget Generated-By: looper 0.0.0-dev (runner=fixer, agent=gpt-5.5) * fix daemon stream disconnect cancellation Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon stream abort cancellation race Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon run cancellation semantics * fix load * doc * 2 * add run refresh recovery * fix active run refresh status * fix reattach abort handling * fix * fix chat initial scroll * fix daemon start failures Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix background run recovery Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix stop run status Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix background run recovery Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * extract daemon run service * move prompt composition to daemon * fix prompt module resolution * fix project id generation * add project run status * add designs kanban view with awaiting_input status - add grid/kanban view toggle on Designs tab; persist choice in localStorage - introduce awaiting_input project display status (daemon-derived from unanswered <question-form>) so projects asking the user aren't shown as Completed; ordered between Running and Completed with amber accent - hide transient queued state from users: coerce queued/starting to running in daemon /api/projects projection and drop the queued kanban column - a11y polish on Designs cards: Space activation, aria-labels on delete, focus-visible outlines, reveal delete on focus-within and touch, prefers-reduced-motion handling - kanban layout uses flex sizing instead of viewport math; scoped icon- only pill button rule fixes view-toggle icon alignment --------- Co-authored-by: mrcfps <mrc@powerformer.com> |
||
|
|
56d08b8c5f
|
Add shared contracts and migrate project code to TypeScript (#118) |