open-design

mirror of https://github.com/nexu-io/open-design.git synced 2026-06-01 03:14:35 +07:00

Author	SHA1	Message	Date
Caprika	21dcd74abb	Increase agent inactivity timeout (#1071 )	2026-05-09 17:07:47 +08:00
Marc Chan	223d35f073	fix: improve Orbit and packaged data-dir startup errors (#1067 )	2026-05-09 16:47:01 +08:00
Herédi Áron	66f84972cf	feat(byok): Added Ollama CLoud as BYOK provider (#923 ) * feat: add Ollama Cloud to KNOWN_PROVIDERS as OpenAI-compatible BYOK provider * feat: add ollama.com to isOpenAICompatible base URL detection * feat: add Ollama Cloud models to SUGGESTED_MODELS_BY_PROTOCOL fallback list * fix: use full Ollama Cloud model list from /api/tags, drop -cloud suffix * feat: add Ollama Cloud as native protocol with NDJSON streaming and connection test support * fix: remove ollama.com from OpenAI compatibility check * feat: add token overrides for Ollama Cloud models to prevent truncation * fix: extend inferApiProtocol and legacy migration to recognize ollama.com base URLs * fix: normalize Ollama Cloud base URL by stripping /api suffix during migration and in daemon --------- Co-authored-by: herediaron <aronheredi346@gmail.com>	2026-05-09 11:21:16 +08:00
lefarcen	afb331a288	feat: add opt-in Langfuse telemetry (#800 ) * docs(specs): add langfuse telemetry change spec Captures the design for forwarding completed agent runs to Langfuse, including data-model mapping, field-budget caps, privacy gates, build-secret injection, GDPR right-to-deletion approach, and the resolved decisions on default consent, identifier shape, region, and ownership. * feat(daemon): add langfuse-trace module and telemetry prefs Adds the dependency-free building blocks for forwarding completed agent runs to Langfuse. Two layers: - AppConfigPrefs gains installationId and a TelemetryPrefs object with metrics / content / artifactManifest gates. The daemon validator treats telemetry like agentModels — replace-on-write, drop-when-empty, reject non-boolean inner values. - New langfuse-trace.ts builds a {trace-create, generation-create} pair from a ReportContext, capping prompt at 8 KB, output at 16 KB, artifacts at 50 entries, and dropping any batch larger than 1 MB before send. reportRunCompleted is no-op when LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY are unset (so dev runs and forks never emit) and short-circuits on prefs.metrics === false. Server-side wiring into the run-close path lands in a follow-up. * fix(langfuse): default to US Langfuse region End-to-end smoke against the project's actual dev key on 2026-05-07 returned 401 from cloud.langfuse.com (EU) and 207 from us.cloud.langfuse.com (US), confirming the org lives in US. Update the default base URL, the matching test, and the spec's Q3 decision row to match. Self-hosted or EU-region operators can still override via the LANGFUSE_BASE_URL env var. * feat(daemon): wire langfuse trace forwarding into run-close Adds the daemon-side glue to forward completed agent runs: - runs.ts gains an optional onTerminate hook fired once per run after it reaches a terminal state. Errors thrown from the hook are caught and logged, never propagated, so telemetry can never break the run path. - New langfuse-bridge.ts assembles a ReportContext from the in-memory run record, the conversation's persisted assistant message, and the user's app-config preferences. It tolerates a missing message (e.g. when web has not yet PUT the final delta) and a missing app-config. - server.ts stashes the original user prompt on the run object inside startChatRun so the bridge can include it without crossing the createChatRunService boundary, and registers the hook callback when building the run service. Behavior remains a no-op unless LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY are set in the daemon env AND telemetry.metrics is true in app-config. A live smoke against us.cloud.langfuse.com on 2026-05-07 confirmed the matching trace + generation schema is accepted (HTTP 207, both events 201 created). * fix(langfuse): address PR #800 review feedback P1 — Move trace forwarding off the daemon-internal run-close hook and onto the message-persistence path. The original onTerminate hook ran inside finish() the moment the SSE 'end' event was emitted, which is before the web client's onDone handler refreshes project files and PUTs producedFiles + final assistant content back to SQLite. Reading SQLite at that moment routinely missed both. The fix: drop the runs.ts hook entirely and trigger from PUT /api/projects/:id/conversations/:cid/ messages/:mid when the saved row carries a terminal runStatus. A reportedRuns Set guards against the multiple PUT calls web makes per turn (each retry / state update). Set entries auto-evict after the same 30 min TTL the runs map uses. Web persists a terminal-status message in all three completion paths — onDone (succeeded), onError (failed), and cancel (canceled) — so this catches every run shape. P2 — postLangfuseBatch now parses the 207 Multi-Status response body. Langfuse legacy ingestion always returns 207, and response.ok is true for 207, so per-event validation errors used to slip through silently. We now warn when body.errors is non-empty. Two new unit tests. P2 — truncate() and the HARD_BATCH cap now compare UTF-8 byte length, not String.length (which counts UTF-16 code units). A 4096-character CJK prompt occupies 12 KB, well over the 8 KB input cap. truncate also walks backwards to a UTF-8 leading byte so the cut never lands inside a multi-byte codepoint. New unit test covers '设'.repeat(4096). P2 — Spec R7 now lists the actual Langfuse trace deletion endpoint (DELETE /api/public/traces/{traceId} for single, DELETE /api/public/traces with body for batch). Verified by curl on us.cloud.langfuse.com: DELETE /api/public/traces/X → 200; the path the original spec named (POST /api/public/trace/X) returns 404. Reference link points at langfuse.com/docs/administration/data-deletion. P3 — Q4 (legacy ingestion vs OTel) moved from Open Questions to Resolved Decisions. The implementation already commits to legacy and the trade-off was discussed during design; the open-question status was stale. * feat(web): privacy consent surface + Settings → Privacy tab Adds the user-facing half of the telemetry feature so the daemon-side hook from PR #800 has something to talk to. - AppConfig gains optional `installationId` (anonymous v4 uuid generated on first opt-in; null after explicit decline; undefined when the user has never seen the consent surface) and `telemetry: TelemetryConfig` ({metrics, content, artifactManifest}). syncConfigToDaemon round-trips both fields so the bridge module sees the same prefs. - SettingsDialog grows a Privacy section with two states. When the user has never made a consent decision (typical first-run path), the section renders the GDPR-aligned consent card: a kicker, the disclosure body listing both metrics and conversation content as separate bullets, and two equally-prominent buttons ("Share usage data" / "Don't share"). The Don't-share path keeps the app fully usable (core app must work with all tracking declined). After a decision the same panel switches to three independent toggles + the anonymous ID + a "Delete my data" button that rotates the ID and turns everything off. - App.tsx points the welcome modal at the new Privacy section so the consent decision is the first thing a fresh installation sees. - 17 i18n keys land in en + zh-CN + zh-TW with hand-translated copy, and as English placeholders in the remaining 14 locales — enough for the parity check to pass while leaving room for proper localisation in a follow-up. Dict type updated. - Minimal index.css for the consent card + toggle rows so the panel is legible without depending on follow-up design polish. Telemetry remains a no-op end-to-end until the user clicks Share usage data: the daemon gate (prefs.metrics === true) keeps every code path short-circuited otherwise. * refactor(web): rebuild Privacy panel using project-native settings primitives The first cut used custom .settings-privacy-* classes + raw HTML checkboxes that didn't match any other Settings tab. Replace with the shell other sections already use: - settings-subsection containers with section-head + h4 + .hint - seg-control / seg-btn pill toggles ("active" / "offline") for each of the three telemetry preferences, mirroring NotificationsSection - a 2-cell seg-control for the consent card so Share usage data and Don't share carry identical visual weight (the GDPR equal-prominence requirement that the previous accent / outline split missed) - ghost button + readonly text input for the installation id row, mirroring the API-key field pattern elsewhere Drop the bespoke CSS block in favor of inheriting the existing settings-section / seg-control / ghost styling. The only privacy- specific style left is a tight definition list inside the consent card for the metrics + content disclosure rows. * refactor(web): use .toggle-row iOS switch for Privacy preferences Active/offline pills (the seg-control single-cell pattern that NotificationsSection uses) read awkwardly for a flat preference list. Switch the three telemetry toggles to .toggle-row — the same control NewProjectPanel uses for "speaker notes" / "animations": label + hint on the left, iOS-style sliding switch on the right, full-row click target. The consent card's two-button seg-control stays as-is — there the equal-weight pill pair is exactly what GDPR equal-prominence wants. * feat(web): standalone first-run privacy consent banner Replaces the Settings-dialog-as-onboarding hack with a dedicated bottom-right banner card that mounts whenever the user has never made a privacy decision (cfg.installationId === undefined). The banner is prominent (anchored to the corner with a soft shadow) but non-blocking, mirrors cookie-consent UX, and shares the project's panel styling — same .modal-elevated background, --radius-lg corners, --shadow-lg lift. Wiring: - App.tsx imports PrivacyConsentModal and renders it at the root, gated on installationId === undefined && !settingsOpen so it doesn't double up with the Privacy tab's own consent card when Settings is already showing. - Share / Don't share both go through handleConfigPersist, so the resulting installationId + telemetry prefs land in localStorage and the daemon at the same time, reusing the existing autosave plumbing. - The previous attempt that pinned the welcome SettingsDialog to the Privacy section is reverted; onboarding now stays focused on agent configuration, and the consent decision lives in its own surface. * fix(web): keep privacy banner visible while Settings welcome modal is open The banner gated itself on `!settingsOpen` to avoid double-rendering with the Privacy tab's consent card. But the first-run path opens the Settings welcome modal automatically when `onboardingCompleted=false`, which fired immediately after bootstrap — so the banner flashed for a moment and then vanished behind the modal backdrop. Drop the `!settingsOpen` clause so the banner stays mounted whenever the user has not yet made a privacy decision, and bump its z-index above the modal backdrop (200 vs 100) so first-run users can actually reach the consent buttons. The minor visual overlap with the Privacy tab's own card is fine: clicking either copy resolves both surfaces. * copy(privacy): soften consent button labels Banner action buttons now read "Help improve Open Design" / "Not now" (en, with hand translations in zh-CN / zh-TW and English placeholders in the other 13 locales) instead of "Share usage data" / "Don't share". The new wording aligns the affirmative action with the kicker copy ("Help us improve Open Design") and reads less alarming, while the disclosure list above still names both data categories explicitly so the consent stays informed under GDPR. The decline button stays as a soft "Not now" rather than an aggressive "Don't share" so the reject path doesn't read as hostile to the user. No structural change — the two-cell seg-control still gives the buttons identical visual weight, and the underlying side-effects are unchanged (installationId is generated on Help / nulled on Not now, and the telemetry prefs flip the same way). * feat(telemetry): expand trace fields for evals & dataset construction Each Langfuse trace now ships the full per-turn + per-install fact sheet that the eval/dataset workflow needs, instead of only the bare turn id + token count from before. Everything below is gated by `prefs.metrics === true`; nothing here is content (those gates remain separate). Per-turn: - model — first-class generation.model field, drives Langfuse cost lookup and model-grouping in the UI; also mirrored in trace.metadata and trace.tags so list-view filters work. - reasoning — generation.modelParameters.{ reasoning } so the Model Parameters card lights up; mirrored in metadata. - skillId / designSystemId — metadata + tags, so dataset slices can group by which skill/DS produced which output. Per-process / build (constant within one daemon run, cached at start): - appVersion / appChannel / packaged from app-version.ts - nodeVersion (process.version), os (platform()), osRelease, arch (os.arch()) - clientType — desktop vs web, derived from a new X-OD-Client header the web layer sets in providers/daemon.ts (with a User-Agent sniff fallback for third-party callers). Plumbing: - startChatRun stashes model / reasoning / skillId / designSystemId on the run object alongside the existing userPrompt stash. - POST /api/runs reads X-OD-Client and stores run.clientType. - langfuse-bridge collects RuntimeInfo once per process and merges per-run client carrier; ReportContext gains optional `turn` + `runtime` blocks; existing fields stay backward compatible. Spec gains a "Telemetry Fields Catalog" section enumerating every field, its source, and the gate it lives under, so the eval team has a single place to look up what's available without reading the trace schema by example. Tests: - new langfuse-trace tests cover turn tags, runtime tags, generation model/modelParameters promotion, modelParameters omission when reasoning is unset, and metadata mirroring. - langfuse-bridge gains an end-to-end "turn-level config" test that threads model/reasoning/skill/DS/clientType + appVersion through the bridge and asserts the Langfuse payload shape. - existing tests adjusted to tolerate host-dependent os tag. * copy(privacy): trim Share button to verb phrase only "Help improve Open Design" overflowed the equal-width 2-cell seg-control on the consent banner — the product name is already in the kicker + headline above the buttons, so the button itself only needs the verb phrase. Drop the product name from all locales: - en: Help improve Open Design → Help improve - zh-CN: 帮助改进 Open Design → 帮助改进 - zh-TW: 協助改進 Open Design → 協助改進 The decline button ("Not now" / "暂不" / "暫不") was already short, so the two buttons now have comparable length and the equal-prominence seg-control fits cleanly. Standalone Settings → Privacy panel uses the same labels for consistency. * fix(web): defer Settings welcome modal until privacy decision is made Previously bootstrap raced two surfaces against each other on first launch: the privacy consent banner (gated on installationId === undefined) and the Settings welcome modal (gated on onboardingCompleted === false). The banner's higher z-index kept it above the backdrop visually, but having two foreground surfaces at once is still confusing UX. Sequence them instead: bootstrap only opens the welcome modal when the user has already resolved consent (installationId !== undefined). Until then the banner owns the foreground alone. Once the user clicks Help improve / Not now, the corresponding handler hands off to the welcome modal if onboarding is still pending. End state matches what it was before — just without the simultaneous-render flash. * debug(privacy): log banner gate state to track sudden disappearance Two console.log points to find which setCfg call (or stale bundle) is flipping cfg.installationId from undefined to a value while the banner is visible. To remove once the regression is reproduced. * fix(privacy): keep installationId + telemetry out of localStorage Daemon is now the single source of truth for the privacy decision. Why this matters: the consent banner gates on \`config.installationId === undefined\`, but loadConfig() merges localStorage on top of the daemon's reply, so a stale uuid in \`open-design:config\` (left over from a previous opt-in) was re-hydrating the React state and immediately syncing back to the daemon — defeating "Delete my data" and re-suppressing the banner within milliseconds of every page load. The deeper reason to fix it here, not just patch the gate: a privacy identifier persisted in browser storage that the user can't see or clear without DevTools is a compliance liability. Anything users can revoke needs one canonical place to store it. Daemon \`app-config.json\` already serves that role for everything else gated through syncConfigToDaemon, so installationId + telemetry now ride that path exclusively: - saveConfig() strips both keys before writing localStorage. - loadConfig() strips both keys when reading older stale payloads, so existing installs migrate transparently on next launch. - syncConfigToDaemon() / mergeDaemonConfig still round-trip them, so the React state stays in sync with the daemon as before. Net effect: clearing app-config.json (or hitting "Delete my data") now fully resets the install identity, with no residual cohort key in browser storage. * feat(privacy): scrub secrets + PII from prompt/output before send When prefs.content is on, daemon now runs the prompt and assistant text through a regex scrubber (apps/daemon/src/redact.ts) before posting to Langfuse. The scrubber is the simplest thing that gives the user-facing copy a truthful claim — pure regex, zero new dependencies, fully auditable in this Apache-2.0 repo (vs. pulling a single-maintainer 5-month-old npm package into a core process). Categories covered (each replaced with [REDACTED:<kind>]): - Anthropic / OpenAI sk- keys (incl. proj/live/test/ant variants) - Langfuse pk-lf- / sk-lf- (specific rule wins over generic sk-) - GitHub gh[opsur]_ tokens - AWS access key ids (AKIA + 16 uppercase) - Google API keys (AIza + 35) - Slack xox[abprs]- tokens - Stripe live/test keys - JWT header.payload.signature triples - Bearer-header values (scheme word stays readable) - Emails, IPv4, US-style phone numbers - Credit cards — 13–19 digit runs that pass a Luhn check, so order ids and unix-nanos timestamps that fail Luhn pass through unchanged Not covered, stated openly in spec + i18n: names, postal addresses, business-secret semantics, raw 40-hex tokens (too high a false-positive cost for artifact slugs). Those would require an ML layer. Wired in: - apps/daemon/src/redact.ts — exports redactSecrets() + redactSecretsWithCounts() helper for future audit-summary metadata. - apps/daemon/src/langfuse-bridge.ts — runs both prompt and output through redactSecrets() before they reach the trace builder. - 18 unit tests cover every pattern plus negative cases (Luhn-failing digit runs, out-of-range IPv4 octets, idempotence on re-redacted text, ordinary prose passthrough). - i18n privacyContentHint on en + zh-CN + zh-TW (plus 14 locale placeholders) enumerates the categories so the consent disclosure matches the implementation — the GDPR informed-consent requirement. - spec gains a Pre-send Redaction subsection with the regex shape table + intentional non-coverage list. Drive-by: dropped the [privacy] debug logs that traced the now-fixed bootstrap regression. * fix(telemetry): make Langfuse reporting resilient * feat(telemetry): nest Langfuse turn observations * feat(telemetry): emit Langfuse tool spans * fix(telemetry): report after finalized message writes * fix(telemetry): honor persisted terminal status * fix(web): let consent banner yield page clicks * fix(telemetry): report current turn prompt only	2026-05-09 10:06:01 +08:00
haocn-ops	5c67651bfd	fix: make Azure api version optional (#941 ) Co-authored-by: haocn-ops <259245673+haocn-ops@users.noreply.github.com>	2026-05-09 09:20:25 +08:00
jnalv414	6c0de72826	fix(daemon): write SSE events atomically in createSseResponse.send (#972 ) Three separate res.write() calls per event (id, event, data) split into three TCP chunks on localhost. Consumers that read chunk-by-chunk and look for the `event: <type>` marker can return before the matching `data:` payload arrives in the next chunk, producing partial events. Combine the three writes into one so each SSE event lands as a single TCP frame for sub-MTU events. Fixes the four chat-route SSE tests (`surfaces Qoder assistant error records...`, `fails Qoder runs when the result reports is_error...`, `fails stalled json-stream runs after the inactivity timeout elapses`, `marks stalled runs failed even when the child ignores SIGTERM`) which were racing against the chunk split in the readSseUntil test helper. Daemon test suite: 86/86 pass after the change.	2026-05-09 09:12:22 +08:00
Palash S Patel	92d558d570	fix: reset inactivity watchdog on raw stdout bytes, not just parsed events (#976 ) Structured stream handlers (Codex item.completed, pi-rpc, ACP, and json-event-stream) only call noteAgentActivity() on parsed JSON-line events. During long model reasoning or buffered artifact generation (generating a full landing page HTML), the stdout pipe can stay silent for >120s even though the child process is working. The result is a false 'Agent stalled without emitting any new output' error. Add a universal child.stdout.on('data', noteAgentActivity) listener before any format-specific stream dispatch so every raw byte resets the inactivity watchdog. Covers all seven stream formats: plain, claude-stream-json, qoder-stream-json, copilot-stream-json, json-event-stream, pi-rpc, and acp-json-rpc.	2026-05-09 09:11:52 +08:00
pmasadali20776-ui	b578b93f3f	Bug FIx: Media generation task state is volatile and lost on daemon restart #648 (#884 ) * feat: implement media tasks persistence * fix(daemon): satisfy exactOptionalPropertyTypes in media-tasks-routes test `process.env.OD_DATA_DIR` is `string \| undefined`, but `openDatabase`'s options accept `{ dataDir?: string }`. Under the daemon tsconfig's exactOptionalPropertyTypes the two are not assignable. Spread the key in only when defined. * fix(daemon): restore mcp-config / mcp-oauth / mcp-tokens imports in server.ts The earlier 'Merge branch main into main' resolved the import-block conflict by keeping only the new media-tasks import and dropping the three pre-existing import blocks. server.ts still uses 13+ symbols from those modules (PendingAuthCache, MCP_TEMPLATES, beginAuth, readMcpConfig, getToken, etc.), so the daemon crashed at startup with 'ReferenceError: PendingAuthCache is not defined' the moment Playwright booted it. Restore the three import blocks verbatim from main. --------- Co-authored-by: lefarcen <935902669@qq.com>	2026-05-09 00:00:18 +08:00
tenderpooh	109722de3a	feat(desktop): export artifacts directly to PDF (#532 ) * feat(desktop): export artifacts directly to PDF * fix(desktop): PDF 내보내기 기본 여백 제거	2026-05-08 23:42:12 +08:00
Siri-Ray	208f09c60e	fix: settle completed runs and clean up shutdown children (#924 ) * fix: clean up completed and shutting down runs * fix: bound daemon CLI shutdown Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix: harden daemon shutdown cleanup Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix: harden daemon shutdown cleanup Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * test: align acp abort fake with typed child	2026-05-08 21:05:22 +08:00
nettee	ef9ca7baff	fix(daemon): typecheck core server paths (#952 )	2026-05-08 20:43:51 +08:00
nettee	32d820e4ee	fix(daemon): typecheck leaf modules (#943 ) * update drift * fix(daemon): typecheck leaf modules * fix(daemon): decode Qoder stdout buffers Generated-By: looper 0.5.6 (runner=fixer, agent=opencode)	2026-05-08 20:01:25 +08:00
Bryan A	e13adf2e63	feat(daemon): finalize design package endpoint (closes #450 ) (#832 ) * feat(daemon): scaffold /api/projects/:id/finalize/anthropic (refs #450) Phase C of the PR 2 plan for issue #450: scaffold the route + module shape so subsequent phases (D-I) land function bodies and tests against a stable surface that already passes typecheck. What lands here: - apps/daemon/src/finalize-design.ts: module-level constants (DEFAULT_BASE_URL, DEFAULT_MAX_TOKENS=16000, INPUT_BODY_CAP_BYTES=384KiB, LOCK_FILENAME=.finalize.lock, OUTPUT_FILENAME=DESIGN.md, DEFAULT_TIMEOUT_MS=120s); inline interfaces for the request/response shape (kept out of packages/contracts per scope rules); two error classes - FinalizePackageLockedError (mirrors PR #493's TranscriptExportLockedError) and FinalizeUpstreamError (carries upstream HTTP status for the route's error mapping); function stub that throws "not yet implemented". - apps/daemon/tests/finalize-design.test.ts: vitest harness with describe.skip placeholder so the file imports cleanly. Real cases land in phases D-I. Default-import of node:fs (per memory: vi.spyOn cannot redefine on the frozen ESM Module Namespace; CJS exports object is mutable). - apps/daemon/src/server.ts: route handler at POST /api/projects/:id/finalize/anthropic, slotted next to the existing :id/deploy* family. Validates apiKey/model non-empty, optional baseUrl via the existing validateExternalApiBaseUrl closure (forbidden -> 403, invalid -> 400), optional maxTokens positive number; calls getProject (404 on miss); calls finalizeDesignPackage (which throws, caught and mapped to 500 for now); maps known error classes (FinalizePackageLockedError -> 409, FinalizeUpstreamError -> 502) pre-emptively. Path shape rationale (Bryan-confirmed): project-scoped path matches every sibling /api/projects/:id/* route in server.ts (deploy, deployments, deploy/preflight); provider-namespaced segment leaves a clean expansion line for /api/projects/:id/finalize/openai etc. as follow-ups. Field-name rationale: apiKey, baseUrl, model, maxTokens match ProxyStreamRequest verbatim (packages/contracts/src/api/proxy.ts:8-19) so a future caller can reuse the same body shape. baseUrl is optional here (intentional divergence from the proxy at server.ts which requires it) so standard Anthropic users do not need to set it; Bedrock / self-hosted-proxy users still can. Verification: pnpm --filter @open-design/daemon typecheck exits 0; finalize-design.test.ts loads cleanly with 1 skipped placeholder; no other tests touched. Refs nexu-io/open-design#450 (PR 2 scaffold; pipeline body in subsequent commits) * feat(daemon): transcript truncation helper for /finalize prompt Phase D of the PR 2 plan for issue #450: lands the helper that bounds the transcript section of the synthesis prompt. Why this exists: real-world signal at authoring time was a local project transcript already at 3.95 MB. Anthropic's claude-opus-4-7 context cap is roughly 200K tokens (~700 KB at typical density). Inserting an unbounded transcript would 4xx upstream on the first real call. This helper keeps the on-disk .transcript.jsonl lossless (PR #493's contract) while making the prompt-inclusion bounded. Strategy: - Cap output at INPUT_BODY_CAP_BYTES (384 KiB) so the prompt has room for the system prompt + design system body + current artifact + room for the synthesis output. - Always preserve the header line - it carries projectId, schemaVersion, conversation/message counts, attachment counts; synthesis quality depends on knowing the original sizes. - Split equal byte budgets between head and tail so both project genesis and most-recent intent survive. Two thinking segments separated only by mid-session truncation lose the same kind of boundary that PR #493 preserves between thinking blocks - that's accepted; smarter semantic chunking is a follow-up. - Insert a single `{"kind":"truncated","reason":"size","omittedBytes":N}` sentinel JSON line between the head and tail so a synthesis consumer can detect the gap. omittedBytes is the difference between the original UTF-8 byte length and the output's UTF-8 byte length. - If the head + tail budgets together cover the whole body (e.g. all message lines are tiny), no marker is emitted - the output is the input verbatim. Tests: - "returns the input verbatim when the JSONL fits under the 384 KiB cap" pins that small transcripts pass through unchanged with no marker. - "head+tail truncates with a single marker line when the JSONL exceeds the 384 KiB cap" pins that output is bounded, header survives, exactly one marker emitted with non-zero omittedBytes, both ends of the body preserved, and at least one middle message omitted. Suite delta: +2 tests in finalize-design.test.ts. Refs nexu-io/open-design#450 * fix(daemon): resolve noUncheckedIndexedAccess in truncateTranscriptForPrompt D1 (0eaa123) shipped with `body[headIndex]` and `body[i]` typed as `string \| undefined` under TypeScript's `noUncheckedIndexedAccess` strict mode. Local typecheck would have caught it but the prior verification piped through `tail` which masked the non-zero exit code of `tsc`. Coalesce each access via `?? ''` (the array is from `String.split('\n')` so undefined elements are not actually reachable; the coalesce is a type-narrowing convenience, not a behavior change). Verification: `pnpm --filter @open-design/daemon typecheck` exits 0; `pnpm --filter @open-design/daemon test finalize-design` shows 2/2 + 1 skipped, identical to the pre-fix run. Refs nexu-io/open-design#450 * feat(daemon): current-artifact resolver for /finalize Phase E of the PR 2 plan for issue #450: resolves which artifact (if any) accompanies the transcript + design system in the synthesis prompt. Priority order (Bryan-locked in plan §6): 1. The file referenced by tabs.is_active = 1 IF an <name>.artifact.json sidecar exists on disk. Sidecar presence is the discriminator: an inferred manifest from `inferLegacyManifest` (e.g. for a bare .html with no sidecar) does NOT count, and an active tab pointing at a non-artifact file (.md, .txt) falls through. 2. Newest project file with a real .artifact.json sidecar, sorted by manifest.updatedAt descending. Files without an updatedAt sort last so legacy pre-streaming manifests do not get accidentally promoted. 3. Returns null - "no artifact in scope". The Phase H caller will emit `artifact: null` in the response and the prompt's "Current artifact" section will read "none". Sidecar presence is checked via `existsSync` on the on-disk path, NOT via the `artifactManifest` field returned by readProjectFile/listFiles (those run inferLegacyManifest as a fallback for known kinds, which would otherwise cause a bare .html with no sidecar to look like an artifact). Tests: - "returns the active-tab artifact when its sidecar is present, even if a newer artifact exists elsewhere": pinned.html (older updatedAt) is in the active tab; newer.html (newer updatedAt) is not. Resolver returns pinned.html - intent (active tab) beats recency. - "falls through to newest .artifact.json when active tab points at a non-artifact file": README.md is the active tab (no sidecar); design.html has a real sidecar. Resolver falls through and returns design.html. - "returns null when no active tab and no .artifact.json sidecars exist": only a README.md is in the project; no tabs row. Resolver returns null. Suite delta: +3 tests in finalize-design.test.ts (5 active total). Refs nexu-io/open-design#450 * feat(daemon): synthesis prompt construction for /finalize Phase F of the PR 2 plan for issue #450: builds the system + user prompts that get sent to Anthropic's Messages API in the synthesis call. Pure function; no IO, no side effects. System prompt (literal, stored as a module-level constant): instructs Claude to emit a DESIGN.md document with a fixed 7-heading structure (# DESIGN.md / ## Summary / ## Brand & Voice / ## Information Architecture / ## Components & Patterns / ## Visual System / ## Open Questions / ## Provenance). The Provenance section is required to list project ID, design system, current artifact, transcript message count, and the UTC generation timestamp. User prompt (built at runtime): structured payload with the truncated transcript JSONL, the design system body, and the current artifact body, each under a ## heading. Missing inputs (no design system selected, no artifact in scope) produce explicit "none" headings + parenthetical placeholder body so Claude does not hallucinate content for absent sections. Truncation is the caller's concern - this function does not re-truncate. The caller (Phase H pipeline) feeds in a JSONL that has already been bounded by truncateTranscriptForPrompt. Tests: - "includes the transcript JSONL verbatim and the generation context": pins all section headings, the transcript body verbatim, the design system body verbatim, the artifact body verbatim, and every generation-context line. - "falls back to \"none\" + parenthetical when no design system is selected": designSystemId=null and designSystemBody=null -> heading reads "## Active design system: none" with the parenthetical body. - "falls back to \"none\" + parenthetical when no artifact is in scope": artifact=null -> heading reads "## Current artifact: none" with the parenthetical body. Suite delta: +3 tests in finalize-design.test.ts (8 active total). Refs nexu-io/open-design#450 * feat(daemon): Anthropic call + retry strategy for /finalize Phase G of the PR 2 plan for issue #450: lands the upstream Claude Messages API call with a single transient-error retry, plus the response extractor that turns Anthropic's content array into the DESIGN.md body. What lands here: - appendVersionedApiPath: inlined from the connectionTest helper at apps/daemon/src/connectionTest.ts:188-195 (it is not exported there). Appends /v1/messages when the base URL has no /vN segment, otherwise appends /messages directly. Same semantics; ~5 lines. - callAnthropicWithRetry: POSTs to <base>/v1/messages with the canonical Anthropic headers (content-type, x-api-key, anthropic-version: 2023-06-01) and body shape ({ model, max_tokens, system, messages, stream:false }). One retry on transient (HTTP 429 or 5xx); on terminal failure throws FinalizeUpstreamError carrying the upstream HTTP status and raw body text. The route handler in Phase I maps status to AUTH_FAILED / RATE_LIMITED / UPSTREAM_FAILED and runs the body through redactSecrets before exposing it as `details`. - extractDesignMd: concatenates content[].text for every block where type === 'text', preserving order. Throws FinalizeUpstreamError(502) on three malformed-response shapes: non-object payload, missing content array, zero text blocks. The route handler maps the throw to 502 UPSTREAM_FAILED so synthesis cannot land a half-empty DESIGN.md on disk. - Test-only `_sleepMs` injection on the call params so the retry-delay sleep is instant under vitest. Default sleep uses setTimeout. Retry posture (1 retry on transient) is opinionated; the maintainer's "standard exponential backoff" answer was directional and a single retry matches the existing daemon's posture (transcript export and connectionTest do zero retries) while staying inside the daemon's blocking-fast posture for /finalize. Tests: - callAnthropicWithRetry: throws on 401 with no retry; retries once on 429 and resolves on second 200; throws after both 5xx attempts; propagates AbortError when signal is pre-aborted. - extractDesignMd: concatenates ordered text blocks; throws on missing content array; throws on content with zero text blocks. A spurious typecheck error from `exactOptionalPropertyTypes` (signal typed as AbortSignal \| undefined where RequestInit expects AbortSignal \| null) was resolved by conditionally spreading signal into the RequestInit literal. Suite delta: +7 tests in finalize-design.test.ts (15 active total). Refs nexu-io/open-design#450 * feat(daemon): wire /finalize pipeline end-to-end Phase H of the PR 2 plan for issue #450: stitches together every phase D-G primitive into the full finalizeDesignPackage pipeline that the route handler in Phase I will expose over HTTP. Pipeline (in execution order, all inside a try/finally that always releases the lockfile): 1. getProject(db, projectId): defensive 404 (the route validates first; this throw catches direct CLI/script callers). 2. mkdirSync(<projectDir>, { recursive: true }): some projects have DB rows but no on-disk dir yet (PR #493's same fix). 3. fs.openSync(.finalize.lock, 'wx'): EEXIST -> FinalizePackageLockedError (mirror PR #493's TranscriptExportLockedError). 4. exportProjectTranscript(db, projectsRoot, projectId, { now }): produces .transcript.jsonl on disk; we read the body and run it through truncateTranscriptForPrompt to bound the prompt-inclusion size. 5. readDesignSystem(designSystemsRoot, designSystemId): returns null when the project has no design_system_id selected, when the design system directory does not exist, or when the DESIGN.md file is missing. 6. resolveCurrentArtifact(db, projectsRoot, projectId): active tab -> newest .artifact.json by manifest.updatedAt -> null. 7. buildSynthesisPrompt({...}): system + user prompt (per Phase F). 8. callAnthropicWithRetry({...}): one retry on 429/5xx; throws FinalizeUpstreamError on terminal failure. 9. extractDesignMd(payload): concatenates content[].text blocks; throws FinalizeUpstreamError(502) on malformed shape. 10. Atomic write: writeFileSync({flag:'wx'}) -> reopen for fsync -> rename. Errors unlink tmp before rethrowing. 11. Lock release in finally (always closeSync + unlinkSync). Bounded blocking: the function uses its own AbortController + 120s timeout when the caller does not supply a signal. Caller-supplied signal takes precedence. Type tightening: switched the local Db interface to `type Db = Database.Database` (better-sqlite3) so the function signature is compatible with `exportProjectTranscript`'s typed parameter. Source file already had a `better-sqlite3` import in claude-design-import area of the daemon, so no new dependency. Tests: - "writes DESIGN.md atomically on the happy path": end-to-end with seeded project + conversation + 2 messages + design system on disk; asserts file at exact path + body bytes match the fetch mock. - "response carries every documented field with correct types": designMdPath/bytesWritten/model/inputTokens/outputTokens/artifact/ transcriptMessageCount/designSystemId all present and typed. - "emits design system 'none' in the prompt when no design_system_id is set": fetch mock asserts on the body it receives. - "throws FinalizePackageLockedError when .finalize.lock is already held": pre-create lockfile; assert throw + DESIGN.md not written + pre-existing lock NOT unlinked (we did not own it). - "replaces an existing DESIGN.md atomically on a second finalize": inject a sentinel between two finalize calls; assert sentinel is gone after second run. - "cleans up tmp file AND lock file on every error path": mock fs.writeFileSync to throw on the tmp path; assert no DESIGN.md.tmp.* remain, no DESIGN.md, no .finalize.lock. - "uses the default https://api.anthropic.com baseUrl when baseUrl is omitted": fetch URL begins with the default; baseUrl=undefined path. vi.restoreAllMocks() now runs in afterEach so the writeFileSync spy from the cleanup test does not leak into subsequent tests. Suite delta: +7 tests in finalize-design.test.ts (22 active total). Refs nexu-io/open-design#450 * feat(daemon): /finalize HTTP route handler + error mapping Phase I of the PR 2 plan for issue #450: replaces the Phase C stub's catch-all 500 with status-aware error mapping that surfaces the right HTTP status + error code for each documented failure mode, and adds HTTP-layer tests that boot startServer to exercise the route's validation branches. Route handler changes: - :id format guard: an inline regex matching isSafeId at apps/daemon/src/projects.ts:556-558 rejects unsafe ids with 400 BAD_REQUEST before any DB or filesystem work. Without this, an id like 'bad!id' would either fail getProject as 404 (wrong code) or reach the function and throw 'invalid project id' (mapped to 500). - FinalizeUpstreamError mapping is now status-aware: - upstream 401 -> 401 AUTH_FAILED - upstream 429 -> 429 RATE_LIMITED - upstream 5xx (or our own 502 sentinel for malformed responses) -> 502 UPSTREAM_FAILED In all cases the upstream raw text is run through redactSecrets so the apiKey cannot leak through `details` even if the upstream echoes the inbound headers. - AbortError mapping: when the 120s AbortController fires (or the caller pre-aborted the signal), surface as 503 TIMEOUT. - Default case: console.error the error per daemon convention; client sees 500 INTERNAL with the message routed through redactSecrets. - Imported redactSecrets alongside the existing connectionTest imports (apps/daemon/src/server.ts:51). HTTP-layer tests (boot startServer({port:0,returnServer:true}) once in beforeAll, mirror the proxy-routes.test.ts pattern): - "400 BAD_REQUEST when baseUrl is not a valid URL (test #13)": baseUrl='not-a-url'. - "403 FORBIDDEN when baseUrl points at a private internal IP (test #14)": baseUrl='http://10.0.0.1'. Note: validateBaseUrl explicitly allows loopback (for local OpenAI-compatible servers) and only blocks non-loopback private IPs (10/8, 172.16/12, 192.168/16, fc00::/7, fe80::/10). - "400 BAD_REQUEST when apiKey is missing (test #15)": apiKey omitted. - "400 BAD_REQUEST when :id contains characters outside the safe-id regex (test #16)": id='bad!id' contains '!' which is not in [A-Za-z0-9._-]. Suite delta: +4 tests (26 active in finalize-design.test.ts). Full daemon suite: 1078/1078 pass; baseline+26 (the +5 above plan target reflects retry+extract split into more granular unit tests than originally enumerated; all real, none skipped). Refs nexu-io/open-design#450 * fix(daemon): tighten isSafeId to reject pure-dot project ids Addresses the P1 path-traversal finding from @lefarcen on PR #832 (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512644). The pre-fix `isSafeId` at apps/daemon/src/projects.ts:556-558 used regex `/^[A-Za-z0-9._-]{1,128}$/` which permitted pure-dot ids (`.`, `..`, `...`) because `.` is in the character class. `projectDir` and `resolveProjectDir` both delegated to `isSafeId`, so an id of `..` would resolve to the PARENT of `.od/projects/` via `path.join`. Threat model (per @lefarcen): - An attacker creates a project row whose stored id is `..` (or another pure-dot variant) — for instance via a workflow that writes the row directly without going through the API. Subsequent finalize/write ops keyed by that id then escape the project tree. - A direct CLI / scripted caller passing `..` as the project id reaches the function without HTTP normalization saving us. (Express normalizes %2e%2e to .. and collapses path segments, which yields 404 for the URL `/api/projects/%2e%2e/...` in practice — but that's Express's protection, not ours.) Fix: - isSafeId now explicitly rejects pure-dot ids (`/^\.+$/.test(id)`) before the char-class regex check. Empty string and inputs longer than 128 chars are also rejected explicitly so the function fails closed on edge cases. - isSafeId is now exported from apps/daemon/src/projects.ts so the /finalize route handler in apps/daemon/src/server.ts can use the same validator instead of re-implementing the regex inline. This prevents drift between the route guard and the projectDir guard, which was how this hole originally appeared. Tests (in finalize-design.test.ts because that's where the threat was flagged; isSafeId is daemon-wide so a dedicated test file would also work): - isSafeId rejects `.`, `..`, `...`, `....` - isSafeId rejects ids with `/`, `\`, `!`, leading whitespace - isSafeId rejects empty string and >128 chars - isSafeId rejects non-string inputs (null/undefined/number) - isSafeId accepts plain ids, ids with mid-string dots, UUIDs, single chars Suite delta: +7 tests (33 active in finalize-design.test.ts). Full daemon suite: 1085/1085. Refs nexu-io/open-design#832 * fix(daemon): address PR #832 P1 findings — imported folders + network 502 Addresses two of the three P1 findings from @lefarcen on PR #832: 1. Imported-folder projects route DESIGN.md to metadata.baseDir (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512656, also flagged independently by @chatgpt-codex-connector at #discussion_r3202430470) The pipeline previously called `projectDir(projectsRoot, projectId)` unconditionally, which resolves to `.od/projects/<id>`. For projects created via /api/import/folder the project row's `metadata.baseDir` carries the user's actual folder; without threading metadata through, finalize would silently land DESIGN.md in the hidden daemon data dir and the current-artifact resolver would miss the user's real files. Fix: switch from `projectDir` to `resolveProjectDir(projectsRoot, projectId, metadata)` in both `finalizeDesignPackage` and `resolveCurrentArtifact`. Thread `project.metadata` (from `getProject`'s normalized row) through both call paths. The resolver gets a new optional `metadata` parameter; native projects pass null and get identical behavior. 2. Network failures and JSON parse errors now map to 502 UPSTREAM_FAILED (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512661) Pre-fix, only HTTP-non-OK responses were wrapped as FinalizeUpstreamError. DNS failures (ECONNREFUSED, ENOTFOUND), fetch TypeErrors, and `response.json()` SyntaxErrors fell through to the route's catch-all and surfaced as 500 INTERNAL — incorrect: those are upstream-level failures, not daemon bugs. Fix: - Wrap callAnthropicWithRetry in a try/catch that passes FinalizeUpstreamError and AbortError through verbatim, but rewraps any other thrown error as FinalizeUpstreamError(502, '', message). - Wrap response.json() in a try/catch that rewraps SyntaxError as FinalizeUpstreamError(502, '', "upstream Anthropic returned non-JSON body: ..."). - The route handler's existing FinalizeUpstreamError mapping then correctly maps these to 502 with the message in `details` (run through redactSecrets first). Tests: - "writes DESIGN.md under metadata.baseDir for imported-folder projects": inserts a project row with metadata.baseDir pointing at a user-folder temp dir; asserts result.designMdPath lands there AND the hidden .od/projects/<id> dir does NOT contain a DESIGN.md. - "rewraps fetch network rejection as FinalizeUpstreamError(502)": fetchImpl throws TypeError with cause.code='ENOTFOUND'; assert thrown error has name=FinalizeUpstreamError and status=502. - "rewraps 200 with non-JSON body as FinalizeUpstreamError(502)": fetchImpl returns 200 with text/html body; response.json() throws SyntaxError internally; assert FinalizeUpstreamError(502). Suite delta: +3 tests (36 active in finalize-design.test.ts). Full daemon suite: green at last check; will re-verify before push. Refs nexu-io/open-design#832 * refactor(daemon): move /finalize DTOs to contracts + map error codes + validate active-tab Addresses the P2 and P3 findings from @lefarcen on PR #832: P2 — Error codes + DTOs not in packages/contracts https://github.com/nexu-io/open-design/pull/832#discussion_r3202512673 Reverses my plan's locked decision #10 ("no contracts changes in this PR; inline the request/response types"). That rule came from the predecessor PROMPT brief's anti-pattern table; @lefarcen's review is fresher signal and supersedes it. Drift risk between the daemon's inline types and any future PR 3 web client is real. - New contracts module: packages/contracts/src/api/finalize.ts with FinalizeAnthropicRequest / FinalizeArtifactRef / FinalizeAnthropicResponse. Re-exported from the package root and made addressable via `@open-design/contracts/api/finalize` subpath. - Daemon source imports the canonical types from contracts and re-exports the public type names so internal references keep working without touching every call site. - Daemon-local error codes remapped to existing ApiErrorCode union members (apps/daemon/src/server.ts), per @lefarcen's suggested mapping: FINALIZE_IN_PROGRESS -> CONFLICT AUTH_FAILED -> UNAUTHORIZED UPSTREAM_FAILED -> UPSTREAM_UNAVAILABLE TIMEOUT -> UPSTREAM_UNAVAILABLE (status 503) INTERNAL -> INTERNAL_ERROR HTTP status codes are unchanged; only the `code` field in the error JSON body changed. P3 — Active-tab name not validated before sidecar probe https://github.com/nexu-io/open-design/pull/832#discussion_r3202512684 resolveCurrentArtifact now runs the active tab's name through validateProjectPath BEFORE composing it into a path.join expression. An invalid tab (traversal segments, absolute path, null byte, reserved segment) causes resolveCurrentArtifact to fall through to the newest-artifact branch rather than abort or probe outside the project directory. Tests: - "falls through (does not throw) when active tab name contains traversal segments": injects a malformed `tabs.name = '../../../etc/passwd'` row directly via SQL (bypassing production tab-creation validation), seeds a real artifact, asserts the resolver returns the real artifact rather than the malformed name. Suite delta: +1 test (37 active in finalize-design.test.ts). Full daemon suite: 1089/1089 green. Refs nexu-io/open-design#832 * fix(contracts): publish /api/finalize as standalone runtime entrypoint Addresses @mrcfps's CI-red review on PR #832 (https://github.com/nexu-io/open-design/pull/832, inline comment on packages/contracts/package.json). The previous J3 commit added `./api/finalize` as a type-only subpath: the entry had only a `types` field, no `default`. That broke the contracts package-runtime gate (packages/contracts/tests/package- runtime.test.ts:38-47) which asserts every exports entry exposes both a `.mjs` runtime and a `.d.ts` types target. mrcfps proposed two fixes; this commit takes path B — make finalize a first-class published module rather than a type-only re-export from the package root. Path B vs path A (a peer-AI second opinion via /collaborate confirmed): under NodeNext + ESM with exports-map semantics, TypeScript validates re-exported symbols against the published module-identity surface. Because the previous J3 had `./api/finalize` neither declared as an exports-map entry nor materialized as a standalone .mjs, TS omitted the re-exported names during package boundary analysis. Even at runtime `import('@open-design/contracts').FINALIZE_SCHEMA_VERSION` worked from the bundled index.mjs but the type-checker rejected it. Path B aligns the runtime and declaration surfaces. Changes: - packages/contracts/esbuild.config.mjs: add `./src/api/finalize.ts` to entryPoints so dist/api/finalize.mjs is generated as a standalone module rather than only inlined into the bundled root. - packages/contracts/package.json: re-add `./api/finalize` to the exports map with both `default: ./dist/api/finalize.mjs` AND `types: ./dist/api/finalize.d.ts`. Mirrors `./api/connectionTest`'s shape (the canonical pattern for first-class submodule entries). - packages/contracts/src/api/finalize.ts: keep the runtime export `FINALIZE_SCHEMA_VERSION = 1` (giving the standalone module a real value to emit beyond the type-only interfaces) and update the doc-comment now that the standalone .mjs is wired. - apps/daemon/src/finalize-design.ts: switch the type import from the inline declarations introduced in the prior J3 fallback to `import type { ... } from '@open-design/contracts/api/finalize'`. Re-export the names so internal references inside finalize-design.ts keep working without touching every call site. Verified: - node --input-type=module -e "import('@open-design/contracts/api/finalize').then(m=>console.log(JSON.stringify(Object.keys(m))))" prints ["FINALIZE_SCHEMA_VERSION"] — runtime resolution clean. - pnpm --filter @open-design/contracts test: 6/6 (including both package-runtime.test.ts cases on the rebuilt exports map). - pnpm --filter @open-design/daemon typecheck: exits 0. - pnpm --filter @open-design/daemon test: 1089/1089 (no regression vs the prior J3 number). Refs nexu-io/open-design#832 --------- Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>	2026-05-08 19:52:11 +08:00
ferasbusiness666	1e8926271b	Harden security scan findings and upgrade dependencies (#806 ) * feat: add accent color control and launcher for Open Design * fix: remove launcher binary from PR * test: cover accent appearance edge cases * Harden security scan findings and upgrade deps * Address proxy security review * Pin jsdom for web test stability --------- Co-authored-by: ferasbusiness666 <ferasbusiness666@users.noreply.github.com> Co-authored-by: lefarcen <935902669@qq.com>	2026-05-08 19:46:34 +08:00
Tom Huang	d592f6087f	feat(mcp): external MCP client with daemon-managed OAuth and 39 design-focused templates (#898 ) * feat(mcp): add external MCP client with daemon-managed OAuth and 17 design-focused templates Open Design now acts as an MCP CLIENT and surfaces tools from third-party MCP servers to the underlying agent (Claude Code, Hermes, Kimi). Daemon - New mcp-config / mcp-oauth / mcp-tokens modules: persist server entries to .od/mcp-config.json, run the OAuth dance for HTTP/SSE servers end-to-end on the daemon (so cloud deployments work and tokens survive across turns), and inject Authorization: Bearer headers into the per-spawn .mcp.json the daemon writes for Claude Code (or the ACP mcpServers map for Hermes/Kimi). - /api/mcp/servers and /api/mcp/oauth/{start,status,disconnect} endpoints, plus spawn-time wiring in agents that hands the configured servers to the active agent CLI. - System-prompt directive for connected external MCPs so the model does not chase Claude Code's synthetic _authenticate / _complete_authentication tools when the Bearer is already pinned. Web - Settings -> External MCP servers panel with per-row OAuth Connect / Disconnect / Refresh affordances and per-row template hints. - New "Add server" picker categorized into 7 groups (image-generation, image-editing, web-capture, ui-components, data-viz, publishing, utilities) with a search box, sticky close button, collapsible <details> sections (auto-expand on search), 60vh capped scroll region, and a pinned Custom-server footer. - ChatComposer /mcp slash and MCP picker button forward to the new Settings tab; AssistantMessage renders MCP tool calls inline; markdown autolinker handles bare http(s) URLs (incl. OAuth links) before italic markers so OAuth callback URLs do not get italic-fragmented mid-token. Contracts - packages/contracts/src/api/mcp.ts owns the wire shapes (McpServerConfig, McpTemplate with stable McpTemplateCategory enum, McpServersResponse, OAuth start/status/disconnect bodies, the postMessage payload from the OAuth callback). Templates (17 built-in) - image-generation: Higgsfield (OpenClaw, OAuth HTTP), Pollinations, Allyson (animated SVG), AWS Bedrock Image (uvx). - image-editing: Imagician, ImageSorcery. - web-capture: just-every screenshot-website-fast, ScreenshotOne. - ui-components: 21st.dev Magic, shadcn/ui, FlyonUI. - data-viz: AntV Chart, Mermaid. - publishing: EdgeOne Pages. - utilities: Filesystem, GitHub, Fetch. Tests - apps/daemon/tests/mcp-{config,oauth,tokens,spawn}.test.ts cover storage round-trip, OAuth helpers, token persistence, spawn-time wiring, every template's transport / command / args / env-field invariants, and the canonical category enum. - apps/web/tests/runtime/markdown.test.tsx covers the new autolinker ordering rules. Co-authored-by: Cursor <cursoragent@cursor.com> * feat(mcp): add 21 more design-focused templates and a `design-systems` category Expands the built-in MCP picker from 17 to 38 templates so users can compose the full Open Design craft loop (design-system intake → generate → edit → audit → publish) without leaving the Settings dialog. Every install spec is verified live against the upstream README; templates that needed Go binaries, multi-step `init` ceremonies, or massive runtime stacks (PostgreSQL + Redis + Ollama) are intentionally deferred so picking a template still resolves to a working server in one click. New `design-systems` category between `web-capture` and `ui-components` (reflects the upstream-of-components position in the workflow). Mirrored in `McpTemplateCategory` on both contracts and daemon, and `CATEGORY_ORDER` on the web side. New templates by category: - image-generation (+4): prompt-to-asset (icons / favicons / OG / logos with free-tier routing across Cloudflare AI / NVIDIA NIM / HF / Stable Horde), Nano Banana (hosted streamable HTTP, virtual try-on + product placement), Seedream (hosted streamable HTTP, ByteDance Seedream v3-v5 + SeedEdit), fal.ai (uvx, 600+ models incl. FLUX / Kling / Hunyuan / MusicGen). - image-editing (+3): Photopea (34 layered-editor tools — closes the PSD gap), Topaz Labs (AI upscale / denoise / sharpen), Transloadit (86+ media pipeline robots). - web-capture (+1): Pagecast (browser → demo GIF / MP4 with auto-zoom). - design-systems (+4, NEW category): Figma-Context (Framelink, designs → code), Design Token Bridge (Tailwind ⇄ CSS ⇄ Figma ⇄ M3 / SwiftUI / W3C DTCG + WCAG contrast), Design System Extractor (Storybook scrape), Aesthetics Wiki (cottagecore / dark-academia / y2k / … moodboards). - data-viz (+2): MCP Dashboards (45+ chart types + KPI dashboards), Excalidraw Architect (hand-drawn architecture diagrams). - publishing (+6): PageDrop, PDFSpark, OGForge, QRMint, Slideshot (HTML → PDF / PPTX / PNG with 7 themes), Deckrun (Markdown → PDF / video, hosted free tier with no key required). - utilities (+1): A11y axe-core (WCAG 2.0/2.1/2.2 + color-contrast + ARIA). Tests cover every new template's wiring (command, args, env / header required-vs-optional, secret flag), the category enum invariant, and in-category declaration order for image-generation, design-systems and publishing buckets where the order is what users see in the picker. 21 new test cases pass; full mcp-config suite is green. Templates intentionally deferred (documented in PR body): figma-use (needs Figma desktop with --remote-debugging-port=9222), m-moire (multi-step `memi suite init` + daemon ceremony), gemini-media-mcp + trident-mcp (Go binaries — no npx / uvx path), Pixelle-MCP (full app with web UI + ComfyUI backend), storybook-addon-mcp (lives inside user's Storybook, not standalone), primitiv (multi-step init / build / serve), ReftrixMCP (PostgreSQL + Redis + Ollama + DINOv2), narasimhaponnada/mermaid (overlap with peng-shawn). Co-authored-by: Cursor <cursoragent@cursor.com> * feat(mcp): add figma-use template (write designs from chat) under design-systems figma-use is the natural counterpart to Figma-Context already in this PR: where Framelink reads Figma designs into the model, figma-use writes back into the canvas (90+ tools — create frames / text / components / variants, render JSX into Figma, export PNG/SVG, query nodes via XPath, lint for WCAG / auto-layout / hardcoded colors, analyze design systems). Wired as an HTTP MCP template (`http://localhost:38451/mcp`) because `figma-use mcp serve` only exposes HTTP — there's no stdio mode in the upstream `serve.ts`. No API key. Two prerequisites the user owns are spelled out in the description so picking the template still resolves to a working server: (1) start Figma with `--remote-debugging-port=9222` (or `figma-use daemon start --pipe` on Figma 126+), and (2) leave `npx figma-use mcp serve` running in a terminal. Inserted between `design-system-extractor` and `aesthetics-wiki` so the design-systems category reads as a workflow: read existing design (Figma Context) → translate tokens (Token Bridge) → extract from Storybook (Extractor) → write back to Figma (figma-use) → break creative block (Aesthetics Wiki). Tests cover the new template's transport (`http`), endpoint URL, the empty header-fields invariant (no auth required), and bump the design-systems group order to include it. Co-authored-by: Cursor <cursoragent@cursor.com> * feat(settings): i18n the External MCP / MCP server / Connectors sidebar entries and make the dialog header track the active section The External MCP sidebar entry this PR introduces was hardcoded English ("External MCP / Add MCP tools (Higgsfield, GitHub…)"). Same for the adjacent Connectors and MCP server entries. The dialog header was also pinned to "Execution & model" copy, so opening Settings → External MCP showed a header that lied about which section the user was on. Adds six translation keys — `settings.connectorsTitle/Hint`, `settings.mcpServerTitle/Hint`, `settings.externalMcpTitle/Hint` — and translates them across all 17 locales (ar, de, en, es-ES, fa, fr, hu, id, ja, ko, pl, pt-BR, ru, tr, uk, zh-CN, zh-TW). `SettingsDialog` now derives the header title/subtitle from the active section (11 sections total) instead of a single hardcoded pair, so each section renders an honest header. Co-authored-by: Cursor <cursoragent@cursor.com> * test(e2e): pin level: 3 on dialog heading lookups for Pets and Connectors CI's Validate workspace job (#1479) failed two Playwright cases with the strict-mode violation: getByRole('dialog').getByRole('heading', { name: 'Pets' }) resolved to 2 elements: 1) <h2>Pets</h2> 2) <h3>Pets</h3> Same root cause as the unit-test fix already in this PR: the dynamic dialog `<h2>` now echoes the section's own `<h3>` because the dialog header tracks the active section. Disambiguate to `level: 3` so each assertion still pins the section heading specifically (which is what the test intends to verify). Audit of the rest of e2e/ for `dialog.getByRole('heading', ...)` — settings-api-protocol.test.ts looks for "OpenAI API" / "Anthropic API" section h3s which never appear in the dialog `<h2>` (always "Execution & model"), so those stay safe. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(mcp): bind OAuth refresh to the issuing client and skip stale tokens Persist the OAuth client context (token endpoint, client_id, client_secret, issuer, redirect_uri, resource) alongside the bearer token so refresh hits the same client the refresh_token was bound to (RFC 6749 §6). The previous refresh path re-ran beginAuth with a dummy OOB redirect URI, which kept getOrRegisterClient from finding the original DCR client and made providers reject the refresh on the next chat turn. Refreshes now reuse the persisted endpoint/client pair directly. Also stop injecting expired access tokens at spawn time when refresh is unavailable or fails. Pinning a stale Bearer made every Claude MCP call 401 while the prompt still treated the server as connected; on that path we now skip the entry and let the UI surface a reconnect. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-08 17:59:20 +08:00
Tom Huang	1d1df52f3b	feat(skills/live-artifact): add 7 example dashboards + contract demo (#716 ) * feat(skills/live-artifact): add 7 example dashboards + contract demo Seven self-contained HTML prototypes under skills/live-artifact/examples/, each with a distinct visual identity and built-in interactivity for video demos: stock-dashboard.html - Bloomberg-style trading floor (dark) crypto-dashboard.html - DeFi/web3 cyber terminal with on-chain ribbon crm-table-live.html - multi-dim CRM with Grid/Kanban/Gallery/Calendar view switcher (light productivity) monday-operator-live.html - editorial Monday-morning briefing (paper) competitor-radar-live.html - mission-control radar with rotating sweep and RGB threat tiers baby-health-live.html - soft pastel parental panel stock-portfolio-live/ - full live-artifact contract example: 102 escaped html_template_v1 bindings + 7 data-od-repeat blocks, ready to register via 'tools live-artifacts create' Each interactive HTML carries refresh-with-flash, view switching, AI panel regeneration, clickable rows/cards that mutate state, and toast notifications. Self-contained - only Google Fonts as external dep. stock-portfolio-live/ demonstrates the daemon contract: template.html + data.json + artifact.json + provenance.json. Refresh runners can rewrite data.json without re-authoring the template. * fix(skills/live-artifact): address PR #716 review feedback - Unroll data-od-repeat blocks into indexed data.* bindings so renderHtmlTemplateV1 can interpolate them (it does not expand data-od-repeat or repeat-local aliases like {{t.label}}). - Rename catalysts[].body to catalysts[].text to satisfy the bounded JSON validator's forbidden-key list (body is rejected case-insensitively); update template binding accordingly. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * fix(skills/live-artifact): make stock-portfolio provenance.json contract-compliant - generatedBy: free-form string -> "agent" (LiveArtifactProvenanceGenerator enum) - sources[].kind -> sources[].type with LiveArtifactProvenanceSourceType enum values (connector for brokerage/quotes connectors, derived for AI recommendation) - Drop non-contract per-source `note` and top-level `summary`/`transformations`/ `refreshContract`/`safetyNotes` fields; preserve their content under the contract-allowed `notes` field so the example survives schema validation. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * fix(skills/live-artifact): use strict ISO-8601 generatedAt in provenance The daemon's `validateIsoDate` requires `Date.toISOString()` round-trip equality, so timezone-offset notation like `2026-05-06T14:32:18-05:00` fails validation even though it parses. Switch to the canonical UTC form `2026-05-06T19:32:18.000Z` (same instant), which the validator accepts. * feat(skills): surface examples/.html as derived skill cards + Live filter A skill that ships hand-crafted samples under examples/.html (e.g. live-artifact's stock dashboard, baby health monitor) now lights up one gallery card per file instead of a single parent card whose preview can only ever show one of them. The parent stays in the listing tagged aggregatesExamples=true so findSkillById and Use this prompt still resolve back to its SKILL.md body, but the Examples tab hides it so the derived <parent>:<child> cards aren't shadowed by a duplicate preview. Subfolder layouts (examples/<name>/template.html + data.json) are deliberately skipped — their templates still hold {{data.x}} placeholders that only the daemon-side renderer fills in, so showing the raw template would render visible braces in the gallery. Ship the baked output as examples/<name>.html alongside the folder to surface it. Adds an examples.modeLive filter pill (translated across all 21 locales) that selects skill.scenario === 'live', so refreshable / connector-backed samples are easy to find without scrolling through every desktop prototype. live-artifact's SKILL.md gains scenario: live so it (and every derived card) lights up there. Co-authored-by: Cursor <cursoragent@cursor.com> * perf(web): parallelize entry-view bootstrap so each tab renders independently Bootstrap used to wait on a single Promise.all behind a global 'Loading workspace…' placeholder, which made the slowest endpoint (typically /api/agents on cold start, since it probes CLI versions) gate every tab including the ones that don't need agents at all. Splits the global bootstrapping flag into per-resource loading flags (agentsLoading, skillsLoading, dsLoading, projectsLoading, promptTemplatesLoading) plus a daemonConfigLoaded flag for the merged daemon config. Each tab now blocks only on the data it actually needs: Examples renders as soon as skills land, Design Systems on dsList, Designs on projects+skills+designSystems, etc. Auto-selecting the first available agent and the default design system moves into dedicated effects gated on daemonConfigLoaded so they no longer race ahead of the daemon-stored choice and overwrite it with a freshly picked first-available pick. EntryView swaps its single loading prop for skillsLoading, designSystemsLoading, projectsLoading, promptTemplatesLoading so each inner tab can pick the right gate without leaking the parent's coarse state. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-08 17:38:29 +08:00
shangxinyu1	8fee22d358	Fix stuck chat runs and unintended cancels (#896 ) * Fix stuck chat runs and unintended cancels * Harden chat run stall watchdog	2026-05-08 15:47:44 +08:00
Marc Chan	e14b8092ea	feat: add Orbit activity summaries (#681 ) * feat: add Orbit activity summaries * fix(orbit): make runs navigable while agent continues * fix(web): widen minimum chat panel * feat: support Orbit template selection * fix(daemon): avoid bogus skill side-file preflight * fix(web): collapse orbit artifact project cards * fix(web): preserve orbit project card titles * fix: improve Orbit run daily briefing * fix: handle Orbit digest data failures * fix: load Orbit templates and connector tools reliably * fix: keep Orbit summary counts consistent Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: apply Orbit template skill context * fix: cache and curate connector tools for Orbit * fix: align Orbit defaults and connector discovery * fix: simplify Orbit template settings * fix: move connectors into settings * fix: compact connector settings catalog * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: prevent connector action button from stretching into pill The icon-only connect/disconnect buttons in the embedded connectors catalog inherited min-width: 92px / 106px from the non-embedded pill rules, overriding the 24px square sizing and causing the buttons to overlap the card head text. Reset min-width to 0 in the embedded icon-only rule so the compact square layout holds. * fix(web): align live artifact file rows * fix: clean up Orbit connector settings lifecycle Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: address Orbit review regressions Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * feat(web): localize Orbit and connector settings * feat(web): gate Orbit runs without connectors * feat(web): refine connector settings UX * feat(web): safeguard Composio key clearing * fix(web): refresh Composio tool badges * feat(web): show connector logos * feat(daemon): localize Orbit prompt window * fix(daemon): clarify blocked connector callback closes * test(daemon): harden flaky async probes * fix(web): align Indonesian connector locale keys * test(web): align connector browser props * fix(web): preserve explicit credential clears Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): time out Composio logo proxy fetches Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): localize Indonesian connector settings copy Translate the new connector settings strings in the Indonesian locale and lock them with a regression test so this surface no longer silently falls back to English. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve discovered connector tools Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve onboarding autosave completion Keep settings autosave from clearing onboarding completion after the close gesture, and expose the desktop main types from source so workspace validation can typecheck packaged imports without a prior desktop build. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): defer Composio catalog cache hydration Load persisted Composio catalog data only after the runtime data directory is configured so startup cannot read another namespace's cache. Add a regression test that exercises the module-load singleton path. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): treat discovery completion independently Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve latest settings draft on close Use the latest persisted settings draft when the dialog closes so onboarding completion does not race a stale daemon sync and overwrite newer Orbit/template selections. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): avoid syncing draft Composio key on Orbit run Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): localize Orbit settings copy Translate the new Indonesian Orbit and autosave strings so the settings UI no longer falls back to English and the locale regression stays covered. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): prefer fresh connector catalog state Keep refetched connector status/auth data authoritative while retaining discovery-only tool metadata so the connectors UI stays consistent after refreshes. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): declare Indonesian locale fallback keys explicitly Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): inline Indonesian fallback strings for CI Replace the Indonesian locale's per-key English lookups with explicit strings so workspace typecheck no longer depends on brittle build-mode resolution in CI. Add a regression test that blocks those per-key English lookups from reappearing in the CI-sensitive fallback sections. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): restrict proxied connector logos to image MIME types Reject non-image upstream logo responses so the daemon never serves third-party HTML from its localhost origin. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * test(e2e): align settings dialog regressions Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): decouple Orbit runs from media sync failures Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): keep SPA catch-all export-compatible Disable dynamic catch-all params for the exported SPA shell so Next.js static builds can emit the root route again. Add a regression test covering the route config against the web export mode. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve Orbit config and workspace routes Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): block SVG in connector logo proxy Reject SVG and other unsafe proxied logo responses so third-party logo content cannot execute under the daemon origin, while keeping raster logo fetches working and making rejected responses non-cacheable. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): fall back to static catalog for empty cache Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): disable Orbit run before connector gate resolves Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(desktop): export shipped desktop types Point the desktop ./main type export at the generated declaration so installed consumers resolve the published file set. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): restore persisted question form selections Render historical submitted answers directly so reloaded question forms keep their locked selections visible. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): retry forced media sync autosave Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): keep Composio logo timeout through body read Keep the Composio logo fetch timeout active until the response body is fully consumed so stalled body reads abort and clear the inflight cache entry. Add a regression test that proves a delayed body read times out and the next request can recover.\n\nGenerated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): refresh Orbit gate after connector auth Re-check connector availability when the settings window regains focus so Orbit unlocks as soon as a connector finishes authenticating in the same settings session. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): keep connector detail tool lists intact Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): ignore malformed Orbit summaries Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(e2e): stabilize design-system multi-select flow Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): cap Composio logo cache growth Bound the Composio logo cache with LRU eviction and expired-entry pruning so repeated untrusted logo requests cannot grow daemon memory without limit. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): bound proxied Composio logo payloads Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): align autosave settings tests Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): remove stray CSS conflict marker Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fixer: address PR #681 follow-up items Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): restore restart routes and connector flows * fix(web): keep SPA export route static * fix(web): stabilize chat scroll tests --------- Co-authored-by: lefarcen <935902669@qq.com>	2026-05-08 14:27:46 +08:00
Tom Huang	2df8b775ec	feat(skills): add 32 zhangzara HTML deck templates (#704 ) * feat(skills): add 32 zhangzara HTML deck templates Vendored from upstream MIT-licensed zarazhangrui/beautiful-html-templates — one Open Design skill per template (name prefix `html-ppt-zhangzara-`) so each template surfaces as its own entry in the Examples panel and renders its own preview. Each skill ships: - SKILL.md (frontmatter + workflow), description, triggers, and od.upstream pointing at the source folder - example.html (the self-contained deck; daemon's preview route looks for <skillDir>/example.html) - template.json (upstream metadata snapshot, with `slug` re-prefixed to `zhangzara-<base>` and a `source` URL) - assets/deck-stage.js / assets/styles.css for the 8 templates that ship a runtime; HTML refs rewritten so the daemon's iframe URL rewriter resolves them through /api/skills/<id>/assets/ scripts/guard.ts allowlist updated with the `html-ppt-zhangzara-` prefix so the vendored upstream JS runtimes pass the residual-JS check. * fix(skills, i18n): address PR #704 review feedback - Add the 32 new html-ppt-zhangzara-* skill ids to the de/ru/fr SKILL_IDS_WITH_EN_FALLBACK arrays so the localized-content coverage e2e test passes. The vendored upstream templates are English-only; falling back to the upstream English description is the right semantic for this batch. - Also add the pre-existing social-media-dashboard skill and totality-festival design system to the same fallback arrays (introduced in #678 without i18n coverage). Tagged with TODOs so localized copy can land in a follow-up. - Ship the upstream MIT LICENSE file in each skills/html-ppt-zhangzara-/ folder so the copyright/permission notice travels with the vendored copy, as MIT requires for redistributing substantial portions. Update each SKILL.md's Source section to reference the bundled LICENSE. - For the 8 runtime-backed templates (creative-mode, editorial-tri-tone, neo-grid-bold, peoples-platform, pin-and-paper, pink-script, soft-editorial, stencil-tablet), expand the workflow's clone step to instruct the agent to copy the assets/ folder alongside example.html — the skill HTML references assets/deck-stage.js (and assets/styles.css for pin-and-paper) as project-local paths, so cloning the HTML alone produces an artifact whose runtime 404s. Verified locally: - pnpm guard passes. - pnpm --filter @open-design/web typecheck passes. - pnpm --filter @open-design/web test passes (309/309). - pnpm --filter @open-design/e2e test passes (6/6 active, including localized-content coverage for de/ru/fr). fix(i18n): drop duplicate totality-festival fallback after merge with main Main already added 'totality-festival' to the design-system EN-fallback lists; the TODO entry from this branch became a duplicate after merge. * fix(skills, guard): address PR #704 follow-up review - Pin Chart.js CDN to 4.4.7 in coral and cartesian example.html so vendored decks no longer track the latest jsDelivr major. - Narrow scripts/guard.ts zhangzara allowlist to a regex that only permits skills/html-ppt-zhangzara-/assets/deck-stage.js, restoring the TypeScript-first guard for any other JS under those skill dirs. - Reconcile slide_count and 'Slides in demo' with actual <section class="slide"> counts: broadside 20 -> 16, monochrome 18 -> 16, neo-grid-bold 13 -> 12. Co-authored-by: Cursor <cursoragent@cursor.com> fix(daemon): keep resolveDataDir return path stable, canonicalize at compare site The realpathSync wrapper inside resolveDataDir was rewriting every /var/... result to /private/var/... on macOS, which broke 11 hermetic assertions in tests/resolve-data-dir.test.ts (absolute paths, relative paths, and \$HOME / \${HOME} / ~ expansions whose mkdtempSync roots live under /var/folders/...). It also changed the public OD_DATA_DIR resolution contract for any downstream caller that compared against the expanded user-supplied path. Restore resolveDataDir to return the expanded resolved path unchanged, and introduce RUNTIME_DATA_DIR_CANONICAL — a one-shot realpath of RUNTIME_DATA_DIR — used only at the narrow folder-import comparison site that needs to match against a user-supplied realpath() result. The import-path symlink protection from #624 still works (a /var-rooted data dir now compares against its /private/var canonical form), while resolveDataDir keeps its stable, user-shaped contract. Verified locally: pnpm --filter @open-design/daemon test (1083/1083), including all 12 resolve-data-dir.test.ts cases. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-08 12:02:59 +08:00
VanJay	369d136d19	Add Docker Compose deployment workflow (#65 ) * Add Docker Compose deployment workflow * Address Docker deployment review feedback Harden publishing inputs and temporary credential handling, and tighten Docker runtime defaults requested by the PR review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix Docker publish build in CI mode Set CI=true during the image build so pnpm prune can run non-interactively inside Docker. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix Docker runtime dependency layout Use pnpm deploy for the daemon package so the runtime image includes production dependencies where Node resolves them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Use legacy pnpm deploy in Docker build Allow pnpm v10 deploy to package the daemon workspace without requiring injected workspace packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Align Docker runtime with Node 24 Use Node 24 for both build and runtime stages and update image verification for the workspace daemon dependency layout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Remove legacy OD_HOST Docker binding fallback Use OD_BIND_HOST as the single daemon bind-host setting for Docker deployment and origin validation. * Update Docker image verifier for daemon dist runtime Check the packaged daemon dist entrypoint and allow npm from the Node 24 runtime image while still rejecting build-only tools. * Allow private LAN browser origins for daemon * Share daemon origin validation helpers Move browser origin validation into a shared daemon module so tests exercise the production logic and cover the remaining private LAN edge cases. * Harden Docker Compose port exposure Bind the Compose deployment to localhost by default and pass the published port through to the daemon origin checks so host-port overrides remain same-origin. * Keep deployment hosts out of local-only no-origin checks Require an actual matching Origin before configured deployment origins can satisfy local-only daemon guards, preventing no-Origin remote clients from bypassing those checks. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: mrcfps <mrc@powerformer.com> Co-authored-by: lefarcen <935902669@qq.com>	2026-05-08 11:51:51 +08:00
Nagendhra Madishetti	665e52b295	fix(daemon): pin OD_DATA_DIR in /api/mcp/install-info env so the macOS-packaged MCP server does not EPERM on .od/projects (#857 ) * fix(daemon): pin OD_DATA_DIR in /api/mcp/install-info env so spawned MCP processes do not fall back to .od inside the macOS app bundle Reporter (#848) ran a packaged Open Design 0.5.0 on macOS and pointed Antigravity's MCP config at the bundle's daemon-cli.mjs. The MCP process is launched by the IDE outside the packaged app's environment, so it does not inherit OD_DATA_DIR. The daemon-cli import path runs mkdirSync('<cwd>/.od/projects') before dispatching to MCP mode, and <cwd> resolves to the read-only macOS app bundle, hitting EPERM. The /api/mcp/install-info endpoint already serializes env into every client snippet (Cursor, Claude Code, VS Code, Zed, Windsurf, Antigravity, Codex). Add OD_DATA_DIR: RUNTIME_DATA_DIR to that env so the snippet pins the daemon's resolved data root, and the spawned MCP process writes to the same directory the daemon already uses regardless of how the IDE was launched. Test added asserts env.OD_DATA_DIR is propagated. * refactor(daemon): extract buildMcpInstallPayload so the test asserts the production helper, not a fixture mirror Reviewer flagged that the previous test asserted env.OD_DATA_DIR on a copy of the handler's payload-construction logic, which would silently pass if the real handler ever diverged from the fixture. Move the env / args / buildHint shape into a pure exported helper (apps/daemon/src/mcp-install-info.ts), wire both server.ts and the test fixture through it, and drop the inline duplicates. The test now exercises the same code path that ships, so any regression in the env block (missing OD_DATA_DIR, wrong format, lost ELECTRON_RUN_AS_NODE) fails it. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-08 11:35:23 +08:00
Nagendhra Madishetti	6de802ba70	feat(daemon): add critique interrupt endpoint + project-keyed run registry (Task 6.1) (#819 ) Phase 6.1 of the Critique Theater rollout: a single new endpoint and the in-process registry that backs it. POST /api/projects/:projectId/critique/:runId/interrupt cascades an AbortController to the orchestrator that owns the spawned CLI so the parser can flush best-so-far state and emit critique.interrupted before the process exits. Backed by a new in-process run registry that the orchestrator wiring registers each run into before runOrchestrator is invoked, and unregisters in a finally block. The registry is keyed by (projectId, runId), not just runId. A request to interrupt project p1's runId cannot find or abort a registry handle that belongs to project p2 even if their ids ever collide. The HTTP handler also performs its own DB-row projectId check before calling the registry, so cross-project leakage is blocked at two layers. The endpoint is idempotent on already-interrupted rows: a client that lost the first response and retries observes 202 with prevStatus "interrupted" rather than a 409 conflict. Other terminal statuses (shipped, failed, timed_out, degraded, below_threshold, legacy) still return 409 because those runs reached their real terminal state on their own and an interrupt is no longer meaningful. Recovery path for stale running rows: when registry.interrupt returns false (the in-process registry has no AbortController for this projectId/runId pair) but the DB still says 'running', the endpoint marks the row 'interrupted' directly with recoveryReason='no_live_handle' and returns 202 with recovered=true. This window opens after a daemon restart in the gap before reconcileStaleRuns sees the row old enough. Without the recovery branch the endpoint would lie: 202 accepted, no child signaled, no critique.interrupted event, row stuck running. The new persistence helper markRunInterruptedRecovery mirrors the per-row write reconcileStaleRuns already does, gated on status='running' so a row that just transitioned terminal is not overwritten. Task 6.2 (rerun endpoint) is intentionally not in this PR. The earlier draft conflated row insertion between the handler and runOrchestrator (primary key collision) and did not actually start a new agent spawn. Rerun needs a real chat-run path with prior-art context, an artifact-id validator, and SQL LIKE escaping that the row lookup path is missing today; it is cleaner shipped as a follow-up than wedged into this PR. Tests: - critique-run-registry: 17 cases covering register, get, interrupt, unregister, list, plus the new (projectId, runId) composite key invariants (cross-project register, cross-project get/interrupt isolation, unregister keying). - critique-interrupt-endpoint: 17 cases covering 202 happy path, 404 on unknown run, 404 on cross-project run, 404 cross-project leak guard at the registry layer, 409 on terminal statuses, 202 idempotent retry on already-interrupted, stale-handle defense, 202 + recovered on a stale running row with no live handle, 400 on bad params. Incidental: apps/web/src/i18n/locales/id.ts was missing 18 fileViewer deploy/Cloudflare keys after upstream landed PR #805 (R2 release publishing). Without those keys the workspace web typecheck fails on the i18n Dict equality check, blocking CI on every PR. Added Indonesian translations for the missing keys to unblock. Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-08 11:29:37 +08:00
Terence !_!	e52720aa12	feat(daemon): add language boost support for Minimax TTS (#773 ) * feat(daemon): add language boost support for Minimax TTS Add --language CLI flag to support language boost parameter for Minimax TTS. This enables better pronunciation for specific languages like Cantonese (Yue). * docs(media): add --language flag to media generation contract Document the language boost parameter for Minimax TTS, enabling better pronunciation for specific languages like Cantonese (Yue). * fix(media): correct Cantonese language_boost value and add input validation - Use correct MiniMax value 'Chinese,Yue' for Cantonese (no space) - Add type guard in server.ts to reject non-string language values - Trim language string before sending to MiniMax API --------- Co-authored-by: root <root@DELLN40.asiacredit.org>	2026-05-08 11:26:34 +08:00
kami	2eae7da24b	feat: support Cloudflare Pages custom domains (#851 ) * Support Cloudflare Pages custom domains without hiding pages.dev fallback Keep the default Pages preview as the first public link while optional owned-zone binding provisions DNS and Pages custom-domain state in parallel. Constraint: Cloudflare deploys must use the existing direct-upload API path with no Wrangler dependency. Constraint: pages.dev must stay visible even while custom-domain verification is pending. Rejected: Vercel custom-domain support \| outside requested Cloudflare-only scope. Rejected: overwriting arbitrary CNAME records \| risks taking over user-managed DNS. Confidence: high Scope-risk: moderate Directive: Do not expose providerMetadata through public deploy contracts; keep custom-domain DNS ownership checks conservative. Tested: pnpm --dir apps/daemon exec vitest run -c vitest.config.ts tests/deploy.test.ts tests/deploy-routes.test.ts Tested: pnpm --filter @open-design/contracts build && pnpm --filter @open-design/contracts typecheck && pnpm --filter @open-design/contracts test Tested: pnpm --filter @open-design/web typecheck && pnpm --filter @open-design/web test -- providers/registry.test.ts components/FileViewer.test.tsx i18n/locales.test.ts Tested: pnpm i18n:check && pnpm guard && pnpm typecheck Tested: pnpm --filter @open-design/daemon build && pnpm --filter @open-design/web build && git diff --check Not-tested: real Cloudflare account/token/domain smoke test * Preserve Cloudflare fallback correctness under large accounts and races Constraint: Cloudflare Pages keeps pages.dev as the primary usable fallback while custom domains remain optional typed metadata. Rejected: Treating custom-domain DNS or binding failure as a top-level deployment failure \| pages.dev can still be ready and usable. Confidence: high Scope-risk: moderate Directive: Keep custom-domain finality tied to Cloudflare Pages API active status plus URL reachability; do not expose providerMetadata. Tested: pnpm --dir apps/daemon exec vitest run -c vitest.config.ts tests/deploy.test.ts tests/deploy-routes.test.ts; pnpm --filter @open-design/web test -- components/FileViewer.test.tsx i18n/locales.test.ts providers/registry.test.ts; pnpm --filter @open-design/daemon typecheck; pnpm --filter @open-design/web typecheck; pnpm i18n:check; git diff --check; pnpm guard; pnpm typecheck; pnpm --filter @open-design/daemon build; pnpm --filter @open-design/web build Not-tested: Real Cloudflare token/account/zone smoke test. * Keep impeccable design notes local Constraint: .impeccable.md is local assistant/design context and should not be part of the PR diff. Rejected: Keeping the file tracked while adding it to .gitignore \| tracked files are not ignored by Git. Confidence: high Scope-risk: narrow Directive: Keep .impeccable.md untracked and ignored; do not rely on it for required project documentation. Tested: git check-ignore -v .impeccable.md; git diff --check Not-tested: Full workspace tests not rerun for ignore-only metadata change.	2026-05-08 11:11:22 +08:00
emilneander	959bfaa817	fix(daemon): make MCP install snippet survive daemon port changes (#846 ) * fix(daemon): make MCP install snippet survive daemon port changes `od mcp` now discovers the live daemon URL via the sidecar IPC status socket on every spawn, so the Settings -> MCP server snippet no longer bakes in `--daemon-url <port>`. Pasted client configs stay valid across daemon restarts even when the daemon binds an ephemeral port (tools-dev, packaged). Resolution order is --daemon-url > OD_DAEMON_URL > IPC discovery > http://127.0.0.1:7456 so explicit overrides still win for direct `od` launches. * fix(daemon): MCP snippet works in non-default namespaces and direct launches Propagate OD_SIDECAR_NAMESPACE / OD_SIDECAR_IPC_BASE into the snippet env so non-default namespace daemons stay reachable; the spawned MCP client does not inherit the daemon's env, so without this it would probe the default-namespace socket and miss. Restore --daemon-url in the snippet for direct `od --port X` launches that have no IPC socket. Reword `od mcp --help` so it does not imply live URL tracking; each new spawn rediscovers, but a running MCP server caches the URL until the client restarts.	2026-05-08 10:59:09 +08:00
Tom Huang	56bf6ee1b6	feat: agent-callable research command and /search (#615 ) * feat: pre-generation research (Tavily) for grounded generation Adds an optional pre-generation research step so the agent can produce slides / prototypes / decks grounded in real sources instead of guessing. User flow: 1. Settings -> Tavily Search -> paste API key (or set TAVILY_API_KEY). 2. Click the new Research button in the chat composer. 3. On send, the daemon runs a Tavily search, prepends the findings as a <research_context> block ahead of the system prompt, and spawns the agent. Research progress shows up as status pills in the chat stream; the agent cites sources inline as [1]/[2]/... Phase 1 surface: - Single provider (Tavily), single depth ('shallow'), no LLM synthesis pass (Tavily's `answer` is the summary). - Composer toggle only; no popover / depth picker yet. - Reuses the existing `status` SSE agent payload + StatusPill UI so no new event variants or renderer code are needed. Layers touched: - contracts: ResearchOptions / Source / Findings DTOs; ChatRequest.research; export from index. - daemon: apps/daemon/src/research/{index,tavily}.ts orchestrator + provider; tavily added to MEDIA_PROVIDERS and ENV_KEYS; hook in startChatRun before prompt assembly. - web: ChatComposer toggle + ChatSendMeta; threaded through ChatPane / ProjectView / streamViaDaemon into ChatRequest. Side fix (required to land the feature, but useful on its own): contracts internal relative imports lacked the `.js` suffix that NodeNext module resolution requires. This was already breaking `pnpm --filter @open-design/daemon typecheck` on main; without the fix, none of the new research types were visible to the daemon. All internal contracts imports now carry `.js`. Spec: specs/current/research-feature.md (phases 2-4 outlined for follow-up: composer popover, multi-provider, deep recursion, example skills with research_recommends). Verified: - pnpm --filter @open-design/contracts typecheck/test - pnpm --filter @open-design/daemon typecheck (the chokidar project-watchers test is a pre-existing flake, unrelated) - pnpm --filter @open-design/web typecheck - node scripts/verify-media-models.mjs * fix(daemon): clamp Tavily max_results to 20 Tavily's /search endpoint requires `max_results` in [0, 20]; sending a larger value (e.g. when `research.depth: "deep"` resolves to 30) returns 400 and `runResearch` silently falls back to no-research. Clamp at the provider boundary so Phase 2 depth tiers above 20 still produce results instead of failing the request. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * Remove stale research merge leftovers * Add agent-callable research search * Fix Indonesian locale typecheck * Fix research command invocation edge cases * Harden slash search prompt expansion * Honor research source caps in command contract * Require search reports in design files * Add research data provider settings * Wire web research provider fallback order * Update research provider fallback wording * Revert "Update research provider fallback wording" This reverts commit `86fb6001e3`. * Revert "Wire web research provider fallback order" This reverts commit `4c9e16036b`. * Revert "Add research data provider settings" This reverts commit `23630d1746`. * Add Dexter and Last30Days research skills * Add DCF and Last30Days OD skills * Add Last30Days and Dexter skills * Resolve research review threads --------- Co-authored-by: a1chzt <chizblank@gmail.com>	2026-05-08 10:33:44 +08:00
INFINITY	988fd6db5e	feat: import existing local folder as project (#597 ) (#624 ) * feat(contracts): types for folder-import endpoint Add ImportFolderRequest, ImportFolderResponse to the public contract surface. Extend ProjectMetadata with a baseDir field — when set, the project's files live at this absolute path instead of .od/projects/<id>/. Stored as the realpath() result so symlinks cannot redirect later writes. Refs nexu-io/open-design#597 * feat(daemon): support metadata.baseDir for folder-rooted projects Add resolveProjectDir() and metadata-aware variants of listFiles, readProjectFile, writeProjectFile, ensureProject so a project's files can live under metadata.baseDir (the user's chosen folder) instead of .od/projects/<id>/. metadata.baseDir is opt-in — projects without it keep the existing .od/projects/<id>/ behavior unchanged. When listFiles walks a baseDir-rooted project, it skips conventional build / install dirs (node_modules, .git, dist, build, .next, .nuxt, .turbo, .cache, .output, out, coverage, __pycache__, .venv, vendor, target, .od, .tmp) so the file panel stays focused on design content instead of being dominated by lockfiles and node_modules. Add detectEntryFile() — best-effort lookup for index.html or any .html at the folder root, used by the import endpoint to seed the initial active tab. Refs nexu-io/open-design#597 * feat(daemon): add POST /api/import/folder endpoint Creates a project rooted at the submitted local folder. metadata.baseDir points at that folder and OD reads / writes there directly — no copy, no shadow tree, mirroring how Cursor / Claude Code / Aider behave. The user owns the workspace and is responsible for their own version control. Safety: - baseDir is canonicalized via fs.promises.realpath() at import time so user-controlled symlinks can't redirect later writes. resolveSafe enforces the bounds check against the literal stored path; without realpath, a symlink (e.g. ~/sneaky → /etc) would let writeProjectFile escape the project tree at every later call because the OS follows the symlink at open() time. - Post-realpath lstat ensures the canonical target is itself a real directory (defense-in-depth). - The data directory (RUNTIME_DATA_DIR) and its descendants are refused after symlink resolution so a redirect into the daemon's own state can't masquerade as a project import. The web client wires this through state/projects.ts → App.tsx, landing the user on the auto-detected entry file when present. Refs nexu-io/open-design#597 * feat(desktop): expose native folder picker to renderer Adds an Electron preload script that exposes window.electronAPI.pickFolder via contextBridge. Wires dialog.showOpenDialog through ipcMain so the web UI can open a native folder selector for project import. Browser-only users fall back to a text input for the absolute path (handled in the web layer); the picker stays an optional convenience on the desktop binary. ipcMain.handle() registers handlers in an internal map that is not exposed via eventNames(), so the natural-looking guard if (!ipcMain.eventNames().includes('dialog:pick-folder')) ipcMain.handle(...) is always true. On a second createDesktopRuntime() call (dev hot-reload, packaged-vs-electron mode swap) the body re-runs and ipcMain.handle() throws 'Attempted to register a second handler'. Use removeHandler() + handle() unconditionally — removeHandler() is a documented no-op when nothing is registered, making the pair idempotent. Includes .cts in the apps/desktop tsconfig so the preload script is typechecked. Refs nexu-io/open-design#597 feat(web): add 'From existing folder' option to New Project UI surface for the import flow: - A new 'Open folder' affordance in NewProjectPanel that uses the native picker on Electron (window.electronAPI.pickFolder) and falls back to an absolute-path text input in the browser. - importFolderProject() in state/projects.ts: typed wrapper around POST /api/import/folder using @open-design/contracts types. - App.tsx wires the response: prepend the new project to the list, navigate to it, and select the auto-detected entry file as the active tab. Skill / design-system pickers from the existing prototype tab are reused — folder import is a project-creation flow, not a separate project type. Refs nexu-io/open-design#597 * docs(architecture): document folder-import endpoint Adds POST /api/import/folder to the daemon API table and a 'Folder import' section explaining the single-mode design (direct read/write in metadata.baseDir, mirroring Cursor / Claude Code / Aider), the realpath() canonicalization, the RUNTIME_DATA_DIR refusal, and the SKIP_DIRS list applied to listFiles for baseDir-rooted projects. Refs nexu-io/open-design#597 * test(daemon): unit + integration tests for folder import Two new files: apps/daemon/tests/folder-import-projects.test.ts (13 unit tests): - resolveProjectDir behavior under all metadata combinations, including the fallback when baseDir is relative and the isSafeId-bypass when baseDir is set - detectEntryFile: index.html priority, .html fallback, null when no html, no descent into subdirs - listFiles with metadata.baseDir: walk, SKIP_DIRS hides node_modules / .git / dist, back-compat for projects without baseDir apps/daemon/tests/folder-import-route.test.ts (10 integration tests): - Happy path: baseDir stored in metadata, importedFrom='folder', conversation created, entry file detected - Error paths: missing baseDir, empty, relative, non-existent, pointing at a file - Security: realpath canonicalization (the symlink test was the one that surfaced the original /var vs /private/var mismatch in RUNTIME_DATA_DIR comparison on macOS) - Security: a symlink that resolves into RUNTIME_DATA_DIR is rejected after realpath, not before Refs nexu-io/open-design#597 * fix(daemon): wire baseDir metadata into chat + deploy reads Two bugs caught in Codex automated review of #624: 1. chat-route was passing the metadata object directly as the listFiles opts argument: `listFiles(PROJECTS_DIR, projectId, chatMeta)`. The listFiles contract reads opts.metadata, not opts itself, so this silently fell back to .od/projects/<id>/ instead of the imported folder. existingProjectFiles was empty for baseDir-rooted projects. Wrap as `{ metadata: chatMeta }`. 2. deploy.ts read project files via readProjectFile without the metadata third argument, so for baseDir-rooted projects the deploy and preflight endpoints would look in .od/projects/<id>/ and fail with file-not-found instead of reading the imported folder. Thread options.metadata through buildDeployFilePlan → readProjectFile and pass project?.metadata at the two server.ts callsites (`POST /api/projects/:id/deploy` and the preflight endpoint). Add a regression test that locks the listFiles contract: passing a bare metadata object as opts must NOT scan baseDir — it must fall back to the standard project dir, otherwise callers can leak the wrong folder by mistake. Refs nexu-io/open-design#597, #624 (Codex review) * fix(daemon): ensure correct metadata handling in folder import Addressed issues with metadata handling in folder import functionality. Updated the listFiles and readProjectFile methods to correctly utilize the metadata.baseDir, ensuring that project files are read from the intended directory. Added regression tests to verify that passing a bare metadata object does not inadvertently scan the baseDir, maintaining the integrity of project file access. Refs nexu-io/open-design#597 * fix(daemon): security hardening from Codex review of #624 P1 findings from automated review: 1. POST /api/projects + PATCH /api/projects/:id rejected client-supplied metadata.baseDir. baseDir is privileged: it lets a project root inside the user's filesystem, and the realpath() + RUNTIME_DATA_DIR reentry checks live only on /api/import/folder. Allowing it on the generic create/patch path lets an attacker smuggle e.g. /etc through and bypass every import-time guard. Both endpoints now refuse a baseDir field with 400. 2. resolveSafeReal() helper: realpath()s each candidate path (or its longest existing prefix for write paths) and re-validates against realpath(projectRoot). The original resolveSafe() only did a string-prefix check, which was fooled by symlinks inside a baseDir-rooted project. A repo containing 'assets -> /Users/me/.ssh' passed the literal prefix check but readFile() followed the link at open() time. resolveSafeReal() is now used by readProjectFile, writeProjectFile, and deleteProjectFile. 3. Multer chat-upload destination now resolves to metadata.baseDir for imported folder projects via a module-level lookup wired to db at startServer() boot. Previously attachments landed in .od/projects/<id>/ even for baseDir projects, so the agent (which runs with cwd=baseDir) couldn't open them. P2 findings: 4. searchProjectFiles threads metadata through listFiles + resolveProjectDir so /api/projects/:id/search hits the right tree. 5. buildProjectArchive + buildBatchArchive now accept metadata so 'Download .zip' works for imported folder projects. 6. Watcher subscribe() resolves to baseDir for imported projects so live-reload SSE actually fires when the user edits files in their own folder. Registry stays keyed by the canonical directory. 7. Template snapshotting reads source-project files with metadata so a template can be saved from a baseDir-rooted source. Tests: - Regression: POST /api/projects with metadata.baseDir → 400. - Regression: descendant symlink (assets/leak.txt -> /etc/hosts) is refused on the raw read endpoint. Refs nexu-io/open-design#597, #624 (Codex P1+P2 review) * fix(daemon): close two regressions found in #624 review round 2 @mrcfps caught two more correctness gaps: 1. Archive root symlink escape — buildProjectArchive accepts an optional ?root=<subdir> param to scope the zip to a subdirectory. The path was resolved with the string-only resolveSafe(), so a directory symlink inside an imported folder (docs -> /Users/me/.ssh) passed the prefix check and collectArchiveEntries() then walked outside the project tree. Switch to the symlink-aware resolveSafeReal() — the same one that already protects raw read/write/delete paths. The walker itself already skips dirent symlinks via !isDirectory && !isFile, so canonicalizing the root is the only missing piece. 2. PATCH metadata wiped baseDir — updateProject() replaces metadata wholesale. The previous guard only blocked an explicit baseDir change, but a normal patch that omits baseDir (a UI editing linkedDirs only sends { metadata: { kind, linkedDirs } }) silently detached imported projects from their folder root. Subsequent reads/writes/watch/deploy fell back to .od/projects/<id>. Re-stamp the immutable folder-import fields (baseDir, importedFrom='folder') from the existing project record onto the incoming patch when the project is imported. A patch that supplies a different baseDir still gets rejected as before; a patch that supplies the same baseDir is accepted as a no-op. A patch on a non-imported project that tries to set baseDir is also still rejected (preserves the POST /api/projects guard from the previous round). Tests: - archive endpoint: ?root=<symlink-to-/etc> → 400. - patch endpoint: PATCH that omits baseDir on an imported project keeps baseDir intact (project still resolves to the user's folder after). Refs nexu-io/open-design#597, #624 (Codex P1 round 2) * fix(web): add Indonesian deploy provider copy --------- Co-authored-by: INFINITY <valentyn.sotov@trendarena.app> Co-authored-by: Siri-Ray <2667192167@qq.com>	2026-05-07 20:43:31 +08:00
kami	09eb88f683	Add Cloudflare Pages artifact deployment Adds Cloudflare Pages artifact deployment support.	2026-05-07 20:04:22 +08:00
Tom	8630fd380a	feat(daemon): close pi adapter parity gaps Closes pi adapter parity gaps for image paths, extra allowed dirs, error events, and sendAgentEvent routing.	2026-05-07 20:03:46 +08:00
PerishFire	cb92c93ae0	Migrate beta release publishing to R2 (#805 ) * Prebundle standalone web packaged runtime * Harden mac standalone prebundle policy * Prebundle mac daemon packaged runtime * Prune mac Electron locales * Maximize mac release artifact compression * Publish beta mac artifacts to R2 * Use remote R2 uploads for beta releases * Fail fast on beta R2 access issues * Use S3-compatible uploads for beta R2 releases * Decouple beta versioning from GitHub releases * Remove legacy beta metadata source * Address release beta review notes	2026-05-07 19:13:52 +08:00
xuncha	a8418ac730	Fix Windows link code folder dialog (#698 ) * Fix Windows link code folder dialog * Add Windows folder dialog coverage * Complete Indonesian locale copy	2026-05-07 17:27:01 +08:00
PerishFire	6efac8887e	Improve Windows beta packaging and installer flow (#768 ) * Optimize Windows packaged web output * Fix packaged contracts runtime build * Optimize Windows packaged size pruning * Prune Windows root Next payload * Remove Windows bundled Node runtime * Prune Windows standalone duplicate Next * Add tools-pack cache foundation * Cache Windows packaged build layers * Cache Windows workspace builds * Cache Electron-ready Windows app * Split Windows tools-pack module * Cache Windows dir build outputs * Split Windows pack build modules * Document Windows NSIS smoke namespace limits * Move Windows NSIS smoke note to agents guide * Optimize Windows beta packaging * Bump packaged beta base version * Improve Windows installer namespace UX * Improve Windows tools-pack cache keys * Stabilize Windows beta cache version keys * Cache Windows workspace build outputs * Optimize windows release beta cache layers * Cache windows release dependencies * Trim windows release cache before save * Refresh windows tools-pack cache key * Improve windows installer preflight prompts * Fallback NSIS installer strings to English * Fix Windows installer cleanup and preflight * Improve Windows NSIS state logging * Fix system NSIS Persian language alias * Use long-path removal for Windows uninstall * Fix mac tools-pack tests on Windows * Address Windows packaging review feedback * Fix Windows installer cache namespace isolation * Include web output mode in Windows tarball cache key * Use unique Windows release cache save keys	2026-05-07 16:44:15 +08:00
Nagendhra Madishetti	bb2015766a	feat: Critique Theater Phase 5 (panel prompt template + system composer wiring)	2026-05-07 16:35:04 +08:00
JHR	c00f89dbe4	fix(daemon): allow portless Origin in CORS whitelist for Chrome compatibility	2026-05-07 16:29:06 +08:00
Nagendhra Madishetti	25a3ffd298	fix(daemon): add legacy data dir migrator Add a one-shot OD_LEGACY_DATA_DIR migrator so packaged Desktop users can recover 0.3.x repo .od data into the 0.4.x data root. The migrator stages payloads before promotion, refuses unsafe merges and symlinks, rolls back failed promotion or marker writes, and extends packaged daemon startup handling for long migrations while failing fast on daemon exits. Closes #710	2026-05-07 15:19:04 +08:00
shangxinyu1	9b501f12a5	Support overriding the Codex executable path (#755 ) * Support overriding the Codex executable path * Replace save-as-template prompts with an in-app dialog * Seed local packaged app config from workspace * Fix packaged config and connection test overrides * Keep tools-pack mac config seeding self-contained * Require absolute CODEX_BIN overrides	2026-05-07 15:00:52 +08:00
monshunter	e6e5928be1	feat(web): add connection tests for execution settings (#507 ) * feat(settings): add connection test for providers and CLI agents Adds a "Test" action in the Settings dialog that verifies the configured provider (Anthropic/OpenAI/Azure/Google) or CLI agent without sending a real chat. Backed by a new daemon endpoint and shared contracts, with categorized inline statuses and i18n strings across all supported locales. * fix(settings): address connection test review feedback * fix(daemon): pass empty MCP servers for connection probes * fix(connection-test): address review blockers * fix(daemon): fail json stream runs on structured errors * fix(contracts): build connection test subpath export * Use draft CLI env in agent connection tests * fix(i18n): add fallback ids for new curated content	2026-05-07 11:25:37 +08:00
Nagendhra Madishetti	832ea7d864	fix: batch of small bug fixes (#283 , #275 , #390 ) (#530 ) * fix(web): add hover tooltips to Design Files action buttons (#283) The batch-download, select-all, and clear-selection buttons in DesignFilesPanel had no title attribute, so users hovering them saw no tooltip. The other action buttons (refresh, new sketch, paste, upload) already had titles. Added titles to the three missing ones using the existing translation keys, so hover behavior is consistent across the panel. Closes #283. * docs: point pi-ai links to pi-mono packages (#275) The pi project moved from a standalone repo to the pi-mono monorepo. The old URL https://github.com/mariozechner/pi-ai now 404s. Replaced both shapes of reference: - The reference-style [piai]: definition now points at https://github.com/badlogic/pi-mono/tree/main/packages/ai (the multi-provider LLM API package). - Inline links whose visible text is the CLI tool 'pi' or 'Pi' now point at https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent (the interactive coding-agent CLI), so a reader clicking 'pi' in the daemon-discovery section lands on the actual binary's docs. Affected: README.md and 10 translated READMEs, plus docs/spec.md, docs/architecture.md, docs/references.md, docs/roadmap.md. Closes #275. * fix(daemon): expand $HOME / ${HOME} in OD_DATA_DIR (#390) Some launchers (systemd unit files, NixOS modules, certain Docker entrypoints) pass OD_DATA_DIR with a literal '$HOME' or '${HOME}' because no shell ever expands them. resolveDataDir previously only handled '~/' shorthand, so '$HOME/.open-design' fell through to path.resolve(PROJECT_ROOT, '$HOME/.open-design') and produced paths like /opt/open-design/$HOME/.open-design. resolveDataDir now expands '~', '~/...', '$HOME', '$HOME/...', '${HOME}', and '${HOME}/...' to os.homedir() before the absolute / relative branch runs. Rebuilds via path.join so the platform separator is correct on Windows even when the input used forward slashes. Tests: 7 unit tests cover empty/undefined, '~', '~/...', '$HOME', '$HOME/...', '${HOME}/...', absolute paths, and relative paths. Closes #390. * fix(daemon): accept backslash separators + hermetic resolve-data-dir tests Round 1 review feedback on PR #530. The previous regex only matched forward-slash separators, so a Windows launcher passing OD_DATA_DIR=$HOME\.open-design or ${HOME}\.open-design fell through to path.resolve(projectRoot, ...) and produced a directory named $HOME or ~ under projectRoot. The regex now accepts both forward and back slashes for the home-prefix separator. The previous tests called the real resolveDataDir against literal ~/od-test, $HOME/od-test, etc., which created and write-checked directories under the developer's or CI runner's actual home. The tests now stub os.homedir() with vi.spyOn to a per-test mkdtemp directory and remove it in afterEach, so no test ever writes outside its own sandbox. Added explicit fixtures for the Windows backslash forms ($HOME\od-test, ${HOME}\od-test, ~\od-test) so launcher coverage stays cross-platform. 12/12 resolve-data-dir tests pass, daemon typecheck clean. * fix(docs,daemon): apply pi-mono links to README.es and await test cleanup Round 2 review feedback on PR #530. README.es.md was added in upstream #552 after my pi-mono link sweep landed, so the daemon-discovery paragraph (line 222), the [piai] reference (line 684), and the Pi table row (line 709) still pointed at the broken https://github.com/mariozechner/pi-ai URL. Applied the same replacements: the [piai] ref now points at packages/ai, and the inline Pi link now points at packages/coding-agent. Spanish readers get the same coverage as the other 11 locales. The absolute-path test in tests/resolve-data-dir.test.ts dropped its fixture via void rm(abs, ...), so a failed async removal could leak rdd-abs-* directories from the suite. The test is now async and awaits the rm in the finally block, matching the awaited cleanup in afterEach. 12/12 resolve-data-dir tests still pass, daemon typecheck clean. * fix(daemon): share $HOME expander between OD_DATA_DIR and OD_MEDIA_CONFIG_DIR Round 3 review feedback on PR #530. resolveDataDir (server.ts) now expands $HOME / ${HOME} / ~, but media-config.ts had its own resolveOverrideDir that only handled ~/. Because configFile() falls back to OD_DATA_DIR when OD_MEDIA_CONFIG_DIR is unset, setting OD_DATA_DIR=$HOME/.open-design split state: SQLite, projects, and artifacts went to the expanded path while media-config.json stayed under <projectRoot>/$HOME/.open-design. Stored provider keys then appeared missing on the next read. Extracted the home-prefix expansion into apps/daemon/src/home-expansion.ts so resolveDataDir and resolveOverrideDir share one resolver. Both now recognize ~ / $HOME / ${HOME} (bare tokens) and ~/, ~\, $HOME/, $HOME\, ${HOME}/, ${HOME}\ (prefix forms with either separator). Three new media-config routing tests cover the OD_DATA_DIR fallback for $HOME/..., ${HOME}/..., and the OD_MEDIA_CONFIG_DIR explicit-override $HOME/... case so the co-location guarantee is locked down by tests. Daemon typecheck clean. Tests pass on Linux CI; the existing pattern in the file uses process.env.HOME which os.homedir() reads on POSIX. Resolve-data-dir tests stay hermetic via vi.spyOn. * docs(daemon): media-config comments reflect full $HOME / ${HOME} expansion Round 3 review feedback on PR #530 (lefarcen, P3 non-blocking). The file-header and resolveOverrideDir() function comment said only ~/ expands. Updated both to mention the shared expandHomePrefix() helper and the full set of forms it handles (~, $HOME, ${HOME} with either separator), so a future reader does not need to chase the implementation to understand what env values are accepted. * test(daemon): stub os.homedir() in media-config routing tests Round 4 review feedback on PR #530. The new $HOME / ${HOME} routing tests relied on process.env.HOME being read by os.homedir(), which works on POSIX but is unreliable on Windows (Node prefers USERPROFILE / profile APIs there). The tests would expand to the real user home while fixtures were written under the per-test homeDir, causing platform-specific failures in the same area this PR is making cross-platform. The inner describe block now stubs os.homedir() via vi.spyOn to return the per-test homeDir, matching the pattern in resolve-data-dir.test.ts. Restored in afterEach. The four $HOME-form routing tests now pass on both POSIX and Windows. Daemon typecheck clean. The two OAuth fallback test failures unrelated to this change (real ChatGPT/Codex tokens in the local env) remain out-of-scope. * fix(i18n): drop duplicate uk.ts promptTemplates keys after rebase Upstream #674 added the same Ukrainian translations my earlier commit added. The rebase landed both copies; tsc rejects duplicate property names. Drop my copies so #674 (which is now upstream) is the single source for these keys. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-07 11:17:02 +08:00
Sid	1bd1f3a661	fix(daemon): surface OpenCode error frames + treat empty-output runs as failed (#700 ) * fix(daemon): surface OpenCode error frames + treat empty-output runs as failed Closes #691. OpenCode runs would silently complete in ~3 seconds without producing any visible chat output and still be rendered as a successful turn — three independent bugs along the structured-stream path conspired to produce this silent-failure shape. ## Bug 1 — `apps/daemon/src/json-event-stream.ts:85-91` OpenCode emits structured error frames on stdout (e.g. provider auth failures, network errors, schema mismatches) and still exits 0. The parser was downgrading these to `{type: 'raw', line: ...}`, which the chat UI does not render as an assistant message. The error string was discarded as "no-op output." Fix: emit a proper `{type: 'error', message, raw}` event matching the qoder-stream contract that the daemon's existing error-handling path already recognises. ## Bug 2 — `apps/daemon/src/server.ts:4199-4205` Even after Bug 1 was fixed, the json-event-stream branch wired the parser to a bare `(ev) => send('agent', ev)` lambda — bypassing the `sendAgentEvent` wrapper that interprets `type:'error'` events and sets the `agentStreamError` flag the close handler reads to flip the run to `failed`. So an emitted `error` event would just be forwarded as a no-op `agent` SSE event with no lifecycle effect. Fix: route json-event-stream through `sendAgentEvent`, mirroring the qoder-stream-json wiring at line 4175. ## Bug 3 — `apps/daemon/src/server.ts:4220-4234` Even after Bugs 1 and 2 are fixed, there's still a class of runs where OpenCode never emits any error frame, never emits any substantive event, and exits 0. Pre-fix this was marked `succeeded` and the user saw a blank chat with no diagnostic. Fix: track `agentProducedOutput` inside `sendAgentEvent` (set on `text_delta`, `thinking_delta`, `tool_use`, `tool_result`, `artifact` — deliberately NOT on `status` / `usage`, since a model can emit token-usage numbers for an empty completion). When the close handler sees `code === 0 && trackingSubstantiveOutput && !agentProducedOutput` the run is marked `failed` with an explicit AGENT_EXECUTION_FAILED SSE error so the chat shows a clear reason instead of a silent empty turn. The check is gated by `trackingSubstantiveOutput` so it only fires on streams that actually contribute to the output flag (currently qoder-stream-json and json-event-stream). ACP sessions and plain stdout streams keep their existing success/failure determination. ## Tests - 3 new unit tests in `apps/daemon/tests/json-event-stream.test.ts` pin the OpenCode error event shape: full repro (`error.data.message`), `error.name` fallback, and the generic-fallback shape when `error` is empty. - All 60 daemon test files (851 tests) pass on `pnpm --filter @open-design/daemon test`. All 42 web test files (309 tests) pass on `pnpm --filter @open-design/web test`. - Full repo `pnpm typecheck` clean. ## Live verification Verified end-to-end via a stub `opencode` binary that mimics each of the failure shapes against `pnpm tools-dev run web`: 1. Stub emits `{"type":"error",...}` then `exit 0` — run now ends as `failed` with the OpenCode error message surfaced as an SSE `error` event. Pre-fix this was `succeeded` with an empty chat. 2. Stub emits nothing then `exit 0` — run now ends as `failed` with "Agent completed without producing any output…" diagnostic. Pre-fix this was `succeeded` with an empty chat. 3. Stub emits a normal `step_start` / `text` / `step_finish` sequence then `exit 0` — run still succeeds. (Regression check.) ## Out of scope (mentioned for the next person) - `claude-stream-json` and `copilot-stream-json` still wire to a bare `(ev) => send('agent', ev)` and don't currently parse `type:'error'` frames. If their CLIs ever start emitting structured error events the same pattern (route through `sendAgentEvent` + emit proper `type:'error'`) applies. Not in scope here because we have no evidence those CLIs do this today, and changing the wiring without a confirmed failure mode risks regressing currently-working flows. - ACP sessions (`pi-rpc`, `acp-json-rpc`) own their own success / failure determination via `acpSession?.hasFatalError()` and the empty-output guard explicitly skips them via `trackingSubstantiveOutput`. - Plain stdout streams have no event-level tracking, so the empty- output guard skips them too. Diagnosing a no-output plain-stream agent is a separate problem that needs different signals. * chore: retrigger CI on top of green main (post #697 i18n backfill)	2026-05-07 02:00:19 +08:00
mamba	570d06419c	feat[qoder cli] add Qoder CLI agent support (#626 ) * chore(agent): 增加对 Qoder CLI 的支持和识别 - 在 QUICKSTART 文档中添加 Qoder CLI 为可选本地 agent CLI - 更新代码中 agents.ts 注释包含 Qoder CLI 扫描支持 - 修改首次加载时检测的可用 CLI 列表，加入 Qoder CLI - 在多个语言版本的 README 中增加 Qoder CLI 支持及相关徽章统计 - 更新 agent 适配器与事件解析相关的代码注释和文档，包含 qoder-stream-json 解析器 - 调整 Windows 下 spawn 行为以支持 Qoder CLI 的 stdin 提供 prompt - 修复多语言文档对支持的 CLI 数量描述错误，确保数据保持同步 Change-Id: I388f2f61c60ce8faa7cef5d84eb407950f8bdbfb Co-developed-by: Qoder <noreply@qoder.com> * chore(agent): 增加对 Qoder CLI 的支持和识别 - 在 QUICKSTART 文档中添加 Qoder CLI 为可选本地 agent CLI - 更新代码中 agents.ts 注释包含 Qoder CLI 扫描支持 - 修改首次加载时检测的可用 CLI 列表，加入 Qoder CLI - 在多个语言版本的 README 中增加 Qoder CLI 支持及相关徽章统计 - 更新 agent 适配器与事件解析相关的代码注释和文档，包含 qoder-stream-json 解析器 - 调整 Windows 下 spawn 行为以支持 Qoder CLI 的 stdin 提供 prompt - 修复多语言文档对支持的 CLI 数量描述错误，确保数据保持同步 Change-Id: Id33f125b7c0b1a1c0b0274073da74d1578c324f7 Co-developed-by: Qoder <noreply@qoder.com> * feat(agent-icon): 添加新的Qoder徽标SVG图形组件 - 新增qoderGlyph函数，返回指定大小的SVG格式图形 - 图形包含多路径定义，颜色使用深灰和绿色填充 - 该组件可用于替代或补充现有AgentIcon图标功能 - 提升应用程序的品牌标识和视觉表现力 Change-Id: I4eca18166b5e33bc6229b40b2531d5a54607a560 Co-developed-by: Qoder <noreply@qoder.com> * Translate to English: --- docs(readme): update to expand CLI agents to 16 - Increased the number of coding agent CLIs from 11 to 16 - New agents included: Devin for Terminal, Kiro CLI, Kilo, Mistral Vibe CLI, DeepSeek TUI docs(readme): update to expand supported coding agents to 16 - Increased the number of supported code agent CLIs from 11 to 16 - Added support for new CLI tools: Devin for Terminal, Kiro CLI, Kilo, Mistral Vibe CLI, DeepSeek CLI - Added automatic CLI detection and switching while maintaining support for more agents - Added BYOK proxy TUI - Expanded compatibility and support coverage in the README’s multiple language versions - Reflected changes across all README translations (Arabic, German, French, Japanese, Korean) - Updated badges and descriptions to reflect CLI count and feature changes - Added event parsers and protocols for the new CLIs in the agent transport implementation - Updated the BYOK proxy and tool exploration features to be compatible with the expanded CLIs Change-Id: I89786b4a0b09bd279fb23265c2177076206fc5af Co-developed-by: Qoder <noreply@qoder.com> * feat(daemon): 支持 imagePaths 参数作为附件路径传递给 Qoder - 修改 buildArgs 函数，添加 --attachment 参数处理 imagePaths 中的绝对路径 - 过滤并忽略空字符串、非字符串及相对路径的 imagePaths 项 - 在单元测试中覆盖 imagePaths 参数支持及无效项过滤逻辑 - 在文档中补充 Qoder 运行时适配器对 --attachment 参数的说明 Change-Id: Ibfc3583ba86c6d258d524912559e97b77bf1dc87 Co-developed-by: Qoder <noreply@qoder.com> * docs(runtime): 说明Qoder适配器继承用户令牌的环境变量 - 添加文档说明检测代理仅为可用性探针，不进行身份验证 - 说明Qoder CLI账号状态独立，认证通过运行时错误路径反馈 - 详细描述子进程环境继承机制及静态环境变量与用户私密令牌区分 - 明确QODER_PERSONAL_ACCESS_TOKEN通过守护进程环境传递，不写入静态环境 - 解释Qoder验证由Qoder CLI负责，支持持久登录和自动化环境变量注入 test(agent): 添加QODER_PERSONAL_ACCESS_TOKEN环境变量继承测试 - 验证qoder适配器环境继承守护进程中的QODER_PERSONAL_ACCESS_TOKEN - 确认qoder适配器未在静态环境变量中定义用户令牌 - 保证用户私密令牌不会被写入静态适配器环境配置 Change-Id: Ie61869afbe497df1b16879b4e47b35123f758ed8 Co-developed-by: Qoder <noreply@qoder.com> * fix(daemon): 改进Qoder模式支持及错误处理机制 - 更新Qoder CLI参数，使用`--yolo`替代`--permission-mode bypass_permissions` - 将工作目录参数从`--cwd`改为`-w`以符合Qoder文档 - 在agent流事件处理中新增错误捕获并通过SSE错误事件发送 - 运行结束时若检测到agent流错误，则标记运行失败 - 测试中fix(daemon): 优化Qoder代理参数与错误处理 - 调整Qoder启动参数，改用`--yolo`和`-w`替代旧参数，避开argv长度限制 - 增强代理流事件处理，捕获并通过SSE错误通同步更新Qoder参数使用及相应断言 - 新增端到端测试，覆盖Qoder助手错误通过SSE错误通道反馈及运行状态失败处理 - 补充工具函数辅助测试事件流读取与运行状态轮询 Change-Id: I5d933745c3659e093b0d2d807f22726e7f83eb48 Co-developed-by: Qoder <noreply@qoder.com> * feat(qoder-stream): 识别并报告Qoder运行错误事件 - 新增messageFromResult函数以从结果对象提取错误信息 - 在处理result事件时根据is_error字段触发error事件 - error事件携带具体错误消息和原始数据 - 添加测试验证Qoder运行返回is_error且退出码为0时正确触发错误事件 - 更新qoder流解析测试以校验错误事件映射 - 在聊天路由测试中增加针对Qoder错误运行的端到端场景验证 Change-Id: Ie98ac518135dbec3181c52de5a49afdea993e279 Co-developed-by: Qoder <noreply@qoder.com>	2026-05-06 19:54:03 +08:00
Deheng Huang	09b78c2f9b	feat(daemon): let Codex image projects use built-in imagegen (#622 ) * feat(daemon): let Codex image projects avoid API-key setup Codex has a built-in image generation path available inside the agent runtime, while the generic media dispatcher still routes gpt-image models through the daemon OpenAI provider. Pass the active agent id into prompt composition so Codex-only gpt-image projects can use built-in imagegen first without changing non-Codex media behavior. Constraint: Existing media contract remains the default path for non-Codex agents and explicit provider fallback Rejected: Add a nested daemon Codex media provider \| heavier auth, streaming, timeout, cancellation, and output parsing surface for this parity fix Confidence: high Scope-risk: narrow Directive: Keep this override after the media contract so it can intentionally supersede dispatcher-only wording for Codex gpt-image projects Tested: pnpm --dir apps/daemon exec vitest run -c vitest.config.ts tests/system-prompt-template.test.ts Tested: pnpm --filter @open-design/daemon typecheck Tested: pnpm guard Tested: pnpm typecheck Not-tested: Live Codex image generation inside the Open Design UI * fix(daemon): harden Codex imagegen prompt routing PR review found the Codex override could be superseded by the web-supplied media contract, trusted unvalidated image model metadata, and assumed generated image paths outside the workspace were readable. This keeps the override daemon-owned, appends it last in the live prompt, validates against registered gpt-image model IDs, allowlists only Codex's generated_images folder, and tightens copy-failure instructions. Constraint: The web contracts composer still emits the generic media contract without agent identity. Rejected: Mirror Codex-specific prompt logic into contracts/web \| duplicates daemon model registry and still leaves final ordering fragile. Confidence: high Scope-risk: narrow Directive: Keep Codex imagegen override appended after client systemPrompt so it remains the final media instruction for Codex gpt-image projects. Tested: pnpm --dir apps/daemon exec vitest run -c vitest.config.ts tests/system-prompt-template.test.ts tests/agents.test.ts tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon typecheck Tested: pnpm guard Tested: pnpm typecheck Not-tested: Live Codex image generation inside the Open Design UI * fix(daemon): keep Codex add-dir writable scope narrow PR review found Codex --add-dir grants writable workspace access, so passing skill, design-system, and linked reference directories through the same chat allowlist broke their documented read-only boundary. This routes chat extra directories by active agent: Codex receives only the validated generated_images output directory needed for built-in imagegen, while non-Codex adapters keep the existing resource and linked-directory read access behavior. Constraint: Codex CLI treats --add-dir as writable sandbox expansion. Constraint: The daemon still stages active skill files into the project cwd as Codex's read-safe path. Rejected: Keep one shared extraAllowedDirs list for all agents \| grants Codex write access to read-only resources. Confidence: high Scope-risk: narrow Directive: Do not add read-only resource/reference directories to Codex --add-dir unless Codex gains a read-only allowlist flag. Tested: git diff --check -- apps/daemon/src/server.ts apps/daemon/tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon exec vitest run tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon typecheck Tested: pnpm guard Tested: pnpm typecheck Not-tested: Live Codex image generation inside the Open Design UI * fix(daemon): validate Codex imagegen add-dir grants PR review found the generated_images grant still trusted symlinked paths and rendered the Codex override before proving the sandbox grant would be present. This validates the generated_images directory before prompt assembly, rejects final-component symlinks and protected-root canonical escapes, passes Codex the canonical grant path, and only appends the Codex imagegen override when that same path is in extraAllowedDirs. Constraint: Codex --add-dir grants writable workspace access, so path aliases into read-only resource roots must be rejected. Rejected: Keep returning the nominal CODEX_HOME path after validation \| leaves Codex operating through a symlink alias instead of the audited grant target. Confidence: high Scope-risk: narrow Directive: Keep Codex imagegen prompt rendering downstream of generated_images validation and grant resolution. Tested: git diff --check -- apps/daemon/src/server.ts apps/daemon/tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon exec vitest run -c vitest.config.ts tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon exec vitest run -c vitest.config.ts tests/agents.test.ts tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon typecheck Tested: pnpm guard Tested: pnpm typecheck Not-tested: Live Codex image generation inside the Open Design UI	2026-05-06 18:28:16 +08:00
Sid	33255a8fdf	Fix agent CLI config and workspace focus mode (#604 ) * fix agent CLI config and workspace focus mode * address CLI env review follow-ups	2026-05-06 16:06:56 +08:00
Tom	5df04c29a3	feat(daemon): add model name to pi initial status and RPC abort on cancel (#618 ) * feat(daemon): add model name to pi initial status and RPC abort on cancel - Emit status:initializing with model name before pi responds so the UI shows 'pi · claude-sonnet-4-5' — matching Claude Code, Copilot, Gemini, and Cursor Agent model-name parity - Replace raw SIGTERM with RPC abort command on cancel, giving pi a chance to clean up gracefully before SIGTERM fallback - Wire run.acpSession onto the run object so cancel() can dispatch to session.abort() for pi and ACP adapters - Add stdinOpen guard so sendCommand is a no-op after stdin closes - Add 4 tests covering initializing status, abort wire format, and stdin-closed guard * fix(daemon): gate stdout parser after abort to prevent post-cancel events Once abort() sets finished=true, the stdout listener kept feeding chunks into mapPiRpcEvent, so text_delta/tool/status events could still be emitted during the PI_ABORT_GRACE_MS window. Add a finished guard at the top of the parser callback so no agent events are forwarded after abort, while still draining stdout cleanly. Adds a test that aborts mid-session, then feeds message_update and tool events, proving zero post-abort agent events are emitted. * refactor(daemon): own SIGTERM fallback in cancel, rewrite abort tests as integration - Move SIGTERM fallback from pi-rpc abort() to runs cancel() so the termination guarantee is centralized — a misbehaving session can't leave the child alive indefinitely (address lefarcen P3 on L130) - Remove the setTimeout/SIGTERM from abort(); it now only sends the RPC abort command, termination is the caller's responsibility - Rewrite initial-status and abort tests as integration tests that exercise attachPiRpcSession against mock child processes instead of duplicating private sendCommand/send helpers inline (address lefarcen P3 on L453 and L491) - All 28 tests pass	2026-05-06 12:20:40 +08:00
Martin Atrin	79fcaef129	Add Tweaks mode for HTML previews with picker, pod selection, and batched chat attachments (#513 ) * Add tweaks mode for HTML preview comments * Fix tweaks geometry and restore critique migration * Harden tweaks mode reload sync * Guard tweaks batch sends during active runs --------- Co-authored-by: puma <puma@pumas-MacBook-Air.local>	2026-05-05 21:09:20 +08:00
Marc Chan	c3d9136a0c	Add live artifacts and Composio connector catalog (#381 ) * docs: add live artifacts implementation spec * docs: align live artifacts implementation plan * Ralph iteration 1: work in progress * Ralph iteration 2: work in progress * Ralph iteration 3: work in progress * Ralph iteration 4: work in progress * Ralph iteration 5: work in progress * Ralph iteration 6: work in progress * Ralph iteration 7: work in progress * Ralph iteration 8: work in progress * Ralph iteration 9: work in progress * Ralph iteration 10: work in progress * Ralph iteration 11: work in progress * Ralph iteration 12: work in progress * Ralph iteration 13: work in progress * Ralph iteration 14: work in progress * Ralph iteration 15: work in progress * Ralph iteration 16: work in progress * Ralph iteration 17: work in progress * Ralph iteration 18: work in progress * Ralph iteration 19: work in progress * Ralph iteration 20: work in progress * Ralph iteration 21: work in progress * Ralph iteration 22: work in progress * Ralph iteration 23: work in progress * Ralph iteration 24: work in progress * Ralph iteration 25: work in progress * Ralph iteration 26: work in progress * Ralph iteration 27: work in progress * Ralph iteration 28: work in progress * Ralph iteration 29: work in progress * Ralph iteration 30: work in progress * Ralph iteration 31: work in progress * Ralph iteration 32: work in progress * Ralph iteration 33: work in progress * Ralph iteration 34: work in progress * Ralph iteration 35: work in progress * Ralph iteration 36: work in progress * Ralph iteration 37: work in progress * Ralph iteration 38: work in progress * Ralph iteration 39: work in progress * Ralph iteration 40: work in progress * Ralph iteration 41: work in progress * Ralph iteration 42: work in progress * Ralph iteration 43: work in progress * Ralph iteration 44: work in progress * Ralph iteration 45: work in progress * Ralph iteration 46: work in progress * Ralph iteration 47: work in progress * Ralph iteration 48: work in progress * Ralph iteration 49: work in progress * Ralph iteration 50: work in progress * Ralph iteration 51: work in progress * Ralph iteration 52: work in progress * Ralph iteration 53: work in progress * Ralph iteration 54: work in progress * Ralph iteration 55: work in progress * Ralph iteration 56: work in progress * Ralph iteration 57: work in progress * Ralph iteration 58: work in progress * Ralph iteration 59: work in progress * Ralph iteration 60: work in progress * Ralph iteration 61: work in progress * Ralph iteration 62: work in progress * Ralph iteration 63: work in progress * Ralph iteration 64: work in progress * Ralph iteration 65: work in progress * Ralph iteration 1: work in progress * Ralph iteration 2: work in progress * Ralph iteration 3: work in progress * Ralph iteration 4: work in progress * Ralph iteration 5: work in progress * Ralph iteration 6: work in progress * Ralph iteration 8: work in progress * Ralph iteration 9: work in progress * Ralph iteration 17: work in progress * Add Composio-backed connectors * Add Composio-backed connector catalog * Fix connector callback flow * Update live artifact connector refresh * Fix live artifact refresh updates * Improve live artifact viewer toolbar * Refine live artifact source tabs * Expand Composio connector catalog * Improve Composio connector browsing * Fix artifact refresh source safety checks Generated-By: looper 0.4.1 (runner=fixer, agent=opencode) * Fix live artifacts PR feedback Generated-By: looper 0.5.0 (runner=fixer, agent=opencode) * Fix live artifact preview CORS validation Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix connector OAuth IPv6 loopback hosts Allow bracketed IPv6 loopback Host headers when deriving connector OAuth callback URLs so IPv6-bound daemons can complete connection flow. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Preserve live artifact refresh permissions Respect explicit refresh permission choices during live artifact create and update flows so revoked connector sources remain gated. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact preview cache freshness Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact refresh validation Guard manual refreshes with local daemon checks and reject daemon_tool sources without a toolName before refresh execution. Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix Composio credential invalidation Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix live artifact CORS methods Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * Fix workspace validation Restore media config test isolation under Vitest setup data-dir overrides and add the missing French live artifact display copy so the workspace test suite stays aligned.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector safety filtering Keep agent-preview connector listings aligned with execution safety policy and prune stale Composio OAuth state records before they accumulate. Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix agent runtime cleanup Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix live artifact daemon access Validate local-only live artifact routes against the peer socket address and pass daemon-resolved CLI paths to ACP MCP descriptors.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector run limit pruning Evict stale connector rate-limit buckets so long-lived daemon processes do not retain per-run entries indefinitely.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Fix connector compact schemas Generated-By: looper 0.5.2 (runner=fixer, agent=opencode) * Improve connector connection feedback * Adjust connector gate positioning * Fix live artifact refresh commits Avoid marking refresh candidates failed after snapshot or state persistence errors by deferring live artifact mutations until the durable refresh metadata is written. Also align connector OAuth callback host validation with daemon loopback handling.\n\nGenerated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Improve connector search relevance * fix(daemon): harden connector connection state Require loopback daemon validation before connector connect side effects and only clear provider-owned connector statuses during credential reset. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): guard connector disconnect route Require local daemon request validation before connector disconnect side effects. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): guard composio config updates Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): dispatch live artifacts mcp first Route the live-artifacts MCP server before the generic MCP CLI so od mcp live-artifacts starts the dedicated server instead of failing generic argument parsing.\n\nGenerated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix(daemon): handle integer connector schemas Allow JSON Schema integer connector inputs while preserving fractional-value validation so generated connector tool schemas accept valid page sizes and limits. Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * fix: align live artifact refresh error codes Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Fix live artifact connector refresh flow * Update live artifact design cards * Add beta badge to live artifact form * Remove live artifact tile model * Fix live artifact refresh sync * Fix live artifact MCP refresh durability Generated-By: looper 0.5.4 (runner=fixer, agent=opencode) * Fix live artifact refresh safety Enforce persisted refresh opt-out and connector auto-read gating before refresh sources execute. Generated-By: looper 0.5.5 (runner=fixer, agent=opencode)	2026-05-05 16:42:11 +08:00
Nagendhra Madishetti	76e6c7a9f6	feat: Critique Theater Phase 4 (persistence + transcript + orchestrator) (#481 ) * docs(specs): add Critique Theater design spec for panel-tempered artifacts * docs(specs): add Critique Theater implementation plan * docs(specs): rename UI to Design Jury, add lane-density modes, ship-rule explainer, label sizing * feat(contracts): add CritiqueConfig schema and defaults * fix(contracts): apply Task 1.1 review (CRITIQUE_PROTOCOL_VERSION rename, descriptions, RoleWeights export) * feat(contracts): add PanelEvent discriminated union and isPanelEvent guard * fix(contracts): apply Task 1.2 review (exhaustive event-type list, runId guard, import order) * feat(contracts): add CritiqueSseEvent variants and panelEventToSse mapper * test(daemon): add v1 wire-protocol golden fixtures for Critique Theater parser * feat(daemon): add v1 streaming parser for Critique Theater wire protocol * chore(contracts): add .js extensions to relative imports for NodeNext consumers * fix(daemon): satisfy noUncheckedIndexedAccess in v1 parser regex match access * test(daemon): cover parser failure modes; fix unclosed-PANELIST swallow bug * fix(daemon,contracts): address PR #387 review - parser now clamps panelist + DIM scores against the run-declared scale captured from <CRITIQUE_RUN scale=...>, not a hardcoded 100 - PANELIST appearing before any <ROUND n=...> opens now throws MalformedBlockError rather than emitting events with NaN round - DIM_RE and MUST_FIX_RE hoisted to module scope and lastIndex reset per call so the parser hot path stops recompiling regex per artifact - overflow check after drain simplified to a plain buf.length > cap test (the prior compound condition was always true on the right side and obscured intent) - scoreThreshold <= scoreScale refine gains a 1e-9 epsilon so floating slack does not reject semantically valid configs - round-1 designer ARTIFACT guard gains a comment naming the spec invariant and the v2 relaxation path - 3 new regression tests cover the panelist-without-round, scale=10 clamp, and scale=20 plumbing cases * docs(specs): rationale for non-goals, failure-mode rate targets, Phase 10 matrix, Phase 14 doc layout * Merge branch 'main' into feat/critique-theater Resolves the contracts/index.ts conflict by keeping the .js extensions added by chore(contracts) `2d6e8d6` and slotting in the new export for ./api/app-config introduced upstream by #255 (`9d700ec`). Critique Theater additions (./sse/critique, ./critique) preserved in their original positions. Verified after merge: pnpm --filter @open-design/contracts test -> 10/10 pass pnpm --filter @open-design/contracts typecheck -> exit 0 pnpm --filter @open-design/daemon typecheck -> exit 0 pnpm --filter @open-design/web typecheck -> exit 0 Two daemon tests in tests/media-config.test.ts fail both before and after the merge because they read real OAuth credentials from the developer machine instead of using mock fixtures. That's an upstream isolation issue on origin/main, not something this branch introduces. * fix: unblock web build and address mrcfps PANELIST oversize bypass The chore commit that added .js extensions to satisfy daemon's nodenext typecheck broke apps/web's Next.js build, because webpack tried to resolve the literal ./common.js when only common.ts exists on disk. Replaced with a subpath approach: contracts/exports gains a './critique' entry pointing straight at src/critique.ts (which has no relative imports), and daemon imports route through @open-design/contracts/critique instead of the barrel. Web keeps the bundler-friendly barrel; daemon's nodenext walks only the leaf module. All 13 contracts source files reverted to no-.js. Separately, mrcfps flagged that parserMaxBlockBytes was only enforced on the leftover buffer after drain returned, so a complete oversized block arriving in one chunk slipped past the cap. Added an explicit per-block size check inside drain for every buffered block type (PANELIST, ROUND_END, SHIP). Three regression tests yield the whole stream as a single chunk and assert OversizeBlockError fires before any events emit. * fix(daemon): close three v1 parser invariant gaps from mrcfps review Three independent gaps that all let malformed or oversized protocol output pass the v1 envelope contract: (1) Envelope guard. ROUND, PANELIST, ROUND_END, and SHIP now throw MalformedBlockError when state.inRun is false. Without this, a stream that omits <CRITIQUE_RUN> could still emit panelist_* events without the run_started handshake, leaving downstream reducers with no run-level config. (2) UTF-8 byte length. Both the per-block size check and the post-drain buf-size check now compare Buffer.byteLength(text, 'utf8') against parserMaxBlockBytes. The previous string-length comparison let multibyte content (CJK, emoji) inside <NOTES>/<SUMMARY> exceed the configured byte cap while staying under the JS string length cap, bypassing the daemon's resource guard. (3) Header-end ordering. PANELIST, ROUND_END, and SHIP now require the opener's > to appear before the matched closing tag. A malformed opener like <PANELIST role="x" score="8"</PANELIST> previously fell through to the closing tag's > and emitted events for an invalid block. Four regression tests cover each gap (ROUND-without-run, SHIP-without-run, multibyte-byte-cap, malformed-opener). * feat(daemon): add critique_runs persistence (Task 4.1) Introduces a new SQLite table critique_runs to back the orchestrator's run lifecycle. Plan called for ALTER TABLE artifacts ADD COLUMN ..., but artifacts is not a DB concept in this repo; runs get their own table. - migrateCritique(db) creates the table + two indexes idempotently and is wired into the existing migrate(db) flow on daemon boot. - CRUD helpers (insertCritiqueRun, getCritiqueRun, updateCritiqueRun, listCritiqueRunsByProject, deleteCritiqueRun) round-trip rounds_json through helpers so callers see typed CritiqueRunRow. - reconcileStaleRuns flips stale 'running' rows to 'interrupted' with a recoveryReason='daemon_restart' marker, supporting the spec's daemon-restart-mid-run failure mode. - Public CritiqueRunStatus union excludes the in-flight 'running' value but the runtime CHECK accepts it, matching the spec's lifecycle. - 11 vitest cases cover migration idempotence, round-trip, default rounds, status validation, update + list ordering, deletion, and reconciliation, plus FK CASCADE on project deletion. * feat(daemon): add Critique Theater transcript writer (Task 4.2) Streams PanelEvent sequences to .ndjson on disk under the artifact dir, gzipping to .ndjson.gz when the cumulative UTF-8 byte size crosses gzipThresholdBytes (default 256 KiB). Uses Node fs streams plus zlib.createGzip so the writer never holds the full transcript in memory. readTranscript inverts the path and streams events back, picking the right pipeline by file extension. Covers happy path, large multibyte, empty input, mid-stream failure cleanup, and unknown-extension reject. * feat(daemon): add Critique Theater orchestrator (Task 4.3) Drives one run end-to-end: parses stdout via parseCritiqueStream, scores each round through scoreboard helpers, persists lifecycle to critique_runs, and emits CritiqueSseEvent variants on the existing project event bus. Honors per-round and total timeouts, applies fallbackPolicy when no <SHIP> arrives, and tees events into writeTranscript so transcripts stream to disk without buffering the whole run in memory. Defensive entry validation throws RangeError on invalid CritiqueConfig before any side effect. Also adds scoreboard.ts (computeComposite, decideRound, selectFallbackRound) and re-exports panelEventToSse/CritiqueSseEvent from the critique subpath so daemon imports never touch the barrel. Fixes missing .js extensions in sse/critique.ts that caused NodeNext module resolution errors. * feat(daemon): wire Critique Theater orchestrator into spawn path (Task 4.4) Adds loadCritiqueConfigFromEnv to read OD_CRITIQUE_* keys with strict validation at boot. Branches the existing CLI spawn flow on cfg.enabled: when false (the M0 default) the legacy single-pass generation runs unchanged; when true the orchestrator owns the run end-to-end. Same SSE bus, same artifact dir, no behavior change for users until they flip the flag. * fix(lockfile): regenerate to include contracts zod + vitest entries The earlier conflict resolution took main's lockfile and ran pnpm install, but the install pass on Windows didn't write the contracts package's zod and vitest entries back into the lockfile. CI's --frozen-lockfile install rejected the resulting state. Re-running pnpm install with --no-frozen-lockfile rewrites the lockfile so it now matches every package.json across the workspace, including contracts/zod ^3.23.8 and contracts/vitest ^2.1.8. Verified locally: pnpm install --frozen-lockfile passes. * fix(daemon): parser ship envelope, SHIP-before-round guard, real artifactRef (Defects 3 + 5) - ParserOptions gains projectId + artifactId; the parser threads them into every emitted ship event's artifactRef so downstream consumers see the real run identity instead of empty placeholders. - <SHIP> now requires at least one closed <ROUND_END> in the same run; malformed streams that emit SHIP before any round complete now throw MalformedBlockError instead of bypassing the round-1 artifact invariant. - The SHIP handler validates the inner <ARTIFACT> block is present and non-empty; missing artifact raises MissingArtifactError. - Three new regressions: SHIP-before-round, SHIP-without-artifact, artifactRef populated from parser options. - Orchestrator threads projectId + artifactId into parserOpts. - Test fixtures updated to include <ARTIFACT> inside <SHIP> blocks. * fix(daemon): orchestrator owns lifecycle, gzip atomicity, fallback on timeout (Defects 2,4,7,8) - Orchestrator now accepts child + childExitPromise, races parser / child-exit / abort / timeout in one awaited flow, and SIGTERMs the child on every non-clean termination. Server awaits the result so the run lifecycle has a single owner. - ChildExitError surfaces when child exits non-zero mid-stream; the run is classified as failed with cause cli_exit_nonzero. - Timeout / abort with at least one completed round elects a fallback via selectFallbackRound and emits a synthetic ship event with status=timed_out or interrupted; the score persists to critique_runs instead of staying null. - applyTimeouts includes childExitRace in every Promise.race so early child exits are classified without waiting for the total timeout. iter.return() cleanup is capped at 200ms to prevent hang on stalling generators. - writeTranscript writes gzip output to transcript.ndjson.gz.tmp, fsyncs, then atomic-renames. Crashes mid-write leave no partial .gz or .gz.tmp on disk. * fix(daemon): plain-stream gating, per-run artifact dir, boot reconcile (Defects 1, 2, 6) - Spawn-path branch now inspects def.streamFormat and only routes through runOrchestrator when format === 'plain'. Adapters emitting wrapper formats (claude-stream-json, copilot-stream-json, json-event-stream, acp-json-rpc, pi-rpc) fall through to legacy single-pass with a one-time stderr warning per format. Per-format decoding into the orchestrator is reserved for v2. - critiqueArtifactDir is now path.join(ARTIFACTS_DIR, projectId, runId) so concurrent or sequential runs in the same project never overwrite each other's transcript or final HTML. Persistence stores the relative per-run path. - reconcileStaleRuns is now invoked after openDatabase on every daemon boot with staleAfterMs = critiqueCfg.totalTimeoutMs. Stale running rows from a prior crash flip to interrupted with rounds_json. recoveryReason='daemon_restart'. Logs a one-line warning naming the flipped count when greater than zero. - Spawn now passes child + childExitPromise to runOrchestrator so the orchestrator can race child exit against the parser, abort signal, and timeouts in one awaited flow. Server awaits the orchestrator's result and surfaces failures through the existing run lifecycle. * fix(daemon): daemon-authoritative scoring, lifecycle status, stderr ordering, insert type Round 2 review feedback on PR #481. 1. CritiqueRunInsert.status now accepts 'running' so the boot-reconcile tests (and any caller seeding an in-flight row) typecheck without casting. The runtime check in insertCritiqueRun already accepted 'running' against the DB constraint set, only the public type was stricter than the DB. 2. round_end keeps the daemon-computed composite authoritative. The agent's <ROUND_END composite=...> attribute is advisory: a divergence beyond COMPOSITE_TOLERANCE emits a composite_mismatch parser_warning so the discrepancy is observable, but the daemon value is what scores and persists. Same policy for must_fix. 3. SHIP-handling derives the final status from decideRound(...) using the daemon's scored round rather than trusting <SHIP composite=... status=...>. A run that the agent claims as shipped but whose daemon composite is below threshold now finalizes as below_threshold, so a malformed or adversarial stream cannot force a ship. 4. server.ts captures the orchestrator's result and maps the critique terminal status to the chat run lifecycle. shipped/below_threshold finalize as 'succeeded'; timed_out/interrupted/degraded/failed finalize as 'failed'. cancelRequested is honored. 5. stderr forwarding and child.on('error') registrations moved BEFORE the orchestrator await so a CLI that floods stderr cannot fill the OS pipe and deadlock until the total timeout, and so an early child error fired during the run is observed by the same listener used after. Tests: - tests/critique-authority.test.ts: 3 new regressions (lying ship downgraded to below_threshold, mismatch warning emitted, aligned composites stay quiet). - All four affected suites green: 14 orchestrator + 10 spawn-wiring + 3 boot-reconcile + 3 authority = 30/30. Workspace typechecks: contracts, daemon, web all exit 0. * fix(daemon,contracts): inline critique SSE, signal-terminated child, null shipped artifactPath Round 3 review feedback on PR #481. 1. packages/contracts/src/critique.ts inlines CritiqueSseEvent + panelEventToSse + CRITIQUE_SSE_EVENT_NAMES + a local mirror of SseTransportEvent. The previous re-export from './sse/critique.js' broke the workspace web build (Turbopack cannot rewrite .js to .ts on a relative source import) while removing the .js extension broke daemon's NodeNext typecheck (it walks this leaf via the './critique' subpath export which requires explicit .js extensions). Inlining removes the cross-file relative import entirely so both consumers walk one self-contained file. packages/contracts/src/sse/critique.ts is removed and its co-located test moves up to packages/contracts/src/critique.test.ts. The barrel packages/contracts/src/index.ts drops the redundant './sse/critique' re-export since './critique' already exports the same symbols. 2. apps/daemon/src/critique/orchestrator.ts treats a signal-terminated child as a terminal race rejection. Previously the race only caught non-zero numeric exit codes and treated code === null as indefinitely pending, so a SIGTERM from /api/runs/:id/cancel resolved childExitPromise as { code: null, signal: 'SIGTERM' } and the orchestrator fell through to the no-SHIP fallback path, persisting below_threshold instead of interrupted. The race now rejects with a new ChildSignaledError when signal !== null, and a new catch branch classifies the run as 'interrupted' and (if at least one round closed) emits a synthetic ship event with status='interrupted' so the persisted row and the SSE transcript reflect the actual cause. 3. Same file, ship-handling: artifactPath is now persisted as null on shipped runs until a future phase actually extracts the <SHIP><ARTIFACT> body to disk. Previously the orchestrator wrote ${artifactDir}/${artifactId} even though no file existed at that path, so any later replay/export/UI code that trusted critique_runs.artifact_path would dereference a missing file. The transcript still records the ship event with the artifact reference so consumers can find the run. Tests: - apps/daemon/tests/critique-lifecycle.test.ts: 2 new regressions (SIGTERM-terminated child after one closed round persists 'interrupted' with a synthetic ship event of the same status; shipped run leaves artifactPath null in result and DB row). - 43 critique-suite tests pass: 14 orchestrator + 11 transcript + 10 spawn-wiring + 3 boot-reconcile + 3 authority + 2 lifecycle. Workspace typechecks: contracts, daemon, web all exit 0. * fix(daemon): buffer raw SHIP, emit only normalized; reject SHIP for unclosed round Round 4 review feedback on PR #481. The parser-event loop used to unconditionally collectedEvents.push(event) and bus.emit(panelEventToSse(event)) for every event, including raw <SHIP>. SSE clients and the transcript could see the agent's forged status="shipped" / composite="9.5" before decideRound(...) ran, even when the daemon later corrected the persisted DB row to below_threshold. The loop now skips ship events entirely; the orchestrator buffers the raw shipEvent, runs daemon-authoritative scoring, and emits a single normalized ship payload built from the daemon's computed composite, selectFallbackRound's mustFix, and decideRound's status. The transcript and SSE bus now only ever see the daemon-scored ship. The unknown-round fallback used to make agent-claimed status/composite authoritative when SHIP referenced a round that was never closed: a malformed stream could close low round 1, then send <SHIP round="2" status="shipped" composite="10">, completedRounds.find(r => r.n === 2) was undefined, and the orchestrator persisted the agent's value. That re-opened the scoring-integrity hole the previous round was meant to close. The orchestrator now drops a SHIP whose round isn't in completedRounds, emits a parser_warning, and falls through to the no-SHIP fallback policy. The synthetic ship from selectFallbackRound gets emitted instead, with daemon-authoritative round/composite/status. Tests: - tests/critique-authority.test.ts: extended the lying-ship regression to also assert the emitted critique.ship payload is downgraded (status='below_threshold', composite < threshold), so the SSE bus cannot see the agent's claim. Added a new regression where SHIP references an unclosed round 2: the agent ship is dropped, a parser_warning fires, the fallback selects round 1, and the only emitted critique.ship has round=1 and status=below_threshold. - 44 critique-suite tests pass: 14 orchestrator + 11 transcript + 10 spawn-wiring + 3 boot-reconcile + 4 authority + 2 lifecycle. Workspace daemon typecheck exits 0. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com> Co-authored-by: mrcfps <mrc@powerformer.com>	2026-05-05 15:50:35 +08:00
PerishFire	bbdd4e84b5	chore: enforce test directory conventions (#496 ) * chore: enforce test directory conventions Move package, app, and tool tests out of src and add guard enforcement so source directories stay source-only. * ci: use guard and package-scoped tests Run the new repository guard in CI and keep test execution aligned with package-scoped commands after removing root aliases. * ci: align stable release guard check Use the new repository guard in stable release verification after replacing the residual-JS-only script. * chore: tighten test layout enforcement Enforce sibling tests directories, typecheck moved test suites with dedicated configs, and refresh remaining guidance that pointed at src-based tests. * chore: clarify no-emit test tsconfigs Explicitly disable declaration-only emit in test tsconfigs so review tooling sees they are no-emit typecheck configs.	2026-05-05 15:34:22 +08:00
Chris Tam	a3a0ae6ced	fix(daemon): respect baseUrl path verbatim in OpenAI-compat proxy (#410 ) * fix(daemon): respect baseUrl path verbatim in OpenAI-compat proxy `appendVersionedApiPath` previously force-injected `/v1` unless the supplied baseUrl ended with `/vN`. That broke any provider whose OpenAI-compatible surface lives under a sub-path: https://api.deepinfra.com/v1/openai → ".../v1/openai/v1/chat/completions" https://openrouter.ai/api/v1 → ".../api/v1/chat/completions" (worked, by luck of /vN suffix) Now the auto-`/v1` only fires when the user supplied no path at all, so DeepInfra, OpenRouter, and any other sub-path-mounted compat surface route to the right endpoint while the canonical `https://api.openai.com` / `https://api.anthropic.com` shortcuts still work. Adds a regression test table covering the matrix. * Account for Anthropic style URLs	2026-05-05 13:22:54 +08:00
Justin Gao	cc8add4f09	feat(daemon): add link code folder support for agent context (#455 ) * feat(daemon): add link code folder support for agent context Users can now link local code directories to a project so the AI agent reads their source code via --add-dir when generating designs. The import menu's "Link code folder" item opens a native OS folder picker, and linked folders appear as removable chips below the chat input. - Add linkedDirs field to ProjectMetadata contract - Add POST /api/dialog/open-folder endpoint (osascript/zenity/PowerShell) - Add validateLinkedDirs with path safety checks (absolute, exists, blocklist) - Append linked dirs to extraAllowedDirs in startChatRun - Add system prompt hint listing linked code folders - Render linked folder chips in ChatComposer with add/remove - Add i18n strings for all 16 locales - Add 8 unit tests for validateLinkedDirs * fix: address PR review feedback - Add JSDoc type annotations to validateLinkedDirs for strict mode - Check path.isAbsolute before resolve to catch relative inputs - Allow linking when projectMetadata is undefined (default to prototype) - Remove redundant PATCH in ProjectView callback * fix: use inline TS annotations instead of JSDoc in linked-dirs.ts * fix(daemon): harden linked-dirs validation against security bypasses - Resolve symlinks with realpathSync.native before checking blocklist - Reject filesystem root (/) and drive roots as linked dirs - Canonicalize blocklist entries to handle macOS /etc -> /private/etc - Validate linkedDirs on project creation, not just PATCH - Re-validate persisted linkedDirs in startChatRun before use - Add tests for root, symlink-to-blocked-dir, and realpath resolution * fix: narrow union type before accessing .error in tests	2026-05-05 12:46:39 +08:00
ChildhoodAndy	bc04d61903	feat(design-files): add batch ZIP download with multi-select (#405 ) * feat(design-files): add batch ZIP download with multi-select Add checkbox-based multi-select to the Design Files panel and a new POST /api/projects/:id/archive/batch endpoint that zips selected files using the project name as the archive filename. * fix(i18n): add missing batch-download keys to uk locale The upstream main branch has a uk.ts locale that didn't exist in the fork. Without these keys the web typecheck fails against the full locale set. * fix(design-files): harden batch archive and prune stale selections - Use lstat to reject symlinks, dotfiles, and .artifact.json sidecars in buildBatchArchive (mirror listFiles/collectFiles allowlist) - Reject invalid names explicitly instead of silently skipping - Prune stale filenames from selected set on files/projectId change * fix(daemon): tighten batch archive allowlist with segment-level checks - Check every path segment for hidden directories, not just basename - Walk intermediate directories with lstat to reject symlinks at any level - Fail-fast on any ineligible file instead of silently skipping	2026-05-05 00:10:26 +08:00

1 2

90 commits