Compare commits

...

77 commits

Author SHA1 Message Date
xxiaoxiong
191f04ac4a fix: preserve cursor position when inserting async references
Capture mention state snapshot before async operations (applyProjectSkill,
applyById) to prevent cursor position issues. Previously, the mention state
could be invalidated during async operations, causing the cursor to land in
the wrong position after inserting skills or plugins.

Changes:
- Add optional mentionSnapshot parameter to replaceMentionWithText()
- Capture mention state before async calls in insertSkillMention()
- Capture mention state before async calls in insertPluginMention()
- Use captured snapshot instead of potentially stale mention state

This ensures the cursor position is calculated from the original mention
context, not from state that may have changed during re-renders triggered
by async operations.

Fixes #3195
2026-05-31 12:10:27 +08:00
蓝宙
e8c179d3a6
fix: show cumulative conversation duration (#3354)
* fix: show cumulative conversation duration

* fix: include usage-only run durations

---------

Co-authored-by: Lanzhou3 <217479610+Lanzhou3@users.noreply.github.com>
2026-05-31 03:52:12 +00:00
estelledc
0b493a66c0
fix(web): prevent caret reset on tools-menu picker mousedown (#3368)
The right-side @-button tools popover (ToolsPluginsPanel,
ToolsSkillsPanel, ToolsMcpPanel) inserts text into the composer
draft using the textarea's selectionStart at click time, but the
picker rows had `onClick` without `onMouseDown={(e) =>
e.preventDefault()}`. On a real mouse, mousedown fires first, the
textarea loses focus before the click handler runs, and
selectionStart resets — so the inserted token lands at offset 0
instead of at the user's cursor.

The @-mention popover already prevents this by calling
preventDefault on mousedown for every picker row (the comment at
ChatComposer.tsx:3039-3043 explains the reason). This change
mirrors that protection on the three tools-menu pickers.

The mention popover itself was unaffected, so design-file
mentions (which only flow through the @-popover via
`replaceMentionWithText`) are not impacted by this issue. The
reporter's mention of "design files" appears to refer to picking a
file via the @-popover, where the protection was already in place.

Closes #3195

Validation:
- pnpm exec vitest run tests/components/ChatComposer.tools-menu-caret.test.tsx
  → 3/3 passed (red on main, asserts each picker calls
  preventDefault on mousedown)
- pnpm --filter @open-design/web test → 2501/2501 passed (260 files)
- pnpm --filter @open-design/web typecheck → green
- pnpm guard → green
2026-05-31 03:50:45 +00:00
mehmet turac
8448b1105c
fix: preserve OpenClaude fallback credentials (#3361)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
landing-page-ci / Validate landing page (push) Failing after 1s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 2s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 2s
ci / Web workspace tests (push) Failing after 2s
ci / Browser tests (push) Failing after 2s
ci / Build workspaces (push) Failing after 2s
ci / Validate workspace (push) Failing after 1s
ci / Runtime trace (push) Has been skipped
2026-05-31 03:49:25 +00:00
Jane
d66a463d62
feat(landing-page): 301 legacy /skills /systems /templates to /plugins (#3352)
The 2026-05 plugins library rebuild introduced /plugins/skills/,
/plugins/systems/, /plugins/templates/ and a unified detail route
/plugins/<manifest-slug>/, but the old /skills/, /systems/, /templates/
catalogs were left live in parallel. Two equivalent page trees split SEO
equity, and the homepage, footer, quickstart, agents, official and blog
pages all still linked to the old routes.

Retire the legacy generators and 301 every old URL to its new plugins
equivalent so inbound links and search equity are preserved:

- Remove the /skills, /systems, /templates page generators (English +
  [locale] wrappers) and the now-orphaned skill-row component, and prune
  the skills/systems/templates branches from the [locale]/[...path]
  catch-all (it now renders only craft + blog).
- Add the migration block to public/_redirects. Detail slugs differ from
  the old folder names (new slugs are manifest-name based, e.g.
  design-system-<x>, example-<x>), so systems/templates use a prefixed
  splat plus a short degrade list, and skills map the 27 with a template
  equivalent explicitly while the ~110 instruction-only skills and all
  mode/scenario/category facet pages degrade to the section landing.
  'replicate' is forced to the section to avoid colliding with the
  design-system of the same name. Locale variants (zh, zh-tw, ja, ko)
  strip to the section.
- Repoint in-site links to /plugins/* across page.tsx (footer, work,
  labs pills), info-page-i18n.ts (en + zh + sourceNames), official,
  quickstart, agents, blog and html-anything, and update the sitemap
  serialize priority list. The system-card keeps linking through
  /systems/<slug>/ so the 8 systems without a detail page ride the
  redirect's degrade rather than pointing at a missing page.

Verified with a full astro build: old routes no longer emit any HTML,
the new section pages exist, _redirects is copied verbatim, and no
in-site link targets a removed route (the remaining /systems/<slug>/
hrefs are the system cards that 301 by design). astro check passes.

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-31 01:04:20 +00:00
estelledc
1a6face04c
fix(web): prune draft tokens when the plugin chip strip clears (#2881) (#3356)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
nix-check / build (push) Failing after 1s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 1s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 1s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
ChatComposer tracks the `@…` tokens this surface authored via the
@-mention popover plugin-pick path. When PluginsSection's chip strip
clears, we wire its `onCleared` and prune *only* those tracked
insertions from the draft so the textarea no longer holds orphaned
styled mentions whose chips just unmounted.

Architecture summary (rounds 1–9 collapsed; round 10 detailed below):

  - `Array<{token, start, pluginId, insertionId?}>` tracking with
    start offsets reconciled across each keystroke via an LCS+LCP
    edit-range diff in
    `apps/web/src/utils/pluginInsertionTracking.ts` (round 3-4).
    `insertionId` is forwarded by `reconcileInsertions` so the
    producer can locate its own entry across reconciles
    (round 10).
  - All draft mutations route through a single `updateDraft`
    chokepoint that runs `reconcileInsertions` outside the
    `setDraft` updater so React StrictMode's double-invoke is
    harmless (round 4-5).
  - Boundaries delegate to the shared
    `inlineMentions.isMentionBoundary` /
    `inlineMentions.isMentionRightBoundary` helpers so the
    tracker can never diverge from the parser (round 5).
  - `setActivePlugin` is a chokepoint for every applyById path,
    filtering tracked entries to those matching the new active
    plugin so a replace-plugin flow can never let stale entries
    survive (round 6).
  - Picker rollback double-snapshots draft + tracker so apply-
    failure restores the tracker but only rewrites the draft
    when no user keystrokes arrived during the await
    (round 7-8).
  - `stripPluginInsertedTokens` collapses whitespace seam-local
    so user-authored multi-space spans elsewhere are preserved
    (round 8).
  - `setActivePlugin` is deferred past `await applyById` on
    every path, and `onCleared` filters by
    `pluginsSectionRef.current?.getActiveRecord()?.id` so a
    pending-window clear scopes to the actually-mounted
    plugin's tokens (round 9).

race in the picker rollback:

  Round 9 made `onCleared` mutate the tracker and the draft when
  it ran during a pending replace, and added the `getActiveRecord`
  filter so the strip targets the still-mounted plugin's entries
  only. The picker's failure-path rollback, however, still
  restored `prevEntries` / `prevActiveId` wholesale — assuming
  nothing else had touched the tracker during the await. If the
  user clicked the still-mounted original chip's × during the
  pending replace AND the deferred `applyById` then resolved
  with a 500, the wholesale restore (a) resurrected entries that
  `onCleared` had legitimately stripped (now stale offsets) and
  (b) left the optimistic `@<target>` orphaned in the draft with
  no chip ever having mounted — the original #2881 symptom
  recurring inside the failure window.

  Fix splits the failure rollback into two paths:

  1. **Detect "intervening clear" via `activePluginIdRef.current
     === null && prevActiveId !== null`.** `onCleared` always
     nulls the active id as its last action; our deferred
     `setActivePlugin` never ran in the failure branch. So the
     null-while-prev-not-null state is the smoking gun for an
     intervening clear during the await.

  2. **On detection, surgically remove only our optimistic
     entry and only its `@<target>`.** Locate the entry by
     `insertionId` (added to `TrackedInsertion` as an optional
     field, forwarded by `reconcileInsertions` so the id
     survives offset shifts) — this disambiguates the case
     where the user picked the same plugin from the @-popover
     more than once during the await window. Splice that entry
     out and run `updateDraft((d) => stripPluginInsertedTokens(
     d, [ourEntry]))` so the draft loses `@<target>` and any
     remaining tracked entries (the in-flight target would have
     no others, but a co-pending second pick could) get their
     offsets reconciled. `activePluginIdRef` stays at `null` —
     `onCleared`'s truth, since no chip is mounted.

  The "no intervening clear" branch is the round 7/8 path:
  restore `prevEntries`/`prevActiveId` wholesale and rewrite
  the draft only if `draftRef.current === postInsertDraft`
  (no user keystrokes during the await).

Regression coverage (additions):

  - `apps/web/tests/components/ChatComposer.plugin-clear-prunes-draft.test.tsx`
    — 18 integration specs total (17 prior + 1 new round-10):
    * `@-popover pick A → @-popover pick B (apply pending) →
      clear A's chip → resolve B with 500 → assert no orphan
      @<target>, no orphan @A, no chip mounted, no stale
      tracker entries`. Uses a deferred `Promise<Response>` so
      the apply stays in flight while the chip-clear is fired,
      then resolves with a 500 to drive the failure path. Pre-
      fix this would resurrect Airbnb's stale entry AND leave
      `@SecondPlugin` orphaned in the draft.

PluginsSection.tsx is unchanged. The host-local tracking +
draft-update chokepoint + parser-aligned boundaries + deferred
active-plugin scoping + transactional applyById + intervening-
clear-aware rollback + filtered `onCleared` keep the cross-
component contract identical to main — only ChatComposer touches
behavior, plus the utils module and two `inlineMentions` exports.

Validation:
  - pnpm exec vitest run tests/utils/pluginInsertionTracking.test.ts → 36/36 passed
  - pnpm exec vitest run tests/components/ChatComposer.plugin-clear-prunes-draft.test.tsx → 18/18 passed
  - pnpm exec vitest run -c vitest.config.ts (full apps/web suite, 228 files) → 2202/2202 passed
  - pnpm --filter @open-design/web typecheck → green
  - pnpm guard → green
2026-05-30 17:16:24 +00:00
Denis Redozubov
f4c5d22f22
fix(daemon): confine sandbox project roots and host discovery (#3243)
* fix(daemon): confine sandbox project and host discovery

* fix(daemon): resolve sandbox data dir for toolchain discovery

* fix(daemon): resolve sandbox data dir for agent env

* fix(daemon): fail fast for sandbox imported folders

* test(daemon): assert sandbox imported folder rejection

* fix(daemon): keep sandbox import guard at run start

* fix(daemon): reject sandbox imported project file roots

* fix(daemon): preserve imported project detail roots

* test(daemon): expect sandbox profiles to stay scoped

* fix(daemon): bypass proxies for agent tool callbacks

* test(daemon): isolate media policy route memory extraction

* fix(daemon): keep loopback no-proxy scoped to sandbox
2026-05-30 16:57:04 +00:00
Denis Redozubov
9a3424d68c
feat(daemon): add sandbox runtime foundation (#3242)
* feat(daemon): add sandbox runtime foundation

* fix(daemon): preserve sandbox roots after agent env overrides

* fix(daemon): keep readiness probes pathless

* fix(daemon): harden headless run fallbacks

* fix(daemon): bootstrap sandbox runtime discovery

* fix(daemon): preserve explicit sandbox agent profile mounts

* fix(daemon): keep sandbox profile lookup run scoped

* fix(daemon): normalize sandbox data dir input

* fix(daemon): pin sandbox env roots to base data dir
2026-05-30 15:06:05 +00:00
open-design-bot[bot]
b9f0b69cf1
docs(readme): refresh contributors wall (#3339)
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-30 14:02:17 +00:00
chaoxiaoche
df1535b7fd
feat(web): add staged preview feedback during generation (#3227)
* feat(web): wire generation preview stage into workspace

Show a 3-step progress overlay (understand → generate → prepare) in the
preview area while artifacts are being generated, replacing the blank
empty state. Displays elapsed time, an estimated duration hint, and a
retry button on failure.

- Add GenerationPreviewStage component + CSS module + runtime helpers
- Integrate buildGenerationPreviewState into FileWorkspace
- Pass messages/artifact/error/retry from ProjectView to FileWorkspace
- Register i18n keys for en and zh-CN locales

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): keep generation preview alive and persistent across waiting states

Address UX feedback on the generation preview surface:

- Make the waiting card feel alive instead of frozen: breathing mark,
  sweeping progress shimmer, pulsing running-step dot, and a live
  activity snippet pulled from streamed events (respects
  prefers-reduced-motion).
- Add an `awaiting-input` phase so the preview no longer reverts to the
  empty "design will appear here" placeholder when the agent asks the
  user a clarifying question (detects inline <question-form>).
- Add a `stopped` phase so a canceled/paused run keeps a contextual
  paused card instead of blanking the surface.
- Fix workspaceHasPreviewSurface live-artifact tab match (was reading a
  non-existent `tabId` field) and correct the unit assertion that
  contradicted the helper's `thinking` handling.
- Populate generationPreview.* keys (incl. new awaiting/stopped strings)
  across all locales.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): reveal generation steps progressively as the agent reaches them

- Only render steps the agent has actually reached (drop pending pills)
  with a slide/fade entrance, so the card visibly evolves 1->2->3 instead
  of always showing the same fully-populated row.
- Keep the "understand" step in progress during requesting/starting so a
  fresh run opens with a single step rather than a pre-filled set.
- Stop surfacing status detail (e.g. the model slug from `requesting`) as
  the live activity line; only genuine thinking/output text is shown.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): add dynamic sub-status to the generating step

Keep 3 high-level steps but give the long "generating" phase concrete,
moving feedback (option A) instead of splitting into more, less-reliable
steps:

- Derive a sub-status from the agent's TodoWrite plan: the in-progress
  task label (activeForm) plus a done/total count, falling back to the
  latest write/edit target file when no plan was emitted.
- The count counts the in-progress task toward `done` to match the
  chat-side todo card (e.g. 3/7 on both sides).
- Suppress the higher-level narration line while the sub-status is shown
  so only one dynamic line appears at a time (early phase = narration,
  writing phase = concrete task + count).

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): drop elapsed timer and duplicate estimate from generation preview

The "usually 2–5 minutes" estimate showed twice (lead footnote + meta row)
and the elapsed counter added little signal, so remove both: delete the
meta row, stop falling back to the estimate footnote in the generating
lead (render the lead only when live narration exists), and drop the now
unused elapsed timer/util.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-30 13:59:49 +00:00
leessju
cfde84b038
fix(web): make hand-off no-editors fallback perform a real reveal (#2494)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
landing-page-ci / Validate landing page (push) Failing after 3s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 2s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
* fix(web): make hand-off no-editors fallback perform a real reveal

The Finder/Explorer/File Manager fallback button was only calling an
optional onRequestRevealInFinder prop that the actual caller never
passes, so the surface advertised an action it never performed.

finder, explorer, and file-manager are real entries in the daemon's
open-in catalogue (open / explorer / xdg-open), so route the fallback
through openProjectInEditor(projectId, fallbackId) for a genuine
reveal. Keep the renderer reveal bridge as a secondary fallback if
the daemon spawn fails, and disable the button while busy so a double
click can't queue two reveals.

Adjacent: PreviewDrawOverlay's Send-while-streaming behavior is
intentional (sending is queued downstream, not blocked), and the
button already carries sendDisabledReason as its tooltip. Cover that
contract with a regression test so a future change can't silently
re-disable the control or drop the localized reason.

Scope note: the i18n hand-off key migration that previously rode on
this branch landed on main via a different key set, so this PR is
narrowed to just the fallback wire-up and the two regression tests.

* fix(web): surface daemon spawn failure inline in zero-editors fallback

The zero-editors HandoffButton fallback called setError() on a rejected
openProjectInEditor but returned only the <button>, so the error never
rendered. Production callers (ProjectView) mount the component without
onRequestRevealInFinder, so a daemon spawn failure became a silent no-op
— exactly the failure mode the PR was meant to cover.

Wrap the solo button in a handoff-wrap container and render the error
inline next to it. Adds a regression test for the rejected-spawn path.

* fix(web): align preview draw send-disabled test

* fix(web): show handoff fallback for zero editors

---------

Co-authored-by: nicejames <nicejames@gmail.com>
Co-authored-by: mrcfps <mrc@powerformer.com>
2026-05-30 06:34:12 +00:00
RyanCheng77
b76e7196db
fix(daemon): dedupe Claude stream wrappers (#3334)
* fix(daemon): dedupe Claude stream wrappers

* fix(daemon): split Claude stream dedupe state

---------

Co-authored-by: 116405 <116405@ky-tech.com.cn>
2026-05-30 06:12:29 +00:00
mehmet turac
259295419a
fix(web): remove design file mentions with chips (#3204) 2026-05-30 04:50:50 +00:00
Weston Houghton
7a9dcf38d7
fix(memory): deliver OpenCode extraction prompt on stdin (#3238)
`opencode run`'s `-f, --file` is a yargs array option that greedily
consumes every trailing non-flag token, so the memory extractor's
`--file <prompt-file> "<message>"` invocation made OpenCode treat the
message text as a second attachment and exit 1 with "File not found".
Every LLM memory extraction failed for OpenCode Local CLI users.

Deliver the prompt on stdin like the chat-run path (def.promptViaStdin)
and drop the --file attachment. The connector-memory test now models
the real yargs --file array-greediness so it would catch a regression.
2026-05-30 04:48:42 +00:00
RyanCheng77
f12679185c
fix(web): send Anthropic proxy image attachments (#3273)
* fix(web): send Anthropic proxy image attachments

* fix(web): omit image attachment stubs for Anthropic proxy

* fix(web): keep image fallback context aligned

* fix(web): align Anthropic image attachment omission

---------

Co-authored-by: 116405 <116405@ky-tech.com.cn>
2026-05-30 04:47:47 +00:00
RyanCheng77
653a3fcc70
fix(web): harden image export downloads (#3318)
* feat(web): export preview as image

* fix(web): harden image export downloads

* docs(skills): add PR feedback quality gate

* docs(skills): require critical review of Claude feedback

---------

Co-authored-by: 116405 <116405@ky-tech.com.cn>
2026-05-30 04:44:00 +00:00
YOMXXX
9305bd1cff
fix(web): truncate long project names in the automation project picker (#3274) (#3317)
Long project names in the "Existing projects" section of the
automation project picker rendered verbatim with no truncate styling,
so a single name like "A very long project name that would otherwise
wrap onto several lines" blew up the row height and made the dropdown
messy to scan. The expected behavior is a single-line label with
ellipsis, with the full name still discoverable on hover.

Add the standard truncate triad (`white-space: nowrap`,
`overflow: hidden`, `text-overflow: ellipsis`) to
`.automation-popover__label`. The parent
`.automation-popover__body` already sets `min-width: 0`, so the
ellipsis renders cleanly. Thread an optional `title` prop through
`PopoverItem` and pass each project's full name from the picker
call site, so the native hover tooltip carries the unclipped name.

Other PopoverItems with fixed in-product copy (e.g. "New project
each run") deliberately omit the title — they never exceed the row
width and the redundant tooltip would be noise.

Regression test covers the DOM contract (every project row has
`title=<full name>`, fixed rows do not); the CSS half is verified by
code review since jsdom does not apply stylesheets.
2026-05-30 04:42:21 +00:00
open-design-bot[bot]
e76eb6da63
Update docs/assets/github-metrics.svg (#3338)
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-30 04:31:16 +00:00
Ramiro
ed5e8c147b
fix(web): keep pet composer menu expanded (#3336)
- apps/web/src/components/ChatComposer.tsx
- apps/web/tests/components/ChatComposer.context-pickers.test.tsx

Clear stale absolute anchors when the pet composer menu is positioned fixed so the popover wraps its content instead of collapsing over the composer textarea.
2026-05-30 04:19:02 +00:00
Ramiro
c33641e592
fix(daemon): normalize cumulative ACP message chunks (#3333)
* fix(daemon): normalize cumulative acp message chunks

- apps/daemon/src/acp.ts
- apps/daemon/tests/acp.test.ts
- apps/web/src/providers/daemon.ts
- apps/web/src/components/DesignSystemFlow.tsx

Convert cumulative ACP message snapshots into suffix deltas and keep temporary browser debug instrumentation for trace verification.

* chore(web): remove temporary stream debug hooks

- apps/web/src/providers/daemon.ts
- apps/web/src/components/DesignSystemFlow.tsx

Remove the browser debug accumulator after validating the ACP duplication trace.
2026-05-30 04:17:32 +00:00
xinsngx
41b1cd763e
fix(media): hide OpenAI OAuth-only image credentials (#3308)
* fix(media): ignore OpenAI OAuth tokens

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.1

* fix(media): hide unavailable model providers

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.2

* fix(media): clear unavailable picker models

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.3

* fix(media): keep missing-model projects executable

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.8

---------

Co-authored-by: Codex <gpt-5@openai.com>
2026-05-30 04:12:10 +00:00
JasonBroderick
0fbeaf829e
fix(#3247): Detect, terminate, and warn on fabricated role markers across all agent paths (#3303)
* fix(daemon): detect and strip fabricated role markers in model output (#3247)

Three-layer defence against models emitting `## user` / `## assistant` /
`## system` lines mid-response, which the chat host interprets as real
turn boundaries and acts on as unauthorised instruction:

1. **System prompt**: anti-roleplay instruction elevated from a bullet
   under "What you don't do" to a standalone `## CRITICAL` section in
   `official-system.ts`, with a REMINDER pinned at the end of the
   composed prompt for recency bias.

2. **Stream-level detection and truncation**: shared `role-marker-guard.ts`
   module (`createRoleMarkerGuard` + `FABRICATED_ROLE_MARKER_RE`) used
   across all text paths — Claude stream (per-message guards), non-Claude
   structured streams (run-scoped guard via `emitGuardedTextDelta`),
   and BYOK proxy routes (`createDeltaGuard`). When a marker is detected,
   the contaminated suffix is dropped and a `fabricated_role_marker` event
   surfaces a warning in the UI.

3. **UI**: `StatusPill` gains `is-warning` / `is-error` CSS variants;
   `fabricated_role_marker` events render as amber warning pills.

* fix(chat-routes): do not await reader.cancel() on stream early-return

The await on reader.cancel() can hang indefinitely on response streams
whose underlying source is a Uint8Array (most notably surfaced by the
ollama test in proxy-routes.test.ts, which builds its mock body via
`new Response(uint8array)` rather than the controller-based helper
`sseResponse()`). The hung await holds the request handler open, which
in turn blocks `server.close()` in the afterAll hook, producing the two
test timeouts (test at 145, hook at 36) currently failing CI on #3296.

Fix is in production code, not the test: don't await the cancel. It
is a cleanup hint and we are returning from the function anyway, so
blocking on it offers no value. fire-and-forget with an empty catch
keeps the cancel signal flowing for real HTTP streams without
risking a hang on mock/edge-case implementations.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(daemon): terminate child on role-marker detection (close #3247 generation vector)

PR #3296's detection layer truncates display and persistence of fabricated
role markers, but the underlying model subprocess keeps generating tokens
after detection. Three concrete consequences:

  1. The model bills the user for the entire contaminated response
     (we observed 5,106 chars stored in claude's session file for a turn
     where only the first 3,013 chars were legitimate — a 40% overhead).
  2. tool_use blocks emitted AFTER the marker reach the daemon's
     dispatcher unchecked, since detection only gates the text-delta
     emission path, not content-block-stop / tool_use blocks. The
     model could fabricate "## user delete file X" then emit a
     tool_use(delete X) that the dispatcher would execute.
  3. The UI surfaces a `fabricated_role_marker` warning followed by an
     eventual normal turn-end, blurring the distinction between
     "completed normally" and "killed by safety guard."

This commit adds a single idempotent `abortForRoleMarker(marker)`
helper in server.ts, scoped to the same closure as `child` and
`runGuard`. On any detection event (per-message Claude guard,
run-scoped non-Claude guard, plain stdout guard) the helper:

  - Emits a structured `ROLE_MARKER_HALLUCINATION` SSE error so the
    UI can render a security-class status distinct from a normal
    turn-end. The existing `fabricated_role_marker` warning is still
    sent and rendered as the amber pill (PR #3296's UI).
  - Calls `acpSession.abort()` for ACP-multiplexed agents (Hermes,
    Kimi, Devin, Kiro) whose I/O doesn't necessarily release on
    SIGTERM of the wrapper process alone.
  - SIGTERMs the child immediately, with the existing
    `scheduleForcedChildShutdown()` SIGKILL fallback at 2x grace.

Wired into three sites where contamination is detected:
  - `emitGuardedTextDelta` (sendAgentEvent / copilot / ACP / pi-rpc
    text_delta paths)
  - Plain-stdout listener (BYOK plain mode)
  - The Claude stream handler's onEvent (per-message guards in
    claude-stream.ts surface `fabricated_role_marker` events directly
    via onEvent rather than through the run-scoped emitGuardedTextDelta)

Tool_use blocks emitted BEFORE the marker still flow through normally
— this guard can't help with those, since by the time we observe a
text marker the prior content block has already finished. Closing
that gap requires speculative cancellation of in-flight tool calls
when a downstream text block contains a marker; that's tracked as
follow-up work, not included here.

Co-Authored-By: roverkai <2196140098@qq.com>
Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* refactor(role-marker-guard): bounded tail + drop chat-style markers

Addresses two review comments on #3303:

(1) O(1) memory + per-delta work (review r3323982225)
  Replace the unbounded `accumulated` string with a rolling tail capped
  at TAIL_BUFFER_SIZE (64 chars — comfortably exceeds the longest
  marker prefix `\n<whitespace>## assistant` ≈ 16–24 chars in practice).
  A 50 KB assistant response delivered in 1000 chunks of 50 bytes was
  previously O(n²) on string concatenation alone; now it is O(1) per
  delta regardless of message length. The `tail.length` value carries
  the "already emitted" offset that the cut-point math needs, so the
  offset semantics at L74–78 of the prior implementation are preserved
  without re-introducing the full-text buffer.

(2) Drop chat-style markers entirely (review r3323982234, option (a))
  `User:` / `Assistant:` / `Human:` / `AI:` are removed from the regex.
  Rationale:
    - The host parses ONLY `## user` / `## assistant` / `## system`
      lines as turn boundaries (see `buildDaemonTranscript` in
      apps/web/src/providers/daemon.ts). A model emitting chat-style
      markers does NOT cause the original #3247 security failure.
    - With kill-on-detection wired in this PR (`abortForRoleMarker`
      in server.ts), a false positive aborts the whole run — far
      more expensive than a stray unflagged `User:` line in chat
      scrollback. Chat-style markers collide with legitimate output
      (form labels, email contacts, JSDoc) often enough that pairing
      them with kill-semantics is the wrong tradeoff.
  The tradeoff is now documented in the regex docblock so the
  kill-on-match behaviour is justified against the false-positive
  surface.

Also aligns the prompt-side CRITICAL block in system.ts: drop the
"don't emit User: / Assistant: / Human: / AI:" bullet, since we no
longer enforce it. Less ambiguity for the model and the operators.

Test file updated:
  - Chat-style positive tests flipped to negative ("does NOT match
    User: — chat-style out of scope") so the intentional exclusion
    has a permanent regression test.
  - Two new tests cover the bounded-tail behaviour: a marker arriving
    after 10 KB of clean text in small chunks, and a marker
    straddling a chunk boundary after 100 prior chunks.
  - Added test for legitimate `User: bob@example.com`-style content
    not triggering contamination.
Test count is now 35 (up from 25); two of the new ones explicitly
exercise the new bounded-tail path.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): drop \`^\` anchor after first chunk (review r3324060995)

Blocking correctness bug introduced by commit 4 (bounded-tail refactor):
once \`tail\` is a rolling slice of mid-stream text, \`^\` in the
canonical regex \`(?:^|\\n)\\s*##\\s+(?:user|...)\` no longer represents
the genuine message start. As the rolling window slides forward chunk
by chunk, a sliced tail can begin with whitespace + \`##\` (or just
\`##\`), letting \`^\` anchor a match against text that the
full-buffer implementation correctly ignored. With kill-on-detection
wired in commit 3, that false positive now SIGTERMs the run and emits
a \`ROLE_MARKER_HALLUCINATION\` error — exactly the failure class
called out in the docblock at L22–29.

Reviewer's evidence (PerishCode, r3324060995): streaming
"…take a look at the ## user content section…" one character at a
time reports \`contaminated: true\` post-refactor; the same text in a
single feed stays clean.

Fix: keep the canonical \`FABRICATED_ROLE_MARKER_RE\` for the very
first non-empty feed (where \`^\` legitimately points at the message
start), and switch to an internal \`NEWLINE_ANCHORED_ROLE_MARKER_RE\`
(\`\\n\\s*##\\s+(?:user|...)\` — drops the \`^\` alternative) for all
subsequent feeds. A \`firstChunk\` boolean tracks the state. Real
newline-preceded markers straddling chunk boundaries are still caught
because the preceding \`\\n\` is retained inside the 64-char tail.

Regression tests added (\`apps/daemon/tests/role-marker-guard.test.ts\`):
  - mid-line \`## user\` streamed char-by-char with no preceding \\n
    (mirrors the reviewer's repro)
  - space-preceded mid-line \`## user\` in a >130-char stream, which
    long enough to force the rolling window past the marker — exercises
    the exact slice condition that triggered the bug
  - real \\n-preceded \`## user\` still caught after a long preamble
    (positive case must not regress)
  - \`## user\` as the very first chunk still caught (\`^\` legitimately
    anchors on the first feed)

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): case-sensitive + tighter prefix scope (reviews r3324151877 / r3324151882)

Two refinements addressing the third review on #3303:

== Blocking (r3324151877) ==
The regex over-matched legitimate Markdown headings, and with
kill-on-detection wired in commit 3 each false positive
deterministically aborts a real run. Three changes tighten the match
to the actual security surface — `## user` / `## assistant` /
`## system` lines the chat host parses as turn boundaries — without
losing any real attack pattern:

1. CASE-SENSITIVE. Dropped the `/i` flag. The host's turn-boundary
   delimiter is lowercase (see `buildDaemonTranscript` in
   apps/web/src/providers/daemon.ts), and the `## CRITICAL`
   system-prompt block already forbids only the lowercase forms.
   Title-Case headings like `## User Guide`, `## System Architecture`,
   `## Assistant settings` are now ignored — these are legitimate
   technical writing patterns LLMs emit constantly. `## USER NOTES`
   (all-caps) likewise no longer flags.

2. POSITIVE LOOKAHEAD `(?=[^a-z])` after the role keyword. Without it,
   `## userland`, `## userspace`, `## users guide`, `## systemd`,
   `## assistance` all match via prefix in the alternation. The
   lookahead requires the next character to exist and to not be a
   lowercase letter, so:
     - `## user\\n…`     → match (newline is not lowercase)
     - `## assistantR…` → match (R is uppercase; the glued-form
                          attack pattern still gets caught)
     - `## assistant.`  → match (. is not a letter)
     - `## users guide` → no match (s is lowercase letter)
     - `## userland`    → no match (l is lowercase letter)
   POSITIVE rather than NEGATIVE `(?![a-z])` because the negative
   form is satisfied at end-of-string, which in a streaming context
   means "we have `## user` but don't know what comes next yet" —
   would fire prematurely if `land` arrives in a later chunk. The
   positive form delays detection by one character in that edge
   case, traded for correctness.

3. `[ \\t]` instead of `\\s` for inner whitespace. Markdown role
   markers are single-line by convention; restricting to space/tab
   prevents oddities like `##\\nuser` from matching across lines.

Test file: added Title-Case fixtures (`## User Guide`,
`## System Architecture`, `## Assistant settings`, `## USER NOTES`)
and prefix-of-longer-word fixtures (`## users guide`, `## userland`,
`## systemd`, `## assistance`) — each asserting NO contamination.
The existing `## usability` negative test gave false confidence as
the reviewer noted (only failed via alternation-miss, not via
word-boundary semantics); the new fixtures actually exercise the
lookahead. Also added a positive test for `## assistant.` (glued
punctuation) to balance the existing `## assistantReading`
(glued uppercase) coverage. Total tests: 35 → 50.

== Non-blocking (r3324151882) ==
Added `ROLE_MARKER_HALLUCINATION` to `API_ERROR_CODES` in
`packages/contracts/src/errors.ts` alongside the existing agent/AMR
codes, with a docblock comment explaining the emission contract:
emitted by `server.ts::abortForRoleMarker` alongside the existing
`fabricated_role_marker` warning event when the daemon detects a
fabricated Markdown role marker in agent output; retryable. The code
was already being emitted over the wire but unregistered — landing
the registration here keeps the contract and emitter in sync as
reviewer requested.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): defer complete-but-unconfirmed marker suffix

Addresses review r3324277xxx — the boundary case where a stream chunk
boundary lands between the role keyword and its lookahead character
violated the documented "everything from the marker onward is silently
dropped" contract. With (?=[^a-z]) as the lookahead, `feedText('## user')`
returned `## user` as safe (no char to satisfy the lookahead → no match
→ pass through), so the fabricated marker line leaked into UI and
app.sqlite before the next chunk confirmed contamination on the next
SIGTERM cycle.

Fix: introduce a `pending` state variable holding bytes that match the
COMPLETE-but-unconfirmed marker prefix at end of buffer
(/(?:^|\\n)[ \\t]*##[ \\t]+(?:user|assistant|assist|system)$/, no
lookahead, $ anchor instead). When the no-match branch detects this
suffix, withhold it from emission until the next feed either:
  - Confirms it (next char non-lowercase) → main regex matches →
    contaminated → withheld bytes dropped along with `## user`.
  - Denies it (next char lowercase, e.g. `userl…`) → main regex no
    longer matches the role keyword → withheld suffix is released
    and emitted alongside the new continuation.

Also tied the firstChunk transition to actual byte emission rather
than feed count. Previously a message that starts with `## system`
followed by a separate `\\n` chunk would lose the `^` anchor on the
second feed (firstChunk had flipped after the first feed even though
nothing was emitted yet), silently breaking detection for that edge
case. Now `firstChunk` stays true until at least one byte has crossed
the emission boundary, matching the conceptual definition of "message
start".

Tests added (apps/daemon/tests/role-marker-guard.test.ts):
  - `## user` deferred at chunk boundary, confirmed by `\\n` in next
  - `## user` deferred at chunk boundary, denied by `land` continuation
  - `## assistant` deferred, confirmed by punctuation
  - `## User` Title-Case still passes through unconditionally
  - `## system` as the very first chunk: deferred, confirmed by \\n
    in next chunk (tests the firstChunk-stays-true-when-nothing-
    emitted invariant)

Total tests: 50 → 55.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(claude-stream): scope role-marker guard to text_delta only, not thinking_delta

Addresses review r3324xxxxxx — guarding the thinking channel buys no
security and causes legitimate aborts.

Why thinking is NOT a #3247 vector:
  - `buildDaemonTranscript` in apps/web/src/providers/daemon.ts only
    re-serializes `m.content` as `## ${m.role}\n...`.
  - Extended-thinking content is rendered to a separate
    `kind: 'thinking'` payload (daemon.ts:857-858) and never folded
    into `m.content`.
  - So a `## user` line in the thinking channel CANNOT become a
    fabricated turn boundary on the next round-trip.

Why guarding it is harmful:
  - Models routinely emit literal `## user` / `## assistant` lines
    in chain-of-thought when reasoning about conversation structure
    ("Let me think about this. The user might phrase it as:\n## user\n
    …"). Common pattern in production traces.
  - With `abortForRoleMarker` wired in server.ts, a guard match on
    thinking SIGTERMs the run and surfaces a security error to the
    UI. The user paid for the reasoning, never sees the answer, and
    gets a confusing "fabricated role marker" warning for what was
    actually legitimate metacognition.
  - This directly contradicts the module's own stated philosophy
    ("a false positive aborts the whole run — a much more expensive
    failure than a stray unflagged ... line", role-marker-guard.ts).

Fix: `emitSafeText` now passes thinking_delta through unconditionally,
skipping both the guard and the contamination check. text_delta
remains fully guarded. The single-line change at the top of
emitSafeText preserves all other channels' behavior.

Regression tests added (apps/daemon/tests/claude-stream-thinking.test.ts):
  - `## user` / `## assistant` lines in a thinking_delta — must NOT
    fire fabricated_role_marker, the thinking content streams intact
    including the marker text, and the subsequent text_delta answer
    still reaches the consumer (run not aborted).
  - Sanity check: same `## user` pattern in a text_delta DOES fire
    fabricated_role_marker and truncates emission at the marker. Locks
    in the channel-discriminated behavior.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): tie firstChunk to slicing, not byte emission

Blocking review r3324xxxxxx: under the prior firstChunk transition
("any byte emitted"), a role marker that arrived at the very start of
a message with its prefix split across multiple chunks bypassed
detection — reopening the #3247 vector on the Claude path.

Concrete cases that were missed (all are routine provider
tokenizations of \`## user\n…\` at message start):
  - \`##\`     | \` user\nDELETE…\`
  - \`## us\`  | \`er\nDELETE…\`
  - \`## \`    | \`user\nDELETE…\`

Mechanism: the pending-deferral regex only catches COMPLETE role
keywords, so a first chunk ending in a partial prefix (\`##\`, \`## \`,
\`## us\`) was emitted in full. That emission flipped firstChunk to
false. From that point only NEWLINE_ANCHORED_ROLE_MARKER_RE was used,
which requires a literal \n before \`##\`. A marker at buffer
position 0 has no preceding \n, so it could no longer match.
abortForRoleMarker never fired and tool_use blocks emitted after the
fabricated turn boundary reached the dispatcher.

Fix: change firstChunk to track "tail has not been sliced yet" rather
than "any byte emitted". While total emitted bytes <= TAIL_BUFFER_SIZE,
tail still represents the entire emission so far and \`^\` in the
canonical regex genuinely anchors at byte 0 of the stream — so the
\`^|\n\` alternation safely catches a chunk-split message-start
marker. The transition happens at the moment we would slice: once
emitted > TAIL_BUFFER_SIZE, tail becomes a mid-stream window, \`^\`
becomes meaningless, and we switch to the newline-only variants.

Earlier iterations of this code tried two other definitions, both
unsound:
  - "any byte emitted" (this commit fixes) — lost \`^\` before a
    chunk-split message-start marker could finish arriving.
  - "newline emitted" (briefly considered as the reviewer's
    alternative suggestion) — left \`^\` valid on a sliced buffer
    when streams hadn't emitted a newline yet, re-introducing the
    rolling-tail mid-stream false positive from review r3324060995.
The slice-based invariant satisfies both: while we have not sliced,
\`^\` is correct; once we slice, it is not.

Regression tests added (apps/daemon/tests/role-marker-guard.test.ts):
  - \`##\`    | \` user\nDELETE…\`   → contaminated, marker=\`## user\`
  - \`## us\` | \`er\nDELETE…\`      → contaminated, marker=\`## user\`
  - \`## \`   | \`user\nDELETE…\`    → contaminated, marker=\`## user\`
  - \`#\`     | \`# user\nDELETE…\`  → contaminated, marker=\`## user\`
The fourth case (single \`#\` first chunk) exercises an even more
adversarial tokenization than the reviewer's examples; it is also
caught.

Total tests: 55 → 59.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(tests): wrap events in stream_event envelope in thinking test

feedJsonl was feeding raw events without the `{ type: 'stream_event',
event: ... }` wrapper that createClaudeStreamHandler requires (line 141
of claude-stream.ts). Events silently fell through all branches, making
both tests pass vacuously. Also fix TS2532 on warnings[0].marker with
non-null assertion (safe after the toHaveLength(1) guard).

Co-Authored-By: RoverKai <roverkai@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: roverkai <2196140098@qq.com>
Co-authored-by: JasonBroderick <jason@buddyboss.com>
Co-authored-by: RoverKai <roverkai@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-30 03:57:56 +00:00
xinsngx
c88a83cd5e
fix(web): preserve preview scroll across tools (#3313)
* fix(web): preserve preview scroll across tools

Capture URL-loaded preview scroll state before tool handoff and restore it through an opt-in raw HTML bridge to avoid jumping back to the top.

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.6

* test(daemon): cover scroll bridge injection paths

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.6

---------

Co-authored-by: Codex <gpt-5@openai.com>
2026-05-30 03:53:50 +00:00
xinsngx
778010bcf9
fix(web): theme home hero select menu (#3309)
* fix(web): theme home hero select menu

Use theme tokens for HomeHero footer select dropdown panels so dark theme menus do not render light-only white backgrounds with low-contrast text.

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.3

* test(web): cover dark model logo inversion

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.7

---------

Co-authored-by: Codex <gpt-5@openai.com>
2026-05-30 03:53:42 +00:00
Nicholas-Xiong
610aac4cc5
fix: insert skill reference when selecting from tools panel (#3220)
* fix: insert skill reference when selecting from tools panel

When selecting a skill from the tools panel (not via @ mention), the skill
reference was not being inserted into the input. The tools panel only called
applyProjectSkill() without inserting the text token.

This fix makes the tools panel skill picker behave consistently with other
reference types (MCP, connectors) by:
- Inserting the @skill token at the current cursor position
- Setting focus and cursor position after the inserted reference
- Closing the tools panel after successful insertion

Now users can:
1. Manually delete an @skill reference
2. Open the tools panel and select a skill
3. See the new @skill reference inserted at the cursor

Fixes #3188

* fix(web): preserve draft edits when inserting skill

---------

Co-authored-by: mrcfps <mrc@powerformer.com>
2026-05-30 03:52:35 +00:00
Patrick A
9146dc1c57
fix(web): persist design files view state across navigation (#2303)
* fix(web): persist design-files view state across navigation

pageSize, sortKey, sortDir, and kindFilter reset on every navigation
because DesignFilesPanel remounts via key={projectId}. Persist them to
localStorage under od:design-files:view-state:v1:<projectId> so each
project's view prefs survive tab-switching.

- Read persisted state via lazy useState initializers (SSR-safe try/catch)
- Write back in a single useEffect keyed on all four values
- Scoped per-project so proj-a settings never bleed into proj-b
- Schema-guarded: invalid/missing fields fall through to defaults
- Red spec: apps/web/tests/components/DesignFilesPanel.view-state-persist.test.tsx

* fix(web): address review feedback on view-state persistence

- Add typeof window guard in readViewState for explicit SSR safety
- Consolidate 4 separate localStorage reads into a single useRef read at
  mount time; each lazy useState initializer now reads from savedViewState.current
  instead of re-parsing localStorage independently

* fix(web): harden design-files view-state persistence

- Validate restored kindFilter values against the current ProjectFileKind
  union via isProjectFileKind() so stale stored values from a prior schema
  are dropped silently instead of being cast unchecked.

- Introduce DEFAULT_SORT_KEY/SORT_DIR/PAGE_SIZE constants so the useState
  initialisers and the new validation guard share a single source of truth.

- Add viewStateHasMounted ref to skip the first-render write in the persist
  useEffect. Without this guard every project the user visits accumulates a
  default-value entry in localStorage on mount, growing stale-key garbage
  unboundedly and making future field additions silently inject defaults into
  every existing entry.

- Harden kindFilter test: replace the silent early-return-on-missing-trigger
  with expect(filterTrigger).not.toBeNull() so a render failure surfaces as
  a real test failure rather than a passing no-op.

* test(e2e): design files view state persists across navigation and reload

Adds a Playwright UI smoke test in e2e/ui/ that exercises the three key
guarantees of the view-state persistence fix:

  (a) Tab-away / tab-back: navigating to a file tab and returning remounts
      DesignFilesPanel (conditionally rendered); all four prefs (sortKey,
      sortDir, pageSize, kindFilter) are restored from localStorage.

  (b) Hard reload: localStorage survives page.reload(); prefs are intact on
      the next mount.

  (c) Per-project key isolation: a second project starts with defaults and
      does not inherit values from the first project's localStorage entry.

The test uses OD_PORT=18011 / OD_WEB_PORT=18012 to avoid port conflicts with
the default development ports.

Also fixes a race in DesignFilesPanel: the stale-kind cleanup useEffect was
running against an empty availableKinds set before the async file list arrived
on mount, which cleared a kindFilter correctly restored from localStorage.
Guard added: skip the cleanup when availableKinds is empty.

Red on origin/main (no persistence logic exists there); green on this branch.

* fix(e2e): address code-reviewer feedback on view-state-persist test

- Add data-testid='df-page-size-select' to per-page <select> in
  DesignFilesPanel (W2: decouple test from i18n string 'Show')
- Add StrictMode comment to viewStateHasMounted guard explaining
  the dev-mode double-write behaviour (W1: document the invariant)
- Switch nav-away from dblclick to single-click + Open button,
  matching the pattern used in app-design-files.test.ts (W4)
- Raise timeout from 60s to 90s for cold CI runners (W3)
- Unify seedTextFile/seedPngFile into shared seedFile helper (N3)
- Add home-hero-input assertion in gotoEntryHome (N2)
- Switch waitForPageSizeSelect to use data-testid (W2)

* test(e2e): split design-files persist into nav, reload, and per-project scenarios

* fix(web): tighten isPageSize to discrete option set, add invalid-value regression test

* fix(web): isolate DesignFilesPanel.test.tsx from persisted view-state key
2026-05-30 03:39:27 +00:00
Weston Houghton
65802542a2
fix(chat): surface OpenCode usage-limit/provider failures instead of a bare timeout (#3316)
* fix(chat): surface OpenCode provider failures from its log on a silent stall

OpenCode's headless `run --format json` mode swallows provider failures: a
429 usage-limit is marked retryable and retried silently with nothing on
stdout/stderr, so the chat run only dies via the inactivity watchdog and the
daemon shows a bare "request timed out" with no reason. The real error
(statusCode + "Monthly usage limit reached…") is recorded only in OpenCode's
own session log.

On a failed OpenCode close where stdout/stderr carry no signal, read the
newest OpenCode session log, extract the latest `service=llm` provider error
(scoped to that one line so the embedded request body can't contaminate the
classification), and emit a structured, retryable SSE error (RATE_LIMITED /
AGENT_AUTH_REQUIRED / UPSTREAM_UNAVAILABLE) carrying the provider's message.

Refs #982.

* fix(chat): emit recovered OpenCode failure from the watchdog path, bound to the run

Addresses review on #3316.

Blocking: the recovery previously ran only in the child-close handler, but in
the inactivity-watchdog stall path (the exact case this targets)
failForInactivity sends its error and finish()es the run — which clears
run.clients — before the child closes. So the structured error reached zero
live SSE clients and only surfaced on reload. Recover and send the OpenCode
failure inside failForInactivity, before finish(), on the same pre-teardown
send path the generic stall message already uses. Keep the close-handler
branch for the case where OpenCode exits non-zero on its own (clients still
attached).

Non-blocking: bind the log lookup to the current run via an mtime gate
(since=run.createdAt) so a stale or concurrent session's error can't be
misattributed — skip log files last written before the run started.

* docs(opencode-log): note the concurrent-run limitation of the mtime gate

* fix(chat): skip close-handler failure emit when the watchdog already finished the run

Non-blocking review follow-up on #3316: on the silent-stall path both
failForInactivity and the child-close handler fired for the same run, so the
recovered RATE_LIMITED error was sent twice and the events-log stream was
reopened after finish() had closed it. Guard the close-handler failure emit
with !design.runs.isTerminal(run.status) — the watchdog already sent the
error and finalized the run; finalization below still runs (finish() no-ops
once terminal).
2026-05-30 03:23:58 +00:00
YOMXXX
9b9a18af5b
fix(daemon): validate skillId on POST/PATCH /api/projects against runtime source-of-truth (#3293)
* fix(daemon): validate skillId on POST /api/projects against runtime source of truth

* fix(daemon): validate skillId on PATCH /api/projects/:id, sharing the POST validator

* test(daemon): cover skillId canonicalization, design-template ids, empty-string + null normalization, type rejection
2026-05-30 03:22:16 +00:00
Ramiro
e30a4a2202
fix(platform): search mise shims dir so mise-installed CLIs are detected (#3319)
* fix(platform): search mise shims dir so mise-installed CLIs are detected

- Add ~/.local/share/mise/shims (and MISE_DATA_DIR override + legacy ~/.mise/shims) to wellKnownUserToolchainBins.
- This makes Pi, Kimi, and other mise-managed coding agents visible to the daemon even when launched from GUI contexts with stripped PATH.
- Added tests for default and MISE_DATA_DIR cases.
- Also pinned pnpm@10.33.2 in root mise.toml for better mise ergonomics.

Before/after: more local CLIs now appear in the runtime picker (Kimi, Pi, Antigravity, Kilo, etc.).

Refs: discussion in session around improving detection for common mise users.

* fix(platform): address Copilot review on mise shims logic

- Generalize the shims comment (no hard-coded CLI examples).
- Make per-version Node toolchain scanning respect MISE_DATA_DIR
  (use the same mise root for installs as for shims).
- Avoid duplicate shims entries when MISE_DATA_DIR makes legacy path
  identical to the primary one.

Addresses the three inline comments from copilot-pull-request-reviewer
on PR #3319.

* test(platform): extend MISE_DATA_DIR test to cover installs scanning

Addresses non-blocking review feedback from @nettee on PR #3319.

The previous test only asserted shims behavior under a custom
MISE_DATA_DIR. This extends it to also create fixture trees under
customMise/installs/node/... and customMise/installs/npm-openai-codex/...
and assert that the install paths are discovered while default-root
paths are excluded.

This makes the test robust against regressions in the installs
scanning logic (existingMiseNpmPackageBinDirs + node version dirs).

* fix(platform): only fall back to ~/.mise/shims when no MISE_DATA_DIR is set

Addresses the remaining non-blocking review comment from @nettee on PR #3319.

When an explicit MISE_DATA_DIR is provided, we no longer inject the
legacy ~/.mise/shims path. This prevents stale shims from a previous
mise layout from being re-introduced into detection.

Also added a regression assertion in the MISE_DATA_DIR test.

* fix(daemon): make claude-stream dedup robust when final assistant wrapper lacks msgId

Prevents duplicated text and thinking output (especially visible during
design system generation with AMR/Vela).

Root cause: the textStreamed guard fell back to  whenever the
final  message arrived without a string uid=501(ramarivera) gid=20(staff) groups=20(staff),12(everyone),61(localaccounts),79(_appserverusr),80(admin),81(_appserveradm),399(com.apple.access_ssh),33(_appstore),98(_lpadmin),100(_lpoperator),204(_developer),250(_analyticsusers),395(com.apple.access_ftp),398(com.apple.access_screensharing),400(com.apple.access_remote_ae) (common in some
AMR flows and design system tasks), causing the full content to be
re-emitted even if it had already been delivered via streaming deltas.

Fix: track whether any text or thinking was streamed via deltas for the
current message and use that as a reliable fallback for the final wrapper
instead of only trusting  presence.

* revert: remove dedup from claude-stream (PR #3319 should stay clean)
2026-05-30 03:21:04 +00:00
open-design-bot[bot]
51963cff78
docs(readme): refresh contributors wall (#3271)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
actionlint / Lint GitHub Actions workflows (push) Failing after 2s
ci / Detect CI change scopes (push) Successful in 1s
landing-page-ci / Validate landing page (push) Failing after 2s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 1s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 2s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-29 14:12:51 +00:00
open-design-bot[bot]
482e318afe
Update docs/assets/github-metrics.svg (#3267)
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-29 14:12:36 +00:00
lefarcen
6f532ca35c
fix(web): snapshot the srcDoc bridge frame in Mark mode so deck capture works (#3304)
The Mark tool (#3081/#3277) captured the preview via the *active* iframe. For
URL-load previews — decks especially — the active frame is the bridgeless URL
iframe, while the snapshot bridge lives only in the (mounted but hidden) srcDoc
transport frame. So Send on a deck timed out and showed 'Could not capture the
preview. Try again to avoid sending only ink.'

Snapshot the srcDoc-render-mode frame instead (capture mode already keeps it on
full content, so it carries the bridge), with a short retry while it finishes
swapping to full content. Falls back to the active frame for the non-URL-load
case where they are the same.

Red spec: PreviewDrawOverlay.test 'snapshots the srcDoc bridge iframe, not the
visible URL-load frame' fails on main (targets the URL frame), passes here.
2026-05-29 11:50:37 +00:00
Jane
9f09d1b649
fix(landing-page): wire up mobile nav toggle on the homepage (#3295)
The homepage runs its own inline header enhancer instead of importing
the shared header-enhancer.astro component, and that inline copy only
ported the scroll-headroom and GitHub stars/version logic — it never
included the hamburger toggle handler. As a result the mobile menu
button rendered (and animated to an X via CSS) but clicking it did
nothing on / and /<locale>/, while sub-pages that do import the shared
enhancer worked fine.

Port the same toggle handler into the homepage inline enhancer: click
flips .is-open on header.nav (which CSS expands into the dropdown panel
below 1080px), and outside-click, Escape, and any in-menu link close it,
keeping aria-expanded in sync.

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-29 10:19:37 +00:00
Guillermo Alcántara
2ea2c91a92
fix(pack): add missing download and host packages to Linux INTERNAL_PACKAGES (#2837)
* fix(pack): add missing download and host packages to Linux INTERNAL_PACKAGES

The native (non-containerized) Linux AppImage build fails with npm 404
errors because @open-design/download and @open-design/host — runtime
dependencies of @open-design/desktop and @open-design/web — were not
included in the INTERNAL_PACKAGES list. Without tarballs for these two
packages, npm install in the assembled app directory tries to resolve
them from the public registry where they don't exist.

Add both packages to INTERNAL_PACKAGES and their build steps to
buildWorkspaceArtifacts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(pack): apply same download/host fix to mac and win lanes, add regression test

1. Extend the INTERNAL_PACKAGES fix to mac/constants.ts, mac/workspace.ts, win/constants.ts, and win/app.ts so all three pack lanes produce tarballs for @open-design/download and @open-design/host.

2. Add internal-packages-coverage.test.ts that derives required workspace runtime deps from apps/desktop and apps/web package.json files and asserts every pack lane's INTERNAL_PACKAGES includes them. This prevents the same drift from recurring when a new workspace dependency is added.

3. Update win-app.test.ts and workspace-build.test.ts mock directory lists to include the two new packages.

* fix(pack): include runtime packages in workspace build cache

* fix(pack): install platform with desktop prebundle packages

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 09:44:03 +00:00
Bassiiiii
0c4b7e50be
fix(web/router): defer popstate dispatch to microtask (#2490)
* fix(web/router): defer popstate dispatch to microtask

navigate() previously dispatched a synchronous popstate event after
mutating window.history, which caused React 18 to emit:

  Cannot update a component (Router) while rendering a different
  component (App). To locate the bad setState() call inside App,
  follow the stack trace as described in
  https://react.dev/link/setstate-in-render

This happens whenever a caller invokes navigate() from inside a
useState updater (e.g. App.tsx:479 routing first-run users through
the onboarding panel from inside the setConfig() update). The
synchronous popstate dispatch reaches useRoute() subscribers which
then call setRoute() while the parent component is still rendering.

Defer the popstate dispatch to a microtask. The window.history call
itself stays synchronous so the URL bar updates immediately; only
subscriber updates are pushed past the current render commit, which
removes the warning without changing observable behaviour for any
existing caller.

* fix(web/router): cover deferred navigation timing

---------

Co-authored-by: Visionboost <contact@visionboost.fr>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 09:37:55 +00:00
koki
e71938767e
feat(community): add Showcase + Contribute + moderators, restructure nav and footer (#3291)
Adds two new entry points and reworks the community page chrome to
match the wider landing-page direction (PRs #3222 and #3230).

Showcase / Plugin Everything (above Ambassadors)
  Pitches Open Design as 'studio and gallery' in one address: anything
  shipped through the studio (content, products, templates, Skills,
  workflows) can return as a plugin, and the strongest pieces are
  carried out to the registry, to X, to Discord's #showcase channel,
  to the newsletter, and to the video reels. Right column holds a
  zero-code Contribute card with a curl installer, a copy-to-clipboard
  button, and a three-step flow for the od-contribute Skill.

Hero CTA row
  Three buttons, in a single row: Stage your masterpieces (Showcase),
  Become an ambassador (Ambassadors), Contributors hall of fame
  (Maintainers).

Top nav
  Pulls the breadcrumb out of the brand mark, surfaces Contributors /
  Ambassadors / Showcase as anchors, and adds GitHub + X icon buttons
  next to the Join Discord pill (mirrors PR #3230).

Footer
  Restructured into columnar layout with brand summary plus Products,
  Plugins, and Community columns; copyright moves to a bottom rule.

Ambassadors
  Renaissance-voice three-column program (Vocation / Patronage /
  Covenant) with an Apply on Discord CTA to the ambassador channel.

Discord
  Card spans wider (max-width 1440px), copy reframed as 'the front
  line of the agent-design era', two moderator profiles on the right
  (Koki from the founding team, Victor as Discord steward), channel
  list and CTAs on the left.

Recent signal
  Kicker and headline framed as this week's leaderboard; backed by a
  hand-curated RANKING_SNAPSHOT. A real refresh pipeline remains a
  follow-up; data is hand-updated until then.

Other notes
  Punctuation pass: replaced most em-dashes in prose with colons,
  periods, commas, semicolons, parentheses; em-dashes only remain in
  data placeholders, page title, and HTML comments. Logo size bumped
  to 32px and now uses an alt of 'Open Design'.

Co-authored-by: koki yanlai xu <koki@kokideMacBook-Air.local>
2026-05-29 09:20:04 +00:00
Md Mushfiqur Rahim
8ec162bb26
fix: pet hover card gets cut off at screen edges (#2860)
* fix: pet hover card gets cut off at screen edges

* fix: address review feedback - viewport clamping + unadopted pet wake

- Add window.innerHeight check to prevent bottom-edge clipping
- Increase menuH estimate for safer positioning
- Open pet settings instead of no-op Wake for unadopted pets

* fix: address review feedback on pet menu positioning and wake action

- Add viewport height check (viewH) to prevent bottom-edge clipping
- Increase menuH estimate for safer positioning
- Open pet settings instead of no-op Wake for unadopted pets
2026-05-29 09:05:51 +00:00
byte92
cdf34897ba
add comment composer keyboard submit shortcut (#2941)
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:46:15 +00:00
Sriram Sivakumar
0bd07b2a3d
fix(daemon): grok-build — pass prompt inline as -p value, drop stdin (#2259)
* fix(daemon): grok-build runtime — pass prompt inline as -p value, drop stdin

Grok Build CLI 0.1.212 enforces `-p, --single <PROMPT>` as a value-requiring
flag — invoking with bare `-p` and piping the prompt to stdin now fails with:

  error: a value is required for '--single <PROMPT>' but none was supplied

The previous runtime def used `promptViaStdin: true` + `buildArgs` returning
`['-p']`, which only worked against earlier grok builds that read the prompt
from stdin when `-p` had no inline value.

This change inlines the prompt as the `-p` argument value and flips
`promptViaStdin: false`. Linux `MAX_ARG_STRLEN` (128 KB) is enough headroom
for typical Open Design prompts; if we ever hit `E2BIG` on a very large
brief, a follow-up could shell out to `--prompt-file <tempfile>`.

Verified against grok 0.1.212 (b7b8204a4) — single-turn invocations now
return clean text replies instead of exit 2.

* fix(daemon): declare grok-build argv prompt budget + regression coverage

@mrcfps' review on #2259 flagged that moving the Grok Build adapter from
the (no-longer-working) stdin path to argv would regress oversized
composed prompts from the actionable AGENT_PROMPT_TOO_LARGE error we
already emit for DeepSeek to a raw spawn ENAMETOOLONG / E2BIG instead.
Fixed by mirroring the DeepSeek argv-budget shape:

- grok-build.ts: `maxPromptArgBytes: 30_000` (same headroom as DeepSeek,
  ~2.7 KB under the Windows CreateProcess 32_767-char cap) so
  `checkPromptArgvBudget` pre-flights composed prompts (system + history
  + skills + design-system content + user message) before spawn.
- prompt-budget.ts: Grok-Build-specific message — names the `-p /
  --single` flag, the xAI CLI 0.1.212+ behavior change, and points the
  user at stdin-capable adapters (claude / codex / hermes) when they
  need to ship large local context.
- Tests: 3 new vitest cases in prompt-budget.test.ts — pin the budget
  field, exercise the strict-overrun + at-limit + CJK byte-count guards
  exactly like the DeepSeek regression set, and assert the Grok-named
  diagnostic copy. New `grokBuild` + `grokBuildMaxPromptArgBytes`
  helpers exported alongside the existing `deepseek*` ones.

All 23 prompt-budget tests pass locally (`pnpm exec vitest run
tests/runtimes/prompt-budget.test.ts`).

---------

Co-authored-by: Sriram Sivakumar <sriram155@gmail.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:45:57 +00:00
saifullakhan
73b2dc853f
Fix project empty state create action (#3082)
Co-authored-by: saifulla-khan <saifulla-khan@users.noreply.github.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:30:43 +00:00
maybeyourking
881571dea7
fix(media): route custom-image edits through images API (#3087)
* fix(media): route custom-image edits through images API

* fix(media): normalize custom-image endpoint suffixes

---------

Co-authored-by: Artist Ning <dingkuake@yeah.net>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:09:44 +00:00
Aria Shishegaran
fe58db2ba1
fix(web): target comment picker elements precisely (#3263)
Resolve Comment picker hit testing against meaningful visible DOM leaves before falling back to annotated ancestors, while preserving Inspect mode's annotation-first selector behavior.

Filter generated React root annotations from Comment targets, keep real element bounds separate from hoverPoint, and avoid rendering the comments drawer inline when a configured dock portal is not mounted.
2026-05-29 07:47:08 +00:00
Mason
1006efa2f6
Improve onboarding AMR runtime card (#3276)
* Improve onboarding AMR runtime card

* Fix onboarding AMR test expectations
2026-05-29 07:45:23 +00:00
freshtemp-labs
593bf2f03c
fix(composer): ellipsis overflow for referenced filenames (#3269)
Inline @-mentioned filenames in the composer can be very lengthy,
causing line wrapping and visual crowding in the input area.

- Switch display from inline to inline-block for max-width
- Cap width at min(240px, 25vw)
- Apply overflow: hidden + text-overflow: ellipsis + white-space: nowrap
- Remove box-decoration-break (unused since content won't wrap)
- Tweak vertical-align for consistent inline-block alignment

Closes #3261

Co-authored-by: freshtemp-labs <freshtemp-labs@users.noreply.github.com>
2026-05-29 07:41:16 +00:00
ziyan2006
071db7ca1b
[codex] Stabilize HTML deck navigation state (#3142)
* fix: stabilize html deck navigation state

* fix: avoid misclassifying transform decks as scroll decks

* fix: detect default root-scroller decks

---------

Co-authored-by: Nongzi <3051966228@qq.com>
2026-05-29 07:41:10 +00:00
elihahah666
be09fe92da
fix: keep settings/handoff/avatar buttons fixed to the right in project header (#3279)
Move the three buttons (settings, handoff, avatar) from fileActionsBefore
to the actions slot so they always stay pinned to the right edge of the
header, regardless of how many extra controls (Share, Present, etc.) are
injected via portal during HTML preview.

Co-authored-by: qiongyu1999 <2694684348@qq.com>
Co-authored-by: Claude Opus 4 <noreply@anthropic.com>
2026-05-29 07:33:57 +00:00
youcef zr
d6d42c3600
fix(pack): bundle download and host packages in Linux AppImage assembly (#2845)
The Linux AppImage path assembles INTERNAL_PACKAGES as `file:` tarballs
and runs `npm install --omit=dev` in an isolated app directory. `pnpm
pack` rewrites each tarball's `workspace:*` refs to a concrete version,
so any runtime @open-design/* dependency missing from INTERNAL_PACKAGES
is resolved from the public npm registry and 404s.

Linux ships webOutputMode "server" and tarball-installs every
INTERNAL_PACKAGES entry, including @open-design/desktop and
@open-design/web. @open-design/host (dep of web + desktop, added in
#2246) and @open-design/download (dep of desktop, added in #2677) landed
after the Linux package list was written and were never added to it, so
`pnpm exec tools-pack linux build --to appimage` fails with:

  npm error 404 Not Found - GET .../@open-design%2fdownload

mac/win default to "standalone", where desktop/web/packaged/daemon are
prebundled with esbuild and excluded from the tarball install
(shouldInstallInternalPackageFor{Mac,Win}Prebundle). The packages they
do install have no download/host dependency, so those lanes correctly
omit them and need no change — this fix stays scoped to linux.ts and
touches no mac/win or workspace-build code.

Add both packages to the Linux INTERNAL_PACKAGES and build them in
buildWorkspaceArtifacts (download depends on platform). Add a cross-lane
regression test that, for each lane, derives the set it actually installs
(honoring the standalone prebundle exclusion) and asserts that set is
closed under its runtime @open-design/* dependencies. The test is red on
the linux lane without this fix and green with it, while mac/win pass
either way — encoding why only Linux needs these packages.
2026-05-29 07:25:03 +00:00
lefarcen
da19ff3ca0
feat(mocks): replay-based mock CLIs for 14 of OD's supported agents (opencode/codex/claude/gemini/cursor-agent/deepseek/qwen/grok + ACP family devin/hermes/kilo/kimi/kiro/vibe) (#3241)
* feat(mocks): replay-based mock CLIs for opencode/claude/codex/deepseek/qwen/grok

Drops in a `mocks/` top-level dir that pretends to be the real agent
CLIs by streaming pre-recorded sessions in each CLI's native stdout
protocol. Zero LLM tokens.

## Use cases

- **E2E tests** in `apps/daemon/tests/` — exercise the full chat-server
  pipeline against a known trace, assert UI events / artifacts.
- **Self-validation during dev** — iterate on `claude-stream.ts` /
  `json-event-stream.ts` parser changes without burning provider budget.
- **Regression harness** — replay the same trace before and after a
  charter / parser change; diff the daemon events the UI surfaces.
- **Demo / onboarding** — show what a 17-tool claude editing session
  looks like end-to-end, offline.

## How

- 6 bash wrappers (`mocks/bin/`) shadow the real CLIs when PATH-overlaid.
- `mocks/mock-agent.mjs` reads `mocks/recordings/<trace>.jsonl`, picks
  one via env var (`SYNCLO_EXPLORE_MOCK_TRACE` / `_POOL` /
  `_BY_PROMPT_HASH`), streams the trace in the requested format.
- Each format renderer matches the EXACT JSON shape the OD daemon
  parser expects, verified line-by-line against
  `apps/daemon/src/{json-event-stream,claude-stream}.ts`:

  | CLI                       | streamFormat              | parser source                              |
  | ------------------------- | ------------------------- | ------------------------------------------ |
  | `opencode`                | `json-event-stream`       | `handleOpenCodeEvent`                      |
  | `codex`                   | `json-event-stream`       | `handleCodexEvent`                         |
  | `claude`                  | `claude-stream-json`      | `createClaudeStreamHandler`                |
  | `deepseek` `qwen` `grok`  | `plain`                   | `server.ts` (raw stdout)                   |

## Quick start

```bash
export PATH="$PWD/mocks/bin:$PATH"
export SYNCLO_EXPLORE_MOCK_TRACE=04097377   # 8-char prefix OK
export SYNCLO_EXPLORE_MOCK_NO_DELAY=1

echo "any prompt" | opencode run
echo "any prompt" | claude -p --output-format=stream-json
echo "any prompt" | codex exec
```

The mock binary announces the picked trace id on stderr:
`[mock-opencode] picked 04097377… via fixed`.

Recording selection (env, in priority order):
- `SYNCLO_EXPLORE_MOCK_TRACE=<id>` — fixed (prefix OK)
- `SYNCLO_EXPLORE_MOCK_BY_PROMPT_HASH=1` + stdin prompt — `sha256(prompt) % N`
- `SYNCLO_EXPLORE_MOCK_POOL=<tag>` — random within `agent:claude` /
  `skill:agent-browser` / `outcome:failed` / etc.
- (default) uniform random
- `SYNCLO_EXPLORE_MOCK_SEED=<str>` — reproducible "random"
- `SYNCLO_EXPLORE_MOCK_NO_DELAY=1` — skip inter-event waits

## Dataset

179 anonymized Langfuse traces from this project's own production
telemetry:

- 9 agents: claude 57 · opencode 41 · codex 38 · gemini 25 ·
  cursor-agent 11 · qwen 2 · copilot 2 · deepseek 2 · antigravity 1
- outcomes: succeeded 144 · failed 35
- skills: default 71 · ad-creative 50 · algorithmic-art 30 ·
  agent-browser 22 · video-hyperframes 2 · plus magazine-web-ppt /
  brainstorming / data-report / penpot-flutter-design-source 1 each
- 124 multi-turn (sessions with ≥2 turns)
- 18 produce `<artifact>` output
- ~4.5 MB on disk total

Anonymization: `/Users/<name>/` → `${HOME}/`,
`C:\Users\<name>\` → `%USERPROFILE%\`, project UUIDs →
stable `proj-001`, `proj-002`, …. Tool input/output payloads
preserved verbatim (templated UI, no cell-level PII).

## Smoke test

`bash mocks/scripts/smoke-test.sh` — 6 checks across all 6 agents.
All pass on this branch (verified locally):

```
  ✓ opencode first event = step_start
  ✓ codex first event = thread.started
  ✓ claude first event = system
  ✓ deepseek emitted plain text (144 chars on first line)
  ✓ qwen emitted plain text (144 chars on first line)
  ✓ grok emitted plain text (144 chars on first line)
All mock CLIs working. 
```

## Adding more recordings

The exporter that produced this set lives in
[nexu-io/agent-pr-explore](https://github.com/nexu-io/agent-pr-explore)
(see `cli/src/local/orchestrator/langfuse-import.ts` + the `local
langfuse-import` CLI command). Operators with the Langfuse keys can pull
more by tag / outcome / artifact / multi-turn filter, then run
`local recordings anonymize --out-dir ~/Documents/open-design/mocks/recordings`.
`mocks/README.md` has the full instructions.

## Out of scope (follow-ups)

- **ACP agents** (`devin`, `hermes`, `kilo`, `kimi`, `kiro`, `vibe`) need
  a JSON-RPC server on stdio rather than a one-shot stream — separate
  `format-acp.mjs` module not yet written.
- **Per-agent json-event-stream variants** (`cursor-agent`, `gemini`,
  `qoder`, `copilot`, `pi`) currently fall back to the `plain` renderer;
  their parsers are in `apps/daemon/src/json-event-stream.ts` and follow
  the same template as `format-codex.mjs`.

## AGENTS.md updates

- Added `mocks/` to the top-level content directories listing
- Added a Validation strategy bullet pointing here for agent-stream /
  parser changes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mocks): add opencode-cli/kiro-cli/vibe-acp bin aliases and unref ACP timeout

- Add mocks/bin/opencode-cli, kiro-cli, vibe-acp wrappers for the primary
  RuntimeAgentDef bin names OD resolves before any fallback. Without these,
  a PATH-overlaid OD daemon run bypasses the mock entirely (opencode-cli,
  kiro-cli) or cannot find the mock at all (vibe-acp, which has no fallback).
- Include opencode-cli, kiro-cli, vibe-acp in the smoke-test ACP/JSON loop
  so coverage is verified end-to-end.
- Call .unref() on the 30s safety timeout in format-acp.mjs so a completed
  ACP session exits promptly instead of waiting the full 30 seconds.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* feat(mocks): add vela (AMR) — login / models / ACP with strict set_model gate

Extends mocks/ to cover OD's own AMR runtime. `vela` is the bin name
`apps/daemon/src/runtimes/defs/amr.ts` specifies (`bin: 'vela'`,
`streamFormat: 'acp-json-rpc'`). It's richer than the generic ACP
agents — covers full login + models + chat-session lifecycle.

### What vela does (mirrored from apps/daemon/tests/fixtures/fake-vela.mjs)

1. `vela login` — writes ~/.amr/config.json with a fake profile (controlKey,
   runtimeKey, user{email,name,plan}, profile-specific apiUrl/linkUrl).
   The on-disk projection is what OD's daemon login route + AmrLoginPill
   poller read; production goes through device-auth, the mock skips
   straight to the file write.

2. `vela models` — prints the production-shaped public model catalog as
   newline-separated `public_model_*    vela` lines. Override via
   FAKE_VELA_MODELS env.

3. `vela agent run --runtime opencode` — ACP JSON-RPC server with three
   vela-specific protocol extensions:

   a. `initialize` response carries `agentCapabilities`
      (`promptCapabilities.embeddedContext`) + `models`
      (`currentModelId` + `availableModels`).
   b. `session/new` response carries the same `models` block.
   c. **Strict set_model gate**: `session/prompt` is rejected with
      JSON-RPC -32602 ("session/set_model must be called before
      session/prompt") UNLESS `session/set_model` (or
      `session/set_config_option`) has been called for the current
      sessionId. Mirrors real vela 0.0.1 contract; catches regressions
      in `attachAcpSession` that silently skip set_model.

### Error injection envs (in sync with fake-vela.mjs)

  FAKE_VELA_SESSION_ID            - sessionId returned by session/new
  FAKE_VELA_TEXT                  - override assistant text
  FAKE_VELA_THOUGHT               - optional thought_chunk before text
  FAKE_VELA_SESSION_NEW_ERROR     - fail session/new
  FAKE_VELA_SET_MODEL_ERROR       - fail session/set_model
  FAKE_VELA_PROMPT_ERROR          - fail session/prompt
  FAKE_VELA_REQUIRE_SET_MODEL='0' - disable the strict gate (legacy)
  FAKE_VELA_LOGIN_USER_EMAIL      - email written into config profile
  FAKE_VELA_LOGIN_USER_PLAN       - plan written into config profile
  FAKE_VELA_LOGIN_DELAY_MS        - sleep before write (test in-flight)
  FAKE_VELA_LOGIN_FAIL            - print + exit 1
  FAKE_VELA_MODELS                - override models stdout
  VELA_PROFILE                    - profile slot (prod | test | local)

### Components

`mocks/lib/format-vela.mjs` (~205 LOC)
  - Full ACP server with vela protocol extensions
  - Strict set_model gate
  - Error injection plumbing

`mocks/lib/vela-subcommands.mjs` (~90 LOC)
  - runVelaLogin() — writes ~/.amr/config.json
  - runVelaModels() — prints catalog

`mocks/bin/vela` — dispatcher wrapper. Forwards `vela <subcmd>` to
mock-agent.mjs which routes to login/models or falls through to ACP.

`mocks/mock-agent.mjs` — parseArgs now collects positionals so the vela
dispatcher can read subcommand from there; switch case added for vela.

`mocks/scripts/smoke-test.sh` — +4 assertions:
  vela models prints ≥10 catalog lines
  vela login writes ~/.amr/config.json with the requested email
  vela agent run ACP roundtrip (initialize+models+set_model+stream+result)
  vela strict set_model gate rejects prompt without prior set_model

### Verified locally

  ✓ vela models printed 15 catalog lines
  ✓ vela login wrote ~/.amr/config.json with profile.prod.user.email
  ✓ vela agent run ACP roundtrip (initialize+models, set_model accepted, prompt streamed)
  ✓ vela strict set_model gate rejects session/prompt without prior set_model

All 21 smoke checks pass (up from 17 with previous P3 ACP commit).

### AGENTS.md + README updates

  AGENTS.md — mention `vela (AMR — vela CLI)` alongside ACP agents in
  the directory listing entry.
  mocks/README.md — protocol table row + dedicated vela section with
  subcommand contract, strict gate explanation, env-injection cheat
  sheet. Mock-tree listing updated.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mocks): honor REPORT_FILE env when --report-file flag not given

Harnesses that spawn the mock without translating their report-path
contract to the mock's CLI flag (notably nexu-io/agent-pr-explore's
orchestrator, which passes REPORT_FILE as env per the existing
opencode/claude/codex agent launchers) wouldn't get a report file
written, so the harness's "agent exit 0 but produced no report"
check would always fire and mark mock runs as failure even though the
stdout stream was complete.

Fix: in mock-agent.mjs parseArgs, fall through to process.env.REPORT_FILE
when --report-file wasn't provided on argv. Each format renderer already
accepts opts.reportFile and writes the recording's final assistant text
to it (`format-*.mjs` already had this — only the wiring was missing).

Verified: synclo-explore run with `mock=true, mock_trace=04097377`
against the opencode wrapper now produces a plan.md with the recording's
17-tool claude editing session report. ~1.5s per run vs ~70s real opencode.

* mocks: move recordings to Cloudflare R2; PR→main→Action upload path

The 179-recording corpus (~4.5 MB raw, ~280 KB after compression) has
been moved off git into Cloudflare R2 at the bucket open-design-mocks
under recordings/v1/. The repo now ships:

- mocks/manifest.json — the canonical catalog (renamed from
  recordings/index.json) with sha256 + storage hints; consumers
  fetch this to discover what exists, then pull individual jsonl
  files on demand
- mocks/scripts/fetch-recordings.sh — parallel, sha256-verified,
  idempotent puller for the public r2.dev URL
- mocks/scripts/add-recording.sh — local maintainer helper that
  validates a new .jsonl and copies it into recordings-staging/
  (no R2 calls; no credentials needed)
- mocks/scripts/upload-to-r2.mjs — called only by the CI workflow
- mocks/scripts/lib/manifest-utils.mjs — shared sha256/meta/
  rebuild-histograms logic, used by both add-recording (preview)
  and upload-to-r2 (actual write) so the entry shape never drifts
- .github/workflows/sync-mocks-to-r2.yml — fires on push to main
  when mocks/recordings-staging/ changes; uploads to R2, updates
  manifest, commits cleanup back; serialized via concurrency group

Trust model: R2 write credentials (CLOUDFLARE_API_TOKEN,
CLOUDFLARE_ACCOUNT_ID) are repo secrets; nobody can push from a
laptop. Read stays public via the r2.dev URL.

Why not pnpm install integration: contributors who do not touch
agent code do not pay the fetch cost. Fetch happens on first
smoke-test run (auto-fallback) or when a mock spawn needs data.

Repo size: -4.55 MB net (delete 179 jsonl, +280 KB manifest +
scripts). Smoke test (21 checks) still green against the fetched
corpus.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: scope R2 write token to a dedicated secret name

Use CLOUDFLARE_R2_MOCKS_TOKEN (instead of reusing the shared
CLOUDFLARE_API_TOKEN that landing-page-*.yml uses for Pages deploys)
so the R2 write capability can be scoped to just the
open-design-mocks bucket without bleeding extra capability into the
Pages workflows.

Also hardcode the powerformer CF account_id directly in the workflow
(account IDs are not secret and the shared CLOUDFLARE_ACCOUNT_ID
secret may point at a different account).

Workflow now fails fast with an actionable error message + dashboard
link if the secret is unset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: switch R2 sync to S3-compat API (wrangler getMemberships gate)

wrangler 4.x calls /memberships before any r2 action, requiring
user:read scope. R2 "Object Read & Write" tokens deliberately lack
that scope (defense in depth — a leaked token should not enumerate
account-level resources). The workflow now uses the aws CLI talking
straight to the R2 S3-compatible endpoint with SigV4, no membership
lookup.

Secret rotation: CLOUDFLARE_R2_MOCKS_TOKEN (Bearer) is replaced by
CLOUDFLARE_R2_MOCKS_AK / CLOUDFLARE_R2_MOCKS_SK (matching the
existing CLOUDFLARE_R2_RELEASES_AK/SK naming convention). End-to-end
tested locally: PUT recording → manifest rebuild → manifest PUT →
staging cleanup all green.

aws CLI is pre-installed on ubuntu-latest, so no install step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: scrub synclo namespace; use OD_MOCKS_* env prefix throughout

These mocks were copy-pasted from synclo-explore, where they
originated, and inherited the SYNCLO_EXPLORE_MOCK_* env-var
convention. That brand-bleed is not appropriate in OD: rename the
public env surface to OD_MOCKS_* (matching OD-native prefixes like
OD_MOCKS_CACHE_DIR, OD_TRACE_R2_UPLOAD, OD_EXPECT_TIMEOUT_SECONDS).

Renames:
  SYNCLO_EXPLORE_MOCK_TRACE             → OD_MOCKS_TRACE
  SYNCLO_EXPLORE_MOCK_BY_PROMPT_HASH    → OD_MOCKS_BY_PROMPT_HASH
  SYNCLO_EXPLORE_MOCK_POOL              → OD_MOCKS_POOL
  SYNCLO_EXPLORE_MOCK_SEED              → OD_MOCKS_SEED
  SYNCLO_EXPLORE_MOCK_NO_DELAY          → OD_MOCKS_NO_DELAY
  SYNCLO_EXPLORE_MOCK_RECORDINGS_DIR    → OD_MOCKS_RECORDINGS_DIR
  SYNCLO_EXPLORE_MOCK_SMOKE_TRACE       → OD_MOCKS_SMOKE_TRACE
  SYNCLO_OD_MOCKS_I_KNOW_WHAT_IM_DOING  → OD_MOCKS_ALLOW_LOCAL_UPLOAD

Also drop the inline harvester usage from README. The harvester is an
external CLI in nexu-io/agent-pr-explore — its README is the right
place for langfuse-import flags, anonymization options, etc. OD only
documents its own staging→PR→Action workflow.

Smoke test (21 checks) still green; OD_MOCKS_TRACE end-to-end
verified to route correctly.

Consumers of the OLD env names (notably the orchestrator in
nexu-io/agent-pr-explore) need a matching rename. No back-compat
shim here — the explore side has zero external users today and a
one-line follow-up is cleaner than a permanent deprecation layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* AGENTS.md: align mock env names with mocks/ rename (SYNCLO_* → OD_MOCKS_*)

Missed in the prior commit (a30b868a) — only grepped mocks/ subdir.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: drop staging dir + GH Action; back to local-script upload

The staging-dir + Action design (added earlier in this PR) had a flaw
the user caught: new recordings briefly entered the repo on their way
through staging, leaving them in git history forever even after the
Action cleanup commit removed them from HEAD. That defeats the whole
point of moving recordings to R2.

Replace with the simpler local-maintainer flow:

  bash mocks/scripts/upload-recording.sh /path/to/<trace>.jsonl
  # → validates, wrangler r2 put, updates manifest.json, wrangler r2 put manifest
  git add mocks/manifest.json && git commit && git push
  # → only the ~200B manifest delta enters git

The wrangler-OAuth gate replaces the CI secret + Action duo. For a
solo / small maintainer team this collapses the trust chain down to
"do you have wrangler login to the powerformer account?" — no GH
secrets to rotate, no concurrency window to worry about, no
inevitable repo-history bloat.

Deletes:
- .github/workflows/sync-mocks-to-r2.yml
- mocks/scripts/upload-to-r2.mjs   (CI-only)
- mocks/scripts/add-recording.sh   (staging helper, now obsolete)
- mocks/recordings-staging/        (empty dir, never to be repopulated)

Adds:
- mocks/scripts/upload-recording.sh

Kept:
- mocks/scripts/fetch-recordings.sh
- mocks/scripts/lib/manifest-utils.mjs (still used by upload-recording.sh)
- mocks/manifest.json (committed; the only mocks artifact in git)

End-to-end tested locally: re-upload an existing recording is
idempotent, manifest math is stable, fetch + smoke test still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: address review — guard allowlist + safe ~/.amr + loud OD_MOCKS_TRACE typo

Three concrete issues raised across recent Siri-Ray (Looper) review
threads on #3241:

1. scripts/guard.ts only allowlisted mocks/lib/ + mocks/mock-agent.mjs,
   leaving mocks/scripts/lib/manifest-utils.mjs outside the residual-
   JS guard. Result: Preflight fail on every push. Extend the allowlist
   to mocks/scripts/ — same precedent as the lib/ entry directly above.

2. mocks/scripts/smoke-test.sh moved the caller real ~/.amr to
   ~/.amr-smoke-backup, ran vela login (which writes a fake config),
   then rm -rf the .amr and restored the backup. Two failure modes:
   crash mid-run loses the user real config, and re-running before
   restore overwrites the backup with the fake login. Fix: sandbox
   vela login into a mktemp -d HOME via env (HOME=$amr_sandbox vela
   login). Never touches the real ~/.amr at all. trap cleans up.

3. mocks/lib/recording-picker.mjs silently fell through to
   prompt-hash → pool → random when OD_MOCKS_TRACE was set but did
   not match any recording (typo, prefix too short, corpus not
   fetched). Tests using a pinned trace would silently get a
   different trace, hiding regressions. Fix: throw an explicit error
   with the failing value + a pointer at fetch-recordings.sh.

Verified locally: pnpm guard prints "Residual JavaScript check
passed", smoke-test still 21/21, ~/.amr mtime unchanged after run,
typo on OD_MOCKS_TRACE now produces "mock-agent: OD_MOCKS_TRACE=...
set but no matching recording in <dir>" on stderr.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fetch-recordings: detect empty filter result before line-counting

printf '%s\n' on an empty string emits a single empty line, so the
previous TOTAL=$(printf ... | grep -c "") math returned 1 on an
empty $ENTRIES_TSV — a typo like `--agent no-such-agent` printed
"Fetching up to 1 recordings", downloaded zero, and exited 0
("ready"). Check `-z $ENTRIES_TSV` first.

Reproduced + fix verified per the reviewer thread.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: address mrcfps review — goldens + provenance + contract check

Three durability improvements suggested in the PR #3241 top-level
review:

## 1. Golden daemon-event snapshots (mocks/golden/*.events.json + apps/daemon/tests/mocks-golden.test.ts)

Smoke-test verified that mocks RUN; that catches crashes but not a
parser change that semantically reshapes the events the daemon emits.

Commit the daemon-event sequence for 3 representative traces:
- claude  314d6833 — median-complexity agent-browser session
- codex   dcdff3b3 — 14-tool refactor
- opencode 9a9522ec — 7-tool data-report

apps/daemon/tests/mocks-golden.test.ts spawns the mock, feeds stdout
through the real createClaudeStreamHandler / createJsonEventStreamHandler,
normalizes per-spawn volatile fields (only sessionId today, only on
claude), and deep-equals against the committed snapshot. A parser
regression fails the test loudly.

After an intentional parser change, regenerate:

  MOCKS_GOLDEN_UPDATE=1 pnpm --filter @open-design/daemon test mocks-golden
  git diff mocks/golden/
  # eyeball; commit if shapes match intent

## 2. Provenance fields on every manifest entry (mocks/scripts/lib/manifest-utils.mjs + mocks/manifest.json)

Augment inspectRecording() to write:

  captured_at         — ISO 8601 from existing meta.timestamp
  cli_version         — null until harvester writes it
  protocol_version    — null until harvester writes it
  anonymization_version — null until harvester writes it

captured_at is now populated for all 179 existing entries from the
meta event the harvester already emits. The harvester in
nexu-io/agent-pr-explore is the next step for cli_version /
protocol_version / anonymization_version — once those are
populated, consumers can detect when a recording is older than ~1
minor version behind the live CLI and flag for re-harvest.

No matrix of (cli_version × agent) recordings — that explodes
maintenance. Just metadata per recording so trust decay is visible.

## 3. Real-CLI contract check (mocks/scripts/contract-check.sh + docs/MOCKS-CONTRACT-CHECK.md)

Mocks catch parser regressions against recordings; they do NOT
catch recordings drifting away from the live agent CLI as that CLI
evolves. The contract check spawns the real CLI alongside the mock
with a fixed deterministic prompt + diffs top-level event-type
distributions.

Deliberately human-driven, not cron-scheduled:
- costs real LLM tokens per invocation
- requires real CLI auth
- maintainer reads the output, not a regex

Suggested triggers per doc: real-CLI release notes mentioning
"output format" / "stream" / "JSON" / "events"; before a parser
refactor; ad-hoc when something looks off.

## Coverage note

README updated to position mocks as "deterministic protocol/parser
coverage" (not "e2e replacement") per mrcfps framing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mocks-golden test): drop import of non-exported ParserKind

Use plain string (the type alias is `string` anyway) — Preflight
typecheck on a31fa71a failed:

  tests/mocks-golden.test.ts(29,8): error TS2459: Module
  "../src/json-event-stream.js" declares "ParserKind" locally, but
  it is not exported.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* recording-picker: structured OD_MOCKS_POOL + hard-fail no-match

Siri-Ray review: \`OD_MOCKS_POOL=outcome:failed\` was documented as a
supported selection knob, but the matcher only checked tags and
\`meta.agent\` — so the negative-path pool found 0 candidates and
silently fell through to global random, validating against any
recording instead of a failed trace.

Fix:
- Parse \`<dim>:<value>\` shape and route each dim to the right meta
  field: \`outcome\` → \`meta.outcome\`, \`agent\` → \`meta.agent\`,
  \`skill\` → \`tags[]\`. Bare values still fall back to tag substring.
- If the env was set and matched nothing, throw with the failing
  value and a jq one-liner for inspection. Same loud-fail policy as
  OD_MOCKS_TRACE — silent fallback was the original bug.

Verified locally: outcome:failed, agent:codex, skill:agent-browser
all route correctly; outcome:nonsense throws the explicit error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* contract-check.sh: fix lost $PROMPT in mock invocation

Siri-Ray review on e576074a: the mock side wrapped its pipeline in
`bash -c "printf %s \"\$PROMPT\" | ..."` — but $PROMPT was a parent
shell variable, not exported, so the child bash expanded it to an
empty string. Result: the contract check sent the real prompt to the
real CLI and an empty string to the mock, defeating the
same-input invariant the whole script rests on. Also let the mock
randomly select a different trace whenever a maintainer happens to
have OD_MOCKS_BY_PROMPT_HASH=1 in their env.

Fix: drop the inner bash -c entirely; use a subshell that scopes the
PATH overlay and pipes printf into the PATH-resolved mock binary
directly. The subshell limits the PATH change without var-passing.

Verified locally: with prompt-A the mock picks trace 54ec02ee via
hash; prompt-B → 2667e851 via hash; empty prompt (old broken
behavior) → random — confirms the prompt is now actually reaching
the mock under PATH overlay.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 07:17:20 +00:00
koki
4b7c018a9b
feat(contrib): add od-contribute skill for non-coder contributors (#3172)
* feat(contrib): add od-contribute skill for non-coder contributors

Adds a Claude Code skill at .claude/skills/od-contribute/ that walks any OD
user — including non-coders — through a first-PR contribution flow:

  - Ship a Skill / Design System made with OD
  - Translate README / QUICKSTART / CONTRIBUTING to a new language
  - Fix a typo / dead link / write a use-case blog post
  - Report a high-quality bug (issue path, no PR)

The skill replaces the test-driven dev-loop of auto-github-contributor with
type-specific no-code validators (frontmatter parse, markdown link check,
code-fence balance, structural overlap with reference DESIGN.md files), so
artifact-only contributions don't have to pretend to be code.

This commit only adds files under .claude/ — no product code, no build
config, no runtime dependencies. .gitignore is amended with three explicit
exceptions so the skill is tracked while personal Claude state (sessions,
settings, etc.) stays ignored as before.

Next steps (separate PRs):
  - Wire the OD app to mount this skill for its embedded agent
  - Add a "Ship to GitHub" UI button in OD that invokes /od-contribute

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* feat(contrib): English-by-default skill + zip installer for non-coders

Two follow-ups to the initial od-contribute skill:

1. Skill content is now English with an explicit instruction at the top
   telling the agent to mirror the user's chat language for every user-
   facing prompt. Generated artifacts (PR titles, commit messages, PR/
   issue body) stay English regardless — GitHub convention.

2. tools/od-contribute-installer/ ships a cross-platform installer that
   drops the skill into every supported agent's home dir without the
   user opening a terminal:

     install.command  macOS double-click
     install.bat      Windows double-click
     install.sh       Linux

   Targets covered:
     ~/.claude/skills/od-contribute/        Claude Code (native)
     ~/.claude/commands/od-contribute.md    Claude Code slash command
     ~/.agents/skills/od-contribute/        Codex CLI (canonical)
     ~/.codex/skills/od-contribute/         Codex CLI (legacy, only
                                            written if ~/.codex/ exists)

   Verified Codex CLI reads the same SKILL.md frontmatter format as
   Claude Code (source: openai/codex codex-rs/core-skills/src/loader.rs).
   Added agents/openai.yaml sidecar inside the skill for Codex picker UX.

3. build-zip.sh produces od-contribute-installer.zip (~37KB) from the
   in-repo skill. The zip is meant to be hosted as a GitHub Release
   asset; the marketing site button points at:
     github.com/nexu-io/open-design/releases/latest/download/od-contribute-installer.zip
   (See tools/od-contribute-installer/HOSTING.md for the manual release
   recipe; CI workflow can come later.)

   The zip itself is gitignored — distribute via Releases, not source.

Still no product code touched, no build config changed.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* refactor(contrib): drop zip installer; ship single curl one-liner

Replace tools/od-contribute-installer/ (4 install scripts + zip build
machinery) with a single self-bootstrapping tools/install-od-contribute.sh.

User flow becomes:

  1. Click button on opendesign.so
  2. Modal shows: paste this into your AI agent's chat:

       curl -sSL https://raw.githubusercontent.com/nexu-io/open-design/main/tools/install-od-contribute.sh | bash

  3. Agent runs it via its Bash tool. User never touches a terminal.
  4. /od-contribute is live in their next chat.

Why this is better than the zip approach:

  * Zero downloads visible to the user — no .zip in their Downloads folder
  * Zero unzip step
  * Zero terminal window flash (the agent's Bash tool runs in-process)
  * Zero per-OS installer files (.command/.bat/.sh) to maintain
  * Auto-updates: re-running the one-liner pulls the latest skill from main

The script downloads only the skill subtree (.claude/skills/od-contribute/
and .claude/commands/od-contribute.md) from a GitHub tarball — no `git`
dependency, just curl + tar (universally available).

Targets remain the same:
  ~/.claude/skills/od-contribute/
  ~/.claude/commands/od-contribute.md
  ~/.agents/skills/od-contribute/
  ~/.codex/skills/od-contribute/  (only if ~/.codex/ exists)

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(contrib): remove leftover zip artifact

Build artifact accidentally committed in the previous commit.
Cleaning up so the binary doesn't live in git history.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): make skill work in sandboxed agents (Codex.app, Cursor)

macOS App Sandbox apps like Codex.app cannot reach the system keychain
where `gh auth login` stores the GitHub token by default. Result: the
skill's check-prereqs.sh fails on `gh auth status` with a misleading
"not authenticated" error, even when gh works fine in the user's regular
shell.

Two changes:

1. config.sh: if GH_TOKEN isn't set in the env, fall back to reading a
   .gh-token file at the skill root. Lets a user (or the OD app, or a
   future OAuth Device Flow bootstrapper) drop a token there once and
   have every skill script pick it up automatically.

2. check-prereqs.sh: accept GH_TOKEN-from-env as a valid auth path
   alongside `gh auth status`. When neither works, the error hint now
   shows BOTH options:
     A) gh auth login from a regular terminal (any agent)
     B) gh auth token > <skill>/.gh-token (sandboxed agents)

Verified: in my local Claude Code (where gh has keychain access), the
keychain path still wins and nothing changes. With GH_TOKEN exported,
check-prereqs.sh succeeds without even consulting gh auth status.

Future: implement OAuth Device Flow inside the skill so non-coder users
hitting this in Codex.app can authenticate by clicking a link, no
terminal involved. That's a separate PR.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(contrib): move install script into skill folder (CI policy fix)

The repo's tools/ directory has a strict allowlist policy enforced by
scripts/guard.ts — only AGENTS.md, dev/, pack/, and serve/ are permitted
top-level entries. Moving install-od-contribute.sh out of tools/ and into
.claude/skills/od-contribute/install.sh:

  - Satisfies the guard policy (no scripts/guard.ts edit needed)
  - Co-locates the install script with the skill it installs (cleaner
    mental model: skill folder is self-contained)
  - The install URL stays inside the gitignore exception we already
    established for .claude/skills/od-contribute/

Public install URL changes from
  raw.githubusercontent.com/.../main/tools/install-od-contribute.sh
to
  raw.githubusercontent.com/.../main/.claude/skills/od-contribute/install.sh

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): address @nettee/looper review feedback (3 blocking issues)

Three real bugs caught by the looper review bot, all fixed:

1) create-pr.sh:48 — git diff missed untracked files

   `git diff --quiet || git diff --cached --quiet` ignored untracked paths,
   so the most common contribution shape (a brand-new Skill folder, a new
   translation file, a new doc) hit the else branch and pushed an empty
   commit. Replaced with `git status --porcelain` which sees untracked,
   plus a post-stage sanity check via `git diff --cached --quiet` so we
   skip the commit cleanly if everything turned out to be in .gitignore.

2) validate-skill-submission.sh:34 — frontmatter parse too lenient

   The awk fence-counter accepted `---` anywhere in the file as the
   opening fence. A SKILL.md with prose before the YAML block parsed as
   "valid frontmatter" by this script while the actual loaders (Claude
   Code + codex-rs/core-skills) required the fence on line 1 and would
   reject it. Added an explicit head -n 1 check so leading prose is
   rejected with a clear error before awk runs.

3) check-prereqs.sh:87 — gh api user failure swallowed

   `GH_USER="$(gh api user --jq .login 2>/dev/null || echo '?')"` set
   GH_USER to literal "?" when the API call failed (revoked token,
   missing 'repo' scope, network), then the script exited READY=1.
   Downstream that propagated to TARGET_FORK="?/open-design" and
   blew up at push time.

   Dropped the `|| echo '?'` fallback. An empty GH_USER now triggers a
   structured error with three common causes and the recovery command,
   and exits 2.

   While here, also fixed a related bug: this script sources config.sh
   which has `set -euo pipefail`, so -e leaked in and aborted the
   script silently the moment any check failed (instead of accumulating
   diagnostics like the original auto-github-contributor design
   intended). Added explicit `set +e; set -uo pipefail` after sourcing
   to restore the "keep checking past failures" behavior the comment
   on line 7 promised.

Smoke-tested all four fixes locally:
  - create-pr.sh: git status --porcelain correctly sees untracked files
  - validator: rejects SKILL.md starting with prose, passes well-formed
  - check-prereqs.sh: with stubbed gh that fails `gh api user`, now
    exits 2 with the structured error (was: silent exit 1)
  - check-prereqs.sh: happy path on real machine unchanged

Thanks @nettee for the careful review.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): macOS Bash 3.2 + over-strict link validator (review round 2)

Two more blocking issues from the looper review, plus one related bug I
caught while re-testing on real OD docs.

1) discover-i18n-gaps.sh: removed Bash 4 dep (declare -A)

   macOS still ships Bash 3.2.57 by default and most agent-spawned bash
   subprocesses inherit that. `declare -A SEEN_LANG=()` failed with
   `declare: -A: invalid option`, crashing Step 3b before any translation
   target could be shown.

   Replaced the associative array with a newline-delimited string set
   (\n<lang>\n bracket form to avoid prefix-overlap false matches like
   zh vs zh-CN). Verified end-to-end on /bin/bash 3.2.57 against the
   actual OD repo: returns the correct 28 stale-translation rows
   across the four English source docs.

   Also fixed a latent path-stripping bug in the same loop: `find`
   emits `./README.zh-CN.md` with leading `./`, so `${path#README.}`
   wasn't stripping the prefix at all. Switched to basename-first.

2) validate-markdown.sh: --reference flag for i18n / docs-edit flows

   The validator was treating every relative link target as a file path
   and failing on slugs like `skills/blog-post/` that are website
   router routes, not files in the checkout. A structure-preserving
   translation of README.md couldn't pass even when the user changed
   nothing except language.

   Added --reference <orig> flag. The validator now builds a "known
   already-broken" set of refs from the source file and excuses those
   in the new file. Newly-introduced broken refs still fail.

   Without --reference (e.g. brand-new blog file with no prior version),
   the relative-ref check is skipped entirely with a SKIP note — since
   we can't tell route slugs from file paths in isolation, failing
   would be wrong. Code-fence balance + external-link health still run.

   Updated SKILL.md so the i18n branch (3b.6) and the docs branch
   (3c.6) call validate-markdown.sh with --reference pointing at the
   English source / HEAD revision respectively.

3) (caught while testing) URL extraction regex too loose

   `grep -oE 'https?://[^) ]+'` was capturing trailing quotes from HTML
   <img src="..."> tags in OD's README, e.g.
     https://cms-assets.youmind.com/.../foo.jpg"
   The trailing `"` made the curl HEAD return 404. Tightened the
   character class to also stop at `"`, `'`, `<`, `>`, `[`, `]`.
   With this fix, README.md now passes all checks (20 external links
   verified 2xx/3xx).

Smoke-tested on macOS /bin/bash 3.2.57 with the actual nexu-io/open-design
working copy. All four scenarios behave correctly:
  - README.md without --reference → SKIP relative-ref check, PASS overall
  - README.md with --reference itself → 34 refs excused as pre-existing, PASS
  - Newly-introduced broken ref → FAIL (regression catch preserved)
  - Old test cases (skill validator, prereq check) → still pass

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): preserve .gh-token across install.sh reruns

`install_skill_to()` did `rm -rf $dest` before copying in the new skill,
which wiped any user-local state files. The most consequential one is
`.gh-token` — sandboxed agents (Codex.app, Cursor) write a GitHub token
there because they can't reach the macOS keychain (see check-prereqs.sh's
hint and config.sh's fallback path).

Effect: the documented upgrade path ("re-run the curl one-liner to pull
the latest skill") would silently lose the token on every refresh, and
the very next /od-contribute run would fail at the prereq gate with
"no GitHub credentials available", forcing the user back through manual
token setup. This affects exactly the audience the PR is aimed at.

Fix: stash any file in PRESERVE=(.gh-token) to a tempdir before rm -rf,
restore after the copy, re-chmod 600 on the way back. Test:

  1. Pre-seed .gh-token in all three target dirs
  2. Run installer
  3. Verify all three tokens still present, contents unchanged, perms 600

Centralized the preserved-state list as PRESERVE=() so future per-user
state (e.g. an OAuth-flow-saved refresh token) only has to be added in
one place.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): i18n stale false-positive + tier markdown link check (round 4)

Two more blocking issues from looper, both real.

1) discover-i18n-gaps.sh: false-stale on same-commit translations

   `git log --since=@<epoch>` is INCLUSIVE of the boundary epoch, so when
   the English source and a translation get touched in the SAME commit
   (a very common pattern: bulk i18n refresh, structural edits applied
   across all locales), the shared commit was counted toward
   english_commits_since_translation. Result: an already-current
   translation was reported with `status="stale", english_commits_since_
   translation=1`, and Step 3b would suggest it for refresh — driving
   users into no-op PRs.

   Reproduced exactly per looper's case: README.md and README.uk.md both
   have last commit 338cb4d at epoch 1779948707; the OLD predicate
   returned 1, the NEW predicate returns 0.

   Switched commits_between() from `--since=@<epoch>` math to commit
   ancestry: `git rev-list <tr_sha>..HEAD -- <newer>`. tr_sha..HEAD reads
   "commits reachable from HEAD but not from tr_sha", which correctly
   excludes the shared tip when both files were last touched together.

2) validate-markdown.sh: brand-new files bypassed local link check

   The previous fix skipped relative-ref validation entirely when
   --reference was absent. That covered slug-style refs (good) but also
   covered explicit `./foo.md` and `../bar/baz.md` style refs (bad).
   Step 3c (new blog post) doesn't pass --reference, so a contribution
   could ship with `[broken](./missing.md)` and pass the validator.

   Tiered the relative-ref check:
     - Image refs (`![alt](path)`) — ALWAYS validated. Markdown image
       syntax is never a website route.
     - Refs starting with `./` or `../` — ALWAYS validated. Explicit
       relative paths are unambiguous file references.
     - Other link refs (`skills/blog-post/` style) — only validated
       when --reference is supplied; otherwise skipped (could be route).

   In all cases, refs already broken in --reference (when supplied) are
   excused as pre-existing rather than reported as regressions.

   Verified against looper's exact repro (`[new broken](./missing.md)`
   in a brand-new file with no --reference): now correctly fails. Also
   verified ambiguous-slug test (`skills/blog-post/`) still skips
   without --reference, image refs always check, and README.md regression
   tests both with and without --reference still pass.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): catch bare-path refs in validators (review round 5)

Two narrow follow-ups to the round-4 tiered link checks:

- validate-skill-submission.sh: scan every non-URL, non-anchor markdown
  link target in SKILL.md (not just `./` / `../` prefixed paths). Plain
  intra-skill refs like `[ref](references/foo.md)` were previously
  ignored by the regex, letting broken bundles pass. Escape detection
  switches to lexical (segment count) instead of `cd … && pwd -P`, so a
  missing intermediate directory no longer masquerades as an escape.

- validate-markdown.sh: treat file-like targets (`*.md`, `*.png`,
  `*.svg`, image/asset/script extensions) as on-disk refs even without
  `--reference`. `[doc](missing.md)` is unambiguously a sibling file,
  not a website route, and Step 3c (new docs/blog) had no `--reference`
  to fall back on. Slug-style refs without an extension still get
  skipped without `--reference`.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(contrib): scratch leak + dedupe gate + workdir reuse (review round 6)

Three blocking issues from looper round 6, all fixed.

1) create-pr.sh + setup-workspace.sh: .od-contrib/ scratch leaked into PR

   `git add -A` in create-pr.sh staged everything in the worktree, including
   the skill's internal scratch dir (.od-contrib/type.txt, .od-contrib/
   slug.txt, .od-contrib/PR-BODY.md created by setup-workspace.sh and the
   render step). OD's .gitignore doesn't exclude .od-contrib/, so every PR
   opened through this flow shipped those bookkeeping files in the user's
   contribution diff.

   Two layers of defense:
   - setup-workspace.sh now writes `.od-contrib/` to .git/info/exclude
     when preparing the workdir (repo-local exclude, not committed).
   - create-pr.sh now uses an explicit pathspec `:!:.od-contrib` on its
     git status / git add calls. So even if a workdir was prepared
     differently, this script alone refuses to stage the scratch dir.

   Verified with a temp repo containing both .od-contrib/PR-BODY.md and a
   user file: only the user file lands in the index after `git add -A
   -- . :!:.od-contrib`.

2) create-issue.sh: dedupe gate didn't actually gate

   The --dedupe-keywords flag printed search hits to stderr but then
   unconditionally fell through to `gh issue create`. The `|| true` after
   the gh search pipeline also swallowed network/jq failures, so a broken
   search looked identical to "no duplicates found" — and the issue got
   created either way. The user never got a real chance to choose
   "comment on existing / open anyway / cancel".

   Now:
   - Run gh search and jq as separate steps; either failure exits 2 with
     a structured REASON=search_failed/parse_failed.
   - If matches > 0 AND --allow-duplicates was NOT passed, exit 3 with
     REASON=duplicates_found and MATCH_COUNT=N. Caller must explicitly
     re-run with --allow-duplicates after surfacing matches to the user.
   - The script now requires `jq` (added od::require jq) since we
     actually parse JSON.
   - Updated the docstring at the top so the caller contract (ask the
     user, then re-invoke with --allow-duplicates) is explicit.

   Verified: searching keyword "preview" against nexu-io/open-design
   matches 5 open issues; the script exits 3 and never calls
   `gh issue create`.

3) setup-workspace.sh: same-day workdir reuse leaked stale state

   `SESSION_DIR=<TYPE>-<SLUG>-<YYYYMMDD>` reused the same directory for
   every same-day, same-(type,slug) invocation. The most acute case:
   SKILL.md 3b.1 calls `setup-workspace.sh i18n translate` BEFORE the
   user has picked a doc/language, so every i18n attempt on the same
   day landed in `i18n-translate-<date>/` — and untracked files from an
   abandoned earlier translation survived `git checkout`/`pull` and
   leaked into the next user's run.

   Two changes:
   - Bumped tag to second precision: `<YYYYMMDD>-<HHMMSS>`. Two human-
     paced sessions in the same second is vanishingly rare. Verified
     two rapid runs produce different tags (114208 vs 114209).
   - When a workdir IS reused (same SESSION_TAG passed in explicitly,
     or rare clock collision), now does `git reset --hard HEAD` and
     `git clean -fdx` first so the run starts from a known-good base
     instead of inheriting prior occupant state.

   The branch name now also tracks the timestamp tag, so two runs can't
   accidentally end up on the same feature branch either.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: leilei926524-tech <leilei926524-tech@users.noreply.github.com>
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
2026-05-29 07:16:04 +00:00
Amy
937946c6fa
Improve model picker search and shared BYOK catalogs (#3262) (#3278) 2026-05-29 07:07:40 +00:00
lefarcen
755d84e64c
feat(web): merge Draw + Screenshot into one Studio mark tool (#3081) (#3277)
Forward-ports chaoxiaoche's Studio toolbar work from #3081 onto current
main. The preview toolbar drops to 4 controls — Comment, Mark (the merged
Draw/Screenshot tool with box-select + pen sub-tools), Edit, Comments —
matching the latest design. The standalone Screenshot button and its
copy-to-clipboard path are removed; capture now flows through the mark
overlay. Also carries #3081's comment select-all/clear-selection panel and
keeps the Draw send guard added in #3270 (Send disabled mid-run, Queue stays).

Reconciled with main work that postdates #3081's base so nothing is lost:
- Preserves #2190's preview iframe keep-alive pool and the AnnotationHoverPopover
  hover card (re-added on top of #3081's BoardComposerPopover, with its own
  anchor helper so it doesn't clash with the composer popover anchoring).
- i18n: keeps every locale key main added; adopts #3081's mark wording.

Behavior change: the comment side-panel Clear now deselects instead of
batch-deleting selected comments (per #3081); per-comment delete and
send-selected remain.

Validation: pnpm --filter @open-design/web typecheck (clean),
full web vitest (2354 passed), pnpm guard.

Co-authored-by: chaoxiaoche <fanzhen910412@gmail.com>
2026-05-29 06:51:38 +00:00
Caprika
76c7d31c53
chore: bump vela cli to 0.0.4 (#3239)
* chore: bump vela cli to 0.0.4-test.0

* chore: refresh lockfile for vela cli 0.0.4-test.0

* chore(nix): refresh pnpm deps hash

* fix: materialize electron before mac release checks

* fix: rebuild electron when mac framework links are invalid

* revert: drop release workflow experiments

* chore(nix): refresh pnpm deps hash

* fix: stop blocking beta mac release on electron symlink preflight

* fix: stop using custom electron dist for beta mac packaging

* fix: guard oversized chat images and opencode overflow

* chore: bump vela cli to 0.0.4

* chore(nix): refresh pnpm deps hash

* fix(daemon): surface prompt-image stat failures instead of dropping them

resolveSafePromptImagePaths only swallowed unresolvable path input; once a
path was confirmed inside UPLOAD_DIR and existed, a statSync failure
(EACCES/EPERM, a file vanishing mid-run) silently dropped the image and let
the run continue without that prompt context. Since this helper is now also
the 1 MB enforcement point, that turned an infra/validation failure into a
'successful' run with missing required context.

Collect those into a new failedImages bucket and fail the run with
INTERNAL_ERROR at the call site, mirroring the oversized-image guard. Add a
unit test covering statSync throwing.

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
Co-authored-by: lefarcen <935902669@qq.com>
2026-05-29 06:41:17 +00:00
Jane
3f4fd58937
feat(landing-page): surface Discord + X in header, restructure site footer (#3230)
Some checks failed
ci / Detect CI change scopes (push) Successful in 0s
visual-baseline / Capture visual baselines (push) Waiting to run
landing-page-ci / Validate landing page (push) Failing after 2s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 2s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 1s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 2s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
* feat(landing-page): surface Discord + X in header, restructure site footer

Two related public-chrome adjustments:

- **Header gains compact Discord + X icon buttons.** Both community
  channels were previously buried in the footer, so the typical
  visitor never saw them on a page-deep scroll. They now sit before
  the Download / Star CTAs in `nav-side`, share the ghost-button
  outline language, and stay icon-only with `aria-label` so they
  read as social affordances rather than competing with the text
  CTAs. At ≤1080px the icon buttons hide alongside the existing
  ghost CTA, so the bar still collapses cleanly into the hamburger
  panel — Star stays in the bar at every breakpoint.

- **Footer restructured into 4 columns: Products / Plugins /
  Resources / Connect.** The old `Plugins / Open Design / Connect`
  three-column layout muddled three different things — sister
  products, the artifact catalogue, and contributor channels —
  under one roof, so visitors hunting for "the other thing this
  team makes" had nowhere obvious to go.
  - **Products** (new) lists the team's apps: Open Design (links
    to homepage) and HTML Anything. Two entries by design — adding
    more products without an editorial pass would dilute the
    column.
  - **Plugins** mirrors the topbar `Plugins` dropdown verbatim:
    Templates / Skills / Systems / Craft, with no count prefix on
    Systems / Craft so it reads identically to the nav.
  - **Resources** (renamed from `Open Design`) carries the
    docs-style links: Official source / Quickstart / Agents locaux
    / Compare / Claude Design alternative. The old column heading
    was confusing because the OD logo + brand name already sit
    under the column.
  - **Connect** gains an X / Twitter row pointing at
    `@nexudotio`. The brand entries on this column are
    contributor / community surfaces only — code, releases,
    chat, social, RSS, contact form.

Implementation:

- `_components/header.tsx` — `DISCORD` and `X_TWITTER` consts at
  the top alongside `REPO`. Two `<a class="nav-icon">` blocks with
  inline SVG before the existing Download / Star CTAs.
- `_components/site-footer.astro` — `HTML_ANYTHING` and `NEXU_IO`
  consts. `<div class="sub-footer-col">` re-ordered to put
  Products first, Plugins second (no longer carries `counts.*`
  values), Resources third, Connect fourth (with the new X / Twitter
  row).
- `globals.css` — `.nav-icon` rule cloned from the ghost CTA's
  visual language (transparent + 1px line, fills on hover) but
  square (36×36 round) so it reads as a social-icon affordance.
  Added `display: none` for `.nav-side .nav-icon` to the existing
  ≤1080px and ≤880px media queries so the icons follow the same
  collapse behaviour as the Download CTA.
- `sub-pages.css` — `.sub-footer-grid` switches from
  `1.6fr 1fr 1fr 1fr` to `1.4fr 1fr 1fr 1fr 1fr` (brand + 4
  columns). At ≤1080px it falls back to a 3-column shape so each
  column has room to breathe; at ≤720px it stays a single column
  (existing behaviour).
- `i18n.ts` — adds `products`, `resources`, `xTwitter`,
  `sisterProjects`, `htmlAnything`, `nexuIo` to `LandingUiCopy.footer`
  (the last three are kept around even though `sisterProjects` is no
  longer rendered after the column was renamed Products — they're
  harmless and avoid churning the type if a future iteration brings
  the Sister-projects framing back). All 17 non-English landing
  locales gain translations for the new keys via the existing
  `LOCALIZED_LANDING_FOOTER_COPY` map (and the `LANDING_UI_COPY_OVERRIDES`
  block for `zh` / `zh-tw`). Translations were generated with
  `claude-haiku-4-5` over OpenRouter, with explicit instructions
  to keep "Open Design", "HTML Anything", and "X / Twitter" in
  English and to render "Products" / "Resources" in sentence case
  per locale convention. Spot-checked against rendered pages on
  `/zh/`, `/zh-tw/`, `/ja/`, `/ko/`, `/de/`, `/fr/` (and `/ar/` for
  RTL) for natural phrasing.

Validation: `pnpm --filter @open-design/landing-page typecheck` ->
0 errors / 0 warnings; local dev server smoke-tested on en root
(`/html-anything/`) and 5 locale variants (`/zh/`, `/zh-tw/`,
`/ja/`, `/de/`, `/fr/`) — header renders 2 nav-icon buttons,
footer renders 4 localized column headings in the correct order
with the right link targets.

* fix(landing-page): address PR #3230 review — locale-aware HTML Anything link + drop unused const

Two non-blocking inline review points from @PerishCode on PR #3230:

- The HTML Anything entry in the new Products column hardcoded
  `https://open-design.ai/html-anything/` via a top-level
  `HTML_ANYTHING` const, but `/html-anything/` is a real localized
  route in this app (`pages/[locale]/html-anything/index.astro`)
  and `open-design.ai` is the same site's live domain. A visitor
  on `/zh/…` clicking through landed on the English route and lost
  locale context, and hardcoding the production domain meant a
  preview build would surface a link that bounces visitors back
  to prod. Switch to `href('/html-anything/')` so the locale prefix
  + the current site's domain (resolved by `localizedHref`) are
  honored, matching every other footer link.

- `NEXU_IO` was declared at the top of the component but never
  referenced — leftover from an earlier iteration that listed
  `nexu.io` as a Sister-projects entry before the column was
  renamed Products and reduced to OD + HTML Anything. Removed.

No behavior change beyond the locale routing fix; the i18n keys
and column structure stay as they landed in the original commit.

* fix(landing-page): correct nav-icon comment to match actual responsive behaviour

The JSX comment introduced for the new Discord + X icon buttons in
PR #3230 claimed the icons "survive at narrow widths while text-only
nav items get pushed off". The CSS that shipped in the same PR does
the opposite: both `@media (max-width: 1080px)` and `@media (max-width:
880px)` blocks add `.nav-side .nav-icon { display: none; }`, so at
narrow widths the icons collapse alongside the ghost Download CTA
while the text nav <ul> moves into the hamburger panel — only the
Star CTA remains visible in the bar.

Rewrite the comment to describe the actual responsive contract so
the next reader of `header.tsx` doesn't have to cross-reference
`globals.css` to figure out which surface stays. Reviewer flag from
@PerishCode on PR #3230.

No code-path change; comment-only.

* fix(landing-page): correct sub-footer 1080px comment to describe actual 3-column grid

The CSS comment introduced for the new sub-footer grid claimed the
≤1080px breakpoint drops to "brand + 2x2 grid of columns" — but the
rule produces a 3-column grid, not a 2x2.

`.sub-footer-grid` has 5 children at this breakpoint (the brand
block + the four footer columns) and `.sub-footer-brand` carries
no `grid-column` span, so with `grid-template-columns: 1.6fr
repeat(2, 1fr)` they flow as: row 1 = brand · Products · Plugins,
row 2 = Resources · Connect · empty cell. The brand sits inline
with two columns rather than on its own, and the four content
columns are not a clean 2x2.

The layout itself is fine; only the comment misleads the next
reader about how the columns wrap. Same flavor as the `header.tsx`
icon comment fixed in 744daec — describe what the rule actually
does so the comment doesn't drift from the CSS. Reviewer flag
from @PerishCode on PR #3230.

Comment-only change.

---------

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-29 05:59:24 +00:00
lefarcen
98a2c63973
feat(daemon): add Antigravity agent adapter (#3157)
* feat(daemon): add Antigravity agent adapter

Adds Google Antigravity (`agy` CLI) as a coding-agent runtime. Detection
picks up `agy` on PATH, the daemon spawns `agy -p "<prompt>"` for a
single non-interactive turn, and the assistant text reply streams back
on stdout. OAuth is shared with the Antigravity IDE through the system
keyring, so users who have signed into the desktop app are authenticated
on first run with no extra step.

`agy` v1.0.3 has no JSON / stream-json / ACP output mode (upstream issue
#119), no `--model` flag (issue #35), and no MCP forwarding hook yet —
the adapter ships with `streamFormat: 'plain'` and a single `default`
fallback model so the model picker doesn't mislead users into thinking
their choice is wired through. We will upgrade buildArgs + add a
dedicated event parser when upstream ships structured output.

Also gitignores `.antigravitycli/`, the project-local config directory
`agy` auto-creates on every run (upstream issue #175).

* fix(daemon): Antigravity adapter — stdin prompt, brand icon, form loop, empty-output guard

- Switch prompt delivery from argv to stdin (`agy -p -`) to avoid the
  30KB maxPromptArgBytes limit that blocked real-world composed prompts
- Add official Antigravity brand SVG icon to agent picker
- Fix repeated question-form loop for plain agents by injecting an
  OVERRIDE block when form answers are already present in the transcript
- Add empty-output guard for plain agents so expired auth or silent
  failures surface a user-visible error instead of a blank "Done" turn

* feat(daemon): expand Antigravity adapter — model picker, form-loop fix, OAuth launcher, log-file classification

PR #3157 follow-up integrating four iterations from end-to-end manual
testing on Gemini 3.5 Flash + GPT-OSS 120B Medium through `agy` v1.0.3.
Each section is independently verifiable; combined they're what made
the first successful artifact generation work end-to-end.

## Model picker via settings.json (agy has no --model flag)

agy v1.0.3 ships no `--model` CLI flag (upstream issue #35), but the
TUI Switch-Model picker writes the chosen label to
`~/.gemini/antigravity-cli/settings.json`'s `"model"` field, and every
`-p` invocation re-reads that file on startup — verified by capturing
the `--log-file` line `Propagating selected model override to backend:
label="<model>"`. Antigravity's `fallbackModels` now lists the 8
labels its TUI exposes (Gemini 3.1 Pro / 3.5 Flash variants, Claude
Sonnet/Opus 4.6 Thinking, GPT-OSS 120B Medium) and `buildArgs`
persists the user's choice to settings.json right before spawn. The
synthetic `default` id is preserved — picking it leaves settings.json
untouched so a user who switches models from agy's own TUI keeps
their choice.

Introduces `RuntimeAgentDef.supportsCustomModel?: boolean`. AMR's
hardcoded blocklist in `SettingsDialog.tsx` migrates to the
declarative flag (it rejects free-form ids at the ACP layer), and
antigravity opts out because its label set is a server-side enum that
silently fails on unrecognised strings.

## Form-loop fix (transcript sanitizer + stronger OVERRIDE)

The discovery form loop on weak/medium plain-stream models (GPT-OSS
120B Medium, Gemini 3.5 Flash) had two reinforcing causes:

  1. `buildDaemonTranscript` packed the prior assistant turn's
     literal `<question-form>` markup into the user request on the
     next turn, giving the model a template to echo. New
     `sanitizePriorAssistantTurnForTranscript` strips
     `<question-form>...</question-form>` blocks and ```json fences
     that match form-schema shape, replacing them with a brief
     placeholder. User content is preserved verbatim (a user who
     legitimately mentions `<question-form>` in chat keeps their
     message intact).
  2. The OVERRIDE block on form-answered turns was 4 lines and only
     banned the bare `<question-form>` tag — models still emitted the
     fenced JSON, form-asking prose ("Got it — tell me the following"),
     and fake system events ("subagents stopped"). The new
     `FORM_ANSWERED_SYSTEM_OVERRIDE` enumerates each anti-pattern and
     pins them via tests, so silently weakening any line reintroduces
     the regression.

Also adds RuntimeAgentDef.resumesSessionViaCli + RuntimeContext.
hasPriorAssistantTurn as forward-looking abstractions (skipTranscript
option on composeChatUserRequestForAgent). Antigravity does NOT opt
in — agy's `-c` resume activates an internal agentic loop with tool
retries and fallback-to-cached-response on tool errors that the OD
system prompt cannot steer; reverted after seeing byte-identical
form re-emissions caused by agy's own retry logic, not OD's transcript.

## One-click OAuth via system terminal

agy print mode can't complete Google Sign-In on its own (the OAuth
callback page asks the user to paste an auth code back into agy, but
`-p` has no input field). Before this commit the auth banner only
told the user to "open a terminal yourself."

Adds `POST /api/agents/antigravity/oauth-launch` and a cross-platform
launcher in `runtimes/terminal-launch.ts`:

  - macOS:    osascript → Terminal.app `do script "agy"` + activate
  - Linux:    tries x-terminal-emulator, gnome-terminal, konsole,
              xfce4-terminal, xterm in order
  - Windows:  `cmd /c start "Open Design" cmd /k agy`

The endpoint hardcodes the `agy` command (no user input → no shell
injection surface) and is loopback-gated like the other daemon
endpoints. The chat's `AGENT_AUTH_REQUIRED` banner now renders a
"Sign in via terminal" button next to Retry; clicking it spawns the
terminal so the user can finish OAuth in one click.

## Silent-failure classification (auth vs quota via --log-file)

agy print mode is silent on stdout/stderr for both missing-OAuth AND
quota-exhausted failures — the upstream
`RESOURCE_EXHAUSTED (code 429): Individual quota reached` and the
`not logged into Antigravity` line only surface in agy's
`--log-file`. Without log inspection the daemon misread quota as
"auth required" and showed the wrong banner.

`RuntimeContext.agentLogFilePath` carries a daemon-owned per-run temp
path that antigravity's buildArgs translates to `--log-file <path>`.
The empty-output guard now reads that log on a `code === 0 &&
!childStdoutSeen` exit, feeds the tail to
`classifyAgentServiceFailure`, and routes:

  - "not logged into Antigravity"     → AGENT_AUTH_REQUIRED with
                                        antigravityAuthGuidance
  - "RESOURCE_EXHAUSTED" / "quota" /  → RATE_LIMITED with
    "Individual quota reached"          antigravityQuotaGuidance
  - none of the above (rare)          → fall back to auth guidance
                                        as the most likely cause

Both surface a terminal launcher in the auth banner: auth gets "Sign
in via terminal", quota gets "Switch model in terminal" — same
endpoint, contextual label. The handler is identical (open agy in a
terminal); the user either signs in or uses agy's Switch Model
picker to pick a model with available quota.

## Validation

- `pnpm guard` pass
- `pnpm --filter @open-design/daemon` runtime + telemetry suites:
  192 passed, 1 skipped (the 1 pre-existing `task-type` failure on
  origin/main is unrelated to this change)
- `pnpm --filter @open-design/web` typecheck pass; sse / amr-guidance
  / AgentIcon suites pass (51 web tests)
- Manual end-to-end on darwin + Gemini 3.5 Flash and GPT-OSS 120B
  Medium: turn-1 question-form rendered correctly, turn-2 produced
  `<artifact>` with full HTML (3.3KB Modern Minimal design) instead
  of re-emitting the form. agy `--log-file` content correctly
  classified as RATE_LIMITED when Gemini Pro quota was exhausted,
  and as AGENT_AUTH_REQUIRED when keychain was cleared.

* fix(web/test): align amrAgent fixture with supportsCustomModel contract

The AMR agent definition in the daemon ships `supportsCustomModel: false`
so the Settings model picker hides the free-text "Custom…" option. The
PR changed `allowCustomModel` from `selected.id !== 'amr'` (hardcoded)
to `selected.supportsCustomModel !== false` (declarative), but the test
fixture was not updated to carry the same field — causing the
`__custom__` sentinel to appear in the picker under test.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): align formAnswerTransition wording with main + scope build directive to discovery

CI surfaced two failures on the merge with main:
- chat-route.test marks submitted discovery form answers ... expected
  the main-version wording 'Do not emit another <formId> form.'
- telemetry-message-finalization keeps non-discovery form answers
  active ... expected task-type to fall through the else branch
  ('Treat these form answers as the active user turn'), not the
  discovery RULE 2/RULE 3 build branch.

The colleague's earlier fba1e40b form-loop fix tightened both pieces
(stronger wording + grouped discovery|task-type into the build branch)
but didn't update the tests that pin the contract. Revert the
transition wording to main and re-scope the build directive to
'discovery' only. The aggressive form-loop suppression we added in
this PR now lives in the system-prompt FORM_ANSWERED_SYSTEM_OVERRIDE
block, which is far stronger than the user-request transition text
this commit reverts.

* fix(daemon): scope formOverride by form id, detach Linux terminal, move agy log cleanup to finally

- FORM_ANSWERED_GENERIC_OVERRIDE: new exported constant for non-discovery/
  non-task-type form ids; contains only the "do not re-ask" suppression
  without the RULE 2 / RULE 3 / artifact directive.
- formAnswerTransitionForCurrentPrompt: extend build-transition branch to
  include task-type alongside discovery, keeping user-turn and system
  override consistent.
- Prompt assembly (server.ts ~10848): derive formOverride from the parsed
  form id — FORM_ANSWERED_SYSTEM_OVERRIDE for discovery/task-type,
  FORM_ANSWERED_GENERIC_OVERRIDE for all other form ids, empty otherwise.
- launchOnLinux: replace execFileAsync (waited for terminal exit, 3 s cap)
  with spawn({ detached: true, stdio: 'ignore' }) + unref(); resolve on
  the 'spawn' event so long-lived interactive terminals (xterm, konsole)
  are not killed mid-OAuth-flow.
- Antigravity log cleanup: move fs.promises.unlink(agentLogFilePath) into
  a try/finally wrapper around the close handler so every exit path
  (success, failure, cancel, non-zero exit) cleans up the per-run temp
  file, preventing unbounded /tmp accumulation.
- Tests: rename task-type case to assert build-transition behaviour; add
  generic-form-id case (preferences) pinning the non-build path; add
  FORM_ANSWERED_GENERIC_OVERRIDE content assertions.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): switch Antigravity buildArgs to chat subcommand invocation

Replace top-level `-p -` with `agy chat [--log-file …] -` so the adapter
uses the documented chat subcommand and stdin sentinel instead of the
unrecognised global -p flag.  Update the agent-args test description and
all four deepEqual assertions to assert the ['chat', '-'] shape.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* test(daemon): drop real-platform default-launch assertion from terminal-launch suite

The removed test called launchAgentInSystemTerminal('agy') with no
platform override, which invokes the real system terminal on every
developer machine running the daemon test suite (Terminal.app on macOS,
cmd.exe on Windows, xterm/gnome-terminal on Linux). That is an
unacceptable OS side effect for a unit test.

The behaviour being asserted — that omitting platform selects
process.platform — is a TypeScript default-parameter guarantee, not a
runtime invariant that needs an integration test. The remaining 'aix'
case continues to pin the unsupported-platform failure shape.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): buffer Antigravity stdout to suppress auth URL before close-time classifier

The plain-stream close handler at code===0 can detect an agy OAuth
prompt in agentStdoutTail and emit AGENT_AUTH_REQUIRED, but by the
time close fires the stdout chunk has already been forwarded to the
client via the plain-stream `send('stdout', { chunk })` path. This
leaves both the raw OAuth URL and the terminal-launch guidance visible
in chat.

Buffer all stdout chunks for the `antigravity` agent instead of
forwarding them immediately. The existing close-time auth-prompt guard
(code===0, !trackingSubstantiveOutput, childStdoutSeen) returns early
when it detects the auth pattern, leaving the buffer unflushed and the
OAuth URL out of the SSE stream. For legitimate assistant output the
buffer is flushed in order just before design.runs.finish so the
chunks still arrive before the run's finished event.

Adds a chat-route integration test using a fake `agy` that exits 0
after printing the canonical auth prompt; asserts that the run emits
AGENT_AUTH_REQUIRED with no event: stdout delta containing the URL.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* test(daemon): isolate antigravity buildArgs argv test from real settings file

Pass a temp antigravitySettingsPath in the RuntimeContext for the
withModel argv assertion so unit tests do not touch
~/.gemini/antigravity-cli/settings.json. Adds the optional
antigravitySettingsPath field to RuntimeContext and threads it
through buildArgs to writeAntigravityModelSelection; production
callers leave it undefined, preserving the existing default path.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): revert Antigravity buildArgs to `-p -` (the only working agy v1.0.3 invocation)

The looper-reviewer-bot reported `chat` as agy's headless subcommand
based on its environment's agy build, and looper-fixer applied that
shape. The installed CLI (`agy --version` reports `1.0.3`) does NOT
expose a `chat` subcommand — `agy --help`'s `Available subcommands`
section lists only `changelog / help / install / plugin / update`,
and `agy chat - < prompt` exits 0 with empty stdout (the daemon then
forwards it as a 'successful' empty reply, exactly the failure mode
the auth/quota guard at server.ts ~12090 is meant to catch — for the
wrong reason).

`-p` is the documented print-mode flag (`Short alias for --print`)
and `agy -p -` reads the prompt from stdin and prints the model
reply, which the entire end-to-end test sequence in this PR has
verified against (form-loop fix, settings.json model routing,
log-file classification all confirmed working on Gemini 3.5 Flash
+ GPT-OSS 120B Medium with this invocation).

Updates the agent-args test to pin `['-p', '-']` instead of
`['chat', '-']` and adds an inline comment in antigravity.ts noting
that `chat` may exist in a future agy build but is not the contract
on the installed CLI today.

* fix(daemon): serialize Antigravity concrete-model spawns to dodge settings.json race

Reviewer (looper) flagged a concurrency race in the model-routing path:
~/.gemini/antigravity-cli/settings.json is process-global, so two OD
runs starting close together with different concrete models can race
the file — run A writes model A, run B writes model B, then A's agy
finally reads settings.json and executes on model B. The Settings
model picker becomes nondeterministic under parallel conversations.

Adds a per-process promise chain in antigravity.ts:
  - acquireAntigravityModelLock(): chain-await + return release fn
  - waitForAgyToReadModel(logPath, expected): polls agy's --log-file
    for the upstream signal
      'Propagating selected model override to backend: label="<X>"'
    which model_config_manager.go emits once agy has finished reading
    settings.json. Returns true on observed match, false on timeout.
    Regex-escapes the expected label so '(' / ')' in 'GPT-OSS 120B
    (Medium)' match literally, not as a capture group.

server.ts spawn pipeline now acquires the lock BEFORE buildArgs (which
performs the settings.json write) and schedules a release-once handler
that fires when EITHER (a) the log-file confirms agy read the model
or (b) the child exits — the exit fallback prevents a stuck/crashed
agy from starving the queue for every subsequent antigravity spawn.

Default-model spawns bypass the lock entirely: their buildArgs doesn't
touch settings.json, so there's nothing to serialize.

Tests pin:
  - FIFO ordering across 2 / 3 concurrent acquirers
  - Wait helper's regex correctly matches parenthesized labels
  - Wait helper does NOT match a different model with shared prefix
  - Wait helper swallows missing-log-file errors and returns false on
    timeout (no spawn-pipeline crash if the log never appears)

194 → 198 passing runtime tests, 0 regressions.

* fix(daemon): close Antigravity lock release race on slow agy startup (looper #263fd2fe7)

Reviewer flagged that the previous serialization scheduled
`releaseOnce` in `.finally()` on waitForAgyToReadModel — meaning the
helper's `false` timeout return ALSO released the lock. If agy took
longer than the 15s polling window to read settings.json (cold start,
swap-thrash, slow network handshake to the upstream backend), run A's
lock dropped at 15s, run B rewrote settings.json with model B, and
run A's still-starting agy then read the wrong model. Same race the
original mutex was meant to close.

Fix the release semantics to be release-on-confirmation-only:

  - waitForAgyToReadModel: `false` now strictly means 'I gave up
    polling,' not 'agy definitely did not read this.' Document the
    contract so a future caller can't conflate the two. Add an
    optional AbortSignal so server.ts can stop polling when the child
    exits — without it, the leftover watcher could outlive the run
    and accidentally match a later concurrent run's log content,
    releasing the wrong lock.
  - server.ts: schedule `releaseOnce` only when waitForAgyToReadModel
    returns true. The exit handler (which fires for crashes, fast
    exits, normal completion) is now the canonical fallback that
    releases the lock no matter what — the queue can't starve
    permanently because agy always exits eventually. The exit
    handler also fires the AbortController so the watcher cleans up.

New tests pin:
  - timeout returns false WITHOUT any release-implying side effect
  - already-aborted signal short-circuits (no readFile calls)
  - abort mid-poll wakes the helper from its setTimeout (no
    multi-hundred-ms hang waiting out a poll interval that no longer
    matters)

198 → 201 passing runtime tests, 0 regressions.

---------

Co-authored-by: qiongyu1999 <2694684348@qq.com>
2026-05-29 05:43:37 +00:00
lefarcen
bf7152dbdc
fix(web): disable Draw direct-send during an active run, keep Queue (#3270)
Reinstates the Studio tool hardening from #3081 on top of current main:
while a task is streaming, the Draw/annotation primary Send action and its
Enter shortcut are disabled, so an annotation can no longer leak into the
active run while the button shows a disabled reason.

This is the synthesis of two stacked-merge-divergent changes rather than a
wholesale revert: Queue stays available, so the value from #1961 (kami) is
preserved — an annotation made during a run is still staged for the next
turn instead of being dropped. Only the button/Enter availability changes;
the downstream queue/streaming-staging handler in ChatComposer is untouched.

- PreviewDrawOverlay: send('send') and canSend now respect sendDisabled.
- Reframed the streaming Draw test to assert Send is disabled while Queue
  still emits a queued annotation (preserving the "annotate during a run"
  coverage).
- Added unit coverage for the Enter/Send guard and Queue availability while
  a task is running.
2026-05-29 05:28:18 +00:00
chaoxiaoche
912c7e380a
fix(plugin): infer semantic roles for token maps (#3231)
Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-05-29 03:50:56 +00:00
Hashem Aldhaheri
bbf4809a7e
fix(web): use surface-appropriate noun in plugin/template preview unavailable copy (#3229)
After #2840 wired plugin and design-template 404s into the same
"no shipped preview" placeholder the skills tab uses, the placeholder
copy still hard-coded "skill" — so users opening a Community/Plugins
card whose manifest declares a preview entry that doesn't ship saw
"No shipped preview for this skill." on a card that is clearly not a
skill.

Adds a noun discriminator to PreviewView.unavailable so the placeholder
reads with the right word per surface — "this skill" on the Skills
tab, "this plugin" on Community/Plugins, "this template" on deck-mode
design-templates. Locales gain three new preview.noun* strings (with
appropriate per-language demonstrative+article) and the existing
unavailable title/body interpolate a {noun} placeholder.

Also fixes a CSS gap in .ds-modal-unavailable surfaced by the same
path: the title and body divs were collapsing onto a single line under
.ds-modal-empty's default flex-row. Mirrors the existing
.ds-modal-error column+gap layout.

Refs #897, #2840.
2026-05-29 03:23:18 +00:00
kami
055680a67d
fix(daemon): dedupe scheduled routine slots (#1971)
* fix(daemon): dedupe scheduled routine slots

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): claim scheduled routine runs atomically

Co-authored-by: multica-agent <github@multica.ai>

* Fix routine loser snapshot rollback

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): defer scheduled routine side effects

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): terminate in-memory run on scheduled prepare failure

If `prepare()` throws after `persistPreparedRun()` has mutated the
routine run with real project/conversation/agentRunId values, the catch
in `RoutineService.start_` previously left the in-memory chat run
queued (no `discard()`), so its `completion` promise hung waiting on
`design.runs.wait(run)` forever, and the `routine_runs` row stayed
pinned to `routine-pending-*` placeholders even though the underlying
project/conversation rows for those real IDs had been created.

The catch now calls `handlerStart.discard?.()` so the in-memory run
terminates as `canceled`, releasing `completion`, and passes the real
IDs through `updateRun` so the persisted failed row reflects what was
attempted instead of the placeholder sentinels. A cleanup failure
inside `discard()` is logged via `console.error` rather than swallowed,
following the same surface-don't-swallow rule the loser cleanup path
uses. The original prepare error is still rethrown so the scheduler
advances to the next cadence (the slot claim is already terminal, so
retrying the same slot would just duplicate-claim and lose).

Added regression coverage in `apps/daemon/tests/routines.test.ts` for
both the normal prepare-failure path (real IDs persisted, discard
fired, completion resolved) and the case where the cleanup itself also
throws (failure surfaces via console.error, the row is still finalized
with the real IDs).

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): clear placeholder IDs on scheduled prepare failure

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): finalize routine prepare failures

* fix(daemon): defer manual routine setup cleanup

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): drop loser chat runs and rollback partial snapshot pins

Two follow-ups from the latest scheduler-claim review:

- Duplicate scheduled losers used to call `design.runs.finish(run, 'canceled')`,
  exposing a phantom canceled routine run on `/api/runs` even though no
  `routine_runs` row, conversation, or messages were ever committed. Split
  the handler tear-down into `discardUnstarted` (used for never-inserted
  paths — drops the in-memory run via the new `design.runs.drop()`) and the
  existing `discard` (used after `prepare()` runs — still finalizes as
  canceled and rolls back partial state).
- `resolvePluginSnapshot()` calls `linkSnapshotToProject()` before linking
  the conversation/run, so a failure mid-link could leave the reused project
  pinned to a snapshot the routine never durably claimed while
  `resolvedRoutineSnapshot` stayed null. Capture the intermediate snapshot
  id in `partiallyAppliedSnapshotId` when the resolver throws, and let
  `discard()` fall back to it for `restoreProjectSnapshotLink` so the
  previous project pin is restored either way.

Regression coverage added in `tests/routine-schedule-claims.test.ts`:

- A scheduled loser does not surface a phantom canceled chat run via
  `/api/runs` after the slot is lost.
- A resolver that throws after `linkSnapshotToProject()` (forced via a
  SQLite trigger on `conversations.applied_plugin_snapshot_id`) still
  restores the reused project's previous pin in `discard()`.

* fix(daemon): return prepared routine run ids

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: kami.c <kami.c@chative.com>
2026-05-29 03:20:47 +00:00
Jane
afc6e9a39f
feat(landing-page): localize templates subcategory chip labels across 16 locales (#3256)
The "scene" chip rail under each `/plugins/templates/<kind>/` page
shipped 23 chip labels in English (`UI & product mockups`,
`Brand & logo`, `Storyboards`, `Social & content`,
`Avatar & portrait`, `Illustration & style`, plus the rest of the
24-slug subcategory map covering all seven artifact kinds). Only
the `zh` override carried a translation; every other non-English
locale fell back to English on its scene rail. The result: a
visitor reading the rest of `/ja/plugins/templates/image/` in
Japanese (hero, kind chips, FAQ, card chrome — all localized in
PR #3218) hit a row of English chips at the bottom that read as
machine output rather than first-party copy.

This change fills `subcategory: { ... }` for the remaining 16
landing locales: `zh-tw`, `ja`, `ko`, `de`, `fr`, `ru`, `es`,
`pt-br`, `it`, `vi`, `pl`, `id`, `nl`, `ar`, `tr`, `uk`. The
existing `zh` translation is untouched. Brand-name tokens
(`UI`, `HyperFrames`, etc.) stay in English; localizable terms
(`Apps`, `Brand`, `Logo`, `Avatar`, `Storyboards`, …) are
translated where the language has a clean native equivalent.
Conjunctions follow locale convention — `&` for Latin-script
locales that read it as native chrome, `·` for CJK locales
where it works better than `&` next to ideographs, and
`و / & / และ`-style natural conjunctions for the rest.

Translations were generated with `claude-haiku-4-5` over OpenRouter
using a single batch script with explicit instructions on
chip-width budget (≈120px, target 1–4 native words), sentence
casing, and brand-token preservation. Output was validated for
JSON shape (every locale returns all 23 slugs) before splicing
into the override blocks.

Validation: pnpm --filter @open-design/landing-page typecheck ->
0 errors / 0 warnings; local dev (port 3067) renders the chip
rail in Japanese / Russian / Traditional Chinese / Arabic / German
/ French on `/<locale>/plugins/templates/image/` (and the same
rail on the other six artifact kinds, which share the subcategory
slug map).

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-29 03:19:29 +00:00
初晨
9c6a69490b
fix(web): localize mention picker copy (#3255) 2026-05-29 03:19:14 +00:00
Anurag Pappula
5319e14dc0
docs: sync README skill and design-system counts to 137 / 150 (#3254)
* docs: bump skill count to 137 in TL;DR and header badge

* docs: sync at-a-glance and comparison-table counts, drop broken arithmetic

* docs: sync remaining body references to 137 skills
2026-05-29 03:18:45 +00:00
Yuhao Chen
d0921ed335
fix(skills): avoid orphan web prototype files (#3253) 2026-05-29 03:18:19 +00:00
Yuhao Chen
4a0900ca81
fix(web): remove passive video play badge (#3252) 2026-05-29 03:17:57 +00:00
laihenyi
f67d245744
docs(i18n): fix zh-TW README parity drift from English (#3251)
- Comparison table: design systems 72 -> 129 (match EN README)
- Repository structure tree: add missing kami-deck.html template entry

Both were drift from the English README. The deeper EN-wide count
inconsistency (badge 149/131 vs body 72/31) is tracked in #3250.
2026-05-29 03:17:18 +00:00
Weston Houghton
20136c4da9
fix(skills): stream-copy fallback when skill staging hits cross-fs EPERM (#3249)
* fix(skills): fall back to a stream copy when skill staging hits EPERM

`fs.cp` copies each file with copy_file_range(2), which the kernel rejects
across some filesystem pairs — e.g. a container image layer (`/app`) copied
onto a ZFS/overlay bind mount (`/data`) — surfacing EPERM. Node doesn't fall
back to a userspace copy, so skill staging failed and degraded to absolute
paths, losing the `.od-skills` write barrier.

Retry recoverable copy errors (EPERM/EXDEV/ENOTSUP/EOPNOTSUPP) with a
dereferencing read/write copy that works across any source/dest filesystem;
non-recoverable errors still degrade as before. A test seam injects a
synthetic EPERM since the real errno only reproduces on those mounts.

* fix(skills): preserve source file mode in the EPERM stream-copy fallback

The cross-filesystem fallback copied contents with createWriteStream, which
opens the destination at the default 0644 and drops the source's exec bit.
Skills shell out to staged helper scripts (e.g.
skills/pptx-html-fidelity-audit/scripts/*.py), so on the EPERM/EXDEV path
this fallback repairs they would fail with EACCES.

chmod (masked to 0o777, so the agent-writable staging copy never inherits
setuid/setgid/sticky) + utimes each copied file from the source stat so the
fallback matches fs.cp's mode/timestamp preservation. Adds a regression test
that stages an executable fixture through the synthetic-EPERM seam and
asserts the exec bit survives.
2026-05-29 03:17:04 +00:00
lefarcen
08c350fb0f
fix(analytics): bucket feedback agent/model directly on the event (#3240)
* fix(analytics): bucket feedback agent/model directly on the event

Reason × agent / reason × model splits on
`assistant_feedback_reason_submit` were 25-74% `unknown` because the
event only carried `run_id` — analyses had to join back to
`run_created/run_finished`, which loses rows whenever the feedback is
given to a message whose run sits outside the query window (the common
case for feedback on older messages), and whose `model_id` was `null`
to begin with (the user didn't pick a specific model — went with the
agent's default).

Carry `agent_provider_id` and `model_id` directly on every feedback
event so the analyses no longer need to join. Replace `null/unknown`
with the `default` bucket via `modelIdForTracking` (and let
`agentIdToTracking` fall through to `other`) at every emit site —
`null` was an analyst-hostile mix of "no selection" and "join failed";
`default` is a real, analysable bucket. On `run_finished`, upgrade the
model to the agent-reported value from initializing/model status
events when the user did not pick one — covers ACP, claude-stream,
copilot-stream, json-event-stream, qoder, pi-rpc.

* fix(analytics): use feedbackAgentProviderIdToTracking and assistantFeedbackModelId for feedback events

Wire API-mode agent ids (anthropic-api → anthropic) and agentName-parsed
model ids through the feedback emit path. Previously the feedback props used
agentIdToTracking (no anthropic-api case) and assistantModelDetail (no
agentName fallback), causing model_id='default' and agent_provider_id='other'
for API-mode agents.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(analytics): extend feedback/run schema for full agent/model coverage

Layered on top of the conflict resolution and the v1 emit switchover
in 0c1b30440. Three things the prior commits did not cover:

1) The v2 `assistant_feedback_*` family (page='studio') shares
   `AssistantFeedbackBase`. Add `agent_provider_id` + `model_id` once on
   the base so all four derived emits (reason_view, click, reason_click,
   reason_submit) carry the same context as the v1 family, instead of
   leaving the v2 dashboard with the same `unknown` gap the v1 PR was
   trying to close.

2) Tighten `FeedbackSubmitResultProps.model_id` and
   `feedbackAgentProviderIdToTracking` from `string | null` /
   `TrackingFeedbackProviderId | null` to non-null. The web emit paths
   already bucket null/empty through `modelIdForTracking` and the
   `?? 'other'` fallback; collapsing that at the helper / contract
   layer means `null` becomes a TS error at every new emit site, so we
   can't regress the unknown bucket again in a future event.

3) Comment on `run_finished.model_id` so reviewers reading
   `finishedModelId` see why the agent-reported value upgrades the
   request-side one.

* fix(analytics): continue event scan past usage to find agent-reported model

The reverse scan for agentReportedModel was broken: the loop broke on
the first usage event (terminal) before ever reaching the status:initializing
or status:model event (emitted at run start, lower index). This meant
run_finished.model_id always fell through to modelIdForTracking(null) =
'default' for any run that reported usage tokens.

Fix: track haveUsageTokens as a flag and defer the break until both usage
tokens are found and either the model is not needed (user picked one) or
the agent-reported model has been captured. Extract the logic into
scanRunEventsForFinishedProps for unit testability.

Tests: six new cases in run-lifecycle-analytics.test.ts cover the
initializing→usage append order, ACP status:model, detail field fallback,
early exit when reqBodyModel is set, no-status event, and empty events.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(analytics): guard usage block with !haveUsageTokens to prevent early events overwriting terminal tokens

In the reverse-scan loop of scanRunEventsForFinishedProps, the usage block
lacked a !haveUsageTokens guard. When needAgentModel is true and the
agentReportedModel lives at the start of the run (lower index), the loop
walks all the way back past multiple usage events (one per step/turn in
multi-step runs), overwriting inputTokens/outputTokens on each pass. The
surviving values were those of the earliest step, not the terminal total.

Adding !haveUsageTokens to the usage block condition ensures only the first
(terminal) usage event seen in reverse sets the token counts; subsequent
earlier usage events are skipped while the scan continues for agentReportedModel.

Adds a test case for initializing(model) → usage(step1) → usage(terminal)
asserting both terminal token counts and agentReportedModel.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)
2026-05-29 03:06:06 +00:00
Nicholas-Xiong
45873a551b
fix: improve queue screenshot preview modal backdrop (#3215)
Increase the backdrop opacity from 44% to 75% and add a blur effect
to better separate the queue screenshot preview from the file-list
preview panel underneath. This prevents visual overlap and makes it
clearer which preview surface is active.

Fixes #3167
2026-05-29 03:04:28 +00:00
初晨
ef8f518b3b
Fix status detail URL parsing (#3208) 2026-05-29 03:03:46 +00:00
Nicholas-Xiong
98651ecae2
fix: localize queue UI strings in Chinese mode (#3213)
- Queued → 已排队
- to Send → 待发送
- Edit queued task → 编辑排队任务
- Save → 保存
- Cancel → 取消
- Edit → 编辑
- more queued → 个排队
- Queued follow-up → 已排队的后续任务

Fixes #3173
2026-05-29 03:03:27 +00:00
hahalolo
afc5f52445
Fix/issue/3149 (#3162)
* fix(docker): fix container startup crash due to missing OD_API_TOKEN

* fix(docker): forward OD_API_TOKEN to fix docker container boot loop

* fix(docker): enforce non-empty OD_API_TOKEN for docker-compose

* fix(deploy): automate OD_API_TOKEN generation in installer and close compose loop

* docs(readme): guide manual deployment users to configure OD_API_TOKEN

* docs(readme): align working directory paths for manual deployment instructions

* docs(readme): align working directory paths for manual deployment instructions

* docs(readme): restore git clone context for first-time users

* fix(web): add min-width constraints to plugin filter span and pill button

related issue 3149
2026-05-29 03:03:03 +00:00
open-design-bot[bot]
49573f031a
Update docs/assets/github-metrics.svg (#3159)
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-29 03:02:19 +00:00
kami
1efa1dc7b5
Add preview iframe keep-alive pool (#2190)
* Add preview iframe keep-alive pool

* Fix active preview eviction on prompt context changes

* Evict preview iframes on skill/design-system registry edits

Bridge Settings → Skills / Design Systems to App.tsx so the keep-alive
pool drops any preview iframe whose project depends on the affected id
after every successful mutation. Without this, body-only edits leave
SkillSummary / DesignSystemSummary fields untouched and ProjectView's
signature-driven eviction never fires, so the active preview keeps
serving stale prompt context. The handler also re-fetches the App
shell's skill / design-system lists so summary-field changes propagate
to ProjectView's signature on the next render.

Also extend IframeKeepAlivePool.evictMatching with an includeActive
option so the new handler can drop the currently-visible iframe along
with parked ones; the fallback pool only ever holds active entries so
includeActive is a no-op there.

Regression tests:
- App.previewKeepAlive: clicking a Settings stub that fires
  onSkillsChanged / onDesignSystemsChanged drives evictMatching with
  includeActive=true and a predicate that matches projects using the
  affected id while skipping unrelated projects.
- SkillsSection: onSkillsChanged fires after a body-only edit and
  after a delete.

* fix: reattach active keep-alive iframe after eviction

* fix(web): refresh design systems after rename

---------

Co-authored-by: kami.c <kami.c@chative.com>
2026-05-29 03:01:17 +00:00
Amy
1c2a1c4459
Add launch review regression coverage and stabilize daemon tests (#3207)
* Add launch review E2E regression coverage

* Harden daemon launch review regressions

* Stabilize daemon runtime tests

* fix(tests): restore e2e preflight typing

Generated-By: looper 0.8.1 (runner=fixer, agent=codex)

* fix(tests): make fake plugin runtime ESM-safe

Generated-By: looper 0.8.1 (runner=fixer, agent=codex)

* Stabilize e2e fake agent and regression tests

* fix(tests): repair fake agent cjs runtime

Generated-By: looper 0.8.1 (runner=fixer, agent=codex)

* fix(review): harden plugin authoring checks

Generated-By: looper 0.9.2 (runner=fixer, agent=codex)

* fix(tests): bind plugin authoring run to seeded conversation

Generated-By: looper 0.9.2 (runner=fixer, agent=codex)
2026-05-29 02:39:33 +00:00
Jane
82203fe4a7
fix(landing-page): community page brand mark + license + nav trim + twitter handle (#3222)
* fix(landing-page): trim community page nav + correct twitter handle

Two small corrections to the static `apps/landing-page/public/community/index.html` page served at https://open-design.ai/community/:

- Drop the `Skills` and `Design systems` shortcuts from the page-level
  top nav. The site-wide topbar already routes to the unified
  `/plugins/` hub (Templates / Skills / Systems / Craft are all
  faceted from there since PR #2880 / #2926 / #2958), so the
  Contributors page nav exposing two of those four facets out-of-
  context reads as inconsistent — visitors who clicked through were
  bypassing the hub. Keep `Ambassadors` (in-page anchor), `GitHub`,
  and the Discord pill; everything else in this list is a
  contributor-facing destination.

- Update the footer X / Twitter link from `x.com/nexu_io` to
  `x.com/nexudotio`. The `@nexudotio` handle is the active product
  account; `@nexu_io` was a stale earlier handle.

No JS / build-pipeline change — the page is static HTML served from
`public/`, so the diff is three lines.

* fix(landing-page): swap community page brand mark from letter "O" to logo image

The Contributors page top nav rendered a hand-rolled black circle
with a white "O" letter inside as the brand mark, which doesn't
match the rest of the site (homepage / sub-page header both use the
same `/logo.webp` image). On a Contributors page where the goal is
to read as a first-party Open Design property, having a different
brand mark in the corner reads as a different site.

Replace the `<span class="brand-mark">O</span>` literal with an
`<img src="/logo.webp">` and rewrite the local `.brand-mark` /
`.brand-mark img` rules to match the homepage's pattern: an
inline-flex 22×22 wrapper with a 5px-radius image inside (≈22%
of side, the same app-icon silhouette convention `globals.css`
uses for the homepage 44×44 mark, scaled down).

The asset is the same `/logo.webp` already shipped in
`public/`, so no new file is added.

* fix(landing-page): correct community footer license string MIT → Apache-2.0

The Contributors page footer rendered `© 2026 Open Design · MIT-licensed
· Built by contributors, in public.` — but Open Design has shipped under
Apache-2.0 since launch (the repo `LICENSE`, every page footer elsewhere
on the site, and the in-product chrome all say Apache-2.0). MIT was a
copy-paste leftover from an older draft and is materially wrong: the
two licenses differ on patent grants and trademark / attribution
mechanics, so showing the wrong one to a contributor reading the page
could shape downstream reuse decisions.

Single-string change: `MIT-licensed` → `Apache-2.0`. Confirmed via grep
that no other reference to MIT remains in the landing-page tree.

---------

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-29 01:32:27 +00:00
Jane
6ac1450925
feat(landing-page): localize templates page chrome + FAQ + categories across 17 locales (#3218)
PR #3185 introduced 9 new copy keys for the templates grid chrome
(`templatesHeroEyebrow`, `templatesHeroLead`, `templatesCounterLabel`,
`cardFeaturedTag`, `cardReadFullPrompt`, `cardUseTemplate`,
`cardShareAria`, `faqHead`, `faqItems`) and used `pcopy.category[slug]`
labels and descriptions on the kind facets. The English base was
filled in but the per-locale `overrides` map was left as a follow-up,
so every non-English visitor saw English chrome on
`/<locale>/plugins/templates/` and English H1 + lead on
`/<locale>/plugins/templates/<kind>/` (except `zh`, which already
shipped a `category` override before PR #3185).

This change fills in all 17 non-English landing locales for those new
chrome keys, FAQ Q&A, and the artifact-category labels:
zh, zh-tw, ja, ko, de, fr, ru, es, pt-br, it, vi, pl, id, nl, ar,
tr, uk. Brand names (`Open Design`, `Claude`, `Claude Design`,
`Anthropic`, `OpenAI`, `HyperFrames`, `Cloudflare`, `Apache-2.0`,
`BYOK`, `PR`, `GitHub`) stay in English in every locale per the SEO
anchor strategy. Artifact category labels are localized with the
native-language word each design / dev community would actually
search for: `プロトタイプ` (ja), `프로토타입` (ko), `Prototyp` (de),
`Prototipo` (es), `Прототип` (ru), and so on. `zh` keeps its
existing `category` translation untouched since it was already
shipped — only the new chrome + FAQ keys land for that locale.

Translations were produced with `claude-haiku-4-5` via OpenRouter
and spot-checked against rendered pages on 5 locales (zh, ja, ko, de,
fr) for natural phrasing, brand-name preservation, and HTML-tag /
entity / variable integrity. The remaining 12 locales follow the
same prompt and are expected to be merge-ready as a v1; native
speakers in the community can refine wording later via small PRs
without coordinating across the whole grid.

Validation: pnpm --filter @open-design/landing-page typecheck ->
0 errors / 0 warnings; local dev (port 3062) renders 231 cards on
each of /zh/, /ja/, /ko/, /de/, /fr/ /plugins/templates/ with hero
eyebrow / H1 / counter / CTA / FAQ head / first FAQ Q all localized,
and /ja/plugins/templates/prototype/ H1 reads "プロトタイプ" with a
localized lead (was English on prod before this PR).

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-29 01:32:03 +00:00
Caprika
cd1790abab
Harden AMR Link startup model discovery (#3198)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
actionlint / Lint GitHub Actions workflows (push) Failing after 1s
ci / Detect CI change scopes (push) Successful in 0s
landing-page-ci / Validate landing page (push) Failing after 1s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 2s
ci / Web workspace tests (push) Failing after 1s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
2026-05-28 14:45:23 +00:00
Caprika
56fe5c5036
fix(amr): stage external image attachments into workspace (#3226)
* fix(daemon): forward AMR image attachments through ACP

* fix(amr): stage external image attachments into workspace

* fix(amr): stage prompt image paths safely
2026-05-28 14:44:00 +00:00
386 changed files with 37609 additions and 3963 deletions

View file

@ -0,0 +1,23 @@
---
description: Open a first-contribution PR (or bug issue) on nexu-io/open-design — works for non-coders too.
argument-hint: "[skill | design-system | i18n | docs | bug — optional, otherwise the skill will ask]"
---
You are entering the **od-contribute** flow.
User input (may be empty): `$ARGUMENTS`
## What to do right now
1. **Load the skill** by invoking the `od-contribute` skill via the Skill tool. The skill owns the full execution playbook — do not reimplement it inline.
2. **Pass the user input forward**:
- If `$ARGUMENTS` is one of `skill`, `design-system`, `i18n`, `docs`, `bug` (or a recognizable Chinese / English equivalent), pre-select that branch and skip the type-picking `AskUserQuestion` in Step 2.
- Otherwise, the skill will ask the user via `AskUserQuestion`.
3. **Honor the interactive contract**:
- Always run the prerequisite check first (`gh` installed + authed). If it fails, surface the install/auth hint and stop — do not try workarounds.
- Always show the preview + require explicit confirmation before pushing or opening any PR/issue.
- At the end, print the PR or issue URL on its own line so the user can click through.
Begin by invoking the skill now.

View file

@ -0,0 +1,320 @@
---
name: od-contribute
description: One-click contribution flow for Open Design (nexu-io/open-design) — even for non-coders. Pick one of four cards (ship a Skill or Design System you made with OD; translate docs; fix a typo / write a blog; report a bug), the agent validates and opens a PR (or issue) for you. Trigger words contribute to open design, ship my OD skill, ship my OD design system, translate OD docs, report an OD bug, od-contribute.
allowed-tools:
- Bash
- Read
- Write
- Edit
- AskUserQuestion
- TaskCreate
- TaskUpdate
- WebFetch
---
# od-contribute — first-contribution flow for Open Design
Locked to `nexu-io/open-design`. Branches by **contribution type**, not by issue. Replaces the dev-loop with type-specific no-code validators. Designed so a product user with zero coding background can ship a real PR.
## Language
Mirror the user's language in every user-facing message — `AskUserQuestion` labels and descriptions, status updates, error explanations. Detect from their first message; when uncertain, default to English.
**Generated artifacts (PR titles, commit messages, PR/issue body files, branch names) MUST be English** regardless of the user's chat language. GitHub conventions, maintainer review, and search all assume English. The templates under `templates/` are already English — keep them that way when rendering.
Scripts live under `scripts/`. Source the shared helpers from any script:
```bash
source "$(dirname "$0")/config.sh"
```
`SKILL_DIR` below = the directory that contains this `SKILL.md`.
---
## Step 1 — Prereq check (always first)
```bash
bash "$SKILL_DIR/scripts/check-prereqs.sh"
```
- Exit 0: capture `GH_USER=<login>` from stdout. Default `TARGET_FORK="${GH_USER}/open-design"`.
- Exit 2: surface the printed install / auth hint **verbatim** and stop. Do not attempt token workarounds.
If `gh repo view "$TARGET_FORK"` fails, ask the user (one `AskUserQuestion`) whether to fork now via `gh repo fork nexu-io/open-design --clone=false`. Default to yes.
## Step 2 — Pick contribution type
Single `AskUserQuestion` (header: "Contribution", multiSelect: false), four options. Translate option labels/descriptions into the user's chat language; the branch routing is unchanged.
1. **🎨 Ship something I made with OD** — _a Skill, Design System, HyperFrame, or template I want to contribute upstream_ → branch `3a`
2. **🌍 Translate OD docs** — _README / QUICKSTART / CONTRIBUTING into a new language_ → branch `3b`
3. **📝 Fix docs / write a blog / fix a typo** — _typo fix, dead link, use-case writeup_ → branch `3c`
4. **🐛 Report a bug** — _something broke; I'll help turn it into a high-quality issue_ → branch `3d` (issue path, no PR)
Each branch below is self-contained. Steps 78 (preview + push) are shared across branches `3a`/`3b`/`3c`. Branch `3d` skips them entirely.
---
### Step 3a — OD product submission (Skill / Design System)
**3a.1** Ask user: "What's the local path to the artifact you want to ship?" (single free-text, translated into the user's chat language). Common: a folder path (Skill) or a single `DESIGN.md` file (Design System).
**3a.2** Sniff type:
```bash
# Skill: folder containing SKILL.md with frontmatter.
# Design System: file matching DESIGN.md anatomy.
```
If ambiguous, ask the user to confirm.
**3a.3** Run setup:
```bash
bash "$SKILL_DIR/scripts/setup-workspace.sh" skill <slug>
# or
bash "$SKILL_DIR/scripts/setup-workspace.sh" design-system <slug>
```
`<slug>` is `od::slugify` of the Skill `name` frontmatter field or of the brand name. Capture `WORKDIR` from stdout.
**3a.4** Copy artifact into workspace at the right target dir:
- Skill → `$WORKDIR/skills/<slug>/`
- Design System → `$WORKDIR/design-systems/<brand-slug>/DESIGN.md` (+ any sibling assets in the same folder)
**3a.5** Validate:
```bash
bash "$SKILL_DIR/scripts/validate-skill-submission.sh" "$WORKDIR/skills/<slug>"
# or, with 1-2 reference DESIGN.md files passed in:
bash "$SKILL_DIR/scripts/validate-design-system.sh" \
"$WORKDIR/design-systems/<slug>/DESIGN.md" \
--reference "$WORKDIR/design-systems/airbnb/DESIGN.md" \
--reference "$WORKDIR/design-systems/apple/DESIGN.md"
```
If validation fails, surface the FAIL lines verbatim, ask the user to fix, retry. **Never push a failing artifact.**
**3a.6** Ask 3 short questions via `AskUserQuestion` (translate the labels into the user's chat language):
- "What name should we credit you under in the PR?" — free-text
- "One-line pitch for this Skill / Design System?" — free-text
- "Path to a screenshot (optional)?" — free-text
**3a.7** Render `templates/PR-BODY-skill.md` (or `PR-BODY-design-system.md`) with substitutions:
- `{{SKILL_NAME}}`, `{{SKILL_SLUG}}` (or `{{BRAND_NAME}}`, `{{BRAND_SLUG}}`)
- `{{PITCH}}` (the one-line)
- `{{MOTIVATION}}` (free-text — agent can offer to draft this from the skill body, but user confirms)
- `{{TRY_PROMPT}}` (a prompt they recommend trying — agent suggests a default, user confirms)
- `{{SCREENSHOT_BLOCK}}` (Markdown image block if a screenshot path was given, else empty)
- `{{DISCORD_INVITE}}` from `$OD_DISCORD_INVITE`
Write to `$WORKDIR/.od-contrib/PR-BODY.md`.
→ Jump to **Step 7**.
---
### Step 3b — i18n translation
**3b.1** Setup workspace (slug = `translate-<doc>-<lang>` if known, else `translate`):
```bash
bash "$SKILL_DIR/scripts/setup-workspace.sh" i18n translate
# capture WORKDIR
```
**3b.2** Discover gaps:
```bash
bash "$SKILL_DIR/scripts/discover-i18n-gaps.sh" "$WORKDIR" > /tmp/od-i18n-gaps.json
```
Each line is JSON. Rank by:
- `status: "missing"` first (missing language is highest leverage)
- then `status: "stale"` ordered by `english_commits_since_translation` desc
- README family before QUICKSTART before CONTRIBUTING
**3b.3** Take the top 34 gaps and present via `AskUserQuestion` (header: "Translation target"). Each option label like: `README → 한국어 (Korean)` / `QUICKSTART (zh-CN) refresh — 12 commits behind`. Translate the header text into the user's chat language but keep the option labels descriptive (the language names belong in their native script).
**3b.4** Once user picks, **rename branch** to be specific:
```bash
git -C "$WORKDIR" branch -m "od-contrib/i18n/<doc>-<lang>-<date>"
```
(or pre-set the slug in step 3b.1 if the user confirmed earlier.)
**3b.5** Translate. Read the English source. Translate **structure-preserving**:
- Code blocks: leave untranslated
- Brand / product names: leave untranslated
- Filenames in inline code: leave untranslated
- Image / link targets: leave untranslated; if a localized version of a linked doc exists, swap the link to the localized file
- Headings: translate, keep the heading depth identical
- Tables: translate cell text only, keep alignment / pipes
Write the result to `$WORKDIR/<TRANSLATED_PATH>` (e.g. `QUICKSTART.es.md`). Show user a unified diff vs. the English source for visual sanity-check (line-count delta within ±15% is a healthy signal).
**3b.6** Validate the translated file against the English source. The `--reference` flag tells the validator to ignore relative refs that were already broken in the source — OD docs frequently link to website route slugs (e.g. `skills/blog-post/`) that aren't files on disk; we don't want a structure-preserving translation to fail because of pre-existing dead refs.
```bash
bash "$SKILL_DIR/scripts/validate-markdown.sh" \
"$WORKDIR/<TRANSLATED_PATH>" \
--reference "$WORKDIR/<ENGLISH_PATH>"
```
If FAIL → surface verbatim, fix, retry.
**3b.7** Render `templates/PR-BODY-i18n.md` with `{{DOC_NAME}}`, `{{LANG_DISPLAY_NAME}}`, `{{LANG_CODE}}`, `{{TRANSLATED_PATH}}`, `{{ENGLISH_PATH}}`, `{{STATUS}}`, `{{TRANSLATION_NOTES}}` (one paragraph from the agent: anything tricky, untranslated terms it kept, etc.), `{{DISCORD_INVITE}}`.
**Step 7**.
---
### Step 3c — Docs / blog / typo
**3c.1** Setup workspace (slug `docs`):
```bash
bash "$SKILL_DIR/scripts/setup-workspace.sh" docs <slug>
```
**3c.2** Ask user (one `AskUserQuestion`):
1. **Auto-discover small fixes** (run discover-doc-gaps, pick something)
2. **I have a specific fix in mind** (free-text)
3. **I want to write a blog / case study** (free-text — what's the use case?)
**3c.3 (Auto-discover branch)** Run:
```bash
bash "$SKILL_DIR/scripts/discover-doc-gaps.sh" "$WORKDIR" > /tmp/od-doc-gaps.json
```
Group by `kind` (typo / deadlink / todo). Show the user up to 6 candidates via `AskUserQuestion`. Once picked, apply the fix in code (typo: replace word; deadlink: ask user for the new URL; todo: that's a proper task, ask user to write the missing prose).
**3c.4 (Specific-fix branch)** Read the file, apply user's described change. Confirm via diff.
**3c.5 (Blog branch)** First check whether OD has a blog directory:
```bash
ls "$WORKDIR/docs" 2>/dev/null
```
If a `docs/blog/` or similar exists, place the new post there. If not, ask the user where it should live, defaulting to `docs/<slug>.md`. Generate an outline → user fills in user-specific bits (their use case, screenshots, the prompt they used, the rendered output) → agent stitches into a final Markdown.
**3c.6** Validate every changed/added file. For files that already exist in the repo (typo fix, dead-link fix, doc edit), pass `--reference` pointing at HEAD's version so we only fail on relative refs the user *introduced*, not on pre-existing route slugs:
```bash
# For modifications to existing files:
git -C "$WORKDIR" show "HEAD:<path>" > "/tmp/od-contrib-orig-<basename>" 2>/dev/null
bash "$SKILL_DIR/scripts/validate-markdown.sh" \
"$WORKDIR/<changed-path>" \
--reference "/tmp/od-contrib-orig-<basename>"
# For brand-new files (e.g. a blog post the user is creating from scratch),
# omit --reference. The validator will skip the relative-ref check entirely
# (since it can't tell route slugs from real paths in isolation).
```
**3c.7** Render `templates/PR-BODY-docs.md` with `{{ONE_LINE_SUMMARY}}`, `{{DETAILS}}`, `{{FILES_LIST}}`, `{{DISCORD_INVITE}}`.
**Step 7**.
---
### Step 3d — Bug report (issue path, no PR)
**3d.1** Read OD's actual schema at runtime to make sure we mirror it:
```bash
gh api "repos/${TARGET_REPO}/contents/.github/ISSUE_TEMPLATE/bug-report.yml" --jq .content | base64 -d > /tmp/od-bug-report.yml
```
If the schema has drifted from the template (`templates/ISSUE-BODY-bug.md`), regenerate the body to match.
**3d.2** Ask the user via `AskUserQuestion`, one structured prompt per critical field. Use **plain language**, not the YAML field names:
| Bug-report field | Prompt to user |
|---|---|
| `description` | "What went wrong? One sentence is fine." |
| `steps` | "How can I reproduce it? Walk me through step by step." |
| `expected` | "What did you expect to happen?" |
| `version` | "Which OD version are you running? (About menu, or `od --version`)" |
| `platform` | dropdown: macOS (Apple Silicon) / macOS (Intel) / Windows / Linux / Other |
| `logs` | "Any error logs you can paste? Skip if you don't have them." |
| `screenshots` | "Path to a screenshot? Skip if you don't have one." |
Translate every prompt above into the user's chat language at runtime.
**3d.3** Auto-collect what we can (these don't need to ask the user):
- OS family from `uname`
- Node version from `node -v` if relevant
**3d.4** Dedupe: extract 35 keywords from the description, run:
```bash
gh search issues "<keywords>" --repo "$TARGET_REPO" --state open --limit 5 --json number,title,url
```
If matches exist, present them to the user via `AskUserQuestion` (translate to user's language): "These existing issues look related. Do you want to: (a) comment on an existing one, (b) open a new issue anyway, (c) cancel?"
**3d.5** If proceeding with new issue, render `templates/ISSUE-BODY-bug.md` and submit:
```bash
bash "$SKILL_DIR/scripts/create-issue.sh" \
--title "$TITLE" \
--body-file "$WORKDIR_OR_TMP/.od-contrib/ISSUE-BODY.md" \
--dedupe-keywords "<keywords>"
```
**3d.6** Print the issue URL on its own line. **Do not** push branches or open PRs from this branch.
---
## Step 7 — Preview + confirm (shared, PR branches only)
Show the user a clean summary:
```text
About to commit:
Branch: od-contrib/<type>/<slug>-<date>
Files:
+ skills/foo/SKILL.md (1.2 KB)
+ skills/foo/preview.png (54 KB)
Push to: <fork or upstream>
Open PR: nexu-io/open-design:main ← <fork>:<branch>
```
Then `git -C "$WORKDIR" diff --stat` and a `head -40` of the rendered PR body for visual sanity.
Required `AskUserQuestion` confirmation (translate to user's language): **"Push this PR?"** with three options:
- **Ship it** — proceed to Step 8
- **Let me revise** — return to the relevant Step 3 sub-step
- **Cancel** — leave the workspace on disk, tell the user the path so they can return later, exit
Never push without an explicit "Ship it".
## Step 8 — Push & open PR
```bash
bash "$SKILL_DIR/scripts/create-pr.sh" \
--workdir "$WORKDIR" \
--type "<skill|design-system|i18n|docs>" \
--title "<PR title from references/newcomer-tone.md>" \
--body-file "$WORKDIR/.od-contrib/PR-BODY.md"
```
Print the PR URL on its own line. Done.
---
## Safety rails (mandatory)
- Never push to `main` / `master` / `develop`. The push scripts refuse.
- Never `--force` push. Just don't.
- All workspace activity stays under `$OD_WORK_ROOT` (default `$HOME/od-contrib-work`). `od::assert_in_workroot` enforces this.
- Bug-report path **always** runs the dedupe search before `gh issue create`.
- Honor user memory: skip GitHub user `xxiaoxiong` from any contributor lookup ([[feedback_no_outreach_xxiaoxiong]]).
## When NOT to use this skill
- The user wants to fix a daemon / web bug or add a feature with code changes → use `auto-github-contributor` instead (it has the TDD loop). This skill deliberately doesn't run lint/typecheck/tests because content paths don't need them.
- The user wants to *generate* a Skill / Design System from scratch → that's Open Design itself. Run OD first, get an artifact, then come back here to ship it.

View file

@ -0,0 +1,13 @@
# Codex CLI sidecar (optional). Adds a friendlier picker entry when this skill
# is loaded by Codex from ~/.agents/skills/od-contribute/ or .agents/skills/.
# Not required — Codex loads SKILL.md regardless.
interface:
display_name: "Open Design — Contribute"
short_description: "Ship a Skill / Design System / translation / typo fix to nexu-io/open-design without writing code."
default_prompt: "I want to contribute to Open Design."
policy:
# Allow Codex to surface this skill when the user mentions OD contribution
# without an explicit `$od-contribute` invocation. Keep on — it's the whole point.
allow_implicit_invocation: true

View file

@ -0,0 +1,136 @@
#!/usr/bin/env bash
# OD Contribute installer — self-bootstrapping.
# Fetches the latest od-contribute skill from nexu-io/open-design and installs
# it into every supported AI agent's home directory.
#
# Two ways to run this:
#
# 1) Tell your AI agent (Claude Code / Codex / Cursor / etc.) in the chat:
#
# curl -sSL https://raw.githubusercontent.com/nexu-io/open-design/main/.claude/skills/od-contribute/install.sh | bash
#
# The agent's Bash tool runs this. You never open a terminal yourself.
#
# 2) Or paste that same one-liner into a terminal directly, if you prefer.
#
# Targets installed:
# ~/.claude/skills/od-contribute/ Claude Code (native skill format)
# ~/.claude/commands/od-contribute.md Claude Code slash command
# ~/.agents/skills/od-contribute/ Codex CLI (canonical path)
# ~/.codex/skills/od-contribute/ Codex CLI (legacy, only if ~/.codex exists)
#
# Override the source branch with OD_CONTRIBUTE_BRANCH=feat/foo (default: main).
set -euo pipefail
REPO="nexu-io/open-design"
BRANCH="${OD_CONTRIBUTE_BRANCH:-main}"
cyan() { printf '\033[36m%s\033[0m\n' "$*"; }
green() { printf '\033[32m%s\033[0m\n' "$*"; }
gray() { printf '\033[90m%s\033[0m\n' "$*"; }
die() { printf '\033[31m[error]\033[0m %s\n' "$*" >&2; exit 1; }
cyan "Installing OD Contribute skill from ${REPO}@${BRANCH}..."
command -v curl >/dev/null 2>&1 || die "curl is required."
command -v tar >/dev/null 2>&1 || die "tar is required."
TMPDIR="$(mktemp -d)"
trap 'rm -rf "$TMPDIR"' EXIT
# Tarball download — no `git clone` needed (works in env without git).
TARBALL="$TMPDIR/repo.tar.gz"
curl -fsSL "https://github.com/${REPO}/archive/refs/heads/${BRANCH}.tar.gz" -o "$TARBALL" \
|| die "failed to fetch ${REPO}@${BRANCH} (branch may not exist)"
# Extract just the two paths we need. GitHub tarballs name the root dir
# <repo>-<branch>/, with slashes in branch names converted to dashes.
TARBALL_ROOT="open-design-${BRANCH//\//-}"
tar -xzf "$TARBALL" -C "$TMPDIR" \
"${TARBALL_ROOT}/.claude/skills/od-contribute" \
"${TARBALL_ROOT}/.claude/commands/od-contribute.md" \
2>/dev/null || die "skill files not found in tarball — branch may have different layout"
SKILL_SRC="$TMPDIR/${TARBALL_ROOT}/.claude/skills/od-contribute"
CMD_SRC="$TMPDIR/${TARBALL_ROOT}/.claude/commands/od-contribute.md"
[[ -f "$SKILL_SRC/SKILL.md" ]] || die "SKILL.md missing at expected path"
[[ -f "$CMD_SRC" ]] || die "slash command missing at expected path"
install_skill_to() {
local dest="$1" label="$2"
# Preserve user-local state across reinstall/upgrade. Re-running this script
# is the documented upgrade path ("re-run to pull the latest skill from
# main"), so anything the user wrote here that ISN'T part of the skill
# itself must survive `rm -rf`. Today that's just `.gh-token` (sandboxed
# agents like Codex.app / Cursor write a GitHub token here when they can't
# reach the macOS keychain — see check-prereqs.sh's hint and config.sh's
# fallback). Add new state filenames to PRESERVE if we ever introduce more.
local PRESERVE=(.gh-token)
local stash=""
local f
for f in "${PRESERVE[@]}"; do
if [[ -f "$dest/$f" ]]; then
[[ -z "$stash" ]] && stash="$(mktemp -d)"
cp -p "$dest/$f" "$stash/$f"
fi
done
rm -rf "$dest"
mkdir -p "$dest"
cp -R "$SKILL_SRC/." "$dest/"
# Restore preserved state. The mode preservation (`cp -p` above + this
# explicit chmod) keeps tokens at 600.
if [[ -n "$stash" ]]; then
for f in "${PRESERVE[@]}"; do
if [[ -f "$stash/$f" ]]; then
cp -p "$stash/$f" "$dest/$f"
chmod 600 "$dest/$f" 2>/dev/null || true
fi
done
rm -rf "$stash"
fi
# Ensure scripts retain executable bit (tar usually preserves; defense in depth).
find "$dest" -name '*.sh' -exec chmod +x {} + 2>/dev/null || true
green "$label"
gray " $dest"
}
# --- Claude Code (native, always install) -----------------------------------
install_skill_to "$HOME/.claude/skills/od-contribute" "Claude Code skill"
mkdir -p "$HOME/.claude/commands"
cp "$CMD_SRC" "$HOME/.claude/commands/od-contribute.md"
green " ✓ Claude Code slash command (/od-contribute)"
gray " $HOME/.claude/commands/od-contribute.md"
# --- Codex CLI (canonical) --------------------------------------------------
install_skill_to "$HOME/.agents/skills/od-contribute" "Codex CLI skill (~/.agents/skills/)"
# --- Codex CLI (legacy) — only if user already has Codex --------------------
if [[ -d "$HOME/.codex" ]]; then
install_skill_to "$HOME/.codex/skills/od-contribute" "Codex CLI skill (legacy ~/.codex/skills/)"
fi
echo
green "Done."
echo
cyan "How to use it:"
cat <<'EOF'
In Claude Code: type /od-contribute in any chat.
In Codex CLI: type @od-contribute or pick "Open Design — Contribute" from /skills.
In other agents: ask the agent to follow ~/.claude/skills/od-contribute/SKILL.md
The skill walks you through one of:
* shipping a Skill or Design System you made with Open Design
* translating a doc to a new language
* fixing a typo or writing a use-case blog
* reporting a clean bug
Need help? Open Design Discord: https://discord.gg/qhbcCH8Am4
EOF

View file

@ -0,0 +1,51 @@
# What an OD design-system folder looks like
Reference for the `od-contribute` skill's `validate-design-system.sh` step.
> **Authoritative source**: read 12 existing folders under `design-systems/` in `nexu-io/open-design` at runtime — the conventions evolve as new systems land.
## Minimum viable design system
```
design-systems/<brand-slug>/
└── DESIGN.md # required — the brand brief OD loads
```
A few systems include extras: `components.html`, `tokens.css`. These are optional, referenced from `DESIGN.md` if present.
## DESIGN.md structure (observed convention)
H1 with the brand name, then a blockquote with category + one-sentence pitch, then numbered H2 sections. Looking at established systems (`airbnb`, `apple`, etc.), the typical section list is:
```markdown
# Design System Inspired by <Brand>
> Category: <e.g. E-Commerce & Retail>
> <one-sentence pitch>
## 1. Visual Theme & Atmosphere
## 2. Color Palette & Roles
## 3. Typography
## 4. Layout & Spacing
## 5. Components
## 6. Motion & Interaction
## 7. Iconography & Imagery
## 8. Voice & Tone
## 9. Edge Cases & Variations
```
Section ordering and exact titles vary — the validator only checks **structural overlap with reference systems**, not exact heading text. ≥30% overlap with the union of headings from existing systems is enough to pass.
## What the validator actually enforces
1. File is non-empty and has at least one H1.
2. ≥30% heading overlap with reference DESIGN.md files (when references are passed in).
3. No `../` relative paths that would resolve outside `design-systems/<brand>/`.
That's deliberately loose — DESIGN.md is a creative brief, not a schema.
## Don'ts
- Don't reference assets outside the brand folder.
- Don't paste binary fonts; use a CSS `@font-face` reference and let OD resolve at runtime.
- Don't use real customer logos / proprietary brand assets you don't have rights to (the validator won't catch this — it's a maintainer-review concern).

View file

@ -0,0 +1,42 @@
# Newcomer tone — voice rules for PR / issue text
Per user feedback ([[feedback_outreach_minimal]]), keep it minimal. The PR body is the **only** place we get to shape the maintainer's first impression of this contributor — make it warm, brief, and useful.
## Hard rules
1. **Always end the PR body with two things:**
- "👋 This is my first OD contribution." (or a similar one-line warmth)
- The OD Discord invite: <https://discord.gg/qhbcCH8Am4> (read from `OD_DISCORD_INVITE` env, never hardcode)
2. **Never claim more than the PR actually does.** A typo fix is a typo fix — don't dress it up as "improving documentation quality" or list 5 fake checkboxes.
3. **Plain language only.** No "ergonomic", "DX", "stakeholder", "stack rank". Talk like a friendly user, not a startup blog.
4. **No emojis except the opening 👋 and one optional 🎨 / 🌍 / 📝 / 🐛 in the title or first line.** OD is design-loving but the maintainers read a *lot* of PRs.
## Soft rules
- Lead with **what changed**, not why or how. Maintainers can read the diff for the how.
- "Why" gets at most 23 sentences. If it needs more, the work is too big for this skill — open an issue instead.
- One screenshot if the change is visible. Zero is fine.
- The "checklist" should reflect what the validator actually checked, not a generic ceremonial list.
## Anti-patterns (do not do these)
- **Don't** write an "ask" section. Don't say "please review when you have time" — the PR is the ask.
- **Don't** invite the maintainer to call / DM you. Discord is the channel.
- **Don't** apologize. ("Sorry if this isn't right" — the maintainer will tell you if it isn't.)
- **Don't** include a "TL;DR" — if the summary needs a TL;DR, the summary is too long.
## Title conventions (for `git commit` and `gh pr create --title`)
| Type | Format | Example |
|---|---|---|
| Skill | `Add Skill: <name>` | `Add Skill: invoice-template` |
| Design System | `Add Design System: <brand>` | `Add Design System: notion` |
| i18n | `Translate <doc> to <Lang>` | `Translate QUICKSTART to Spanish` |
| i18n (refresh) | `Update <Lang> translation of <doc>` | `Update zh-CN translation of README` |
| Docs typo | `Fix typo in <file>` | `Fix typo in README.md` |
| Docs other | `<verb> <noun> in <where>` | `Clarify daemon setup in QUICKSTART` |
| Bug (issue title) | `<observed> on <surface>` | `Preview iframe is blank on Safari 17` |
## When to ask before writing
If the user wants to ship something whose tone is unusual (a manifesto blog post, a contentious refactor, naming a brand after a real company without rights), pause and ask the user. Better to skip the PR than ship something the maintainer will close politely.

View file

@ -0,0 +1,38 @@
# OD repo map — what goes where
Mirrors `nexu-io/open-design` `CONTRIBUTING.md` so the skill doesn't need to re-fetch it on every run. **If this drifts from upstream CONTRIBUTING.md, upstream wins** — re-read the live file when in doubt.
## Three high-leverage contribution surfaces (per OD's CONTRIBUTING.md)
| If you want to… | You're really adding | Where it lives | Ship size |
|---|---|---|---|
| Make OD render a new kind of artifact | a **Skill** | `skills/<your-skill>/` | one folder, ~2 files |
| Make OD speak a new brand's visual language | a **Design System** | `design-systems/<brand>/DESIGN.md` | one Markdown file |
| Hook up a new coding-agent CLI | an **Agent adapter** | `apps/daemon/src/agents.ts` | ~10 lines (code — out of scope for this skill) |
| Improve docs, port a section to fr / de / zh-CN, fix typos | docs | `README.md`, `README.fr.md`, `README.de.md`, `README.zh-CN.md`, `docs/`, `QUICKSTART.md` | one PR |
## Localized doc files we know about
| Doc family | English source | Translations seen on disk (as of plan time) |
|---|---|---|
| README | `README.md` | ar, de, es, fr, ja-JP, ko, pt-BR, ru, tr, uk, zh-CN, zh-TW |
| QUICKSTART | `QUICKSTART.md` | de, fr, ja-JP, pt-BR, zh-CN, zh-TW |
| CONTRIBUTING | `CONTRIBUTING.md` | de, fr, ja-JP, pt-BR, zh-CN |
| MAINTAINERS | `MAINTAINERS.md` | de, fr, ja-JP, pt-BR, zh-CN |
The skill `discover-i18n-gaps.sh` does NOT trust this table — it scans the workspace at runtime. Use this list only when you need to seed an `AskUserQuestion` card without a workspace.
## Issue templates
- `bug-report.yml` — required fields: description, steps to reproduce, expected, version, platform.
- `feature-request.yml` — out of scope for this skill (feature requests should come from product, not auto-routed.)
- `preview-v0.8.0-feedback.yml` — branch-specific.
## Out-of-scope surfaces (don't touch from this skill)
- `apps/daemon/src/` — daemon code. Requires real review.
- `apps/web/src/` — web app code. Requires real review.
- `packages/`, `plugins/`, `tools/` — internal libs.
- `e2e/` — Playwright-driven; non-trivial to author.
If a user asks to contribute to those surfaces, suggest the original `auto-github-contributor` skill (TDD pipeline) instead.

View file

@ -0,0 +1,53 @@
# What an OD skill folder looks like
Reference for the `od-contribute` skill's `validate-skill-submission.sh` step and for guiding a user through assembling a Skill submission.
> **Authoritative source**: read 12 existing folders under `skills/` in `nexu-io/open-design` at runtime — conventions evolve faster than this doc.
## Minimum viable skill
```
skills/<your-skill>/
└── SKILL.md # required, must have YAML frontmatter
```
That's it. Many of the simplest skills in OD are exactly that: one Markdown file in one folder.
## Frontmatter — what `validate-skill-submission.sh` requires
```yaml
---
name: <kebab-case-slug> # required; usually matches the folder name
description: | # required; one paragraph; what the skill does in user terms
Generate and iterate ad creative including headlines, descriptions, and primary text.
triggers: # optional but strongly recommended
- "ad creative"
- "ad headline"
od: # optional; OD-specific metadata
mode: design-system # or other modes; check existing skills
category: <category-slug>
upstream: "https://github.com/..." # if the skill was lifted from somewhere
---
```
**Required by validator**: `name`, `description`. Everything else is convention.
## Body conventions (after the frontmatter)
Looking at existing skills, the typical body has:
1. `# <skill-name>` H1.
2. A one-line "what it does" sentence.
3. Optional `## Source` block when adapted from upstream (with attribution).
4. `## How to use` with one or two example prompts the user might type.
## When a skill folder needs more than `SKILL.md`
- **Reference assets** — long prompt fragments, example outputs, image references — go alongside `SKILL.md` in the same folder, referenced via relative paths in `SKILL.md`.
- **Subfolders** are fine: the validator only requires that every relative reference inside `SKILL.md` resolves and that no path escapes the skill folder.
## Don'ts
- Don't put runtime code in here. Skills are *content* — Markdown + maybe assets. Code adapters live in `apps/daemon/src/`.
- Don't reference files outside `skills/<your-skill>/` — that breaks portability.
- Don't put binaries you don't need (the lighter the folder, the easier the review).

View file

@ -0,0 +1,121 @@
#!/usr/bin/env bash
# Verify required tools + gh auth before the skill starts.
# Exit 0 = ready (prints GH_USER=... and READY=1 to stdout)
# Exit 2 = missing prereq, hint printed to stderr; skill should surface it verbatim.
set -uo pipefail
# shellcheck disable=SC1091
source "$(dirname "$0")/config.sh"
# config.sh runs with `set -e` for its own callers, but this script wants the
# OPPOSITE behavior: continue checking all prereqs even when one fails so we
# can surface the full diagnostic in one shot rather than aborting at the
# first miss. Restore -uo pipefail without -e after sourcing.
set +e
set -uo pipefail
# Skill root, used in the auth-failure hint below to tell the user where to
# drop a .gh-token file if they're stuck in a sandboxed agent.
_OD_SKILL_DIR_HINT="$(cd "$(dirname "$0")/.." && pwd)"
STATUS=0
MISSING=()
HINTS=()
check_bin() {
local bin="$1" install_hint="$2"
if command -v "$bin" >/dev/null 2>&1; then
printf ' ✓ %s\n' "$bin" >&2
else
printf ' ✗ %s (not installed)\n' "$bin" >&2
MISSING+=("$bin")
HINTS+=("$install_hint")
STATUS=2
fi
}
printf '[od-contrib] checking prerequisites...\n' >&2
OS="$(uname -s)"
case "$OS" in
Darwin) GH_HINT="brew install gh" ;;
Linux) GH_HINT="see https://github.com/cli/cli#installation (e.g. 'sudo apt install gh' or 'brew install gh')" ;;
*) GH_HINT="see https://github.com/cli/cli#installation" ;;
esac
check_bin gh "$GH_HINT"
check_bin git "install git for your OS"
check_bin jq "$( [[ $OS == Darwin ]] && echo 'brew install jq' || echo 'sudo apt install jq (or brew install jq)' )"
if ((${#MISSING[@]} > 0)); then
printf '\n[od-contrib][error] missing required tools: %s\n' "${MISSING[*]}" >&2
printf '\nInstall hints:\n' >&2
for i in "${!MISSING[@]}"; do
printf ' - %s: %s\n' "${MISSING[$i]}" "${HINTS[$i]}" >&2
done
exit 2
fi
# Two acceptable auth paths:
# 1. `gh auth status` succeeds (gh has a token in keychain or hosts.yml)
# 2. GH_TOKEN env var is set (config.sh loaded it from .gh-token, or caller exported it)
# Path 2 matters for sandboxed runtimes (Codex.app, Cursor, etc.) where gh
# CAN'T reach macOS keychain due to App Sandbox restrictions.
if [[ -n "${GH_TOKEN:-}" ]]; then
# Verify the token actually works against the API.
if ! gh api user --jq .login >/dev/null 2>&1; then
printf '[od-contrib][error] GH_TOKEN is set but gh api call failed (token expired?).\n' >&2
printf '[od-contrib][error] Refresh the token: from a terminal run gh auth refresh or replace the .gh-token file.\n' >&2
exit 2
fi
elif ! gh auth status >/dev/null 2>&1; then
cat >&2 <<EOF
[od-contrib][error] No GitHub credentials available.
Two ways to fix this:
Option A (one-time, works for any agent):
From a regular terminal, run:
gh auth login
Pick GitHub.com → HTTPS → browser login. Need 'repo' scope.
Option B (for sandboxed agents like Codex.app / Cursor that can't reach
the macOS keychain):
From a regular terminal where gh IS authenticated, run:
gh auth token > "$_OD_SKILL_DIR_HINT/.gh-token"
chmod 600 "$_OD_SKILL_DIR_HINT/.gh-token"
The skill will pick up the token automatically next run.
EOF
exit 2
fi
# Resolve the authenticated login. Fail closed if this can't be done — even
# with `gh auth status` green, `gh api user` can fail when the token has
# insufficient scopes, has been revoked, or GitHub is unreachable. Returning
# a fabricated GH_USER like `?` would propagate to TARGET_FORK and cause
# downstream pushes to point at `?/open-design`, so we'd rather stop here.
GH_USER="$(gh api user --jq .login 2>/dev/null)"
if [[ -z "$GH_USER" ]]; then
cat >&2 <<'EOF'
[od-contrib][error] gh auth check passed but `gh api user` could not resolve a login.
Common causes:
- The token has insufficient scopes (need at least 'repo')
- The token has been revoked or expired since the session started
- GitHub API is unreachable
Refresh the token with the right scopes and retry:
gh auth refresh -s repo
EOF
exit 2
fi
printf ' ✓ gh authed as %s\n' "$GH_USER" >&2
printf ' ✓ target locked to %s\n' "$OD_TARGET_REPO" >&2
printf 'GH_USER=%s\n' "$GH_USER"
printf 'READY=1\n'

View file

@ -0,0 +1,66 @@
#!/usr/bin/env bash
# Shared config for the od-contribute skill.
# TARGET_REPO is hard-locked to nexu-io/open-design — this skill is OD-specific.
#
# Override via env vars before invoking a script:
# TARGET_FORK "<owner>/<name>" push branches here. Defaults to $GH_USER/open-design at runtime.
# OD_BASE_BRANCH default: main
# OD_WORK_ROOT default: $HOME/od-contrib-work
# OD_DISCORD_INVITE default: https://discord.gg/qhbcCH8Am4
set -euo pipefail
readonly OD_TARGET_REPO="nexu-io/open-design"
TARGET_REPO="$OD_TARGET_REPO"
: "${TARGET_FORK:=}"
: "${OD_BASE_BRANCH:=main}"
: "${OD_WORK_ROOT:="$HOME/od-contrib-work"}"
: "${OD_DISCORD_INVITE:=https://discord.gg/qhbcCH8Am4}"
# Sandboxed-agent fallback for gh auth.
# Codex.app, Cursor, and other macOS App Sandbox runtimes can't reach the
# system keychain where `gh auth login` stores the token by default. If
# GH_TOKEN isn't already set in the env, look for a token file shipped
# alongside the skill. The skill never *creates* this file automatically —
# it must be written by either:
# - a one-time `gh auth token > <skill>/.gh-token` from a non-sandboxed shell, or
# - the OAuth Device Flow bootstrap (TODO: implement for non-coder users).
_OD_SKILL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
if [[ -z "${GH_TOKEN:-}" && -f "$_OD_SKILL_DIR/.gh-token" ]]; then
GH_TOKEN="$(tr -d '[:space:]' < "$_OD_SKILL_DIR/.gh-token")"
export GH_TOKEN
fi
unset _OD_SKILL_DIR
export TARGET_REPO TARGET_FORK OD_BASE_BRANCH OD_WORK_ROOT OD_DISCORD_INVITE
od::log() { printf '[od-contrib] %s\n' "$*" >&2; }
od::warn() { printf '[od-contrib][warn] %s\n' "$*" >&2; }
od::err() { printf '[od-contrib][error] %s\n' "$*" >&2; }
od::die() { od::err "$*"; exit 1; }
od::require() {
command -v "$1" >/dev/null 2>&1 || od::die "missing dependency: $1"
}
od::slugify() {
local s="${1:-}"
s="$(printf '%s' "$s" | tr '[:upper:]' '[:lower:]')"
s="$(printf '%s' "$s" | sed -E 's/[^a-z0-9]+/-/g; s/^-+//; s/-+$//')"
printf '%s' "${s:0:48}"
}
od::workdir_for() {
# $1 = a slug for this contribution session (e.g. "skill-foo-2026-05-28")
printf '%s/%s\n' "$OD_WORK_ROOT" "$1"
}
# Refuse to operate outside $OD_WORK_ROOT (defense against runaway scripts).
od::assert_in_workroot() {
local path="$1"
case "$path" in
"$OD_WORK_ROOT"/*) return 0 ;;
*) od::die "refusing to operate on path outside OD_WORK_ROOT: $path" ;;
esac
}

View file

@ -0,0 +1,100 @@
#!/usr/bin/env bash
# Create a bug-report issue on nexu-io/open-design from a rendered body file.
# Usage:
# create-issue.sh --title "<issue title>" --body-file <rendered .md>
# [--allow-duplicates] [--dedupe-keywords "<kw>"]
#
# Dedupe gate (now actually a gate, not a print):
# - If --dedupe-keywords is supplied, the script runs `gh search issues`
# FIRST and writes the matches to stderr.
# - If any matches are found AND --allow-duplicates was NOT passed, the
# script EXITS NON-ZERO with a clear hint and refuses to call
# `gh issue create`. This lets the agent (per SKILL.md Step 3d.4) show
# the matches to the user and only re-invoke with --allow-duplicates
# after the user explicitly chose "open a new issue anyway".
# - If `gh search` ITSELF fails (network, rate limit, jq parse error),
# the script also exits non-zero. Failing closed is the right default
# for a bug-dedupe gate — we'd rather block creation than open
# potentially redundant issues silently.
#
# Caller contract (matches SKILL.md):
# 1. Run with --dedupe-keywords on first attempt; show output to user.
# 2. If exit is non-zero with REASON=duplicates_found, ask the user.
# 3. If user picks "open anyway", re-run WITHOUT --dedupe-keywords (or
# WITH --allow-duplicates). The script then creates the issue.
#
# Emits the issue URL on its own line (stdout) on success.
set -euo pipefail
source "$(dirname "$0")/config.sh"
TITLE=""
BODY_FILE=""
DEDUPE_KEYWORDS=""
ALLOW_DUPES=0
while (($#)); do
case "$1" in
--title) TITLE="$2"; shift 2 ;;
--body-file) BODY_FILE="$2"; shift 2 ;;
--dedupe-keywords) DEDUPE_KEYWORDS="$2"; shift 2 ;;
--allow-duplicates) ALLOW_DUPES=1; shift ;;
*) od::die "unknown flag: $1" ;;
esac
done
[[ -n "$TITLE" ]] || od::die "--title required"
[[ -f "$BODY_FILE" ]] || od::die "--body-file does not exist: $BODY_FILE"
od::require gh
od::require jq
if [[ -n "$DEDUPE_KEYWORDS" && "$ALLOW_DUPES" -eq 0 ]]; then
od::log "checking for duplicates: $DEDUPE_KEYWORDS"
# Run gh search and jq as separate steps so a failure in either is loud
# rather than swallowed by `|| true`. The previous implementation chained
# them with `|| true`, which let a network or jq error mask "no duplicates"
# vs "search broken" — both produced empty output and the script then
# created the issue regardless.
if ! SEARCH_JSON="$(gh search issues "$DEDUPE_KEYWORDS" \
--repo "$TARGET_REPO" \
--state open \
--limit 5 \
--json number,title,url 2>&1)"; then
od::err "gh search failed: $SEARCH_JSON"
printf 'REASON=search_failed\n' >&2
exit 2
fi
MATCH_COUNT="$(printf '%s' "$SEARCH_JSON" | jq -r 'length' 2>/dev/null || echo 'parse-error')"
if [[ "$MATCH_COUNT" == "parse-error" ]]; then
od::err "could not parse gh search output as JSON"
printf 'REASON=parse_failed\n' >&2
exit 2
fi
if (( MATCH_COUNT > 0 )); then
printf '%s' "$SEARCH_JSON" \
| jq -r '.[] | " #\(.number) \(.title)\n \(.url)"' >&2
od::err "${MATCH_COUNT} potentially duplicate open issue(s) found."
od::err "Refusing to create a new issue. Show these to the user and ask:"
od::err " (a) comment on an existing one — open the URL above"
od::err " (b) open a new issue anyway — re-run with --allow-duplicates"
od::err " (c) cancel — do nothing"
printf 'REASON=duplicates_found\n' >&2
printf 'MATCH_COUNT=%s\n' "$MATCH_COUNT" >&2
exit 3
fi
od::log "no duplicates found — proceeding with create"
fi
URL="$(gh issue create \
--repo "$TARGET_REPO" \
--title "$TITLE" \
--body-file "$BODY_FILE" \
--label bug)" || od::die "gh issue create failed"
printf '\n'
printf '%s\n' "$URL"

View file

@ -0,0 +1,116 @@
#!/usr/bin/env bash
# Commit, push, and open a PR against nexu-io/open-design.
# Usage: create-pr.sh --workdir <dir> --type <skill|design-system|i18n|docs> \
# --title "<pr title>" --body-file <rendered PR body .md>
#
# Reads:
# <workdir>/.od-contrib/contributor.txt (display name; optional)
# <workdir>/.od-contrib/pitch.txt (one-line pitch; optional)
# Emits PR URL on its own line at the end (stdout).
set -euo pipefail
source "$(dirname "$0")/config.sh"
WORKDIR=""
TYPE=""
TITLE=""
BODY_FILE=""
DRAFT=""
while (($#)); do
case "$1" in
--workdir) WORKDIR="$2"; shift 2 ;;
--type) TYPE="$2"; shift 2 ;;
--title) TITLE="$2"; shift 2 ;;
--body-file) BODY_FILE="$2"; shift 2 ;;
--draft) DRAFT="--draft"; shift ;;
*) od::die "unknown flag: $1" ;;
esac
done
[[ -n "$WORKDIR" ]] || od::die "--workdir required"
[[ -n "$TYPE" ]] || od::die "--type required (skill|design-system|i18n|docs)"
[[ -n "$TITLE" ]] || od::die "--title required"
[[ -f "$BODY_FILE" ]] || od::die "--body-file does not exist: $BODY_FILE"
[[ -d "$WORKDIR/.git" ]] || od::die "not a git workdir: $WORKDIR"
od::require gh
od::require git
cd "$WORKDIR"
BRANCH="$(git rev-parse --abbrev-ref HEAD)"
case "$BRANCH" in
main|master|develop) od::die "refusing to push base branch '$BRANCH'" ;;
esac
# 1) Stage + commit if there are changes. Use a non-jargon commit message.
#
# Use `git status --porcelain` rather than `git diff --quiet` because the latter
# ignores untracked files. The most common contribution shape — a brand-new
# Skill folder, translation file, or doc — is 100% untracked at this point;
# any predicate that misses untracked paths would silently push an empty PR.
#
# Belt-and-suspenders against the skill's internal scratch dir leaking into
# the user's contribution PR: setup-workspace.sh adds `.od-contrib/` to
# .git/info/exclude, but in case this script is invoked against a workdir
# set up differently, also pass `:!.od-contrib` as a pathspec exclude so
# nothing under .od-contrib/ gets staged here.
SCRATCH_EXCLUDE=':!:.od-contrib'
if [[ -n "$(git status --porcelain -- . "$SCRATCH_EXCLUDE")" ]]; then
git add -A -- . "$SCRATCH_EXCLUDE"
# If even after `git add` the index is clean (e.g., changes were only in
# ignored paths or symlink mode bits), skip the commit instead of erroring.
if git diff --cached --quiet; then
od::log "no real changes after staging — skipping commit"
else
git commit -m "$TITLE"
od::log "created commit"
fi
else
od::log "nothing new to commit (assuming work was already committed)"
fi
# 2) Decide push remote. Prefer fork.
PUSH_REMOTE="origin"
if [[ -n "${TARGET_FORK}" ]] && git remote | grep -q '^fork$'; then
PUSH_REMOTE="fork"
else
od::warn "no fork configured (TARGET_FORK empty) — pushing to upstream ${TARGET_REPO}. 3s to abort..."
sleep 3 || true
fi
od::log "pushing to ${PUSH_REMOTE}/${BRANCH}"
git push -u "$PUSH_REMOTE" "$BRANCH"
# 3) Pick label set per contribution type. (OD's labels: documentation, i18n, blog, enhancement, ...)
LABELS=()
case "$TYPE" in
skill) LABELS+=("good first issue" "enhancement") ;;
design-system) LABELS+=("good first issue" "enhancement") ;;
i18n) LABELS+=("i18n" "documentation") ;;
docs) LABELS+=("documentation") ;;
esac
LABEL_FLAGS=()
for L in "${LABELS[@]}"; do
LABEL_FLAGS+=(--label "$L")
done
# 4) Open the PR. `gh pr create` automatically picks `head` from the pushed branch.
HEAD_REF="$BRANCH"
if [[ "$PUSH_REMOTE" == "fork" && -n "$TARGET_FORK" ]]; then
HEAD_REF="${TARGET_FORK%%/*}:${BRANCH}"
fi
PR_URL="$(gh pr create \
--repo "$TARGET_REPO" \
--base "$OD_BASE_BRANCH" \
--head "$HEAD_REF" \
--title "$TITLE" \
--body-file "$BODY_FILE" \
${DRAFT} \
"${LABEL_FLAGS[@]}")" || od::die "gh pr create failed"
printf '\n'
printf '%s\n' "$PR_URL"

View file

@ -0,0 +1,118 @@
#!/usr/bin/env bash
# Find low-effort doc improvements in nexu-io/open-design.
# Usage: discover-doc-gaps.sh <workdir>
# Stdout: NDJSON rows. Three classes:
# {"kind":"todo","file":"docs/foo.md","line":42,"text":"TODO: explain the daemon"}
# {"kind":"typo","file":"README.md","line":17,"word":"recieve","suggested":"receive"}
# {"kind":"deadlink","file":"docs/bar.md","line":3,"url":"https://example.com/x","status":"404"}
#
# Dead-link checks are best-effort: timeout 8s, only reports 4xx/5xx/timeout, not network errors.
set -uo pipefail
source "$(dirname "$0")/config.sh"
WORKDIR="${1:?workdir required}"
[[ -d "$WORKDIR/.git" ]] || od::die "not a git workdir: $WORKDIR"
cd "$WORKDIR"
od::require jq
# Use ripgrep when present for speed; fall back to grep -rE.
if command -v rg >/dev/null 2>&1; then
GREP() { rg --no-heading --line-number --color never "$@"; }
else
GREP() {
# Translate a couple of rg flags we use to grep-equivalents.
local args=()
while (($#)); do
case "$1" in
--no-heading|--color) shift ;;
--color=never) shift ;;
--line-number) args+=("-n"); shift ;;
*) args+=("$1"); shift ;;
esac
done
grep -rE "${args[@]}"
}
fi
# 1) TODOs / FIXMEs in docs.
emit_todo() {
while IFS=: read -r file line rest; do
[[ -z "$file" ]] && continue
jq -nc --arg file "$file" --argjson line "$line" --arg text "$rest" \
'{kind:"todo", file:$file, line:$line, text:($text|sub("^[[:space:]]+";""))}'
done
}
GREP --no-heading --line-number --color never -e 'TODO|FIXME|XXX' \
-g '*.md' docs/ README*.md QUICKSTART*.md CONTRIBUTING*.md 2>/dev/null \
| emit_todo || true
# Fallback path for environments where rg --glob isn't available — grep equivalent.
if ! command -v rg >/dev/null 2>&1; then
grep -rEn -- 'TODO|FIXME|XXX' docs README*.md QUICKSTART*.md CONTRIBUTING*.md 2>/dev/null \
| emit_todo || true
fi
# 2) Common typos. Whole-word match, case-sensitive (avoid false positives in code/links).
TYPOS=(
"teh|the"
"recieve|receive"
"seperate|separate"
"occured|occurred"
"succesful|successful"
"untill|until"
"wich|which"
"thier|their"
"alot|a lot"
"definately|definitely"
"neccessary|necessary"
"enviroment|environment"
"transparant|transparent"
"appearence|appearance"
)
for entry in "${TYPOS[@]}"; do
bad="${entry%%|*}"
good="${entry##*|}"
while IFS=: read -r file line _rest; do
[[ -z "$file" ]] && continue
# Skip code blocks (rough heuristic: skip if line is inside ```).
jq -nc --arg file "$file" --argjson line "$line" --arg word "$bad" --arg good "$good" \
'{kind:"typo", file:$file, line:$line, word:$word, suggested:$good}'
done < <(GREP --no-heading --line-number --color never -e "\\b${bad}\\b" -g '*.md' . 2>/dev/null \
|| grep -rEn "\\b${bad}\\b" --include='*.md' . 2>/dev/null \
|| true)
done
# 3) External link health (best-effort, capped).
# Cap to 50 links per run so we don't hammer arbitrary hosts.
MAX_LINKS=50
SEEN=0
extract_links() {
GREP --no-heading --line-number --color never -e '\]\(https?://[^) ]+\)' -g '*.md' . 2>/dev/null \
|| grep -rEn '\]\(https?://[^) ]+\)' --include='*.md' . 2>/dev/null
}
while IFS= read -r row; do
[[ "$SEEN" -ge "$MAX_LINKS" ]] && break
file="${row%%:*}"
rest="${row#*:}"
line="${rest%%:*}"
text="${rest#*:}"
# Extract first http(s) URL on the line.
url="$(printf '%s' "$text" | grep -oE 'https?://[^) ]+' | head -1)"
[[ -z "$url" ]] && continue
SEEN=$((SEEN+1))
# HEAD with 8s timeout, follow redirects, take final status.
status="$(curl -sS -o /dev/null -m 8 -L -w '%{http_code}' --head "$url" 2>/dev/null || echo "000")"
case "$status" in
2*|3*) ;; # OK
000) ;; # network/timeout — skip rather than spam false positives
*)
jq -nc --arg file "$file" --argjson line "$line" --arg url "$url" --arg status "$status" \
'{kind:"deadlink", file:$file, line:$line, url:$url, status:$status}'
;;
esac
done < <(extract_links | head -n "$MAX_LINKS")

View file

@ -0,0 +1,114 @@
#!/usr/bin/env bash
# Find translation gaps in nexu-io/open-design.
# Usage: discover-i18n-gaps.sh <workdir>
# Stdout: NDJSON, one row per gap:
# {"doc":"README","english":"README.md","lang":"es","translated":null,"status":"missing"}
# {"doc":"QUICKSTART","english":"QUICKSTART.md","lang":"zh-CN","translated":"QUICKSTART.zh-CN.md","status":"stale","english_mtime":"...","translated_mtime":"...","english_commits_since":12}
#
# A "stale" translation is one whose last-touched commit is older than the most recent
# commit touching the English source. Ranking is left to the caller (the agent).
set -euo pipefail
source "$(dirname "$0")/config.sh"
WORKDIR="${1:?workdir required}"
[[ -d "$WORKDIR/.git" ]] || od::die "not a git workdir: $WORKDIR"
cd "$WORKDIR"
od::require git
od::require jq
# Translatable English source files we care about (top-level docs).
ENGLISH_DOCS=(README.md QUICKSTART.md CONTRIBUTING.md MAINTAINERS.md TRANSLATIONS.md PRIVACY.md)
# Common language suffixes seen in OD's tree (extend as the project grows).
LANGS=(zh-CN zh-TW ja-JP de fr es ko ru pt-BR tr uk ar)
# Languages already represented for a given doc are detected from disk;
# the LANGS array is what we *offer* to a contributor when no translation exists.
last_commit_epoch() {
# Last commit touching $1 — empty string if file has never been committed.
git log -1 --format=%ct -- "$1" 2>/dev/null || true
}
commits_between() {
# How many commits touched $newer that are NOT ancestors of $older_ref's tip
# commit. Uses commit ancestry rather than `--since=<epoch>` math because
# `--since` is inclusive of the boundary epoch — so when English source and
# translation are touched in the SAME commit (very common: bulk i18n
# refresh, structural change applied across all translations), `--since`
# would count that shared commit and mark the translation "stale" by 1.
#
# `tr_sha..HEAD -- $newer` reads as: "commits reachable from HEAD but not
# from tr_sha, that touched $newer". When tr_sha is HEAD's tip for $newer
# too (same-commit update), the answer is correctly 0.
local newer="$1" older_ref="$2"
local tr_sha
tr_sha="$(git log -1 --format=%H -- "$older_ref" 2>/dev/null)"
if [[ -z "$tr_sha" ]]; then
# Translation never committed; count all history of $newer.
git log --format=%H -- "$newer" 2>/dev/null | wc -l | tr -d ' '
else
git rev-list "${tr_sha}..HEAD" -- "$newer" 2>/dev/null | wc -l | tr -d ' '
fi
}
emit() {
jq -nc \
--arg doc "$1" --arg english "$2" --arg lang "$3" \
--arg translated "$4" --arg status "$5" \
--arg en_epoch "$6" --arg tr_epoch "$7" --arg en_commits_since "$8" \
'{
doc: $doc, english: $english, lang: $lang,
translated: ($translated | select(length>0)),
status: $status,
english_mtime_epoch: ($en_epoch | select(length>0) | tonumber? // null),
translated_mtime_epoch: ($tr_epoch | select(length>0) | tonumber? // null),
english_commits_since_translation: ($en_commits_since | tonumber? // null)
}'
}
for english in "${ENGLISH_DOCS[@]}"; do
[[ -f "$english" ]] || continue
doc="${english%.md}"
en_epoch="$(last_commit_epoch "$english")"
# Track observed languages for this doc as a newline-delimited string.
# Avoids `declare -A` (associative arrays), which requires Bash 4 — macOS
# ships with Bash 3.2 by default and most agent-spawned bash subprocesses
# inherit that. The leading + trailing newlines let us match `\n<lang>\n`
# without false positives on prefix overlap (e.g. zh vs zh-CN).
SEEN_LANGS=$'\n'
while IFS= read -r -d '' translated; do
# Filename pattern: <DOC>.<lang>.md (e.g. README.zh-CN.md).
# `find . ... -print0` emits paths with a leading `./`; strip that first
# and operate on the basename so the prefix-strip below works regardless.
base="${translated#./}"
base="$(basename "$base")"
lang_part="${base#${doc}.}"
lang_part="${lang_part%.md}"
[[ -z "$lang_part" || "$lang_part" == "$base" ]] && continue
SEEN_LANGS+="${lang_part}"$'\n'
tr_epoch="$(last_commit_epoch "$translated")"
if [[ -z "$tr_epoch" ]]; then
emit "$doc" "$english" "$lang_part" "$translated" "untracked" "$en_epoch" "" ""
continue
fi
en_commits_since="$(commits_between "$english" "$translated")"
if [[ "$en_commits_since" -gt 0 ]]; then
emit "$doc" "$english" "$lang_part" "$translated" "stale" "$en_epoch" "$tr_epoch" "$en_commits_since"
fi
# else: up-to-date, skip emission entirely.
done < <(find . -maxdepth 1 -type f -name "${doc}.*.md" -print0)
# Then, for each language in LANGS that we didn't see, emit a "missing" row.
for lang in "${LANGS[@]}"; do
case "$SEEN_LANGS" in
*$'\n'"$lang"$'\n'*) continue ;;
esac
emit "$doc" "$english" "$lang" "" "missing" "$en_epoch" "" ""
done
done

View file

@ -0,0 +1,92 @@
#!/usr/bin/env bash
# Clone (or reuse) nexu-io/open-design in an isolated workdir + create a feature branch.
# Usage: setup-workspace.sh <type> <slug>
# <type> one of: skill | design-system | i18n | docs
# <slug> short kebab-case identifier (e.g. "translate-readme-es", "fix-typo-quickstart")
#
# Env: TARGET_FORK optional (else pushes go to upstream — create-pr.sh warns first).
#
# Stdout (machine-readable):
# WORKDIR=<abs path>
# BRANCH=<branch name>
set -euo pipefail
source "$(dirname "$0")/config.sh"
TYPE="${1:?type required (skill|design-system|i18n|docs)}"
SLUG="${2:?slug required}"
case "$TYPE" in
skill|design-system|i18n|docs) ;;
*) od::die "unknown type: $TYPE (expected skill|design-system|i18n|docs)" ;;
esac
od::require gh
od::require git
# Use second-precision timestamp so two contribution sessions on the same day
# (or the SKILL.md i18n flow that calls setup-workspace.sh with a placeholder
# slug like "translate" before the user has picked a language) don't collide
# into the same workdir. Reusing a workdir would leak untracked / half-edited
# files from an earlier abandoned session into a later contribution.
SESSION_TAG="$(date +%Y%m%d-%H%M%S)"
SESSION_DIR="${TYPE}-${SLUG}-${SESSION_TAG}"
WORKDIR="$(od::workdir_for "$SESSION_DIR")"
BRANCH="od-contrib/${TYPE}/${SLUG}-${SESSION_TAG}"
mkdir -p "$OD_WORK_ROOT"
od::assert_in_workroot "$WORKDIR"
CLONE_URL="https://github.com/${TARGET_REPO}.git"
if [[ -d "$WORKDIR/.git" ]]; then
# We reach here only if the user explicitly resumed by passing the same
# SESSION_TAG, or if the wall clock somehow produced a duplicate. Clean any
# untracked/dirty state so the run starts from a known good base instead of
# inheriting whatever the previous occupant left behind.
od::log "reusing existing workdir: $WORKDIR"
git -C "$WORKDIR" fetch origin --prune
git -C "$WORKDIR" reset --hard HEAD
git -C "$WORKDIR" clean -fdx
else
od::log "cloning $CLONE_URL$WORKDIR (depth 50)"
git clone --depth 50 "$CLONE_URL" "$WORKDIR"
fi
# Tell git to ignore our internal scratch dir so `git add -A` later (in
# create-pr.sh) doesn't accidentally stage type.txt, slug.txt, PR-BODY.md
# into the user's contribution PR. .git/info/exclude is repo-local and not
# committed, so we don't pollute the OD repo's .gitignore.
mkdir -p "$WORKDIR/.git/info"
if ! grep -qxF '.od-contrib/' "$WORKDIR/.git/info/exclude" 2>/dev/null; then
printf '\n# od-contribute scratch dir (added by setup-workspace.sh)\n.od-contrib/\n' \
>> "$WORKDIR/.git/info/exclude"
fi
git -C "$WORKDIR" checkout "$OD_BASE_BRANCH"
git -C "$WORKDIR" pull --ff-only origin "$OD_BASE_BRANCH"
# Configure fork remote if provided.
if [[ -n "${TARGET_FORK}" ]]; then
if git -C "$WORKDIR" remote | grep -q '^fork$'; then
git -C "$WORKDIR" remote set-url fork "https://github.com/${TARGET_FORK}.git"
else
git -C "$WORKDIR" remote add fork "https://github.com/${TARGET_FORK}.git"
fi
fi
# Create or reset branch off latest base.
if git -C "$WORKDIR" show-ref --verify --quiet "refs/heads/$BRANCH"; then
od::log "branch $BRANCH already exists — switching"
git -C "$WORKDIR" checkout "$BRANCH"
else
git -C "$WORKDIR" checkout -b "$BRANCH" "$OD_BASE_BRANCH"
fi
mkdir -p "$WORKDIR/.od-contrib"
printf '%s\n' "$TYPE" > "$WORKDIR/.od-contrib/type.txt"
printf '%s\n' "$SLUG" > "$WORKDIR/.od-contrib/slug.txt"
od::log "workspace ready"
printf 'WORKDIR=%s\n' "$WORKDIR"
printf 'BRANCH=%s\n' "$BRANCH"

View file

@ -0,0 +1,97 @@
#!/usr/bin/env bash
# Validate a user-supplied DESIGN.md (Open Design "design system" submission).
# Usage: validate-design-system.sh <DESIGN.md path> [--reference <existing-DESIGN.md>]
#
# Strategy: instead of hardcoding a schema, we read 1-3 existing DESIGN.md files
# from the OD repo at runtime to learn which top-level sections are conventional,
# then check the new file has at least those sections (case-insensitive H1/H2 match).
#
# Heuristic-only: warns rather than fails on missing optional sections; only fails
# when the file is empty, unparseable, or has zero structural overlap with samples.
set -uo pipefail
source "$(dirname "$0")/config.sh"
NEW_FILE="${1:?DESIGN.md path required}"
shift || true
REFERENCE_FILES=()
while (($#)); do
case "$1" in
--reference) REFERENCE_FILES+=("$2"); shift 2 ;;
*) od::die "unknown flag: $1" ;;
esac
done
[[ -f "$NEW_FILE" ]] || od::die "not a file: $NEW_FILE"
[[ -s "$NEW_FILE" ]] || od::die "file is empty: $NEW_FILE"
extract_headings() {
# Pull H1/H2 lines, lowercase, trim, dedupe.
awk '/^#{1,2}[[:space:]]+/ { sub(/^#{1,2}[[:space:]]+/, ""); print tolower($0) }' "$1" \
| sed -E 's/[[:space:]]+$//' | sort -u
}
new_headings="$(extract_headings "$NEW_FILE")"
[[ -n "$new_headings" ]] || { printf 'FAIL no H1/H2 headings found in %s — is this really a design system doc?\n' "$NEW_FILE"; printf 'RESULT=fail\n'; exit 1; }
# If references were supplied, build the union of their headings as the "expected" set.
EXPECTED=""
for ref in "${REFERENCE_FILES[@]}"; do
[[ -f "$ref" ]] || continue
EXPECTED+=$'\n'"$(extract_headings "$ref")"
done
EXPECTED="$(printf '%s' "$EXPECTED" | grep -v '^$' | sort -u || true)"
PASS=0
WARN=0
FAIL=0
if [[ -z "$EXPECTED" ]]; then
printf 'WARN no reference DESIGN.md provided — running structure-only checks\n'
WARN=$((WARN+1))
else
# Count overlap. >= 30% structural overlap = looks like a design system.
overlap=0
total=0
while IFS= read -r h; do
[[ -z "$h" ]] && continue
total=$((total+1))
if printf '%s\n' "$new_headings" | grep -Fxq "$h"; then
overlap=$((overlap+1))
fi
done <<< "$EXPECTED"
if [[ "$total" -eq 0 ]]; then
printf 'WARN references parsed but had no headings\n'; WARN=$((WARN+1))
else
pct=$(( overlap * 100 / total ))
if [[ "$pct" -ge 30 ]]; then
printf 'PASS structural overlap with reference DESIGN.md files: %d%% (%d/%d)\n' "$pct" "$overlap" "$total"
PASS=$((PASS+1))
else
printf 'FAIL structural overlap with reference DESIGN.md files only %d%% (%d/%d) — likely missing required sections\n' "$pct" "$overlap" "$total"
FAIL=$((FAIL+1))
fi
fi
fi
# Always-on lightweight checks:
if grep -qE '^(#)[[:space:]]+' "$NEW_FILE"; then
printf 'PASS has at least one H1 heading\n'; PASS=$((PASS+1))
else
printf 'WARN no H1 heading found — convention is one H1 with the brand/system name\n'; WARN=$((WARN+1))
fi
# No relative path escape (../).
if grep -nE '\(\.\./' "$NEW_FILE" >/dev/null; then
printf 'WARN contains ../ relative paths — make sure they resolve once placed at design-systems/<brand>/DESIGN.md\n'; WARN=$((WARN+1))
fi
if [[ "$FAIL" -eq 0 ]]; then
printf 'RESULT=pass (passes=%d warns=%d)\n' "$PASS" "$WARN"
exit 0
else
printf 'RESULT=fail (passes=%d warns=%d fails=%d)\n' "$PASS" "$WARN" "$FAIL"
exit 1
fi

View file

@ -0,0 +1,205 @@
#!/usr/bin/env bash
# Lightweight Markdown validation for i18n / docs / blog contributions.
#
# Usage: validate-markdown.sh <file> [<file> ...] [--reference <orig>]
#
# Checks per file:
# - File is non-empty.
# - Code fences are balanced (count of ``` is even).
# - Newly-introduced relative refs that don't resolve on disk fail.
# Refs that ALREADY exist in the --reference file (the English source for
# a translation, or HEAD's version for a docs edit) are NOT failed even
# if they don't resolve — many OD docs reference website-router slugs
# like `skills/blog-post/` that aren't files in the checked-out repo.
# - External http(s) links return 2xx/3xx (best-effort, capped, 8s timeout).
#
# Without --reference, relative-ref checking is skipped entirely (since we
# can't tell route slugs from file paths in isolation). The other checks
# still run.
set -uo pipefail
source "$(dirname "$0")/config.sh"
set +e
set -uo pipefail # restore the "accumulate diagnostics" stance after sourcing.
REFERENCE=""
FILES=()
while (($#)); do
case "$1" in
--reference) REFERENCE="$2"; shift 2 ;;
--) shift; while (($#)); do FILES+=("$1"); shift; done ;;
-*) od::die "unknown flag: $1" ;;
*) FILES+=("$1"); shift ;;
esac
done
(( ${#FILES[@]} >= 1 )) || od::die "usage: validate-markdown.sh <file> [<file> ...] [--reference <orig>]"
# Build the "already-broken in source" set of relative refs (newline-delimited
# string for Bash 3 compatibility — no associative arrays). Anything in this
# set is excused from failing the new-file check.
KNOWN_DEAD=$'\n'
if [[ -n "$REFERENCE" ]]; then
if [[ ! -f "$REFERENCE" ]]; then
od::warn "--reference $REFERENCE does not exist; ignoring."
else
ref_dir="$(cd "$(dirname "$REFERENCE")" && pwd -P)"
while IFS= read -r ref; do
[[ -z "$ref" ]] && continue
case "$ref" in http*|mailto:*|\#*|/*) continue ;; esac
target="${ref%%#*}"; target="${target%%\?*}"
[[ -z "$target" ]] && continue
if [[ ! -e "$ref_dir/$target" ]]; then
KNOWN_DEAD+="${ref}"$'\n'
fi
done < <(grep -oE '\!?\[[^]]*\]\([^)]+\)' "$REFERENCE" 2>/dev/null \
| sed -E 's/.*\(([^)]+)\).*/\1/' \
| sort -u)
fi
fi
OVERALL=0
MAX_HTTP_PER_FILE=20
check_file() {
local f="$1"
local fail=0
printf -- '--- %s ---\n' "$f"
if [[ ! -f "$f" ]]; then
printf 'FAIL not a file: %s\n' "$f"
return 1
fi
if [[ ! -s "$f" ]]; then
printf 'FAIL empty file: %s\n' "$f"
return 1
fi
printf 'PASS exists, non-empty\n'
# Code fence balance.
local fences
fences="$(grep -cE '^```' "$f" 2>/dev/null)"
if (( fences % 2 == 0 )); then
printf 'PASS code fences balanced (%d)\n' "$fences"
else
printf 'FAIL unbalanced code fences (%d ``` lines)\n' "$fences"
fail=1
fi
# Relative refs — tiered check:
#
# Image refs (![alt](path)) — always validate. No website route uses
# image-syntax markdown; if it doesn't resolve on disk, it's broken.
#
# Link refs starting with ./ or ../ — always validate. Explicit relative
# paths are unambiguously file references, not router slugs.
#
# Other link refs (e.g. `skills/blog-post/`) — only validated when
# --reference is supplied (we excuse refs already broken in the source).
# Without --reference we skip these because OD docs use slug-style refs
# for website routes that don't resolve to files in the checkout.
#
# In all cases, refs already broken in --reference (when supplied) are
# excused from failure rather than reported as regressions.
local dir rel_bad=0 rel_excused=0 rel_skipped_ambiguous=0
dir="$(cd "$(dirname "$f")" && pwd -P)"
while IFS= read -r entry; do
[[ -z "$entry" ]] && continue
# `!?` in grep keeps the leading `!` for image refs; case-detect here.
is_img=0
case "$entry" in '!'*) is_img=1 ;; esac
# Extract URL: between first `(` and last `)`.
ref="${entry#*\(}"
ref="${ref%\)*}"
case "$ref" in http*|mailto:*|\#*|/*) continue ;; esac
target="${ref%%#*}"; target="${target%%\?*}"
[[ -z "$target" ]] && continue
# Should we validate this ref?
if (( is_img == 0 )); then
case "$ref" in
./*|../*) ;; # explicit relative — always validate
*)
# File-like targets (have an obvious file extension) are unambiguously
# on-disk references — `[doc](missing.md)` is not a website route, it
# is a sibling file. Validate without --reference. Otherwise (no
# extension, looks like a slug), only validate when we have a
# reference to compare against.
case "${target##*/}" in
*.md|*.markdown|*.mdx \
|*.png|*.jpg|*.jpeg|*.gif|*.webp|*.svg|*.ico|*.bmp \
|*.pdf|*.txt|*.json|*.yaml|*.yml|*.toml \
|*.sh|*.ts|*.tsx|*.js|*.jsx|*.css|*.html|*.xml \
|*.csv|*.zip|*.gz)
;; # file-like — always validate
*)
if [[ -z "$REFERENCE" ]]; then
rel_skipped_ambiguous=$((rel_skipped_ambiguous+1))
continue
fi
;;
esac
;;
esac
fi
if [[ ! -e "$dir/$target" ]]; then
case "$KNOWN_DEAD" in
*$'\n'"$ref"$'\n'*) rel_excused=$((rel_excused+1)) ;;
*)
printf 'FAIL broken relative reference: %s\n' "$ref"
rel_bad=$((rel_bad+1))
fail=1
;;
esac
fi
done < <(grep -oE '!?\[[^]]*\]\([^)]+\)' "$f" 2>/dev/null | sort -u)
if (( rel_bad == 0 )); then
msg="PASS relative refs OK"
(( rel_excused > 0 )) && msg+=" (${rel_excused} pre-existing dead refs kept as-is)"
(( rel_skipped_ambiguous > 0 )) && msg+=" (${rel_skipped_ambiguous} slug-style refs skipped — pass --reference to check)"
printf '%s\n' "$msg"
fi
# External link health (best-effort).
local http_seen=0 http_bad=0
while IFS= read -r url; do
[[ -z "$url" ]] && continue
(( http_seen >= MAX_HTTP_PER_FILE )) && break
http_seen=$((http_seen+1))
local code
code="$(curl -sS -o /dev/null -m 8 -L -w '%{http_code}' --head "$url" 2>/dev/null)"
[[ -z "$code" ]] && code="000"
case "$code" in
2*|3*|000) ;; # OK, or network-flaky — don't punish.
*)
printf 'FAIL external link %s returned %s\n' "$url" "$code"
http_bad=$((http_bad+1))
fail=1
;;
esac
# URL extraction: stop at whitespace, ), ", ', <, >, [, ]. HTML <img src="..."> in
# OD docs would otherwise leak a trailing quote into the URL and cause false 404s.
done < <(grep -oE 'https?://[^][[:space:]"'\''<>)]+' "$f" 2>/dev/null | sort -u)
if (( http_bad == 0 && http_seen > 0 )); then
printf 'PASS %d external links return 2xx/3xx (or network-skipped)\n' "$http_seen"
fi
return "$fail"
}
for f in "${FILES[@]}"; do
if ! check_file "$f"; then
OVERALL=1
fi
done
if [[ "$OVERALL" -eq 0 ]]; then
printf 'RESULT=pass\n'
exit 0
else
printf 'RESULT=fail\n'
exit 1
fi

View file

@ -0,0 +1,138 @@
#!/usr/bin/env bash
# Validate a user-supplied OD skill folder before staging it for PR.
# Usage: validate-skill-submission.sh <skill-folder>
# Checks (each prints PASS/FAIL line on stdout):
# - SKILL.md exists
# - SKILL.md has frontmatter with `name` and `description`
# - `name` matches folder name (warn-only, since OD may rename on merge)
# - all relative paths in SKILL.md resolve to files inside the folder
# - no path escapes the skill folder (../ in references)
# Exit 0 = all PASS or only warnings. Exit 1 = at least one FAIL.
set -uo pipefail
source "$(dirname "$0")/config.sh"
SKILL_DIR="${1:?skill folder path required}"
[[ -d "$SKILL_DIR" ]] || od::die "not a directory: $SKILL_DIR"
ABS_SKILL_DIR="$(cd "$SKILL_DIR" && pwd -P)"
FAIL=0
pass() { printf 'PASS %s\n' "$1"; }
warn() { printf 'WARN %s\n' "$1"; }
fail() { printf 'FAIL %s\n' "$1"; FAIL=1; }
SKILL_MD="$ABS_SKILL_DIR/SKILL.md"
if [[ ! -f "$SKILL_MD" ]]; then
fail "SKILL.md missing — every OD skill folder must contain SKILL.md at its root"
printf 'RESULT=%s\n' "fail"
exit 1
fi
pass "SKILL.md exists"
# Frontmatter parse: extract YAML between the first two '---' lines.
#
# The opening fence MUST be on line 1 — both Claude Code's loader and Codex
# CLI's loader (codex-rs/core-skills) parse the top of the file, so a SKILL.md
# that starts with prose, a BOM, or whitespace and only contains a `---` block
# later will load as having no frontmatter, even if this validator picks it up.
# Reject leading content explicitly so the validator can't pass a file the
# real loaders will reject.
FIRST_LINE="$(head -n 1 "$SKILL_MD")"
if [[ ! "$FIRST_LINE" =~ ^---[[:space:]]*$ ]]; then
fail "SKILL.md must start with a YAML frontmatter fence ('---') on line 1 — found: $(printf '%q' "$FIRST_LINE" | head -c 80)"
printf 'RESULT=%s\n' "fail"
exit 1
fi
FRONT=$(awk '
BEGIN { in_fm=0; fence=0 }
/^---[[:space:]]*$/ {
fence++
if (fence==1) { in_fm=1; next }
if (fence==2) { exit }
}
in_fm { print }
' "$SKILL_MD")
if [[ -z "$FRONT" ]]; then
fail "SKILL.md has a leading '---' but no closing fence or empty frontmatter"
else
pass "SKILL.md frontmatter present"
name_line="$(printf '%s' "$FRONT" | grep -E '^name:' | head -1 || true)"
desc_line="$(printf '%s' "$FRONT" | grep -E '^description:' | head -1 || true)"
[[ -n "$name_line" ]] && pass "frontmatter has 'name'" || fail "frontmatter missing 'name:'"
[[ -n "$desc_line" ]] && pass "frontmatter has 'description'" || fail "frontmatter missing 'description:'"
# Sanity: name should look like a slug.
fm_name="$(printf '%s' "$name_line" | sed -E 's/^name:[[:space:]]*//; s/^["'\''"]//; s/["'\''"]$//')"
folder_name="$(basename "$ABS_SKILL_DIR")"
if [[ -n "$fm_name" && "$fm_name" != "$folder_name" ]]; then
warn "frontmatter name '$fm_name' differs from folder name '$folder_name' (maintainer may rename — OK)"
fi
fi
# Relative path scan: every non-URL, non-anchor markdown link target must
# resolve inside the skill folder.
#
# We extract ALL markdown links (`[label](target)`) and filter out URLs and
# anchors here, rather than only matching dot-prefixed paths in the regex.
# Plain intra-skill references like `[ref](references/foo.md)` or
# `[script](scripts/run.sh)` are common and must be validated too — the
# contract for SKILL.md says every relative path resolves on disk, regardless
# of whether the author wrote `./references/foo.md` or `references/foo.md`.
# A narrower `\(\.{1,2}/...\)` pattern would silently let bare paths through.
BAD_REFS=0
ESCAPE=0
# Lexical escape check: count path segments and ensure no prefix walks above
# the skill root. We do this on the literal target rather than from `cd … &&
# pwd -P` so that a missing intermediate directory (which is itself a fail
# we want to report) doesn't masquerade as an escape.
escapes_root() {
local p="$1" depth=0 seg
# Strip a leading "./" if present.
p="${p#./}"
IFS='/' read -r -a _segs <<< "$p"
for seg in "${_segs[@]}"; do
case "$seg" in
''|.) ;;
..) depth=$((depth-1)); (( depth < 0 )) && return 0 ;;
*) depth=$((depth+1)) ;;
esac
done
return 1
}
while IFS= read -r ref; do
# Skip protocol URLs, mailto, anchors-only, and absolute paths.
case "$ref" in
http*|https*|mailto:*|tel:*|\#*|/*) continue ;;
esac
# Strip query and fragment components before resolving.
target="${ref%%#*}"
target="${target%%\?*}"
[[ -z "$target" ]] && continue
if escapes_root "$target"; then
ESCAPE=1
fail "path escapes skill folder: $ref"
continue
fi
if [[ ! -e "$ABS_SKILL_DIR/$target" ]]; then
BAD_REFS=$((BAD_REFS+1))
fail "referenced file does not exist: $ref"
fi
done < <(grep -oE '\!?\[[^]]*\]\([^)]+\)' "$SKILL_MD" 2>/dev/null \
| sed -E 's/.*\(([^)]+)\).*/\1/' \
| sort -u)
if [[ "$BAD_REFS" -eq 0 && "$ESCAPE" -eq 0 ]]; then
pass "all relative references resolve inside the skill folder"
fi
if [[ "$FAIL" -eq 0 ]]; then
printf 'RESULT=%s\n' "pass"
exit 0
else
printf 'RESULT=%s\n' "fail"
exit 1
fi

View file

@ -0,0 +1,37 @@
### What happened?
{{WHAT_HAPPENED}}
### Steps to reproduce
{{STEPS}}
### Expected behavior
{{EXPECTED}}
### Open Design version
{{OD_VERSION}}
### Platform
{{PLATFORM}}
### Logs (optional)
```
{{LOGS}}
```
### Screenshots (optional)
{{SCREENSHOTS}}
### Additional context
{{CONTEXT}}
---
_Reported via the `od-contribute` skill. If you can reproduce or have more context, please add a comment — every signal helps narrow the fix._

View file

@ -0,0 +1,37 @@
## What this PR adds
A new Design System — **{{BRAND_NAME}}** — at `design-systems/{{BRAND_SLUG}}/DESIGN.md`.
> {{PITCH}}
## What this design system covers
{{COVERAGE_NOTES}}
## How to try it
1. `cd open-design`
2. `pnpm tools-dev run web`
3. Start a new project and pick **{{BRAND_NAME}}** from the design system picker.
4. Ask the model: _"{{TRY_PROMPT}}"_
{{SCREENSHOT_BLOCK}}
## What's in this PR
- `design-systems/{{BRAND_SLUG}}/DESIGN.md` — the canonical design brief OD loads.
- Any supporting assets in `design-systems/{{BRAND_SLUG}}/` are referenced from `DESIGN.md`.
## Checklist
- [x] DESIGN.md has the conventional sections (compared against existing OD design systems)
- [x] No `../` path escapes outside the brand folder
- [ ] Maintainer review
---
👋 This is my first OD contribution. Hi! If anything looks off, tell me what to change and I'll happily push a fixup commit.
If you want to chat (or you're another newcomer reading this and want help shipping your first PR), come hang out in the OD Discord: {{DISCORD_INVITE}}
_Generated with the `od-contribute` skill._

View file

@ -0,0 +1,32 @@
## What this PR fixes
{{ONE_LINE_SUMMARY}}
## Details
{{DETAILS}}
<!--
Use this for the body when there's nuance:
- which file/section
- the exact sentence/typo/dead link
- what you replaced it with and why
-->
## Files touched
{{FILES_LIST}}
## Checklist
- [x] Markdown still parses cleanly (no broken fences or structure)
- [x] All links and image paths still resolve
- [ ] Maintainer review
---
👋 This is my first OD contribution. Hi! Small fix, but I figured every typo / dead link costs the next reader 30 seconds, and this saves that.
If you want to chat or there's something you'd love help getting fixed, come find us in the OD Discord: {{DISCORD_INVITE}}
_Generated with the `od-contribute` skill._

View file

@ -0,0 +1,41 @@
## What this PR translates
**{{DOC_NAME}}** → **{{LANG_DISPLAY_NAME}}** (`{{LANG_CODE}}`)
- New file: `{{TRANSLATED_PATH}}`
- Source: `{{ENGLISH_PATH}}`
- Status: {{STATUS}} <!-- "missing" (new translation) or "stale" (refreshed) -->
## What I preserved
- Every Markdown structure element (headings, lists, tables, callouts, link/image targets)
- Code blocks — left untranslated
- Brand names and product names — left untranslated
- Internal cross-links — adjusted to point to the localized file when one exists, else to the English source
## What I changed
{{TRANSLATION_NOTES}}
## How to verify
```bash
# Render preview locally
cd open-design
# (or just open the .md file in any Markdown viewer)
```
## Checklist
- [x] Markdown parses cleanly (code fences balanced, no broken structure)
- [x] All relative links and image paths still resolve
- [x] External links return 2xx/3xx
- [ ] Maintainer review
---
👋 This is my first OD contribution. I'm a native {{LANG_DISPLAY_NAME}} speaker (or close to it!) and want to help OD reach more people in my language.
If you want to chat or you're another translator reading this, come find us in the OD Discord: {{DISCORD_INVITE}}
_Generated with the `od-contribute` skill._

View file

@ -0,0 +1,37 @@
## What this PR adds
A new Skill — **{{SKILL_NAME}}** — at `skills/{{SKILL_SLUG}}/`.
> {{PITCH}}
## Why I made it
{{MOTIVATION}}
## How to try it
1. `cd open-design`
2. Run OD locally: `pnpm tools-dev run web`
3. Open a project, start a chat, and ask: _"{{TRY_PROMPT}}"_
{{SCREENSHOT_BLOCK}}
## What's in this PR
- `skills/{{SKILL_SLUG}}/SKILL.md` — the skill itself (frontmatter + instructions)
- everything else inside `skills/{{SKILL_SLUG}}/` is referenced from `SKILL.md`
## Checklist
- [x] `SKILL.md` has a `name` and `description` in the frontmatter
- [x] Every relative path in `SKILL.md` resolves
- [x] No path escapes the skill folder
- [ ] Maintainer review
---
👋 This is my first OD contribution. Hi! If anything looks off, tell me what to change and I'll happily push a fixup commit.
If you want to chat (or you're another newcomer reading this and want help shipping your first PR), come hang out in the OD Discord: {{DISCORD_INVITE}}
_Generated with the `od-contribute` skill._

View file

@ -122,11 +122,12 @@ jobs:
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Verify mac Electron framework symlinks
- name: Inspect mac Electron framework symlinks
run: |
set -euo pipefail
electron_dist="$(node -e 'const path = require("node:path"); const { createRequire } = require("node:module"); const requireFromDesktop = createRequire(path.join(process.cwd(), "apps/desktop/package.json")); const electron = requireFromDesktop.resolve("electron"); process.stdout.write(path.join(path.dirname(electron), "dist"));')"
framework="$electron_dist/Electron.app/Contents/Frameworks/Electron Framework.framework"
missing_links=0
for link in \
"$framework/Electron Framework" \
"$framework/Helpers" \
@ -134,12 +135,15 @@ jobs:
"$framework/Resources" \
"$framework/Versions/Current"; do
if [ ! -L "$link" ]; then
echo "Expected Electron framework symlink, got non-symlink: $link" >&2
ls -la "$framework" >&2 || true
ls -la "$framework/Versions" >&2 || true
exit 1
echo "::warning::Expected Electron framework symlink, got non-symlink: $link"
missing_links=1
fi
done
if [ "$missing_links" -ne 0 ]; then
ls -la "$framework" >&2 || true
ls -la "$framework/Versions" >&2 || true
echo "Continuing into tools-pack because electron-builder is the source of truth for whether packaging actually works."
fi
- name: Prepare Apple signing certificate
env:

11
.gitignore vendored
View file

@ -42,9 +42,18 @@ tsconfig.tsbuildinfo
.cursor/
.agents/
.opencode/
.claude/
.claude/*
# Exception: od-contribute skill ships with the repo so the OD app can mount it
# for non-coder contributors. Personal Claude state (sessions, settings, etc.) stays ignored.
!.claude/skills/
.claude/skills/*
!.claude/skills/od-contribute/
!.claude/commands/
.claude/commands/*
!.claude/commands/od-contribute.md
.codex/
.deepseek/
.antigravitycli/
# Commander task scratchpad; keep local task notes out of git by default.
.task/

View file

@ -14,7 +14,7 @@ This file is the single source of truth for agents entering this repository. Rea
## Workspace directories
- Workspace packages come from `pnpm-workspace.yaml`: `apps/*`, `packages/*`, `tools/*`, and `e2e`.
- Top-level content directories: `skills/` (functional skills the agent invokes mid-task — utilities, briefs, packagers; see `skills/AGENTS.md`), `design-templates/` (rendering catalogue: decks, prototypes, image/video/audio templates; see `design-templates/AGENTS.md` and `specs/current/skills-and-design-templates.md`), `design-systems/` (brand `DESIGN.md` files), `craft/` (universal brand-agnostic craft rules a skill can opt into via `od.craft.requires`).
- Top-level content directories: `skills/` (functional skills the agent invokes mid-task — utilities, briefs, packagers; see `skills/AGENTS.md`), `design-templates/` (rendering catalogue: decks, prototypes, image/video/audio templates; see `design-templates/AGENTS.md` and `specs/current/skills-and-design-templates.md`), `design-systems/` (brand `DESIGN.md` files), `craft/` (universal brand-agnostic craft rules a skill can opt into via `od.craft.requires`), `mocks/` (replay-based mock CLIs for `opencode`/`claude`/`codex`/`gemini`/`cursor-agent`/`deepseek`/`qwen`/`grok`, the ACP family `devin`/`hermes`/`kilo`/`kimi`/`kiro`/`vibe`, and the AMR `vela` CLI (login + models + ACP), built from anonymized Langfuse traces — PATH-overlay drop-in for tests and self-validation; see `mocks/README.md`).
- `apps/web` is the Next.js 16 App Router + React 18 web runtime; do not restore `apps/nextjs`.
- `apps/daemon` is the local privileged daemon and `od` bin. It owns `/api/*`, agent spawning, skills, design systems, artifacts, and static serving.
- `apps/desktop` is the Electron shell; it discovers the web URL through sidecar IPC.
@ -167,6 +167,7 @@ root `pnpm tools-pr` script without a new explicit maintainer decision.
## Validation strategy
- After package, workspace, or command-entry changes, run `pnpm install` so workspace links and generated dist entries stay fresh.
- For agent-stream / parser changes (`apps/daemon/src/claude-stream.ts`, `json-event-stream.ts`, `qoder-stream.ts`, etc.), replay a recorded session through the mock CLIs in `mocks/` to verify event shapes round-trip without burning provider budget. PATH-overlay activation: `export PATH="$PWD/mocks/bin:$PATH" OD_MOCKS_TRACE=<8-char-id> OD_MOCKS_NO_DELAY=1`. See `mocks/README.md` for the trace catalog and selection knobs.
- Treat every `pnpm-lock.yaml` change as requiring a Nix pnpm deps hash refresh check. `nix/pnpm-deps.nix` is a generated lock artifact; use `pnpm nix:update-hash` only when intentionally maintaining Nix packaging, then re-run `nix flake check --print-build-logs --keep-going`. Contributors without Nix can rely on the PR `Validate workspace` gate, which now uploads or auto-applies the generated hash-only fix when possible.
- Before marking regular work ready, run at least `pnpm guard` and `pnpm typecheck`, plus the package-scoped tests/builds that match the files changed. Do not use or add root `pnpm test`/`pnpm build` aliases.
- For local web runtime loops, prefer `pnpm tools-dev run web --daemon-port <port> --web-port <port>`.

View file

@ -800,7 +800,7 @@ Issues و PRs و skills جديدة وأنظمة تصميم جديدة، كلّه
شكراً لكلّ من ساعد في دفع Open Design للأمام — بكود، بوثائق، بملاحظات، بـ skills جديدة، بأنظمة تصميم جديدة، أو حتى بـ issue حادّة. كلّ مساهمة حقيقية تهمّ، والجدار أدناه أسهل طريقة لقول ذلك علناً.
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Open Design contributors" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Open Design contributors" />
</a>
إن شحنت أوّل PR — مرحباً. تصنيف [`good-first-issue`](https://github.com/nexu-io/open-design/labels/good-first-issue) هو نقطة الدخول.
@ -817,9 +817,9 @@ Issues و PRs و skills جديدة وأنظمة تصميم جديدة، كلّه
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -726,7 +726,7 @@ Vollständiger Walkthrough, Merge-Messlatte, Code Style und was wir nicht annehm
Danke an alle, die Open Design vorangebracht haben: durch Code, Docs, Feedback, neue Skills, neue Design Systems oder auch ein scharfes Issue. Jeder echte Beitrag zählt, und die Wand unten ist die einfachste Art, das laut zu sagen.
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Open Design contributors" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Open Design contributors" />
</a>
Wenn Sie Ihren ersten PR gemergt haben: willkommen. Das Label [`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) ist der Einstiegspunkt.
@ -743,9 +743,9 @@ Das SVG oben wird täglich von [`.github/workflows/metrics.yml`](.github/workflo
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -787,7 +787,7 @@ Walkthrough completo, estándar de merge, code style y lo que no aceptamos → [
Gracias a todas las personas que han ayudado a mover Open Design hacia adelante: con código, docs, feedback, nuevas skills, nuevos design systems o incluso un issue preciso. Toda contribución real cuenta, y el muro de abajo es la forma más simple de decirlo en voz alta.
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Contribuidores de Open Design" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Contribuidores de Open Design" />
</a>
Si ya enviaste tu primer PR, bienvenido. La etiqueta [`good-first-issue`](https://github.com/nexu-io/open-design/labels/good-first-issue) es el punto de entrada.
@ -804,9 +804,9 @@ El SVG anterior se regenera diariamente mediante [`.github/workflows/metrics.yml
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Historial de estrellas de Open Design" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Historial de estrellas de Open Design" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -733,7 +733,7 @@ Guide complet, critères de merge, style de code et refus fréquents → [`CONTR
Merci à toutes les personnes qui font avancer Open Design : code, docs, retours, nouveaux Skills, nouveaux Design Systems ou issues bien ciblées. Chaque vraie contribution compte.
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Contributeurs Open Design" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Contributeurs Open Design" />
</a>
Si vous avez livré votre première PR, bienvenue. Le label [`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) est le point dentrée.
@ -750,9 +750,9 @@ Le SVG ci-dessus est régénéré chaque jour par [`.github/workflows/metrics.ym
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Historique des stars Open Design" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Historique des stars Open Design" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -723,7 +723,7 @@ Issue、PR、新 Skill、新 Design System を歓迎します。最も効果の
コード、ドキュメント、フィードバック、新 Skill、新 Design System、あるいは鋭い Issue — あらゆる形で Open Design を前進させてくださったすべての方に感謝します。すべての実質的なコントリビューションは大切であり、以下のウォールは最もシンプルな感謝の表明です。
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Open Design コントリビューター" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Open Design コントリビューター" />
</a>
初めての PR を送った方 — ようこそ。[`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) ラベルがエントリポイントです。
@ -740,9 +740,9 @@ Issue、PR、新 Skill、新 Design System を歓迎します。最も効果の
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -726,7 +726,7 @@ daemon 부팅 시 `PATH`에서 자동 감지됩니다. 설정 필요 없음. 스
Open Design을 앞으로 나아가게 도와준 모든 분께 감사드립니다 — 코드, 문서, 피드백, 새 skill, 새 디자인 시스템, 또는 날카로운 이슈 하나라도. 모든 진짜 기여가 의미 있고, 아래의 벽이 가장 직접적인 "감사합니다"입니다.
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Open Design 컨트리뷰터" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Open Design 컨트리뷰터" />
</a>
첫 PR을 보냈다면 — 환영합니다. [`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) 레이블이 시작점입니다.
@ -743,9 +743,9 @@ Open Design을 앞으로 나아가게 도와준 모든 분께 감사드립니다
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -1,6 +1,6 @@
# Open Design — the open-source Claude Design alternative
> **Open Design is the open-source, local-first alternative to [Claude Design][cd].** Web-deployable, BYOK at every layer — **16 coding-agent CLIs** auto-detected on your `PATH` (Claude Code, Codex, Devin for Terminal, Cursor Agent, Gemini CLI, OpenCode, Qwen, Qoder CLI, GitHub Copilot CLI, Hermes, Kimi, Pi, Kiro, Kilo, Mistral Vibe, DeepSeek TUI) become the design engine, driven by **132 composable Skills** and **150 brand-grade Design Systems**. No CLI? An OpenAI-compatible BYOK proxy is the same loop minus the spawn.
> **Open Design is the open-source, local-first alternative to [Claude Design][cd].** Web-deployable, BYOK at every layer — **16 coding-agent CLIs** auto-detected on your `PATH` (Claude Code, Codex, Devin for Terminal, Cursor Agent, Gemini CLI, OpenCode, Qwen, Qoder CLI, GitHub Copilot CLI, Hermes, Kimi, Pi, Kiro, Kilo, Mistral Vibe, DeepSeek TUI) become the design engine, driven by **137 composable Skills** and **150 brand-grade Design Systems**. No CLI? An OpenAI-compatible BYOK proxy is the same loop minus the spawn.
> [!IMPORTANT]
> ### 🔥 `0.8.0-preview` is here. Design's old world ends here.
@ -31,7 +31,7 @@
<a href="LICENSE"><img alt="License" src="https://img.shields.io/badge/license-Apache%202.0-blue.svg?style=flat-square" /></a>
<a href="#supported-coding-agents"><img alt="Agents" src="https://img.shields.io/badge/agents-16%20CLIs%20%2B%20BYOK%20proxy-black?style=flat-square" /></a>
<a href="#design-systems"><img alt="Design systems" src="https://img.shields.io/badge/design%20systems-150-orange?style=flat-square" /></a>
<a href="#skills"><img alt="Skills" src="https://img.shields.io/badge/skills-132-teal?style=flat-square" /></a>
<a href="#skills"><img alt="Skills" src="https://img.shields.io/badge/skills-137-teal?style=flat-square" /></a>
<a href="https://discord.gg/qhbcCH8Am4"><img alt="Discord" src="https://img.shields.io/badge/discord-join-5865F2?style=flat-square&logo=discord&logoColor=white" /></a>
<a href="https://x.com/nexudotio"><img alt="Follow @nexudotio on X" src="https://img.shields.io/badge/follow-%40nexudotio-1DA1F2?style=flat-square&logo=x&logoColor=white" /></a>
<a href="QUICKSTART.md"><img alt="Quickstart" src="https://img.shields.io/badge/quickstart-3%20commands-green?style=flat-square" /></a>
@ -64,8 +64,8 @@ OD stands on four open-source shoulders:
|---|---|
| **Coding-agent CLIs (16)** | Claude Code · Codex CLI · Devin for Terminal · Cursor Agent · Gemini CLI · OpenCode · Qwen Code · Qoder CLI · GitHub Copilot CLI · Hermes (ACP) · Kimi CLI (ACP) · Pi (RPC) · Kiro CLI (ACP) · Kilo (ACP) · Mistral Vibe CLI (ACP) · DeepSeek TUI — auto-detected on `PATH`, swap with one click |
| **BYOK fallback** | Protocol-specific API proxy at `/api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream` — paste `baseUrl` + `apiKey` + `model`, choose Anthropic / OpenAI / Azure OpenAI / Google Gemini / Ollama Cloud / SenseAudio, and the daemon normalizes SSE back to the same chat stream. SenseAudio chat additionally exposes `generate_image` and `generate_video` tools so the model can write rendered artifacts straight into the active project's folder. Internal-IP/SSRF blocked at the daemon edge. |
| **Design systems built-in** | **129** — 2 hand-authored starters + 70 product systems (Linear, Stripe, Vercel, Airbnb, Tesla, Notion, Anthropic, Apple, Cursor, Supabase, Figma, Xiaohongshu, …) from [`awesome-design-md`][acd2], plus 57 design skills from [`awesome-design-skills`][ads] added directly under `design-systems/` |
| **Skills built-in** | **132** — 27 in `prototype` mode (web-prototype, saas-landing, dashboard, mobile-app, gamified-app, social-carousel, magazine-poster, dating-web, sprite-animation, motion-frames, critique, tweaks, wireframe-sketch, pm-spec, eng-runbook, finance-report, hr-onboarding, invoice, kanban-board, team-okrs, …) + 4 in `deck` mode (`guizang-ppt` · `simple-deck` · `replit-deck` · `weekly-update`). Grouped in the picker by `scenario`: design / marketing / operation / engineering / product / finance / hr / sale / personal. |
| **Design systems built-in** | **150** — hand-authored starters plus product systems (Linear, Stripe, Vercel, Airbnb, Tesla, Notion, Anthropic, Apple, Cursor, Supabase, Figma, Xiaohongshu, …) from [`awesome-design-md`][acd2], with curated entries from [`awesome-design-skills`][ads] added directly under `design-systems/` |
| **Skills built-in** | **137** — across `prototype` (web-prototype, saas-landing, dashboard, mobile-app, gamified-app, social-carousel, magazine-poster, dating-web, sprite-animation, motion-frames, critique, tweaks, wireframe-sketch, pm-spec, eng-runbook, finance-report, hr-onboarding, invoice, kanban-board, team-okrs, …), `deck` (`guizang-ppt` · `simple-deck` · `replit-deck` · `weekly-update`), and `image` / `video` / `audio` / `template` / `design-system` / `utility` modes. Grouped in the picker by `scenario`: design / marketing / operation / engineering / product / finance / hr / sale / personal. |
| **Media generation** | Image · video · audio surfaces ship alongside the design loop. **gpt-image-2** (Azure / OpenAI) for posters, avatars, infographics, illustrated maps · **Seedance 2.0** (ByteDance) for cinematic 15s text-to-video and image-to-video · **HyperFrames** ([heygen-com/hyperframes](https://github.com/heygen-com/hyperframes)) for HTML→MP4 motion graphics (product reveals, kinetic typography, data charts, social overlays, logo outros). Other image generators can already plug in through **Custom Image API** / **ImageRouter** when they expose an OpenAI-compatible image endpoint; workflow-first local runtimes such as **ComfyUI** are tracked separately as planned adapters. **93** ready-to-replicate prompts gallery — 43 gpt-image-2 + 39 Seedance + 11 HyperFrames — under [`prompt-templates/`](prompt-templates/), with preview thumbnails and source attribution. Same chat surface as code; outputs a real `.mp4` / `.png` chip into the project workspace. |
| **Visual directions** | 5 curated schools (Editorial Monocle · Modern Minimal · Warm Soft · Tech Utility · Brutalist Experimental) — each ships a deterministic OKLch palette + font stack ([`apps/daemon/src/prompts/directions.ts`](apps/daemon/src/prompts/directions.ts)) |
| **Device frames** | iPhone 15 Pro · Pixel · iPad Pro · MacBook · Browser Chrome — pixel-accurate, shared across skills under [`assets/frames/`](assets/frames/) |
@ -129,9 +129,9 @@ Linux AppImage packaging is available through the optional release lane and is c
## Skills
**132 skills ship in the box.** Each is a folder under [`skills/`](skills/) following the Claude Code [`SKILL.md`][skill] convention with an extended `od:` frontmatter that the daemon parses verbatim — `mode`, `platform`, `scenario`, `preview.type`, `design_system.requires`, `default_for`, `featured`, `fidelity`, `speaker_notes`, `animations`, `example_prompt` ([`apps/daemon/src/skills.ts`](apps/daemon/src/skills.ts)).
**137 skills ship in the box.** Each is a folder under [`skills/`](skills/) following the Claude Code [`SKILL.md`][skill] convention with an extended `od:` frontmatter that the daemon parses verbatim — `mode`, `platform`, `scenario`, `preview.type`, `design_system.requires`, `default_for`, `featured`, `fidelity`, `speaker_notes`, `animations`, `example_prompt` ([`apps/daemon/src/skills.ts`](apps/daemon/src/skills.ts)).
Two **modes** anchor the interactive catalog: **`prototype`** (32 skills — anything that renders as a single-page artifact, from a magazine landing to a phone screen to a PM spec doc) and **`deck`** (9 skills — horizontal-swipe presentations with deck-framework chrome). The catalog also ships `image`, `video`, `audio`, `template`, `design-system`, and `utility` modes for media generation, catalog updaters, and post-export audit helpers. The **`scenario`** field is what the picker groups them by: `design` · `marketing` · `operation` · `engineering` · `product` · `finance` · `hr` · `sale` · `personal`.
Two **modes** anchor the interactive catalog: **`prototype`** (anything that renders as a single-page artifact, from a magazine landing to a phone screen to a PM spec doc) and **`deck`** (horizontal-swipe presentations with deck-framework chrome). The catalog also ships `image`, `video`, `audio`, `template`, `design-system`, and `utility` modes for media generation, catalog updaters, and post-export audit helpers. The **`scenario`** field is what the picker groups them by: `design` · `marketing` · `operation` · `engineering` · `product` · `finance` · `hr` · `sale` · `personal`.
### Showcase examples
@ -260,7 +260,7 @@ What you compose at send time isn't "system + user". It's:
DISCOVERY directives (turn-1 form, turn-2 brand branch, TodoWrite, 5-dim critique)
+ identity charter (OFFICIAL_DESIGNER_PROMPT, anti-AI-slop, junior-pass)
+ active DESIGN.md (150 systems available)
+ active SKILL.md (132 skills available)
+ active SKILL.md (137 skills available)
+ project metadata (kind, fidelity, speakerNotes, animations, inspiration ids)
+ skill side files (auto-injected pre-flight: read assets/template.html + references/*.md)
+ (deck kind, no skill seed) DECK_FRAMEWORK_DIRECTIVE (nav / counter / scroll / print)
@ -408,7 +408,7 @@ For desktop/background startup, fixed-port restarts, and media generation dispat
The first load:
1. Detects which agent CLIs you have on `PATH` and picks one automatically.
2. Loads 132 skills + 150 design systems.
2. Loads 137 skills + 150 design systems.
3. Pops the welcome dialog so you can paste an Anthropic key (only needed for the BYOK fallback path).
4. **Auto-creates `./.od/`** — the local runtime folder for the SQLite project DB, per-project artifacts, and saved renders. There is no `od init` step; the daemon `mkdir`s everything it needs on boot.
@ -709,7 +709,7 @@ open-design/
│ ├── sidecar/ ← generic sidecar runtime primitives
│ └── platform/ ← generic process/platform primitives
├── skills/ ← 132 SKILL.md skill bundles (32 prototype + 9 deck + image / video / audio / template / design-system / utility)
├── skills/ ← 137 SKILL.md skill bundles (prototype, deck, image, video, audio, template, design-system, utility)
│ ├── web-prototype/ ← default for prototype mode
│ ├── saas-landing/ dashboard/ pricing-page/ docs-page/ blog-post/
│ ├── mobile-app/ mobile-onboarding/ gamified-app/
@ -895,7 +895,7 @@ The chat / artifact loop gets the spotlight, but a handful of less-visible capab
- **Claude Design ZIP import.** Drop an export from claude.ai onto the welcome dialog. `POST /api/import/claude-design` extracts it into a real `.od/projects/<id>/`, opens the entry file as a tab, and stages a continue-where-Anthropic-left-off prompt for your local agent. No re-prompting, no "ask the model to re-create what we just had". ([`apps/daemon/src/server.ts`](apps/daemon/src/server.ts) — `/api/import/claude-design`)
- **Multi-provider BYOK proxy.** `POST /api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream` takes `{ baseUrl, apiKey, model, messages }`, builds the provider-specific upstream request, normalizes SSE chunks into `delta/end/error`, and allows loopback local LLM providers while rejecting non-loopback private, link-local, CGNAT, multicast, reserved, and redirect targets to head off SSRF. OpenAI-compatible covers OpenAI, Azure AI Foundry `/openai/v1`, DeepSeek, Groq, MiMo, OpenRouter, Ollama, LM Studio, and self-hosted vLLM; Azure OpenAI adds deployment URL + `api-version`; Google uses Gemini `:streamGenerateContent`.
- **User-saved templates.** Once you like a render, `POST /api/templates` snapshots the HTML + metadata into the SQLite `templates` table. The next project picks it from a "your templates" row in the picker — same surface as the shipped 132, but yours.
- **User-saved templates.** Once you like a render, `POST /api/templates` snapshots the HTML + metadata into the SQLite `templates` table. The next project picks it from a "your templates" row in the picker — same surface as the shipped 137, but yours.
- **Tab persistence.** Every project remembers its open files and active tab in the `tabs` table. Reopen the project tomorrow and the workspace looks exactly the way you left it.
- **Artifact lint API.** `POST /api/artifacts/lint` runs structural checks on a generated artifact (broken `<artifact>` framing, missing required side files, stale palette tokens) and returns findings the agent can read back into its next turn. The five-dim self-critique uses this to ground its score in real evidence, not vibes.
- **Sidecar protocol + desktop automation.** Daemon, web, and desktop processes carry typed five-field stamps (`app · mode · namespace · ipc · source`) and expose a JSON-RPC IPC channel at `/tmp/open-design/ipc/<namespace>/<app>.sock`. `tools-dev inspect desktop status \| eval \| screenshot` drives that channel, so headless E2E works against a real Electron shell without bespoke harnesses ([`packages/sidecar-proto/`](packages/sidecar-proto/), [`apps/desktop/src/main/`](apps/desktop/src/main/)).
@ -921,8 +921,8 @@ The whole machinery below is the [`huashu-design`](https://github.com/alchaincyf
| Form factor | Web (claude.ai) | Desktop (Electron) | **Web app + local daemon** |
| Deployable on Vercel | ❌ | ❌ | **✅** |
| Agent runtime | Bundled (Opus 4.7) | Bundled ([`pi-ai`][piai]) | **Delegated to user's existing CLI** |
| Skills | Proprietary | 12 custom TS modules + `SKILL.md` | **132 file-based [`SKILL.md`][skill] bundles, droppable** |
| Design system | Proprietary | `DESIGN.md` (v0.2 roadmap) | **`DESIGN.md` × 129 systems shipped** |
| Skills | Proprietary | 12 custom TS modules + `SKILL.md` | **137 file-based [`SKILL.md`][skill] bundles, droppable** |
| Design system | Proprietary | `DESIGN.md` (v0.2 roadmap) | **`DESIGN.md` × 150 systems shipped** |
| Provider flexibility | Anthropic only | 7+ via [`pi-ai`][piai] | **16 CLI adapters + OpenAI-compatible BYOK proxy** |
| Init question form | ❌ | ❌ | **✅ Hard rule, turn 1** |
| Direction picker | ❌ | ❌ | **✅ 5 deterministic directions** |
@ -994,7 +994,7 @@ Long-form provenance write-up — what we take from each, what we deliberately d
- [x] Daemon + agent detection (16 CLI adapters) + skill registry + design-system catalog
- [x] Web app + chat + question form + 5-direction picker + todo progress + sandboxed preview
- [x] 132 skills + 150 design systems + 5 visual directions + 5 device frames
- [x] 137 skills + 150 design systems + 5 visual directions + 5 device frames
- [x] SQLite-backed projects · conversations · messages · tabs · templates
- [x] Multi-provider BYOK proxy (`/api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream`) with SSRF guard
- [x] Claude Design ZIP import (`/api/import/claude-design`)
@ -1040,7 +1040,7 @@ Full walkthrough, bar-for-merging, code style, and what we don't accept → [`CO
Thanks to everyone who has helped move Open Design forward — through code, docs, feedback, new skills, new design systems, or even a sharp issue. Every real contribution counts, and the wall below is the easiest way to say so out loud.
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Open Design contributors" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Open Design contributors" />
</a>
If you've shipped your first PR — welcome. The [`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) label is the entry point.
@ -1057,9 +1057,9 @@ The SVG above is regenerated daily by [`.github/workflows/metrics.yml`](.github/
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -730,7 +730,7 @@ Walkthrough completo, barra para mergear, estilo de código e o que não aceitam
Obrigado a todas as pessoas que ajudaram a empurrar o Open Design pra frente — via código, docs, feedback, novas skills, novos design systems ou até uma issue afiada. Toda contribuição real conta, e a parede abaixo é a forma mais simples de dizer isso em voz alta.
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Contribuidoras e contribuidores do Open Design" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Contribuidoras e contribuidores do Open Design" />
</a>
Se você acabou de mandar seu primeiro PR — bem-vindo. A label [`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) é o ponto de entrada.
@ -747,9 +747,9 @@ O SVG acima é regenerado diariamente por [`.github/workflows/metrics.yml`](.git
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Histórico de estrelas do Open Design" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Histórico de estrelas do Open Design" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -729,7 +729,7 @@ Issues, PR, новые skills и новые design systems приветству
Спасибо всем, кто помогает двигать Open Design вперёд — кодом, документацией, обратной связью, новыми skills, новыми design systems или просто точным issue. Вклад любого реального масштаба здесь важен, а стена ниже — самый простой способ сказать это вслух.
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Contributors Open Design" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Contributors Open Design" />
</a>
Если вы только что отправили свой первый PR — добро пожаловать. Метка [`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) — хорошая точка входа.
@ -746,9 +746,9 @@ SVG выше ежедневно пересобирается workflow [`.github/
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="История звёзд Open Design" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="История звёзд Open Design" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -887,7 +887,7 @@ Tam walkthrough, merge çıtası, code style ve kabul etmediklerimiz → [`CONTR
Open Design'ı kod, doküman, feedback, yeni skill, yeni design system veya keskin bir issue ile ileri taşıyan herkese teşekkürler. Her gerçek katkı önemlidir; aşağıdaki wall bunu yüksek sesle söylemenin en kolay yolu.
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Open Design contributors" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Open Design contributors" />
</a>
İlk PR'ını gönderdiysen hoş geldin. [`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) label'ı giriş noktasıdır.
@ -904,9 +904,9 @@ Yukarıdaki SVG [`.github/workflows/metrics.yml`](.github/workflows/metrics.yml)
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -729,7 +729,7 @@ OD не зупиняється на коді. Та сама поверхня ч
Дякуємо всім, хто допоміг просувати Open Design — через код, документацію, зворотний зв'язок, нові навички, нові системи дизайну або навіть гостре питання. Кожен реальний внесок рахується, а стіна нижче — найпростіший спосіб сказати це вголос.
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Контриб'ютори Open Design" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Контриб'ютори Open Design" />
</a>
Якщо ви злили свій перший PR — ласкаво просимо. Мітка [`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) — це точка входу.
@ -746,9 +746,9 @@ SVG вище перегенерується щодня [`.github/workflows/metri
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Історія зірок Open Design" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Історія зірок Open Design" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -722,7 +722,7 @@ Daemon 启动时从 `PATH` 自动检测,无需配置。流式分发逻辑在 [
感谢每一位让 Open Design 变得更好的朋友 —— 无论是写代码、修文档、提 issue、加 skill 还是加 design system每一次真实贡献都会被记住。下面这面墙是最直观的「Thank you」。
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Open Design 贡献者" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Open Design 贡献者" />
</a>
第一次提 PR欢迎从 [`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) 标签起步。
@ -739,9 +739,9 @@ Daemon 启动时从 `PATH` 自动检测,无需配置。流式分发逻辑在 [
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -713,7 +713,8 @@ open-design/
│ └── browser-chrome.html
├── templates/
│ └── deck-framework.html ← deck 基線nav / counter / print
│ ├── deck-framework.html ← deck 基線nav / counter / print
│ └── kami-deck.html ← kami 風格 deck 起手(羊皮紙 / 墨藍襯線)
├── scripts/
│ └── sync-design-systems.ts ← 從上游 awesome-design-md tarball 重新匯入
@ -888,7 +889,7 @@ Chat / artifact 迴圈最顯眼,但這套倉庫裡還有幾個能力被埋得
| 可部署 Vercel | ❌ | ❌ | **✅** |
| Agent 執行時 | 內建 (Opus 4.7) | 內建 ([`pi-ai`][piai]) | **委託給使用者已裝好的 CLI** |
| Skill | 私有 | 12 套自定義 TS 模組 + `SKILL.md` | **31 套基於檔案的 [`SKILL.md`][skill],可丟入** |
| Design system | 私有 | `DESIGN.md`v0.2 路線圖) | **`DESIGN.md` × 72 套,開箱即有** |
| Design system | 私有 | `DESIGN.md`v0.2 路線圖) | **`DESIGN.md` × 129 套,開箱即有** |
| Provider 靈活度 | 僅 Anthropic | 7+[`pi-ai`][piai] | **16 套 CLI adapter + OpenAI 相容 BYOK 代理** |
| 初始化問題表單 | ❌ | ❌ | **✅ 硬規則 turn 1** |
| 方向選擇器 | ❌ | ❌ | **✅ 5 套確定性方向** |
@ -1005,7 +1006,7 @@ Daemon 啟動時從 `PATH` 自動檢測,無需配置。流式分發邏輯在 [
感謝每一位讓 Open Design 變得更好的朋友 —— 無論是寫程式碼、修文檔、提 issue、加 skill 還是加 design system每一次真實貢獻都會被記住。下面這面牆是最直觀的「Thank you」。
<a href="https://github.com/nexu-io/open-design/graphs/contributors">
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-28" alt="Open Design 貢獻者" />
<img src="https://contrib.rocks/image?repo=nexu-io/open-design&cache_bust=2026-05-30" alt="Open Design 貢獻者" />
</a>
第一次提 PR歡迎從 [`good-first-issue`/`help-wanted`](https://github.com/nexu-io/open-design/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22%2C%22help+wanted%22) 標籤起步。
@ -1022,9 +1023,9 @@ Daemon 啟動時從 `PATH` 自動檢測,無需配置。流式分發邏輯在 [
<a href="https://star-history.com/#nexu-io/open-design&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-28" />
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&theme=dark&cache_bust=2026-05-30" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
<img alt="Open Design star history" src="https://api.star-history.com/svg?repos=nexu-io/open-design&type=Date&cache_bust=2026-05-30" />
</picture>
</a>

View file

@ -65,6 +65,7 @@ interface AttachAcpSessionOptions {
prompt: string;
cwd?: string;
model?: string | null;
imagePaths?: string[];
mcpServers?: AcpMcpServerInput[];
send: (event: string, payload: unknown) => void;
clientName?: string;
@ -116,6 +117,15 @@ function sendRpcResult(writable: RpcWritable, id: JsonRpcId, result: unknown): v
writable.write(`${JSON.stringify({ jsonrpc: '2.0', id, result })}\n`);
}
function buildPromptBlocks(prompt: string, imagePaths: string[]): Array<Record<string, string>> {
const blocks: Array<Record<string, string>> = [{ type: 'text', text: prompt }];
for (const imagePath of imagePaths) {
if (typeof imagePath !== 'string' || imagePath.trim().length === 0) continue;
blocks.push({ type: 'resource_link', uri: imagePath });
}
return blocks;
}
function isJsonRpcId(value: unknown): value is JsonRpcId {
return typeof value === 'number' || typeof value === 'string';
}
@ -422,6 +432,7 @@ export function attachAcpSession({
prompt,
cwd,
model,
imagePaths = [],
mcpServers,
send,
clientName = 'open-design',
@ -446,6 +457,7 @@ export function attachAcpSession({
let emittedThinkingStart = false;
let emittedFirstTokenStatus = false;
let emittedTextChunk = false;
let emittedTextBuffer = '';
let finished = false;
let fatal = false;
let aborted = false;
@ -525,7 +537,7 @@ export function attachAcpSession({
'session/prompt',
{
sessionId,
prompt: [{ type: 'text', text: prompt }],
prompt: buildPromptBlocks(prompt, imagePaths),
},
'session/prompt',
);
@ -607,7 +619,12 @@ export function attachAcpSession({
if (update.sessionUpdate === 'agent_message_chunk') {
const text = asObject(update.content)?.text;
if (typeof text === 'string' && text.length > 0) {
const delta = text.startsWith(emittedTextBuffer)
? text.slice(emittedTextBuffer.length)
: text;
if (delta.length > 0) {
emittedTextChunk = true;
emittedTextBuffer += delta;
if (!emittedFirstTokenStatus) {
emittedFirstTokenStatus = true;
send('agent', {
@ -616,7 +633,8 @@ export function attachAcpSession({
ttftMs: Date.now() - runStartedAt,
});
}
send('agent', { type: 'text_delta', delta: text });
send('agent', { type: 'text_delta', delta });
}
}
return;
}

View file

@ -0,0 +1,76 @@
import { randomUUID } from 'node:crypto';
import fs from 'node:fs';
import path from 'node:path';
const STAGING_DIRNAME = '.amr-attachments';
const STAGING_MAX_AGE_MS = 24 * 60 * 60 * 1000;
function isWithinRoot(root: string, candidate: string): boolean {
const relativePath = path.relative(root, candidate);
return (
relativePath === '' ||
(relativePath.length > 0 &&
!relativePath.startsWith('..') &&
!path.isAbsolute(relativePath))
);
}
async function pruneStagedAttachments(stagingDir: string): Promise<void> {
let entries;
try {
entries = await fs.promises.readdir(stagingDir, { withFileTypes: true });
} catch {
return;
}
const cutoff = Date.now() - STAGING_MAX_AGE_MS;
await Promise.all(entries.map(async (entry) => {
if (!entry.isFile()) return;
const filePath = path.join(stagingDir, entry.name);
try {
const stat = await fs.promises.stat(filePath);
if (stat.mtimeMs < cutoff) {
await fs.promises.rm(filePath, { force: true });
}
} catch {
// Best-effort cleanup only.
}
}));
}
export async function stageAmrImagePaths(
cwd: string | null | undefined,
imagePaths: string[],
uploadRoot?: string | null,
): Promise<string[]> {
if (!cwd || !Array.isArray(imagePaths) || imagePaths.length === 0) return [];
const root = path.resolve(cwd);
const uploadRootReal = uploadRoot
? await fs.promises.realpath(uploadRoot).catch(() => null)
: null;
const stagingDir = path.join(root, STAGING_DIRNAME);
await fs.promises.mkdir(stagingDir, { recursive: true });
await pruneStagedAttachments(stagingDir);
const staged: string[] = [];
for (const inputPath of imagePaths) {
if (typeof inputPath !== 'string' || inputPath.trim().length === 0) continue;
try {
const resolved = path.resolve(inputPath);
const real = await fs.promises.realpath(resolved);
if (uploadRootReal && !isWithinRoot(uploadRootReal, real)) continue;
const stat = await fs.promises.stat(real);
if (!stat.isFile()) continue;
if (isWithinRoot(root, real)) {
staged.push(real);
continue;
}
const basename = path.basename(real);
const destination = path.join(stagingDir, `${randomUUID()}-${basename}`);
await fs.promises.copyFile(real, destination);
staged.push(destination);
} catch {
// Ignore malformed or missing files; attachments are advisory input.
}
}
return staged;
}

View file

@ -20,6 +20,7 @@ import { projectKindToTracking } from '@open-design/contracts/analytics';
import { proxyDispatcherRequestInit, validateBaseUrlResolved } from './connectionTest.js';
import { googleStreamGenerateContentUrl } from './google-models.js';
import { parseMediaExecutionPolicyInput } from './media-policy.js';
import { createRoleMarkerGuard } from './role-marker-guard.js';
// Allowlist for the `/feedback` route. Mirrors the
// ChatMessageFeedbackReasonCode union in packages/contracts/src/api/chat.ts.
@ -549,7 +550,16 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
if (!match || match.index === undefined) break;
const frame = buffer.slice(0, match.index);
buffer = buffer.slice(match.index + match[0].length);
if (await onFrame(collectSseFrame(frame))) return;
if (await onFrame(collectSseFrame(frame))) {
// Fire-and-forget cancel: awaiting hangs on some response-stream
// implementations (notably Response built from Uint8Array body,
// exposed by tests/proxy-routes.test.ts ollama case where the
// mock body's tee'd cancel() never resolves). The cancel signal
// is a hint; we're already returning from the function, so we
// don't gain anything by blocking on it.
void reader.cancel().catch(() => {});
return;
}
}
}
@ -575,7 +585,11 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
if (!line) continue;
try {
const data = JSON.parse(line);
if (await onFrame({ data })) return;
if (await onFrame({ data })) {
// See note in streamUpstreamSse — fire-and-forget cancel.
void reader.cancel().catch(() => {});
return;
}
} catch {
// Ignore malformed provider keepalive lines.
}
@ -644,6 +658,30 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
return '';
};
// Per-request role-marker guard for BYOK proxy streams (#3247).
function createDeltaGuard(sse: any) {
const guard = createRoleMarkerGuard('proxy');
return {
sendDelta(text: string) {
if (guard.contaminated || !text) return;
const safe = guard.feedText(text);
if (safe.length > 0) {
sse.send('delta', { delta: safe });
}
if (guard.contaminated) {
const warn = guard.warningEvent();
const markerText = warn?.marker ?? '## user';
sse.send('delta', {
delta: `\n\n---\n⚠ **Security warning:** The model attempted to emit a fabricated role marker (\`${markerText}\`). Response was truncated to prevent unauthorized instruction injection. See issue #3247.\n`,
});
}
},
get contaminated() {
return guard.contaminated;
},
};
}
app.post('/api/proxy/anthropic/stream', async (req, res) => {
/** @type {Partial<ProxyStreamRequest>} */
const proxyBody = req.body || {};
@ -716,6 +754,7 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
}
let ended = false;
const guard = createDeltaGuard(sse);
await streamUpstreamSse(response, ({ event, data }: any) => {
if (!data) return false;
if (event === 'error' || data.type === 'error') {
@ -725,7 +764,12 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
return true;
}
if (event === 'content_block_delta' && typeof data.delta?.text === 'string') {
sse.send('delta', { delta: data.delta.text });
guard.sendDelta(data.delta.text);
if (guard.contaminated) {
sse.send('end', {});
ended = true;
return true;
}
}
if (event === 'message_stop') {
sse.send('end', {});
@ -820,6 +864,7 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
}
let ended = false;
const guard = createDeltaGuard(sse);
await streamUpstreamSse(response, ({ payload, data }: any) => {
if (payload === '[DONE]') {
sse.send('end', {});
@ -834,7 +879,14 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
return true;
}
const delta = extractOpenAIText(data);
if (delta) sse.send('delta', { delta });
if (delta) {
guard.sendDelta(delta);
if (guard.contaminated) {
sse.send('end', {});
ended = true;
return true;
}
}
return false;
});
if (!ended) sse.send('end', {});
@ -967,6 +1019,7 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
}
let ended = false;
const guard = createDeltaGuard(sse);
await streamUpstreamSse(response, ({ payload: ssePayload, data }: any) => {
if (ssePayload === '[DONE]') {
sse.send('end', {});
@ -981,7 +1034,13 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
return true;
}
const delta = extractOpenAIText(data);
if (delta) sse.send('delta', { delta });
if (delta) { guard.sendDelta(delta);
if (guard.contaminated) {
sse.send('end', {});
ended = true;
return true;
}
}
return false;
});
if (!ended) sse.send('end', {});
@ -1070,6 +1129,7 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
}
let ended = false;
const guard = createDeltaGuard(sse);
await streamUpstreamSse(response, ({ data }: any) => {
if (!data) return false;
const streamError = extractStreamErrorMessage(data);
@ -1079,7 +1139,13 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
return true;
}
const delta = extractGeminiText(data);
if (delta) sse.send('delta', { delta });
if (delta) { guard.sendDelta(delta);
if (guard.contaminated) {
sse.send('end', {});
ended = true;
return true;
}
}
const blockMessage = extractGeminiBlockMessage(data);
if (blockMessage) {
sendProxyError(sse, blockMessage, { details: data });
@ -1157,6 +1223,7 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
}
let ended = false;
const guard = createDeltaGuard(sse);
await streamUpstreamNdjson(response, ({ data }: any) => {
if (!data) return false;
if (data.done) {
@ -1165,7 +1232,14 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
return true;
}
const content = data.message?.content;
if (typeof content === 'string' && content) sse.send('delta', { delta: content });
if (typeof content === 'string' && content) {
guard.sendDelta(content);
if (guard.contaminated) {
sse.send('end', {});
ended = true;
return true;
}
}
return false;
});
if (!ended) sse.send('end', {});
@ -1335,6 +1409,7 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
let finishReason = '';
let providerError = '';
const guard = createDeltaGuard(sse);
await streamUpstreamSse(response, ({ payload, data }: any) => {
if (payload === '[DONE]') return true;
if (!data) return false;
@ -1356,7 +1431,11 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
// emit text before / after a tool_call in the same turn, and
// we want the user to see whatever the model decided to say.
if (typeof delta.content === 'string' && delta.content) {
sse.send('delta', { delta: delta.content });
guard.sendDelta(delta.content);
if (guard.contaminated) {
sse.send('end', {});
return true;
}
}
// Tool call deltas stream as fragments — `id` arrives once at

View file

@ -1,3 +1,5 @@
import path from 'node:path';
import { redactSecrets } from './redact.js';
export interface ClaudeCliDiagnosticInput {
@ -7,6 +9,7 @@ export interface ClaudeCliDiagnosticInput {
stderrTail?: string | null;
stdoutTail?: string | null;
env?: Record<string, unknown> | null;
resolvedBin?: string | null;
}
export interface ClaudeCliDiagnostic {
@ -51,6 +54,15 @@ function withContext(
};
}
function selectedClaudeCompatibleRuntime(input: ClaudeCliDiagnosticInput): 'claude' | 'openclaude' {
if (typeof input.resolvedBin !== 'string' || !input.resolvedBin.trim()) return 'claude';
const base = path
.basename(input.resolvedBin.trim().replace(/\\/g, '/'))
.replace(/\.(exe|cmd|bat)$/i, '')
.toLowerCase();
return base === 'openclaude' ? 'openclaude' : 'claude';
}
export function diagnoseClaudeCliFailure(
input: ClaudeCliDiagnosticInput,
): ClaudeCliDiagnostic | null {
@ -61,6 +73,8 @@ export function diagnoseClaudeCliFailure(
const normalized = text.toLowerCase();
const hasCustomBaseUrl = envValue(input.env, 'ANTHROPIC_BASE_URL') !== null;
const hasConfigDir = envValue(input.env, 'CLAUDE_CONFIG_DIR') !== null;
const runtime = selectedClaudeCompatibleRuntime(input);
const isOpenClaude = runtime === 'openclaude';
const customEndpointConnectionFailure =
hasCustomBaseUrl &&
@ -90,6 +104,13 @@ export function diagnoseClaudeCliFailure(
);
}
if (authFailure) {
if (isOpenClaude) {
return withContext(
'OpenClaude could not authenticate with its configured endpoint.',
'The spawned OpenClaude process exited before producing a response. Check the OpenClaude API key, endpoint, and local configuration, then retry.',
input,
);
}
const configHint = hasConfigDir
? 'The configured Claude config directory may contain stale or expired auth state.'
: 'If you use multiple Claude profiles, set CLAUDE_CONFIG_DIR in Settings so Open Design spawns the same profile that works in your terminal.';
@ -147,6 +168,13 @@ export function diagnoseClaudeCliFailure(
}
if (!text.trim() && input.exitCode === 1) {
if (isOpenClaude) {
return withContext(
'OpenClaude exited before producing diagnostics.',
'Check the OpenClaude API key, endpoint, and local configuration, then retry.',
input,
);
}
const message = hasConfigDir
? 'Claude Code exited before producing diagnostics while using the configured Claude profile.'
: 'Claude Code exited before producing diagnostics.';

View file

@ -19,6 +19,8 @@
* `tool_use` event when that block stops.
*/
import { createRoleMarkerGuard, type RoleMarkerGuard } from './role-marker-guard.js';
type StreamEvent = Record<string, unknown>;
type EventSink = (event: StreamEvent) => void;
type BlockState = { type?: unknown; name?: unknown; id?: unknown; input: string };
@ -39,18 +41,60 @@ export function createClaudeStreamHandler(onEvent: EventSink) {
// Most recent assistant message id so content_block_* events without an id
// can be attributed correctly.
let currentMessageId: string | null = null;
// Message ids that already streamed text via `stream_event` deltas.
// Message ids that already streamed assistant text/thinking via
// `stream_event` deltas.
// When `--include-partial-messages` is OFF (older Claude Code, e.g. 1.0.84
// pre-flag), no deltas arrive — only the final `assistant` wrapper carries
// text. The fallback below emits that text once, but we must skip it for
// content. The fallback below emits that content once, but we must skip it for
// newer builds that already streamed deltas, otherwise the message would
// duplicate.
const textStreamed = new Set<string>();
const thinkingStreamed = new Set<string>();
let currentMessageStreamedText = false;
let currentMessageStreamedThinking = false;
// Per-message role-marker guards for cross-chunk detection (#3247).
const roleGuards = new Map<string, RoleMarkerGuard>();
function blockKey(index: unknown): string {
return `${currentMessageId ?? 'anon'}:${index}`;
}
// Per-message role-marker guard (#3247). Covers text_delta ONLY.
//
// Why not thinking_delta: extended thinking is rendered to a
// separate `kind: 'thinking'` payload and is never folded into
// `m.content` by `buildDaemonTranscript` (apps/web/src/providers/daemon.ts),
// so it cannot be re-serialized as a turn boundary on the next
// round-trip — it is not a #3247 re-injection vector. Models
// routinely emit literal `## user` / `## assistant` lines in
// chain-of-thought when reasoning about conversation structure,
// and with kill-on-detection wired in server.ts a guard on the
// thinking channel would abort otherwise-legitimate runs without
// any compensating security benefit. See PR #3303 review
// r3324xxxxxx. Thinking is passed through unguarded; only the
// user-visible text channel is policed.
function emitSafeText(msgId: string | null, text: string, eventType: string = 'text_delta') {
if (eventType !== 'text_delta' || !msgId) {
onEvent({ type: eventType, delta: text });
return;
}
let guard = roleGuards.get(msgId);
if (!guard) {
guard = createRoleMarkerGuard(msgId);
roleGuards.set(msgId, guard);
}
if (guard.contaminated) return;
const safe = guard.feedText(text);
if (safe.length > 0) {
onEvent({ type: eventType, delta: safe });
}
if (guard.contaminated) {
const warn = guard.warningEvent();
if (warn) onEvent(warn);
}
}
function feed(chunk: string) {
buffer += chunk;
let nl;
@ -110,9 +154,12 @@ export function createClaudeStreamHandler(onEvent: EventSink) {
// covered it (older Claude Code without --include-partial-messages
// delivers text only here; newer builds stream it and would duplicate).
if (obj.type === 'assistant' && isRecord(obj.message) && Array.isArray(obj.message.content)) {
currentMessageId = typeof obj.message.id === 'string' ? obj.message.id : currentMessageId;
const msgId = typeof obj.message.id === 'string' ? obj.message.id : null;
const alreadyStreamed = msgId ? textStreamed.has(msgId) : false;
const explicitMsgId = typeof obj.message.id === 'string' ? obj.message.id : null;
const textMsgId = explicitMsgId ?? (currentMessageStreamedText ? currentMessageId : null);
const thinkingMsgId = explicitMsgId ?? (currentMessageStreamedThinking ? currentMessageId : null);
if (explicitMsgId) currentMessageId = explicitMsgId;
const textAlreadyStreamed = textMsgId ? textStreamed.has(textMsgId) : false;
const thinkingAlreadyStreamed = thinkingMsgId ? thinkingStreamed.has(thinkingMsgId) : false;
// Per-turn `stop_reason` is emitted as `turn_end` AFTER the content
// blocks have been processed (see below). When `--include-partial-
// messages` is unsupported, tool_use events surface only from the
@ -138,19 +185,19 @@ export function createClaudeStreamHandler(onEvent: EventSink) {
input: block.input ?? null,
});
} else if (
!alreadyStreamed &&
!textAlreadyStreamed &&
block.type === 'text' &&
typeof block.text === 'string' &&
block.text.length > 0
) {
onEvent({ type: 'text_delta', delta: block.text });
emitSafeText(textMsgId, block.text);
} else if (
!alreadyStreamed &&
!thinkingAlreadyStreamed &&
block.type === 'thinking' &&
typeof block.thinking === 'string' &&
block.thinking.length > 0
) {
onEvent({ type: 'thinking_delta', delta: block.thinking });
emitSafeText(thinkingMsgId, block.thinking, 'thinking_delta');
}
}
// Surface the turn_end signal now that every tool_use in this
@ -160,6 +207,8 @@ export function createClaudeStreamHandler(onEvent: EventSink) {
if (stopReason) {
onEvent({ type: 'turn_end', stopReason });
}
currentMessageStreamedText = false;
currentMessageStreamedThinking = false;
return;
}
@ -194,7 +243,11 @@ export function createClaudeStreamHandler(onEvent: EventSink) {
function handleStreamEvent(ev: Record<string, unknown>) {
if (ev.type === 'message_start') {
// Clean up per-message role-marker guard from the previous message.
if (currentMessageId) roleGuards.delete(currentMessageId);
currentMessageId = isRecord(ev.message) && typeof ev.message.id === 'string' ? ev.message.id : null;
currentMessageStreamedText = false;
currentMessageStreamedThinking = false;
if (typeof ev.ttft_ms === 'number') {
onEvent({ type: 'status', label: 'streaming', ttftMs: ev.ttft_ms });
}
@ -217,12 +270,14 @@ export function createClaudeStreamHandler(onEvent: EventSink) {
if (delta.type === 'text_delta' && typeof delta.text === 'string') {
if (currentMessageId) textStreamed.add(currentMessageId);
onEvent({ type: 'text_delta', delta: delta.text });
currentMessageStreamedText = true;
emitSafeText(currentMessageId, delta.text);
return;
}
if (delta.type === 'thinking_delta' && typeof delta.thinking === 'string') {
if (currentMessageId) textStreamed.add(currentMessageId);
onEvent({ type: 'thinking_delta', delta: delta.thinking });
if (currentMessageId) thinkingStreamed.add(currentMessageId);
currentMessageStreamedThinking = true;
emitSafeText(currentMessageId, delta.thinking, 'thinking_delta');
return;
}
if (delta.type === 'input_json_delta' && typeof delta.partial_json === 'string') {

View file

@ -1862,6 +1862,8 @@ async function testAgentConnectionInternal(
...(def.env || {}),
},
configuredAgentEnv,
undefined,
{ resolvedBin: executableResolution.selectedPath },
);
const env = applyAgentLaunchEnv(baseEnv, executableResolution);
const auth = await probeAgentAuthStatus(input.agentId, executableResolution.launchPath, env);
@ -2026,6 +2028,7 @@ async function testAgentConnectionInternal(
stderrTail,
stdoutTail: rawStdoutTail || buffered,
env,
resolvedBin: executableResolution.selectedPath,
});
if (claudeDiagnostic) {
console.warn(

View file

@ -28,9 +28,11 @@
// source root so an environment that puts `skills/` itself behind a
// symlink (e.g. a content-addressable mount) is followed correctly.
import { createReadStream, createWriteStream } from 'node:fs';
import { createHash } from 'node:crypto';
import { cp, lstat, rm, stat } from 'node:fs/promises';
import { chmod, cp, lstat, mkdir, readdir, rm, stat, utimes } from 'node:fs/promises';
import path from 'node:path';
import { pipeline } from 'node:stream/promises';
export const SKILLS_CWD_ALIAS = '.od-skills';
@ -52,6 +54,46 @@ export function skillCwdAliasSegment(dir: string): string {
return `${folder}-${digest}`;
}
// copy_file_range(2) — used by fs.cp under the hood — is rejected with
// these errno codes when source and destination live on different
// filesystems (commonly EXDEV; a container image layer copied onto a
// ZFS/overlay bind mount surfaces EPERM). Node doesn't fall back to a
// userspace copy on any of them, so we do.
const RECOVERABLE_COPY_CODES = new Set(['EPERM', 'EXDEV', 'ENOTSUP', 'EOPNOTSUPP']);
type SkillCopyFn = (
source: string,
destination: string,
options: { recursive: boolean; dereference: boolean; preserveTimestamps: boolean },
) => Promise<void>;
// Recursive copy that mirrors `cp({ dereference: true })` without going
// through copy_file_range. `stat()` (not `lstat`) follows symlinks, so
// every staged entry lands as a real directory or regular file — keeping
// `.od-skills/` a self-contained write barrier even on the fallback path.
async function copyTreeDereferenced(srcDir: string, destDir: string): Promise<void> {
await mkdir(destDir, { recursive: true });
for (const entry of await readdir(srcDir, { withFileTypes: true })) {
const from = path.join(srcDir, entry.name);
const to = path.join(destDir, entry.name);
const entryStat = await stat(from);
if (entryStat.isDirectory()) {
await copyTreeDereferenced(from, to);
} else if (entryStat.isFile()) {
await pipeline(createReadStream(from), createWriteStream(to));
// createWriteStream opens the destination with the default 0644, so
// restore the source's permission bits (notably the exec bit on
// skill helper scripts) and mtime — `fs.cp` preserves these, and
// skills shell out to staged scripts. Mask to 0o777 so the
// agent-writable staging copy never inherits setuid/setgid/sticky.
await chmod(to, entryStat.mode & 0o777);
await utimes(to, entryStat.atime, entryStat.mtime);
}
// Sockets, FIFOs, and devices can't appear in a sane skill folder and
// copying them would hang or fail — skip them.
}
}
/**
* Copy `<sourceDir>` to `<cwd>/.od-skills/<folderName>/` so an agent can
* reach skill side files via a cwd-relative path. Idempotent and
@ -68,6 +110,11 @@ export async function stageActiveSkill(
folderName: string,
sourceDir: string,
log: SkillStagingLogger = () => {},
// Seam for tests: the real copy_file_range EPERM only reproduces on
// specific cross-filesystem mounts (ZFS/overlay), so tests inject a
// copy that rejects with a synthetic code to drive the fallback path.
nativeCopy: SkillCopyFn = (source, destination, options) =>
cp(source, destination, options),
): Promise<SkillStagingResult> {
if (!cwd) {
return { staged: false, reason: 'no project cwd' };
@ -123,7 +170,8 @@ export async function stageActiveSkill(
// reflected and a partially-failed previous run cannot leave junk
// behind.
await rm(stagedPath, { recursive: true, force: true });
await cp(sourceDir, stagedPath, {
try {
await nativeCopy(sourceDir, stagedPath, {
recursive: true,
// Resolve every symlink we find inside the skill so the staged
// copy is a fully self-contained set of regular files. This is
@ -133,6 +181,15 @@ export async function stageActiveSkill(
dereference: true,
preserveTimestamps: true,
});
} catch (err) {
const code = (err as NodeJS.ErrnoException).code ?? '';
if (!RECOVERABLE_COPY_CODES.has(code)) throw err;
log(
`[od] skill-stage: native copy failed (${code}); retrying with stream copy`,
);
await rm(stagedPath, { recursive: true, force: true });
await copyTreeDereferenced(sourceDir, stagedPath);
}
return { staged: true, stagedPath };
} catch (err) {
log(`[od] skill-stage failed: ${(err as Error).message}`);

View file

@ -202,6 +202,14 @@ function migrate(db: SqliteDb): void {
FOREIGN KEY(routine_id) REFERENCES routines(id) ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS routine_schedule_claims (
routine_id TEXT NOT NULL,
slot_at INTEGER NOT NULL,
claimed_at INTEGER NOT NULL,
PRIMARY KEY(routine_id, slot_at),
FOREIGN KEY(routine_id) REFERENCES routines(id) ON DELETE CASCADE
);
CREATE INDEX IF NOT EXISTS idx_routine_runs_routine
ON routine_runs(routine_id, started_at DESC);
`);
@ -744,12 +752,23 @@ export function listConversations(db: SqliteDb, projectId: string) {
AND m.run_status IS NOT NULL
)
WHERE rn = 1
),
total_run_durations AS (
SELECT m.conversation_id AS conversationId,
SUM(${terminalRunDurationSql('m')}) AS totalDurationMs
FROM messages m
JOIN project_conversations c ON c.id = m.conversation_id
WHERE m.role = 'assistant'
AND m.run_status IN ('succeeded', 'failed', 'canceled')
GROUP BY m.conversation_id
)
SELECT c.id, c.projectId, c.title, c.createdAt, c.updatedAt,
lr.latestRunStatus, lr.latestRunStartedAt,
lr.latestRunEndedAt, lr.latestRunEventsJson
lr.latestRunEndedAt, lr.latestRunEventsJson,
trd.totalDurationMs
FROM project_conversations c
LEFT JOIN latest_runs lr ON lr.conversationId = c.id
LEFT JOIN total_run_durations trd ON trd.conversationId = c.id
ORDER BY c.updatedAt DESC`,
)
.all(projectId)).map(normalizeConversation);
@ -767,6 +786,7 @@ export function getConversation(db: SqliteDb, id: string) {
return {
...normalizeConversation(r),
latestRun: latestConversationRunSummary(db, r.id) ?? undefined,
...numberProperty('totalDurationMs', totalConversationRunDurationMs(db, r.id)),
};
}
@ -783,10 +803,16 @@ function normalizeConversation(r: DbRow) {
title: r.title ?? null,
createdAt: Number(r.createdAt),
updatedAt: Number(r.updatedAt),
...numberProperty('totalDurationMs', r.totalDurationMs),
latestRun: latestRun ?? undefined,
};
}
function numberProperty(key: string, value: unknown) {
const n = value == null ? undefined : Number(value);
return typeof n === 'number' && Number.isFinite(n) ? { [key]: n } : {};
}
function latestConversationRunSummary(db: SqliteDb, conversationId: string) {
const row = db
.prepare(
@ -805,6 +831,50 @@ function latestConversationRunSummary(db: SqliteDb, conversationId: string) {
return conversationRunSummaryFromRow(row);
}
function totalConversationRunDurationMs(db: SqliteDb, conversationId: string): number | undefined {
const row = db
.prepare(
`SELECT SUM(${terminalRunDurationSql()}) AS totalDurationMs
FROM messages
WHERE conversation_id = ?
AND role = 'assistant'
AND run_status IN ('succeeded', 'failed', 'canceled')`,
)
.get(conversationId) as DbRow | undefined;
return row?.totalDurationMs == null ? undefined : Number(row.totalDurationMs);
}
function terminalRunDurationSql(alias?: string) {
const p = alias ? `${alias}.` : '';
return `CASE
WHEN ${p}started_at IS NOT NULL AND ${p}ended_at IS NOT NULL THEN
CASE
WHEN CAST(${p}ended_at AS INTEGER) >= CAST(${p}started_at AS INTEGER)
THEN CAST(${p}ended_at AS INTEGER) - CAST(${p}started_at AS INTEGER)
ELSE 0
END
ELSE (
SELECT CASE
WHEN json_extract(usage_event.value, '$.durationMs') >= 0
THEN json_extract(usage_event.value, '$.durationMs')
ELSE 0
END
FROM json_each(
CASE
WHEN json_valid(${p}events_json) AND json_type(${p}events_json) = 'array'
THEN ${p}events_json
ELSE '[]'
END
) AS usage_event
WHERE usage_event.type = 'object'
AND json_extract(usage_event.value, '$.kind') = 'usage'
AND json_type(usage_event.value, '$.durationMs') IN ('integer', 'real')
ORDER BY CAST(usage_event.key AS INTEGER) DESC
LIMIT 1
)
END`;
}
function conversationRunSummaryFromRow(row: DbRow | undefined) {
if (!row || typeof row.runStatus !== 'string') return null;
const startedAt = row.startedAt == null ? undefined : Number(row.startedAt);
@ -1495,6 +1565,41 @@ export function insertRoutineRun(db: SqliteDb, r: DbRow) {
return getRoutineRun(db, r.id);
}
export function insertScheduledRoutineRun(db: SqliteDb, r: DbRow, slotAt: number) {
const insertClaim = db.prepare(
`INSERT OR IGNORE INTO routine_schedule_claims
(routine_id, slot_at, claimed_at)
VALUES (?, ?, ?)`,
);
const insertRun = db.prepare(
`INSERT INTO routine_runs
(id, routine_id, trigger, status, project_id, conversation_id,
agent_run_id, started_at, completed_at, summary, error, error_code)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`,
);
const tx = db.transaction(() => {
const claim = insertClaim.run(r.routineId, slotAt, Date.now());
if (claim.changes === 0) return false;
insertRun.run(
r.id,
r.routineId,
r.trigger,
r.status,
r.projectId,
r.conversationId,
r.agentRunId,
r.startedAt,
r.completedAt ?? null,
r.summary ?? null,
r.error ?? null,
r.errorCode ?? null,
);
return true;
});
if (!tx()) return null;
return getRoutineRun(db, r.id);
}
export function updateRoutineRun(db: SqliteDb, id: string, patch: DbRow) {
const existing = getRoutineRun(db, id);
if (!existing) return null;
@ -1504,10 +1609,14 @@ export function updateRoutineRun(db: SqliteDb, id: string, patch: DbRow) {
};
db.prepare(
`UPDATE routine_runs
SET status = ?, completed_at = ?, summary = ?, error = ?, error_code = ?
SET status = ?, project_id = ?, conversation_id = ?, agent_run_id = ?,
completed_at = ?, summary = ?, error = ?, error_code = ?
WHERE id = ?`,
).run(
merged.status,
merged.projectId,
merged.conversationId,
merged.agentRunId,
merged.completedAt ?? null,
merged.summary ?? null,
merged.error ?? null,

View file

@ -6,6 +6,7 @@ import {
inlineRelativeAssets,
type InlineAssetReader,
} from './inline-assets.js';
import { isSandboxModeEnabled } from './sandbox-mode.js';
export interface RegisterImportRoutesDeps extends RouteDeps<'db' | 'http' | 'uploads' | 'node' | 'ids' | 'paths' | 'imports' | 'auth' | 'projectStore' | 'conversations' | 'projectFiles' | 'validation'> {}
@ -28,6 +29,11 @@ export function registerImportRoutes(app: Express, ctx: RegisterImportRoutesDeps
const { insertConversation } = ctx.conversations;
const { setTabs } = ctx.projectFiles;
const { validateProjectDesignSystemId } = ctx.validation;
const rejectSandboxFolderImport = () =>
isSandboxModeEnabled(process.env)
? 'folder imports are disabled when OD_SANDBOX_MODE is enabled'
: null;
app.post(
'/api/import/claude-design',
importUpload.single('file'),
@ -107,6 +113,10 @@ export function registerImportRoutes(app: Express, ctx: RegisterImportRoutesDeps
if (typeof baseDir !== 'string' || !baseDir.trim()) {
return sendApiError(res, 400, 'BAD_REQUEST', 'baseDir required');
}
const sandboxReason = rejectSandboxFolderImport();
if (sandboxReason) {
return sendApiError(res, 400, 'BAD_REQUEST', sandboxReason);
}
let trustedPickerImport = false;
if (isDesktopAuthGateActive()) {
const secret = desktopAuthSecret();
@ -204,6 +214,10 @@ export function registerImportRoutes(app: Express, ctx: RegisterImportRoutesDeps
if (typeof baseDir !== 'string' || !baseDir.trim()) {
return sendApiError(res, 400, 'BAD_REQUEST', 'baseDir required');
}
const sandboxReason = rejectSandboxFolderImport();
if (sandboxReason) {
return sendApiError(res, 400, 'BAD_REQUEST', sandboxReason);
}
let trustedPickerImport = false;
if (isDesktopAuthGateActive()) {
const secret = desktopAuthSecret();

View file

@ -41,6 +41,7 @@ import path from 'node:path';
import { MEDIA_PROVIDERS } from './media-models.js';
import { expandHomePrefix } from './home-expansion.js';
import { resolveXAIBearer } from './xai-credentials.js';
import { isSandboxModeEnabled } from './sandbox-mode.js';
const PROVIDER_IDS = MEDIA_PROVIDERS.map((p) => p.id);
type ProviderEntry = { apiKey?: string; baseUrl?: string; model?: string };
@ -286,54 +287,19 @@ async function readJsonIfPresent(file: string): Promise<JsonRecord | null> {
}
}
function tokenFromHermesAuth(data: unknown): string {
const providerToken = readNestedString(data, [
'providers',
'openai-codex',
'tokens',
'access_token',
]);
if (providerToken) return providerToken;
const pool =
isRecord(data) && isRecord(data.credential_pool)
? data.credential_pool['openai-codex']
: null;
if (Array.isArray(pool)) {
for (const item of pool) {
const token = readNestedString(item, ['access_token']);
if (token) return token;
}
}
return '';
function apiKeyFromCodexAuth(data: unknown): string {
return readNestedString(data, ['OPENAI_API_KEY']);
}
function tokenFromCodexAuth(data: unknown): { token: string; source: string } | null {
const oauthToken = readNestedString(data, ['tokens', 'access_token']);
if (oauthToken) return { token: oauthToken, source: 'oauth-codex' };
const apiKey = readNestedString(data, ['OPENAI_API_KEY']);
if (apiKey) return { token: apiKey, source: 'codex-auth' };
return null;
}
async function resolveOpenAIOAuthCredential(): Promise<OAuthCredential | null> {
async function resolveOpenAIAuthFileCredential(): Promise<OAuthCredential | null> {
if (isSandboxModeEnabled(process.env)) return null;
const home = os.homedir();
const hermesAuth = await readJsonIfPresent(
path.join(home, '.hermes', 'auth.json'),
);
const hermesToken = tokenFromHermesAuth(hermesAuth);
if (hermesToken) {
return { apiKey: hermesToken, source: 'oauth-hermes' };
}
const codexAuth = await readJsonIfPresent(
path.join(home, '.codex', 'auth.json'),
);
const codexToken = tokenFromCodexAuth(codexAuth);
if (codexToken) {
return { apiKey: codexToken.token, source: codexToken.source };
const apiKey = apiKeyFromCodexAuth(codexAuth);
if (apiKey) {
return { apiKey, source: 'codex-auth' };
}
return null;
@ -354,10 +320,10 @@ async function resolveXAIOAuthCredential(
};
}
if (isSandboxModeEnabled(process.env)) return null;
// 2. Borrow the xAI OAuth token Hermes wrote to ~/.hermes/auth.json
// when the user ran `hermes auth add xai-oauth`. Mirrors how
// resolveOpenAIOAuthCredential already borrows the openai-codex
// token from the same file, so a user who has already authorized
// when the user ran `hermes auth add xai-oauth`. A user who has already authorized
// Hermes doesn't have to run a second OAuth dance inside OD.
// (No proactive refresh here — Hermes itself maintains the token,
// and we only borrow what is currently fresh.)
@ -380,23 +346,25 @@ async function resolveXAIOAuthCredential(
/**
* Resolve credentials for a provider. Env vars win, then stored config,
* then OpenAI/Codex OAuth for the OpenAI media provider.
* then provider-specific external credential stores. OpenAI only trusts
* explicit API keys from Codex auth files; Codex/Hermes OAuth tokens are
* not valid proof that the Images API can be called.
* Returns { apiKey, baseUrl } where either may be empty string.
*/
export async function resolveProviderConfig(projectRoot: string, providerId: string): Promise<ProviderEntry> {
const stored = await readStored(projectRoot);
const entry = stored[providerId] || {};
const envKey = readEnvKey(providerId);
const needsOAuthFallback = !envKey && !entry.apiKey;
const oauth = needsOAuthFallback
const needsExternalCredential = !envKey && !entry.apiKey;
const externalCredential = needsExternalCredential
? providerId === 'openai'
? await resolveOpenAIOAuthCredential()
? await resolveOpenAIAuthFileCredential()
: providerId === 'grok'
? await resolveXAIOAuthCredential(projectRoot)
: null
: null;
return {
apiKey: envKey || entry.apiKey || oauth?.apiKey || '',
apiKey: envKey || entry.apiKey || externalCredential?.apiKey || '',
baseUrl: entry.baseUrl || '',
...(typeof entry.model === 'string' && entry.model.trim()
? { model: entry.model.trim() }
@ -427,20 +395,20 @@ export async function readMaskedConfig(projectRoot: string): Promise<MaskedConfi
const entry = stored[id] || {};
const envKey = readEnvKey(id);
const hasStoredKey = typeof entry.apiKey === 'string' && entry.apiKey.length > 0;
const needsOAuthFallback = !envKey && !hasStoredKey;
const oauth = needsOAuthFallback
const needsExternalCredential = !envKey && !hasStoredKey;
const externalCredential = needsExternalCredential
? id === 'openai'
? await resolveOpenAIOAuthCredential()
? await resolveOpenAIAuthFileCredential()
: id === 'grok'
? await resolveXAIOAuthCredential(projectRoot)
: null
: null;
providers[id] = {
configured: Boolean(envKey || hasStoredKey || oauth?.apiKey),
source: envKey ? 'env' : hasStoredKey ? 'stored' : oauth?.source || 'unset',
configured: Boolean(envKey || hasStoredKey || externalCredential?.apiKey),
source: envKey ? 'env' : hasStoredKey ? 'stored' : externalCredential?.source || 'unset',
// Show last 4 chars only when stored locally; never echo env-var
// or OAuth secrets so power users don't accidentally see them in
// the DOM.
// or borrowed auth-file/OAuth secrets so power users don't
// accidentally see them in the DOM.
apiKeyTail: hasStoredKey && entry.apiKey ? entry.apiKey.slice(-4) : '',
baseUrl: entry.baseUrl || '',
...(typeof entry.model === 'string' && entry.model.trim()

View file

@ -37,7 +37,7 @@ export const MEDIA_PROVIDERS: MediaProvider[] = [
{ id: 'hyperframes', label: 'HyperFrames', hint: 'Local HTML -> MP4 renderer', integrated: true, credentialsRequired: false, settingsVisible: false },
{ id: 'nanobanana', label: 'Nano Banana', hint: 'Google official by default; custom gateway configurable', integrated: true, defaultBaseUrl: 'https://generativelanguage.googleapis.com', supportsCustomModel: true },
{ id: 'imagerouter', label: 'ImageRouter', hint: 'OpenAI-compatible image + video routing', integrated: true, defaultBaseUrl: 'https://api.imagerouter.io/v1/openai', docsUrl: 'https://docs.imagerouter.io/api-reference/image-generation/', supportsCustomModel: true, customModelPlaceholder: 'openai/gpt-image-2 or xAI/grok-imagine-video' },
{ id: 'custom-image', label: 'Custom Image API', hint: 'OpenAI-compatible /v1/images/generations (local or cloud)', integrated: true, docsUrl: 'https://platform.openai.com/docs/api-reference/images', supportsCustomModel: true, customModelPlaceholder: 'my-image-model' },
{ id: 'custom-image', label: 'Custom Image API', hint: 'OpenAI-compatible images/generations + images/edits (local or cloud)', integrated: true, docsUrl: 'https://platform.openai.com/docs/api-reference/images', supportsCustomModel: true, customModelPlaceholder: 'my-image-model' },
{ id: 'comfyui', label: 'ComfyUI', hint: 'Local JSON workflow server (planned adapter)', integrated: false, defaultBaseUrl: 'http://127.0.0.1:8188', docsUrl: 'https://docs.comfy.org/development/core-concepts/workflow' },
{ id: 'bfl', label: 'Black Forest Labs', hint: 'FLUX 1.1 Pro / FLUX Pro / Dev', integrated: false, defaultBaseUrl: 'https://api.bfl.ai' },
{ id: 'fal', label: 'Fal.ai', hint: 'Sora / Seedance / Veo / FLUX', integrated: false, defaultBaseUrl: 'https://fal.run' },
@ -93,7 +93,7 @@ export const IMAGE_MODELS: MediaModel[] = [
{ id: 'openai/gpt-image-1.5', label: 'openai/gpt-image-1.5', hint: 'ImageRouter · routed GPT Image', provider: 'imagerouter', caps: ['t2i'] },
{ id: 'black-forest-labs/FLUX-1.1-pro', label: 'FLUX-1.1-pro', hint: 'ImageRouter · Black Forest Labs', provider: 'imagerouter', caps: ['t2i'] },
{ id: 'custom-image', label: 'custom-image', hint: 'Custom · OpenAI-compatible endpoint', provider: 'custom-image', caps: ['t2i'] },
{ id: 'custom-image', label: 'custom-image', hint: 'Custom · OpenAI-compatible endpoint', provider: 'custom-image', caps: ['t2i', 'i2i'] },
{ id: 'flux-1.1-pro', label: 'flux-1.1-pro', hint: 'BFL · flagship', provider: 'bfl', caps: ['t2i', 'i2i'] },
{ id: 'flux-pro', label: 'flux-pro', hint: 'BFL', provider: 'bfl', caps: ['t2i'] },

View file

@ -30,7 +30,8 @@
// * provider 'imagerouter'→ ImageRouter OpenAI-compatible image/video
// generation endpoints
// * provider 'custom-image'→ user-supplied OpenAI-compatible
// /v1/images/generations endpoint
// /v1/images/generations + /v1/images/edits
// endpoints
//
// The fallback stub handlers are gated behind OD_MEDIA_ALLOW_STUBS=1; in
// release builds they throw StubProviderDisabledError (mapped to HTTP
@ -709,7 +710,7 @@ function withMediaRequestInit(
async function renderOpenAIImage(ctx: MediaContext, credentials: ProviderConfig): Promise<RenderResult> {
if (!credentials.apiKey) {
throw new Error('no OpenAI credential — configure an API key in Settings, set OPENAI_API_KEY, or refresh Codex/Hermes OAuth');
throw new Error('no OpenAI credential — configure an API key in Settings or set OPENAI_API_KEY');
}
const rawBase = credentials.baseUrl || 'https://api.openai.com/v1';
const azure = detectAzureEndpoint(rawBase);
@ -866,7 +867,7 @@ async function renderCustomOpenAIImage(ctx: MediaContext, credentials: ProviderC
const baseUrl = (credentials.baseUrl || '').trim();
if (!baseUrl) {
throw new Error(
'Custom Image API base URL required — configure a /v1/images/generations compatible endpoint in Settings',
'Custom Image API base URL required — configure an OpenAI-compatible /v1/images/generations or /v1/images/edits endpoint in Settings',
);
}
const wireModel = (
@ -891,8 +892,14 @@ async function renderCustomOpenAIImage(ctx: MediaContext, credentials: ProviderC
n: 1,
size: openaiSizeFor('gpt-image-1', ctx.aspect),
};
let url = buildOpenAIImageUrl(baseUrl, false);
if (ctx.imageRef?.dataUrl) {
body.response_format = 'b64_json';
body.images = [{ image_url: ctx.imageRef.dataUrl }];
url = buildOpenAIImageEditUrl(baseUrl);
}
const resp = await fetch(buildOpenAIImageUrl(baseUrl, false), withMediaRequestInit(ctx, {
const resp = await fetch(url, withMediaRequestInit(ctx, {
method: 'POST',
headers,
body: JSON.stringify(body),
@ -988,19 +995,34 @@ function detectAzureEndpoint(baseUrl: string): boolean {
* appending the default api-version for Azure when the user didn't
* specify one. Returns a string ready for `fetch`.
*/
function normalizeOpenAICompatiblePath(pathname: string, endpoint: 'images' | 'videos', mode: 'generations' | 'edits'): string {
const strippedPath = pathname.replace(/\/+$/, '');
const generationsSuffix = `/${endpoint}/generations`;
const editsSuffix = endpoint === 'images' ? '/images/edits' : null;
if (strippedPath.endsWith(generationsSuffix)) {
if (mode === 'generations') return strippedPath;
return endpoint === 'images'
? `${strippedPath.slice(0, -generationsSuffix.length)}${editsSuffix}`
: strippedPath;
}
if (editsSuffix && strippedPath.endsWith(editsSuffix)) {
if (mode === 'edits') return strippedPath;
return `${strippedPath.slice(0, -editsSuffix.length)}${generationsSuffix}`;
}
return mode === 'edits' && editsSuffix
? `${strippedPath}${editsSuffix}`
: `${strippedPath}${generationsSuffix}`;
}
function buildOpenAICompatibleGenerationUrl(baseUrl: string, endpoint: 'images' | 'videos'): string {
const suffix = `/${endpoint}/generations`;
let parsed;
try {
parsed = new URL(baseUrl);
} catch {
const stripped = baseUrl.replace(/\/$/, '');
return stripped.endsWith(suffix) ? stripped : `${stripped}${suffix}`;
}
const strippedPath = parsed.pathname.replace(/\/+$/, '');
if (!strippedPath.endsWith(suffix)) {
parsed.pathname = `${strippedPath}${suffix}`;
return normalizeOpenAICompatiblePath(stripped, endpoint, 'generations');
}
parsed.pathname = normalizeOpenAICompatiblePath(parsed.pathname, endpoint, 'generations');
return parsed.toString();
}
@ -1019,6 +1041,18 @@ function buildOpenAIImageUrl(baseUrl: string, isAzure: boolean): string {
return parsed.toString();
}
function buildOpenAIImageEditUrl(baseUrl: string): string {
let parsed;
try {
parsed = new URL(baseUrl);
} catch {
const stripped = baseUrl.replace(/\/$/, '');
return normalizeOpenAICompatiblePath(stripped, 'images', 'edits');
}
parsed.pathname = normalizeOpenAICompatiblePath(parsed.pathname, 'images', 'edits');
return parsed.toString();
}
function buildOpenAIVideoUrl(baseUrl: string): string {
return buildOpenAICompatibleGenerationUrl(baseUrl, 'videos');
}
@ -1083,7 +1117,7 @@ function openaiSpeechFormatFor(fileName: string): string {
async function renderOpenAISpeech(ctx: MediaContext, credentials: ProviderConfig, fileName: string): Promise<RenderResult> {
if (!credentials.apiKey) {
throw new Error('no OpenAI credential — configure an API key in Settings, set OPENAI_API_KEY, or refresh Codex/Hermes OAuth');
throw new Error('no OpenAI credential — configure an API key in Settings or set OPENAI_API_KEY');
}
const rawBase = credentials.baseUrl || 'https://api.openai.com/v1';
const azure = detectAzureEndpoint(rawBase);

View file

@ -61,9 +61,6 @@ import {
} from './memory-extractions.js';
import { resolveProviderConfig } from './media-config.js';
import { spawn } from 'node:child_process';
import { promises as fsp } from 'node:fs';
import os from 'node:os';
import path from 'node:path';
import { createCommandInvocation } from '@open-design/platform';
import {
applyAgentLaunchEnv,
@ -789,16 +786,6 @@ function extractJsonEventText(kind, raw, agentName) {
.trim();
}
async function writeLocalCliPromptAttachment(agentId, prompt) {
const dir = await fsp.mkdtemp(path.join(os.tmpdir(), `od-memory-${agentId}-`));
const file = path.join(dir, 'prompt.md');
await fsp.writeFile(file, prompt, 'utf8');
return {
file,
cleanup: () => fsp.rm(dir, { recursive: true, force: true }).catch(() => {}),
};
}
async function callLocalCli(provider, system, user, options) {
if (typeof options?.localCliRunner === 'function') {
return options.localCliRunner({
@ -843,7 +830,6 @@ async function callLocalCli(provider, system, user, options) {
let args;
let stdinText = prompt;
let cleanupPromptAttachment = () => Promise.resolve();
let parseStdout = (raw) => raw.trim();
if (provider.agentId === 'claude') {
args = ['-p', '--input-format', 'text', '--output-format', 'text'];
@ -860,8 +846,12 @@ async function callLocalCli(provider, system, user, options) {
);
parseStdout = (raw) => extractJsonEventText(def.eventParser || def.id, raw, def.name);
} else if (provider.agentId === 'opencode') {
const attachment = await writeLocalCliPromptAttachment(provider.agentId, prompt);
cleanupPromptAttachment = attachment.cleanup;
// Deliver the prompt on stdin, matching the chat-run path
// (def.promptViaStdin). `opencode run`'s `-f, --file` is a yargs array
// option that greedily consumes every trailing non-flag token, so
// `--file <prompt-file> "<message>"` made OpenCode treat the message
// text as a second attachment and exit with "File not found". Bare
// `opencode run --format json` reads the message from stdin instead.
args = def.buildArgs(
'',
[],
@ -869,19 +859,19 @@ async function callLocalCli(provider, system, user, options) {
{ model: provider.model },
{ cwd },
);
args.push(
'--file',
attachment.file,
'Read the attached OpenDesign memory extraction prompt and return strict JSON only.',
);
stdinText = '';
parseStdout = (raw) => extractJsonEventText(def.eventParser || def.id, raw, def.name);
} else {
throw new Error(`Local CLI memory extraction is not supported for ${provider.agentId}`);
}
const env = applyAgentLaunchEnv(
spawnEnvForAgent(def.id, { ...process.env, ...(def.env || {}) }, configuredAgentEnv),
spawnEnvForAgent(
def.id,
{ ...process.env, ...(def.env || {}) },
configuredAgentEnv,
undefined,
{ resolvedBin: launch.selectedPath },
),
launch,
);
const invocation = createCommandInvocation({
@ -907,10 +897,8 @@ async function callLocalCli(provider, system, user, options) {
if (settled) return;
settled = true;
clearTimeout(timeout);
void cleanupPromptAttachment().finally(() => {
if (err) reject(err);
else resolve(text);
});
};
const timeout = setTimeout(() => {

View file

@ -171,6 +171,48 @@ export function linkSnapshotToProject(db: SqliteDb, snapshotId: string, projectI
).run(snapshotId, projectId);
}
export function restoreProjectSnapshotLink(
db: SqliteDb,
projectId: string,
snapshotIdToDiscard: string,
previousSnapshotId: string | null | undefined,
discardedRunId?: string | null | undefined,
): void {
const previous = typeof previousSnapshotId === 'string' && previousSnapshotId.length > 0
? previousSnapshotId
: null;
db.prepare(
`UPDATE projects
SET applied_plugin_snapshot_id = ?
WHERE id = ?
AND applied_plugin_snapshot_id = ?`,
).run(previous, projectId, snapshotIdToDiscard);
const expiry = unreferencedSnapshotExpiry();
if (typeof discardedRunId === 'string' && discardedRunId.length > 0) {
const result = db.prepare(
`UPDATE applied_plugin_snapshots
SET run_id = NULL,
expires_at = ?
WHERE id = ?
AND project_id = ?
AND run_id = ?`,
).run(expiry, snapshotIdToDiscard, projectId, discardedRunId);
if (result.changes > 0) return;
}
db.prepare(
`UPDATE applied_plugin_snapshots
SET expires_at = ?
WHERE id = ?
AND run_id IS NULL
AND project_id = ?`,
).run(expiry, snapshotIdToDiscard, projectId);
}
function unreferencedSnapshotExpiry(): number | null {
const days = readPluginEnvKnobs().snapshotUnreferencedTtlDays;
return days > 0 ? Date.now() + days * 24 * 60 * 60 * 1000 : null;
}
// Pin a snapshot to a conversation row. Same shape as
// `linkSnapshotToProject` but mutates `conversations.applied_plugin_snapshot_id`.
// Used when a plugin is applied inside an existing chat composer (§8.4).

View file

@ -0,0 +1,23 @@
import path from 'node:path';
export function resolveProjectRoot(moduleDir: string): string {
const base = path.basename(moduleDir);
const daemonDir =
base === 'dist' || base === 'src' ? path.dirname(moduleDir) : moduleDir;
return path.resolve(daemonDir, '../..');
}
export function resolveProjectRootFromNestedModule(moduleDir: string): string {
let current = path.resolve(moduleDir);
while (true) {
const base = path.basename(current);
if (base === 'dist' || base === 'src') {
return resolveProjectRoot(current);
}
const parent = path.dirname(current);
if (parent === current) {
return resolveProjectRoot(moduleDir);
}
current = parent;
}
}

View file

@ -1,4 +1,5 @@
import type { Express } from 'express';
import path from 'node:path';
import {
defaultScenarioPluginIdForProjectMetadata,
type PluginManifest,
@ -21,6 +22,125 @@ import { auditDesignSystemPackage } from './tools-connectors-cli.js';
export interface RegisterProjectRoutesDeps extends RouteDeps<'db' | 'design' | 'http' | 'paths' | 'projectStore' | 'projectFiles' | 'conversations' | 'templates' | 'status' | 'events' | 'ids' | 'telemetry' | 'validation'> {}
function projectDetailResolvedDir(
projectsRoot: string,
project: any,
resolveProjectDir: (
projectsRoot: string,
projectId: string,
metadata?: unknown,
opts?: { allowUnavailableSandboxImportedProject?: boolean },
) => string,
): string {
const baseDir = typeof project?.metadata?.baseDir === 'string'
? path.normalize(project.metadata.baseDir)
: null;
if (baseDir && path.isAbsolute(baseDir)) return baseDir;
return resolveProjectDir(projectsRoot, project.id, project.metadata, {
allowUnavailableSandboxImportedProject: true,
});
}
const URL_PREVIEW_SCROLL_BRIDGE = `<script data-od-url-scroll-bridge>
(function(){
if (window.__odUrlScrollBridge) return;
window.__odUrlScrollBridge = true;
var pending = false;
function scrollElement(){
return document.querySelector('.design-canvas') || document.scrollingElement || document.documentElement;
}
function num(value){
var next = Number(value || 0);
return Number.isFinite(next) ? next : 0;
}
function post(){
var el = scrollElement();
if (!el) return;
var frame = document.scrollingElement || document.documentElement;
window.parent.postMessage({
type: 'od:preview-scroll',
canvasLeft: Math.round(el.scrollLeft || 0),
canvasTop: Math.round(el.scrollTop || 0),
frameLeft: Math.round(frame.scrollLeft || 0),
frameTop: Math.round(frame.scrollTop || 0)
}, '*');
}
function schedule(){
if (pending) return;
pending = true;
window.requestAnimationFrame(function(){
pending = false;
post();
});
}
function scrollTo(el, left, top){
if (!el) return;
if (typeof el.scrollTo === 'function') el.scrollTo(num(left), num(top));
else {
el.scrollLeft = num(left);
el.scrollTop = num(top);
}
}
function scrollBy(el, left, top){
if (!el) return;
var dx = num(left);
var dy = num(top);
if (!dx && !dy) return;
if (typeof el.scrollBy === 'function') el.scrollBy({ left: dx, top: dy, behavior: 'auto' });
else {
el.scrollLeft = (el.scrollLeft || 0) + dx;
el.scrollTop = (el.scrollTop || 0) + dy;
}
}
function requestRestore(){
window.parent.postMessage({ type: 'od:preview-scroll-request' }, '*');
}
window.addEventListener('message', function(ev){
var data = ev && ev.data;
if (!data || !data.type) return;
if (data.type === 'od:preview-scroll-restore') {
scrollTo(document.scrollingElement || document.documentElement, data.frameLeft, data.frameTop);
scrollTo(scrollElement(), data.canvasLeft, data.canvasTop);
setTimeout(post, 0);
return;
}
if (data.type === 'od:preview-scroll-by') {
scrollBy(scrollElement(), data.left, data.top);
schedule();
}
});
window.addEventListener('scroll', schedule, true);
document.addEventListener('scroll', schedule, true);
window.addEventListener('resize', schedule);
if (document.readyState === 'loading') {
document.addEventListener('DOMContentLoaded', function(){
requestRestore();
schedule();
});
} else {
setTimeout(function(){
requestRestore();
schedule();
}, 0);
}
})();
</script>`;
function wantsUrlPreviewScrollBridge(value: unknown): boolean {
if (Array.isArray(value)) return value.some(wantsUrlPreviewScrollBridge);
if (typeof value !== 'string') return false;
return value === 'scroll' || value === '1' || value === 'true';
}
function injectUrlPreviewScrollBridge(html: string): string {
if (html.includes('data-od-url-scroll-bridge')) return html;
const bodyCloseIndex = html.search(/<\/body\s*>/i);
if (bodyCloseIndex >= 0) {
return `${html.slice(0, bodyCloseIndex)}${URL_PREVIEW_SCROLL_BRIDGE}${html.slice(bodyCloseIndex)}`;
}
return `${html}${URL_PREVIEW_SCROLL_BRIDGE}`;
}
export function registerProjectRoutes(app: Express, ctx: RegisterProjectRoutesDeps) {
const { db, design } = ctx;
const { sendApiError, createSseResponse } = ctx.http;
@ -32,7 +152,7 @@ export function registerProjectRoutes(app: Express, ctx: RegisterProjectRoutesDe
const { listLatestProjectRunStatuses, listProjectsAwaitingInput, normalizeProjectDisplayStatus, composeProjectDisplayStatus, listProjects } = ctx.status;
const { subscribeFileEvents, activeProjectEventSinks } = ctx.events;
const { randomId } = ctx.ids;
const { validateProjectDesignSystemId } = ctx.validation;
const { validateProjectDesignSystemId, validateProjectSkillId } = ctx.validation;
async function loadPluginRegistryView() {
const [skills, designSystems] = await Promise.all([
listSkills(SKILLS_DIR),
@ -181,6 +301,11 @@ export function registerProjectRoutes(app: Express, ctx: RegisterProjectRoutesDe
);
}
const normalizedDesignSystemId = designSystemValidation.id;
const skillValidation = await validateProjectSkillId(skillId);
if (!skillValidation.ok) {
return sendApiError(res, 400, skillValidation.code, skillValidation.message);
}
const normalizedSkillId = skillValidation.id;
const projectMetadata =
metadata && typeof metadata === 'object'
? {
@ -200,7 +325,7 @@ export function registerProjectRoutes(app: Express, ctx: RegisterProjectRoutesDe
const project = insertProject(db, {
id,
name: name.trim(),
skillId: skillId ?? null,
skillId: normalizedSkillId,
designSystemId: normalizedDesignSystemId,
pendingPrompt: pendingPrompt || null,
metadata: projectMetadata,
@ -314,7 +439,7 @@ export function registerProjectRoutes(app: Express, ctx: RegisterProjectRoutesDe
const project = getProject(db, req.params.id);
if (!project)
return sendApiError(res, 404, 'PROJECT_NOT_FOUND', 'not found');
const resolvedDir = resolveProjectDir(PROJECTS_DIR, project.id, project.metadata);
const resolvedDir = projectDetailResolvedDir(PROJECTS_DIR, project, resolveProjectDir);
/** @type {import('@open-design/contracts').ProjectResponse} */
const body = { project, resolvedDir };
res.json(body);
@ -403,6 +528,13 @@ export function registerProjectRoutes(app: Express, ctx: RegisterProjectRoutesDe
}
patch.designSystemId = designSystemValidation.id;
}
if (Object.prototype.hasOwnProperty.call(patch, 'skillId')) {
const skillValidation = await validateProjectSkillId(patch.skillId);
if (!skillValidation.ok) {
return sendApiError(res, 400, skillValidation.code, skillValidation.message);
}
patch.skillId = skillValidation.id;
}
const project = updateProject(db, req.params.id, patch);
if (!project)
return sendApiError(res, 404, 'PROJECT_NOT_FOUND', 'not found');
@ -947,6 +1079,13 @@ export function registerProjectFileRoutes(app: Express, ctx: RegisterProjectFile
}
const file = await readProjectFile(PROJECTS_DIR, projectId, relPath, project?.metadata);
if (
wantsUrlPreviewScrollBridge(req.query.odPreviewBridge) &&
/^text\/html(?:;|$)/i.test(file.mime)
) {
res.type(file.mime).send(injectUrlPreviewScrollBridge(file.buffer.toString('utf8')));
return;
}
res.type(file.mime).send(file.buffer);
} catch (err: any) {
const status = err && err.code === 'ENOENT' ? 404 : 400;

View file

@ -26,6 +26,7 @@ import {
isPublicationGuardedArtifactKind,
} from './artifact-publication-guard.js';
import { isIgnoredProjectDirName } from './project-ignored-dirs.js';
import { isSandboxModeEnabled } from './sandbox-mode.js';
const FORBIDDEN_SEGMENT = /^$|^\.\.?$/;
const RESERVED_PROJECT_FILE_SEGMENTS = new Set(['.live-artifacts']);
@ -40,13 +41,42 @@ export function projectDir(projectsRoot, projectId) {
return path.join(projectsRoot, projectId);
}
export class SandboxImportedProjectError extends Error {
code = 'SANDBOX_IMPORTED_PROJECT_UNAVAILABLE';
constructor() {
super(
'Imported-folder projects are not available in OD_SANDBOX_MODE until their files are mirrored into the managed project directory.',
);
this.name = 'SandboxImportedProjectError';
}
}
function hasExternalProjectRoot(metadata?) {
if (typeof metadata?.baseDir !== 'string') return false;
return path.isAbsolute(path.normalize(metadata.baseDir));
}
export function assertSandboxProjectRootAvailable(metadata?) {
if (isSandboxModeEnabled(process.env) && hasExternalProjectRoot(metadata)) {
throw new SandboxImportedProjectError();
}
}
function usesExternalProjectRoot(metadata?) {
if (isSandboxModeEnabled(process.env)) return false;
return hasExternalProjectRoot(metadata);
}
// Returns the folder a project's files live in. For git-linked projects
// (metadata.baseDir set), this is the user's own folder. Otherwise falls
// back to the standard computed path under projectsRoot.
export function resolveProjectDir(projectsRoot, projectId, metadata?) {
if (typeof metadata?.baseDir === 'string') {
const p = path.normalize(metadata.baseDir);
if (path.isAbsolute(p)) return p;
export function resolveProjectDir(projectsRoot, projectId, metadata?, opts = {}) {
if (!opts.allowUnavailableSandboxImportedProject) {
assertSandboxProjectRootAvailable(metadata);
}
if (usesExternalProjectRoot(metadata)) {
return path.normalize(metadata.baseDir);
}
if (!isSafeId(projectId)) throw new Error('invalid project id');
return path.join(projectsRoot, projectId);
@ -55,7 +85,7 @@ export function resolveProjectDir(projectsRoot, projectId, metadata?) {
export async function ensureProject(projectsRoot, projectId, metadata?) {
const dir = resolveProjectDir(projectsRoot, projectId, metadata);
// Git-linked folders already exist; skip mkdir to avoid side-effects.
if (typeof metadata?.baseDir !== 'string') {
if (!usesExternalProjectRoot(metadata)) {
await mkdir(dir, { recursive: true });
}
return dir;
@ -67,7 +97,7 @@ export async function listFiles(projectsRoot, projectId, opts = {}) {
const out = [];
// Skip build/install dirs for linked folders so node_modules doesn't stall
// the walk on large repos.
const skipDirs = metadata?.baseDir ? isIgnoredProjectDirName : undefined;
const skipDirs = usesExternalProjectRoot(metadata) ? isIgnoredProjectDirName : undefined;
await collectFiles(dir, '', out, skipDirs, dir);
// Newest first — matches the visual order users expect after generating.
out.sort((a, b) => b.mtime - a.mtime);

View file

@ -661,6 +661,23 @@ export function composeSystemPrompt({
);
}
// Pinned LAST so recency bias reinforces the role-marker prohibition.
// This is the canonical anti-roleplay instruction;
parts.push(
"\n\n---\n\n## CRITICAL: Never fabricate conversation turns\n\n" +
"The text you emit is processed by a chat host that interprets lines " +
"starting with \`## user\`, \`## assistant\`, or \`## system\` as real " +
"turn boundaries. Emitting these lines causes the host to treat your " +
"fabricated text as a real user request and execute unauthorised actions.\n\n" +
"**FORBIDDEN — you MUST NOT:**\n" +
"- Emit any line starting with \`## user\`, \`## assist\`, \`## assistant\`, or \`## system\`\n" +
"- Roleplay multiple turns inside a single response\n" +
"- Invent a user message and then reply to it\n\n" +
"The host will truncate your response at the first role-marker line — " +
"any text after it is lost. If you feel the urge to simulate a dialogue, " +
"stop and ask the user a real question instead.",
);
return parts.join('');
}

View file

@ -0,0 +1,297 @@
/**
* Shared utility for detecting and stripping fabricated role-marker lines
* (`## user`, `## assistant`, `## system`) injected by the model into its
* own output (see #3247 same class as #2102 / #2464).
*
* `createRoleMarkerGuard()` stateful per-message guard for structured
* stream handlers that can track message boundaries (Claude, Copilot,
* Qoder, OpenCode/Codex, Pi, ACP). Returns `{ feedText, contaminated,
* warningEvent }`.
*/
// Regex matching fabricated role-marker lines injected by the model into
// its own output. Anchored to start-of-line via (?:^|\n) so we don't
// false-positive on user prose like "here is the ## user content".
//
// Scope (deliberately narrow): Markdown-style `## user` / `## assistant`
// / `## assist` / `## system` only — these are the patterns the chat
// host actually parses as turn boundaries (see `buildDaemonTranscript`
// in apps/web/src/providers/daemon.ts). Chat-style markers like
// `User:` / `Assistant:` / `Human:` / `AI:` are intentionally NOT
// included, because:
// (1) The host never parses them as turn boundaries; a model emitting
// them does NOT cause the original #3247 security failure mode.
// (2) They collide with legitimate output far more often than the
// Markdown family (e.g., "User: bob@example.com", form labels,
// JSDoc lines). With kill-on-detection wired in server.ts
// (`abortForRoleMarker`), a false positive aborts the whole run
// — a much more expensive failure than a stray unflagged
// `User:` line in the chat scrollback.
// If a host frontend ever starts parsing chat-style markers as
// boundaries, narrow the additions to that frontend's specific
// path rather than the shared regex.
//
// Three deliberate refinements vs. a naive `## role` match:
//
// 1. CASE-SENSITIVE. The chat host's turn-boundary delimiter is
// lowercase (`## user` / `## assistant` / `## system` — see
// `buildDaemonTranscript` in apps/web/src/providers/daemon.ts), and
// the `## CRITICAL` system-prompt block forbids only the lowercase
// forms. Title-Case Markdown headings like `## User Guide`,
// `## System Architecture`, `## Assistant settings` are LEGITIMATE
// content (LLMs emit these constantly in technical writing) and
// must not contaminate. Matching with `/i` would deterministically
// abort any run that produced such a heading — exactly the
// "false positive aborts the whole run" cost the docblock cites
// as the reason to keep the regex narrow.
// (See PR #3303 review r3324151877.)
//
// 2. POSITIVE LOOKAHEAD `(?=[^a-z])`. Without it, `## userland`,
// `## userspace`, `## users guide`, `## systemd`, `## assistance`
// all match via prefix in the alternation. The positive lookahead
// requires the character after the role keyword to exist AND to NOT
// be a lowercase letter:
// - `## user\n…` → match (newline is not lowercase)
// - `## assistantR…` → match (R is uppercase; the glued-form
// attack pattern still gets caught)
// - `## assistant.` → match (. is not a letter)
// - `## users guide` → no match (s is lowercase letter)
// - `## userland` → no match (l is lowercase letter)
// Why POSITIVE `[^a-z]` rather than NEGATIVE `(?![a-z])`: the
// negative form is satisfied at end-of-string, which in a streaming
// context means "we have just received `## user` but don't know
// what comes next yet". A negative lookahead would fire prematurely
// if the rest of the role-keyword landed in a later chunk (e.g.
// the model emits `## user` then `land` arrives). The positive
// form requires an actual non-lowercase character to be present,
// so detection waits one more chunk in that edge case — a
// one-character latency traded for correctness.
//
// 3. `[ \t]` instead of `\s` for inner whitespace. `\s` matches
// newlines, which would let oddities like `##\nuser` match across
// lines. Markdown role markers are always single-line by
// convention; restricting to space/tab tightens the match without
// losing any real attack pattern.
//
// Alternation order: `assistant` is listed before `assist` so a
// fully-spelled `## assistant` consumes 9 chars (not 6) and the
// `(?![a-z])` check is applied at position 9 (after the full word)
// rather than position 6. Truncated forms (`## assist\n` from a
// stream cut mid-emission) still match via the `assist` branch.
export const FABRICATED_ROLE_MARKER_RE =
/(?:^|\n)[ \t]*##[ \t]+(?:user|assistant|assist|system)(?=[^a-z])/;
// Internal-only variant used after the first chunk has been processed.
// Drops the `^` alternative: once `tail` is a rolling slice of
// mid-stream text, `^` no longer represents the genuine message start
// — applying it would let the regex anchor at an arbitrary cut point
// inside legitimate prose ("…take a look at the ## user content…"
// fed char-by-char would eventually slide a tail window onto leading
// whitespace + `## user` and false-positive). Only `\n`-preceded
// markers are real role boundaries on subsequent chunks; the preceding
// newline is retained inside the 64-char tail so genuine markers
// straddling a chunk boundary are still caught.
// (See PR #3303 review r3324060995.)
const NEWLINE_ANCHORED_ROLE_MARKER_RE =
/\n[ \t]*##[ \t]+(?:user|assistant|assist|system)(?=[^a-z])/;
// Pending-marker variants used in the no-match branch to detect a
// COMPLETE-but-unconfirmed marker prefix at the end of the buffer.
// Drop the `(?=[^a-z])` lookahead and anchor with `$` instead — the
// lookahead's whole purpose is to require a non-lowercase character
// AFTER the role keyword, which by definition can't be present when
// the chunk boundary fell exactly between the role keyword and its
// next byte. If one of these matches, the role keyword IS at the end
// of the current buffer; we withhold it and revisit on the next
// feed, where one of three things will happen:
// (1) The next char is non-lowercase → main regex matches →
// contaminated → withheld bytes dropped.
// (2) The next char is lowercase (e.g. `## userl…`) → main regex
// no longer matches the role keyword → withheld bytes are
// confirmed safe and emitted alongside the new chunk.
// (3) The role keyword is part of a longer word that itself is a
// role keyword (only `user` ⊂ `users`, etc. — none extend to
// a different role) → still case (2), since the extension is
// lowercase.
// This implements the suggested fix on review r3324277xxx —
// preserves the documented "everything from the marker onward is
// silently dropped" contract across chunk boundaries that fall
// inside the lookahead-detection window.
const FIRST_CHUNK_PENDING_MARKER_TAIL_RE =
/(?:^|\n)[ \t]*##[ \t]+(?:user|assistant|assist|system)$/;
const NEWLINE_ANCHORED_PENDING_MARKER_TAIL_RE =
/\n[ \t]*##[ \t]+(?:user|assistant|assist|system)$/;
// Bounded tail size for cross-chunk matching. Must comfortably exceed
// the longest possible marker prefix:
// "\n" + whitespace run + "##" + whitespace + "assistant" ≈ 1624
// chars in practice (LLMs rarely emit more than a couple newlines or a
// handful of spaces between sections). 64 leaves generous margin and
// keeps the guard's memory + per-delta work O(1) regardless of message
// length — important because a 50KB assistant response delivered in
// 1000 chunks of 50 bytes is otherwise O(n²) on string concatenation
// alone.
const TAIL_BUFFER_SIZE = 64;
export interface RoleMarkerGuard {
/** Feed a text delta for the current message. Returns the safe portion
* to emit (may be shorter than `text` if a marker was found mid-chunk,
* or empty string if the entire chunk is past the cut point). */
feedText(text: string): string;
/** Whether a fabricated marker was detected (further text is dropped). */
readonly contaminated: boolean;
/** If contaminated, the warning event to emit. `null` if clean. */
warningEvent(): { type: 'fabricated_role_marker'; marker: string; messageId: string } | null;
}
/**
* Create a stateful guard that detects fabricated role markers across
* chunk boundaries. Memory + per-call work is O(1): instead of
* accumulating the full message text, the guard retains only a small
* trailing suffix (TAIL_BUFFER_SIZE chars) enough for the matcher to
* see across chunk boundaries when a marker straddles them.
*
* Usage in a stream handler:
*
* const guard = createRoleMarkerGuard(messageId);
* for (const delta of deltas) {
* const safe = guard.feedText(delta.text);
* if (safe.length > 0) onEvent({ type: 'text_delta', delta: safe });
* if (guard.contaminated) {
* onEvent(guard.warningEvent()!);
* break; // stop emitting text for this message
* }
* }
*/
export function createRoleMarkerGuard(messageId: string): RoleMarkerGuard {
// Rolling tail of the bytes we have ALREADY EMITTED, capped at
// TAIL_BUFFER_SIZE. Used as the prefix when matching against new
// text so we catch markers that straddle a chunk boundary.
let tail = '';
// Bytes we have RECEIVED but DEFERRED — held back because they form
// a complete-but-unconfirmed marker suffix at the end of the buffer
// and we don't yet know whether the next chunk will confirm them
// (next char non-lowercase → contaminated, drop) or deny them
// (next char lowercase → suffix was part of a longer word, emit).
// Without this, a chunk boundary falling exactly between the role
// keyword and its lookahead char would leak the marker line itself
// into the UI / app.sqlite before we could classify it. See review
// r3324277xxx.
let pending = '';
// Tracks whether `tail` still represents the ENTIRE emission so
// far — i.e. no slicing has occurred yet and `^` in the canonical
// regex genuinely anchors at byte 0 of the message stream. While
// this holds, the `^|\n` alternation safely catches a role marker
// that arrives at the start of the stream even if its prefix is
// split across multiple chunks (`## ` | `user\n…`, `## us` | `er\n…`,
// `##` | ` user\n…`). The moment `tail` would exceed
// TAIL_BUFFER_SIZE, the slice turns `tail` into a mid-stream
// window and `^` no longer represents the stream start — we then
// switch to the newline-only variants so a sliding window cannot
// manufacture a match from prose. The transition is on slicing,
// not on first emission: earlier definitions ("any byte emitted",
// "newline emitted") both had failure modes — see PR #3303 reviews
// r3324060995 and r3324xxxxxx, and the regression tests below.
let firstChunk = true;
let _contaminated = false;
let markerText: string | null = null;
return {
get contaminated() {
return _contaminated;
},
feedText(text: string): string {
if (_contaminated) return '';
if (text.length === 0) return '';
// Combine `tail` (already-emitted suffix for cross-chunk matching),
// `pending` (deferred-from-prior-call suspicious suffix), and the
// new `text` into a single matching buffer.
const buffer = tail + pending + text;
const matchRe = firstChunk
? FABRICATED_ROLE_MARKER_RE
: NEWLINE_ANCHORED_ROLE_MARKER_RE;
const pendingRe = firstChunk
? FIRST_CHUNK_PENDING_MARKER_TAIL_RE
: NEWLINE_ANCHORED_PENDING_MARKER_TAIL_RE;
// `firstChunk` transitions are tied to actual byte emission, not
// feed count — see comment above. Transitioned at the end of
// this function only when we emit at least one byte.
const match = matchRe.exec(buffer);
if (match) {
// Marker confirmed. Compute the safe-to-emit portion (bytes
// between previously-emitted `tail` and the marker), drop
// `pending` (the deferred portion sits inside the marker
// region by definition once the lookahead char arrives), and
// mark contaminated. Subsequent feeds early-return.
_contaminated = true;
markerText = match[0].trim();
pending = '';
const alreadyEmitted = tail.length;
const markerStart = match.index;
if (markerStart <= alreadyEmitted) return '';
return buffer.slice(alreadyEmitted, markerStart);
}
// No confirmed marker. Check whether the buffer ends with a
// complete-but-unconfirmed marker prefix (role keyword present,
// lookahead char not yet arrived). If so, withhold that suffix
// until the next feed; emit the rest.
const pendingMatch = pendingRe.exec(buffer);
const alreadyEmitted = tail.length;
const pendingStart = pendingMatch
// Never withhold bytes we have already emitted in a prior
// feed — the suspicious suffix could in pathological cases
// start inside `tail` (we held back `pending` correctly on
// the prior call, but the suffix-start position is upstream
// of where we hold). Clamp to alreadyEmitted so safeToEmit
// never goes negative.
? Math.max(pendingMatch.index, alreadyEmitted)
: buffer.length;
const safeToEmit = buffer.slice(alreadyEmitted, pendingStart);
pending = buffer.slice(pendingStart);
// Roll the emitted-bytes tail forward.
const fullEmitted = tail + safeToEmit;
const willSlice = fullEmitted.length > TAIL_BUFFER_SIZE;
tail = willSlice
? fullEmitted.slice(fullEmitted.length - TAIL_BUFFER_SIZE)
: fullEmitted;
// `firstChunk` is true exactly while `tail` still represents the
// entire emission so far — i.e. no slice has occurred and `^` in
// the canonical regex genuinely anchors at byte 0 of the stream.
// The moment we slice (emitted bytes exceed TAIL_BUFFER_SIZE),
// `tail` becomes a mid-stream window, `^` becomes meaningless,
// and we switch to the newline-only variants.
//
// Earlier iterations of this code used "any byte emitted" or
// "newline emitted" as the transition trigger. Both were wrong:
// - "any byte" lost the `^` anchor before a chunk-split
// message-start marker (e.g. `## ` | `user\n…`,
// `## us` | `er\n…`) could finish arriving — see PR #3303
// review r3324xxxxxx, and the new tests below.
// - "newline emitted" left `^` valid on a sliced buffer for
// streams that hadn't yet emitted a newline, which then
// false-positived the rolling-tail mid-stream case from
// review r3324060995.
// Slice-based is the invariant that satisfies both: while we
// haven't sliced, `^` is correct; once we slice, it isn't.
if (willSlice) firstChunk = false;
return safeToEmit;
},
warningEvent() {
if (!_contaminated || !markerText) return null;
return {
type: 'fabricated_role_marker',
marker: markerText,
messageId,
};
},
};
}

View file

@ -77,6 +77,24 @@ export interface RoutineRunHandlerStart {
conversationId: string;
agentRunId: string;
completion: Promise<RoutineRunCompletion>;
prepare?: (run: RoutineRun) => void | Promise<void>;
start?: () => void;
// Tear-down for the case where the handler returned a start handle but
// `RoutineService` later reached `prepare()` and it failed — i.e. the
// routine_run row exists, prepare may have partially mutated project /
// conversation / snapshot state, and the in-memory chat run still needs
// to terminate as `canceled`. Callers MUST surface failures rather than
// swallow them (the loser-retry path depends on it).
discard?: () => void;
// Tear-down for the case where the run was NEVER durably inserted —
// either `insertRun()` threw, or `insertRun()` returned `false` because
// a sibling daemon already won the scheduled slot. Prepare has not run,
// so no project / conversation / snapshot writes need rolling back. The
// in-memory chat run must also be removed from the registry instead of
// being finalized as `canceled`, otherwise duplicate-loser slots would
// surface phantom canceled runs on `/api/runs`. Falls back to `discard`
// when the handler does not distinguish the two cases.
discardUnstarted?: () => void;
}
export interface RoutineRunCompletion {
@ -95,7 +113,7 @@ export type RoutineRunHandler = (input: {
export interface RoutinePersistence {
list(): Routine[];
insertRun(run: RoutineRun): void;
insertRun(run: RoutineRun, options?: { scheduledSlotAt?: number }): boolean | void;
updateRun(id: string, patch: Partial<RoutineRun>): void;
getLatestRun(routineId: string): RoutineRun | null;
}
@ -106,6 +124,25 @@ interface ScheduledTimer {
fireAt: Date;
}
function clearRoutinePlaceholderId(value: string): string {
return value.startsWith('routine-pending-') ? '' : value;
}
class ScheduledRunPersistenceError extends Error {
constructor(
readonly routineId: string,
readonly slotAt: number,
readonly originalError: unknown,
) {
super(`Routine ${routineId} scheduled slot ${slotAt} could not be persisted`);
this.name = 'ScheduledRunPersistenceError';
}
}
function isScheduledRunPersistenceError(error: unknown): error is ScheduledRunPersistenceError {
return error instanceof ScheduledRunPersistenceError;
}
// ---------- timezone math ----------
// Returns the wall-clock parts of `atUtc` rendered in `timezone`. Uses
@ -458,22 +495,43 @@ export class RoutineService {
if (!routine.enabled) return;
const fireAt = nextRunAtForSchedule(routine.schedule);
if (!fireAt) return;
this.scheduleRoutineAt(routine, fireAt);
}
private retryScheduledSlot(routineId: string, fireAt: Date): void {
if (!this.started) return;
const routine = this.persistence.list().find((candidate) => candidate.id === routineId);
if (!routine?.enabled) return;
this.scheduleRoutineAt(routine, fireAt);
}
private scheduleRoutineAt(routine: Routine, fireAt: Date): void {
// setTimeout can't carry past 2^31 ms (~24.8 days); we cap and use
// a chained re-schedule. Routines fire within hours/days, but a
// misconfigured "next month" weekly value could otherwise overflow.
const delay = Math.max(1_000, Math.min(2_000_000_000, fireAt.getTime() - Date.now()));
const timer = setTimeout(() => {
this.timers.delete(routine.id);
this.start_(routine.id, 'scheduled')
const slotAt = fireAt.getTime();
this.start_(routine.id, 'scheduled', { scheduledSlotAt: slotAt })
.then(() => {
// Always reschedule so a single fire keeps the cadence alive.
this.rescheduleOne(routine.id);
})
.catch((error) => {
console.error(
`[od] routine ${routine.id} scheduled run failed:`,
error instanceof Error ? error.message : error,
error instanceof ScheduledRunPersistenceError
? error.originalError instanceof Error
? error.originalError.message
: error.originalError
: error instanceof Error ? error.message : error,
);
})
.finally(() => {
// Always reschedule so a single fire keeps the cadence alive.
if (isScheduledRunPersistenceError(error)) {
this.retryScheduledSlot(routine.id, fireAt);
} else {
this.rescheduleOne(routine.id);
}
});
}, delay);
if (typeof timer.unref === 'function') timer.unref();
@ -491,6 +549,7 @@ export class RoutineService {
private async start_(
routineId: string,
trigger: RoutineRunTrigger,
options: { scheduledSlotAt?: number } = {},
): Promise<RoutineRunHandlerStart> {
if (!this.runHandler) throw new Error('Routine run handler is not configured');
const inflight = this.inflight.get(routineId);
@ -505,7 +564,7 @@ export class RoutineService {
const handler = this.runHandler;
if (!handler) throw new Error('Routine run handler is not configured');
const handlerStart = await handler({ routine, trigger, startedAt, runId });
this.persistence.insertRun({
const run: RoutineRun = {
id: runId,
routineId: routine.id,
trigger,
@ -518,7 +577,106 @@ export class RoutineService {
summary: null,
error: null,
errorCode: null,
};
const scheduledSlotAt = options.scheduledSlotAt;
const wasScheduled = scheduledSlotAt != null;
const publicProjectId = () => clearRoutinePlaceholderId(run.projectId);
const publicConversationId = () => clearRoutinePlaceholderId(run.conversationId);
const publicAgentRunId = () => clearRoutinePlaceholderId(run.agentRunId);
const scrubRoutinePlaceholders = () => {
run.projectId = publicProjectId();
run.conversationId = publicConversationId();
run.agentRunId = publicAgentRunId();
};
// Tear-down to use when the durable routine_run row was never
// inserted (insertRun threw, or another daemon already won the slot).
// Prefer the explicit `discardUnstarted` callback when the handler
// distinguishes the two cases — that one drops the in-memory chat run
// entirely instead of finalizing it as `canceled`, so duplicate
// scheduled losers do not surface phantom runs on `/api/runs`.
// Handlers that do not implement the split still see `discard`.
const discardUnstarted = handlerStart.discardUnstarted ?? handlerStart.discard;
let inserted = true;
try {
inserted = this.persistence.insertRun(run, options) !== false;
} catch (error) {
try {
discardUnstarted?.();
} catch (discardError) {
if (wasScheduled) {
throw new ScheduledRunPersistenceError(routine.id, scheduledSlotAt, discardError);
}
throw discardError;
}
if (wasScheduled) {
throw new ScheduledRunPersistenceError(routine.id, scheduledSlotAt, error);
}
throw error;
}
if (!inserted) {
try {
discardUnstarted?.();
} catch (discardError) {
if (wasScheduled) {
throw new ScheduledRunPersistenceError(routine.id, scheduledSlotAt, discardError);
}
throw discardError;
}
return handlerStart;
}
try {
await handlerStart.prepare?.(run);
const preparedIdsChanged =
run.projectId !== handlerStart.projectId
|| run.conversationId !== handlerStart.conversationId
|| run.agentRunId !== handlerStart.agentRunId;
handlerStart.projectId = run.projectId;
handlerStart.conversationId = run.conversationId;
handlerStart.agentRunId = run.agentRunId;
if (wasScheduled || preparedIdsChanged) {
this.persistence.updateRun(runId, {
projectId: run.projectId,
conversationId: run.conversationId,
agentRunId: run.agentRunId,
});
}
} catch (error) {
// Terminate the in-memory chat run created by `handler(...)` so its
// `completion` promise resolves instead of waiting forever on a
// run that will never start. Surface any cleanup failure rather
// than swallow it, but still finalize the persisted row.
let discardError: unknown = null;
try {
handlerStart.discard?.();
} catch (err) {
discardError = err;
}
if (discardError != null) {
console.error(
`[od] routine ${routine.id} prepare cleanup failed:`,
discardError instanceof Error ? discardError.message : discardError,
);
}
// Persist IDs only after `prepare()` has replaced routine
// placeholders with real resources. If preparation failed before
// enrichment, clear the sentinels so the terminal row does not point
// at fabricated project/conversation IDs. For scheduled runs the
// slot claim was already accepted at `insertRun()`, so retrying the
// same slot is not appropriate — let the error propagate so the
// scheduler advances to the next cadence.
scrubRoutinePlaceholders();
this.persistence.updateRun(runId, {
status: 'failed',
completedAt: Date.now(),
summary: null,
error: error instanceof Error ? error.message : String(error),
errorCode: null,
projectId: run.projectId,
conversationId: run.conversationId,
agentRunId: run.agentRunId,
});
throw error;
}
handlerStart.completion
.then((completion) => {
this.persistence.updateRun(runId, {
@ -538,6 +696,18 @@ export class RoutineService {
errorCode: null,
});
});
try {
handlerStart.start?.();
} catch (error) {
this.persistence.updateRun(runId, {
status: 'failed',
completedAt: Date.now(),
summary: null,
error: error instanceof Error ? error.message : String(error),
errorCode: null,
});
throw error;
}
return handlerStart;
})();
this.inflight.set(routineId, promise);

View file

@ -295,6 +295,29 @@ export function createChatRunService({
return new Promise((resolve) => run.waiters.add(resolve));
};
// Drop a run from the in-memory registry without emitting any terminal
// event. Used by callers that prepared a run optimistically (created the
// record before some external precondition was checked) and need to undo
// the create without surfacing the run via `/api/runs`. Only valid before
// the run reaches a terminal status — terminal runs use scheduleCleanup
// and would already have notified any subscribers.
const drop = (run) => {
if (!run) return;
if (TERMINAL_RUN_STATUSES.has(run.status)) return;
runs.delete(run.id);
for (const sse of run.clients) {
try { sse.end(); } catch { /* best-effort detach */ }
}
run.clients.clear();
// Resolve any pending waiters with a synthetic "canceled" status so
// they unblock instead of hanging forever — the run is being dropped
// because nothing will ever start.
run.status = 'canceled';
run.updatedAt = Date.now();
for (const waiter of run.waiters) waiter(statusBody(run));
run.waiters.clear();
};
return {
create,
start,
@ -307,6 +330,7 @@ export function createChatRunService({
emit,
finish,
fail,
drop,
statusBody,
isTerminal(status) {
return TERMINAL_RUN_STATUSES.has(status);

View file

@ -22,6 +22,33 @@ const CURSOR_AUTH_GUIDANCE =
const DEEPSEEK_AUTH_GUIDANCE =
'DeepSeek TUI is installed but is not authenticated. Add or verify your API key in `~/.deepseek/config.toml` as `api_key = "..."`, or expose DEEPSEEK_API_KEY to the Open Design daemon process, then retry. If Open Design is launched outside an interactive shell, shell rc files such as ~/.zshrc may not be loaded.';
// agy's print mode (`-p`) detects a missing OAuth token, prints the
// Google sign-in URL to stdout, waits 30s for completion, then exits
// "Error: authentication timed out." That URL points at a callback page
// that asks the user to paste the resulting auth code BACK into agy —
// which only works in the interactive TUI. So in OD's chat, surfacing
// the raw URL is a dead end (no input field to paste the code into).
// Instead we ask the user to run `agy` in a terminal once, which opens
// the browser, completes OAuth, and writes the credentials to the
// system keyring — both `-p` and TUI invocations read from there
// afterward, so the chat run can succeed on retry.
const ANTIGRAVITY_AUTH_GUIDANCE =
'Antigravity needs to sign in. The agy CLI\'s keyring entry has expired or been cleared, and `-p` print mode cannot complete OAuth on its own (it has no field to paste the auth code into).\n\nFix: open a terminal and run `agy` once — it will open Google sign-in in your browser, accept the redirect, and store the token in your system keyring. After you finish, return here and retry this chat. You only need to do this once; the keyring entry persists across both terminal and Open Design runs.';
// agy's account-level quota is per-model (consumer accounts get a
// separate quota for Gemini 3 Pro vs Flash vs Claude vs GPT-OSS), and
// when exhausted the upstream returns
// RESOURCE_EXHAUSTED (code 429): Individual quota reached. Contact
// your administrator to enable overages. Resets in <H>h<M>m<S>s.
// to the `--log-file`. Print mode emits nothing on stdout/stderr, so
// without log inspection the daemon misreads it as missing-OAuth.
// Guidance points the user at agy's TUI Switch-Model picker because
// (a) different models have separate quotas, and (b) we can't drive
// the picker from OD until upstream issue #35 ships a `--model`
// flag — see antigravity.ts notes.
const ANTIGRAVITY_QUOTA_GUIDANCE =
'Antigravity returned "RESOURCE_EXHAUSTED: Individual quota reached" for the current model. Each Antigravity model (Gemini 3 Pro / Flash, Claude 4.6, GPT-OSS) has its own quota.\n\nFix: open `agy` in a terminal and use its Switch Model picker (the menu at the bottom of the TUI) to pick a model with available quota, then retry here. Open Design uses whatever model you pick in agy\'s TUI when the Settings model picker is left on "Default". Quotas reset automatically on Antigravity\'s schedule.';
const REASONIX_AUTH_GUIDANCE =
'DeepSeek Reasonix is installed but is not authenticated. Add your API key in `~/.reasonix/config.json` under `apiKey`, or expose DEEPSEEK_API_KEY to the Open Design daemon process, then retry. If Open Design is launched outside an interactive shell, shell rc files such as ~/.zshrc may not be loaded.';
@ -33,6 +60,14 @@ export function deepseekAuthGuidance(): string {
return DEEPSEEK_AUTH_GUIDANCE;
}
export function antigravityAuthGuidance(): string {
return ANTIGRAVITY_AUTH_GUIDANCE;
}
export function antigravityQuotaGuidance(): string {
return ANTIGRAVITY_QUOTA_GUIDANCE;
}
export function reasonixAuthGuidance(): string {
return REASONIX_AUTH_GUIDANCE;
}
@ -50,6 +85,27 @@ export function isCursorAuthFailureText(text: string): boolean {
);
}
// agy's plain-mode output when no keyring credentials are available:
// - Top of stdout: "Authentication required. Please visit the URL to log in: <URL>"
// - Tail of stdout: "Waiting for authentication (timeout 30s)..."
// "Error: authentication timed out."
// The same TUI text is logged by `agy --log-file` as
// "You are not logged into Antigravity" and
// "error getting token source: You are not logged into Antigravity"
// (confirmed via the `--log-file` dump on a cleared keyring). Any of
// these is sufficient signal — match conservatively so the regex
// doesn't fire on prose containing the word "authentication" by accident.
export function isAntigravityAuthFailureText(text: string): boolean {
const value = String(text || '');
if (!value.trim()) return false;
return (
/authentication required.*please visit/i.test(value) ||
/authentication timed out/i.test(value) ||
/not logged into antigravity/i.test(value) ||
/accounts\.google\.com\/o\/oauth2\/auth.*antigravity/i.test(value)
);
}
export function isDeepSeekAuthFailureText(text: string): boolean {
const value = String(text || '');
if (!value.trim()) return false;
@ -92,6 +148,13 @@ export function classifyAgentAuthFailure(
message: deepseekAuthGuidance(),
};
}
if (agentId === 'antigravity') {
if (!isAntigravityAuthFailureText(text)) return null;
return {
status: 'missing',
message: antigravityAuthGuidance(),
};
}
if (agentId === 'reasonix') {
if (!isReasonixAuthFailureText(text)) return null;
return {

View file

@ -1,6 +1,9 @@
import { execAgentFile } from './shared.js';
import type { RuntimeAgentDef, RuntimeModelOption } from '../types.js';
const AMR_MODELS_TIMEOUT_MS = 10_000;
const AMR_MODELS_RETRY_DELAYS_MS = [250, 750] as const;
const PREFERRED_AMR_CHAT_MODEL_ORDER = [
'deepseek-v4-flash',
'deepseek-v3.2',
@ -44,6 +47,7 @@ export function normalizeVelaModelId(rawId: string): string | null {
: withoutProvider;
if (!withoutPrefix) return null;
if (/^deepseek_v3_2$/i.test(withoutPrefix)) return 'deepseek-v3.2';
if (/^deepseek-v3-2$/i.test(withoutPrefix)) return 'deepseek-v3.2';
if (/^kimi_k2_6$/i.test(withoutPrefix)) return 'kimi-k2.6';
if (/^glm_5_1$/i.test(withoutPrefix)) return 'glm-5.1';
if (/^glm_5$/i.test(withoutPrefix)) return 'glm-5';
@ -128,24 +132,76 @@ function orderAmrChatModels(
.map(({ model }) => model);
}
function sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
function velaModelsErrorMessage(error: unknown): string {
if (error instanceof Error) return error.message;
return String(error ?? '');
}
function isRetriableVelaModelsError(error: unknown): boolean {
const message = velaModelsErrorMessage(error).toLowerCase();
return [
'deadline exceeded',
'timed out',
'timeout',
'temporarily unavailable',
'temporary failure',
'econnreset',
'econnrefused',
'enotfound',
'502',
'503',
'504',
].some((pattern) => message.includes(pattern));
}
async function fetchVelaModelsWithRetry(
resolvedBin: string,
env: NodeJS.ProcessEnv,
): Promise<RuntimeModelOption[]> {
let lastError: unknown = null;
for (let attempt = 0; attempt <= AMR_MODELS_RETRY_DELAYS_MS.length; attempt += 1) {
try {
const { stdout } = await execAgentFile(resolvedBin, ['models'], {
env,
timeout: AMR_MODELS_TIMEOUT_MS,
maxBuffer: 1024 * 1024,
});
return parseVelaModels(String(stdout));
} catch (error) {
lastError = error;
if (
attempt === AMR_MODELS_RETRY_DELAYS_MS.length ||
!isRetriableVelaModelsError(error)
) {
throw error;
}
await sleep(AMR_MODELS_RETRY_DELAYS_MS[attempt] ?? 0);
}
}
throw lastError instanceof Error ? lastError : new Error(velaModelsErrorMessage(lastError));
}
export const amrAgentDef = {
id: 'amr',
name: 'AMR',
bin: 'vela',
versionArgs: ['--version'],
fetchModels: async (resolvedBin, env) => {
const { stdout } = await execAgentFile(resolvedBin, ['models'], {
env,
timeout: 10_000,
maxBuffer: 1024 * 1024,
});
return parseVelaModels(String(stdout));
},
fetchModels: fetchVelaModelsWithRetry,
// Fail closed when Vela's live catalog is unavailable. Stale static
// fallbacks let users select models that link/opencode no longer accepts.
fallbackModels: [] as RuntimeModelOption[],
buildArgs: () => ['agent', 'run', '--runtime', 'opencode'],
streamFormat: 'acp-json-rpc',
// Vela routes model selection through ACP's `session/set_model` and only
// accepts ids that survived the `vela models` preflight check, so a
// free-text "Custom" id silently fails at spawn. The model picker
// surfaces the live Vela catalog instead.
supportsCustomModel: false,
supportsImagePaths: true,
// Daemon-process env override for emergency operator pinning. Normal UI
// selection comes from the live `vela models` catalog and is preflighted
// before spawn.

View file

@ -0,0 +1,247 @@
import {
existsSync,
mkdirSync,
readFileSync,
writeFileSync,
} from 'node:fs';
import { readFile as fsReadFile } from 'node:fs/promises';
import { homedir } from 'node:os';
import { dirname, join } from 'node:path';
import { DEFAULT_MODEL_OPTION } from './shared.js';
import type { RuntimeAgentDef } from '../types.js';
// `agy` v1.0.3 still has no `--model` flag (upstream issue #35), but the
// TUI's Switch-Model picker writes the choice to its settings.json, and
// every `agy -p` invocation re-reads that file on startup — verified by
// capturing the `--log-file` line `Propagating selected model override to
// backend: label="<model>"`. So we can route OD's model picker through
// settings.json: when the user picks a concrete model in Settings, the
// daemon writes the label into agy's settings.json right before spawn,
// and the resulting print-mode run uses that model.
//
// Two ids the picker exposes are special:
// - 'default' : leave settings.json untouched, so agy keeps
// whatever the user last picked in its own TUI.
// (Respects user choice when they switch models
// from `agy` directly.)
// - any other id : the literal display label agy expects (e.g.
// "Gemini 3.1 Pro (High)", "Claude Sonnet 4.6
// (Thinking)"). We persist it before spawn.
//
// `supportsCustomModel: false` because the label set is a server-side
// enum — a typed id agy doesn't recognise resolves to a silent
// `availableModels` cache miss + empty print-mode output, which surfaces
// to the user as a generic "empty response" error.
//
// The 8 model labels mirror what `Switch Model` in agy's TUI lists for
// consumer-tier accounts as of 2026-05-28. The set is small and stable
// enough to ship statically until upstream adds a programmatic
// `agy models` subcommand (also tracked under issue #35).
const ANTIGRAVITY_SETTINGS_PATH = join(
homedir(),
'.gemini',
'antigravity-cli',
'settings.json',
);
export function writeAntigravityModelSelection(
label: string,
settingsPath: string = ANTIGRAVITY_SETTINGS_PATH,
): void {
let existing: Record<string, unknown> = {};
if (existsSync(settingsPath)) {
try {
const parsed = JSON.parse(readFileSync(settingsPath, 'utf8')) as unknown;
if (parsed && typeof parsed === 'object' && !Array.isArray(parsed)) {
existing = parsed as Record<string, unknown>;
}
} catch {
// Corrupt JSON — fall through and rewrite the file from scratch so
// the next spawn starts from a known-good state.
}
}
existing.model = label;
mkdirSync(dirname(settingsPath), { recursive: true });
writeFileSync(settingsPath, `${JSON.stringify(existing, null, 2)}\n`);
}
// Per-process serialization for write-settings → spawn → agy-reads
// cycles on antigravity. `~/.gemini/antigravity-cli/settings.json` is
// process-global, so two OD runs that both pick concrete (non-default)
// models can race: run A writes model A, spawn A starts, run B writes
// model B before A's agy has read settings.json — A then executes on
// model B. The daemon serialises non-default antigravity spawns
// through this chain: each acquire awaits the previous release, and
// each release fires only after the spawned agy actually emits
// `Propagating selected model override to backend: label="<X>"` in
// its `--log-file` (which is the upstream signal that settings.json
// has been read).
let antigravityLockChain: Promise<void> = Promise.resolve();
export async function acquireAntigravityModelLock(): Promise<() => void> {
const previous = antigravityLockChain;
let release: () => void = () => {};
antigravityLockChain = new Promise<void>((resolve) => {
release = resolve;
});
await previous;
return release;
}
// Visible for tests. Resets the module-level lock chain so a test that
// installed a hanging acquirer can release it without leaking state to
// subsequent test cases. Production code never calls this.
export function _resetAntigravityModelLockForTests(): void {
antigravityLockChain = Promise.resolve();
}
export interface WaitForAgyModelOptions {
timeoutMs?: number;
pollIntervalMs?: number;
// Override for tests; production reads the daemon-owned log file path.
readFile?: (path: string) => Promise<string>;
// Override `Date.now` for tests; production uses the wall clock.
now?: () => number;
// Stops polling when fired. Production wires this to `child.once('exit')`
// so the watcher cancels as soon as agy exits — the lock release is
// then driven by the exit handler rather than the helper's return
// value, eliminating the slow-startup race the looper review at
// 263fd2fe7 flagged: if a cold agy takes >timeoutMs to read its
// settings.json, we'd otherwise return false, the caller would
// release the lock, and a concurrent run B could rewrite
// settings.json before A's agy actually read it.
abortSignal?: AbortSignal;
}
// Polls agy's `--log-file` for the line
// `Propagating selected model override to backend: label="<expectedModel>"`
// which `model_config_manager.go` emits once agy has finished reading
// `~/.gemini/antigravity-cli/settings.json` and sent the model
// override to the upstream backend. Returns true on observed signal,
// false on timeout OR abort. Never throws — a missing log file is
// treated as "not yet seen" so the polling loop keeps retrying until
// either the deadline or the abort signal fires.
//
// IMPORTANT: callers MUST NOT use a `false` return as a "go ahead and
// release the settings.json lock" signal — false means "I gave up
// polling," not "agy definitely didn't read this." Release the lock
// only on (a) a `true` return, OR (b) child exit. See server.ts for
// the wiring.
export async function waitForAgyToReadModel(
logFilePath: string,
expectedModel: string,
options: WaitForAgyModelOptions = {},
): Promise<boolean> {
const timeoutMs = options.timeoutMs ?? 15_000;
const pollIntervalMs = options.pollIntervalMs ?? 250;
const readFile =
options.readFile ?? ((path: string) => fsReadFile(path, 'utf8'));
const now = options.now ?? Date.now;
const abortSignal = options.abortSignal;
if (abortSignal?.aborted) return false;
const escaped = expectedModel.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
const pattern = new RegExp(
`Propagating selected model override to backend: label="${escaped}"`,
);
const deadline = now() + timeoutMs;
while (now() < deadline) {
if (abortSignal?.aborted) return false;
try {
const content = await readFile(logFilePath);
if (pattern.test(content)) return true;
} catch {
// Log file may not have appeared yet; keep polling.
}
if (now() >= deadline) break;
await new Promise<void>((resolve) => {
const timer = setTimeout(resolve, pollIntervalMs);
const onAbort = () => {
clearTimeout(timer);
resolve();
};
abortSignal?.addEventListener('abort', onAbort, { once: true });
});
}
return false;
}
export const antigravityAgentDef = {
id: 'antigravity',
name: 'Antigravity',
bin: 'agy',
versionArgs: ['--version'],
fallbackModels: [
DEFAULT_MODEL_OPTION,
{ id: 'Gemini 3.1 Pro (High)', label: 'Gemini 3.1 Pro (High)' },
{ id: 'Gemini 3.1 Pro (Low)', label: 'Gemini 3.1 Pro (Low)' },
{ id: 'Gemini 3.5 Flash (High)', label: 'Gemini 3.5 Flash (High)' },
{ id: 'Gemini 3.5 Flash (Medium)', label: 'Gemini 3.5 Flash (Medium)' },
{ id: 'Gemini 3.5 Flash (Low)', label: 'Gemini 3.5 Flash (Low)' },
{
id: 'Claude Sonnet 4.6 (Thinking)',
label: 'Claude Sonnet 4.6 (Thinking)',
},
{ id: 'Claude Opus 4.6 (Thinking)', label: 'Claude Opus 4.6 (Thinking)' },
{ id: 'GPT-OSS 120B (Medium)', label: 'GPT-OSS 120B (Medium)' },
],
supportsCustomModel: false,
// We deliberately do NOT opt into `resumesSessionViaCli` / agy's `-c`
// resume flag on follow-up turns. Tested both shapes; `-c` activates
// agy's internal agentic loop (multi-step model retries, tool calls,
// fallback-to-cached-response on tool errors) which can't be steered
// from OD's system-prompt OVERRIDE — even with the strongest wording
// we got an identical byte-for-byte form re-emission on turn 2 when
// turn 1's tool-call retry path returned the cached form response.
//
// Instead we treat agy as a stateless plain adapter like qwen /
// deepseek: every spawn gets the full OD-rendered transcript via
// `buildDaemonTranscript`, and that transcript's prior assistant
// turns are sanitized to strip `<question-form>` markup + form-schema
// JSON fences (see `sanitizePriorAssistantTurnForTranscript` in
// apps/web/src/providers/daemon.ts). The stronger OVERRIDE block
// composed in server.ts gives a second line of defense for weak
// plain-stream models like Gemini 3.5 Flash.
buildArgs: (
_prompt,
_imagePaths,
_extra = [],
options = {},
runtimeContext = {},
) => {
if (options.model && options.model !== DEFAULT_MODEL_OPTION.id) {
writeAntigravityModelSelection(
options.model,
runtimeContext.antigravitySettingsPath,
);
}
// We invoke agy via `-p -` (print mode + stdin sentinel), NOT
// `chat -`. Verified against `agy --help` on v1.0.3 — the
// `Available subcommands` list is `changelog / help / install /
// plugin / update`, and `chat` is NOT among them. `-p` is the
// documented print-mode flag (`Short alias for --print`) and
// `agy -p -` reads the prompt from stdin. The looper reviewer
// bot's environment runs a different agy build that may have
// renamed the entry point; until upstream confirms a stable
// headless subcommand (see google-antigravity/antigravity-cli#119)
// and the change actually ships in the auto-update channel that
// packaged OD users get, `-p -` is the contract that actually
// produces a print-mode reply on the installed CLI.
const args: string[] = ['-p'];
// Always opt into `--log-file` when the daemon supplied a path so
// it can post-exit grep for the actual upstream failure shape
// (auth missing vs quota reached vs upstream error) — without it
// the chat surfaces a generic "empty response" because print mode
// never echoes those errors on stdout. See server.ts empty-output
// guard for the consumer.
if (runtimeContext.agentLogFilePath) {
args.push('--log-file', runtimeContext.agentLogFilePath);
}
args.push('-');
return args;
},
promptViaStdin: true,
streamFormat: 'plain',
installUrl: 'https://antigravity.google/cli',
docsUrl: 'https://antigravity.google/docs/cli-overview',
} satisfies RuntimeAgentDef;

View file

@ -49,11 +49,10 @@ export const grokBuildAgentDef = {
label: 'grok-4.20-multi-agent (xAI · orchestration)',
},
],
// Prompt delivered via stdin so Windows `spawn ENAMETOOLONG` and Linux
// `spawn E2BIG` can't truncate large composed prompts. `grok -p` with
// no positional argument reads from piped stdin.
buildArgs: (_prompt, _imagePaths, _extra = [], options = {}) => {
const args = ['-p'];
// Grok Build CLI v0.1.212 enforces `-p, --single <PROMPT>` as value-
// required — stdin piping no longer satisfies it. Inline the prompt.
buildArgs: (prompt, _imagePaths, _extra = [], options = {}) => {
const args = ['-p', prompt];
if (options.model && options.model !== DEFAULT_MODEL_OPTION.id) {
args.push('--model', options.model);
}
@ -69,7 +68,21 @@ export const grokBuildAgentDef = {
{ id: 'xhigh', label: 'xhigh' },
{ id: 'max', label: 'max' },
],
promptViaStdin: true,
promptViaStdin: false,
// Guard against prompts that would blow Windows' ~32 KB CreateProcess
// limit (or Linux MAX_ARG_STRLEN on extreme edges) before spawn. Same
// shape as the DeepSeek adapter — the previous stdin path is gone (CLI
// 0.1.212 enforces `-p <value>`), so the composed prompt now rides
// argv and a sufficiently large one — system text + history + skills/
// design-system content + user message — could surface as a generic
// spawn ENAMETOOLONG / E2BIG instead of a Grok-specific, user-
// actionable message. The /api/chat spawn path checks this byte
// budget against the composed prompt and emits AGENT_PROMPT_TOO_LARGE
// ("reduce skills/design-system context, or pick an adapter with
// stdin support") before calling `spawn`. 30_000 bytes leaves ~2.7 KB
// of argv headroom under the Windows command-line limit for `-p
// --model <id> --effort <level>` and internal quoting.
maxPromptArgBytes: 30_000,
streamFormat: 'plain',
installUrl: 'https://x.ai/cli',
docsUrl: 'https://x.ai/cli',

View file

@ -151,6 +151,8 @@ async function probe(
...(def.env || {}),
},
configuredEnv,
undefined,
{ resolvedBin: launch.selectedPath },
),
launch,
);

View file

@ -1,11 +1,27 @@
import path from 'node:path';
import { fileURLToPath } from 'node:url';
import { mergeProxyAwareEnv, resolveSystemProxyEnv } from '@open-design/platform';
import { resolveProjectRelativePath } from '../home-expansion.js';
import { expandConfiguredEnv } from './paths.js';
import { resolveAmrOpenCodeExecutable } from './executables.js';
import { amrVelaProfileEnv } from '../integrations/vela-profile.js';
import { resolveProjectRootFromNestedModule } from '../project-root.js';
import {
applySandboxRuntimeEnv,
isSandboxModeEnabled,
resolveSandboxRuntimeConfig,
type SandboxRuntimeConfig,
} from '../sandbox-mode.js';
type RuntimeEnvMap = NodeJS.ProcessEnv | Record<string, string>;
type SpawnEnvOptions = {
resolvedBin?: string | null;
};
const RUNTIME_MODULE_PROJECT_ROOT = resolveProjectRootFromNestedModule(
path.dirname(fileURLToPath(import.meta.url)),
);
// Build the env passed to spawn() for a given agent adapter.
//
@ -38,7 +54,9 @@ export function spawnEnvForAgent(
baseEnv: RuntimeEnvMap,
configuredEnv: unknown = {},
systemProxyEnv: RuntimeEnvMap = resolveSystemProxyEnv(),
options: SpawnEnvOptions = {},
): NodeJS.ProcessEnv {
const sandboxRuntime = sandboxRuntimeConfigForBaseEnv(baseEnv);
const env = mergeProxyAwareEnv(
process.platform,
systemProxyEnv,
@ -58,20 +76,52 @@ export function spawnEnvForAgent(
const opencodeBin = resolveAmrOpenCodeExecutable(env);
if (opencodeBin) env.VELA_OPENCODE_BIN = opencodeBin;
}
return env;
return reapplySandboxRuntimeEnv(env, sandboxRuntime);
}
if (agentId === 'claude') {
if (!isOpenClaudeExecutable(options.resolvedBin)) {
stripUnlessCustomBaseUrl(env, 'ANTHROPIC_BASE_URL', ['ANTHROPIC_API_KEY']);
return env;
}
return reapplySandboxRuntimeEnv(env, sandboxRuntime);
}
if (agentId === 'codex') {
stripUnlessCustomBaseUrl(env, 'OPENAI_BASE_URL', [
'OPENAI_API_KEY',
'CODEX_API_KEY',
]);
return env;
return reapplySandboxRuntimeEnv(env, sandboxRuntime);
}
return env;
return reapplySandboxRuntimeEnv(env, sandboxRuntime);
}
function isOpenClaudeExecutable(resolvedBin: string | null | undefined): boolean {
if (typeof resolvedBin !== 'string' || !resolvedBin.trim()) return false;
const base = path
.basename(resolvedBin.trim().replace(/\\/g, '/'))
.replace(/\.(exe|cmd|bat)$/i, '')
.toLowerCase();
return base === 'openclaude';
}
function sandboxRuntimeConfigForBaseEnv(
baseEnv: RuntimeEnvMap,
): SandboxRuntimeConfig | null {
if (!isSandboxModeEnabled(baseEnv)) return null;
const dataDir = baseEnv.OD_DATA_DIR?.trim();
if (!dataDir) return null;
const resolvedDataDir = resolveProjectRelativePath(
dataDir,
RUNTIME_MODULE_PROJECT_ROOT,
);
return resolveSandboxRuntimeConfig(true, resolvedDataDir);
}
function reapplySandboxRuntimeEnv(
env: NodeJS.ProcessEnv,
sandboxRuntime: SandboxRuntimeConfig | null,
): NodeJS.ProcessEnv {
if (!sandboxRuntime) return env;
return applySandboxRuntimeEnv(env, sandboxRuntime);
}
// Remove `secretKeys` from `env` unless `baseUrlKey` is set to a non-empty

View file

@ -2,10 +2,17 @@ import { accessSync, constants, existsSync, statSync } from 'node:fs';
import { delimiter } from 'node:path';
import path from 'node:path';
import { homedir } from 'node:os';
import { fileURLToPath } from 'node:url';
import { wellKnownUserToolchainBins } from '@open-design/platform';
import { resolveSandboxRuntimeConfigFromEnv } from '../sandbox-mode.js';
import { expandHomePath } from './paths.js';
import type { RuntimeAgentDef } from './types.js';
const RUNTIME_PROJECT_ROOT = path.resolve(
path.dirname(fileURLToPath(import.meta.url)),
'../../../..',
);
const AGENT_BIN_ENV_KEYS = new Map<string, string>([
['amr', 'VELA_BIN'],
['aider', 'AIDER_BIN'],
@ -35,7 +42,12 @@ let cachedToolchainDirs: string[] | null = null;
let cachedToolchainDirsAt = 0;
function userToolchainDirs() {
const homeOverride = process.env.OD_AGENT_HOME;
const sandboxRuntime = resolveSandboxRuntimeConfigFromEnv(
process.env,
RUNTIME_PROJECT_ROOT,
);
const homeOverride =
sandboxRuntime?.roots.agentHomeDir ?? process.env.OD_AGENT_HOME;
const home = homeOverride || homedir();
const now = Date.now();
if (

View file

@ -1,7 +1,13 @@
import { readFileSync } from 'node:fs';
import { fileURLToPath } from 'node:url';
import { homedir } from 'node:os';
import path from 'node:path';
import {
isSandboxModeEnabled,
resolveSandboxRuntimeConfigFromEnv,
sandboxAgentProfilesConfigPath,
} from '../sandbox-mode.js';
import { DEFAULT_MODEL_OPTION, sanitizeCustomModel } from './models.js';
import type {
RuntimeAgentDef,
@ -9,10 +15,44 @@ import type {
RuntimeModelOption,
} from './types.js';
function localAgentProfilesFile(): string {
const RUNTIME_PROJECT_ROOT = path.resolve(
path.dirname(fileURLToPath(import.meta.url)),
'../../../..',
);
function isInsideDir(parent: string, child: string): boolean {
const relative = path.relative(parent, child);
return (
relative === '' ||
(!relative.startsWith('..') && !path.isAbsolute(relative))
);
}
function localAgentProfilesFile(): string | null {
const explicit = process.env.OD_AGENT_PROFILES_CONFIG;
if (typeof explicit === 'string' && explicit.trim()) {
return explicit.trim();
const explicitPath =
typeof explicit === 'string' && explicit.trim()
? path.resolve(explicit.trim())
: null;
if (isSandboxModeEnabled(process.env)) {
if (!process.env.OD_DATA_DIR?.trim()) return null;
const sandboxRuntime = resolveSandboxRuntimeConfigFromEnv(
process.env,
RUNTIME_PROJECT_ROOT,
);
if (!sandboxRuntime?.enabled) return null;
if (
explicitPath &&
isInsideDir(sandboxRuntime.roots.agentHomeDir, explicitPath)
) {
return explicitPath;
}
return sandboxAgentProfilesConfigPath(sandboxRuntime);
}
if (explicitPath) {
return explicitPath;
}
return path.join(homedir(), '.open-design', 'agents.local.json');
}
@ -152,9 +192,11 @@ function createLocalAgentDef(
export function readLocalAgentProfileDefs(
baseDefs: RuntimeAgentDef[],
): RuntimeAgentDef[] {
const profilesFile = localAgentProfilesFile();
if (profilesFile == null) return [];
let parsed: unknown;
try {
parsed = JSON.parse(readFileSync(localAgentProfilesFile(), 'utf8'));
parsed = JSON.parse(readFileSync(profilesFile, 'utf8'));
} catch {
return [];
}

View file

@ -25,6 +25,18 @@ export function rememberLiveModels(agentId: string, models: RuntimeModelOption[]
liveModelOrder.set(agentId, ids);
}
export function getRememberedLiveModels(agentId: string): RuntimeModelOption[] {
const ids = liveModelOrder.get(agentId) ?? [];
return ids.map((id) => ({ id, label: id }));
}
export function preferFreshLiveModels(
freshModels: RuntimeModelOption[],
rememberedModels: RuntimeModelOption[],
): RuntimeModelOption[] {
return freshModels.length > 0 ? freshModels : rememberedModels;
}
export function isKnownModel(def: RuntimeAgentDef, modelId: string | null | undefined) {
if (!modelId) return false;
const live = liveModelCache.get(def.id);

View file

@ -0,0 +1,170 @@
// OpenCode swallows provider failures in headless `run --format json` mode:
// on a 429 usage-limit (and similar), it marks the error retryable, retries
// silently, and emits NOTHING on stdout/stderr — so the daemon only sees an
// inactivity-watchdog timeout with no reason. The real error is recorded
// only in OpenCode's own session log (`service=llm … error={…}`). This
// module recovers that signal so the chat UI can show "usage limit reached"
// instead of a bare timeout. OpenCode-specific by design; see issue #982.
import { readdirSync, readFileSync, statSync } from 'node:fs';
import path from 'node:path';
import { classifyAgentServiceFailure, type AgentServiceFailureCode } from './auth.js';
export interface OpenCodeServiceFailure {
code: AgentServiceFailureCode;
message: string;
statusCode: number | null;
}
// OpenCode resolves its data dir as `$XDG_DATA_HOME/opencode` (when set) or
// `$HOME/.local/share/opencode`, with session logs under `log/`. Mirror that
// so we read the same files the spawned CLI wrote. Null when neither var is
// set (we have no basis to guess a path).
export function resolveOpenCodeLogDir(
env: Record<string, string | undefined>,
): string | null {
const xdg = typeof env.XDG_DATA_HOME === 'string' ? env.XDG_DATA_HOME.trim() : '';
const home = typeof env.HOME === 'string' ? env.HOME.trim() : '';
const base = xdg || (home ? path.join(home, '.local', 'share') : '');
if (!base) return null;
return path.join(base, 'opencode', 'log');
}
// Read the tail of OpenCode's most recent session log. Filenames are
// `<ISO-like-timestamp>.log`, so a lexicographic sort orders them by recency.
// `since` (when provided) binds the lookup to the current run: a file last
// written before the run started can only belong to an earlier session, so
// it is skipped rather than risk surfacing a stale provider error for this
// run. (This does not disambiguate two OpenCode runs writing into the same
// HOME concurrently — OpenCode only emits its session id on the stdout
// stream, which is empty in the silent-stall case, so mtime is the only
// run-binding signal available here.) The 2 MB tail comfortably holds the
// final error frame even though
// OpenCode embeds the entire request body (system prompt + tool schemas) in
// each `service=llm` line. Synchronous on purpose: the only callers are the
// (non-async) run close handler and the inactivity watchdog, once per failed
// OpenCode run. Returns null on any fs error (no dir yet, perms).
export function readLatestOpenCodeLogTail(
logDir: string,
options: { maxBytes?: number; since?: number } = {},
): string | null {
const { maxBytes = 2_000_000, since } = options;
let names: string[];
try {
names = readdirSync(logDir).filter((name) => name.endsWith('.log'));
} catch {
return null;
}
if (names.length === 0) return null;
names.sort().reverse(); // newest filename first
for (const name of names) {
const full = path.join(logDir, name);
if (since != null) {
try {
if (statSync(full).mtimeMs < since) continue;
} catch {
continue;
}
}
try {
const buf = readFileSync(full, 'utf8');
return buf.length > maxBytes ? buf.slice(-maxBytes) : buf;
} catch {
continue;
}
}
return null;
}
// Only treat a `"message":"…"` value as the failure reason when it reads
// like a service error. The embedded request body uses `"content":` for
// prompt text, but tool schemas and user prompts could still contain a
// stray `"message"` key, so this keyword gate keeps unrelated payload text
// from masquerading as the error.
const SERVICE_ERROR_MESSAGE_RE =
/usage limit|rate[ _-]?limit|quota|limit reached|insufficient|credit|balance|overloaded|unavailable|unauthor|authenticat|invalid[ _-]?(?:api[ _-]?)?key|api key|\/login|exhaust|too many requests/i;
function pickServiceErrorMessage(line: string): string | null {
const re = /"message":"((?:[^"\\]|\\.)*)"/g;
let fallback: string | null = null;
let match: RegExpExecArray | null;
while ((match = re.exec(line)) !== null) {
let value: string;
try {
value = JSON.parse(`"${match[1]}"`);
} catch {
value = match[1]!;
}
value = value.trim();
if (SERVICE_ERROR_MESSAGE_RE.test(value)) return value;
if (!fallback) fallback = value;
}
return fallback && SERVICE_ERROR_MESSAGE_RE.test(fallback) ? fallback : null;
}
function codeFromStatus(statusCode: number): AgentServiceFailureCode | null {
if (statusCode === 401 || statusCode === 403) return 'AGENT_AUTH_REQUIRED';
if (statusCode === 429) return 'RATE_LIMITED';
if (statusCode >= 500 && statusCode <= 599) return 'UPSTREAM_UNAVAILABLE';
return null;
}
function defaultMessageForCode(code: AgentServiceFailureCode): string {
switch (code) {
case 'AGENT_AUTH_REQUIRED':
return 'OpenCode could not authenticate with the model provider.';
case 'RATE_LIMITED':
return 'OpenCode hit a provider usage or rate limit.';
case 'UPSTREAM_UNAVAILABLE':
return "OpenCode's model provider is temporarily unavailable.";
}
}
// Classify the latest `service=llm` provider error in an OpenCode log tail.
// We scope to that single line so the huge request body of *other* lines
// can't leak in, key the classification on the unambiguous HTTP `statusCode`
// first, and fall back to keyword matching the extracted message only.
export function extractOpenCodeServiceFailure(
logTail: string,
): OpenCodeServiceFailure | null {
if (!logTail || !logTail.trim()) return null;
const lines = logTail.split(/\r?\n/);
let line: string | null = null;
for (let i = lines.length - 1; i >= 0; i -= 1) {
const candidate = lines[i]!;
if (
candidate.includes('service=llm') &&
/\bERROR\b/.test(candidate) &&
candidate.includes('error=')
) {
line = candidate;
break;
}
}
if (!line) return null;
const statusMatch = /"statusCode":\s*(\d{3})/.exec(line);
const statusCode = statusMatch ? Number(statusMatch[1]) : null;
const message = pickServiceErrorMessage(line);
let code: AgentServiceFailureCode | null =
statusCode != null ? codeFromStatus(statusCode) : null;
if (!code && message) code = classifyAgentServiceFailure(message);
if (!code) return null;
return { code, message: message || defaultMessageForCode(code), statusCode };
}
// Convenience for the run close handler / inactivity watchdog: resolve the
// log dir from the spawned agent's env, read the newest log tail (bound to
// the current run via `since`), and classify it.
export function readOpenCodeServiceFailure(
env: Record<string, string | undefined>,
options: { since?: number } = {},
): OpenCodeServiceFailure | null {
const logDir = resolveOpenCodeLogDir(env);
if (!logDir) return null;
const tail = readLatestOpenCodeLogTail(logDir, options);
if (!tail) return null;
return extractOpenCodeServiceFailure(tail);
}

View file

@ -10,6 +10,12 @@ function promptArgvBudgetMessage(
'Reduce the selected skills/design-system context or conversation length, or use DeepSeek through an API/provider model connection for large contexts. Pick a stdin-capable adapter when the prompt must include large local context.'
);
}
if (def.id === 'grok-build') {
return (
`${def.name} requires the prompt as the value of -p / --single (xAI CLI 0.1.212+ no longer reads piped stdin), and this run's composed prompt exceeds the safe size (${bytes} > ${def.maxPromptArgBytes} bytes). ` +
'Reduce the selected skills/design-system context or conversation length, or pick an adapter with stdin support (e.g. claude, codex, hermes) when the prompt must include large local context.'
);
}
return (
`${def.name} requires the prompt as a command-line argument and this run's composed prompt exceeds the safe size (${bytes} > ${def.maxPromptArgBytes} bytes). ` +
'Reduce the selected skills/design-system context, shorten the conversation, or pick an adapter with stdin support.'

View file

@ -18,6 +18,7 @@ import { kiloAgentDef } from './defs/kilo.js';
import { vibeAgentDef } from './defs/vibe.js';
import { deepseekAgentDef } from './defs/deepseek.js';
import { aiderAgentDef } from './defs/aider.js';
import { antigravityAgentDef } from './defs/antigravity.js';
import { reasonixAgentDef } from './defs/reasonix.js';
import { readLocalAgentProfileDefs as readLocalAgentProfileDefsFromFile } from './local-profiles.js';
import type { RuntimeAgentDef } from './types.js';
@ -43,6 +44,7 @@ const BASE_AGENT_DEFS: RuntimeAgentDef[] = [
vibeAgentDef,
deepseekAgentDef,
aiderAgentDef,
antigravityAgentDef,
reasonixAgentDef,
];

View file

@ -0,0 +1,130 @@
import { execFile, spawn } from 'node:child_process';
import { promisify } from 'node:util';
const execFileAsync = promisify(execFile);
// Cross-platform spawn helper for "open a system terminal and run this
// command in it." Used by the antigravity adapter's `oauth-launch`
// endpoint: agy's print mode (`-p`) cannot complete the Google
// Sign-In OAuth flow (the upstream callback page asks the user to
// paste the auth code back into agy, but `-p` has no input field), so
// the user has to run `agy` interactively at least once to populate
// the system keyring. Spawning a terminal from inside OD makes that
// a one-click action instead of a "go open Terminal yourself" task.
//
// Each platform branch uses primitives that are safe against shell
// injection BECAUSE we never accept user input here — the `command`
// argument is always a hard-coded binary name like `agy`. Adding
// caller-supplied flags or env vars to this helper would invalidate
// that guarantee, so the signature is intentionally narrow.
export type TerminalLaunchResult =
| { ok: true; platform: NodeJS.Platform; via: string }
| { ok: false; platform: NodeJS.Platform; reason: string };
// macOS: AppleScript via osascript. Bringing Terminal.app to the
// foreground and creating a new shell that immediately runs the
// command is the canonical macOS pattern (same one VS Code uses for
// "Open in External Terminal").
async function launchOnDarwin(command: string): Promise<TerminalLaunchResult> {
// `do script "<cmd>"` opens a new Terminal window and runs <cmd>
// in it; activate brings Terminal.app to the foreground so the
// user actually sees the new window. Strict double-quote escaping
// protects us if `command` ever grows special characters (today
// it's just `agy`, so this is belt-and-suspenders).
const safe = command.replace(/"/g, '\\"');
const script = `tell application "Terminal" to do script "${safe}"\ntell application "Terminal" to activate`;
try {
await execFileAsync('osascript', ['-e', script], { timeout: 5_000 });
return { ok: true, platform: 'darwin', via: 'osascript' };
} catch (err) {
return {
ok: false,
platform: 'darwin',
reason: `osascript failed: ${err instanceof Error ? err.message : String(err)}`,
};
}
}
// Linux: try the Debian/Ubuntu meta-emulator first, then the common
// concrete terminals. Each attempt spawns detached so the terminal
// window's lifetime is independent from the daemon's process group.
// We resolve as soon as the child process starts (not when it exits),
// because terminals like xterm and x-terminal-emulator stay alive for
// the duration of the interactive session — waiting for exit would time
// out and kill the window mid-OAuth-flow.
async function launchOnLinux(command: string): Promise<TerminalLaunchResult> {
// Order matters: x-terminal-emulator is the Debian alternative that
// resolves to whichever terminal the distro chose. Otherwise try the
// common ones. Each requires a slightly different invocation syntax
// (`-e` vs `--` vs `-x`), captured in this table.
const attempts: Array<{ bin: string; args: string[] }> = [
{ bin: 'x-terminal-emulator', args: ['-e', command] },
{ bin: 'gnome-terminal', args: ['--', 'sh', '-c', `${command}; exec $SHELL`] },
{ bin: 'konsole', args: ['-e', command] },
{ bin: 'xfce4-terminal', args: ['-e', command] },
{ bin: 'xterm', args: ['-e', command] },
];
const errors: string[] = [];
for (const { bin, args } of attempts) {
try {
await new Promise<void>((resolve, reject) => {
const child = spawn(bin, args, { detached: true, stdio: 'ignore' });
child.unref();
child.once('spawn', resolve);
child.once('error', reject);
});
return { ok: true, platform: 'linux', via: bin };
} catch (err) {
errors.push(`${bin}: ${err instanceof Error ? err.message : String(err)}`);
}
}
return {
ok: false,
platform: 'linux',
reason: `no system terminal worked (${errors.join('; ')})`,
};
}
// Windows: `cmd /c start "<title>" cmd /k "<command>"` — the outer
// `start` opens a new console window (the first quoted "Open Design"
// is the window title, required by `start`'s positional-arg parser
// when the next token is also quoted), and the inner `cmd /k` keeps
// the window open after the command finishes so the user can see
// OAuth output and finish the flow before the window closes.
async function launchOnWindows(command: string): Promise<TerminalLaunchResult> {
try {
await execFileAsync(
'cmd.exe',
['/c', 'start', 'Open Design', 'cmd.exe', '/k', command],
{ timeout: 5_000 },
);
return { ok: true, platform: 'win32', via: 'cmd /c start' };
} catch (err) {
return {
ok: false,
platform: 'win32',
reason: `cmd /c start failed: ${err instanceof Error ? err.message : String(err)}`,
};
}
}
export async function launchAgentInSystemTerminal(
command: string,
platform: NodeJS.Platform = process.platform,
): Promise<TerminalLaunchResult> {
switch (platform) {
case 'darwin':
return launchOnDarwin(command);
case 'linux':
return launchOnLinux(command);
case 'win32':
return launchOnWindows(command);
default:
return {
ok: false,
platform,
reason: `system-terminal launch is not supported on ${platform}`,
};
}
}

View file

@ -18,8 +18,45 @@ export type RuntimeBuildOptions = {
export type RuntimeContext = {
cwd?: string;
// True when the current chat run has at least one prior persisted
// assistant message in the same conversation — i.e. this isn't the
// first user turn. Plain-streaming adapters that support a "continue
// the most recent conversation" CLI flag (e.g. `agy -c`) read this to
// decide whether to resume the upstream agent's own session state
// instead of spawning a fresh, context-free turn. Adapters that
// either have no resume flag or recompose history into the prompt
// themselves ignore this field.
hasPriorAssistantTurn?: boolean;
// Daemon-owned path to a temp file where the adapter should write
// its diagnostic log. Today only antigravity consumes this: agy in
// print mode is silent on stdout/stderr for both missing-auth AND
// quota-exhausted failures (verified via `agy --log-file` capture
// during PR #3157), so post-exit log inspection is the only way to
// tell them apart. Adapters that don't have a `--log-file` flag
// ignore this field; the daemon cleans the file up after reading.
agentLogFilePath?: string;
// Override for the antigravity model-selection settings file path.
// Production code leaves this undefined (falls back to the default
// ~/.gemini/antigravity-cli/settings.json). Tests pass a temp path
// so unit assertions against buildArgs do not touch the real home dir.
antigravitySettingsPath?: string;
};
// Marker on a RuntimeAgentDef declaring that the adapter's CLI maintains
// its own multi-turn conversation memory and the daemon should NOT also
// pack the rendered web transcript (the `## user` / `## assistant` blocks
// `buildDaemonTranscript` produces) into the user request. Today only
// `agy -c` qualifies; other plain-stream adapters have no upstream
// session storage and still rely on the daemon-side transcript injection
// for multi-turn coherence.
//
// Without this opt-out, agy with `-c` receives the same prior turn
// twice — once from its own conversation memory, once embedded in the
// composed user request — and the embedded copy includes the literal
// `<question-form>` markup it emitted on turn 1. The model then
// pattern-matches that and re-emits the form on turn 2, looking like
// the discovery loop never breaks.
export type RuntimeCapabilityMap = Record<string, boolean>;
export type RuntimeListModels = {
@ -101,6 +138,21 @@ export type RuntimeAgentDef = {
| 'opencode-env-content';
installUrl?: string;
docsUrl?: string;
// When `false`, the Settings model picker hides the "Custom (fill below)"
// option and the associated free-text input. Use this for agents whose
// CLI does not actually accept a model id (e.g. `agy` v1.0.3 has no
// `--model` flag yet — upstream issue #35 — and the model is chosen
// server-side; AMR routes model selection through ACP's
// `session/set_model` and rejects free-form ids). Defaults to allowing
// custom input (undefined === true) so most adapters keep today's UX.
supportsCustomModel?: boolean;
// When `true`, the daemon trusts this adapter's CLI to carry its own
// multi-turn conversation memory across spawn invocations (today only
// `agy -c`). The chat composer skips the rendered web transcript on
// follow-up turns and sends just the latest user message — see the
// RuntimeContext.hasPriorAssistantTurn comment for why double-context
// is the discovery-form loop's root cause.
resumesSessionViaCli?: boolean;
// Optional name of a daemon-process environment variable that overrides
// the default model id when the chat run reaches the spawn layer with
// null or the synthetic 'default'. Used by adapters whose CLI rejects

View file

@ -0,0 +1,134 @@
import fs from 'node:fs';
import path from 'node:path';
import { resolveProjectRelativePath } from './home-expansion.js';
export const SANDBOX_MODE_ENV = 'OD_SANDBOX_MODE';
export interface SandboxRuntimeRoots {
agentHomeDir: string;
cacheDir: string;
configDir: string;
generatedFilesDir: string;
logsDir: string;
mcpConfigDir: string;
pluginStateDir: string;
previewStateDir: string;
skillsCacheDir: string;
tempDir: string;
toolConfigDir: string;
}
export interface SandboxRuntimeConfig {
enabled: boolean;
dataDir: string;
roots: SandboxRuntimeRoots;
}
const TRUTHY_VALUES = new Set(['1', 'true', 'yes', 'on']);
const FALSY_VALUES = new Set(['0', 'false', 'no', 'off', '']);
export function isSandboxModeEnabled(
env: Record<string, string | undefined> = process.env,
): boolean {
const raw = env[SANDBOX_MODE_ENV];
if (typeof raw !== 'string') return false;
const value = raw.trim().toLowerCase();
if (TRUTHY_VALUES.has(value)) return true;
if (FALSY_VALUES.has(value)) return false;
throw new Error(
`${SANDBOX_MODE_ENV} must be one of ${Array.from(TRUTHY_VALUES).join(', ')} ` +
`or ${Array.from(FALSY_VALUES).join(', ')}`,
);
}
export function resolveSandboxRuntimeConfig(
enabled: boolean,
dataDir: string,
): SandboxRuntimeConfig {
const sandboxRoot = path.join(dataDir, 'sandbox');
return {
enabled,
dataDir,
roots: {
agentHomeDir: path.join(sandboxRoot, 'agent-home'),
cacheDir: path.join(sandboxRoot, 'cache'),
configDir: path.join(sandboxRoot, 'config'),
generatedFilesDir: path.join(dataDir, 'generated-files'),
logsDir: path.join(dataDir, 'logs'),
mcpConfigDir: dataDir,
pluginStateDir: path.join(dataDir, 'plugins'),
previewStateDir: path.join(dataDir, 'previews'),
skillsCacheDir: path.join(dataDir, 'skills'),
tempDir: path.join(sandboxRoot, 'tmp'),
toolConfigDir: path.join(sandboxRoot, 'tools'),
},
};
}
export function resolveSandboxRuntimeConfigFromEnv(
env: Record<string, string | undefined>,
projectRoot: string,
): SandboxRuntimeConfig | null {
if (!isSandboxModeEnabled(env)) return null;
const rawDataDir = env.OD_DATA_DIR?.trim();
if (!rawDataDir) {
throw new Error('OD_DATA_DIR is required when OD_SANDBOX_MODE is enabled');
}
return resolveSandboxRuntimeConfig(
true,
resolveProjectRelativePath(rawDataDir, projectRoot),
);
}
export function sandboxAgentProfilesConfigPath(
config: SandboxRuntimeConfig,
): string {
return path.join(
config.roots.agentHomeDir,
'.open-design',
'agents.local.json',
);
}
export function ensureSandboxRuntimeDirs(config: SandboxRuntimeConfig): void {
if (!config.enabled) return;
for (const dir of new Set(Object.values(config.roots))) {
fs.mkdirSync(dir, { recursive: true });
}
}
export function applySandboxRuntimeEnv(
baseEnv: NodeJS.ProcessEnv,
config: SandboxRuntimeConfig,
): NodeJS.ProcessEnv {
if (!config.enabled) return baseEnv;
const env: NodeJS.ProcessEnv = { ...baseEnv };
const { roots } = config;
const codexHome = path.join(roots.agentHomeDir, '.codex');
const claudeConfigDir = path.join(roots.configDir, 'claude');
const opencodeHome = path.join(roots.agentHomeDir, '.opencode');
const npmUserConfig = path.join(roots.toolConfigDir, 'npmrc');
env[SANDBOX_MODE_ENV] = '1';
env.OD_DATA_DIR = config.dataDir;
env.OD_AGENT_HOME = roots.agentHomeDir;
env.HOME = roots.agentHomeDir;
env.USERPROFILE = roots.agentHomeDir;
env.XDG_CONFIG_HOME = roots.configDir;
env.XDG_CACHE_HOME = roots.cacheDir;
env.XDG_DATA_HOME = path.join(roots.configDir, 'data');
env.XDG_STATE_HOME = path.join(roots.configDir, 'state');
env.TMPDIR = roots.tempDir;
env.TEMP = roots.tempDir;
env.TMP = roots.tempDir;
env.CODEX_HOME = codexHome;
env.CLAUDE_CONFIG_DIR = claudeConfigDir;
env.OPENCODE_TEST_HOME = opencodeHome;
env.OD_AGENT_PROFILES_CONFIG = sandboxAgentProfilesConfigPath(config);
env.NPM_CONFIG_USERCONFIG = npmUserConfig;
env.npm_config_userconfig = npmUserConfig;
return env;
}

File diff suppressed because it is too large Load diff

View file

@ -1,5 +1,7 @@
import assert from 'node:assert/strict';
import { EventEmitter } from 'node:events';
import fs from 'node:fs';
import os from 'node:os';
import { PassThrough } from 'node:stream';
import path from 'node:path';
import { test, vi } from 'vitest';
@ -202,6 +204,107 @@ test('attachAcpSession keeps legacy session/set_model when no model config optio
assert.equal(requests.some((entry) => entry.method === 'session/set_config_option'), false);
});
test('attachAcpSession includes image attachments as ACP resource links', () => {
const child = new FakeAcpChild();
const writes: string[] = [];
child.stdin.on('data', (chunk) => writes.push(String(chunk)));
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'od-acp-image-'));
const imagePath = path.join(tmpDir, 'screenshot.png');
fs.writeFileSync(imagePath, 'png');
attachAcpSession({
child: child as never,
prompt: 'describe this image',
cwd: '/tmp/od-project',
model: null,
imagePaths: [imagePath],
mcpServers: [],
send: () => {},
});
writeAcpResult(child, 1, {});
writeAcpResult(child, 2, { sessionId: 'session-1' });
writeAcpResult(child, 3, {});
const requests = parseRpcWrites(writes);
const promptRequest = requests.find((entry) => entry.method === 'session/prompt');
assert.deepEqual(promptRequest?.params, {
sessionId: 'session-1',
prompt: [
{ type: 'text', text: 'describe this image' },
{ type: 'resource_link', uri: imagePath },
],
});
});
test('attachAcpSession converts cumulative ACP message snapshots into deltas', () => {
const child = new FakeAcpChild();
const events: Array<{ event: string; payload: unknown }> = [];
attachAcpSession({
child: child as never,
prompt: 'describe the project',
cwd: '/tmp/od-project',
model: null,
mcpServers: [],
send: (event, payload) => events.push({ event, payload }),
});
writeAcpResult(child, 1, {});
writeAcpResult(child, 2, { sessionId: 'session-1' });
writeAcpUpdate(child, {
sessionUpdate: 'agent_message_chunk',
content: { text: 'Agent Haven' },
});
writeAcpUpdate(child, {
sessionUpdate: 'agent_message_chunk',
content: { text: 'Agent Haven — managed AI agents' },
});
writeAcpUpdate(child, {
sessionUpdate: 'agent_message_chunk',
content: { text: 'Agent Haven — managed AI agents' },
});
writeAcpResult(child, 3, { usage: { inputTokens: 1, outputTokens: 2 } });
const textDeltas = events
.filter((entry) => entry.event === 'agent' && (entry.payload as { type?: unknown }).type === 'text_delta')
.map((entry) => (entry.payload as { delta?: unknown }).delta);
assert.deepEqual(textDeltas, ['Agent Haven', ' — managed AI agents']);
});
test('attachAcpSession keeps incremental ACP message chunks unchanged', () => {
const child = new FakeAcpChild();
const events: Array<{ event: string; payload: unknown }> = [];
attachAcpSession({
child: child as never,
prompt: 'describe the project',
cwd: '/tmp/od-project',
model: null,
mcpServers: [],
send: (event, payload) => events.push({ event, payload }),
});
writeAcpResult(child, 1, {});
writeAcpResult(child, 2, { sessionId: 'session-1' });
writeAcpUpdate(child, {
sessionUpdate: 'agent_message_chunk',
content: { text: 'Agent Haven' },
});
writeAcpUpdate(child, {
sessionUpdate: 'agent_message_chunk',
content: { text: ' — managed AI agents' },
});
writeAcpResult(child, 3, { usage: { inputTokens: 1, outputTokens: 2 } });
const textDeltas = events
.filter((entry) => entry.event === 'agent' && (entry.payload as { type?: unknown }).type === 'text_delta')
.map((entry) => (entry.payload as { delta?: unknown }).delta);
assert.deepEqual(textDeltas, ['Agent Haven', ' — managed AI agents']);
});
test('attachAcpSession exposes abort and sends session cancel after session creation', () => {
const child = new FakeAcpChild();
const writes: string[] = [];
@ -293,6 +396,10 @@ function writeAcpResult(child: FakeAcpChild, id: number, result: unknown): void
child.stdout.write(`${JSON.stringify({ id, result })}\n`);
}
function writeAcpUpdate(child: FakeAcpChild, update: unknown): void {
child.stdout.write(`${JSON.stringify({ method: 'session/update', params: { update } })}\n`);
}
function agentModelStatuses(events: Array<{ event: string; payload: unknown }>): unknown[] {
return events
.filter((entry) => {

View file

@ -87,6 +87,19 @@ describe('agent runtime tool environment', () => {
expect(env.OD_DATA_DIR).toBe(process.env.OD_DATA_DIR);
});
it('keeps non-sandbox NO_PROXY behavior unchanged', () => {
const env = createAgentRuntimeEnv(
{ PATH: '/bin', HTTP_PROXY: 'http://127.0.0.1:9', NO_PROXY: '' },
'http://127.0.0.1:7456',
{ token: 'fresh-token' },
'/opt/open-design/bin/node',
);
expect(env.HTTP_PROXY).toBe('http://127.0.0.1:9');
expect(env.NO_PROXY).toBe('');
expect(env.no_proxy).toBeUndefined();
});
it('passes the daemon sidecar IPC path from the explicit base env into agent wrapper sessions', () => {
const env = createAgentRuntimeEnv(
{ PATH: '/bin', [SIDECAR_ENV.IPC_PATH]: '/tmp/open-design/ipc/daemon.sock' },

View file

@ -15,6 +15,8 @@
*/
import { spawn, type ChildProcess } from 'node:child_process';
import { chmodSync, existsSync, mkdtempSync, readFileSync, rmSync, writeFileSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { fileURLToPath } from 'node:url';
import path from 'node:path';
import { describe, expect, it } from 'vitest';
@ -104,6 +106,8 @@ describe('AMR runtime def', () => {
expect(normalizeVelaModelId('public_model_qwen3_235b_a22b')).toBe('qwen3-235b-a22b');
expect(normalizeVelaModelId('deepseek-v3.2')).toBe('deepseek-v3.2');
expect(normalizeVelaModelId('vela/deepseek-v3.2')).toBe('deepseek-v3.2');
expect(normalizeVelaModelId('deepseek-v3-2')).toBe('deepseek-v3.2');
expect(normalizeVelaModelId('vela/deepseek-v3-2')).toBe('deepseek-v3.2');
});
it('parses `vela models` output with fast chat defaults and plain canonical labels', () => {
@ -149,6 +153,61 @@ describe('AMR runtime def', () => {
{ id: 'qwen3-235b-a22b', label: 'qwen3-235b-a22b' },
]);
});
it('retries transient `vela models` failures before succeeding', async () => {
const tempDir = mkdtempSync(path.join(tmpdir(), 'od-amr-retry-'));
const stateFile = path.join(tempDir, 'retry-state.json');
const wrapperPath = path.join(tempDir, 'vela-wrapper');
const wrapperSource = `#!/usr/bin/env node
const { existsSync, readFileSync, writeFileSync } = require('node:fs');
const { spawn } = require('node:child_process');
const stateFile = process.env.RETRY_STATE_FILE;
const fakeVela = process.env.FAKE_VELA_PATH;
const args = process.argv.slice(2);
if (args[0] === 'models') {
const state = stateFile && existsSync(stateFile)
? JSON.parse(readFileSync(stateFile, 'utf8'))
: { attempts: 0 };
state.attempts += 1;
if (stateFile) writeFileSync(stateFile, JSON.stringify(state), 'utf8');
if (state.attempts < 3) {
process.stderr.write('Get "https://amr-link.open-design.ai/v1/models": context deadline exceeded\\n');
process.exit(1);
}
}
const child = spawn(process.execPath, [fakeVela, ...args], {
stdio: ['ignore', 'pipe', 'pipe'],
env: process.env,
});
let stdout = '';
let stderr = '';
child.stdout.on('data', (chunk) => { stdout += String(chunk); });
child.stderr.on('data', (chunk) => { stderr += String(chunk); });
child.on('exit', (code) => {
if (stdout) process.stdout.write(stdout);
if (stderr) process.stderr.write(stderr);
process.exit(code ?? 0);
});
`;
writeFileSync(wrapperPath, wrapperSource, 'utf8');
chmodSync(wrapperPath, 0o755);
try {
const models = await amrAgentDef.fetchModels?.(
wrapperPath,
{
...process.env,
FAKE_VELA_PATH: FAKE_VELA,
RETRY_STATE_FILE: stateFile,
},
);
expect(models?.[0]?.id).toBe('deepseek-v4-flash');
expect(existsSync(stateFile)).toBe(true);
const attempts = JSON.parse(readFileSync(stateFile, 'utf8')) as { attempts: number };
expect(attempts.attempts).toBe(3);
} finally {
rmSync(tempDir, { recursive: true, force: true });
}
});
});
describe('AMR ACP transport — end-to-end against fake vela stub', () => {

View file

@ -0,0 +1,33 @@
import { mkdtemp, mkdir, rm, symlink, writeFile } from 'node:fs/promises';
import path from 'node:path';
import { tmpdir } from 'node:os';
import { afterEach, expect, test } from 'vitest';
import { stageAmrImagePaths } from '../src/amr-image-staging.js';
const tempDirs: string[] = [];
afterEach(async () => {
await Promise.all(tempDirs.splice(0).map((dir) => rm(dir, { recursive: true, force: true })));
});
test('stageAmrImagePaths rejects upload symlinks that resolve outside the upload root', async () => {
const root = await mkdtemp(path.join(tmpdir(), 'od-amr-stage-'));
tempDirs.push(root);
const projectDir = path.join(root, 'project');
const uploadRoot = path.join(root, 'uploads');
const outsideDir = path.join(root, 'outside');
await Promise.all([
mkdir(projectDir, { recursive: true }),
mkdir(uploadRoot, { recursive: true }),
mkdir(outsideDir, { recursive: true }),
]);
const outsideFile = path.join(outsideDir, 'secret.png');
const symlinkPath = path.join(uploadRoot, 'escape.png');
await writeFile(outsideFile, 'not-an-image');
await symlink(outsideFile, symlinkPath);
await expect(
stageAmrImagePaths(projectDir, [symlinkPath], uploadRoot),
).resolves.toEqual([]);
});

View file

@ -5,7 +5,7 @@
// OD_API_TOKEN is set.
// 2. When OD_API_TOKEN is set, every /api/* request from a non-loopback
// peer must carry `Authorization: Bearer <OD_API_TOKEN>`. The
// health/version/status probes stay open for monitoring.
// health/readiness/version probes stay open for monitoring.
//
// Tests force the bearer-required code path by stamping the env vars
// before startServer. The daemon listens on 127.0.0.1 throughout (so
@ -77,8 +77,8 @@ describe('bearer middleware', () => {
expect(resp.status).toBe(200);
});
it('keeps health / version / daemon-status open without a bearer', async () => {
for (const path of ['/api/health', '/api/version', '/api/daemon/status']) {
it('keeps health / readiness / version probes open without a bearer', async () => {
for (const path of ['/api/health', '/api/ready', '/api/version']) {
const resp = await fetch(`${baseUrl}${path}`);
expect(resp.status).toBe(200);
}

View file

@ -29,6 +29,8 @@ import { getAgentDef } from '../src/agents.js';
import { readMemoryConfig, writeMemoryConfig } from '../src/memory.js';
import { renderCodexImagegenOverride } from '../src/prompts/system.js';
const FAKE_VELA_FIXTURE = resolve(process.cwd(), 'tests', 'fixtures', 'fake-vela.mjs');
function symlinkDir(target: string, link: string): void {
symlinkSync(target, link, process.platform === 'win32' ? 'junction' : 'dir');
}
@ -214,6 +216,173 @@ process.exit(0);
);
});
it('rewrites the OpenCode scanner overflow into a generic retry message', async () => {
const conversationId = `conv-${randomUUID()}`;
await withFakeAgent(
'opencode',
`
process.stderr.write('json-rpc id 4: opencode event stream: read opencode SSE: bufio.Scanner: token too long\\n');
process.exit(1);
`,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
conversationId,
message: 'hello',
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('AGENT_EXECUTION_FAILED');
expect(body).toContain('The run failed due to an unknown upstream streaming error. Please retry.');
expect(body).toContain('event: stderr');
expect(body).toContain('"status":"failed"');
},
);
});
it('retries transient AMR Link catalog failures before aborting startup', async () => {
const previousRuntimeKey = process.env.VELA_RUNTIME_KEY;
const previousLinkUrl = process.env.VELA_LINK_URL;
const stateFile = join(tmpdir(), `od-amr-model-retry-${randomUUID()}.json`);
try {
process.env.VELA_RUNTIME_KEY = 'fake-runtime-key';
process.env.VELA_LINK_URL = 'https://amr-link.open-design.ai/v1';
await withFakeAgent(
'vela',
`
const { existsSync, readFileSync, writeFileSync } = require('node:fs');
const { spawn } = require('node:child_process');
const fixture = ${JSON.stringify(FAKE_VELA_FIXTURE)};
const stateFile = ${JSON.stringify(stateFile)};
const args = process.argv.slice(2);
if (args[0] === 'models') {
const state = existsSync(stateFile)
? JSON.parse(readFileSync(stateFile, 'utf8'))
: { attempts: 0 };
state.attempts += 1;
writeFileSync(stateFile, JSON.stringify(state), 'utf8');
if (state.attempts < 3) {
process.stderr.write('Get "https://amr-link.open-design.ai/v1/models": context deadline exceeded\\n');
process.exit(1);
}
}
const child = spawn(process.execPath, [fixture, ...args], {
stdio: 'inherit',
env: process.env,
});
child.on('exit', (code, signal) => {
if (signal) process.kill(process.pid, signal);
process.exit(code ?? 0);
});
`,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'amr',
message: 'hello',
model: 'deepseek-v3.2',
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('"type":"text_delta","delta":"Hello from fake "');
expect(body).toContain('"type":"text_delta","delta":"vela."');
expect(body).not.toContain('model_catalog_unavailable');
const attempts = JSON.parse(readFileSync(stateFile, 'utf8')) as { attempts: number };
expect(attempts.attempts).toBe(3);
},
);
} finally {
rmSync(stateFile, { force: true });
if (previousRuntimeKey == null) delete process.env.VELA_RUNTIME_KEY;
else process.env.VELA_RUNTIME_KEY = previousRuntimeKey;
if (previousLinkUrl == null) delete process.env.VELA_LINK_URL;
else process.env.VELA_LINK_URL = previousLinkUrl;
}
});
it('does not report plugin authoring as succeeded when the agent only emits planning text without artifacts', async () => {
const projectId = `proj-plugin-authoring-${randomUUID()}`;
const createProjectResponse = await fetch(`${baseUrl}/api/projects`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
id: projectId,
name: 'Plugin authoring completion fixture',
skillId: null,
designSystemId: null,
}),
});
expect(createProjectResponse.status).toBe(200);
const conversationsResponse = await fetch(`${baseUrl}/api/projects/${projectId}/conversations`);
expect(conversationsResponse.status).toBe(200);
const conversationsBody = await conversationsResponse.json() as {
conversations: Array<{ id: string }>;
};
const conversationId = conversationsBody.conversations[0]?.id;
expect(conversationId).toBeTruthy();
await withFakeAgent(
'opencode',
`
process.stdin.resume();
process.stdin.on('end', () => {
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: '我来帮你创建一个通用的 Open Design 插件脚手架。先读取文档规范,再生成插件文件。' } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
projectId,
conversationId,
pluginId: 'od-plugin-authoring',
message: '请创建一个可刷新、可审计、由 API 驱动的 Open Design 插件脚手架。',
}),
});
expect(createResponse.status).toBe(202);
const {
runId,
pluginId,
appliedPluginSnapshotId,
} = await createResponse.json() as {
runId: string;
pluginId: string | null;
appliedPluginSnapshotId: string | null;
};
expect(pluginId).toBe('od-plugin-authoring');
expect(appliedPluginSnapshotId).toBeTruthy();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`);
const eventsBody = await readSseUntil(eventsResponse, 'event: final');
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('先读取文档规范,再生成插件文件');
expect(statusBody.status).not.toBe('succeeded');
const filesResponse = await fetch(`${baseUrl}/api/projects/${projectId}/files`);
expect(filesResponse.status).toBe(200);
const filesBody = await filesResponse.json() as { files: Array<{ name: string }> };
expect(filesBody.files.some((file) => file.name.startsWith('generated-plugin/'))).toBe(false);
},
);
});
it('closes the # Instructions block with an explicit "do not echo" guard so models do not parrot the prompt back', async () => {
// claude-opus-4-7 (and a few other instruction-tuned models) start
// their reply by echoing the # Instructions block verbatim, which
@ -1138,6 +1307,50 @@ process.exit(1);
);
});
it('suppresses Antigravity auth stdout and emits AGENT_AUTH_REQUIRED without an event: stdout delta', async () => {
await withFakeAgent(
'agy',
`
const args = process.argv.slice(2);
if (args[0] === '--version') {
console.log('1.107.0-test');
process.exit(0);
}
// Simulate agy chat - printing the OAuth prompt and exiting 0
process.stdout.write('Authentication required. Please visit the URL to log in: https://accounts.google.com/o/oauth2/auth?client_id=12345&redirect_uri=antigravity-redirect\\n');
process.stdout.write('Waiting for authentication (timeout 30s)...\\n');
process.stdout.write('Error: authentication timed out.\\n');
process.exit(0);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'antigravity',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'AGENT_AUTH_REQUIRED');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('AGENT_AUTH_REQUIRED');
expect(eventsBody).not.toContain('event: stdout');
expect(eventsBody).not.toContain('accounts.google.com');
expect(statusBody.status).toBe('failed');
},
);
});
it('surfaces Qoder assistant error records through the SSE error channel', async () => {
const qoderErrorLine = JSON.stringify({
type: 'assistant',

View file

@ -0,0 +1,96 @@
/**
* Regression tests for the role-marker guard's scope in
* `claude-stream.ts` specifically, that the guard is applied only to
* the user-visible `text_delta` channel and NOT to `thinking_delta`.
*
* Rationale (see role-marker-guard.ts docblock + PR #3303 review
* r3324xxxxxx): extended-thinking content is never folded into
* `m.content` by `buildDaemonTranscript`, so it cannot become a
* fabricated turn boundary on the next round-trip. Models routinely
* emit literal `## user` / `## assistant` lines in chain-of-thought
* when reasoning about conversation structure; guarding the thinking
* channel would abort otherwise-legitimate runs without buying any
* security.
*/
import { describe, expect, it } from 'vitest';
import { createClaudeStreamHandler } from '../src/claude-stream.js';
type Event = Record<string, unknown>;
function collect(): { events: Event[]; sink: (ev: Event) => void } {
const events: Event[] = [];
return { events, sink: (ev) => events.push(ev) };
}
function feedJsonl(handler: ReturnType<typeof createClaudeStreamHandler>, lines: object[]) {
for (const line of lines) {
handler.feed(JSON.stringify({ type: 'stream_event', event: line }) + '\n');
}
}
describe('claude-stream role-marker guard scope', () => {
it('does NOT contaminate or warn when ## user appears in thinking_delta', () => {
const { events, sink } = collect();
const handler = createClaudeStreamHandler(sink);
feedJsonl(handler, [
{ type: 'message_start', message: { id: 'msg-think-1' } },
{
type: 'content_block_delta',
index: 0,
delta: {
type: 'thinking_delta',
thinking:
'Let me think about this. The user might phrase it as a question like:\n## user\nWhat is the cost?\n## assistant\nIt is $X.\nBut they actually asked for a summary, so…',
},
},
{ type: 'content_block_delta', index: 1, delta: { type: 'text_delta', text: 'The cost is $X.' } },
]);
// No fabricated_role_marker event must fire.
const warnings = events.filter((e) => e.type === 'fabricated_role_marker');
expect(warnings).toHaveLength(0);
// The thinking_delta should reach the consumer intact (no truncation
// at the `## user` line — the entire reasoning passes through).
const thinking = events
.filter((e) => e.type === 'thinking_delta')
.map((e) => e.delta)
.join('');
expect(thinking).toContain('## user');
expect(thinking).toContain('## assistant');
expect(thinking).toContain('summary');
// The subsequent text_delta answer must still stream — the run
// was not aborted by the thinking-channel marker.
const answer = events
.filter((e) => e.type === 'text_delta')
.map((e) => e.delta)
.join('');
expect(answer).toBe('The cost is $X.');
});
it('DOES contaminate when ## user appears in text_delta (sanity check)', () => {
const { events, sink } = collect();
const handler = createClaudeStreamHandler(sink);
feedJsonl(handler, [
{ type: 'message_start', message: { id: 'msg-text-1' } },
{ type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'OK.\n## user\nfabricated' } },
]);
// Real attack vector — must fire on the text channel.
const warnings = events.filter((e) => e.type === 'fabricated_role_marker');
expect(warnings).toHaveLength(1);
expect(warnings[0]!.marker).toBe('## user');
// Pre-marker prefix `OK.` emitted; everything from the marker
// onward suppressed.
const text = events
.filter((e) => e.type === 'text_delta')
.map((e) => e.delta)
.join('');
expect(text).toBe('OK.');
});
});

View file

@ -86,6 +86,42 @@ async function withFakeAgent<T>(
}
}
async function withOnlyFakeAgent<T>(
binName: string,
script: string,
run: () => Promise<T>,
): Promise<T> {
const dir = await fsp.mkdtemp(path.join(os.tmpdir(), 'od-conn-test-bin-'));
const oldPath = process.env.PATH;
const oldAgentHome = process.env.OD_AGENT_HOME;
const oldClaudeBin = process.env.CLAUDE_BIN;
try {
if (process.platform === 'win32') {
const runner = path.join(dir, `${binName}-test-runner.cjs`);
await fsp.writeFile(runner, script);
await fsp.writeFile(
path.join(dir, `${binName}.cmd`),
`@echo off\r\nnode "${runner}" %*\r\n`,
);
} else {
const bin = path.join(dir, binName);
await fsp.writeFile(bin, `#!/usr/bin/env node\n${script}`);
await fsp.chmod(bin, 0o755);
}
process.env.PATH = dir;
process.env.OD_AGENT_HOME = dir;
delete process.env.CLAUDE_BIN;
return await run();
} finally {
process.env.PATH = oldPath;
if (oldAgentHome === undefined) delete process.env.OD_AGENT_HOME;
else process.env.OD_AGENT_HOME = oldAgentHome;
if (oldClaudeBin === undefined) delete process.env.CLAUDE_BIN;
else process.env.CLAUDE_BIN = oldClaudeBin;
await fsp.rm(dir, { recursive: true, force: true });
}
}
async function withFakeCodex<T>(script: string, run: () => Promise<T>): Promise<T> {
return withFakeAgent('codex', script, run);
}
@ -94,6 +130,10 @@ async function withFakeClaude<T>(script: string, run: () => Promise<T>): Promise
return withFakeAgent('claude', script, run);
}
async function withOnlyFakeOpenClaude<T>(script: string, run: () => Promise<T>): Promise<T> {
return withOnlyFakeAgent('openclaude', script, run);
}
async function withFakeOpenCode<T>(script: string, run: () => Promise<T>): Promise<T> {
return withFakeAgent('opencode', script, run);
}
@ -2199,6 +2239,58 @@ process.stdin.on('end', () => {
);
});
it('preserves ANTHROPIC_API_KEY when Claude adapter launches the OpenClaude fallback', async () => {
const envFile = path.join(os.tmpdir(), `od-openclaude-env-${Date.now()}-${Math.random()}.json`);
const previousKey = process.env.ANTHROPIC_API_KEY;
try {
process.env.ANTHROPIC_API_KEY = 'sk-openclaude-test';
await withOnlyFakeOpenClaude(
`
const fs = require('node:fs');
fs.writeFileSync(${JSON.stringify(envFile)}, JSON.stringify({
ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY || null,
}));
let input = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => { input += chunk; });
process.stdin.on('end', () => {
try {
JSON.parse(input.trim());
console.log(JSON.stringify({
type: 'assistant',
message: {
id: 'msg_1',
content: [{ type: 'text', text: 'ok' }],
stop_reason: 'end_turn',
},
}));
} catch (err) {
console.error(err instanceof Error ? err.message : String(err));
process.exit(1);
}
});
`,
async () => {
const result = await testAgentConnection({ agentId: 'claude' });
expect(result).toMatchObject({
ok: true,
kind: 'success',
agentName: 'Claude Code',
});
await expect(fsp.readFile(envFile, 'utf8')).resolves.toBe(
JSON.stringify({ ANTHROPIC_API_KEY: 'sk-openclaude-test' }),
);
expect(result.diagnostics?.binaryPath ?? '').toMatch(/openclaude/i);
},
);
} finally {
if (previousKey === undefined) delete process.env.ANTHROPIC_API_KEY;
else process.env.ANTHROPIC_API_KEY = previousKey;
await fsp.rm(envFile, { force: true });
}
});
it('returns Claude /login guidance when the spawned CLI cannot authenticate', async () => {
await withFakeClaude(
`console.error(JSON.stringify({ apiKeySource: 'none', error_status: 401 })); process.exit(1);`,
@ -2974,16 +3066,18 @@ process.stdin.on('end', () => {
});
it('reports an early-phase diagnostics block when the agent CLI is missing (#2248)', async () => {
// Clear PATH so the daemon cannot locate `claude`. We restore the
// env in `finally` to avoid leaking the empty PATH to later tests.
// Depending on whether the resolver short-circuits or the spawn
// itself ENOENTs, the kind may be agent_not_installed or
// agent_spawn_failed and the phase may be 'binary_resolution' or
// 'spawn'. Both are valid "we never reached the smoke test" shapes
// — the actionable bit for the UI is that diagnostics arrived at
// all and that the phase is one of the two early values.
// Isolate every resolver input so the daemon truly cannot locate
// `claude`, even on machines that have a pinned CLAUDE_BIN or an
// alternate user toolchain home configured. PATH alone is no longer
// sufficient because runtime resolution also consults CLI env
// overrides and OD_AGENT_HOME-scoped toolchain bins.
const oldPath = process.env.PATH;
const oldClaudeBin = process.env.CLAUDE_BIN;
const oldAgentHome = process.env.OD_AGENT_HOME;
const emptyHome = await fsp.mkdtemp(path.join(os.tmpdir(), 'od-missing-claude-home-'));
process.env.PATH = '';
delete process.env.CLAUDE_BIN;
process.env.OD_AGENT_HOME = emptyHome;
try {
const result = await testAgentConnection({ agentId: 'claude' });
expect(result.ok).toBe(false);
@ -2992,6 +3086,11 @@ process.stdin.on('end', () => {
expect(['binary_resolution', 'spawn']).toContain(result.diagnostics?.phase);
} finally {
process.env.PATH = oldPath;
if (oldClaudeBin === undefined) delete process.env.CLAUDE_BIN;
else process.env.CLAUDE_BIN = oldClaudeBin;
if (oldAgentHome === undefined) delete process.env.OD_AGENT_HOME;
else process.env.OD_AGENT_HOME = oldAgentHome;
await fsp.rm(emptyHome, { recursive: true, force: true });
}
});

View file

@ -1,8 +1,11 @@
import {
chmodSync,
existsSync,
lstatSync,
mkdirSync,
mkdtempSync,
readFileSync,
statSync,
symlinkSync,
writeFileSync,
} from 'node:fs';
@ -251,4 +254,111 @@ describe('stageActiveSkill', () => {
expect(result.reason).toContain(expectedReason);
},
);
it('falls back to a dereferenced stream copy when the native copy fails with EPERM', async () => {
// Repro for the Docker/ZFS report: `fs.cp` -> copy_file_range(2) is
// rejected with EPERM across the image-layer -> bind-mount boundary
// and Node doesn't fall back. The real errno only appears on those
// mounts, so inject a copy that rejects with a synthetic EPERM.
const fs = fresh();
const cwd = path.join(fs, 'project');
const sourceDir = writeSampleSkill(path.join(fs, 'skills'), 'blog-post');
// A symlinked side file proves the fallback still dereferences, so the
// staged copy stays a self-contained write barrier.
symlinkSync(
path.join(sourceDir, 'assets', 'template.html'),
path.join(sourceDir, 'assets', 'linked.html'),
);
mkdirSync(cwd);
const messages: string[] = [];
const eperm = Object.assign(
new Error('EPERM: operation not permitted, copyfile'),
{ code: 'EPERM' },
);
const result = await stageActiveSkill(
cwd,
'blog-post',
sourceDir,
(m) => messages.push(m),
() => Promise.reject(eperm),
);
expect(result.staged).toBe(true);
const staged = result.stagedPath!;
expect(readFileSync(path.join(staged, 'SKILL.md'), 'utf8')).toContain(
'original SKILL',
);
const linked = path.join(staged, 'assets', 'linked.html');
expect(lstatSync(linked).isSymbolicLink()).toBe(false);
expect(lstatSync(linked).isFile()).toBe(true);
expect(readFileSync(linked, 'utf8')).toContain('original');
expect(messages.some((m) => m.includes('stream copy'))).toBe(true);
});
it('preserves the source exec bit through the stream-copy fallback (EPERM path)', async () => {
// Regression for PR #3249 review: skills shell out to staged helper
// scripts, so the fallback copy must keep the source's exec bit. A
// plain stream copy would reset it to the default 0644 and the agent
// would hit EACCES on the exact cross-fs path this fallback repairs.
const fs = fresh();
const cwd = path.join(fs, 'project');
const sourceDir = writeSampleSkill(path.join(fs, 'skills'), 'blog-post');
const script = path.join(sourceDir, 'scripts', 'run.sh');
mkdirSync(path.dirname(script));
writeFileSync(script, '#!/usr/bin/env bash\necho hi\n');
chmodSync(script, 0o755);
mkdirSync(cwd);
const eperm = Object.assign(new Error('EPERM: operation not permitted'), {
code: 'EPERM',
});
const result = await stageActiveSkill(
cwd,
'blog-post',
sourceDir,
() => {},
() => Promise.reject(eperm),
);
expect(result.staged).toBe(true);
const stagedScript = path.join(result.stagedPath!, 'scripts', 'run.sh');
// Exec bit survives on the helper script…
expect(statSync(stagedScript).mode & 0o111).not.toBe(0);
// …while a non-executable sibling is not made executable.
expect(statSync(path.join(result.stagedPath!, 'SKILL.md')).mode & 0o111).toBe(
0,
);
});
it('degrades to the absolute-path fallback on a non-recoverable copy error', async () => {
const fs = fresh();
const cwd = path.join(fs, 'project');
const sourceDir = writeSampleSkill(path.join(fs, 'skills'), 'blog-post');
mkdirSync(cwd);
const enospc = Object.assign(
new Error('ENOSPC: no space left on device'),
{ code: 'ENOSPC' },
);
const messages: string[] = [];
const result = await stageActiveSkill(
cwd,
'blog-post',
sourceDir,
(m) => messages.push(m),
() => Promise.reject(enospc),
);
// Not a cross-filesystem rejection — propagates to the existing
// degrade path instead of attempting the stream-copy fallback.
expect(result.staged).toBe(false);
expect(result.reason).toMatch(/ENOSPC/);
expect(
existsSync(path.join(cwd, SKILLS_CWD_ALIAS, 'blog-post', 'SKILL.md')),
).toBe(false);
expect(messages.some((m) => m.includes('stream copy'))).toBe(false);
});
});

View file

@ -38,6 +38,8 @@ describe('GET /api/daemon/status', () => {
version: unknown;
bindHost: unknown;
port: unknown;
sandboxMode: boolean;
sandbox: { enabled: boolean };
pid: unknown;
installedPlugins: unknown;
shuttingDown: boolean;
@ -49,11 +51,32 @@ describe('GET /api/daemon/status', () => {
expect(typeof body.port).toBe('number');
expect(typeof body.pid).toBe('number');
expect(typeof body.installedPlugins).toBe('number');
expect(body.sandboxMode).toBe(false);
expect(body.sandbox).toEqual({ enabled: false });
expect(body.shuttingDown).toBe(false);
expect(body).not.toHaveProperty('namespace');
});
});
describe('GET /api/ready', () => {
it('returns a readiness snapshot for headless launchers', async () => {
const resp = await fetch(`${baseUrl}/api/ready`);
expect(resp.status).toBe(200);
const body = (await resp.json()) as {
ok: boolean;
ready: boolean;
version: unknown;
};
expect(body.ok).toBe(true);
expect(body.ready).toBe(true);
expect(typeof body.version === 'string' || typeof body.version === 'object').toBe(true);
expect(body).not.toHaveProperty('dataDir');
expect(body).not.toHaveProperty('sandboxMode');
expect(body).not.toHaveProperty('sandbox');
});
});
describe('POST /api/daemon/shutdown', () => {
it('only accepts requests from local-daemon-allowed origins', async () => {
// Without the local-daemon header, the route is rejected. The

View file

@ -127,4 +127,46 @@ describe('diagnostics export handler — packaged (runtime) layout', () => {
await rm(root, { recursive: true, force: true });
}
});
it('reports missing packaged log files under logical log paths without duplicating runtime segments', async () => {
const root = join(tmpdir(), `od-diag-missing-${randomUUID()}`);
const namespaceRoot = join(root, 'namespaces', 'release-beta');
const daemonLogPath = join(namespaceRoot, 'logs', APP_KEYS.DAEMON, 'latest.log');
try {
await mkdir(dirname(daemonLogPath), { recursive: true });
await writeFile(daemonLogPath, 'daemon ok\n', 'utf8');
const runtime: SidecarRuntimeContext<SidecarStamp> = {
app: APP_KEYS.DAEMON,
base: join(namespaceRoot, 'runtime'),
ipc: '/tmp/od-diag-missing.sock',
mode: SIDECAR_MODES.RUNTIME,
namespace: 'release-beta',
source: SIDECAR_SOURCES.PACKAGED,
};
const handler = createDiagnosticsExportHandler({ runtime, projectRoot: '/tmp/test-project' });
const res = mockResponse();
await handler({} as never, res as never, () => undefined);
expect(res.capturedStatus).toBe(200);
const zip = await JSZip.loadAsync(res.capturedPayload!);
const manifest = JSON.parse(await zip.file('summary/manifest.json')!.async('string')) as {
files: Array<{ name: string; bytes?: number; error?: string }>;
};
const fileNames = manifest.files.map((file) => file.name);
expect(fileNames).toContain('logs/daemon/latest.log');
expect(fileNames).toContain('logs/web/latest.log');
expect(fileNames).toContain('logs/desktop/latest.log');
expect(fileNames.some((name) => name.includes('runtime/release-beta/logs'))).toBe(false);
const webLog = manifest.files.find((file) => file.name === 'logs/web/latest.log');
const desktopLog = manifest.files.find((file) => file.name === 'logs/desktop/latest.log');
expect(webLog?.error).toBeTruthy();
expect(desktopLog?.error).toBeTruthy();
} finally {
await rm(root, { recursive: true, force: true });
}
});
});

View file

@ -4,7 +4,24 @@ import { tmpdir } from 'node:os';
import path from 'node:path';
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import { detectEntryFile, listFiles, resolveProjectDir } from '../src/projects.js';
import {
assertSandboxProjectRootAvailable,
detectEntryFile,
listFiles,
resolveProjectDir,
SandboxImportedProjectError,
} from '../src/projects.js';
function withSandboxMode<T>(run: () => T): T {
const previous = process.env.OD_SANDBOX_MODE;
process.env.OD_SANDBOX_MODE = '1';
try {
return run();
} finally {
if (previous == null) delete process.env.OD_SANDBOX_MODE;
else process.env.OD_SANDBOX_MODE = previous;
}
}
describe('resolveProjectDir', () => {
const projectsRoot = '/var/od/projects';
@ -50,6 +67,22 @@ describe('resolveProjectDir', () => {
}),
).not.toThrow();
});
it('rejects metadata.baseDir in sandbox mode before resolving a project file root', () => {
withSandboxMode(() => {
const baseDir = '/Users/me/projects/site';
expect(
() => resolveProjectDir(projectsRoot, projectId, { kind: 'prototype', baseDir }),
).toThrowError(SandboxImportedProjectError);
expect(() =>
assertSandboxProjectRootAvailable({ kind: 'prototype', baseDir }),
).toThrowError(SandboxImportedProjectError);
expect(() => resolveProjectDir(projectsRoot, '../escape', {
kind: 'prototype',
baseDir,
})).toThrowError();
});
});
});
describe('detectEntryFile', () => {

View file

@ -45,6 +45,37 @@ describe('POST /api/import/folder', () => {
});
}
async function withSandboxMode<T>(run: () => Promise<T>): Promise<T> {
const previous = process.env.OD_SANDBOX_MODE;
process.env.OD_SANDBOX_MODE = '1';
try {
return await run();
} finally {
if (previous == null) delete process.env.OD_SANDBOX_MODE;
else process.env.OD_SANDBOX_MODE = previous;
}
}
async function waitForRunStatus(
runId: string,
): Promise<{ status: string; error?: string | null; errorCode?: string | null }> {
let lastStatus = 'unknown';
for (let attempt = 0; attempt < 200; attempt += 1) {
const statusResponse = await fetch(`${baseUrl}/api/runs/${runId}`);
const statusBody = (await statusResponse.json()) as {
status: string;
error?: string | null;
errorCode?: string | null;
};
lastStatus = statusBody.status;
if (statusBody.status !== 'queued' && statusBody.status !== 'running') {
return statusBody;
}
await new Promise((resolve) => setTimeout(resolve, 25));
}
throw new Error(`run did not reach a terminal status; last status: ${lastStatus}`);
}
it('creates a project rooted at the submitted folder', async () => {
const folder = makeFolder();
await writeFile(path.join(folder, 'index.html'), '<!doctype html>');
@ -62,6 +93,80 @@ describe('POST /api/import/folder', () => {
expect(body.entryFile).toBe('index.html');
});
it('rejects folder imports in sandbox mode', async () => {
await withSandboxMode(async () => {
const folder = makeFolder();
await writeFile(path.join(folder, 'index.html'), '<!doctype html>');
const resp = await importFolder({ baseDir: folder });
expect(resp.status).toBe(400);
const body = (await resp.json()) as { error?: { message?: string } };
expect(body.error?.message).toMatch(/OD_SANDBOX_MODE/i);
});
});
it('fails sandbox runs for imported folders instead of using an empty managed project', async () => {
const folder = makeFolder();
await writeFile(path.join(folder, 'index.html'), '<!doctype html>');
const importResp = await importFolder({ baseDir: folder });
expect(importResp.status).toBe(200);
const { project } = (await importResp.json()) as { project: { id: string } };
await withSandboxMode(async () => {
const runResp = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'claude',
projectId: project.id,
message: 'Inspect the imported project.',
}),
});
expect(runResp.status).toBe(202);
const { runId } = (await runResp.json()) as { runId: string };
const status = await waitForRunStatus(runId);
expect(status.status).toBe('failed');
expect(status.errorCode).toBe('BAD_REQUEST');
expect(status.error).toMatch(/imported-folder projects.*OD_SANDBOX_MODE/i);
});
});
it('still opens an imported-folder project record in sandbox mode', async () => {
const folder = makeFolder();
await writeFile(path.join(folder, 'index.html'), '<!doctype html>');
const importResp = await importFolder({ baseDir: folder });
expect(importResp.status).toBe(200);
const { project } = (await importResp.json()) as { project: { id: string } };
await withSandboxMode(async () => {
const resp = await fetch(`${baseUrl}/api/projects/${project.id}`);
expect(resp.status).toBe(200);
const body = (await resp.json()) as {
project?: { id?: string; metadata?: { baseDir?: string } };
};
expect(body.project?.id).toBe(project.id);
expect(body.project?.metadata?.baseDir).toBeTruthy();
});
});
it('rejects imported-folder project file listing in sandbox mode', async () => {
const folder = makeFolder();
await writeFile(path.join(folder, 'index.html'), '<!doctype html>');
const importResp = await importFolder({ baseDir: folder });
expect(importResp.status).toBe(200);
const { project } = (await importResp.json()) as { project: { id: string } };
await withSandboxMode(async () => {
const resp = await fetch(`${baseUrl}/api/projects/${project.id}/files`);
expect(resp.status).toBe(400);
const body = (await resp.json()) as { error?: { message?: string } };
expect(body.error?.message).toMatch(/imported-folder projects.*OD_SANDBOX_MODE/i);
});
});
it('auto-detects the entry file when present', async () => {
const folder = makeFolder();
await writeFile(path.join(folder, 'index.html'), '');

View file

@ -0,0 +1,194 @@
import type http from 'node:http';
import { randomUUID } from 'node:crypto';
import { chmod, mkdtemp, rm, writeFile } from 'node:fs/promises';
import os from 'node:os';
import path from 'node:path';
import { afterEach, describe, expect, it } from 'vitest';
import { startServer } from '../src/server.js';
type StartedServer = {
url: string;
server: http.Server;
shutdown?: () => Promise<void> | void;
};
describe('POST /api/runs headless fallbacks', () => {
let started: StartedServer | null = null;
const oldPath = process.env.PATH;
const oldAgentHome = process.env.OD_AGENT_HOME;
afterEach(async () => {
await Promise.resolve(started?.shutdown?.());
if (started?.server) {
await new Promise<void>((resolve) => started?.server.close(() => resolve()));
}
started = null;
if (oldPath === undefined) delete process.env.PATH;
else process.env.PATH = oldPath;
if (oldAgentHome === undefined) delete process.env.OD_AGENT_HOME;
else process.env.OD_AGENT_HOME = oldAgentHome;
});
it('binds omitted conversationId to the seeded project conversation', async () => {
started = await startTestServer();
const { projectId, conversationId: seededConversationId } = await createProject(
started.url,
'Headless default conversation project',
);
await delay(5);
const newerConversationId = await createConversation(started.url, projectId, 'Newer user chat');
const conversationsResponse = await fetch(
`${started.url}/api/projects/${encodeURIComponent(projectId)}/conversations`,
);
expect(conversationsResponse.status).toBe(200);
const conversationsBody = await conversationsResponse.json() as {
conversations: Array<{ id: string }>;
};
expect(conversationsBody.conversations[0]?.id).toBe(newerConversationId);
const runResponse = await fetch(`${started.url}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: `missing-agent-${randomUUID()}`,
projectId,
message: 'Headless prompt',
}),
});
expect(runResponse.status).toBe(202);
const runBody = await runResponse.json() as { conversationId: string | null };
expect(runBody.conversationId).toBe(seededConversationId);
});
it('falls back past a stale saved agent to the first detected available runtime', async () => {
started = await startTestServer();
const binDir = await mkdtemp(path.join(os.tmpdir(), 'od-headless-run-bin-'));
const emptyAgentHome = await mkdtemp(path.join(os.tmpdir(), 'od-headless-run-home-'));
const priorConfig = await readAppConfigFromServer(started.url);
try {
const opencodeBin = await writeFakeOpencode(binDir);
process.env.PATH = '';
process.env.OD_AGENT_HOME = emptyAgentHome;
const configResponse = await fetch(`${started.url}/api/app-config`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'claude',
agentCliEnv: {
claude: { CLAUDE_BIN: path.join(binDir, 'missing-claude') },
opencode: { OPENCODE_BIN: opencodeBin },
},
}),
});
expect(configResponse.status).toBe(200);
const { projectId } = await createProject(started.url, 'Headless stale agent project');
const runResponse = await fetch(`${started.url}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
projectId,
message: 'Headless prompt',
}),
});
expect(runResponse.status).toBe(202);
const runBody = await runResponse.json() as { runId: string };
const statusResponse = await fetch(
`${started.url}/api/runs/${encodeURIComponent(runBody.runId)}`,
);
expect(statusResponse.status).toBe(200);
const statusBody = await statusResponse.json() as { agentId: string | null };
expect(statusBody.agentId).toBe('opencode');
} finally {
await restoreAppConfig(started.url, priorConfig);
await rm(binDir, { recursive: true, force: true });
await rm(emptyAgentHome, { recursive: true, force: true });
}
});
});
async function startTestServer(): Promise<StartedServer> {
return await startServer({ port: 0, returnServer: true }) as StartedServer;
}
async function createProject(url: string, name: string): Promise<{
projectId: string;
conversationId: string;
}> {
const projectId = `project_${randomUUID()}`;
const response = await fetch(`${url}/api/projects`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
id: projectId,
name,
metadata: { kind: 'prototype' },
}),
});
expect(response.status).toBe(200);
const body = await response.json() as { conversationId: string };
return { projectId, conversationId: body.conversationId };
}
async function createConversation(
url: string,
projectId: string,
title: string,
): Promise<string> {
const response = await fetch(`${url}/api/projects/${encodeURIComponent(projectId)}/conversations`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ title }),
});
expect(response.status).toBe(200);
const body = await response.json() as { conversation: { id: string } };
return body.conversation.id;
}
async function readAppConfigFromServer(url: string): Promise<Record<string, unknown>> {
const response = await fetch(`${url}/api/app-config`);
expect(response.status).toBe(200);
const body = await response.json() as { config?: Record<string, unknown> };
return body.config ?? {};
}
async function restoreAppConfig(url: string, config: Record<string, unknown>): Promise<void> {
await fetch(`${url}/api/app-config`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: Object.hasOwn(config, 'agentId') ? config.agentId : null,
agentCliEnv: Object.hasOwn(config, 'agentCliEnv') ? config.agentCliEnv : null,
}),
});
}
async function writeFakeOpencode(dir: string): Promise<string> {
const bin = path.join(dir, 'opencode');
await writeFile(bin, `#!/usr/bin/env node
if (process.argv.includes('--version')) {
console.log('opencode 0.0.0');
process.exit(0);
}
if (process.argv[2] === 'models') {
console.log('test/model');
process.exit(0);
}
if (process.argv[2] === 'run') {
process.stdin.resume();
process.stdin.on('end', () => process.exit(0));
setTimeout(() => process.exit(0), 50);
} else {
process.exit(0);
}
`, 'utf8');
await chmod(bin, 0o755);
return bin;
}
function delay(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}

View file

@ -21,6 +21,8 @@ import { startServer } from '../src/server.js';
async function withFakeClaude<T>(run: () => Promise<T>): Promise<T> {
const dir = await fsp.mkdtemp(join(tmpdir(), 'od-mcp-spawn-bin-'));
const oldPath = process.env.PATH;
const oldClaudeBin = process.env.CLAUDE_BIN;
const oldAgentHome = process.env.OD_AGENT_HOME;
// Fake `claude` that prints stream-json the daemon understands and exits 0.
// The single result frame is enough to drive the run to `succeeded`.
const script = `
@ -50,9 +52,15 @@ process.exit(0);
await fsp.chmod(bin, 0o755);
}
process.env.PATH = `${dir}${delimiter}${oldPath ?? ''}`;
delete process.env.CLAUDE_BIN;
process.env.OD_AGENT_HOME = dir;
return await run();
} finally {
process.env.PATH = oldPath;
if (oldClaudeBin === undefined) delete process.env.CLAUDE_BIN;
else process.env.CLAUDE_BIN = oldClaudeBin;
if (oldAgentHome === undefined) delete process.env.OD_AGENT_HOME;
else process.env.OD_AGENT_HOME = oldAgentHome;
await fsp.rm(dir, { recursive: true, force: true });
}
}
@ -61,13 +69,13 @@ async function waitForRunStatus(
baseUrl: string,
runId: string,
): Promise<{ status: string }> {
for (let attempt = 0; attempt < 60; attempt += 1) {
for (let attempt = 0; attempt < 200; attempt += 1) {
const r = await fetch(`${baseUrl}/api/runs/${runId}`);
const body = (await r.json()) as { status: string };
if (body.status !== 'queued' && body.status !== 'running') return body;
await new Promise((resolve) => setTimeout(resolve, 25));
}
throw new Error('run did not finish');
throw new Error('run did not finish within 5s of polling');
}
describe('spawn writes external MCP config for Claude Code', () => {

View file

@ -21,7 +21,7 @@ const OPENAI_ENV_KEYS = [
'AZURE_OPENAI_API_KEY',
];
describe('media-config OpenAI OAuth fallback', () => {
describe('media-config OpenAI auth-file fallback', () => {
let homeDir: string;
let projectRoot: string;
const originalHome = process.env.HOME;
@ -30,6 +30,7 @@ describe('media-config OpenAI OAuth fallback', () => {
);
const originalMediaConfigDir = process.env.OD_MEDIA_CONFIG_DIR;
const originalDataDir = process.env.OD_DATA_DIR;
const originalSandboxMode = process.env.OD_SANDBOX_MODE;
let homedirSpy: ReturnType<typeof vi.spyOn>;
beforeEach(async () => {
@ -42,6 +43,7 @@ describe('media-config OpenAI OAuth fallback', () => {
}
delete process.env.OD_MEDIA_CONFIG_DIR;
delete process.env.OD_DATA_DIR;
delete process.env.OD_SANDBOX_MODE;
});
afterEach(async () => {
@ -67,6 +69,11 @@ describe('media-config OpenAI OAuth fallback', () => {
} else {
process.env.OD_DATA_DIR = originalDataDir;
}
if (originalSandboxMode == null) {
delete process.env.OD_SANDBOX_MODE;
} else {
process.env.OD_SANDBOX_MODE = originalSandboxMode;
}
homedirSpy.mockRestore();
await rm(homeDir, { recursive: true, force: true });
await rm(projectRoot, { recursive: true, force: true });
@ -88,7 +95,7 @@ describe('media-config OpenAI OAuth fallback', () => {
return (masked.providers as Record<string, unknown>).openai;
}
it('uses Hermes openai-codex OAuth when no API key is configured', async () => {
it('ignores Hermes openai-codex OAuth for media generation', async () => {
await writeHomeJson('.hermes/auth.json', {
providers: {
'openai-codex': {
@ -100,15 +107,15 @@ describe('media-config OpenAI OAuth fallback', () => {
const resolved = await resolveProviderConfig(projectRoot, 'openai');
const masked = await readMaskedConfig(projectRoot);
expect(resolved.apiKey).toBe('hermes-oauth-token');
expect(resolved.apiKey).toBe('');
expect(openaiProvider(masked)).toMatchObject({
configured: true,
source: 'oauth-hermes',
configured: false,
source: 'unset',
apiKeyTail: '',
});
});
it('uses Codex OAuth when Hermes has no OpenAI Codex credential', async () => {
it('ignores Codex OAuth tokens for media generation', async () => {
await writeHomeJson('.codex/auth.json', {
tokens: { access_token: 'codex-oauth-token' },
});
@ -116,15 +123,56 @@ describe('media-config OpenAI OAuth fallback', () => {
const resolved = await resolveProviderConfig(projectRoot, 'openai');
const masked = await readMaskedConfig(projectRoot);
expect(resolved.apiKey).toBe('codex-oauth-token');
expect(resolved.apiKey).toBe('');
expect(openaiProvider(masked)).toMatchObject({
configured: true,
source: 'oauth-codex',
configured: false,
source: 'unset',
apiKeyTail: '',
});
});
it('keeps stored provider config ahead of OAuth fallbacks', async () => {
it('does not read host OpenAI auth files in sandbox mode', async () => {
process.env.OD_SANDBOX_MODE = '1';
await writeHomeJson('.hermes/auth.json', {
providers: {
'openai-codex': {
tokens: { access_token: 'hermes-oauth-token' },
},
},
});
await writeHomeJson('.codex/auth.json', {
tokens: { access_token: 'codex-oauth-token' },
OPENAI_API_KEY: 'host-codex-api-key',
});
const resolved = await resolveProviderConfig(projectRoot, 'openai');
const masked = await readMaskedConfig(projectRoot);
expect(resolved.apiKey).toBe('');
expect(openaiProvider(masked)).toMatchObject({
configured: false,
source: 'unset',
});
});
it('uses explicit OPENAI_API_KEY from Codex auth files', async () => {
await writeHomeJson('.codex/auth.json', {
tokens: { access_token: 'codex-oauth-token' },
OPENAI_API_KEY: 'codex-api-key',
});
const resolved = await resolveProviderConfig(projectRoot, 'openai');
const masked = await readMaskedConfig(projectRoot);
expect(resolved.apiKey).toBe('codex-api-key');
expect(openaiProvider(masked)).toMatchObject({
configured: true,
source: 'codex-auth',
apiKeyTail: '',
});
});
it('keeps stored provider config ahead of auth-file fallbacks', async () => {
await writeHomeJson('.hermes/auth.json', {
providers: {
'openai-codex': {

View file

@ -199,6 +199,113 @@ describe('OpenAI-compatible media providers', () => {
expect(fetchMock).toHaveBeenCalledTimes(1);
});
it('rewrites custom-image text-only requests back to /v1/images/generations when configured with an edits URL', async () => {
await writeConfig({
providers: {
'custom-image': {
apiKey: 'proxy-test-key',
baseUrl: 'https://proxy.example.test/v1/images/edits',
model: 'acme-image-model',
},
},
});
const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
expect(String(input)).toBe('https://proxy.example.test/v1/images/generations');
expect(init?.method).toBe('POST');
expect(init?.headers).toMatchObject({
authorization: 'Bearer proxy-test-key',
'content-type': 'application/json',
});
expect(JSON.parse(String(init?.body))).toEqual({
prompt: 'A matte product shot on a neutral backdrop',
model: 'acme-image-model',
n: 1,
size: '1024x1024',
});
return new Response(JSON.stringify({
data: [{ b64_json: PNG_BASE64 }],
}), {
status: 200,
headers: { 'content-type': 'application/json' },
});
});
vi.stubGlobal('fetch', fetchMock);
const result = await generateMedia({
projectRoot,
projectsRoot,
projectId: 'project-1',
surface: 'image',
model: 'custom-image',
prompt: 'A matte product shot on a neutral backdrop',
output: 'custom-from-edits-base.png',
});
expect(result.providerId).toBe('custom-image');
expect(result.providerNote).toContain('custom-image/acme-image-model');
expect(fetchMock).toHaveBeenCalledTimes(1);
});
it('routes custom-image reference-image requests through /v1/images/edits', async () => {
await writeConfig({
providers: {
'custom-image': {
apiKey: 'proxy-test-key',
baseUrl: 'https://proxy.example.test/v1',
model: 'acme-image-edit-model',
},
},
});
const projectDir = path.join(projectsRoot, 'project-1');
await mkdir(projectDir, { recursive: true });
await writeFile(
path.join(projectDir, 'reference.png'),
Buffer.from(PNG_BASE64, 'base64'),
);
const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
expect(String(input)).toBe('https://proxy.example.test/v1/images/edits');
expect(init?.method).toBe('POST');
expect(init?.headers).toMatchObject({
authorization: 'Bearer proxy-test-key',
'content-type': 'application/json',
});
const body = JSON.parse(String(init?.body));
expect(body.prompt).toBe('Turn this reference into a blueprint-style UI illustration');
expect(body.model).toBe('acme-image-edit-model');
expect(body.n).toBe(1);
expect(body.size).toBe('1024x1024');
expect(body.response_format).toBe('b64_json');
expect(body.images).toHaveLength(1);
expect(body.images[0]?.image_url).toMatch(/^data:image\/png;base64,/);
return new Response(JSON.stringify({
data: [{ b64_json: PNG_BASE64 }],
}), {
status: 200,
headers: { 'content-type': 'application/json' },
});
});
vi.stubGlobal('fetch', fetchMock);
const result = await generateMedia({
projectRoot,
projectsRoot,
projectId: 'project-1',
surface: 'image',
model: 'custom-image',
prompt: 'Turn this reference into a blueprint-style UI illustration',
image: 'reference.png',
output: 'edited.png',
});
expect(result.providerId).toBe('custom-image');
expect(result.providerNote).toContain('custom-image/acme-image-edit-model');
expect(fetchMock).toHaveBeenCalledTimes(1);
const bytes = await readFile(path.join(projectDir, 'edited.png'));
expect(bytes.length).toBeGreaterThan(0);
});
it('renders ImageRouter images through the OpenAI-compatible JSON endpoint', async () => {
process.env.OD_IMAGEROUTER_API_KEY = 'ir-test-key';

View file

@ -1,11 +1,12 @@
import type http from 'node:http';
import { chmod, mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
import { chmod, mkdir, mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
import { randomUUID } from 'node:crypto';
import os from 'node:os';
import path from 'node:path';
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import { startServer } from '../src/server.js';
import { memoryDir, writeMemoryConfig } from '../src/memory.js';
type FakeMediaEndpoint = 'tool' | 'legacy';
@ -19,6 +20,7 @@ describe('run-scoped media policy routes', () => {
let binDir: string;
let oldPath: string | undefined;
let oldCapture: string | undefined;
let oldMemoryConfigRaw: string | null = null;
let server: http.Server | null = null;
let shutdown: (() => Promise<void> | void) | undefined;
@ -28,6 +30,12 @@ describe('run-scoped media policy routes', () => {
oldPath = process.env.PATH;
oldCapture = process.env.OD_CAPTURE_MEDIA_RESPONSE;
process.env.PATH = `${binDir}${path.delimiter}${oldPath ?? ''}`;
const memoryConfig = memoryConfigPath();
oldMemoryConfigRaw = await readFile(memoryConfig, 'utf8').catch(() => null);
await writeMemoryConfig(process.env.OD_DATA_DIR!, {
chatExtractionEnabled: false,
extraction: null,
});
});
afterEach(async () => {
@ -41,6 +49,14 @@ describe('run-scoped media policy routes', () => {
else process.env.PATH = oldPath;
if (oldCapture === undefined) delete process.env.OD_CAPTURE_MEDIA_RESPONSE;
else process.env.OD_CAPTURE_MEDIA_RESPONSE = oldCapture;
const memoryConfig = memoryConfigPath();
if (oldMemoryConfigRaw === null) {
await rm(memoryConfig, { force: true });
} else {
await mkdir(path.dirname(memoryConfig), { recursive: true });
await writeFile(memoryConfig, oldMemoryConfigRaw);
}
oldMemoryConfigRaw = null;
await rm(tempDir, { recursive: true, force: true });
await rm(binDir, { recursive: true, force: true });
});
@ -468,6 +484,10 @@ describe('run-scoped media policy routes', () => {
};
}
function memoryConfigPath(): string {
return path.join(memoryDir(process.env.OD_DATA_DIR!), '.config.json');
}
async function writeFakeAgent(
capturePath: string,
requestBody: unknown,

View file

@ -1023,7 +1023,7 @@ process.stdout.write(JSON.stringify({
}
});
it('runs OpenCode Local CLI with a message argument and attached prompt file', async () => {
it('runs OpenCode Local CLI memory extraction with the prompt on stdin', async () => {
await writeMemoryConfig(dataDir, { extraction: null });
const tempDir = await fsp.mkdtemp(path.join(tmpdir(), 'od-opencode-memory-'));
const binPath = path.join(tempDir, 'opencode-cli');
@ -1031,16 +1031,33 @@ process.stdout.write(JSON.stringify({
const previousPath = process.env.PATH;
const previousCapture = process.env.OD_MEMORY_OPENCODE_ARGS_OUT;
// Model the real `opencode run` arg parser: `-f, --file` is a yargs
// *array* option, so it greedily swallows every following non-flag
// token as a file path. Any captured path that doesn't exist makes the
// real CLI exit 1 with "File not found: <token>" — which is exactly how
// a trailing positional message after `--file` crashed extraction. The
// supported one-shot shape is bare `run` with the prompt on stdin.
await fsp.writeFile(
binPath,
`#!/usr/bin/env node
const fs = require('node:fs');
const args = process.argv.slice(2);
const fileIndex = args.indexOf('--file');
const attachedFile = fileIndex >= 0 ? args[fileIndex + 1] : null;
const prompt = attachedFile ? fs.readFileSync(attachedFile, 'utf8') : '';
const stdin = fs.readFileSync(0, 'utf8');
fs.writeFileSync(process.env.OD_MEMORY_OPENCODE_ARGS_OUT, JSON.stringify({ args, attachedFile, prompt, stdin }));
const files = [];
const fileFlag = args.findIndex((a) => a === '--file' || a === '-f');
if (fileFlag >= 0) {
for (let i = fileFlag + 1; i < args.length; i += 1) {
if (args[i].startsWith('-')) break;
files.push(args[i]);
}
}
fs.writeFileSync(process.env.OD_MEMORY_OPENCODE_ARGS_OUT, JSON.stringify({ args, stdin, files }));
for (const f of files) {
if (!fs.existsSync(f)) {
process.stderr.write('Error: File not found: ' + f + '\\n');
process.exit(1);
}
}
process.stdout.write(JSON.stringify({
type: 'text',
part: {
@ -1048,9 +1065,9 @@ process.stdout.write(JSON.stringify({
text: JSON.stringify({
entries: [{
type: 'project',
name: 'OpenCode prompt attachment',
description: 'OpenCode memory used a prompt file',
body: 'OpenDesign connector memory extraction should pass the compacted prompt to OpenCode as an attached file while sending a short message argument.'
name: 'OpenCode stdin prompt',
description: 'OpenCode memory used stdin',
body: 'OpenDesign connector memory extraction should pass the compacted prompt to OpenCode on stdin and parse the JSON event stream response.'
}]
})
}
@ -1077,7 +1094,7 @@ process.stdout.write(JSON.stringify({
expect(result.suggestions).toEqual([
expect.objectContaining({
type: 'project',
name: 'OpenCode prompt attachment',
name: 'OpenCode stdin prompt',
}),
]);
@ -1086,14 +1103,15 @@ process.stdout.write(JSON.stringify({
'run',
'--format',
'json',
'--file',
'Read the attached OpenDesign memory extraction prompt and return strict JSON only.',
'openai/gpt-5',
]));
expect(captured.args).toContain('openai/gpt-5');
expect(captured.prompt).toContain('You are a design-memory extractor');
expect(captured.prompt).toContain('OpenDesign connector memory should collect design preferences');
expect(captured.stdin).toBe('');
await expect(fsp.access(captured.attachedFile)).rejects.toThrow();
// The prompt rides on stdin like the chat-run path; no `--file`
// attachment (whose array option would swallow any trailing message).
expect(captured.args).not.toContain('--file');
expect(captured.args).not.toContain('-f');
expect(captured.files).toEqual([]);
expect(captured.stdin).toContain('You are a design-memory extractor');
expect(captured.stdin).toContain('OpenDesign connector memory should collect design preferences');
} finally {
if (previousPath == null) {
delete process.env.PATH;

View file

@ -0,0 +1,113 @@
// Golden daemon-event snapshots — addresses the regression-signal point
// from review on #3241: smoke-testing that mocks RUN catches only crashes
// or protocol-level garbage; it does NOT catch a parser change that
// semantically reshapes the events the daemon emits to the UI.
//
// This test replays representative recordings through the actual daemon
// stream handlers and asserts the emitted event sequence matches a
// committed `mocks/golden/<trace>.events.json`. A parser tweak that
// drops a tool_result, changes a usage shape, or renames an event type
// fails this test loudly.
//
// Update flow when a parser change is INTENTIONAL:
// MOCKS_GOLDEN_UPDATE=1 pnpm --filter @open-design/daemon test mocks-golden
// then `git diff mocks/golden/` and commit the new shapes.
//
// Auto-skips when the recording corpus hasn't been fetched yet (see
// `mocks/scripts/fetch-recordings.sh`); CI that exercises this test must
// fetch first.
import { describe, it, expect } from 'vitest';
import { existsSync, readFileSync, writeFileSync, mkdirSync } from 'node:fs';
import { spawnSync } from 'node:child_process';
import { dirname, join } from 'node:path';
import { fileURLToPath } from 'node:url';
import { createClaudeStreamHandler } from '../src/claude-stream.js';
import { createJsonEventStreamHandler } from '../src/json-event-stream.js';
const HERE = dirname(fileURLToPath(import.meta.url));
const REPO = join(HERE, '../../..');
const MOCK_AGENT = join(REPO, 'mocks/mock-agent.mjs');
const GOLDEN_DIR = join(REPO, 'mocks/golden');
const RECORDINGS_DIR = join(REPO, 'mocks/recordings');
// Median-tool-count successful traces per agent (selected from manifest
// 2026-05-29). Each one's `.jsonl` lives in `mocks/recordings/` after
// `bash mocks/scripts/fetch-recordings.sh`.
const CASES: Array<{ agent: 'claude' | 'codex' | 'opencode'; trace: string }> = [
{ agent: 'claude', trace: '314d6833-0377-4ac4-ba11-2b8d7eca5511' },
{ agent: 'codex', trace: 'dcdff3b3-cd39-4dcd-be83-372830a29639' },
{ agent: 'opencode', trace: '9a9522ec-575f-432f-aeed-efc491e900aa' },
];
// Replace per-spawn-volatile fields with stable sentinels so the
// snapshot stays diffable across runs. Currently only `sessionId` —
// claude's mock emits a fresh UUID every spawn. Opencode/codex carry
// the recording's own session/thread id so they're already stable.
function normalizeVolatile(events: unknown[]): unknown[] {
return events.map(e => {
if (!e || typeof e !== 'object') return e;
const rec = e as Record<string, unknown>;
const out: Record<string, unknown> = { ...rec };
if ('sessionId' in out) out.sessionId = '<normalized>';
return out;
});
}
function runMockAndCollectEvents(agent: string, trace: string): unknown[] {
// Force no-delay so the spawn returns quickly + deterministically.
const proc = spawnSync(
process.execPath,
[MOCK_AGENT, '--as', agent, '--no-delay'],
{
env: { ...process.env, OD_MOCKS_TRACE: trace, OD_MOCKS_NO_DELAY: '1' },
input: 'golden-test-prompt',
encoding: 'utf-8',
timeout: 30_000,
maxBuffer: 50 * 1024 * 1024,
},
);
if (proc.status !== 0) {
throw new Error(
`mock-agent --as ${agent} exit ${proc.status}: ${proc.stderr.slice(0, 500)}`,
);
}
const events: unknown[] = [];
const sink = (e: unknown) => events.push(e);
const handler =
agent === 'claude'
? createClaudeStreamHandler(sink)
: createJsonEventStreamHandler(agent, sink);
handler.feed(proc.stdout);
return normalizeVolatile(events);
}
const recordingsAvailable =
existsSync(RECORDINGS_DIR) &&
CASES.every(c => existsSync(join(RECORDINGS_DIR, `${c.trace}.jsonl`)));
describe.skipIf(!recordingsAvailable)(
'mocks goldens — daemon event shape regression',
() => {
for (const { agent, trace } of CASES) {
it(`${agent} ${trace.slice(0, 8)}`, () => {
const events = runMockAndCollectEvents(agent, trace);
const goldenPath = join(GOLDEN_DIR, `${trace}.events.json`);
if (process.env.MOCKS_GOLDEN_UPDATE === '1') {
mkdirSync(GOLDEN_DIR, { recursive: true });
writeFileSync(
goldenPath,
JSON.stringify({ agent, trace, events }, null, 2) + '\n',
);
return;
}
const golden = JSON.parse(readFileSync(goldenPath, 'utf-8'));
expect({ agent, trace, events }).toEqual(golden);
});
}
},
);

View file

@ -17,7 +17,9 @@ import {
createSnapshot,
getSnapshot,
linkSnapshotToRun,
linkSnapshotToProject,
markSnapshotStale,
restoreProjectSnapshotLink,
} from '../src/plugins/snapshots.js';
let db: Database.Database;
@ -106,6 +108,37 @@ describe('snapshots writer', () => {
expect(after.expires_at).toBeNull();
});
it('restoreProjectSnapshotLink makes an unlinked discarded snapshot expirable again', () => {
db.prepare('INSERT INTO projects (id, name) VALUES (?, ?)').run('project-1', 'Project 1');
const previous = createSnapshot(db, baseInput({ query: 'Previous {{topic}}' }));
linkSnapshotToProject(db, previous.snapshotId, 'project-1');
const discarded = createSnapshot(db, baseInput({ query: 'Discarded {{topic}}' }));
linkSnapshotToProject(db, discarded.snapshotId, 'project-1');
restoreProjectSnapshotLink(
db,
'project-1',
discarded.snapshotId,
previous.snapshotId,
'run-that-was-never-linked',
);
const project = db.prepare(
`SELECT applied_plugin_snapshot_id AS appliedPluginSnapshotId
FROM projects
WHERE id = ?`,
).get('project-1') as { appliedPluginSnapshotId: string | null };
const discardedRow = db.prepare(
`SELECT run_id AS runId, expires_at AS expiresAt
FROM applied_plugin_snapshots
WHERE id = ?`,
).get(discarded.snapshotId) as { runId: string | null; expiresAt: number | null };
expect(project.appliedPluginSnapshotId).toBe(previous.snapshotId);
expect(discardedRow.runId).toBeNull();
expect(discardedRow.expiresAt).not.toBeNull();
});
it('markSnapshotStale flips status', () => {
db.prepare('INSERT INTO projects (id, name) VALUES (?, ?)').run('project-1', 'Project 1');
const snap = createSnapshot(db, baseInput());

View file

@ -165,6 +165,11 @@ describe('GET /api/projects/:id/raw/* range request route', () => {
await writeFile(path.join(dir, 'clip.mp4'), Buffer.alloc(FILE_SIZE, 0x42));
await writeFile(path.join(dir, 'audio.mp3'), Buffer.alloc(FILE_SIZE, 0x43));
await writeFile(path.join(dir, 'page.html'), Buffer.from('<html/>'));
await writeFile(path.join(dir, 'body.html'), Buffer.from('<html><body><main>Preview</main></body></html>'));
await writeFile(
path.join(dir, 'bridged.html'),
Buffer.from('<html><body><script data-od-url-scroll-bridge></script><main>Preview</main></body></html>'),
);
});
afterAll(() => new Promise<void>((resolve) => server.close(() => resolve())));
@ -226,6 +231,32 @@ describe('GET /api/projects/:id/raw/* range request route', () => {
expect(text).toBe('<html/>');
});
it('injects the URL preview scroll bridge only when requested', async () => {
const plain = await fetch(rawUrl('page.html'));
expect(await plain.text()).toBe('<html/>');
const bridged = await fetch(`${rawUrl('page.html')}?odPreviewBridge=scroll`);
expect(bridged.status).toBe(200);
const html = await bridged.text();
expect(html).toContain('data-od-url-scroll-bridge');
expect(html).toContain("type: 'od:preview-scroll'");
});
it('injects the URL preview scroll bridge before the closing body tag', async () => {
const bridged = await fetch(`${rawUrl('body.html')}?odPreviewBridge=scroll`);
expect(bridged.status).toBe(200);
const html = await bridged.text();
expect(html.indexOf('data-od-url-scroll-bridge')).toBeGreaterThan(-1);
expect(html.indexOf('data-od-url-scroll-bridge')).toBeLessThan(html.indexOf('</body>'));
});
it('does not inject the URL preview scroll bridge twice', async () => {
const bridged = await fetch(`${rawUrl('bridged.html')}?odPreviewBridge=scroll`);
expect(bridged.status).toBe(200);
const html = await bridged.text();
expect(html.match(/data-od-url-scroll-bridge/g)?.length).toBe(1);
});
it('returns 404 for a missing file', async () => {
const res = await fetch(rawUrl('missing.mp4'));
expect(res.status).toBe(404);

View file

@ -0,0 +1,225 @@
import type http from 'node:http';
import { randomUUID } from 'node:crypto';
import { afterAll, beforeAll, describe, expect, it } from 'vitest';
import { startServer } from '../src/server.js';
describe('project skillId validation', () => {
let server: http.Server;
let baseUrl: string;
const projectsToClean: string[] = [];
beforeAll(async () => {
const started = (await startServer({ port: 0, returnServer: true })) as {
url: string;
server: http.Server;
};
baseUrl = started.url;
server = started.server;
});
afterAll(async () => {
for (const id of projectsToClean.splice(0)) {
await fetch(`${baseUrl}/api/projects/${encodeURIComponent(id)}`, {
method: 'DELETE',
}).catch(() => {});
}
await new Promise<void>((resolve) => server.close(() => resolve()));
});
function uniqueId(prefix: string): string {
return `${prefix}-${randomUUID()}`;
}
async function createProject(body: Record<string, unknown>) {
return fetch(`${baseUrl}/api/projects`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body),
});
}
describe('POST /api/projects', () => {
it('rejects unknown skillId with 400 SKILL_NOT_FOUND', async () => {
const id = uniqueId('p');
const resp = await createProject({
id,
name: 'Skill id check',
skillId: 'definitely-not-a-real-skill',
});
expect(resp.status).toBe(400);
const body = (await resp.json()) as { error: { code: string } };
expect(body.error.code).toBe('SKILL_NOT_FOUND');
// Project must not have been persisted.
const getResp = await fetch(`${baseUrl}/api/projects/${encodeURIComponent(id)}`);
expect(getResp.status).toBe(404);
});
it('accepts a valid bundled skill id and stores it as-is', async () => {
const id = uniqueId('p');
const resp = await createProject({
id,
name: 'Bundled skill',
skillId: 'open-design-landing',
});
expect(resp.status).toBe(200);
projectsToClean.push(id);
const body = (await resp.json()) as { project: { skillId: string } };
expect(body.project.skillId).toBe('open-design-landing');
});
it('accepts a design-template id (source-of-truth = listAllSkillLikeEntries)', async () => {
const id = uniqueId('p');
const resp = await createProject({
id,
name: 'Template skill',
skillId: 'dashboard',
});
expect(resp.status).toBe(200);
projectsToClean.push(id);
const body = (await resp.json()) as { project: { skillId: string } };
expect(body.project.skillId).toBe('dashboard');
});
it('canonicalizes an aliased skill id (editorial-collage → open-design-landing)', async () => {
const id = uniqueId('p');
const resp = await createProject({
id,
name: 'Aliased skill',
skillId: 'editorial-collage',
});
expect(resp.status).toBe(200);
projectsToClean.push(id);
const body = (await resp.json()) as { project: { skillId: string } };
expect(body.project.skillId).toBe('open-design-landing');
});
it('normalizes empty string skillId to null', async () => {
const id = uniqueId('p');
const resp = await createProject({ id, name: 'Empty skill', skillId: '' });
expect(resp.status).toBe(200);
projectsToClean.push(id);
const body = (await resp.json()) as { project: { skillId: string | null } };
expect(body.project.skillId).toBeNull();
});
it('treats null skillId as no skill pinned', async () => {
const id = uniqueId('p');
const resp = await createProject({ id, name: 'Null skill', skillId: null });
expect(resp.status).toBe(200);
projectsToClean.push(id);
const body = (await resp.json()) as { project: { skillId: string | null } };
expect(body.project.skillId).toBeNull();
});
it('treats omitted skillId as no skill pinned', async () => {
const id = uniqueId('p');
const resp = await createProject({ id, name: 'Omitted skill' });
expect(resp.status).toBe(200);
projectsToClean.push(id);
const body = (await resp.json()) as { project: { skillId: string | null } };
expect(body.project.skillId).toBeNull();
});
it('rejects numeric skillId with 400 INVALID_SKILL_ID', async () => {
const id = uniqueId('p');
const resp = await createProject({ id, name: 'Bad type', skillId: 42 });
expect(resp.status).toBe(400);
const body = (await resp.json()) as { error: { code: string } };
expect(body.error.code).toBe('INVALID_SKILL_ID');
const getResp = await fetch(`${baseUrl}/api/projects/${encodeURIComponent(id)}`);
expect(getResp.status).toBe(404);
});
it('rejects object skillId with 400 INVALID_SKILL_ID', async () => {
const id = uniqueId('p');
const resp = await createProject({ id, name: 'Bad type', skillId: {} });
expect(resp.status).toBe(400);
const body = (await resp.json()) as { error: { code: string } };
expect(body.error.code).toBe('INVALID_SKILL_ID');
const getResp = await fetch(`${baseUrl}/api/projects/${encodeURIComponent(id)}`);
expect(getResp.status).toBe(404);
});
});
async function patchProject(id: string, patch: Record<string, unknown>) {
return fetch(`${baseUrl}/api/projects/${encodeURIComponent(id)}`, {
method: 'PATCH',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(patch),
});
}
describe('PATCH /api/projects/:id', () => {
it('rejects unknown skillId with 400 SKILL_NOT_FOUND', async () => {
const id = uniqueId('p');
const created = await createProject({ id, name: 'Patch target' });
expect(created.status).toBe(200);
projectsToClean.push(id);
const resp = await patchProject(id, { skillId: 'still-not-a-real-skill' });
expect(resp.status).toBe(400);
const body = (await resp.json()) as { error: { code: string } };
expect(body.error.code).toBe('SKILL_NOT_FOUND');
// skillId on the row stays unchanged (null since create).
const get = await fetch(`${baseUrl}/api/projects/${encodeURIComponent(id)}`);
const getBody = (await get.json()) as { project: { skillId: string | null } };
expect(getBody.project.skillId).toBeNull();
});
it('canonicalizes an aliased skillId on patch', async () => {
const id = uniqueId('p');
await createProject({ id, name: 'Patch alias' });
projectsToClean.push(id);
const resp = await patchProject(id, { skillId: 'editorial-collage' });
expect(resp.status).toBe(200);
const body = (await resp.json()) as { project: { skillId: string } };
expect(body.project.skillId).toBe('open-design-landing');
});
it('normalizes empty-string skillId on patch to null', async () => {
const id = uniqueId('p');
await createProject({ id, name: 'Patch empty', skillId: 'open-design-landing' });
projectsToClean.push(id);
const resp = await patchProject(id, { skillId: '' });
expect(resp.status).toBe(200);
const body = (await resp.json()) as { project: { skillId: string | null } };
expect(body.project.skillId).toBeNull();
});
it('treats null skillId on patch as unset', async () => {
const id = uniqueId('p');
await createProject({ id, name: 'Patch null', skillId: 'open-design-landing' });
projectsToClean.push(id);
const resp = await patchProject(id, { skillId: null });
expect(resp.status).toBe(200);
const body = (await resp.json()) as { project: { skillId: string | null } };
expect(body.project.skillId).toBeNull();
});
it('leaves skillId untouched when the field is omitted from patch', async () => {
const id = uniqueId('p');
await createProject({ id, name: 'Patch omit', skillId: 'open-design-landing' });
projectsToClean.push(id);
const resp = await patchProject(id, { name: 'Renamed' });
expect(resp.status).toBe(200);
const body = (await resp.json()) as { project: { skillId: string; name: string } };
expect(body.project.skillId).toBe('open-design-landing');
expect(body.project.name).toBe('Renamed');
});
it('rejects numeric skillId on patch with 400 INVALID_SKILL_ID', async () => {
const id = uniqueId('p');
await createProject({ id, name: 'Patch bad type' });
projectsToClean.push(id);
const resp = await patchProject(id, { skillId: 42 });
expect(resp.status).toBe(400);
const body = (await resp.json()) as { error: { code: string } };
expect(body.error.code).toBe('INVALID_SKILL_ID');
const get = await fetch(`${baseUrl}/api/projects/${encodeURIComponent(id)}`);
const getBody = (await get.json()) as { project: { skillId: string | null } };
expect(getBody.project.skillId).toBeNull();
});
});
});

View file

@ -124,6 +124,95 @@ test('conversation latest run follows assistant message position', () => {
assert.equal(getConversation(db, conversationId)?.latestRun?.status, 'running');
});
test('conversation summaries expose cumulative completed run duration', () => {
const db = createDb();
insertProject(db, {
id: 'project-duration',
name: 'project-duration',
createdAt: 1,
updatedAt: 1,
});
insertConversation(db, {
id: 'project-duration-conversation',
projectId: 'project-duration',
title: 'Duration test',
createdAt: 1,
updatedAt: 4,
});
upsertMessage(db, 'project-duration-conversation', {
id: 'project-duration-first',
role: 'assistant',
content: 'first done',
runId: 'project-duration-first-run',
runStatus: 'succeeded',
startedAt: 10_000,
endedAt: 40_000,
});
upsertMessage(db, 'project-duration-conversation', {
id: 'project-duration-running',
role: 'assistant',
content: 'still running',
runId: 'project-duration-running-run',
runStatus: 'running',
startedAt: 45_000,
});
upsertMessage(db, 'project-duration-conversation', {
id: 'project-duration-second',
role: 'assistant',
content: 'second done',
runId: 'project-duration-second-run',
runStatus: 'failed',
startedAt: 50_000,
endedAt: 125_000,
});
const listed = listConversations(db, 'project-duration')[0] as { totalDurationMs?: number };
const fetched = getConversation(db, 'project-duration-conversation') as { totalDurationMs?: number } | null;
assert.equal(listed.totalDurationMs, 105_000);
assert.equal(fetched?.totalDurationMs, 105_000);
});
test('conversation summaries include usage-only terminal run durations', () => {
const db = createDb();
insertProject(db, {
id: 'project-usage-duration',
name: 'project-usage-duration',
createdAt: 1,
updatedAt: 1,
});
insertConversation(db, {
id: 'project-usage-duration-conversation',
projectId: 'project-usage-duration',
title: 'Usage duration test',
createdAt: 1,
updatedAt: 4,
});
upsertMessage(db, 'project-usage-duration-conversation', {
id: 'project-usage-duration-imported',
role: 'assistant',
content: 'imported done',
runId: 'project-usage-duration-imported-run',
runStatus: 'succeeded',
events: [{ kind: 'usage', durationMs: 22_000 }],
});
upsertMessage(db, 'project-usage-duration-conversation', {
id: 'project-usage-duration-timestamped',
role: 'assistant',
content: 'timestamped done',
runId: 'project-usage-duration-timestamped-run',
runStatus: 'succeeded',
startedAt: 30_000,
endedAt: 60_000,
});
const listed = listConversations(db, 'project-usage-duration')[0] as { totalDurationMs?: number };
const fetched = getConversation(db, 'project-usage-duration-conversation') as { totalDurationMs?: number } | null;
assert.equal(listed.totalDurationMs, 52_000);
assert.equal(fetched?.totalDurationMs, 52_000);
});
test('conversation listing batches latest run summaries for large projects', () => {
const db = createDb();
insertProject(db, {

Some files were not shown because too many files have changed in this diff Show more