Commit graph

1767 commits

Author SHA1 Message Date
mehmet turac
8448b1105c
fix: preserve OpenClaude fallback credentials (#3361)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
landing-page-ci / Validate landing page (push) Failing after 1s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 2s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 2s
ci / Web workspace tests (push) Failing after 2s
ci / Browser tests (push) Failing after 2s
ci / Build workspaces (push) Failing after 2s
ci / Validate workspace (push) Failing after 1s
ci / Runtime trace (push) Has been skipped
2026-05-31 03:49:25 +00:00
Jane
d66a463d62
feat(landing-page): 301 legacy /skills /systems /templates to /plugins (#3352)
The 2026-05 plugins library rebuild introduced /plugins/skills/,
/plugins/systems/, /plugins/templates/ and a unified detail route
/plugins/<manifest-slug>/, but the old /skills/, /systems/, /templates/
catalogs were left live in parallel. Two equivalent page trees split SEO
equity, and the homepage, footer, quickstart, agents, official and blog
pages all still linked to the old routes.

Retire the legacy generators and 301 every old URL to its new plugins
equivalent so inbound links and search equity are preserved:

- Remove the /skills, /systems, /templates page generators (English +
  [locale] wrappers) and the now-orphaned skill-row component, and prune
  the skills/systems/templates branches from the [locale]/[...path]
  catch-all (it now renders only craft + blog).
- Add the migration block to public/_redirects. Detail slugs differ from
  the old folder names (new slugs are manifest-name based, e.g.
  design-system-<x>, example-<x>), so systems/templates use a prefixed
  splat plus a short degrade list, and skills map the 27 with a template
  equivalent explicitly while the ~110 instruction-only skills and all
  mode/scenario/category facet pages degrade to the section landing.
  'replicate' is forced to the section to avoid colliding with the
  design-system of the same name. Locale variants (zh, zh-tw, ja, ko)
  strip to the section.
- Repoint in-site links to /plugins/* across page.tsx (footer, work,
  labs pills), info-page-i18n.ts (en + zh + sourceNames), official,
  quickstart, agents, blog and html-anything, and update the sitemap
  serialize priority list. The system-card keeps linking through
  /systems/<slug>/ so the 8 systems without a detail page ride the
  redirect's degrade rather than pointing at a missing page.

Verified with a full astro build: old routes no longer emit any HTML,
the new section pages exist, _redirects is copied verbatim, and no
in-site link targets a removed route (the remaining /systems/<slug>/
hrefs are the system cards that 301 by design). astro check passes.

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-31 01:04:20 +00:00
estelledc
1a6face04c
fix(web): prune draft tokens when the plugin chip strip clears (#2881) (#3356)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
nix-check / build (push) Failing after 1s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 1s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 1s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
ChatComposer tracks the `@…` tokens this surface authored via the
@-mention popover plugin-pick path. When PluginsSection's chip strip
clears, we wire its `onCleared` and prune *only* those tracked
insertions from the draft so the textarea no longer holds orphaned
styled mentions whose chips just unmounted.

Architecture summary (rounds 1–9 collapsed; round 10 detailed below):

  - `Array<{token, start, pluginId, insertionId?}>` tracking with
    start offsets reconciled across each keystroke via an LCS+LCP
    edit-range diff in
    `apps/web/src/utils/pluginInsertionTracking.ts` (round 3-4).
    `insertionId` is forwarded by `reconcileInsertions` so the
    producer can locate its own entry across reconciles
    (round 10).
  - All draft mutations route through a single `updateDraft`
    chokepoint that runs `reconcileInsertions` outside the
    `setDraft` updater so React StrictMode's double-invoke is
    harmless (round 4-5).
  - Boundaries delegate to the shared
    `inlineMentions.isMentionBoundary` /
    `inlineMentions.isMentionRightBoundary` helpers so the
    tracker can never diverge from the parser (round 5).
  - `setActivePlugin` is a chokepoint for every applyById path,
    filtering tracked entries to those matching the new active
    plugin so a replace-plugin flow can never let stale entries
    survive (round 6).
  - Picker rollback double-snapshots draft + tracker so apply-
    failure restores the tracker but only rewrites the draft
    when no user keystrokes arrived during the await
    (round 7-8).
  - `stripPluginInsertedTokens` collapses whitespace seam-local
    so user-authored multi-space spans elsewhere are preserved
    (round 8).
  - `setActivePlugin` is deferred past `await applyById` on
    every path, and `onCleared` filters by
    `pluginsSectionRef.current?.getActiveRecord()?.id` so a
    pending-window clear scopes to the actually-mounted
    plugin's tokens (round 9).

race in the picker rollback:

  Round 9 made `onCleared` mutate the tracker and the draft when
  it ran during a pending replace, and added the `getActiveRecord`
  filter so the strip targets the still-mounted plugin's entries
  only. The picker's failure-path rollback, however, still
  restored `prevEntries` / `prevActiveId` wholesale — assuming
  nothing else had touched the tracker during the await. If the
  user clicked the still-mounted original chip's × during the
  pending replace AND the deferred `applyById` then resolved
  with a 500, the wholesale restore (a) resurrected entries that
  `onCleared` had legitimately stripped (now stale offsets) and
  (b) left the optimistic `@<target>` orphaned in the draft with
  no chip ever having mounted — the original #2881 symptom
  recurring inside the failure window.

  Fix splits the failure rollback into two paths:

  1. **Detect "intervening clear" via `activePluginIdRef.current
     === null && prevActiveId !== null`.** `onCleared` always
     nulls the active id as its last action; our deferred
     `setActivePlugin` never ran in the failure branch. So the
     null-while-prev-not-null state is the smoking gun for an
     intervening clear during the await.

  2. **On detection, surgically remove only our optimistic
     entry and only its `@<target>`.** Locate the entry by
     `insertionId` (added to `TrackedInsertion` as an optional
     field, forwarded by `reconcileInsertions` so the id
     survives offset shifts) — this disambiguates the case
     where the user picked the same plugin from the @-popover
     more than once during the await window. Splice that entry
     out and run `updateDraft((d) => stripPluginInsertedTokens(
     d, [ourEntry]))` so the draft loses `@<target>` and any
     remaining tracked entries (the in-flight target would have
     no others, but a co-pending second pick could) get their
     offsets reconciled. `activePluginIdRef` stays at `null` —
     `onCleared`'s truth, since no chip is mounted.

  The "no intervening clear" branch is the round 7/8 path:
  restore `prevEntries`/`prevActiveId` wholesale and rewrite
  the draft only if `draftRef.current === postInsertDraft`
  (no user keystrokes during the await).

Regression coverage (additions):

  - `apps/web/tests/components/ChatComposer.plugin-clear-prunes-draft.test.tsx`
    — 18 integration specs total (17 prior + 1 new round-10):
    * `@-popover pick A → @-popover pick B (apply pending) →
      clear A's chip → resolve B with 500 → assert no orphan
      @<target>, no orphan @A, no chip mounted, no stale
      tracker entries`. Uses a deferred `Promise<Response>` so
      the apply stays in flight while the chip-clear is fired,
      then resolves with a 500 to drive the failure path. Pre-
      fix this would resurrect Airbnb's stale entry AND leave
      `@SecondPlugin` orphaned in the draft.

PluginsSection.tsx is unchanged. The host-local tracking +
draft-update chokepoint + parser-aligned boundaries + deferred
active-plugin scoping + transactional applyById + intervening-
clear-aware rollback + filtered `onCleared` keep the cross-
component contract identical to main — only ChatComposer touches
behavior, plus the utils module and two `inlineMentions` exports.

Validation:
  - pnpm exec vitest run tests/utils/pluginInsertionTracking.test.ts → 36/36 passed
  - pnpm exec vitest run tests/components/ChatComposer.plugin-clear-prunes-draft.test.tsx → 18/18 passed
  - pnpm exec vitest run -c vitest.config.ts (full apps/web suite, 228 files) → 2202/2202 passed
  - pnpm --filter @open-design/web typecheck → green
  - pnpm guard → green
2026-05-30 17:16:24 +00:00
Denis Redozubov
f4c5d22f22
fix(daemon): confine sandbox project roots and host discovery (#3243)
* fix(daemon): confine sandbox project and host discovery

* fix(daemon): resolve sandbox data dir for toolchain discovery

* fix(daemon): resolve sandbox data dir for agent env

* fix(daemon): fail fast for sandbox imported folders

* test(daemon): assert sandbox imported folder rejection

* fix(daemon): keep sandbox import guard at run start

* fix(daemon): reject sandbox imported project file roots

* fix(daemon): preserve imported project detail roots

* test(daemon): expect sandbox profiles to stay scoped

* fix(daemon): bypass proxies for agent tool callbacks

* test(daemon): isolate media policy route memory extraction

* fix(daemon): keep loopback no-proxy scoped to sandbox
2026-05-30 16:57:04 +00:00
Denis Redozubov
9a3424d68c
feat(daemon): add sandbox runtime foundation (#3242)
* feat(daemon): add sandbox runtime foundation

* fix(daemon): preserve sandbox roots after agent env overrides

* fix(daemon): keep readiness probes pathless

* fix(daemon): harden headless run fallbacks

* fix(daemon): bootstrap sandbox runtime discovery

* fix(daemon): preserve explicit sandbox agent profile mounts

* fix(daemon): keep sandbox profile lookup run scoped

* fix(daemon): normalize sandbox data dir input

* fix(daemon): pin sandbox env roots to base data dir
2026-05-30 15:06:05 +00:00
open-design-bot[bot]
b9f0b69cf1
docs(readme): refresh contributors wall (#3339)
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-30 14:02:17 +00:00
chaoxiaoche
df1535b7fd
feat(web): add staged preview feedback during generation (#3227)
* feat(web): wire generation preview stage into workspace

Show a 3-step progress overlay (understand → generate → prepare) in the
preview area while artifacts are being generated, replacing the blank
empty state. Displays elapsed time, an estimated duration hint, and a
retry button on failure.

- Add GenerationPreviewStage component + CSS module + runtime helpers
- Integrate buildGenerationPreviewState into FileWorkspace
- Pass messages/artifact/error/retry from ProjectView to FileWorkspace
- Register i18n keys for en and zh-CN locales

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): keep generation preview alive and persistent across waiting states

Address UX feedback on the generation preview surface:

- Make the waiting card feel alive instead of frozen: breathing mark,
  sweeping progress shimmer, pulsing running-step dot, and a live
  activity snippet pulled from streamed events (respects
  prefers-reduced-motion).
- Add an `awaiting-input` phase so the preview no longer reverts to the
  empty "design will appear here" placeholder when the agent asks the
  user a clarifying question (detects inline <question-form>).
- Add a `stopped` phase so a canceled/paused run keeps a contextual
  paused card instead of blanking the surface.
- Fix workspaceHasPreviewSurface live-artifact tab match (was reading a
  non-existent `tabId` field) and correct the unit assertion that
  contradicted the helper's `thinking` handling.
- Populate generationPreview.* keys (incl. new awaiting/stopped strings)
  across all locales.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): reveal generation steps progressively as the agent reaches them

- Only render steps the agent has actually reached (drop pending pills)
  with a slide/fade entrance, so the card visibly evolves 1->2->3 instead
  of always showing the same fully-populated row.
- Keep the "understand" step in progress during requesting/starting so a
  fresh run opens with a single step rather than a pre-filled set.
- Stop surfacing status detail (e.g. the model slug from `requesting`) as
  the live activity line; only genuine thinking/output text is shown.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): add dynamic sub-status to the generating step

Keep 3 high-level steps but give the long "generating" phase concrete,
moving feedback (option A) instead of splitting into more, less-reliable
steps:

- Derive a sub-status from the agent's TodoWrite plan: the in-progress
  task label (activeForm) plus a done/total count, falling back to the
  latest write/edit target file when no plan was emitted.
- The count counts the in-progress task toward `done` to match the
  chat-side todo card (e.g. 3/7 on both sides).
- Suppress the higher-level narration line while the sub-status is shown
  so only one dynamic line appears at a time (early phase = narration,
  writing phase = concrete task + count).

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): drop elapsed timer and duplicate estimate from generation preview

The "usually 2–5 minutes" estimate showed twice (lead footnote + meta row)
and the elapsed counter added little signal, so remove both: delete the
meta row, stop falling back to the estimate footnote in the generating
lead (render the lead only when live narration exists), and drop the now
unused elapsed timer/util.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-30 13:59:49 +00:00
leessju
cfde84b038
fix(web): make hand-off no-editors fallback perform a real reveal (#2494)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
landing-page-ci / Validate landing page (push) Failing after 3s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 2s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
* fix(web): make hand-off no-editors fallback perform a real reveal

The Finder/Explorer/File Manager fallback button was only calling an
optional onRequestRevealInFinder prop that the actual caller never
passes, so the surface advertised an action it never performed.

finder, explorer, and file-manager are real entries in the daemon's
open-in catalogue (open / explorer / xdg-open), so route the fallback
through openProjectInEditor(projectId, fallbackId) for a genuine
reveal. Keep the renderer reveal bridge as a secondary fallback if
the daemon spawn fails, and disable the button while busy so a double
click can't queue two reveals.

Adjacent: PreviewDrawOverlay's Send-while-streaming behavior is
intentional (sending is queued downstream, not blocked), and the
button already carries sendDisabledReason as its tooltip. Cover that
contract with a regression test so a future change can't silently
re-disable the control or drop the localized reason.

Scope note: the i18n hand-off key migration that previously rode on
this branch landed on main via a different key set, so this PR is
narrowed to just the fallback wire-up and the two regression tests.

* fix(web): surface daemon spawn failure inline in zero-editors fallback

The zero-editors HandoffButton fallback called setError() on a rejected
openProjectInEditor but returned only the <button>, so the error never
rendered. Production callers (ProjectView) mount the component without
onRequestRevealInFinder, so a daemon spawn failure became a silent no-op
— exactly the failure mode the PR was meant to cover.

Wrap the solo button in a handoff-wrap container and render the error
inline next to it. Adds a regression test for the rejected-spawn path.

* fix(web): align preview draw send-disabled test

* fix(web): show handoff fallback for zero editors

---------

Co-authored-by: nicejames <nicejames@gmail.com>
Co-authored-by: mrcfps <mrc@powerformer.com>
2026-05-30 06:34:12 +00:00
RyanCheng77
b76e7196db
fix(daemon): dedupe Claude stream wrappers (#3334)
* fix(daemon): dedupe Claude stream wrappers

* fix(daemon): split Claude stream dedupe state

---------

Co-authored-by: 116405 <116405@ky-tech.com.cn>
2026-05-30 06:12:29 +00:00
mehmet turac
259295419a
fix(web): remove design file mentions with chips (#3204) 2026-05-30 04:50:50 +00:00
Weston Houghton
7a9dcf38d7
fix(memory): deliver OpenCode extraction prompt on stdin (#3238)
`opencode run`'s `-f, --file` is a yargs array option that greedily
consumes every trailing non-flag token, so the memory extractor's
`--file <prompt-file> "<message>"` invocation made OpenCode treat the
message text as a second attachment and exit 1 with "File not found".
Every LLM memory extraction failed for OpenCode Local CLI users.

Deliver the prompt on stdin like the chat-run path (def.promptViaStdin)
and drop the --file attachment. The connector-memory test now models
the real yargs --file array-greediness so it would catch a regression.
2026-05-30 04:48:42 +00:00
RyanCheng77
f12679185c
fix(web): send Anthropic proxy image attachments (#3273)
* fix(web): send Anthropic proxy image attachments

* fix(web): omit image attachment stubs for Anthropic proxy

* fix(web): keep image fallback context aligned

* fix(web): align Anthropic image attachment omission

---------

Co-authored-by: 116405 <116405@ky-tech.com.cn>
2026-05-30 04:47:47 +00:00
RyanCheng77
653a3fcc70
fix(web): harden image export downloads (#3318)
* feat(web): export preview as image

* fix(web): harden image export downloads

* docs(skills): add PR feedback quality gate

* docs(skills): require critical review of Claude feedback

---------

Co-authored-by: 116405 <116405@ky-tech.com.cn>
2026-05-30 04:44:00 +00:00
YOMXXX
9305bd1cff
fix(web): truncate long project names in the automation project picker (#3274) (#3317)
Long project names in the "Existing projects" section of the
automation project picker rendered verbatim with no truncate styling,
so a single name like "A very long project name that would otherwise
wrap onto several lines" blew up the row height and made the dropdown
messy to scan. The expected behavior is a single-line label with
ellipsis, with the full name still discoverable on hover.

Add the standard truncate triad (`white-space: nowrap`,
`overflow: hidden`, `text-overflow: ellipsis`) to
`.automation-popover__label`. The parent
`.automation-popover__body` already sets `min-width: 0`, so the
ellipsis renders cleanly. Thread an optional `title` prop through
`PopoverItem` and pass each project's full name from the picker
call site, so the native hover tooltip carries the unclipped name.

Other PopoverItems with fixed in-product copy (e.g. "New project
each run") deliberately omit the title — they never exceed the row
width and the redundant tooltip would be noise.

Regression test covers the DOM contract (every project row has
`title=<full name>`, fixed rows do not); the CSS half is verified by
code review since jsdom does not apply stylesheets.
2026-05-30 04:42:21 +00:00
open-design-bot[bot]
e76eb6da63
Update docs/assets/github-metrics.svg (#3338)
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-30 04:31:16 +00:00
Ramiro
ed5e8c147b
fix(web): keep pet composer menu expanded (#3336)
- apps/web/src/components/ChatComposer.tsx
- apps/web/tests/components/ChatComposer.context-pickers.test.tsx

Clear stale absolute anchors when the pet composer menu is positioned fixed so the popover wraps its content instead of collapsing over the composer textarea.
2026-05-30 04:19:02 +00:00
Ramiro
c33641e592
fix(daemon): normalize cumulative ACP message chunks (#3333)
* fix(daemon): normalize cumulative acp message chunks

- apps/daemon/src/acp.ts
- apps/daemon/tests/acp.test.ts
- apps/web/src/providers/daemon.ts
- apps/web/src/components/DesignSystemFlow.tsx

Convert cumulative ACP message snapshots into suffix deltas and keep temporary browser debug instrumentation for trace verification.

* chore(web): remove temporary stream debug hooks

- apps/web/src/providers/daemon.ts
- apps/web/src/components/DesignSystemFlow.tsx

Remove the browser debug accumulator after validating the ACP duplication trace.
2026-05-30 04:17:32 +00:00
xinsngx
41b1cd763e
fix(media): hide OpenAI OAuth-only image credentials (#3308)
* fix(media): ignore OpenAI OAuth tokens

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.1

* fix(media): hide unavailable model providers

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.2

* fix(media): clear unavailable picker models

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.3

* fix(media): keep missing-model projects executable

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.8

---------

Co-authored-by: Codex <gpt-5@openai.com>
2026-05-30 04:12:10 +00:00
JasonBroderick
0fbeaf829e
fix(#3247): Detect, terminate, and warn on fabricated role markers across all agent paths (#3303)
* fix(daemon): detect and strip fabricated role markers in model output (#3247)

Three-layer defence against models emitting `## user` / `## assistant` /
`## system` lines mid-response, which the chat host interprets as real
turn boundaries and acts on as unauthorised instruction:

1. **System prompt**: anti-roleplay instruction elevated from a bullet
   under "What you don't do" to a standalone `## CRITICAL` section in
   `official-system.ts`, with a REMINDER pinned at the end of the
   composed prompt for recency bias.

2. **Stream-level detection and truncation**: shared `role-marker-guard.ts`
   module (`createRoleMarkerGuard` + `FABRICATED_ROLE_MARKER_RE`) used
   across all text paths — Claude stream (per-message guards), non-Claude
   structured streams (run-scoped guard via `emitGuardedTextDelta`),
   and BYOK proxy routes (`createDeltaGuard`). When a marker is detected,
   the contaminated suffix is dropped and a `fabricated_role_marker` event
   surfaces a warning in the UI.

3. **UI**: `StatusPill` gains `is-warning` / `is-error` CSS variants;
   `fabricated_role_marker` events render as amber warning pills.

* fix(chat-routes): do not await reader.cancel() on stream early-return

The await on reader.cancel() can hang indefinitely on response streams
whose underlying source is a Uint8Array (most notably surfaced by the
ollama test in proxy-routes.test.ts, which builds its mock body via
`new Response(uint8array)` rather than the controller-based helper
`sseResponse()`). The hung await holds the request handler open, which
in turn blocks `server.close()` in the afterAll hook, producing the two
test timeouts (test at 145, hook at 36) currently failing CI on #3296.

Fix is in production code, not the test: don't await the cancel. It
is a cleanup hint and we are returning from the function anyway, so
blocking on it offers no value. fire-and-forget with an empty catch
keeps the cancel signal flowing for real HTTP streams without
risking a hang on mock/edge-case implementations.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(daemon): terminate child on role-marker detection (close #3247 generation vector)

PR #3296's detection layer truncates display and persistence of fabricated
role markers, but the underlying model subprocess keeps generating tokens
after detection. Three concrete consequences:

  1. The model bills the user for the entire contaminated response
     (we observed 5,106 chars stored in claude's session file for a turn
     where only the first 3,013 chars were legitimate — a 40% overhead).
  2. tool_use blocks emitted AFTER the marker reach the daemon's
     dispatcher unchecked, since detection only gates the text-delta
     emission path, not content-block-stop / tool_use blocks. The
     model could fabricate "## user delete file X" then emit a
     tool_use(delete X) that the dispatcher would execute.
  3. The UI surfaces a `fabricated_role_marker` warning followed by an
     eventual normal turn-end, blurring the distinction between
     "completed normally" and "killed by safety guard."

This commit adds a single idempotent `abortForRoleMarker(marker)`
helper in server.ts, scoped to the same closure as `child` and
`runGuard`. On any detection event (per-message Claude guard,
run-scoped non-Claude guard, plain stdout guard) the helper:

  - Emits a structured `ROLE_MARKER_HALLUCINATION` SSE error so the
    UI can render a security-class status distinct from a normal
    turn-end. The existing `fabricated_role_marker` warning is still
    sent and rendered as the amber pill (PR #3296's UI).
  - Calls `acpSession.abort()` for ACP-multiplexed agents (Hermes,
    Kimi, Devin, Kiro) whose I/O doesn't necessarily release on
    SIGTERM of the wrapper process alone.
  - SIGTERMs the child immediately, with the existing
    `scheduleForcedChildShutdown()` SIGKILL fallback at 2x grace.

Wired into three sites where contamination is detected:
  - `emitGuardedTextDelta` (sendAgentEvent / copilot / ACP / pi-rpc
    text_delta paths)
  - Plain-stdout listener (BYOK plain mode)
  - The Claude stream handler's onEvent (per-message guards in
    claude-stream.ts surface `fabricated_role_marker` events directly
    via onEvent rather than through the run-scoped emitGuardedTextDelta)

Tool_use blocks emitted BEFORE the marker still flow through normally
— this guard can't help with those, since by the time we observe a
text marker the prior content block has already finished. Closing
that gap requires speculative cancellation of in-flight tool calls
when a downstream text block contains a marker; that's tracked as
follow-up work, not included here.

Co-Authored-By: roverkai <2196140098@qq.com>
Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* refactor(role-marker-guard): bounded tail + drop chat-style markers

Addresses two review comments on #3303:

(1) O(1) memory + per-delta work (review r3323982225)
  Replace the unbounded `accumulated` string with a rolling tail capped
  at TAIL_BUFFER_SIZE (64 chars — comfortably exceeds the longest
  marker prefix `\n<whitespace>## assistant` ≈ 16–24 chars in practice).
  A 50 KB assistant response delivered in 1000 chunks of 50 bytes was
  previously O(n²) on string concatenation alone; now it is O(1) per
  delta regardless of message length. The `tail.length` value carries
  the "already emitted" offset that the cut-point math needs, so the
  offset semantics at L74–78 of the prior implementation are preserved
  without re-introducing the full-text buffer.

(2) Drop chat-style markers entirely (review r3323982234, option (a))
  `User:` / `Assistant:` / `Human:` / `AI:` are removed from the regex.
  Rationale:
    - The host parses ONLY `## user` / `## assistant` / `## system`
      lines as turn boundaries (see `buildDaemonTranscript` in
      apps/web/src/providers/daemon.ts). A model emitting chat-style
      markers does NOT cause the original #3247 security failure.
    - With kill-on-detection wired in this PR (`abortForRoleMarker`
      in server.ts), a false positive aborts the whole run — far
      more expensive than a stray unflagged `User:` line in chat
      scrollback. Chat-style markers collide with legitimate output
      (form labels, email contacts, JSDoc) often enough that pairing
      them with kill-semantics is the wrong tradeoff.
  The tradeoff is now documented in the regex docblock so the
  kill-on-match behaviour is justified against the false-positive
  surface.

Also aligns the prompt-side CRITICAL block in system.ts: drop the
"don't emit User: / Assistant: / Human: / AI:" bullet, since we no
longer enforce it. Less ambiguity for the model and the operators.

Test file updated:
  - Chat-style positive tests flipped to negative ("does NOT match
    User: — chat-style out of scope") so the intentional exclusion
    has a permanent regression test.
  - Two new tests cover the bounded-tail behaviour: a marker arriving
    after 10 KB of clean text in small chunks, and a marker
    straddling a chunk boundary after 100 prior chunks.
  - Added test for legitimate `User: bob@example.com`-style content
    not triggering contamination.
Test count is now 35 (up from 25); two of the new ones explicitly
exercise the new bounded-tail path.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): drop \`^\` anchor after first chunk (review r3324060995)

Blocking correctness bug introduced by commit 4 (bounded-tail refactor):
once \`tail\` is a rolling slice of mid-stream text, \`^\` in the
canonical regex \`(?:^|\\n)\\s*##\\s+(?:user|...)\` no longer represents
the genuine message start. As the rolling window slides forward chunk
by chunk, a sliced tail can begin with whitespace + \`##\` (or just
\`##\`), letting \`^\` anchor a match against text that the
full-buffer implementation correctly ignored. With kill-on-detection
wired in commit 3, that false positive now SIGTERMs the run and emits
a \`ROLE_MARKER_HALLUCINATION\` error — exactly the failure class
called out in the docblock at L22–29.

Reviewer's evidence (PerishCode, r3324060995): streaming
"…take a look at the ## user content section…" one character at a
time reports \`contaminated: true\` post-refactor; the same text in a
single feed stays clean.

Fix: keep the canonical \`FABRICATED_ROLE_MARKER_RE\` for the very
first non-empty feed (where \`^\` legitimately points at the message
start), and switch to an internal \`NEWLINE_ANCHORED_ROLE_MARKER_RE\`
(\`\\n\\s*##\\s+(?:user|...)\` — drops the \`^\` alternative) for all
subsequent feeds. A \`firstChunk\` boolean tracks the state. Real
newline-preceded markers straddling chunk boundaries are still caught
because the preceding \`\\n\` is retained inside the 64-char tail.

Regression tests added (\`apps/daemon/tests/role-marker-guard.test.ts\`):
  - mid-line \`## user\` streamed char-by-char with no preceding \\n
    (mirrors the reviewer's repro)
  - space-preceded mid-line \`## user\` in a >130-char stream, which
    long enough to force the rolling window past the marker — exercises
    the exact slice condition that triggered the bug
  - real \\n-preceded \`## user\` still caught after a long preamble
    (positive case must not regress)
  - \`## user\` as the very first chunk still caught (\`^\` legitimately
    anchors on the first feed)

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): case-sensitive + tighter prefix scope (reviews r3324151877 / r3324151882)

Two refinements addressing the third review on #3303:

== Blocking (r3324151877) ==
The regex over-matched legitimate Markdown headings, and with
kill-on-detection wired in commit 3 each false positive
deterministically aborts a real run. Three changes tighten the match
to the actual security surface — `## user` / `## assistant` /
`## system` lines the chat host parses as turn boundaries — without
losing any real attack pattern:

1. CASE-SENSITIVE. Dropped the `/i` flag. The host's turn-boundary
   delimiter is lowercase (see `buildDaemonTranscript` in
   apps/web/src/providers/daemon.ts), and the `## CRITICAL`
   system-prompt block already forbids only the lowercase forms.
   Title-Case headings like `## User Guide`, `## System Architecture`,
   `## Assistant settings` are now ignored — these are legitimate
   technical writing patterns LLMs emit constantly. `## USER NOTES`
   (all-caps) likewise no longer flags.

2. POSITIVE LOOKAHEAD `(?=[^a-z])` after the role keyword. Without it,
   `## userland`, `## userspace`, `## users guide`, `## systemd`,
   `## assistance` all match via prefix in the alternation. The
   lookahead requires the next character to exist and to not be a
   lowercase letter, so:
     - `## user\\n…`     → match (newline is not lowercase)
     - `## assistantR…` → match (R is uppercase; the glued-form
                          attack pattern still gets caught)
     - `## assistant.`  → match (. is not a letter)
     - `## users guide` → no match (s is lowercase letter)
     - `## userland`    → no match (l is lowercase letter)
   POSITIVE rather than NEGATIVE `(?![a-z])` because the negative
   form is satisfied at end-of-string, which in a streaming context
   means "we have `## user` but don't know what comes next yet" —
   would fire prematurely if `land` arrives in a later chunk. The
   positive form delays detection by one character in that edge
   case, traded for correctness.

3. `[ \\t]` instead of `\\s` for inner whitespace. Markdown role
   markers are single-line by convention; restricting to space/tab
   prevents oddities like `##\\nuser` from matching across lines.

Test file: added Title-Case fixtures (`## User Guide`,
`## System Architecture`, `## Assistant settings`, `## USER NOTES`)
and prefix-of-longer-word fixtures (`## users guide`, `## userland`,
`## systemd`, `## assistance`) — each asserting NO contamination.
The existing `## usability` negative test gave false confidence as
the reviewer noted (only failed via alternation-miss, not via
word-boundary semantics); the new fixtures actually exercise the
lookahead. Also added a positive test for `## assistant.` (glued
punctuation) to balance the existing `## assistantReading`
(glued uppercase) coverage. Total tests: 35 → 50.

== Non-blocking (r3324151882) ==
Added `ROLE_MARKER_HALLUCINATION` to `API_ERROR_CODES` in
`packages/contracts/src/errors.ts` alongside the existing agent/AMR
codes, with a docblock comment explaining the emission contract:
emitted by `server.ts::abortForRoleMarker` alongside the existing
`fabricated_role_marker` warning event when the daemon detects a
fabricated Markdown role marker in agent output; retryable. The code
was already being emitted over the wire but unregistered — landing
the registration here keeps the contract and emitter in sync as
reviewer requested.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): defer complete-but-unconfirmed marker suffix

Addresses review r3324277xxx — the boundary case where a stream chunk
boundary lands between the role keyword and its lookahead character
violated the documented "everything from the marker onward is silently
dropped" contract. With (?=[^a-z]) as the lookahead, `feedText('## user')`
returned `## user` as safe (no char to satisfy the lookahead → no match
→ pass through), so the fabricated marker line leaked into UI and
app.sqlite before the next chunk confirmed contamination on the next
SIGTERM cycle.

Fix: introduce a `pending` state variable holding bytes that match the
COMPLETE-but-unconfirmed marker prefix at end of buffer
(/(?:^|\\n)[ \\t]*##[ \\t]+(?:user|assistant|assist|system)$/, no
lookahead, $ anchor instead). When the no-match branch detects this
suffix, withhold it from emission until the next feed either:
  - Confirms it (next char non-lowercase) → main regex matches →
    contaminated → withheld bytes dropped along with `## user`.
  - Denies it (next char lowercase, e.g. `userl…`) → main regex no
    longer matches the role keyword → withheld suffix is released
    and emitted alongside the new continuation.

Also tied the firstChunk transition to actual byte emission rather
than feed count. Previously a message that starts with `## system`
followed by a separate `\\n` chunk would lose the `^` anchor on the
second feed (firstChunk had flipped after the first feed even though
nothing was emitted yet), silently breaking detection for that edge
case. Now `firstChunk` stays true until at least one byte has crossed
the emission boundary, matching the conceptual definition of "message
start".

Tests added (apps/daemon/tests/role-marker-guard.test.ts):
  - `## user` deferred at chunk boundary, confirmed by `\\n` in next
  - `## user` deferred at chunk boundary, denied by `land` continuation
  - `## assistant` deferred, confirmed by punctuation
  - `## User` Title-Case still passes through unconditionally
  - `## system` as the very first chunk: deferred, confirmed by \\n
    in next chunk (tests the firstChunk-stays-true-when-nothing-
    emitted invariant)

Total tests: 50 → 55.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(claude-stream): scope role-marker guard to text_delta only, not thinking_delta

Addresses review r3324xxxxxx — guarding the thinking channel buys no
security and causes legitimate aborts.

Why thinking is NOT a #3247 vector:
  - `buildDaemonTranscript` in apps/web/src/providers/daemon.ts only
    re-serializes `m.content` as `## ${m.role}\n...`.
  - Extended-thinking content is rendered to a separate
    `kind: 'thinking'` payload (daemon.ts:857-858) and never folded
    into `m.content`.
  - So a `## user` line in the thinking channel CANNOT become a
    fabricated turn boundary on the next round-trip.

Why guarding it is harmful:
  - Models routinely emit literal `## user` / `## assistant` lines
    in chain-of-thought when reasoning about conversation structure
    ("Let me think about this. The user might phrase it as:\n## user\n
    …"). Common pattern in production traces.
  - With `abortForRoleMarker` wired in server.ts, a guard match on
    thinking SIGTERMs the run and surfaces a security error to the
    UI. The user paid for the reasoning, never sees the answer, and
    gets a confusing "fabricated role marker" warning for what was
    actually legitimate metacognition.
  - This directly contradicts the module's own stated philosophy
    ("a false positive aborts the whole run — a much more expensive
    failure than a stray unflagged ... line", role-marker-guard.ts).

Fix: `emitSafeText` now passes thinking_delta through unconditionally,
skipping both the guard and the contamination check. text_delta
remains fully guarded. The single-line change at the top of
emitSafeText preserves all other channels' behavior.

Regression tests added (apps/daemon/tests/claude-stream-thinking.test.ts):
  - `## user` / `## assistant` lines in a thinking_delta — must NOT
    fire fabricated_role_marker, the thinking content streams intact
    including the marker text, and the subsequent text_delta answer
    still reaches the consumer (run not aborted).
  - Sanity check: same `## user` pattern in a text_delta DOES fire
    fabricated_role_marker and truncates emission at the marker. Locks
    in the channel-discriminated behavior.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): tie firstChunk to slicing, not byte emission

Blocking review r3324xxxxxx: under the prior firstChunk transition
("any byte emitted"), a role marker that arrived at the very start of
a message with its prefix split across multiple chunks bypassed
detection — reopening the #3247 vector on the Claude path.

Concrete cases that were missed (all are routine provider
tokenizations of \`## user\n…\` at message start):
  - \`##\`     | \` user\nDELETE…\`
  - \`## us\`  | \`er\nDELETE…\`
  - \`## \`    | \`user\nDELETE…\`

Mechanism: the pending-deferral regex only catches COMPLETE role
keywords, so a first chunk ending in a partial prefix (\`##\`, \`## \`,
\`## us\`) was emitted in full. That emission flipped firstChunk to
false. From that point only NEWLINE_ANCHORED_ROLE_MARKER_RE was used,
which requires a literal \n before \`##\`. A marker at buffer
position 0 has no preceding \n, so it could no longer match.
abortForRoleMarker never fired and tool_use blocks emitted after the
fabricated turn boundary reached the dispatcher.

Fix: change firstChunk to track "tail has not been sliced yet" rather
than "any byte emitted". While total emitted bytes <= TAIL_BUFFER_SIZE,
tail still represents the entire emission so far and \`^\` in the
canonical regex genuinely anchors at byte 0 of the stream — so the
\`^|\n\` alternation safely catches a chunk-split message-start
marker. The transition happens at the moment we would slice: once
emitted > TAIL_BUFFER_SIZE, tail becomes a mid-stream window, \`^\`
becomes meaningless, and we switch to the newline-only variants.

Earlier iterations of this code tried two other definitions, both
unsound:
  - "any byte emitted" (this commit fixes) — lost \`^\` before a
    chunk-split message-start marker could finish arriving.
  - "newline emitted" (briefly considered as the reviewer's
    alternative suggestion) — left \`^\` valid on a sliced buffer
    when streams hadn't emitted a newline yet, re-introducing the
    rolling-tail mid-stream false positive from review r3324060995.
The slice-based invariant satisfies both: while we have not sliced,
\`^\` is correct; once we slice, it is not.

Regression tests added (apps/daemon/tests/role-marker-guard.test.ts):
  - \`##\`    | \` user\nDELETE…\`   → contaminated, marker=\`## user\`
  - \`## us\` | \`er\nDELETE…\`      → contaminated, marker=\`## user\`
  - \`## \`   | \`user\nDELETE…\`    → contaminated, marker=\`## user\`
  - \`#\`     | \`# user\nDELETE…\`  → contaminated, marker=\`## user\`
The fourth case (single \`#\` first chunk) exercises an even more
adversarial tokenization than the reviewer's examples; it is also
caught.

Total tests: 55 → 59.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(tests): wrap events in stream_event envelope in thinking test

feedJsonl was feeding raw events without the `{ type: 'stream_event',
event: ... }` wrapper that createClaudeStreamHandler requires (line 141
of claude-stream.ts). Events silently fell through all branches, making
both tests pass vacuously. Also fix TS2532 on warnings[0].marker with
non-null assertion (safe after the toHaveLength(1) guard).

Co-Authored-By: RoverKai <roverkai@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: roverkai <2196140098@qq.com>
Co-authored-by: JasonBroderick <jason@buddyboss.com>
Co-authored-by: RoverKai <roverkai@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-30 03:57:56 +00:00
xinsngx
c88a83cd5e
fix(web): preserve preview scroll across tools (#3313)
* fix(web): preserve preview scroll across tools

Capture URL-loaded preview scroll state before tool handoff and restore it through an opt-in raw HTML bridge to avoid jumping back to the top.

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.6

* test(daemon): cover scroll bridge injection paths

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.6

---------

Co-authored-by: Codex <gpt-5@openai.com>
2026-05-30 03:53:50 +00:00
xinsngx
778010bcf9
fix(web): theme home hero select menu (#3309)
* fix(web): theme home hero select menu

Use theme tokens for HomeHero footer select dropdown panels so dark theme menus do not render light-only white backgrounds with low-contrast text.

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.3

* test(web): cover dark model logo inversion

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.7

---------

Co-authored-by: Codex <gpt-5@openai.com>
2026-05-30 03:53:42 +00:00
Nicholas-Xiong
610aac4cc5
fix: insert skill reference when selecting from tools panel (#3220)
* fix: insert skill reference when selecting from tools panel

When selecting a skill from the tools panel (not via @ mention), the skill
reference was not being inserted into the input. The tools panel only called
applyProjectSkill() without inserting the text token.

This fix makes the tools panel skill picker behave consistently with other
reference types (MCP, connectors) by:
- Inserting the @skill token at the current cursor position
- Setting focus and cursor position after the inserted reference
- Closing the tools panel after successful insertion

Now users can:
1. Manually delete an @skill reference
2. Open the tools panel and select a skill
3. See the new @skill reference inserted at the cursor

Fixes #3188

* fix(web): preserve draft edits when inserting skill

---------

Co-authored-by: mrcfps <mrc@powerformer.com>
2026-05-30 03:52:35 +00:00
Patrick A
9146dc1c57
fix(web): persist design files view state across navigation (#2303)
* fix(web): persist design-files view state across navigation

pageSize, sortKey, sortDir, and kindFilter reset on every navigation
because DesignFilesPanel remounts via key={projectId}. Persist them to
localStorage under od:design-files:view-state:v1:<projectId> so each
project's view prefs survive tab-switching.

- Read persisted state via lazy useState initializers (SSR-safe try/catch)
- Write back in a single useEffect keyed on all four values
- Scoped per-project so proj-a settings never bleed into proj-b
- Schema-guarded: invalid/missing fields fall through to defaults
- Red spec: apps/web/tests/components/DesignFilesPanel.view-state-persist.test.tsx

* fix(web): address review feedback on view-state persistence

- Add typeof window guard in readViewState for explicit SSR safety
- Consolidate 4 separate localStorage reads into a single useRef read at
  mount time; each lazy useState initializer now reads from savedViewState.current
  instead of re-parsing localStorage independently

* fix(web): harden design-files view-state persistence

- Validate restored kindFilter values against the current ProjectFileKind
  union via isProjectFileKind() so stale stored values from a prior schema
  are dropped silently instead of being cast unchecked.

- Introduce DEFAULT_SORT_KEY/SORT_DIR/PAGE_SIZE constants so the useState
  initialisers and the new validation guard share a single source of truth.

- Add viewStateHasMounted ref to skip the first-render write in the persist
  useEffect. Without this guard every project the user visits accumulates a
  default-value entry in localStorage on mount, growing stale-key garbage
  unboundedly and making future field additions silently inject defaults into
  every existing entry.

- Harden kindFilter test: replace the silent early-return-on-missing-trigger
  with expect(filterTrigger).not.toBeNull() so a render failure surfaces as
  a real test failure rather than a passing no-op.

* test(e2e): design files view state persists across navigation and reload

Adds a Playwright UI smoke test in e2e/ui/ that exercises the three key
guarantees of the view-state persistence fix:

  (a) Tab-away / tab-back: navigating to a file tab and returning remounts
      DesignFilesPanel (conditionally rendered); all four prefs (sortKey,
      sortDir, pageSize, kindFilter) are restored from localStorage.

  (b) Hard reload: localStorage survives page.reload(); prefs are intact on
      the next mount.

  (c) Per-project key isolation: a second project starts with defaults and
      does not inherit values from the first project's localStorage entry.

The test uses OD_PORT=18011 / OD_WEB_PORT=18012 to avoid port conflicts with
the default development ports.

Also fixes a race in DesignFilesPanel: the stale-kind cleanup useEffect was
running against an empty availableKinds set before the async file list arrived
on mount, which cleared a kindFilter correctly restored from localStorage.
Guard added: skip the cleanup when availableKinds is empty.

Red on origin/main (no persistence logic exists there); green on this branch.

* fix(e2e): address code-reviewer feedback on view-state-persist test

- Add data-testid='df-page-size-select' to per-page <select> in
  DesignFilesPanel (W2: decouple test from i18n string 'Show')
- Add StrictMode comment to viewStateHasMounted guard explaining
  the dev-mode double-write behaviour (W1: document the invariant)
- Switch nav-away from dblclick to single-click + Open button,
  matching the pattern used in app-design-files.test.ts (W4)
- Raise timeout from 60s to 90s for cold CI runners (W3)
- Unify seedTextFile/seedPngFile into shared seedFile helper (N3)
- Add home-hero-input assertion in gotoEntryHome (N2)
- Switch waitForPageSizeSelect to use data-testid (W2)

* test(e2e): split design-files persist into nav, reload, and per-project scenarios

* fix(web): tighten isPageSize to discrete option set, add invalid-value regression test

* fix(web): isolate DesignFilesPanel.test.tsx from persisted view-state key
2026-05-30 03:39:27 +00:00
Weston Houghton
65802542a2
fix(chat): surface OpenCode usage-limit/provider failures instead of a bare timeout (#3316)
* fix(chat): surface OpenCode provider failures from its log on a silent stall

OpenCode's headless `run --format json` mode swallows provider failures: a
429 usage-limit is marked retryable and retried silently with nothing on
stdout/stderr, so the chat run only dies via the inactivity watchdog and the
daemon shows a bare "request timed out" with no reason. The real error
(statusCode + "Monthly usage limit reached…") is recorded only in OpenCode's
own session log.

On a failed OpenCode close where stdout/stderr carry no signal, read the
newest OpenCode session log, extract the latest `service=llm` provider error
(scoped to that one line so the embedded request body can't contaminate the
classification), and emit a structured, retryable SSE error (RATE_LIMITED /
AGENT_AUTH_REQUIRED / UPSTREAM_UNAVAILABLE) carrying the provider's message.

Refs #982.

* fix(chat): emit recovered OpenCode failure from the watchdog path, bound to the run

Addresses review on #3316.

Blocking: the recovery previously ran only in the child-close handler, but in
the inactivity-watchdog stall path (the exact case this targets)
failForInactivity sends its error and finish()es the run — which clears
run.clients — before the child closes. So the structured error reached zero
live SSE clients and only surfaced on reload. Recover and send the OpenCode
failure inside failForInactivity, before finish(), on the same pre-teardown
send path the generic stall message already uses. Keep the close-handler
branch for the case where OpenCode exits non-zero on its own (clients still
attached).

Non-blocking: bind the log lookup to the current run via an mtime gate
(since=run.createdAt) so a stale or concurrent session's error can't be
misattributed — skip log files last written before the run started.

* docs(opencode-log): note the concurrent-run limitation of the mtime gate

* fix(chat): skip close-handler failure emit when the watchdog already finished the run

Non-blocking review follow-up on #3316: on the silent-stall path both
failForInactivity and the child-close handler fired for the same run, so the
recovered RATE_LIMITED error was sent twice and the events-log stream was
reopened after finish() had closed it. Guard the close-handler failure emit
with !design.runs.isTerminal(run.status) — the watchdog already sent the
error and finalized the run; finalization below still runs (finish() no-ops
once terminal).
2026-05-30 03:23:58 +00:00
YOMXXX
9b9a18af5b
fix(daemon): validate skillId on POST/PATCH /api/projects against runtime source-of-truth (#3293)
* fix(daemon): validate skillId on POST /api/projects against runtime source of truth

* fix(daemon): validate skillId on PATCH /api/projects/:id, sharing the POST validator

* test(daemon): cover skillId canonicalization, design-template ids, empty-string + null normalization, type rejection
2026-05-30 03:22:16 +00:00
Ramiro
e30a4a2202
fix(platform): search mise shims dir so mise-installed CLIs are detected (#3319)
* fix(platform): search mise shims dir so mise-installed CLIs are detected

- Add ~/.local/share/mise/shims (and MISE_DATA_DIR override + legacy ~/.mise/shims) to wellKnownUserToolchainBins.
- This makes Pi, Kimi, and other mise-managed coding agents visible to the daemon even when launched from GUI contexts with stripped PATH.
- Added tests for default and MISE_DATA_DIR cases.
- Also pinned pnpm@10.33.2 in root mise.toml for better mise ergonomics.

Before/after: more local CLIs now appear in the runtime picker (Kimi, Pi, Antigravity, Kilo, etc.).

Refs: discussion in session around improving detection for common mise users.

* fix(platform): address Copilot review on mise shims logic

- Generalize the shims comment (no hard-coded CLI examples).
- Make per-version Node toolchain scanning respect MISE_DATA_DIR
  (use the same mise root for installs as for shims).
- Avoid duplicate shims entries when MISE_DATA_DIR makes legacy path
  identical to the primary one.

Addresses the three inline comments from copilot-pull-request-reviewer
on PR #3319.

* test(platform): extend MISE_DATA_DIR test to cover installs scanning

Addresses non-blocking review feedback from @nettee on PR #3319.

The previous test only asserted shims behavior under a custom
MISE_DATA_DIR. This extends it to also create fixture trees under
customMise/installs/node/... and customMise/installs/npm-openai-codex/...
and assert that the install paths are discovered while default-root
paths are excluded.

This makes the test robust against regressions in the installs
scanning logic (existingMiseNpmPackageBinDirs + node version dirs).

* fix(platform): only fall back to ~/.mise/shims when no MISE_DATA_DIR is set

Addresses the remaining non-blocking review comment from @nettee on PR #3319.

When an explicit MISE_DATA_DIR is provided, we no longer inject the
legacy ~/.mise/shims path. This prevents stale shims from a previous
mise layout from being re-introduced into detection.

Also added a regression assertion in the MISE_DATA_DIR test.

* fix(daemon): make claude-stream dedup robust when final assistant wrapper lacks msgId

Prevents duplicated text and thinking output (especially visible during
design system generation with AMR/Vela).

Root cause: the textStreamed guard fell back to  whenever the
final  message arrived without a string uid=501(ramarivera) gid=20(staff) groups=20(staff),12(everyone),61(localaccounts),79(_appserverusr),80(admin),81(_appserveradm),399(com.apple.access_ssh),33(_appstore),98(_lpadmin),100(_lpoperator),204(_developer),250(_analyticsusers),395(com.apple.access_ftp),398(com.apple.access_screensharing),400(com.apple.access_remote_ae) (common in some
AMR flows and design system tasks), causing the full content to be
re-emitted even if it had already been delivered via streaming deltas.

Fix: track whether any text or thinking was streamed via deltas for the
current message and use that as a reliable fallback for the final wrapper
instead of only trusting  presence.

* revert: remove dedup from claude-stream (PR #3319 should stay clean)
2026-05-30 03:21:04 +00:00
open-design-bot[bot]
51963cff78
docs(readme): refresh contributors wall (#3271)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
actionlint / Lint GitHub Actions workflows (push) Failing after 2s
ci / Detect CI change scopes (push) Successful in 1s
landing-page-ci / Validate landing page (push) Failing after 2s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 1s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 2s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-29 14:12:51 +00:00
open-design-bot[bot]
482e318afe
Update docs/assets/github-metrics.svg (#3267)
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-29 14:12:36 +00:00
lefarcen
6f532ca35c
fix(web): snapshot the srcDoc bridge frame in Mark mode so deck capture works (#3304)
The Mark tool (#3081/#3277) captured the preview via the *active* iframe. For
URL-load previews — decks especially — the active frame is the bridgeless URL
iframe, while the snapshot bridge lives only in the (mounted but hidden) srcDoc
transport frame. So Send on a deck timed out and showed 'Could not capture the
preview. Try again to avoid sending only ink.'

Snapshot the srcDoc-render-mode frame instead (capture mode already keeps it on
full content, so it carries the bridge), with a short retry while it finishes
swapping to full content. Falls back to the active frame for the non-URL-load
case where they are the same.

Red spec: PreviewDrawOverlay.test 'snapshots the srcDoc bridge iframe, not the
visible URL-load frame' fails on main (targets the URL frame), passes here.
2026-05-29 11:50:37 +00:00
Jane
9f09d1b649
fix(landing-page): wire up mobile nav toggle on the homepage (#3295)
The homepage runs its own inline header enhancer instead of importing
the shared header-enhancer.astro component, and that inline copy only
ported the scroll-headroom and GitHub stars/version logic — it never
included the hamburger toggle handler. As a result the mobile menu
button rendered (and animated to an X via CSS) but clicking it did
nothing on / and /<locale>/, while sub-pages that do import the shared
enhancer worked fine.

Port the same toggle handler into the homepage inline enhancer: click
flips .is-open on header.nav (which CSS expands into the dropdown panel
below 1080px), and outside-click, Escape, and any in-menu link close it,
keeping aria-expanded in sync.

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-29 10:19:37 +00:00
Guillermo Alcántara
2ea2c91a92
fix(pack): add missing download and host packages to Linux INTERNAL_PACKAGES (#2837)
* fix(pack): add missing download and host packages to Linux INTERNAL_PACKAGES

The native (non-containerized) Linux AppImage build fails with npm 404
errors because @open-design/download and @open-design/host — runtime
dependencies of @open-design/desktop and @open-design/web — were not
included in the INTERNAL_PACKAGES list. Without tarballs for these two
packages, npm install in the assembled app directory tries to resolve
them from the public registry where they don't exist.

Add both packages to INTERNAL_PACKAGES and their build steps to
buildWorkspaceArtifacts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(pack): apply same download/host fix to mac and win lanes, add regression test

1. Extend the INTERNAL_PACKAGES fix to mac/constants.ts, mac/workspace.ts, win/constants.ts, and win/app.ts so all three pack lanes produce tarballs for @open-design/download and @open-design/host.

2. Add internal-packages-coverage.test.ts that derives required workspace runtime deps from apps/desktop and apps/web package.json files and asserts every pack lane's INTERNAL_PACKAGES includes them. This prevents the same drift from recurring when a new workspace dependency is added.

3. Update win-app.test.ts and workspace-build.test.ts mock directory lists to include the two new packages.

* fix(pack): include runtime packages in workspace build cache

* fix(pack): install platform with desktop prebundle packages

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 09:44:03 +00:00
Bassiiiii
0c4b7e50be
fix(web/router): defer popstate dispatch to microtask (#2490)
* fix(web/router): defer popstate dispatch to microtask

navigate() previously dispatched a synchronous popstate event after
mutating window.history, which caused React 18 to emit:

  Cannot update a component (Router) while rendering a different
  component (App). To locate the bad setState() call inside App,
  follow the stack trace as described in
  https://react.dev/link/setstate-in-render

This happens whenever a caller invokes navigate() from inside a
useState updater (e.g. App.tsx:479 routing first-run users through
the onboarding panel from inside the setConfig() update). The
synchronous popstate dispatch reaches useRoute() subscribers which
then call setRoute() while the parent component is still rendering.

Defer the popstate dispatch to a microtask. The window.history call
itself stays synchronous so the URL bar updates immediately; only
subscriber updates are pushed past the current render commit, which
removes the warning without changing observable behaviour for any
existing caller.

* fix(web/router): cover deferred navigation timing

---------

Co-authored-by: Visionboost <contact@visionboost.fr>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 09:37:55 +00:00
koki
e71938767e
feat(community): add Showcase + Contribute + moderators, restructure nav and footer (#3291)
Adds two new entry points and reworks the community page chrome to
match the wider landing-page direction (PRs #3222 and #3230).

Showcase / Plugin Everything (above Ambassadors)
  Pitches Open Design as 'studio and gallery' in one address: anything
  shipped through the studio (content, products, templates, Skills,
  workflows) can return as a plugin, and the strongest pieces are
  carried out to the registry, to X, to Discord's #showcase channel,
  to the newsletter, and to the video reels. Right column holds a
  zero-code Contribute card with a curl installer, a copy-to-clipboard
  button, and a three-step flow for the od-contribute Skill.

Hero CTA row
  Three buttons, in a single row: Stage your masterpieces (Showcase),
  Become an ambassador (Ambassadors), Contributors hall of fame
  (Maintainers).

Top nav
  Pulls the breadcrumb out of the brand mark, surfaces Contributors /
  Ambassadors / Showcase as anchors, and adds GitHub + X icon buttons
  next to the Join Discord pill (mirrors PR #3230).

Footer
  Restructured into columnar layout with brand summary plus Products,
  Plugins, and Community columns; copyright moves to a bottom rule.

Ambassadors
  Renaissance-voice three-column program (Vocation / Patronage /
  Covenant) with an Apply on Discord CTA to the ambassador channel.

Discord
  Card spans wider (max-width 1440px), copy reframed as 'the front
  line of the agent-design era', two moderator profiles on the right
  (Koki from the founding team, Victor as Discord steward), channel
  list and CTAs on the left.

Recent signal
  Kicker and headline framed as this week's leaderboard; backed by a
  hand-curated RANKING_SNAPSHOT. A real refresh pipeline remains a
  follow-up; data is hand-updated until then.

Other notes
  Punctuation pass: replaced most em-dashes in prose with colons,
  periods, commas, semicolons, parentheses; em-dashes only remain in
  data placeholders, page title, and HTML comments. Logo size bumped
  to 32px and now uses an alt of 'Open Design'.

Co-authored-by: koki yanlai xu <koki@kokideMacBook-Air.local>
2026-05-29 09:20:04 +00:00
Md Mushfiqur Rahim
8ec162bb26
fix: pet hover card gets cut off at screen edges (#2860)
* fix: pet hover card gets cut off at screen edges

* fix: address review feedback - viewport clamping + unadopted pet wake

- Add window.innerHeight check to prevent bottom-edge clipping
- Increase menuH estimate for safer positioning
- Open pet settings instead of no-op Wake for unadopted pets

* fix: address review feedback on pet menu positioning and wake action

- Add viewport height check (viewH) to prevent bottom-edge clipping
- Increase menuH estimate for safer positioning
- Open pet settings instead of no-op Wake for unadopted pets
2026-05-29 09:05:51 +00:00
byte92
cdf34897ba
add comment composer keyboard submit shortcut (#2941)
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:46:15 +00:00
Sriram Sivakumar
0bd07b2a3d
fix(daemon): grok-build — pass prompt inline as -p value, drop stdin (#2259)
* fix(daemon): grok-build runtime — pass prompt inline as -p value, drop stdin

Grok Build CLI 0.1.212 enforces `-p, --single <PROMPT>` as a value-requiring
flag — invoking with bare `-p` and piping the prompt to stdin now fails with:

  error: a value is required for '--single <PROMPT>' but none was supplied

The previous runtime def used `promptViaStdin: true` + `buildArgs` returning
`['-p']`, which only worked against earlier grok builds that read the prompt
from stdin when `-p` had no inline value.

This change inlines the prompt as the `-p` argument value and flips
`promptViaStdin: false`. Linux `MAX_ARG_STRLEN` (128 KB) is enough headroom
for typical Open Design prompts; if we ever hit `E2BIG` on a very large
brief, a follow-up could shell out to `--prompt-file <tempfile>`.

Verified against grok 0.1.212 (b7b8204a4) — single-turn invocations now
return clean text replies instead of exit 2.

* fix(daemon): declare grok-build argv prompt budget + regression coverage

@mrcfps' review on #2259 flagged that moving the Grok Build adapter from
the (no-longer-working) stdin path to argv would regress oversized
composed prompts from the actionable AGENT_PROMPT_TOO_LARGE error we
already emit for DeepSeek to a raw spawn ENAMETOOLONG / E2BIG instead.
Fixed by mirroring the DeepSeek argv-budget shape:

- grok-build.ts: `maxPromptArgBytes: 30_000` (same headroom as DeepSeek,
  ~2.7 KB under the Windows CreateProcess 32_767-char cap) so
  `checkPromptArgvBudget` pre-flights composed prompts (system + history
  + skills + design-system content + user message) before spawn.
- prompt-budget.ts: Grok-Build-specific message — names the `-p /
  --single` flag, the xAI CLI 0.1.212+ behavior change, and points the
  user at stdin-capable adapters (claude / codex / hermes) when they
  need to ship large local context.
- Tests: 3 new vitest cases in prompt-budget.test.ts — pin the budget
  field, exercise the strict-overrun + at-limit + CJK byte-count guards
  exactly like the DeepSeek regression set, and assert the Grok-named
  diagnostic copy. New `grokBuild` + `grokBuildMaxPromptArgBytes`
  helpers exported alongside the existing `deepseek*` ones.

All 23 prompt-budget tests pass locally (`pnpm exec vitest run
tests/runtimes/prompt-budget.test.ts`).

---------

Co-authored-by: Sriram Sivakumar <sriram155@gmail.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:45:57 +00:00
saifullakhan
73b2dc853f
Fix project empty state create action (#3082)
Co-authored-by: saifulla-khan <saifulla-khan@users.noreply.github.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:30:43 +00:00
maybeyourking
881571dea7
fix(media): route custom-image edits through images API (#3087)
* fix(media): route custom-image edits through images API

* fix(media): normalize custom-image endpoint suffixes

---------

Co-authored-by: Artist Ning <dingkuake@yeah.net>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:09:44 +00:00
Aria Shishegaran
fe58db2ba1
fix(web): target comment picker elements precisely (#3263)
Resolve Comment picker hit testing against meaningful visible DOM leaves before falling back to annotated ancestors, while preserving Inspect mode's annotation-first selector behavior.

Filter generated React root annotations from Comment targets, keep real element bounds separate from hoverPoint, and avoid rendering the comments drawer inline when a configured dock portal is not mounted.
2026-05-29 07:47:08 +00:00
Mason
1006efa2f6
Improve onboarding AMR runtime card (#3276)
* Improve onboarding AMR runtime card

* Fix onboarding AMR test expectations
2026-05-29 07:45:23 +00:00
freshtemp-labs
593bf2f03c
fix(composer): ellipsis overflow for referenced filenames (#3269)
Inline @-mentioned filenames in the composer can be very lengthy,
causing line wrapping and visual crowding in the input area.

- Switch display from inline to inline-block for max-width
- Cap width at min(240px, 25vw)
- Apply overflow: hidden + text-overflow: ellipsis + white-space: nowrap
- Remove box-decoration-break (unused since content won't wrap)
- Tweak vertical-align for consistent inline-block alignment

Closes #3261

Co-authored-by: freshtemp-labs <freshtemp-labs@users.noreply.github.com>
2026-05-29 07:41:16 +00:00
ziyan2006
071db7ca1b
[codex] Stabilize HTML deck navigation state (#3142)
* fix: stabilize html deck navigation state

* fix: avoid misclassifying transform decks as scroll decks

* fix: detect default root-scroller decks

---------

Co-authored-by: Nongzi <3051966228@qq.com>
2026-05-29 07:41:10 +00:00
elihahah666
be09fe92da
fix: keep settings/handoff/avatar buttons fixed to the right in project header (#3279)
Move the three buttons (settings, handoff, avatar) from fileActionsBefore
to the actions slot so they always stay pinned to the right edge of the
header, regardless of how many extra controls (Share, Present, etc.) are
injected via portal during HTML preview.

Co-authored-by: qiongyu1999 <2694684348@qq.com>
Co-authored-by: Claude Opus 4 <noreply@anthropic.com>
2026-05-29 07:33:57 +00:00
youcef zr
d6d42c3600
fix(pack): bundle download and host packages in Linux AppImage assembly (#2845)
The Linux AppImage path assembles INTERNAL_PACKAGES as `file:` tarballs
and runs `npm install --omit=dev` in an isolated app directory. `pnpm
pack` rewrites each tarball's `workspace:*` refs to a concrete version,
so any runtime @open-design/* dependency missing from INTERNAL_PACKAGES
is resolved from the public npm registry and 404s.

Linux ships webOutputMode "server" and tarball-installs every
INTERNAL_PACKAGES entry, including @open-design/desktop and
@open-design/web. @open-design/host (dep of web + desktop, added in
#2246) and @open-design/download (dep of desktop, added in #2677) landed
after the Linux package list was written and were never added to it, so
`pnpm exec tools-pack linux build --to appimage` fails with:

  npm error 404 Not Found - GET .../@open-design%2fdownload

mac/win default to "standalone", where desktop/web/packaged/daemon are
prebundled with esbuild and excluded from the tarball install
(shouldInstallInternalPackageFor{Mac,Win}Prebundle). The packages they
do install have no download/host dependency, so those lanes correctly
omit them and need no change — this fix stays scoped to linux.ts and
touches no mac/win or workspace-build code.

Add both packages to the Linux INTERNAL_PACKAGES and build them in
buildWorkspaceArtifacts (download depends on platform). Add a cross-lane
regression test that, for each lane, derives the set it actually installs
(honoring the standalone prebundle exclusion) and asserts that set is
closed under its runtime @open-design/* dependencies. The test is red on
the linux lane without this fix and green with it, while mac/win pass
either way — encoding why only Linux needs these packages.
2026-05-29 07:25:03 +00:00
lefarcen
da19ff3ca0
feat(mocks): replay-based mock CLIs for 14 of OD's supported agents (opencode/codex/claude/gemini/cursor-agent/deepseek/qwen/grok + ACP family devin/hermes/kilo/kimi/kiro/vibe) (#3241)
* feat(mocks): replay-based mock CLIs for opencode/claude/codex/deepseek/qwen/grok

Drops in a `mocks/` top-level dir that pretends to be the real agent
CLIs by streaming pre-recorded sessions in each CLI's native stdout
protocol. Zero LLM tokens.

## Use cases

- **E2E tests** in `apps/daemon/tests/` — exercise the full chat-server
  pipeline against a known trace, assert UI events / artifacts.
- **Self-validation during dev** — iterate on `claude-stream.ts` /
  `json-event-stream.ts` parser changes without burning provider budget.
- **Regression harness** — replay the same trace before and after a
  charter / parser change; diff the daemon events the UI surfaces.
- **Demo / onboarding** — show what a 17-tool claude editing session
  looks like end-to-end, offline.

## How

- 6 bash wrappers (`mocks/bin/`) shadow the real CLIs when PATH-overlaid.
- `mocks/mock-agent.mjs` reads `mocks/recordings/<trace>.jsonl`, picks
  one via env var (`SYNCLO_EXPLORE_MOCK_TRACE` / `_POOL` /
  `_BY_PROMPT_HASH`), streams the trace in the requested format.
- Each format renderer matches the EXACT JSON shape the OD daemon
  parser expects, verified line-by-line against
  `apps/daemon/src/{json-event-stream,claude-stream}.ts`:

  | CLI                       | streamFormat              | parser source                              |
  | ------------------------- | ------------------------- | ------------------------------------------ |
  | `opencode`                | `json-event-stream`       | `handleOpenCodeEvent`                      |
  | `codex`                   | `json-event-stream`       | `handleCodexEvent`                         |
  | `claude`                  | `claude-stream-json`      | `createClaudeStreamHandler`                |
  | `deepseek` `qwen` `grok`  | `plain`                   | `server.ts` (raw stdout)                   |

## Quick start

```bash
export PATH="$PWD/mocks/bin:$PATH"
export SYNCLO_EXPLORE_MOCK_TRACE=04097377   # 8-char prefix OK
export SYNCLO_EXPLORE_MOCK_NO_DELAY=1

echo "any prompt" | opencode run
echo "any prompt" | claude -p --output-format=stream-json
echo "any prompt" | codex exec
```

The mock binary announces the picked trace id on stderr:
`[mock-opencode] picked 04097377… via fixed`.

Recording selection (env, in priority order):
- `SYNCLO_EXPLORE_MOCK_TRACE=<id>` — fixed (prefix OK)
- `SYNCLO_EXPLORE_MOCK_BY_PROMPT_HASH=1` + stdin prompt — `sha256(prompt) % N`
- `SYNCLO_EXPLORE_MOCK_POOL=<tag>` — random within `agent:claude` /
  `skill:agent-browser` / `outcome:failed` / etc.
- (default) uniform random
- `SYNCLO_EXPLORE_MOCK_SEED=<str>` — reproducible "random"
- `SYNCLO_EXPLORE_MOCK_NO_DELAY=1` — skip inter-event waits

## Dataset

179 anonymized Langfuse traces from this project's own production
telemetry:

- 9 agents: claude 57 · opencode 41 · codex 38 · gemini 25 ·
  cursor-agent 11 · qwen 2 · copilot 2 · deepseek 2 · antigravity 1
- outcomes: succeeded 144 · failed 35
- skills: default 71 · ad-creative 50 · algorithmic-art 30 ·
  agent-browser 22 · video-hyperframes 2 · plus magazine-web-ppt /
  brainstorming / data-report / penpot-flutter-design-source 1 each
- 124 multi-turn (sessions with ≥2 turns)
- 18 produce `<artifact>` output
- ~4.5 MB on disk total

Anonymization: `/Users/<name>/` → `${HOME}/`,
`C:\Users\<name>\` → `%USERPROFILE%\`, project UUIDs →
stable `proj-001`, `proj-002`, …. Tool input/output payloads
preserved verbatim (templated UI, no cell-level PII).

## Smoke test

`bash mocks/scripts/smoke-test.sh` — 6 checks across all 6 agents.
All pass on this branch (verified locally):

```
  ✓ opencode first event = step_start
  ✓ codex first event = thread.started
  ✓ claude first event = system
  ✓ deepseek emitted plain text (144 chars on first line)
  ✓ qwen emitted plain text (144 chars on first line)
  ✓ grok emitted plain text (144 chars on first line)
All mock CLIs working. 
```

## Adding more recordings

The exporter that produced this set lives in
[nexu-io/agent-pr-explore](https://github.com/nexu-io/agent-pr-explore)
(see `cli/src/local/orchestrator/langfuse-import.ts` + the `local
langfuse-import` CLI command). Operators with the Langfuse keys can pull
more by tag / outcome / artifact / multi-turn filter, then run
`local recordings anonymize --out-dir ~/Documents/open-design/mocks/recordings`.
`mocks/README.md` has the full instructions.

## Out of scope (follow-ups)

- **ACP agents** (`devin`, `hermes`, `kilo`, `kimi`, `kiro`, `vibe`) need
  a JSON-RPC server on stdio rather than a one-shot stream — separate
  `format-acp.mjs` module not yet written.
- **Per-agent json-event-stream variants** (`cursor-agent`, `gemini`,
  `qoder`, `copilot`, `pi`) currently fall back to the `plain` renderer;
  their parsers are in `apps/daemon/src/json-event-stream.ts` and follow
  the same template as `format-codex.mjs`.

## AGENTS.md updates

- Added `mocks/` to the top-level content directories listing
- Added a Validation strategy bullet pointing here for agent-stream /
  parser changes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mocks): add opencode-cli/kiro-cli/vibe-acp bin aliases and unref ACP timeout

- Add mocks/bin/opencode-cli, kiro-cli, vibe-acp wrappers for the primary
  RuntimeAgentDef bin names OD resolves before any fallback. Without these,
  a PATH-overlaid OD daemon run bypasses the mock entirely (opencode-cli,
  kiro-cli) or cannot find the mock at all (vibe-acp, which has no fallback).
- Include opencode-cli, kiro-cli, vibe-acp in the smoke-test ACP/JSON loop
  so coverage is verified end-to-end.
- Call .unref() on the 30s safety timeout in format-acp.mjs so a completed
  ACP session exits promptly instead of waiting the full 30 seconds.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* feat(mocks): add vela (AMR) — login / models / ACP with strict set_model gate

Extends mocks/ to cover OD's own AMR runtime. `vela` is the bin name
`apps/daemon/src/runtimes/defs/amr.ts` specifies (`bin: 'vela'`,
`streamFormat: 'acp-json-rpc'`). It's richer than the generic ACP
agents — covers full login + models + chat-session lifecycle.

### What vela does (mirrored from apps/daemon/tests/fixtures/fake-vela.mjs)

1. `vela login` — writes ~/.amr/config.json with a fake profile (controlKey,
   runtimeKey, user{email,name,plan}, profile-specific apiUrl/linkUrl).
   The on-disk projection is what OD's daemon login route + AmrLoginPill
   poller read; production goes through device-auth, the mock skips
   straight to the file write.

2. `vela models` — prints the production-shaped public model catalog as
   newline-separated `public_model_*    vela` lines. Override via
   FAKE_VELA_MODELS env.

3. `vela agent run --runtime opencode` — ACP JSON-RPC server with three
   vela-specific protocol extensions:

   a. `initialize` response carries `agentCapabilities`
      (`promptCapabilities.embeddedContext`) + `models`
      (`currentModelId` + `availableModels`).
   b. `session/new` response carries the same `models` block.
   c. **Strict set_model gate**: `session/prompt` is rejected with
      JSON-RPC -32602 ("session/set_model must be called before
      session/prompt") UNLESS `session/set_model` (or
      `session/set_config_option`) has been called for the current
      sessionId. Mirrors real vela 0.0.1 contract; catches regressions
      in `attachAcpSession` that silently skip set_model.

### Error injection envs (in sync with fake-vela.mjs)

  FAKE_VELA_SESSION_ID            - sessionId returned by session/new
  FAKE_VELA_TEXT                  - override assistant text
  FAKE_VELA_THOUGHT               - optional thought_chunk before text
  FAKE_VELA_SESSION_NEW_ERROR     - fail session/new
  FAKE_VELA_SET_MODEL_ERROR       - fail session/set_model
  FAKE_VELA_PROMPT_ERROR          - fail session/prompt
  FAKE_VELA_REQUIRE_SET_MODEL='0' - disable the strict gate (legacy)
  FAKE_VELA_LOGIN_USER_EMAIL      - email written into config profile
  FAKE_VELA_LOGIN_USER_PLAN       - plan written into config profile
  FAKE_VELA_LOGIN_DELAY_MS        - sleep before write (test in-flight)
  FAKE_VELA_LOGIN_FAIL            - print + exit 1
  FAKE_VELA_MODELS                - override models stdout
  VELA_PROFILE                    - profile slot (prod | test | local)

### Components

`mocks/lib/format-vela.mjs` (~205 LOC)
  - Full ACP server with vela protocol extensions
  - Strict set_model gate
  - Error injection plumbing

`mocks/lib/vela-subcommands.mjs` (~90 LOC)
  - runVelaLogin() — writes ~/.amr/config.json
  - runVelaModels() — prints catalog

`mocks/bin/vela` — dispatcher wrapper. Forwards `vela <subcmd>` to
mock-agent.mjs which routes to login/models or falls through to ACP.

`mocks/mock-agent.mjs` — parseArgs now collects positionals so the vela
dispatcher can read subcommand from there; switch case added for vela.

`mocks/scripts/smoke-test.sh` — +4 assertions:
  vela models prints ≥10 catalog lines
  vela login writes ~/.amr/config.json with the requested email
  vela agent run ACP roundtrip (initialize+models+set_model+stream+result)
  vela strict set_model gate rejects prompt without prior set_model

### Verified locally

  ✓ vela models printed 15 catalog lines
  ✓ vela login wrote ~/.amr/config.json with profile.prod.user.email
  ✓ vela agent run ACP roundtrip (initialize+models, set_model accepted, prompt streamed)
  ✓ vela strict set_model gate rejects session/prompt without prior set_model

All 21 smoke checks pass (up from 17 with previous P3 ACP commit).

### AGENTS.md + README updates

  AGENTS.md — mention `vela (AMR — vela CLI)` alongside ACP agents in
  the directory listing entry.
  mocks/README.md — protocol table row + dedicated vela section with
  subcommand contract, strict gate explanation, env-injection cheat
  sheet. Mock-tree listing updated.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mocks): honor REPORT_FILE env when --report-file flag not given

Harnesses that spawn the mock without translating their report-path
contract to the mock's CLI flag (notably nexu-io/agent-pr-explore's
orchestrator, which passes REPORT_FILE as env per the existing
opencode/claude/codex agent launchers) wouldn't get a report file
written, so the harness's "agent exit 0 but produced no report"
check would always fire and mark mock runs as failure even though the
stdout stream was complete.

Fix: in mock-agent.mjs parseArgs, fall through to process.env.REPORT_FILE
when --report-file wasn't provided on argv. Each format renderer already
accepts opts.reportFile and writes the recording's final assistant text
to it (`format-*.mjs` already had this — only the wiring was missing).

Verified: synclo-explore run with `mock=true, mock_trace=04097377`
against the opencode wrapper now produces a plan.md with the recording's
17-tool claude editing session report. ~1.5s per run vs ~70s real opencode.

* mocks: move recordings to Cloudflare R2; PR→main→Action upload path

The 179-recording corpus (~4.5 MB raw, ~280 KB after compression) has
been moved off git into Cloudflare R2 at the bucket open-design-mocks
under recordings/v1/. The repo now ships:

- mocks/manifest.json — the canonical catalog (renamed from
  recordings/index.json) with sha256 + storage hints; consumers
  fetch this to discover what exists, then pull individual jsonl
  files on demand
- mocks/scripts/fetch-recordings.sh — parallel, sha256-verified,
  idempotent puller for the public r2.dev URL
- mocks/scripts/add-recording.sh — local maintainer helper that
  validates a new .jsonl and copies it into recordings-staging/
  (no R2 calls; no credentials needed)
- mocks/scripts/upload-to-r2.mjs — called only by the CI workflow
- mocks/scripts/lib/manifest-utils.mjs — shared sha256/meta/
  rebuild-histograms logic, used by both add-recording (preview)
  and upload-to-r2 (actual write) so the entry shape never drifts
- .github/workflows/sync-mocks-to-r2.yml — fires on push to main
  when mocks/recordings-staging/ changes; uploads to R2, updates
  manifest, commits cleanup back; serialized via concurrency group

Trust model: R2 write credentials (CLOUDFLARE_API_TOKEN,
CLOUDFLARE_ACCOUNT_ID) are repo secrets; nobody can push from a
laptop. Read stays public via the r2.dev URL.

Why not pnpm install integration: contributors who do not touch
agent code do not pay the fetch cost. Fetch happens on first
smoke-test run (auto-fallback) or when a mock spawn needs data.

Repo size: -4.55 MB net (delete 179 jsonl, +280 KB manifest +
scripts). Smoke test (21 checks) still green against the fetched
corpus.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: scope R2 write token to a dedicated secret name

Use CLOUDFLARE_R2_MOCKS_TOKEN (instead of reusing the shared
CLOUDFLARE_API_TOKEN that landing-page-*.yml uses for Pages deploys)
so the R2 write capability can be scoped to just the
open-design-mocks bucket without bleeding extra capability into the
Pages workflows.

Also hardcode the powerformer CF account_id directly in the workflow
(account IDs are not secret and the shared CLOUDFLARE_ACCOUNT_ID
secret may point at a different account).

Workflow now fails fast with an actionable error message + dashboard
link if the secret is unset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: switch R2 sync to S3-compat API (wrangler getMemberships gate)

wrangler 4.x calls /memberships before any r2 action, requiring
user:read scope. R2 "Object Read & Write" tokens deliberately lack
that scope (defense in depth — a leaked token should not enumerate
account-level resources). The workflow now uses the aws CLI talking
straight to the R2 S3-compatible endpoint with SigV4, no membership
lookup.

Secret rotation: CLOUDFLARE_R2_MOCKS_TOKEN (Bearer) is replaced by
CLOUDFLARE_R2_MOCKS_AK / CLOUDFLARE_R2_MOCKS_SK (matching the
existing CLOUDFLARE_R2_RELEASES_AK/SK naming convention). End-to-end
tested locally: PUT recording → manifest rebuild → manifest PUT →
staging cleanup all green.

aws CLI is pre-installed on ubuntu-latest, so no install step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: scrub synclo namespace; use OD_MOCKS_* env prefix throughout

These mocks were copy-pasted from synclo-explore, where they
originated, and inherited the SYNCLO_EXPLORE_MOCK_* env-var
convention. That brand-bleed is not appropriate in OD: rename the
public env surface to OD_MOCKS_* (matching OD-native prefixes like
OD_MOCKS_CACHE_DIR, OD_TRACE_R2_UPLOAD, OD_EXPECT_TIMEOUT_SECONDS).

Renames:
  SYNCLO_EXPLORE_MOCK_TRACE             → OD_MOCKS_TRACE
  SYNCLO_EXPLORE_MOCK_BY_PROMPT_HASH    → OD_MOCKS_BY_PROMPT_HASH
  SYNCLO_EXPLORE_MOCK_POOL              → OD_MOCKS_POOL
  SYNCLO_EXPLORE_MOCK_SEED              → OD_MOCKS_SEED
  SYNCLO_EXPLORE_MOCK_NO_DELAY          → OD_MOCKS_NO_DELAY
  SYNCLO_EXPLORE_MOCK_RECORDINGS_DIR    → OD_MOCKS_RECORDINGS_DIR
  SYNCLO_EXPLORE_MOCK_SMOKE_TRACE       → OD_MOCKS_SMOKE_TRACE
  SYNCLO_OD_MOCKS_I_KNOW_WHAT_IM_DOING  → OD_MOCKS_ALLOW_LOCAL_UPLOAD

Also drop the inline harvester usage from README. The harvester is an
external CLI in nexu-io/agent-pr-explore — its README is the right
place for langfuse-import flags, anonymization options, etc. OD only
documents its own staging→PR→Action workflow.

Smoke test (21 checks) still green; OD_MOCKS_TRACE end-to-end
verified to route correctly.

Consumers of the OLD env names (notably the orchestrator in
nexu-io/agent-pr-explore) need a matching rename. No back-compat
shim here — the explore side has zero external users today and a
one-line follow-up is cleaner than a permanent deprecation layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* AGENTS.md: align mock env names with mocks/ rename (SYNCLO_* → OD_MOCKS_*)

Missed in the prior commit (a30b868a) — only grepped mocks/ subdir.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: drop staging dir + GH Action; back to local-script upload

The staging-dir + Action design (added earlier in this PR) had a flaw
the user caught: new recordings briefly entered the repo on their way
through staging, leaving them in git history forever even after the
Action cleanup commit removed them from HEAD. That defeats the whole
point of moving recordings to R2.

Replace with the simpler local-maintainer flow:

  bash mocks/scripts/upload-recording.sh /path/to/<trace>.jsonl
  # → validates, wrangler r2 put, updates manifest.json, wrangler r2 put manifest
  git add mocks/manifest.json && git commit && git push
  # → only the ~200B manifest delta enters git

The wrangler-OAuth gate replaces the CI secret + Action duo. For a
solo / small maintainer team this collapses the trust chain down to
"do you have wrangler login to the powerformer account?" — no GH
secrets to rotate, no concurrency window to worry about, no
inevitable repo-history bloat.

Deletes:
- .github/workflows/sync-mocks-to-r2.yml
- mocks/scripts/upload-to-r2.mjs   (CI-only)
- mocks/scripts/add-recording.sh   (staging helper, now obsolete)
- mocks/recordings-staging/        (empty dir, never to be repopulated)

Adds:
- mocks/scripts/upload-recording.sh

Kept:
- mocks/scripts/fetch-recordings.sh
- mocks/scripts/lib/manifest-utils.mjs (still used by upload-recording.sh)
- mocks/manifest.json (committed; the only mocks artifact in git)

End-to-end tested locally: re-upload an existing recording is
idempotent, manifest math is stable, fetch + smoke test still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: address review — guard allowlist + safe ~/.amr + loud OD_MOCKS_TRACE typo

Three concrete issues raised across recent Siri-Ray (Looper) review
threads on #3241:

1. scripts/guard.ts only allowlisted mocks/lib/ + mocks/mock-agent.mjs,
   leaving mocks/scripts/lib/manifest-utils.mjs outside the residual-
   JS guard. Result: Preflight fail on every push. Extend the allowlist
   to mocks/scripts/ — same precedent as the lib/ entry directly above.

2. mocks/scripts/smoke-test.sh moved the caller real ~/.amr to
   ~/.amr-smoke-backup, ran vela login (which writes a fake config),
   then rm -rf the .amr and restored the backup. Two failure modes:
   crash mid-run loses the user real config, and re-running before
   restore overwrites the backup with the fake login. Fix: sandbox
   vela login into a mktemp -d HOME via env (HOME=$amr_sandbox vela
   login). Never touches the real ~/.amr at all. trap cleans up.

3. mocks/lib/recording-picker.mjs silently fell through to
   prompt-hash → pool → random when OD_MOCKS_TRACE was set but did
   not match any recording (typo, prefix too short, corpus not
   fetched). Tests using a pinned trace would silently get a
   different trace, hiding regressions. Fix: throw an explicit error
   with the failing value + a pointer at fetch-recordings.sh.

Verified locally: pnpm guard prints "Residual JavaScript check
passed", smoke-test still 21/21, ~/.amr mtime unchanged after run,
typo on OD_MOCKS_TRACE now produces "mock-agent: OD_MOCKS_TRACE=...
set but no matching recording in <dir>" on stderr.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fetch-recordings: detect empty filter result before line-counting

printf '%s\n' on an empty string emits a single empty line, so the
previous TOTAL=$(printf ... | grep -c "") math returned 1 on an
empty $ENTRIES_TSV — a typo like `--agent no-such-agent` printed
"Fetching up to 1 recordings", downloaded zero, and exited 0
("ready"). Check `-z $ENTRIES_TSV` first.

Reproduced + fix verified per the reviewer thread.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* mocks: address mrcfps review — goldens + provenance + contract check

Three durability improvements suggested in the PR #3241 top-level
review:

## 1. Golden daemon-event snapshots (mocks/golden/*.events.json + apps/daemon/tests/mocks-golden.test.ts)

Smoke-test verified that mocks RUN; that catches crashes but not a
parser change that semantically reshapes the events the daemon emits.

Commit the daemon-event sequence for 3 representative traces:
- claude  314d6833 — median-complexity agent-browser session
- codex   dcdff3b3 — 14-tool refactor
- opencode 9a9522ec — 7-tool data-report

apps/daemon/tests/mocks-golden.test.ts spawns the mock, feeds stdout
through the real createClaudeStreamHandler / createJsonEventStreamHandler,
normalizes per-spawn volatile fields (only sessionId today, only on
claude), and deep-equals against the committed snapshot. A parser
regression fails the test loudly.

After an intentional parser change, regenerate:

  MOCKS_GOLDEN_UPDATE=1 pnpm --filter @open-design/daemon test mocks-golden
  git diff mocks/golden/
  # eyeball; commit if shapes match intent

## 2. Provenance fields on every manifest entry (mocks/scripts/lib/manifest-utils.mjs + mocks/manifest.json)

Augment inspectRecording() to write:

  captured_at         — ISO 8601 from existing meta.timestamp
  cli_version         — null until harvester writes it
  protocol_version    — null until harvester writes it
  anonymization_version — null until harvester writes it

captured_at is now populated for all 179 existing entries from the
meta event the harvester already emits. The harvester in
nexu-io/agent-pr-explore is the next step for cli_version /
protocol_version / anonymization_version — once those are
populated, consumers can detect when a recording is older than ~1
minor version behind the live CLI and flag for re-harvest.

No matrix of (cli_version × agent) recordings — that explodes
maintenance. Just metadata per recording so trust decay is visible.

## 3. Real-CLI contract check (mocks/scripts/contract-check.sh + docs/MOCKS-CONTRACT-CHECK.md)

Mocks catch parser regressions against recordings; they do NOT
catch recordings drifting away from the live agent CLI as that CLI
evolves. The contract check spawns the real CLI alongside the mock
with a fixed deterministic prompt + diffs top-level event-type
distributions.

Deliberately human-driven, not cron-scheduled:
- costs real LLM tokens per invocation
- requires real CLI auth
- maintainer reads the output, not a regex

Suggested triggers per doc: real-CLI release notes mentioning
"output format" / "stream" / "JSON" / "events"; before a parser
refactor; ad-hoc when something looks off.

## Coverage note

README updated to position mocks as "deterministic protocol/parser
coverage" (not "e2e replacement") per mrcfps framing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mocks-golden test): drop import of non-exported ParserKind

Use plain string (the type alias is `string` anyway) — Preflight
typecheck on a31fa71a failed:

  tests/mocks-golden.test.ts(29,8): error TS2459: Module
  "../src/json-event-stream.js" declares "ParserKind" locally, but
  it is not exported.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* recording-picker: structured OD_MOCKS_POOL + hard-fail no-match

Siri-Ray review: \`OD_MOCKS_POOL=outcome:failed\` was documented as a
supported selection knob, but the matcher only checked tags and
\`meta.agent\` — so the negative-path pool found 0 candidates and
silently fell through to global random, validating against any
recording instead of a failed trace.

Fix:
- Parse \`<dim>:<value>\` shape and route each dim to the right meta
  field: \`outcome\` → \`meta.outcome\`, \`agent\` → \`meta.agent\`,
  \`skill\` → \`tags[]\`. Bare values still fall back to tag substring.
- If the env was set and matched nothing, throw with the failing
  value and a jq one-liner for inspection. Same loud-fail policy as
  OD_MOCKS_TRACE — silent fallback was the original bug.

Verified locally: outcome:failed, agent:codex, skill:agent-browser
all route correctly; outcome:nonsense throws the explicit error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* contract-check.sh: fix lost $PROMPT in mock invocation

Siri-Ray review on e576074a: the mock side wrapped its pipeline in
`bash -c "printf %s \"\$PROMPT\" | ..."` — but $PROMPT was a parent
shell variable, not exported, so the child bash expanded it to an
empty string. Result: the contract check sent the real prompt to the
real CLI and an empty string to the mock, defeating the
same-input invariant the whole script rests on. Also let the mock
randomly select a different trace whenever a maintainer happens to
have OD_MOCKS_BY_PROMPT_HASH=1 in their env.

Fix: drop the inner bash -c entirely; use a subshell that scopes the
PATH overlay and pipes printf into the PATH-resolved mock binary
directly. The subshell limits the PATH change without var-passing.

Verified locally: with prompt-A the mock picks trace 54ec02ee via
hash; prompt-B → 2667e851 via hash; empty prompt (old broken
behavior) → random — confirms the prompt is now actually reaching
the mock under PATH overlay.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 07:17:20 +00:00
koki
4b7c018a9b
feat(contrib): add od-contribute skill for non-coder contributors (#3172)
* feat(contrib): add od-contribute skill for non-coder contributors

Adds a Claude Code skill at .claude/skills/od-contribute/ that walks any OD
user — including non-coders — through a first-PR contribution flow:

  - Ship a Skill / Design System made with OD
  - Translate README / QUICKSTART / CONTRIBUTING to a new language
  - Fix a typo / dead link / write a use-case blog post
  - Report a high-quality bug (issue path, no PR)

The skill replaces the test-driven dev-loop of auto-github-contributor with
type-specific no-code validators (frontmatter parse, markdown link check,
code-fence balance, structural overlap with reference DESIGN.md files), so
artifact-only contributions don't have to pretend to be code.

This commit only adds files under .claude/ — no product code, no build
config, no runtime dependencies. .gitignore is amended with three explicit
exceptions so the skill is tracked while personal Claude state (sessions,
settings, etc.) stays ignored as before.

Next steps (separate PRs):
  - Wire the OD app to mount this skill for its embedded agent
  - Add a "Ship to GitHub" UI button in OD that invokes /od-contribute

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* feat(contrib): English-by-default skill + zip installer for non-coders

Two follow-ups to the initial od-contribute skill:

1. Skill content is now English with an explicit instruction at the top
   telling the agent to mirror the user's chat language for every user-
   facing prompt. Generated artifacts (PR titles, commit messages, PR/
   issue body) stay English regardless — GitHub convention.

2. tools/od-contribute-installer/ ships a cross-platform installer that
   drops the skill into every supported agent's home dir without the
   user opening a terminal:

     install.command  macOS double-click
     install.bat      Windows double-click
     install.sh       Linux

   Targets covered:
     ~/.claude/skills/od-contribute/        Claude Code (native)
     ~/.claude/commands/od-contribute.md    Claude Code slash command
     ~/.agents/skills/od-contribute/        Codex CLI (canonical)
     ~/.codex/skills/od-contribute/         Codex CLI (legacy, only
                                            written if ~/.codex/ exists)

   Verified Codex CLI reads the same SKILL.md frontmatter format as
   Claude Code (source: openai/codex codex-rs/core-skills/src/loader.rs).
   Added agents/openai.yaml sidecar inside the skill for Codex picker UX.

3. build-zip.sh produces od-contribute-installer.zip (~37KB) from the
   in-repo skill. The zip is meant to be hosted as a GitHub Release
   asset; the marketing site button points at:
     github.com/nexu-io/open-design/releases/latest/download/od-contribute-installer.zip
   (See tools/od-contribute-installer/HOSTING.md for the manual release
   recipe; CI workflow can come later.)

   The zip itself is gitignored — distribute via Releases, not source.

Still no product code touched, no build config changed.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* refactor(contrib): drop zip installer; ship single curl one-liner

Replace tools/od-contribute-installer/ (4 install scripts + zip build
machinery) with a single self-bootstrapping tools/install-od-contribute.sh.

User flow becomes:

  1. Click button on opendesign.so
  2. Modal shows: paste this into your AI agent's chat:

       curl -sSL https://raw.githubusercontent.com/nexu-io/open-design/main/tools/install-od-contribute.sh | bash

  3. Agent runs it via its Bash tool. User never touches a terminal.
  4. /od-contribute is live in their next chat.

Why this is better than the zip approach:

  * Zero downloads visible to the user — no .zip in their Downloads folder
  * Zero unzip step
  * Zero terminal window flash (the agent's Bash tool runs in-process)
  * Zero per-OS installer files (.command/.bat/.sh) to maintain
  * Auto-updates: re-running the one-liner pulls the latest skill from main

The script downloads only the skill subtree (.claude/skills/od-contribute/
and .claude/commands/od-contribute.md) from a GitHub tarball — no `git`
dependency, just curl + tar (universally available).

Targets remain the same:
  ~/.claude/skills/od-contribute/
  ~/.claude/commands/od-contribute.md
  ~/.agents/skills/od-contribute/
  ~/.codex/skills/od-contribute/  (only if ~/.codex/ exists)

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(contrib): remove leftover zip artifact

Build artifact accidentally committed in the previous commit.
Cleaning up so the binary doesn't live in git history.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): make skill work in sandboxed agents (Codex.app, Cursor)

macOS App Sandbox apps like Codex.app cannot reach the system keychain
where `gh auth login` stores the GitHub token by default. Result: the
skill's check-prereqs.sh fails on `gh auth status` with a misleading
"not authenticated" error, even when gh works fine in the user's regular
shell.

Two changes:

1. config.sh: if GH_TOKEN isn't set in the env, fall back to reading a
   .gh-token file at the skill root. Lets a user (or the OD app, or a
   future OAuth Device Flow bootstrapper) drop a token there once and
   have every skill script pick it up automatically.

2. check-prereqs.sh: accept GH_TOKEN-from-env as a valid auth path
   alongside `gh auth status`. When neither works, the error hint now
   shows BOTH options:
     A) gh auth login from a regular terminal (any agent)
     B) gh auth token > <skill>/.gh-token (sandboxed agents)

Verified: in my local Claude Code (where gh has keychain access), the
keychain path still wins and nothing changes. With GH_TOKEN exported,
check-prereqs.sh succeeds without even consulting gh auth status.

Future: implement OAuth Device Flow inside the skill so non-coder users
hitting this in Codex.app can authenticate by clicking a link, no
terminal involved. That's a separate PR.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(contrib): move install script into skill folder (CI policy fix)

The repo's tools/ directory has a strict allowlist policy enforced by
scripts/guard.ts — only AGENTS.md, dev/, pack/, and serve/ are permitted
top-level entries. Moving install-od-contribute.sh out of tools/ and into
.claude/skills/od-contribute/install.sh:

  - Satisfies the guard policy (no scripts/guard.ts edit needed)
  - Co-locates the install script with the skill it installs (cleaner
    mental model: skill folder is self-contained)
  - The install URL stays inside the gitignore exception we already
    established for .claude/skills/od-contribute/

Public install URL changes from
  raw.githubusercontent.com/.../main/tools/install-od-contribute.sh
to
  raw.githubusercontent.com/.../main/.claude/skills/od-contribute/install.sh

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): address @nettee/looper review feedback (3 blocking issues)

Three real bugs caught by the looper review bot, all fixed:

1) create-pr.sh:48 — git diff missed untracked files

   `git diff --quiet || git diff --cached --quiet` ignored untracked paths,
   so the most common contribution shape (a brand-new Skill folder, a new
   translation file, a new doc) hit the else branch and pushed an empty
   commit. Replaced with `git status --porcelain` which sees untracked,
   plus a post-stage sanity check via `git diff --cached --quiet` so we
   skip the commit cleanly if everything turned out to be in .gitignore.

2) validate-skill-submission.sh:34 — frontmatter parse too lenient

   The awk fence-counter accepted `---` anywhere in the file as the
   opening fence. A SKILL.md with prose before the YAML block parsed as
   "valid frontmatter" by this script while the actual loaders (Claude
   Code + codex-rs/core-skills) required the fence on line 1 and would
   reject it. Added an explicit head -n 1 check so leading prose is
   rejected with a clear error before awk runs.

3) check-prereqs.sh:87 — gh api user failure swallowed

   `GH_USER="$(gh api user --jq .login 2>/dev/null || echo '?')"` set
   GH_USER to literal "?" when the API call failed (revoked token,
   missing 'repo' scope, network), then the script exited READY=1.
   Downstream that propagated to TARGET_FORK="?/open-design" and
   blew up at push time.

   Dropped the `|| echo '?'` fallback. An empty GH_USER now triggers a
   structured error with three common causes and the recovery command,
   and exits 2.

   While here, also fixed a related bug: this script sources config.sh
   which has `set -euo pipefail`, so -e leaked in and aborted the
   script silently the moment any check failed (instead of accumulating
   diagnostics like the original auto-github-contributor design
   intended). Added explicit `set +e; set -uo pipefail` after sourcing
   to restore the "keep checking past failures" behavior the comment
   on line 7 promised.

Smoke-tested all four fixes locally:
  - create-pr.sh: git status --porcelain correctly sees untracked files
  - validator: rejects SKILL.md starting with prose, passes well-formed
  - check-prereqs.sh: with stubbed gh that fails `gh api user`, now
    exits 2 with the structured error (was: silent exit 1)
  - check-prereqs.sh: happy path on real machine unchanged

Thanks @nettee for the careful review.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): macOS Bash 3.2 + over-strict link validator (review round 2)

Two more blocking issues from the looper review, plus one related bug I
caught while re-testing on real OD docs.

1) discover-i18n-gaps.sh: removed Bash 4 dep (declare -A)

   macOS still ships Bash 3.2.57 by default and most agent-spawned bash
   subprocesses inherit that. `declare -A SEEN_LANG=()` failed with
   `declare: -A: invalid option`, crashing Step 3b before any translation
   target could be shown.

   Replaced the associative array with a newline-delimited string set
   (\n<lang>\n bracket form to avoid prefix-overlap false matches like
   zh vs zh-CN). Verified end-to-end on /bin/bash 3.2.57 against the
   actual OD repo: returns the correct 28 stale-translation rows
   across the four English source docs.

   Also fixed a latent path-stripping bug in the same loop: `find`
   emits `./README.zh-CN.md` with leading `./`, so `${path#README.}`
   wasn't stripping the prefix at all. Switched to basename-first.

2) validate-markdown.sh: --reference flag for i18n / docs-edit flows

   The validator was treating every relative link target as a file path
   and failing on slugs like `skills/blog-post/` that are website
   router routes, not files in the checkout. A structure-preserving
   translation of README.md couldn't pass even when the user changed
   nothing except language.

   Added --reference <orig> flag. The validator now builds a "known
   already-broken" set of refs from the source file and excuses those
   in the new file. Newly-introduced broken refs still fail.

   Without --reference (e.g. brand-new blog file with no prior version),
   the relative-ref check is skipped entirely with a SKIP note — since
   we can't tell route slugs from file paths in isolation, failing
   would be wrong. Code-fence balance + external-link health still run.

   Updated SKILL.md so the i18n branch (3b.6) and the docs branch
   (3c.6) call validate-markdown.sh with --reference pointing at the
   English source / HEAD revision respectively.

3) (caught while testing) URL extraction regex too loose

   `grep -oE 'https?://[^) ]+'` was capturing trailing quotes from HTML
   <img src="..."> tags in OD's README, e.g.
     https://cms-assets.youmind.com/.../foo.jpg"
   The trailing `"` made the curl HEAD return 404. Tightened the
   character class to also stop at `"`, `'`, `<`, `>`, `[`, `]`.
   With this fix, README.md now passes all checks (20 external links
   verified 2xx/3xx).

Smoke-tested on macOS /bin/bash 3.2.57 with the actual nexu-io/open-design
working copy. All four scenarios behave correctly:
  - README.md without --reference → SKIP relative-ref check, PASS overall
  - README.md with --reference itself → 34 refs excused as pre-existing, PASS
  - Newly-introduced broken ref → FAIL (regression catch preserved)
  - Old test cases (skill validator, prereq check) → still pass

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): preserve .gh-token across install.sh reruns

`install_skill_to()` did `rm -rf $dest` before copying in the new skill,
which wiped any user-local state files. The most consequential one is
`.gh-token` — sandboxed agents (Codex.app, Cursor) write a GitHub token
there because they can't reach the macOS keychain (see check-prereqs.sh's
hint and config.sh's fallback path).

Effect: the documented upgrade path ("re-run the curl one-liner to pull
the latest skill") would silently lose the token on every refresh, and
the very next /od-contribute run would fail at the prereq gate with
"no GitHub credentials available", forcing the user back through manual
token setup. This affects exactly the audience the PR is aimed at.

Fix: stash any file in PRESERVE=(.gh-token) to a tempdir before rm -rf,
restore after the copy, re-chmod 600 on the way back. Test:

  1. Pre-seed .gh-token in all three target dirs
  2. Run installer
  3. Verify all three tokens still present, contents unchanged, perms 600

Centralized the preserved-state list as PRESERVE=() so future per-user
state (e.g. an OAuth-flow-saved refresh token) only has to be added in
one place.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): i18n stale false-positive + tier markdown link check (round 4)

Two more blocking issues from looper, both real.

1) discover-i18n-gaps.sh: false-stale on same-commit translations

   `git log --since=@<epoch>` is INCLUSIVE of the boundary epoch, so when
   the English source and a translation get touched in the SAME commit
   (a very common pattern: bulk i18n refresh, structural edits applied
   across all locales), the shared commit was counted toward
   english_commits_since_translation. Result: an already-current
   translation was reported with `status="stale", english_commits_since_
   translation=1`, and Step 3b would suggest it for refresh — driving
   users into no-op PRs.

   Reproduced exactly per looper's case: README.md and README.uk.md both
   have last commit 338cb4d at epoch 1779948707; the OLD predicate
   returned 1, the NEW predicate returns 0.

   Switched commits_between() from `--since=@<epoch>` math to commit
   ancestry: `git rev-list <tr_sha>..HEAD -- <newer>`. tr_sha..HEAD reads
   "commits reachable from HEAD but not from tr_sha", which correctly
   excludes the shared tip when both files were last touched together.

2) validate-markdown.sh: brand-new files bypassed local link check

   The previous fix skipped relative-ref validation entirely when
   --reference was absent. That covered slug-style refs (good) but also
   covered explicit `./foo.md` and `../bar/baz.md` style refs (bad).
   Step 3c (new blog post) doesn't pass --reference, so a contribution
   could ship with `[broken](./missing.md)` and pass the validator.

   Tiered the relative-ref check:
     - Image refs (`![alt](path)`) — ALWAYS validated. Markdown image
       syntax is never a website route.
     - Refs starting with `./` or `../` — ALWAYS validated. Explicit
       relative paths are unambiguous file references.
     - Other link refs (`skills/blog-post/` style) — only validated
       when --reference is supplied; otherwise skipped (could be route).

   In all cases, refs already broken in --reference (when supplied) are
   excused as pre-existing rather than reported as regressions.

   Verified against looper's exact repro (`[new broken](./missing.md)`
   in a brand-new file with no --reference): now correctly fails. Also
   verified ambiguous-slug test (`skills/blog-post/`) still skips
   without --reference, image refs always check, and README.md regression
   tests both with and without --reference still pass.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* fix(contrib): catch bare-path refs in validators (review round 5)

Two narrow follow-ups to the round-4 tiered link checks:

- validate-skill-submission.sh: scan every non-URL, non-anchor markdown
  link target in SKILL.md (not just `./` / `../` prefixed paths). Plain
  intra-skill refs like `[ref](references/foo.md)` were previously
  ignored by the regex, letting broken bundles pass. Escape detection
  switches to lexical (segment count) instead of `cd … && pwd -P`, so a
  missing intermediate directory no longer masquerades as an escape.

- validate-markdown.sh: treat file-like targets (`*.md`, `*.png`,
  `*.svg`, image/asset/script extensions) as on-disk refs even without
  `--reference`. `[doc](missing.md)` is unambiguously a sibling file,
  not a website route, and Step 3c (new docs/blog) had no `--reference`
  to fall back on. Slug-style refs without an extension still get
  skipped without `--reference`.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(contrib): scratch leak + dedupe gate + workdir reuse (review round 6)

Three blocking issues from looper round 6, all fixed.

1) create-pr.sh + setup-workspace.sh: .od-contrib/ scratch leaked into PR

   `git add -A` in create-pr.sh staged everything in the worktree, including
   the skill's internal scratch dir (.od-contrib/type.txt, .od-contrib/
   slug.txt, .od-contrib/PR-BODY.md created by setup-workspace.sh and the
   render step). OD's .gitignore doesn't exclude .od-contrib/, so every PR
   opened through this flow shipped those bookkeeping files in the user's
   contribution diff.

   Two layers of defense:
   - setup-workspace.sh now writes `.od-contrib/` to .git/info/exclude
     when preparing the workdir (repo-local exclude, not committed).
   - create-pr.sh now uses an explicit pathspec `:!:.od-contrib` on its
     git status / git add calls. So even if a workdir was prepared
     differently, this script alone refuses to stage the scratch dir.

   Verified with a temp repo containing both .od-contrib/PR-BODY.md and a
   user file: only the user file lands in the index after `git add -A
   -- . :!:.od-contrib`.

2) create-issue.sh: dedupe gate didn't actually gate

   The --dedupe-keywords flag printed search hits to stderr but then
   unconditionally fell through to `gh issue create`. The `|| true` after
   the gh search pipeline also swallowed network/jq failures, so a broken
   search looked identical to "no duplicates found" — and the issue got
   created either way. The user never got a real chance to choose
   "comment on existing / open anyway / cancel".

   Now:
   - Run gh search and jq as separate steps; either failure exits 2 with
     a structured REASON=search_failed/parse_failed.
   - If matches > 0 AND --allow-duplicates was NOT passed, exit 3 with
     REASON=duplicates_found and MATCH_COUNT=N. Caller must explicitly
     re-run with --allow-duplicates after surfacing matches to the user.
   - The script now requires `jq` (added od::require jq) since we
     actually parse JSON.
   - Updated the docstring at the top so the caller contract (ask the
     user, then re-invoke with --allow-duplicates) is explicit.

   Verified: searching keyword "preview" against nexu-io/open-design
   matches 5 open issues; the script exits 3 and never calls
   `gh issue create`.

3) setup-workspace.sh: same-day workdir reuse leaked stale state

   `SESSION_DIR=<TYPE>-<SLUG>-<YYYYMMDD>` reused the same directory for
   every same-day, same-(type,slug) invocation. The most acute case:
   SKILL.md 3b.1 calls `setup-workspace.sh i18n translate` BEFORE the
   user has picked a doc/language, so every i18n attempt on the same
   day landed in `i18n-translate-<date>/` — and untracked files from an
   abandoned earlier translation survived `git checkout`/`pull` and
   leaked into the next user's run.

   Two changes:
   - Bumped tag to second precision: `<YYYYMMDD>-<HHMMSS>`. Two human-
     paced sessions in the same second is vanishingly rare. Verified
     two rapid runs produce different tags (114208 vs 114209).
   - When a workdir IS reused (same SESSION_TAG passed in explicitly,
     or rare clock collision), now does `git reset --hard HEAD` and
     `git clean -fdx` first so the run starts from a known-good base
     instead of inheriting prior occupant state.

   The branch name now also tracks the timestamp tag, so two runs can't
   accidentally end up on the same feature branch either.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: leilei926524-tech <leilei926524-tech@users.noreply.github.com>
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
2026-05-29 07:16:04 +00:00
Amy
937946c6fa
Improve model picker search and shared BYOK catalogs (#3262) (#3278) 2026-05-29 07:07:40 +00:00
lefarcen
755d84e64c
feat(web): merge Draw + Screenshot into one Studio mark tool (#3081) (#3277)
Forward-ports chaoxiaoche's Studio toolbar work from #3081 onto current
main. The preview toolbar drops to 4 controls — Comment, Mark (the merged
Draw/Screenshot tool with box-select + pen sub-tools), Edit, Comments —
matching the latest design. The standalone Screenshot button and its
copy-to-clipboard path are removed; capture now flows through the mark
overlay. Also carries #3081's comment select-all/clear-selection panel and
keeps the Draw send guard added in #3270 (Send disabled mid-run, Queue stays).

Reconciled with main work that postdates #3081's base so nothing is lost:
- Preserves #2190's preview iframe keep-alive pool and the AnnotationHoverPopover
  hover card (re-added on top of #3081's BoardComposerPopover, with its own
  anchor helper so it doesn't clash with the composer popover anchoring).
- i18n: keeps every locale key main added; adopts #3081's mark wording.

Behavior change: the comment side-panel Clear now deselects instead of
batch-deleting selected comments (per #3081); per-comment delete and
send-selected remain.

Validation: pnpm --filter @open-design/web typecheck (clean),
full web vitest (2354 passed), pnpm guard.

Co-authored-by: chaoxiaoche <fanzhen910412@gmail.com>
2026-05-29 06:51:38 +00:00
Caprika
76c7d31c53
chore: bump vela cli to 0.0.4 (#3239)
* chore: bump vela cli to 0.0.4-test.0

* chore: refresh lockfile for vela cli 0.0.4-test.0

* chore(nix): refresh pnpm deps hash

* fix: materialize electron before mac release checks

* fix: rebuild electron when mac framework links are invalid

* revert: drop release workflow experiments

* chore(nix): refresh pnpm deps hash

* fix: stop blocking beta mac release on electron symlink preflight

* fix: stop using custom electron dist for beta mac packaging

* fix: guard oversized chat images and opencode overflow

* chore: bump vela cli to 0.0.4

* chore(nix): refresh pnpm deps hash

* fix(daemon): surface prompt-image stat failures instead of dropping them

resolveSafePromptImagePaths only swallowed unresolvable path input; once a
path was confirmed inside UPLOAD_DIR and existed, a statSync failure
(EACCES/EPERM, a file vanishing mid-run) silently dropped the image and let
the run continue without that prompt context. Since this helper is now also
the 1 MB enforcement point, that turned an infra/validation failure into a
'successful' run with missing required context.

Collect those into a new failedImages bucket and fail the run with
INTERNAL_ERROR at the call site, mirroring the oversized-image guard. Add a
unit test covering statSync throwing.

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
Co-authored-by: lefarcen <935902669@qq.com>
2026-05-29 06:41:17 +00:00
Jane
3f4fd58937
feat(landing-page): surface Discord + X in header, restructure site footer (#3230)
Some checks failed
ci / Detect CI change scopes (push) Successful in 0s
visual-baseline / Capture visual baselines (push) Waiting to run
landing-page-ci / Validate landing page (push) Failing after 2s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 2s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 1s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 2s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
* feat(landing-page): surface Discord + X in header, restructure site footer

Two related public-chrome adjustments:

- **Header gains compact Discord + X icon buttons.** Both community
  channels were previously buried in the footer, so the typical
  visitor never saw them on a page-deep scroll. They now sit before
  the Download / Star CTAs in `nav-side`, share the ghost-button
  outline language, and stay icon-only with `aria-label` so they
  read as social affordances rather than competing with the text
  CTAs. At ≤1080px the icon buttons hide alongside the existing
  ghost CTA, so the bar still collapses cleanly into the hamburger
  panel — Star stays in the bar at every breakpoint.

- **Footer restructured into 4 columns: Products / Plugins /
  Resources / Connect.** The old `Plugins / Open Design / Connect`
  three-column layout muddled three different things — sister
  products, the artifact catalogue, and contributor channels —
  under one roof, so visitors hunting for "the other thing this
  team makes" had nowhere obvious to go.
  - **Products** (new) lists the team's apps: Open Design (links
    to homepage) and HTML Anything. Two entries by design — adding
    more products without an editorial pass would dilute the
    column.
  - **Plugins** mirrors the topbar `Plugins` dropdown verbatim:
    Templates / Skills / Systems / Craft, with no count prefix on
    Systems / Craft so it reads identically to the nav.
  - **Resources** (renamed from `Open Design`) carries the
    docs-style links: Official source / Quickstart / Agents locaux
    / Compare / Claude Design alternative. The old column heading
    was confusing because the OD logo + brand name already sit
    under the column.
  - **Connect** gains an X / Twitter row pointing at
    `@nexudotio`. The brand entries on this column are
    contributor / community surfaces only — code, releases,
    chat, social, RSS, contact form.

Implementation:

- `_components/header.tsx` — `DISCORD` and `X_TWITTER` consts at
  the top alongside `REPO`. Two `<a class="nav-icon">` blocks with
  inline SVG before the existing Download / Star CTAs.
- `_components/site-footer.astro` — `HTML_ANYTHING` and `NEXU_IO`
  consts. `<div class="sub-footer-col">` re-ordered to put
  Products first, Plugins second (no longer carries `counts.*`
  values), Resources third, Connect fourth (with the new X / Twitter
  row).
- `globals.css` — `.nav-icon` rule cloned from the ghost CTA's
  visual language (transparent + 1px line, fills on hover) but
  square (36×36 round) so it reads as a social-icon affordance.
  Added `display: none` for `.nav-side .nav-icon` to the existing
  ≤1080px and ≤880px media queries so the icons follow the same
  collapse behaviour as the Download CTA.
- `sub-pages.css` — `.sub-footer-grid` switches from
  `1.6fr 1fr 1fr 1fr` to `1.4fr 1fr 1fr 1fr 1fr` (brand + 4
  columns). At ≤1080px it falls back to a 3-column shape so each
  column has room to breathe; at ≤720px it stays a single column
  (existing behaviour).
- `i18n.ts` — adds `products`, `resources`, `xTwitter`,
  `sisterProjects`, `htmlAnything`, `nexuIo` to `LandingUiCopy.footer`
  (the last three are kept around even though `sisterProjects` is no
  longer rendered after the column was renamed Products — they're
  harmless and avoid churning the type if a future iteration brings
  the Sister-projects framing back). All 17 non-English landing
  locales gain translations for the new keys via the existing
  `LOCALIZED_LANDING_FOOTER_COPY` map (and the `LANDING_UI_COPY_OVERRIDES`
  block for `zh` / `zh-tw`). Translations were generated with
  `claude-haiku-4-5` over OpenRouter, with explicit instructions
  to keep "Open Design", "HTML Anything", and "X / Twitter" in
  English and to render "Products" / "Resources" in sentence case
  per locale convention. Spot-checked against rendered pages on
  `/zh/`, `/zh-tw/`, `/ja/`, `/ko/`, `/de/`, `/fr/` (and `/ar/` for
  RTL) for natural phrasing.

Validation: `pnpm --filter @open-design/landing-page typecheck` ->
0 errors / 0 warnings; local dev server smoke-tested on en root
(`/html-anything/`) and 5 locale variants (`/zh/`, `/zh-tw/`,
`/ja/`, `/de/`, `/fr/`) — header renders 2 nav-icon buttons,
footer renders 4 localized column headings in the correct order
with the right link targets.

* fix(landing-page): address PR #3230 review — locale-aware HTML Anything link + drop unused const

Two non-blocking inline review points from @PerishCode on PR #3230:

- The HTML Anything entry in the new Products column hardcoded
  `https://open-design.ai/html-anything/` via a top-level
  `HTML_ANYTHING` const, but `/html-anything/` is a real localized
  route in this app (`pages/[locale]/html-anything/index.astro`)
  and `open-design.ai` is the same site's live domain. A visitor
  on `/zh/…` clicking through landed on the English route and lost
  locale context, and hardcoding the production domain meant a
  preview build would surface a link that bounces visitors back
  to prod. Switch to `href('/html-anything/')` so the locale prefix
  + the current site's domain (resolved by `localizedHref`) are
  honored, matching every other footer link.

- `NEXU_IO` was declared at the top of the component but never
  referenced — leftover from an earlier iteration that listed
  `nexu.io` as a Sister-projects entry before the column was
  renamed Products and reduced to OD + HTML Anything. Removed.

No behavior change beyond the locale routing fix; the i18n keys
and column structure stay as they landed in the original commit.

* fix(landing-page): correct nav-icon comment to match actual responsive behaviour

The JSX comment introduced for the new Discord + X icon buttons in
PR #3230 claimed the icons "survive at narrow widths while text-only
nav items get pushed off". The CSS that shipped in the same PR does
the opposite: both `@media (max-width: 1080px)` and `@media (max-width:
880px)` blocks add `.nav-side .nav-icon { display: none; }`, so at
narrow widths the icons collapse alongside the ghost Download CTA
while the text nav <ul> moves into the hamburger panel — only the
Star CTA remains visible in the bar.

Rewrite the comment to describe the actual responsive contract so
the next reader of `header.tsx` doesn't have to cross-reference
`globals.css` to figure out which surface stays. Reviewer flag from
@PerishCode on PR #3230.

No code-path change; comment-only.

* fix(landing-page): correct sub-footer 1080px comment to describe actual 3-column grid

The CSS comment introduced for the new sub-footer grid claimed the
≤1080px breakpoint drops to "brand + 2x2 grid of columns" — but the
rule produces a 3-column grid, not a 2x2.

`.sub-footer-grid` has 5 children at this breakpoint (the brand
block + the four footer columns) and `.sub-footer-brand` carries
no `grid-column` span, so with `grid-template-columns: 1.6fr
repeat(2, 1fr)` they flow as: row 1 = brand · Products · Plugins,
row 2 = Resources · Connect · empty cell. The brand sits inline
with two columns rather than on its own, and the four content
columns are not a clean 2x2.

The layout itself is fine; only the comment misleads the next
reader about how the columns wrap. Same flavor as the `header.tsx`
icon comment fixed in 744daec — describe what the rule actually
does so the comment doesn't drift from the CSS. Reviewer flag
from @PerishCode on PR #3230.

Comment-only change.

---------

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-29 05:59:24 +00:00