Commit graph

630 commits

Author SHA1 Message Date
Caprika
6736310a01
Implement manual edit inspector (#1448)
* feat(web): tweaks palette popover with HSL hue-shift recoloring

Adds a Tweaks color-palette popover to the HTML preview toolbar.
Selecting a palette re-skins the iframe in place via a srcDoc-side
bridge that walks the DOM and shifts every chromatic paint to the
target hue while preserving each color's saturation and lightness —
pale tints stay pale, bold CTAs stay bold, just in the new color
family. Mono-noir desaturates instead of shifting.

- runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options
- file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected
- FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration
- PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir
- PreviewDrawOverlay: stub pass-through until the draw branch lands

* feat(web): hide finalize-design toolbar from project header

* test(e2e): skip project actions toolbar flow after toolbar removal

* Polish manual edit inspector

* Implement manual edit inspector

* Fix manual edit review regressions

* Fix FileViewer CI regressions

* Fix remaining manual edit review issues

* Flush manual edit styles before draw exit

* Restore Critique Theater styles

* Accept pixel line-height manual edits

---------

Co-authored-by: qiongyu1999 <2694684348@qq.com>
2026-05-13 13:25:58 +08:00
Joey-nexu
e3a848a33a
feat(landing-page): replace Ø wordmark with PNG logo across nav/footer/favicon (#1449)
* feat(landing-page): replace Ø wordmark with PNG logo across nav/footer/favicon

Switches the brand mark from the Unicode 'Ø' glyph to the new circular
gradient paper-plane PNG. Header nav and footer share the same image, and
the browser tab + iOS home screen icons are regenerated from the same
500x500 source.

- public/logo.png (500x500, brand source)
- public/favicon.png (32x32, replaces favicon.svg)
- public/apple-touch-icon.png (180x180, regenerated)
- header.tsx + page.tsx footer: <span>Ø</span> -> <img src=/logo.png />
- globals.css: simplify .brand-mark (drop Ø-era border/font, add object-fit
  contain on child img)
- index.astro: link rel=icon now points at favicon.png

* fix(landing-page): apply logo + favicon swap to sub-page layout too

Review on #1449 caught two cross-page consistency issues:

- P1: sub-page-layout.astro still linked /favicon.svg, which this PR
  deletes — every Skills/Systems/Templates/Craft page would request a
  missing asset. Updated to /favicon.png to match index.astro.
- P2: sub-page-layout.astro still rendered the Ø wordmark in its footer
  brand block, leaving the public site with mixed brand marks. Replaced
  with the same <img src=/logo.png /> wrapper pattern used on the
  homepage header and footer.

Repo-wide grep now shows 0 favicon.svg references and 0 Ø brand-mark
spans. typecheck still 25 files / 0 errors / 0 warnings.

---------

Co-authored-by: Joey-nexu <236967869+joeylee12629-star@users.noreply.github.com>
2026-05-13 12:30:32 +08:00
Prantik Medhi
0244a769cb
feat(landing-page): add blog routes (#1444)
* fix(landing-page): register blog collection config

* fix(landing-page): restore blog content config

* fix(blog): use content-layer ids
2026-05-13 12:20:45 +08:00
nettee
0f0d2879ff
Make de/fr/ru content i18n optional (#1511) 2026-05-13 12:17:17 +08:00
Nagendhra Madishetti
e2f409579d
docs: Critique Theater Phase 14 (user guide + 2 AGENTS module maps) (#1319)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)

Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.

Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
  sticky against further lifecycle events for the same run, except
  for parser_warning which can land late and is recorded in a side
  channel without changing phase.
- A new run_started for a different runId at any time discards the
  prior state and reboots, so the UI can launch consecutive runs
  without an explicit reset action.
- Events whose runId does not match the active run return the same
  state reference, so React's useReducer doesn't re-render
  subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
  so an out-of-order panelist_dim for round 1 arriving after a
  round 2 dim does not corrupt the round 2 bucket.

Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.

* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)

createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.

useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.

Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
  CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
  shape back to PanelEvent; malformed JSON is swallowed and does
  not stop the stream; exponential backoff schedule and ready-reset
  semantics are pinned with a setTimeout seam; close() cancels
  pending reconnects and shuts the live source; no-op fallback
  when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
  reducer driven by synthetic actions, no connection when disabled
  or projectId is null, clean close on unmount, projectId change
  reopens cleanly.

* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)

Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.

speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
  finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
  can watch the run unfold.
- paused parses the transcript but holds events back until the
  caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
  feature, currently treated as instant; replay timestamps are not
  yet persisted with each event so honest pacing requires a
  follow-up Phase 7+ task.

gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.

Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
  shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
  done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
  still land.
- .gz transcripts route through the gunzip seam.

Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.

* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)

Two P1 fixes from lefarcen's review on PR #1307:

SSE payload override

`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.

Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.

Replay pause/resume

`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).

Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
  reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
  the next scheduled timer, flips to instant, and confirms the
  remaining transcript drains exactly once (cursor was preserved)

Doc consistency

The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.

Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.

* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)

* feat(web): Theater PanelistLane component (Phase 8.1)

* feat(web): Theater ScoreTicker component (Phase 8.2)

* feat(web): Theater RoundDivider component (Phase 8.3)

* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)

* feat(web): Theater TheaterDegraded chip (Phase 8.5)

* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)

* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)

* feat(web): Theater TheaterStage top-level container (Phase 8.8)

* feat(web): Theater CSS using existing semantic tokens (no hex literals)

* feat(web): Theater public exports barrel

* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)

Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.

State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
   Host hooks dispatch it when their gating prop changes so a stale
   run from a prior project / transcript cannot bleed into the next
   context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
   connection effect, so a workspace switch from project A (which
   streamed a critique) to project B clears the reducer before the
   new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
   parse effect, so transcriptUrl swaps (including swap-to-null after
   a replay reached `shipped`) lift the reducer back to idle before
   the new fetch starts.

SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
   check after the cheap `isPanelEvent` predicate. A
   `critique.ship` frame missing `composite` / `round` / `status` /
   `artifactRef` is rejected before reaching the reducer, so
   TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
   Every variant's required fields are validated: run_started
   (protocolVersion, non-empty cast, maxRounds, threshold, scale),
   panelist_* (round, role, plus variant-specific shape), round_end
   (round, composite, mustFix, decision in {continue,ship}, reason),
   ship (round, composite, status, artifactRef.{projectId,artifactId},
   summary), degraded (reason, adapter), interrupted (bestRound,
   composite), failed (cause), parser_warning (kind, position).

Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
   view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
   the in-progress lane the instant the tag opens. Before this, a
   stream that emitted only `panelist_open` after `run_started` left
   `rounds = []` and the UI rendered no current round until a later
   `panelist_dim` arrived.

Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
   `var(--purple, var(--accent))`. `--purple` is actually defined
   across the design systems; `--magenta` is not, so Brand was
   silently falling through to `--accent` and looking identical to
   Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
   interrupted-collapse copy ("Interrupted at round N, best
   composite X.X"). Previously the interrupted branch reused
   `shippedSummary` and the UI read "Shipped at round..." for a run
   that specifically did not ship. Native value in en + zh-CN; other
   locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
   hardcoded `theater-degraded-heading`, so two chips rendered on
   the same page (chat history with multiple completed runs) keep
   their aria-labelledby references unambiguous.

Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.

Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales

* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)

* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)

* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)

Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.

Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
  /api/projects/:id/critique/:runId/interrupt alongside the
  optimistic local dispatch. Before this fix, clicking Interrupt
  only flipped the React state to interrupted while the daemon job
  kept running. The fetch is best-effort: a 404 (endpoint not wired
  yet, lands in Phase 15) is swallowed with a dev-mode console.warn
  so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
  and simulate the "daemon not ready yet" path. Two tests pin both:
  the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
  rejected fetch still flips the UI.

interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
  one we saw; when it changes, interruptPending is cleared. A user
  who interrupts run-1 and then triggers run-2 from the same mount
  now gets a fresh, enabled kill button instead of one stuck in
  "Interrupting…". Pinned by a new mount test.

Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
  input, textarea, select, or contenteditable element is ignored
  (and any ancestor of those via closest() is treated the same
  way). Body-level focus still fires the keybind so the Theater
  area's affordance keeps working. Four new tests cover textarea,
  input, contenteditable, and the body-focus positive case.

userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
  critiqueTheater.userFacingName key so the "Design Jury" label can
  be renamed without touching code. Phase 8 introduced
  critiqueTheater.title by mistake; renamed across types.ts, en.ts,
  zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
  TheaterStage.tsx. The locale alignment test stays green.

Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
  the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
  the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.

* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)

In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.

Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.

7/7 vitest cases green.

* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)

Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.

* feat(daemon): adapter conformance harness (Phase 10.3)

runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.

5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.

* test(e2e): Critique Theater Playwright suite (Phase 11)

Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.

Coverage:

  1. Happy path: a single mounted theater plays the full
     fixture (1 run_started, 5 panelists open / dim / must_fix /
     close, 1 round_end, 1 ship) and ends on the score badge.
  2. Interrupt mid-run: the panelist that is open at the time
     the interrupt button is clicked closes with an interrupted
     marker and the transcript freezes there.
  3. Visual regression at 375x720 mobile.
  4. Visual regression at 768x1024 tablet.
  5. Visual regression at 1280x800 desktop.
  6. A11y role tree: the theater region exposes a labelled
     landmark, each panelist lane is a group with an accessible
     name, the score is a status live region.

All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.

* test(web): reducer p99 bench at 10k iterations (Phase 13.1)

Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.

The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.

* test(web): critique surface coverage walker (Phase 13.2)

Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.

62 assertions, one per symbol per corpus.

* docs: Critique Theater user guide (Phase 14.1)

Seven sections aimed at end users (not contributors):

  1. What is Design Jury
  2. How it works (the five panelists, auto-converging rounds,
     the composite formula)
  3. Settings (the M1 toggle and what it does)
  4. Reading the score badge
  5. Replay surface
  6. Troubleshooting (degraded, interrupted, failed)
  7. FAQ

The composite formula is documented as
    designer * 0 + critic * 0.4 + brand * 0.2 + a11y * 0.2 + copy * 0.2
because anyone trying to reverse-engineer the score is going to
search for those weights and the docs are the place they should
land first.

* docs(daemon): critique module AGENTS map (Phase 14.2)

Daemon-side wayfinder for the apps/daemon/src/critique directory.
Tables every file, what owns what invariant, and the 'when you
change anything here' guide so a future contributor does not
have to reverse-engineer the rollout resolver before adding a
new SSE event.

* docs(web): Theater module AGENTS map (Phase 14.3)

Web-side mirror of the daemon AGENTS map. Same file table, same
invariants section, same change-impact guide, sized to the
Theater component package.

* docs: tighten Phase 14 reasoning from lefarcen review (PR #1319)

Four content gaps lefarcen flagged in the Phase 14 docs review,
addressed inline rather than deferred. The fifth item (scope-drift
between 'docs only' PR body and the cumulative stacked diff) is
handled by rewriting the PR body, not the docs.

1. Round exit conditions (lefarcen P2-1).
   docs/critique-theater.md §2 'Auto-converging rounds' now lists
   the five conditions that stop a run (threshold reached, round
   budget exhausted, per-round timeout, total timeout, user
   interrupt) with their default values. A user debugging a run
   that stopped at round 1 with composite 5.4 can read this list
   and find the matching cause without spelunking the orchestrator.

2. Prior-art comparison (lefarcen P2-2).
   New §1.5 'Why an in-CLI panel and not a third-party design lint'
   pre-answers the 'why not Figma lint / Adobe checker / Material
   You conformance' question. Three differences: rule engines vs
   generative reviewers, post-hoc vs in-loop, external service vs
   same-CLI-session.

3. Composite formula rationale (lefarcen P2-4).
   §2 now explains why each weight is set the way it is: critic
   gates correctness so it gets 0.4; brand / a11y / copy are
   secondary quality dimensions at 0.2 each; designer is at 0.0
   in v1 because aesthetic preference is not a ship gate. The slot
   stays in the schema so notes flow into the transcript and a v2
   config release can bump the weight without a wire-shape change.

4. v2 cast-config ownership (lefarcen P2-3).
   Both AGENTS.md files (daemon + web) now declare a 'Designer
   weight frozen at 0.0 until v2 cast config' invariant. The
   daemon side calls out where the SKILL.md frontmatter resolver
   lands (apps/daemon/src/critique/config.ts); the web side calls
   out where the Settings surface lands (apps/web/src/components/
   Settings/). A contributor reading either AGENTS.md before
   implementing v2 sees which module to touch first.

* docs(web): mirror the Designer-weight invariant in Theater AGENTS.md (PR #1319)

lefarcen P1 follow-up on PR #1319: the daemon AGENTS.md already
declares 'Designer weight is frozen at 0.0 until v2 cast config
lands' as an invariant, but the web AGENTS.md's parallel bullet
led with 'Composite weights are read-only on the web side' which
buried the Designer-specific constraint. A web contributor
reading that bullet would not realise the v1 weight distribution
is wire-shape (changing it mid-v1 invalidates persisted
critique_runs composite values).

Rewrote the bullet to lead with the same 'Designer weight is
frozen at 0.0 until v2 cast config lands' phrasing the daemon
side uses, and added an explicit cross-link to the daemon
AGENTS.md so the two halves of the invariant read as one rule.

Web-side specifics retained: ScoreTicker / TheaterCollapsed read
composite off the wire (no client recompute), v2 lands as a
Settings surface at apps/web/src/components/Settings/, do not
add a 'weights' prop to any component in this directory until
the contracts package carries the v2 cast type.

* docs: replace deferred metrics endpoint reference + refresh Theater module map (PR #1319)

Two carryover items lefarcen flagged across the PR #1319 + #1320
reviews.

1. docs/critique-theater.md was sending users to
   /api/metrics/critique as the conformance-status check on
   malformed_block, but the Phase 12 metrics endpoint is
   explicitly deferred until after orchestrator wiring lands.
   Replaced the link with the pnpm conformance-harness command
   that DOES exist today (pnpm --filter @open-design/daemon
   vitest run tests/critique-conformance.test.ts) and noted
   that the dashboard surfaces this status as a series once
   Phase 12 ships.

2. apps/web/src/components/Theater/AGENTS.md module map was
   stale after Phase 15: the index.ts row said 'only two hooks
   are exported' but the barrel now exports
   useCritiqueTheaterEnabled too (plus the setCritiqueTheaterEnabled
   setter). Updated the row to list all three hooks + the
   setter + the reducer-derived contract types, and added a
   new row for hooks/useCritiqueTheaterEnabled.ts in the file
   table so a web contributor scanning the table sees the new
   hook without inferring it from the index.ts blurb.

* fix(web): restore wait-for-daemon-ack pattern on Theater interrupt

Same regression as flagged on PR #1316 post-main-merge: the
optimistic local dispatch fired before the POST resolved, so a
daemon 404 / 409 still terminalized the UI and the real SSE
terminal event got ignored by the sticky interrupted phase.

Snapshot runId / bestRound / composite at click time, dispatch
interrupted only on res.ok, clear interruptPending on rejection or
non-2xx so the user can retry. Tests cover rejection + 404 leaving
the run on the live stage; the 204 path waits for the ack.

---------

Co-authored-by: Nagendhra <nagendhra405@gmail.com>
2026-05-13 12:11:48 +08:00
mehmet turac
313008933a
fix(memory): add loading state to Refresh button in Memory settings (#1477)
* fix(memory): add loading state to Refresh button in Memory settings

Addressed review feedback:
- Fixed icon class: use 'icon-spin' instead of 'spin'
- Added 'settings.memoryExtractionsRefreshing' to types.ts and all locale files
- Removed unrelated tools-pack changes (split to separate PR)

Fixes #1418

* fix(memory): remove duplicate translation and add missing Thai locale

Addressed review feedback:
- Removed duplicate settings.memoryExtractionsRefreshing in en.ts
- Added settings.memoryExtractionsRefreshing to th.ts

Fixes #1418
2026-05-13 12:06:30 +08:00
Samay
874b1e9db3
fix: treat Codex reconnect events as warnings not fatal errors (#1482)
* fix: treat Codex reconnect events as warnings not fatal errors

Reconnecting... x/5 events are recoverable — Codex eventually
completes successfully. Surface them as status events instead
of failing the entire run.

Fixes #1471

* test: add regression tests for codex reconnect warning handling (#1471)

* test: add regression tests for codex reconnect warning handling (#1471)
2026-05-13 12:00:55 +08:00
Prantik Medhi
03ed39602e
fix: preserve Claude tool inputs (#1476)
Fixes nexu-io/open-design#1475
2026-05-13 11:25:02 +08:00
Nicholas-Xiong
9db5dd8798
fix: close Settings modal when opening project from Routines (#1490)
* fix: close Settings modal when opening project from Routines

Fixes #1355

When clicking 'Open project' in Routine history, the Settings modal
now closes automatically, providing a clean transition to the project.

Changes:
- Added onClose prop to RoutinesSection component
- Call onClose after navigate() in Open project button handler
- Pass onClose from SettingsDialog to RoutinesSection

This ensures the modal dismisses after navigation, consistent with
other modal-based navigation actions in the app.

* fix: pass onClose prop to RunHistory component

Address review feedback: RunHistory component now receives and uses
the onClose prop to properly close Settings modal when navigating.
2026-05-13 11:15:36 +08:00
Rocky
1b84af44c0
fix(web/i18n): translate template platform selection + Companion surfaces to Chinese (#1491)
The template creation panel's Target platforms section was hardcoded in
English with no i18n keys at all, while the adjacent Companion surfaces
section had i18n keys in place but zh-CN.ts and zh-TW.ts stored English
placeholder values for them. Both sections appeared in English regardless
of the user's selected language, as reported for Chinese in #1415.

Fix:

- Add 14 new keys to `apps/web/src/i18n/types.ts` for the Target platforms
  section (`newproj.targetPlatformsLabel`, `newproj.targetPlatformsHint`,
  plus `label`/`hint` pairs for each of the six platform cards:
  responsive, webDesktop, mobileIos, mobileAndroid, tablet, desktopApp).
- Populate the new keys in `en.ts` (source of truth) and translate to
  Simplified Chinese in `zh-CN.ts` and Traditional Chinese in `zh-TW.ts`.
- Replace the English-placeholder values in `zh-CN.ts` and `zh-TW.ts`
  for the pre-existing Companion surfaces keys
  (`surfaceOptionsLabel`, `includeLandingPage`, `includeLandingPageHint`,
  `includeOsWidgets`, `includeOsWidgetsHint`,
  `includeOsWidgetsDisabledHint`).
- Refactor `DESIGN_PLATFORMS` in `NewProjectPanel.tsx` to store `keyof
  Dict` entries instead of hardcoded English strings; `PlatformPicker`
  now pulls translated labels and hints through `useT()` at render time.

Other locales (de, fr, ja, ko, ru, ar, pl, hu, uk, tr, th, es-ES, pt-BR,
fa, id) use `...en` spread in their locale files, so they inherit the
new English defaults automatically and can be translated by native
speakers in follow-up PRs.

Chinese translations follow Open Design's existing zh-CN.ts style
conventions: 应用 (app), 移动 (mobile), 营销 (marketing) are consistent
with prior entries in the file; 画框 (frame) matches Figma's Chinese UI
for the same mockup-frame concept.

Verified:

- `pnpm guard` clean
- `pnpm --filter @open-design/web typecheck` clean
- 945 web tests pass (`pnpm --filter @open-design/web test`)
- Manually verified in a locally-built `tools-pack` app with 简体中文
  selected: the template creation panel's Target platforms and
  Companion surfaces sections now render fully in Chinese with no
  layout breakage.

Fixes #1415
2026-05-13 11:10:57 +08:00
Nicholas-Xiong
0e3438731a
fix: add spacing between window controls and logo on macOS (#1480)
* fix: add spacing between window controls and logo on macOS

Fixes #1427

When hovering over the macOS window control area, the traffic light controls
appeared too close to the Open Design logo, creating visual crowding in the
title bar.

Changes:
- Added 8px margin-right to .app-chrome-traffic-space
- This creates breathing room between the window controls and the brand logo
- The spacing applies on top of the existing 12px gap in the header
- Added explanatory comment for future maintainers

Result:
-  Window controls and logo have proper spacing
-  Title bar feels more balanced and polished
-  No visual crowding around the brand mark
-  Maintains clean layout on macOS

This is a visual polish fix that improves the perceived quality of the app's
title bar, one of the most visible parts of the interface.

* fix: only apply traffic light spacing on macOS

Address review feedback: Use a separate CSS variable
--app-chrome-traffic-margin that defaults to 0px and is only
set to 8px on macOS via MAC_WINDOW_CHROME_CSS injection.

This prevents the spacing from affecting non-macOS platforms
where traffic-space is 0px.

Changes:
- Added --app-chrome-traffic-margin variable (default: 0px)
- Set it to 8px in MAC_WINDOW_CHROME_CSS for macOS only
- Removed unconditional margin-right from base style
2026-05-13 11:06:58 +08:00
Prantik Medhi
def9544996
fix(web): tighten routines project radios (#1493) 2026-05-13 11:05:18 +08:00
sukumarp2022
b167991d7c
feat: add project-level and user-level custom instructions (#1304)
* feat: add project-level and user-level custom instructions

Implements #510 — editable custom instructions that get injected into
every model message, at both user level (Settings → Memory) and
project level (pencil icon in project header).

- Add customInstructions to Project, AppConfigPrefs contracts
- Add custom_instructions column migration to projects table
- Inject user + project instructions into system prompt (after memory,
  before design system; project-level wins on conflict)
- Add Settings textarea for user-level instructions
- Add inline editor bar in ProjectView for project-level instructions
- Sync user-level instructions through daemon app-config round-trip

* fix: address PR review — validation, draft reset, length limit

- Reset instructionsDraft on Cancel and toggle close (stale draft bug)
- Thread customInstructions through POST /api/projects create handler
- Add type + length validation (5000 chars) in PATCH handler
- Enforce length cap in app-config applyConfigValue
- Add maxLength={5000} to both UI textareas
- Resync draft via useEffect when editor is closed
- Remove stray run.sh from commit

* fix: address maintainer review — save race condition, precedence wording

- Make handleSaveInstructions async with await + revert on failure
- Add instructionsSaving state to disable Save/Cancel/textarea during save
- Clarify precedence wording with concrete example in both prompt composers
- UpdateProjectRequest already has customInstructions (verified)

* fix: use server-returned project in save handler, drop optimistic update

The previous optimistic-update + revert approach captured a stale project
snapshot in the useCallback closure. On failure, reverting with the captured
object could clobber unrelated project fields that changed during the
async request.

Switch to pessimistic update: wait for patchProject to succeed, then call
onProjectChange(result) with the server-returned project object. The
instructionsSaving flag disables the editor UI during the round-trip.

* fix: align create/PATCH validation for customInstructions

Create endpoint now rejects invalid types and >5000 char values with
400 instead of silently truncating, matching the PATCH handler behavior.
2026-05-12 14:27:57 -04:00
Neha Prasad
342ba44383
fix memory extraction history affordance (#1447) 2026-05-12 13:35:34 -04:00
PerishFire
61163d6b92
Optimize Windows packaged prebundle flow (#1389) 2026-05-12 12:07:32 -04:00
Eli
49ea2499ac
[codex] Add draw annotation workflow (#1435)
* feat(web): tweaks palette popover with HSL hue-shift recoloring

Adds a Tweaks color-palette popover to the HTML preview toolbar.
Selecting a palette re-skins the iframe in place via a srcDoc-side
bridge that walks the DOM and shifts every chromatic paint to the
target hue while preserving each color's saturation and lightness —
pale tints stay pale, bold CTAs stay bold, just in the new color
family. Mono-noir desaturates instead of shifting.

- runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options
- file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected
- FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration
- PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir
- PreviewDrawOverlay: stub pass-through until the draw branch lands

* feat(web): hide finalize-design toolbar from project header

* test(e2e): skip project actions toolbar flow after toolbar removal

* Add draw annotation workflow

* Restore project actions toolbar
2026-05-12 21:54:59 +08:00
nettee
f621dbbfea
feat(web): Add Tailwind foundation (#1388) 2026-05-12 21:48:16 +08:00
mehmet turac
7e2168ed29
fix(picker): improve provider group header separation in Media model picker (#1441)
Added min-height and border-bottom to the sticky provider group header
to ensure it fully separates from the model content below.

Fixes #1434
2026-05-12 09:47:44 -04:00
Neha Prasad
2d405fae96
fix:align artifact preview exit button (#1445) 2026-05-12 21:39:31 +08:00
Nagendhra Madishetti
09a8fa8d64
feat(web): Critique Theater Phase 8 (8 Theater components, barrel, role-keyed CSS) (#1314)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)

Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.

Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
  sticky against further lifecycle events for the same run, except
  for parser_warning which can land late and is recorded in a side
  channel without changing phase.
- A new run_started for a different runId at any time discards the
  prior state and reboots, so the UI can launch consecutive runs
  without an explicit reset action.
- Events whose runId does not match the active run return the same
  state reference, so React's useReducer doesn't re-render
  subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
  so an out-of-order panelist_dim for round 1 arriving after a
  round 2 dim does not corrupt the round 2 bucket.

Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.

* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)

createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.

useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.

Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
  CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
  shape back to PanelEvent; malformed JSON is swallowed and does
  not stop the stream; exponential backoff schedule and ready-reset
  semantics are pinned with a setTimeout seam; close() cancels
  pending reconnects and shuts the live source; no-op fallback
  when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
  reducer driven by synthetic actions, no connection when disabled
  or projectId is null, clean close on unmount, projectId change
  reopens cleanly.

* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)

Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.

speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
  finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
  can watch the run unfold.
- paused parses the transcript but holds events back until the
  caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
  feature, currently treated as instant; replay timestamps are not
  yet persisted with each event so honest pacing requires a
  follow-up Phase 7+ task.

gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.

Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
  shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
  done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
  still land.
- .gz transcripts route through the gunzip seam.

Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.

* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)

Two P1 fixes from lefarcen's review on PR #1307:

SSE payload override

`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.

Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.

Replay pause/resume

`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).

Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
  reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
  the next scheduled timer, flips to instant, and confirms the
  remaining transcript drains exactly once (cursor was preserved)

Doc consistency

The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.

Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.

* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)

* feat(web): Theater PanelistLane component (Phase 8.1)

* feat(web): Theater ScoreTicker component (Phase 8.2)

* feat(web): Theater RoundDivider component (Phase 8.3)

* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)

* feat(web): Theater TheaterDegraded chip (Phase 8.5)

* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)

* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)

* feat(web): Theater TheaterStage top-level container (Phase 8.8)

* feat(web): Theater CSS using existing semantic tokens (no hex literals)

* feat(web): Theater public exports barrel

* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)

Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.

State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
   Host hooks dispatch it when their gating prop changes so a stale
   run from a prior project / transcript cannot bleed into the next
   context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
   connection effect, so a workspace switch from project A (which
   streamed a critique) to project B clears the reducer before the
   new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
   parse effect, so transcriptUrl swaps (including swap-to-null after
   a replay reached `shipped`) lift the reducer back to idle before
   the new fetch starts.

SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
   check after the cheap `isPanelEvent` predicate. A
   `critique.ship` frame missing `composite` / `round` / `status` /
   `artifactRef` is rejected before reaching the reducer, so
   TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
   Every variant's required fields are validated: run_started
   (protocolVersion, non-empty cast, maxRounds, threshold, scale),
   panelist_* (round, role, plus variant-specific shape), round_end
   (round, composite, mustFix, decision in {continue,ship}, reason),
   ship (round, composite, status, artifactRef.{projectId,artifactId},
   summary), degraded (reason, adapter), interrupted (bestRound,
   composite), failed (cause), parser_warning (kind, position).

Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
   view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
   the in-progress lane the instant the tag opens. Before this, a
   stream that emitted only `panelist_open` after `run_started` left
   `rounds = []` and the UI rendered no current round until a later
   `panelist_dim` arrived.

Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
   `var(--purple, var(--accent))`. `--purple` is actually defined
   across the design systems; `--magenta` is not, so Brand was
   silently falling through to `--accent` and looking identical to
   Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
   interrupted-collapse copy ("Interrupted at round N, best
   composite X.X"). Previously the interrupted branch reused
   `shippedSummary` and the UI read "Shipped at round..." for a run
   that specifically did not ship. Native value in en + zh-CN; other
   locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
   hardcoded `theater-degraded-heading`, so two chips rendered on
   the same page (chat history with multiple completed runs) keep
   their aria-labelledby references unambiguous.

Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.

Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales

* fix(web): tighten isPanelEvent in contracts so enum + numeric fields are checked end-to-end (Siri-Ray round-3 P1 on PR #1314)

The variant validator on the web SSE path previously accepted any
`typeof === 'string'` for closed-enum fields (ship.status,
panelist_*.role, degraded.reason, failed.cause, parser_warning.kind,
run_started.cast[]) and any `typeof === 'number'` for numeric fields,
which let NaN / Infinity through. Downstream components index i18n
tables by enum value, so an unknown status or role would land
`SHIP_BADGE_KEY[final.status]` on undefined and crash the translator.

The replay parser had a separate gap: `useCritiqueReplay.parseTranscript`
called the cheap `isPanelEvent` header check directly, so a recorded
line like `{"type":"ship","runId":"r"}` reached the reducer with
composite, status, round, artifactRef, summary all undefined and
TheaterCollapsed then called `final.composite.toFixed(1)` on undefined.

Resolution: move all wire-side validation into the contract guard.

- Export const arrays for the closed enums:
  SHIP_STATUSES, DEGRADED_REASONS, FAILED_CAUSES, PARSER_WARNING_KINDS,
  ROUND_DECISIONS (PANELIST_ROLES already existed).
- Rewrite `isPanelEvent` in packages/contracts/src/critique.ts to be the
  single deep validator: header (known type + non-empty runId) plus
  every variant-specific required field plus closed-enum membership
  plus Number.isFinite on every numeric field. Documented as the wire
  source of truth.
- Drop the local `hasValidVariantShape` from web/sse.ts; sseToPanelEvent
  now relies entirely on the contract guard, and parseTranscript in
  useCritiqueReplay (which already uses isPanelEvent) gets the deeper
  validation for free.

Tests (TDD, red-first):

- packages/contracts/tests/critique.test.ts: 13 new cases pinning the
  strict guard directly (well-formed across every variant, every
  rejection path: unknown type, empty/non-string runId, unknown enum,
  non-finite numeric, missing variant field).
- apps/web/tests/components/Theater/state/sse.test.ts: 9 new cases for
  each closed-enum rejection on the wire path plus a positive sweep
  across every legal enum value across every variant.
- apps/web/tests/components/Theater/hooks/useCritiqueReplay.test.tsx:
  2 new cases for incomplete and unknown-enum transcript lines.

Verified:
- pnpm --filter @open-design/contracts test 4 files / 30 tests green.
- pnpm --filter @open-design/contracts build clean.
- pnpm --filter @open-design/web typecheck clean.
- pnpm --filter @open-design/web test 107 files / 976 tests green.

* fix(contracts): enforce numeric domains in isPanelEvent (lefarcen P2 on PR #1314 round 4)

The strict guard from PR #1314 round 3 enforced enum membership and
Number.isFinite, but accepted any finite number where the contract
intends a specific domain: scale: 0 (ScoreTicker divides by it),
negative thresholds, fractional rounds, negative mustFix, etc.
ScoreTicker.tsx writes `var(--scale, ${state.scale})` into inline
CSS and divides by it for tick width, so a guard-passing scale: 0
shipped Infinity into the rendered style. Negative composite /
score values reached downstream code that assumes >= 0.

Resolution: mirror the daemon-side Zod domain constraints in the
runtime guard.

Three new helpers in packages/contracts/src/critique.ts:

  - isPositiveInt(v): integer with v > 0. Used for round, maxRounds,
    scale, protocolVersion (all 1-indexed in the orchestrator).
  - isNonNegativeInt(v): integer with v >= 0. Used for mustFix,
    position, bestRound. bestRound: 0 is the valid sentinel for
    'interrupted before any round closed'.
  - isNonNegativeFinite(v): finite number with v >= 0. Used for
    composite, score, dimScore, threshold. Threshold may be
    fractional (e.g. 8.5 on a scale of 10).

Cross-field check inside run_started: threshold <= scale (the daemon
Zod schema enforces this with an epsilon refine, the wire guard
matches the same intent).

Tests (TDD, red-first) added in packages/contracts/tests/critique.test.ts:

  - 22 new rejection cases across every numeric field that
    previously slipped through: scale: 0, negative scale, fractional
    scale, maxRounds: 0, fractional maxRounds, protocolVersion: 0,
    fractional protocolVersion, negative threshold, threshold > scale,
    round: 0, fractional round, negative dimScore / score, negative /
    fractional mustFix, negative composite, ship round: 0, negative /
    fractional bestRound, negative interrupted composite, negative /
    fractional parser_warning position.
  - 3 positive boundary cases that must still pass: threshold == scale,
    fractional threshold within [0, scale], interrupted with
    bestRound: 0 (no round completed before interrupt), parser_warning
    with position: 0 (start of stream).

Verified:
- pnpm --filter @open-design/contracts build clean.
- pnpm --filter @open-design/contracts test: 4 files / 59 tests green
  (was 37 before the new domain cases).
- pnpm --filter @open-design/web typecheck clean.
- pnpm --filter @open-design/web test: 110 files / 1004 tests green;
  no regression on Theater suite, sse validator, replay parser, or
  assistant-feedback widget tests.

---------

Co-authored-by: Nagendhra <nagendhra405@gmail.com>
2026-05-12 21:38:58 +08:00
Rocky
a7e6e0dc3d
fix(web/AssistantMessage): update status block detail to latest value instead of skipping (#1413)
When using Open Design with an ACP agent (e.g. Devin for Terminal),
after selecting a non-default model in the picker, the model badge
under the conversation header kept showing the agent's initial default
(e.g. `model swe-1-6-fast`) instead of the running model. The
conversation header text and the agent itself reflected the selected
model correctly — only the badge UI was stale.

Root cause: `buildBlocks()` in this file de-duplicated consecutive
status events with the same label by SKIPPING the new one rather than
updating the existing block. The daemon emits two `status:
label='model'` events per turn — once after `session/new` returns
with the agent's default model, then again after
`session/set_config_option` succeeds with the user-selected model
(see `apps/daemon/src/acp.ts` ~lines 495 and 587). The dedupe kept
the first and skipped the second, so the badge stayed stuck on the
default.

Fix: update the existing block's detail to the latest value instead.
"Most recent detail wins" is more accurate than "first one wins
forever" for every status label that reaches this code path
(`'model'`, `'initializing'`, etc.; the filter at lines 1156-1162
already drops the labels we don't want to surface as badges:
streaming, starting, requesting, thinking, empty_response).

Adds two regression tests in
apps/web/tests/components/AssistantMessage.test.tsx:

- Two sequential `status: 'model'` events with different details
  render the second detail and not the first (the Bug A scenario).
- Two sequential events with the SAME detail still collapse to a
  single badge (no regression in the existing dedupe behavior).

`pnpm guard` clean; AssistantMessage tests pass 22/22. Manually
verified working in a local build against Devin for Terminal
`2026.5.6-4` with #1208 applied — badge now updates to the
selected model on every turn.
2026-05-12 19:28:35 +08:00
Nagendhra Madishetti
d60c4521bf
fix(daemon): mark ghost CLIs unavailable when --version probe is rejected by the OS (#658) (#1301)
* fix(daemon): mark ghost CLIs unavailable when --version probe ENOENTs

Settings > Execution & model > Local CLI kept advertising Codex CLI
with a stale version number after the user had uninstalled the
binary (#658). Root cause was the probe in
apps/daemon/src/runtimes/detection.ts swallowing every `--version`
failure and returning `available: true` anyway.

The detection flow normally does:

  resolveAgentExecutable(def, env) -> resolved path or null
  if null: return available: false  (handles "binary not on PATH")
  else: spawn `<resolved> --version` and read the first line

On the uninstall path though, `resolveAgentExecutable` can still
return a non-null value:

  - macOS / Linux: a leftover wrapper shim (e.g. an npm bin shim or a
    homebrew alias) survives the uninstall, but the underlying
    interpreter / target the shim invokes is gone, so execFile
    rejects with ENOENT.
  - Windows: a `.cmd` shim file is still on PATH but points at a
    deleted target.
  - Permissions: the binary file is still there but the executable
    bit was stripped, so execFile rejects with EACCES.

The previous catch arm was unconditional:

    } catch {
      // binary exists but --version failed; still mark available
    }

That branch was correct for the original case it was written for
(adapters whose `--version` flag is not supported, so execFile
returns non-zero but the binary itself is fine). It was wrong for
the OS-level rejections above, where the binary cannot be invoked
at all.

Tighten the catch:

  - ENOENT, EACCES, ENOTDIR: return `available: false` immediately
    (the OS itself rejected the spawn; the CLI is not invocable).
  - Anything else: fall through to the old "available, version
    unknown" branch so adapters that have no `--version` flag keep
    working.

Coverage in apps/daemon/tests/runtimes/probe-ghost-cli.test.ts:
five vitest cases. Three pin each spawn-error code to
available=false, one pins ETIMEDOUT to available=true (the
adapter-with-no-version-flag contract), one happy-path check
asserts a clean --version returns the parsed version string.

* test(daemon): use .js specifier in ghost-cli mock so NodeNext tsc accepts the import

The mock specifier and the typeof import(...) query in
probe-ghost-cli.test.ts both passed `../../src/runtimes/executables.ts`.
Under apps/daemon/tsconfig.tests.json's NodeNext module setting,
allowImportingTsExtensions is off, so tsc rejects .ts-extension
imports and the test typecheck gate fails. Switching to the .js
specifier mirrors the rest of the daemon test suite and the
production side's import path, and tsc -p tsconfig.tests.json now
exits clean.

* fix(daemon): cover stale-wrapper exits and stale-override fallback in ghost-CLI probe

Two gaps left from the first revision of the #658 fix, both surfaced
by lefarcen's review on PR #1301:

1. Stale wrapper shims commonly exit 126 ("not executable") or 127
   ("command not found") instead of rejecting at spawn time with
   ENOENT. execFile reports those as a numeric err.code equal to the
   exit status, so the original string-only ENOENT/EACCES/ENOTDIR
   guard missed them and still advertised the agent as available.
   The probe now also classifies exit code 126 / 127 as not invocable
   while leaving every other non-zero exit on the legacy "available,
   version=null" path so adapters with no --version flag are still
   not regressed.

2. A user with a stale CODEX_BIN override and a working binary on
   PATH would be locked out of Settings' "adopt detected binary"
   repair flow (PR #1205), because that flow gates on
   agent.available === true. The probe now consults
   inspectAgentExecutableResolution directly: when the selected
   override is not invocable but a distinct PATH candidate exists,
   the probe retries against the PATH candidate before giving up.
   That keeps Settings able to surface the working binary and
   adopt-or-clear the bad override.

Extracted probeVersionAtPath() as a tagged-result helper so the
not-invocable / spawned discriminator lives in one place and the
retry path can re-use it without duplicating the classifier.

Tests: probe-ghost-cli.test.ts grows from 5 to 11 cases covering the
two new exit-code shapes, the override-to-PATH fallback (success and
both-broken), a generic non-126/127 exit (must stay available), and a
no-distinct-PATH-candidate guard that pins the no-retry contract.

* fix(daemon): keep ghost-CLI detection aligned with chat-run resolution

Siri-Ray's review on #1301 pointed out that the earlier configured-
override fallback broke a load-bearing invariant: the previous
revision had detection retry a PATH binary when the configured
override failed to spawn, then report that PATH binary as
`available: true` with `path` pointing at it. But chat/run
resolution still goes through `resolveAgentBin -> resolveAgentExecutable`,
whose `selectedPath` prefers the configured override whenever the
file exists. The result: Settings would advertise Codex as
available at `/usr/local/bin/codex` while every actual chat send
would spawn the stale `/stale/custom/codex` and fail with the same
ghost error #658 was meant to fix. The bug got swapped from
Settings to chat instead of fixed.

The fix here probes the same path the run-time resolver picks. If
the configured override is stale (ENOENT / EACCES / ENOTDIR or the
126/127 wrapper-shim exit signature) the agent is reported
unavailable, full stop, even when a different PATH candidate
exists. That keeps the daemon-internal invariant Siri-Ray flagged
intact: spawning at chat time uses the same executable detection
reported as available.

The Settings repair flow (PR #1205) still has all the signal it
needs via `inspectAgentExecutableResolution` directly, which
exposes both `configuredOverridePath` and `pathResolvedPath`
independently of the detection result. The UI can decide whether
to surface an adopt-or-clear affordance based on the divergence
without needing `available: true` as a permission gate. If that
flow currently does gate on `available`, a follow-up PR can rewire
Settings to read the resolution diagnostic instead.

The 126/127 wrapper-shim exit classification (the other lefarcen
P2 fix) stays in place: numeric exit codes 126 and 127 are POSIX-
shell "not executable" / "command not found" and reliably signal a
shim whose target has been removed; generic non-zero exits (1, 2,
ETIMEDOUT) keep the legacy "available, version=null" contract so
adapters with no `--version` flag are not regressed.

Tests
- 3 OS errno cases (ENOENT / EACCES / ENOTDIR) -> available: false
- 2 stale-wrapper exit cases (126 / 127) -> available: false
- 1 generic non-zero exit (1) -> available: true, version: null
- 1 timeout case (ETIMEDOUT) -> available: true, version: null
- 1 happy path -> parsed version string
- New regression test for the Siri-Ray invariant: a stale
  configured override + working PATH-environment combination drives
  the real `resolveAgentExecutable` through `vi.importActual`, and
  pins that detection's reported `available` / `path` match
  whatever the resolver returns for the same configured env. A
  future refactor that diverges detection from resolution trips
  this assertion before merge instead of in production.

---------

Co-authored-by: Nagendhra <nagendhra405@gmail.com>
2026-05-12 19:25:00 +08:00
Joey-nexu
5077a1cd38
feat(landing-page): split catalog into per-facet pages + auto-deploy on content changes (#1158)
* feat(landing-page): split catalog into per-facet pages + auto-deploy on content changes

Convert the single-page landing into a content-driven multi-page site
sourced directly from the canonical Markdown bundles in the repo root,
and close the deploy loop so contributor edits go live without manual
follow-up.

## What's new

- `/skills/`, `/systems/`, `/craft/`, `/templates/` index + detail
  pages, generated from `skills/<slug>/SKILL.md`,
  `design-systems/<slug>/DESIGN.md`, `craft/*.md`, and
  `templates/live-artifacts/<slug>/README.md` via Astro content
  collections (`app/content.config.ts`). No mirroring of content into
  the landing-page package — `glob` re-scans on every build.
- Faceted sub-routes generated from frontmatter:
    - `/skills/mode/<slug>/`     — 8 pages (deck, prototype, image, …)
    - `/skills/scenario/<slug>/` — 18 pages after alias collapse
    - `/systems/category/<slug>/` — 21 pages
  Each page owns its own `<title>`, meta description, and
  `CollectionPage` JSON-LD; chips on the parent index pages are now
  real anchors that link to these facet routes.
- Updated top-bar nav (`_components/header.tsx`) to point at the new
  internal routes with live counts pulled from the catalog. Counts in
  the homepage hero meta description likewise driven by
  `getCatalogCounts()` so they never drift.
- Per-skill / per-template thumbnails. A Playwright generator
  (`scripts/generate-previews.ts`) walks every `example.html` and
  `templates/live-artifacts/<slug>/index.html`, screenshots them at
  1440×900@2x, and writes PNGs to `public/previews/`. The catalog
  data layer auto-detects presence and degrades gracefully when an
  artifact has no renderable HTML.

## Plumbing the auto-update loop

- `landing-page-deploy.yml` and `landing-page-ci.yml` now trigger on
  changes under `skills/`, `design-systems/`, `craft/`, and
  `templates/`. Without this, a contributor adding a new SKILL.md to
  `main` would silently skip the deploy and the published site would
  fall behind.
- Both workflows now install Playwright Chromium (cached by version)
  and run `pnpm previews` before `astro build`, so generated
  thumbnails ship in `out/previews/` automatically. Preview generation
  is `continue-on-error: true` — a single broken example.html should
  not block the deploy of the rest of the catalog.
- `apps/landing-page/public/previews/` is gitignored: the directory
  is owned by CI and would otherwise add ~70MB of binary churn to the
  repo on every regeneration.

## Tag canonicalization

- `app/_lib/catalog.ts` adds a small per-scope alias table so
  authoring drift like `od.scenario: operation` vs `operations`, or
  `live` vs `live-artifacts`, collapses to a single canonical route
  instead of leaking two near-empty pages. Mode and category alias
  tables are scaffolded but currently empty.

## Validation

- `pnpm --filter @open-design/landing-page typecheck` — 0 errors,
  0 warnings, 0 hints across 25 Astro files
- `pnpm --filter @open-design/landing-page build` — 341 pages built
  (1 home + 8 mode + 18 scenario + 21 category + N detail pages +
  sitemap + RSS), zero external JS, ≥16 Cloudflare-resized hero
  image URLs intact

## Why this matters

After merge, any push to `main` that adds, removes, or edits a skill,
design system, craft principle, or live-artifact template
automatically triggers a fresh build that:

1. picks up the new Markdown via the content-collection glob,
2. regenerates thumbnails for any matching example.html,
3. emits new sitemap entries and JSON-LD,
4. and ships to Cloudflare Pages — no landing-page-side change
   required.

* fix(landing-page): address review feedback on PR #1158

Five fixes from the review pass — none change scope, all close the
"contradictory totals" / "stale data" / "silent CI failure" gaps the
reviewers flagged.

## Hero / catalog claims now read live counts everywhere

`apps/landing-page/app/page.tsx` previously hardcoded `31` skills and
`72` systems in the hero copy and stat rings, while the nav and meta
description had already moved to `getCatalogCounts()`. After this PR
every visible "X skills / Y systems" claim — hero lead, hero stat
rings, capabilities cards body copy, labs section meta + filter pills,
selected-work fractions, the labs CTA, and the footer Library — reads
from a single `counts` prop. `Header` and `Page` now both require
`counts` (no optional fallback) so a future caller can never silently
publish stale numbers.

The labs-section filter pills also stop being decorative buttons:
they now link to the actual `/skills/mode/<slug>/` and `/skills/`
catalog routes the new multi-page architecture exposes.

## Craft README no longer publishes

`apps/landing-page/app/_lib/catalog.ts` filtered out `e.id !== 'README'`,
but Astro normalizes `craft/README.md`'s id to lowercase `readme`, so
the published site shipped `/craft/readme/` as a public craft principle
and the nav badge counted 12 instead of 11. Compare case-insensitively
(`e.id.toLowerCase() !== 'readme'`) so any future README casing is
also filtered out. Verified locally: `apps/landing-page/out/craft/`
now contains exactly 11 entries.

## Preview URL preserves actual file extension

`listPreviews()` was already discovering `.png`, `.webp`, `.jpg`, and
`.jpeg`, but `previewUrlFor()` always emitted `.png`, so a future
sharp/webp post-processor (or a manually committed template asset)
would mark the record as available while the rendered `<img src>`
404'd. Switched the structure from `Set<slug>` to `Map<slug, filename>`
and emit the actual on-disk filename verbatim.

## Preview script: per-artifact soft, systemic hard

Previously any single failed `example.html` capture exited the script
non-zero, which forced both workflows to mark the entire preview step
`continue-on-error: true`. That blanket tolerance also masked
systemic generator failures — a chromium launch that never finds the
browser binary would silently ship a deploy with zero thumbnails.

`scripts/generate-previews.ts` now distinguishes:

- per-artifact failures → logged and skipped, exit 0 (catalog
  degrades gracefully for those skills),
- discoverJobs / chromium.launch / 100%-failure run → exit 1
  (systemic, must fail the build).

Both workflows drop their `continue-on-error: true` flags so a real
problem actually surfaces.

## AGENTS.md reflects the multi-page architecture

`apps/landing-page/AGENTS.md` previously declared the landing page
single-route ("Not multi-page. There is exactly one route ('/')").
That guidance is now wrong — there are six top-level route groups
(`/`, `/skills/`, `/systems/`, `/craft/`, `/templates/`, plus their
facet variants). Updated to describe content-collection sourcing, the
no-mirror rule, the auto-deploy workflow contract, and the
"never hardcode catalog claims" boundary.

## Validation

- `pnpm --filter @open-design/landing-page typecheck` — 0 errors,
  0 warnings, 0 hints across 25 Astro files
- `pnpm --filter @open-design/landing-page build` — 340 pages built
  (was 341 before the README filter; the README route is now
  correctly absent), live counts visible in the built `out/index.html`:
  `driven by 125 composable skills and 149 brand-grade design systems`
- Verified `out/craft/` no longer contains `readme/`
- Verified preview URLs resolve to the actual on-disk filename via
  the regenerated catalog index page

* fix(landing-page): clean up live-artifact template name + summary parsing

Address @mrcfps's follow-up review on `0715d8c`. The
`shapeLiveArtifactTemplate()` parser was passing the README's H1
verbatim (literal backticks intact) and using the first non-empty
post-H1 line as the summary, even when that line was the
`> Category: **Live Artifacts**` editorial blockquote. Result:
`/templates/live-otd-operations-brief/` was shipping a
`<meta name="description" content=">">` and a card title with raw
Markdown noise — a regression for both SEO snippets and the
templates catalog at-a-glance scan.

## Two new shared helpers

- `stripMarkdownInline()` — strip backticks, asterisks, and link
  wrappers so `# \`otd-operations-brief\` · live-artifact template`
  becomes `otd-operations-brief · live-artifact template` before any
  further trimming.
- `extractFirstProseParagraph()` — walk the body after the H1 and
  skip blockquotes (`>`), list markers, table rows, fenced code, and
  HR rules. Stop at the first contiguous prose paragraph and pass it
  through `stripMarkdownInline()` so the result is human-readable.

Both helpers live next to `titleizeSlug()` and are used by
`shapeCraft()` and `shapeLiveArtifactTemplate()` so they share one
implementation.

## Live-artifact title boilerplate trim

Live-artifact READMEs commonly title themselves
`# \`<slug>\` · live-artifact template`. After stripping the inline
backticks the trailing `· live-artifact template` is redundant
("Templates" already groups them) and adds a wide noisy suffix on
catalog cards. Removed it via a narrow regex tail-strip.

## Result on the existing fixture

Verified locally for `templates/live-artifacts/otd-operations-brief/`:

- before: `<title>\`otd-operations-brief\` · live-artifact template …</title>`,
  `<meta name="description" content=">">`
- after:  `<title>otd-operations-brief — Open Design template</title>`,
  `<meta name="description" content="A drop-in html_template_v1
  live-artifact template for an editorial On-Time Delivery brief.
  It ships:">`

Typecheck 0/0/0, build 340 pages.

---------

Co-authored-by: Joey <joey@cursor.so>
Co-authored-by: Joey-nexu <236967869+joeylee12629-star@users.noreply.github.com>
2026-05-12 19:24:50 +08:00
Prantik Medhi
060540f73c
fix(web): improve design system search affordance (#1437) 2026-05-12 19:15:16 +08:00
Joey-nexu
4f70bf80fb
feat(web): add Discord community link in entry sidebar footer (#1391)
Adds a Discord invite (https://discord.com/invite/qhbcCH8Am4) as a foot-pill
sibling to the existing X follow link in EntryView's left sidebar bottom row.
Introduces a 'discord' icon to the inline SVG icon set, rendered with
fill=currentColor so it adopts local text color like the rest of the system.

CSS unchanged: reuses .foot-pill .foot-pill-follow.

Co-authored-by: Joey-nexu <236967869+joeylee12629-star@users.noreply.github.com>
2026-05-12 18:51:06 +08:00
Matt Van Horn
23218eacd9
refactor(settings): use tiled language picker instead of dropdown (#1406)
The Language section in Settings rendered a single-button dropdown
trigger that opened a floating menu. With one visible label and lots of
empty panel space, the layout misled users into thinking only one
language existed. Replace the dropdown trigger + portaled menu with an
inline tile grid that shows every locale at a glance and clicks
directly to switch.

Side effects of the new layout: the languageOpen / languageMenuRect
state, the dynamic placement effect, the resize-close effect, the
mousedown click-outside handler, and the languageRef are gone. The
global Escape handler no longer needs to guard against the menu being
open. CSS for .settings-language-picker, .settings-language-button,
.settings-language-menu, and .settings-language-option is replaced by
.settings-language-grid (auto-fill 180px minmax columns) +
.settings-language-tile.

Tests in SettingsDialog.execution.test.tsx that drove the dropdown
(click trigger → click menuitemradio → assert menu closed) are
rewritten to drive the tiles directly via the radio role.

Refs #1347
2026-05-12 17:49:04 +08:00
shangxinyu1
0220124e0f
test: add Memory and Routines coverage (#1400)
* test: align extended Playwright coverage with current UI behavior

* test: address extended suite review feedback

* test: fix Codex fallback config hydration in e2e

* test: add Memory and Routines coverage

* test: fix Memory and Routines component test typing

* test: include Memory and Routines e2e in extended suite
2026-05-12 17:48:56 +08:00
nettee
28d3e5faf5
Fix Codex wrapper launch paths (#1395) 2026-05-12 17:20:32 +08:00
Rocky
6c3fd86642
fix(daemon/acp): terminate ACP child after clean prompt completion (#1286)
* fix(daemon/acp): terminate ACP child after clean prompt completion (Bug B / #1265)

Some ACP agents (notably Devin for Terminal) keep the child process
alive after stdin closes, waiting for the next prompt. Open Design
spawns a fresh agent per chat turn and relies on child.on('close') to
finalize the run, so without an explicit signal-driven shutdown the
chat sits stuck in the 'working' state indefinitely.

Three small, targeted changes:

- apps/daemon/src/acp.ts: After a clean session/prompt response we
  schedule a 500ms grace period and then SIGTERM the child. This
  mirrors the pattern detectAcpModels() already uses after model
  discovery. The grace period leaves well-behaved agents that exit on
  stdin.end() unaffected.
- apps/daemon/src/acp.ts: New completedSuccessfully() method on the
  session handle reports whether the prompt resolved without a fatal
  error or abort, so the consumer can distinguish 'clean signal exit'
  from 'genuine signal failure'.
- apps/daemon/src/server.ts: child.on('close') now treats a SIGTERM
  exit as 'succeeded' when acpSession.completedSuccessfully() is true.
- apps/web/src/providers/daemon.ts: Trust the server's authoritative
  endStatus; the signal/non-zero-code safety net no longer overrides
  an explicit 'succeeded' status, so the chat doesn't surface a fake
  'agent exited with signal SIGTERM' error after a clean ACP run.

Daemon tests cover the SIGTERM grace timer, clean early-exit (timer
cleared), and completedSuccessfully() abort/error states. Manual UI
test on plain main + this fix confirms Devin chats now return to ready
automatically after Done · ...

* fix(daemon/connectionTest): treat ACP clean SIGTERM as success

Codex review on #1286 caught that the new SIGTERM in attachAcpSession
breaks ACP connection tests for agents that don't shut down on
stdin.end() (the exact Devin behavior the patch targets).

attachAgentStreamHandlers() in connectionTest.ts now also respects
acpSession.completedSuccessfully(), mirroring the same check we apply
in server.ts. Without this, a clean prompt response followed by our
SIGTERM would set winner.signal === 'SIGTERM', flip exitedCleanly to
false, and the connection test would report 'agent_spawn_failed'
even when the agent had returned a healthy response.

Also widened the AgentSpawnHandle type so completedSuccessfully is
visible on the structural type used inside connectionTest.ts.

All 56 daemon tests still pass; typecheck + guard clean.

* fix(daemon/acp): narrow ACP success-on-signal override to forced-SIGTERM

Looper review on #1286 caught that the success predicate was broader
than the SIGTERM case it was meant to handle. `completedSuccessfully()`
flips to true as soon as the ACP `session/prompt` response is
processed, but it does not say why the child later closed. With the
broad predicate, an ACP agent that returned a prompt result and then
exited with code 1 (or was killed by SIGKILL/SIGSEGV) was still marked
'succeeded', regressing the existing close-status behavior for genuine
post-response process failures.

Scope the override to the exact forced-shutdown shape this PR
introduces:

  code === null && signal === 'SIGTERM' && acpCleanCompletion

Applied to both `server.ts` (chat run finalization) and
`connectionTest.ts` (connection-test classification). Any other
post-response failure now falls through to 'failed' / 'agent_spawn_failed'
as before.

All 59 daemon tests still pass; typecheck + guard clean.

* fix(web/daemon): only bypass exit-code safety net on explicit server success

Looper review on #1286 caught that the previous web change trusted
`endStatus === 'succeeded'` absolutely, but `endStatus` can become
'succeeded' in two distinct ways:

1. The SSE end event explicitly carries `status: 'succeeded'`
   (authoritative server declaration).
2. The end event omits or has an invalid `status` field and the
   handler silently falls back to 'succeeded' as a local default.

Both produced `endStatus === 'succeeded'` in the existing code, so
the new safety-net bypass treated them identically. That regressed
backward compat: a compatible or older daemon emitting an end event
like `{code:1}` or `{code:null,signal:"SIGTERM"}` with no
`status` would suddenly skip the failure banner.

Track explicit success separately via `serverDeclaredSuccess`,
set true only when:

- The SSE end event has `status === 'succeeded'`, or
- The fallback `fetchChatRunStatus` REST path returns
  `status === 'succeeded'` (which the existing `isChatRunStatus()`
  guard already proves is explicit).

The safety net is now bypassed only on that explicit signal; the
local-fallback success path still reaches the exit-code/signal
check so real failures surface as before.

Adds three web-side regression tests in `apps/web/tests/providers/sse.test.ts`:

- Explicit `status: 'succeeded'` + SIGTERM → onDone called, no error
- End event with `{code:1}` and no `status` → onError surfaces
  'agent exited with code 1' as before
- End event with `{code:null,signal:'SIGTERM'}` and no `status` →
  onError surfaces 'agent exited with signal SIGTERM' as before

`pnpm guard` + daemon typecheck clean; 27/27 SSE tests pass (up
from 24).
2026-05-12 17:13:10 +08:00
Yiang Yiyan
5ff578dc8d
fix: support object-style question-form options (#1293)
* fix: support object-style question-form options

* fix: preserve stable option values in form submissions
2026-05-12 17:03:45 +08:00
Mason
2f51f3c1ae
feature: refine assistant artifact feedback (#1379)
* feature: refine assistant artifact feedback

* fix: clear hidden custom feedback reason

* test: update assistant feedback expectations
2026-05-12 17:00:42 +08:00
Nagendhra Madishetti
4d0ea247a7
fix(desktop): swallow setTypeOfService EINVAL crashes in dev main (#647) (#1298)
* fix(desktop): swallow harmless setTypeOfService EINVAL crashes in dev main

The packaged Electron entry (apps/packaged/src/logging.ts) already
filters the undici "setTypeOfService EINVAL" crash that issue #895
introduced for the prod build, but the dev / source-built desktop
entry was missing the parallel guard. Result: switching settings
tabs in a from-source desktop run could fire a fresh fetch, undici
would try to set IP_TOS on the outbound socket, the kernel would
refuse on certain macOS / VPN configurations, and the rejection
bubbled to Electron's default handler as the "JavaScript error in
the main process" dialog reported in issue #647.

Add the same defensive filter to apps/desktop:

  - isHarmlessSocketOptionError matches only the canonical undici
    shape (syscall name AND EINVAL code). A contradicting code
    (EACCES, EPERM, etc) explicitly fails the match so real bugs
    don't get hidden.
  - The uncaughtException handler logs harmless cases at warn and
    returns silently. For anything else it removes itself from the
    listener list and re-throws via setImmediate, restoring Node's
    default crash path so Electron's native dialog renders exactly
    as it would without this filter.
  - unhandledRejection mirrors the same harmless / fall-through
    split.

The filter is installed BEFORE app.whenReady so it is armed by the
time the renderer fires its first fetch.

The helper is duplicated rather than imported from apps/packaged
because AGENTS.md forbids cross-app private-source imports. The file
header calls out the parallel and notes that the two copies should
stay in sync until the helper is promoted to a shared workspace
package (follow-up); the contract is identical so a regression in
one will surface in the other's test suite.

Tests in apps/desktop/tests/main/uncaught-exception.test.ts mirror
apps/packaged/tests/logging.test.ts: 8 cases pinning the matcher
shape, 2 cases pinning the handler's harmless-log-warn vs
fall-through-rethrow split.

Validated: pnpm guard, pnpm --filter @open-design/desktop typecheck,
pnpm --filter @open-design/desktop build, and pnpm --filter
@open-design/desktop test (14 passed, 10 new).

* fix(desktop,packaged): fail-fast on non-harmless unhandled rejections

The previous unhandledRejection listeners logged non-harmless reasons
and returned, which kept the main process alive after any rejected
promise. A real bug, a failed IPC registration, or any unexpected
async exception was reduced to a console line instead of surfacing
through Node/Electron's default crash path the filter was meant to
preserve.

Both copies now route non-harmless rejections through a parallel
factory (createDesktopUnhandledRejectionHandler /
createFatalUnhandledRejectionHandler) that mirrors the
uncaughtException policy: harmless setTypeOfService EINVAL shapes log
at warn and return, anything else logs at error, removes the
listener, and re-throws via setImmediate. Listener removal happens
before the scheduled throw, so the rethrown reason lands in the
uncaughtException path with no recursion.

Tests cover the harmless branch, the detach + ordered rethrow, and
non-Error / primitive rejection reasons (Promise.reject(42)) which
must fall through. Desktop suite: 13/13, packaged suite: 16/16.
Flagged on PR #1298 by Siri-Ray and the codex P2 review thread; the
two file copies stay in lockstep per the AGENTS.md sync invariant.

---------

Co-authored-by: Nagendhra <nagendhra405@gmail.com>
2026-05-12 16:50:57 +08:00
mrzhangkris
54516a9866
fix: truncate long template names on project cards (#1220) (#1302)
Add min-width: 0 to .design-card-name so text-overflow: ellipsis
works correctly in flex layouts. Long template names were pushing
the task execution status (Running, Failed, etc.) out of view on
project cards.

Closes #1220

Co-authored-by: laomo <laomo@openclaw.ai>
2026-05-12 16:49:35 +08:00
Bryan
1fb099916c
feat(daemon): export self-contained HTML via /export/*?inline=1 endpoint (#1312)
* test(daemon): add Red unit tests for inlineRelativeAssets helper

14 cases pinning the behavior contract for the upcoming
apps/daemon/src/inline-assets.ts helper:

- link/script inlining with verbatim body preservation
- non-src script attrs preserved (type=module, defer, crossorigin)
- relative path resolution (root + nested + deep-nested owners)
- self-closing and single-quoted attr forms
- negative cases: missing rel, rel=preload, absolute/data/blob/leading-slash
- escaping: </style and </script inside body
- null-fileReader graceful degradation
- duplicate identical tags fully replaced (diverges from
  apps/web/src/components/FileViewer.tsx:5313's first-match-only;
  locked decision per plan §3.3)
- HTML-escaped data-od-inline-asset attr

Tests intentionally Red — module ../src/inline-assets.js does not yet
exist. Phase B-G of plan declarative-roaming-gosling.md will turn them
green by porting FileViewer.tsx:5248-5354 server-side.

Refs nexu-io/open-design#368.

* feat(daemon): port inlineRelativeAssets server-side for export endpoint

Adds apps/daemon/src/inline-assets.ts — a pure helper that takes
(html, ownerFileName, fileReader closure) and returns the HTML with
every relative <link rel=stylesheet> and <script src> contents inlined
into <style data-od-inline-asset="…">/<script>…</script> blocks. The
fileReader closure keeps the helper free of fs/Express coupling so the
route handler owns the filesystem boundary.

Port source: apps/web/src/components/FileViewer.tsx:5248-5354 — five
functions (inlineRelativeAssets, resolveProjectRelativePath, baseDirFor,
readHtmlAttr, escapeHtmlAttr). The fetch hop becomes the fileReader
closure; replace-all replaces first-match-only per locked design
decision §3.3 (inline comment in inline-assets.ts cites the divergence
from FileViewer.tsx:5313 and notes the web inline path is on a
deprecation track since PR #384 made URL-load the default).

Phase B-G of plan declarative-roaming-gosling.md. All 14 unit cases
from the Red commit (a60a9023) now pass; tightens one case to use a
realistic '&'-only filename (the original `<`/`>`-bearing filename was
unreachable in real filesystems and exposed a regex limitation the
web client carries too).

Daemon delta: +14 tests (1704 → 1718). Typecheck clean.

Refs nexu-io/open-design#368.

* test(daemon): add Red integration tests for /export/*?inline=1 route

9 HTTP cases against GET /api/projects/:id/export/*?inline=1:

- 3-file React-ish layout returns self-contained HTML (wiring guard:
  body assertions catch removal of the await inlineRelativeAssets(...)
  line, not just helper-internals changes)
- missing inline / non-canonical values (0, false, foo, empty) → 400
- non-HTML file → 400 UNSUPPORTED_FILE_TYPE
- missing file → 404 FILE_NOT_FOUND
- invalid project id (..) → some 4xx (Express normalizes before route)
- null-origin OPTIONS preflight → 204 + Access-Control-Allow-Origin: *
- missing sibling asset → 200 with <link> tag intact, other asset inlined
- nested HTML entry (pages/index.html + ../shared/util.js) → 200 inlined

8 of 9 tests Red (404 / 403); the invalid-project-id case is tolerant
about how Express rejects .. so it accidentally passes Red — Green
will tighten to 400 BAD_REQUEST via isSafeId.

Phase C-R of plan declarative-roaming-gosling.md. C-G will register
the route in apps/daemon/src/import-export-routes.ts.

Refs nexu-io/open-design#368.

* feat(daemon): wire GET /api/projects/:id/export/*?inline=1 endpoint

Adds the export-inline endpoint into registerProjectExportRoutes
(import-export-routes.ts) alongside /export/pdf and /archive. The
route:

- Validates project id via ctx.validation.isSafeId
- Requires ?inline=1 (accept-list: 1 / true / yes / on, matching Part
  1's parseForceInline at file-viewer-render-mode.ts:59-66)
- Reads the owner HTML via ctx.projectFiles.readProjectFile; maps
  ENOENT to 404 FILE_NOT_FOUND, everything else to 400 BAD_REQUEST
- Gates non-HTML callers with 400 UNSUPPORTED_FILE_TYPE
- Builds a fileReader closure that silently returns null on any sibling
  read failure (failure-local, not fatal — matches the web client's
  null-filter at FileViewer.tsx:5311)
- Hands the buffer + relPath to inlineRelativeAssets and returns the
  result as text/html

DI: RegisterProjectExportRoutesDeps gains 'projectFiles' | 'validation';
server.ts:2879 passes the corresponding deps. Mirrors the dep shape of
RegisterFinalizeRoutesDeps used by PR #832's /finalize/anthropic.

Null-origin support intentionally omitted (decision §10 in the PR
description): the daemon's null-origin allowlist is /raw/* and
/codex-pets/.../spritesheet only, and export consumers are same-origin
UI or server-side tooling — sandboxed-iframe srcdoc previews fetch
/raw/* instead. Integration test #7 pins the 403 contract so a future
allowlist change is deliberate.

Phase C-G of plan declarative-roaming-gosling.md. All 23 tests green
(14 unit + 9 integration); full daemon suite 1727 passing (delta +9
over B-G's 1718). Typecheck clean.

Refs nexu-io/open-design#368.

* test(daemon): add Red regression for inlined-body tag-literal corruption

Reproduces the correctness bug Siri-Ray (looper) and codex-bot flagged
on PR #1312: the reduce/split-join approach in inlineRelativeAssets
re-scans the progressively mutated HTML, so a tag literal that happens
to appear inside an already-inlined asset body gets the inner literal
also replaced — corrupting the body and producing duplicate inlining.

Concrete reproducer (CSS, where </style escape doesn't touch <link>):

  HTML:    <link rel="stylesheet" href="a.css">
           <link rel="stylesheet" href="b.css">
  a.css:   /* see also <link rel="stylesheet" href="b.css"> */
  b.css:   body{color:red}

Under split/join the second pass splits on `<link rel="stylesheet"
href="b.css">` and matches BOTH the real outer tag AND the literal
inside a.css's comment. Result: b.css's <style> block is injected
inside a.css's comment, and b.css gets inlined twice.

Phase F-R of plan declarative-roaming-gosling.md (post-PR-#1312
review round). F-G will rewrite the helper to collect matches by
position in the original HTML and concat slices in a single pass,
so already-inlined content is never re-scanned.

Refs nexu-io/open-design#1312 review threads at
apps/daemon/src/inline-assets.ts:122 (Siri-Ray looper + codex bot).

* feat(daemon): replace inliner reduce/split-join with position-based concat

Fixes the inlined-body tag-literal corruption Siri-Ray (looper) +
codex-bot flagged on PR #1312. The previous `replaceAllOccurrences`
(`source.split(from).join(to)`) re-scanned the progressively mutated
HTML on each pass, so a tag literal that appeared inside an already-
inlined CSS/JS body got the inner literal replaced too, producing
duplicate inlining and corrupted bodies.

New shape: collect every match's {start, end} byte span from the
ORIGINAL html via `matchAll`, await the per-match replacements in
parallel, sort by start, and concat slices of the original html with
the replacement strings in a single pass. Text introduced by an
earlier replacement is never scanned for matches.

The dup-tag fix (decision §8 — replace every occurrence, not
first-match-only) is preserved: every original-tag position gets its
own slice, so all duplicates are inlined.

Also extracts buildInlineStyleBlock / buildInlineScriptBlock so the
match-collection loops stay readable.

Phase F-G of plan declarative-roaming-gosling.md. Regression test
(c809bccc) goes Green; all 24 unit + integration tests pass; daemon
suite still clean.

Refs nexu-io/open-design#1312.

* test(daemon): add Red CSP-sandbox test + P3 coverage gaps from PR #1312 review

Three tests covering lefarcen's review on PR #1312:

1. [Red] CSP sandbox header (P2, lefarcen @ import-export-routes.ts:423).
   Top-level browser navigation to /export/*?inline=1 sends no Origin
   header, so the daemon middleware lets it through and any JS in the
   exported document runs with daemon-origin privileges. Asserts the
   response sends `Content-Security-Policy: sandbox allow-scripts` so
   the browser treats it as a sandboxed iframe with an opaque origin
   (scripts still run, but no cookies / no /api/ access). This test
   fails until G1-G adds the header in the handler.

2. [Green-on-commit] Accept-list cases (P3, lefarcen @ test.ts:262).
   PR body decision §7 promises `inline=true/yes/on` case-insensitive,
   but round-1 tests only exercised inline=1. Pin the full accept list
   (true / yes / on + TRUE / Yes / ON). Already passes — the route's
   parser already implements the accept list; this just makes the
   contract testable.

3. [Green-on-commit] isSafeId guard (P3, lefarcen @ test.ts:287).
   Previous `..` test was normalized by Express before reaching the
   route. New input uses `bad!id` (URL-safe, but outside isSafeId's
   /^[A-Za-z0-9._-]+$/ char class), so Express passes it into
   req.params unchanged and isSafeId rejects with the documented
   400 BAD_REQUEST envelope.

Phase G1-R / H of plan declarative-roaming-gosling.md. Refs
nexu-io/open-design#1312 review comments.

* feat(daemon): send Content-Security-Policy: sandbox allow-scripts on /export

Closes the same-origin XSS surface lefarcen flagged on PR #1312 (P2
at import-export-routes.ts:423): top-level browser navigation to the
export URL sends no Origin header, so the daemon's /api middleware
admits the request and any JS in the exported document executes with
daemon-origin privileges (cookies, /api/, localStorage).

`Content-Security-Policy: sandbox allow-scripts` on the response
makes the browser treat the document as a sandboxed iframe with an
opaque origin. Scripts still execute (necessary for the screenshot
use case — the whole point of inlining JS), but they cannot read
cookies, hit /api/, or otherwise escalate to the daemon's origin.

Phase G1-G of plan declarative-roaming-gosling.md. Daemon delta: +3
tests (the Red CSP test from 58151356 turns Green; the P3 coverage
gap tests stay green).

Refs nexu-io/open-design#1312.

* test(daemon): add Red regression for <link> stylesheet attr preservation

Currently `<link rel="stylesheet" href="print.css" media="print">`
becomes a plain `<style data-od-inline-asset="print.css">…</style>`
with no media query — print-only styles apply unconditionally. Same
problem for `title` (alternate stylesheet sets), `disabled` (initial
disabled state), and `nonce` (CSP nonce). All four are valid on
both `<link rel=stylesheet>` and `<style>` per HTML spec, so the
inliner must carry them across.

PR #1312 round-2 review (lefarcen P2 @ inline-assets.ts:44). Phase
G2-R; G2-G will extend buildInlineStyleBlock to copy the four attrs
off the source <link>.

Refs nexu-io/open-design#1312.

* feat(daemon): preserve <link> stylesheet semantics on inlined <style>

Closes lefarcen's P2 review note on PR #1312 (inline-assets.ts:44):
`<link rel="stylesheet" href="print.css" media="print">` was becoming
a plain <style> with no media query, so print-only styles applied
unconditionally. Same issue for `title` (alternate stylesheet sets),
`disabled` (initial disabled state), and `nonce` (CSP nonce).

buildInlineStyleBlock now carries four attrs across from the source
<link>:
  - media, title, nonce  (value attrs, HTML-escaped via escapeHtmlAttr)
  - disabled             (boolean attr — copied as bare presence)

Other <link> attrs (rel, href, type, crossorigin, integrity,
referrerpolicy) don't apply to <style> and are intentionally dropped.

New `hasBooleanHtmlAttr` helper distinguishes presence-as-attr from
substring-inside-another-attr-value via a regex that requires a
word boundary after the name (whitespace, `=`, or `>`).

Phase G2-G of plan declarative-roaming-gosling.md. All 28 tests pass.

Refs nexu-io/open-design#1312.

* docs(daemon): narrow inliner contract claim + document size-limit policy

Closes lefarcen's P2 review notes on PR #1312:

1. "Self-contained" incomplete (inline-assets.ts:67): the helper
   only rewrites top-level <link rel=stylesheet> / <script src>.
   `<img src>`, CSS `url(...)`, CSS `@import`, ES module imports,
   font sources, and similar remain external in the response. The
   PR title/body claimed "self-contained HTML" which over-promised
   for screenshot tooling expecting bundled images/fonts.

   Module docstring now enumerates the full not-rewritten list and
   names the screenshot path as the primary use case (headless
   browser fetches each external asset on render, so inline-CSS-
   and-JS-only is sufficient). The route handler comment block
   mirrors the contract.

   A fully offline export with image/font bundling is filed as a
   follow-up — out of scope for this PR.

2. No response cap (inline-assets.ts:72): the helper does
   concurrent reads + multiple string copies and could spike daemon
   memory. The daemon is local-first (single-user, developer's
   machine — see open_design_architecture.md), so the effective
   ceiling is the size of the user's own project. The docstring now
   states this rationale and names the conditions under which a
   bounded-concurrency reader and output-size limit would be
   needed (non-trusted callers).

Docs-only — no behavior change, all 28 tests still pass.

Refs nexu-io/open-design#1312.

* test(daemon): add Red regression for hasBooleanHtmlAttr quoted-value match

PR #1312 round-2 review (lefarcen P3): `hasBooleanHtmlAttr` tests the
tag string with no attr-quoting awareness, so the literal text
`disabled` appearing inside any quoted attribute value followed by
another whitespace char satisfies `\sdisabled(?=\s|=|/?>)`.

  <link rel=stylesheet href=x.css data-note="content disabled stuff">

emits a <style disabled> block, silently disabling a stylesheet the
author wrote without that attr.

Also adds a counterweight test for the legitimate-disabled case
(<link … disabled>) so the next-commit fix doesn't over-correct
and start dropping real boolean attrs.

Phase I3-R of plan declarative-roaming-gosling.md (post-PR-#1312
round-2 review). I3-G will strip quoted attribute values from the
tag string before testing for the bare attr.

Refs nexu-io/open-design#1312.

* feat(daemon): make hasBooleanHtmlAttr quote-aware to avoid false positives

Closes lefarcen's P3 review note on PR #1312:
`hasBooleanHtmlAttr` previously ran `\sname(?=\s|=|/?>)` over the
full tag string, so the literal text `disabled` appearing inside any
quoted attribute value followed by whitespace satisfied the regex.
Source tags like `<link rel=stylesheet href=x.css
data-note="content disabled stuff">` were emitting a <style
disabled> block — silently disabling a stylesheet the author wrote
without that attr.

Fix: strip `="…"` and `='…'` substrings out of the tag with two
regex passes BEFORE testing for the bare attr. The lookahead still
requires `\s|=|/?>` after the attr name, so `<link disabled>`,
`<link disabled="">`, `<link disabled/>`, etc. all match — but the
attr name as a substring of any quoted value cannot match because
values have been stripped to `""` / `''`.

Phase I3-G of plan declarative-roaming-gosling.md. All 30 tests
green (28 prior + 2 round-3 regression cases: false-positive and
legitimate-disabled). Refs nexu-io/open-design#1312.

* test(daemon): add Red cap-enforcement tests + scaffold InlineOptions

PR #1312 round-2 review (lefarcen P2 — still open): round-2 only
documented that no cap is enforced. Reviewer pushed back: the helper
still builds unbounded candidate arrays + runs Promise.all over all
asset reads + concatenates the full output in memory. Need actual
limits in code.

This commit adds the Red test surface that drives the next commit's
enforcement:

  - InlineAssetsLimitError("owner") when owner HTML > maxOwnerBytes
  - InlineAssetsLimitError("candidates") when tag matches > maxCandidates
  - Per-asset graceful: oversized asset → tag stays as URL ref
  - InlineAssetsLimitError("total") when assembled output > maxTotalBytes
  - Bounded read concurrency: peak in-flight reads ≤ maxReadConcurrency
  - Integration: route maps the throw to 413 PAYLOAD_TOO_LARGE

InlineOptions interface is added to the helper signature as a no-op
test-door (per feedback_test_doors_over_fake_timers.md), so tests
can exercise tiny fixtures while production callers use module-level
defaults. The next commit (H3-G) wires the enforcement.

Phase H3-R of plan declarative-roaming-gosling.md. Daemon delta on
this commit: +6 tests (5 unit + 1 integration), all Red.

Refs nexu-io/open-design#1312.

* feat(daemon): enforce inliner caps + map limit errors to 413 PAYLOAD_TOO_LARGE

Closes lefarcen's still-open P2 review on PR #1312 round 2 ("the
code still builds unbounded candidate arrays + Promise.all over all
asset reads + concatenates the full output in memory"). Caps are now
enforced in code with the documented defaults:

  MAX_INLINE_OWNER_BYTES       = 2 MiB
  MAX_INLINE_ASSET_BYTES       = 5 MiB per sibling
  MAX_INLINE_CANDIDATES        = 500 link/script matches
  MAX_INLINE_TOTAL_BYTES       = 50 MiB assembled output
  MAX_INLINE_READ_CONCURRENCY  = 8 simultaneous fileReader calls

Enforcement points:

- Owner cap (input): fires immediately at function entry. Cheap —
  Buffer.byteLength of the already-decoded UTF-8 string.
- Candidate cap (planning): fires after matchAll, BEFORE any sibling
  read. Pathological HTML with thousands of <link>/<script src>
  tags is rejected without opening a single file descriptor.
- Asset cap (per-sibling): post-read length check; oversized assets
  return null from the wrapped reader, so the tag stays as a URL ref
  and the response is still 200. This is the only "graceful" cap —
  one bad asset doesn't fail the whole export.
- Total cap (output): tracked across the slice-and-concat loop,
  guarding both preserved-html slices AND injected replacements.
- Concurrency cap (planning): a tiny in-module runWithConcurrency
  worker-pool keeps at most maxReadConcurrency fileReader calls in
  flight, with order-preserving results.

`InlineAssetsLimitError` carries a `limit` discriminator so logs and
clients can disambiguate owner/asset/candidates/total. The route
handler catches it and emits 413 PAYLOAD_TOO_LARGE.

Drive-by error-envelope fix while in the route: UNSUPPORTED_FILE_TYPE
(an unregistered ApiErrorCode) → UNSUPPORTED_MEDIA_TYPE (the
canonical code) with HTTP 415. The round-1 string was a slip;
caught by reading packages/contracts/src/errors.ts:11 while wiring
PAYLOAD_TOO_LARGE.

Phase H3-G of plan declarative-roaming-gosling.md. All 36 tests
green (28 prior + 2 round-3 quoted-attr + 5 cap unit + 1 cap
integration). Refs nexu-io/open-design#1312.

* feat(daemon): enforce inliner caps pre-buffer via AssetHandle contract

Closes lefarcen's still-open P2 review on PR #1312 round 3 ("the
helper enforces maxTotalBytes only after all candidate assets have
already been read and converted to replacement strings" /
"maxAssetBytes is checked after fileReader fully buffers each
sibling"). Round-3 caps were defensive against the final output
size but did not bound peak memory during read fanout — 500 assets
at 5 MiB each could materialize ~2.5 GiB before the 413 fired.

Contract change: InlineAssetReader now returns `AssetHandle | null`
where AssetHandle is `{ readonly size: number; read(): Promise<...> }`.
Callers expose `size` from a cheap stat-equivalent (the route uses
`resolveProjectFilePath`) and defer the full materialization to
`read()`. The helper checks size against maxAssetBytes BEFORE
invoking read, and against the running total BEFORE the reservation
is committed.

Enforcement flow inside runWithConcurrency:

  1. await fileReader(p.resolved) → cheap stat-only call
  2. if (handle.size > maxAssetBytes) return null   ← pre-buffer
  3. if (runningBytes + handle.size > maxTotalBytes) ← pre-buffer
     totalAborted = true; return null
  4. runningBytes += handle.size                    ← reserve
  5. await handle.read()                            ← only now
  6. if (read returned null) runningBytes -= refund

`totalAborted` is a shared flag the workers check at entry, so
once the running total hits the cap, no new reads start. With
maxReadConcurrency = 8, at most ~8 stat-side calls finish after
abort — peak memory bounded.

The concat-time guard stays as the exact final assertion (the
pre-buffer reservation is approximate — it counts the original tag
bytes and skips wrapper overhead).

Route closure updated to do `resolveProjectFilePath` first, then
`readProjectFile` inside the deferred `read()`. Test reader helpers
(`readerFrom` + the concurrency-test reader) updated to the new
shape.

Two new unit tests pin the pre-buffer semantics:

  - `maxAssetBytes` is checked via handle.size BEFORE handle.read()
    (the reader's `read()` throws — must never run)
  - Running total abort stops further reads once exceeded (counting
    reader observes ≤ 2 reads when cap should fire after the first)

Phase K of plan declarative-roaming-gosling.md (post-PR-#1312
round-3 review). All 38 tests green (36 prior + 2 round-4
pre-buffer cases).

Refs nexu-io/open-design#1312.

* test(daemon): add Red test pinning owner pre-buffer 413 before mime 415

PR #1312 round-5 (lefarcen P2): the route currently reads the owner
file with readProjectFile() before any size check, so a 100 MiB owner
HTML is fully buffered into memory before the helper's ownerBytes
check fires. The fix is to stat with resolveProjectFilePath first,
reject pre-buffer with 413 PAYLOAD_TOO_LARGE on oversize, then fold
in the mime check (still 415 on mismatch, now pre-buffer), then
readProjectFile when both gates pass.

The Red→Green discriminator is the combination 'oversize AND
non-HTML': pre-fix the route reads the buffer first and the
text/plain mime check fires → 415; post-fix the route stats first
and the size check fires before the mime check → 413. Asserting
'got 413, not 415' pins both the pre-buffer property and the check
ordering (size before mime, per lefarcen's locked round-5 sequence).

2 MiB+1 byte fixture is acceptable in test setup; MAX_INLINE_OWNER_BYTES
is the production 2 MiB so no test-door is needed.

Red verified: AssertionError: expected 415 to be 413 (pre-fix flow
reads → mime → 415).

* feat(daemon): stat owner before readProjectFile in /export route to bound owner pre-buffer

PR #1312 round-5 (lefarcen P2 confirmed at PR-1312#issuecomment-4424868413
follow-up): the route previously called readProjectFile() unconditionally
on the owner, so a 100 MiB owner HTML was fully buffered into memory
before the helper's ownerBytes check fired with InlineAssetsLimitError
('owner'). That meant the 413 envelope returned to the caller but only
after peak memory had already hit the file size.

Fix mirrors the sibling-asset stat-then-read contract round 4 added via
the AssetHandle interface: call resolveProjectFilePath first (cheap
stat), reject pre-buffer with 413 PAYLOAD_TOO_LARGE on size >
MAX_INLINE_OWNER_BYTES, fold in the mime check (still 415
UNSUPPORTED_MEDIA_TYPE on mismatch, now also pre-buffer per lefarcen's
'fold-in is welcome'), then readProjectFile() only when both gates
pass. Size check fires before mime check, so an oversize non-HTML file
returns 413 rather than 415 — the observable Red→Green discriminator
for this round.

The helper's ownerBytes check (inline-assets.ts:127-133) stays as
defense-in-depth for direct in-process callers that skip the route and
for any drift between stat-reported size and the bytes returned by
readFile.

Verifies the round-5 Red at apps/daemon/tests/export-inline-route.ts
('returns 413 (not 415) for an oversize non-HTML file'). Daemon suite
1743/1743 passing.

* test(daemon): add Red test pinning stat-vs-actual byte reconciliation

PR #1312 round-5 (lefarcen P3 confirmed at PR-1312#issuecomment-4424868413
follow-up): the helper trusts handle.size for the running-total guard
and never reconciles with the actual byte length of content unless the
per-asset cap is exceeded. A reader that under-reports size (stale
stat, UTF-8 expansion at decode, sparse file, deliberate lie) can let
many strings materialize in memory before the concat-time guard at
the bottom of inlineRelativeAssets throws — defeating the round-4
pre-buffer cap intent.

Fix is lefarcen-confirmed path-a: post-read, the helper computes
actualBytes = Buffer.byteLength(content, 'utf8'), reconciles
runningBytes (add actualBytes, refund handle.size), and if running
total exceeds maxTotalBytes flips totalAborted = true and returns
null. Subsequent workers see totalAborted before invoking their own
read(). Helper still throws InlineAssetsLimitError('total') after
Promise.all settles — preserving the round-2/3/4 graceful-fallback
pattern instead of racing throws across in-flight workers.

Red→Green discriminator is read count. Pre-fix the helper trusts
the lying handle.size (10), so both reads complete (each returning
1000 bytes) under the reservation total of 56+10+10=76 < cap 500.
The concat-time guard then catches the 2000+-byte assembly and
throws 'total' — but only after both reads materialized in memory.
Post-fix worker 1's reconciliation trips totalAborted as soon as
actualBytes (1000) is folded into runningBytes; worker 2 skips its
read.

Red verified: AssertionError expected 1, received 2 (pre-fix flow
completes both reads before concat-guard fires).

* feat(daemon): reconcile inliner reservation with post-read actual bytes

PR #1312 round-5 (lefarcen P3 confirmed at PR-1312#issuecomment-4424868413
follow-up, path-a): the helper trusted handle.size for the running-
total guard and only reconciled with actual bytes for the per-asset
cap. A reader that under-reported size — stale stat, UTF-8 decode
expansion at read time, sparse file, deliberate lie — could let
many strings materialize before the concat-time guard at the bottom
of inlineRelativeAssets caught the excess. That defeated the
round-4 pre-buffer cap intent.

Fix: after a successful read(), compute actualBytes =
Buffer.byteLength(content, 'utf8'), reconcile runningBytes by
folding in (actualBytes - handle.size), and re-check the total cap.
If the reconciliation pushes runningBytes past maxTotalBytes,
drop the asset's inlining (tag stays as URL ref), set
totalAborted = true to block subsequent worker reads, and let
Promise.all settle. The helper then throws
InlineAssetsLimitError('total') below — matching the round-2/3/4
graceful-fallback pattern (no throw-before-settle race between
in-flight workers).

The per-asset cap check at line 228 is preserved for stat-lying
readers that blow a single asset past maxAssetBytes; that branch
refunds handle.size and drops without flipping totalAborted, so
sibling assets still get a fair shot.

Verifies the round-5 Red at apps/daemon/tests/export-inline-route.ts
('reconciles handle.size with actual content bytes'). Daemon suite
1744/1744 passing.

---------

Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>
2026-05-12 16:48:16 +08:00
@aaronjmars
377d65b7e4
fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN SSRF bypass) (#1122)
* fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN bypass)

new URL('http://192.168.1.5./').hostname returns '192.168.1.5.' — the
trailing dot is the RFC 1034 absolute-FQDN form and resolves identically
to '192.168.1.5'. parseIpv4 fails on the dotted form, so 169.254.169.254.
slips past the metadata-service block, 192.168.1.5. slips past the LAN
block, and localhost. slips past the loopback identification.

Strip trailing dots in normalizeBracketedIpv6 so all downstream checks
(isLoopbackApiHost, isBlockedExternalApiHostname, isBlockedIpv4, IPv6
range tests) see the canonical form.

Adds 6 vitest cases covering loopback FQDN forms (localhost.,
foo.localhost., 127.0.0.1.) and SSRF FQDN bypasses (169.254.169.254.,
192.168.1.5., 10.0.0.5.).

Refs nexu-io/open-design#1119 review feedback (P2 from @lefarcen).

* test(connectionTest): tighten trailing-dot coverage per #1122 review

Two issues from #1122 review:

1. (P2 from @mrcfps + codex bot) The original `foo.localhost.` case
   asserted error===undefined on validateBaseUrl, which only proves the
   URL passed validation — not that the host is identified as loopback.
   Replaced with direct isLoopbackApiHost(...) assertions on the actual
   loopback FQDN forms (localhost., 127.0.0.1., 127.0.0.5.) so the test
   exercises the loopback path the comment claims.

2. (P3 from @lefarcen) Original blocked-FQDN tests covered only 3 of 7
   ranges that isBlockedIpv4 handles. Added a dedicated case per range
   (0.0.0.0/8, 10/8, 100.64/10, 169.254/16, 172.16/12, 192.168/16,
   multicast >=224) so future regressions in normalizeBracketedIpv6
   surface against the full coverage.

* docs: drop misleading foo.localhost./endsWith claim in normalizer comment

@lefarcen review feedback: isLoopbackApiHost only accepts exact 'localhost',
'::1', loopback IPv4, and mapped loopback IPv4 — there's no subdomain or
endsWith handling, so referencing 'foo.localhost.' overstates what the
trailing-dot strip enables. Rewrite the comment to match actual call sites
(isLoopbackApiHost equality + isBlockedIpv4 numeric parse).
2026-05-12 16:36:09 +08:00
eggward han
b6e4ae5e11
feat(daemon): make connection-test timeouts configurable (#1222)
* feat(daemon): make connection-test timeouts configurable

Provider and agent connection tests had hardcoded 12s / 45s budgets,
which are too tight for slow networks or distant providers (the user
sees "timeout" in Settings with no way to extend the budget).

- Add OD_CONNECTION_TEST_PROVIDER_TIMEOUT_MS (default 12_000)
- Add OD_CONNECTION_TEST_AGENT_TIMEOUT_MS (default 45_000)
- Invalid values (non-numeric, zero, negative, fractional) emit a
  console.warn and fall back to the default, so a typo in the env
  never silently disables the safety timeout.
- Export resolveConnectionTestTimeoutMs for unit testing; cover the
  three resolution paths (fallback / honored override / invalid).

41 connection-test tests pass (+3 new), full daemon suite 1170/1170.

* fix(daemon): reject connection-test timeout overrides above Node's setTimeout maximum

Node's `setTimeout` silently clamps any delay above `2^31-1` ms
(2_147_483_647) to ~1 ms with a TimeoutOverflowWarning. The previous
`Number.isInteger(n) && n >= 1` check accepted oversized values
unchanged and passed them straight to `setTimeout`, so an override
that *intended* to raise the budget — e.g.
`OD_CONNECTION_TEST_AGENT_TIMEOUT_MS=3000000000` — instead caused
every connection test to fail almost immediately. The safety
timeout was effectively disarmed.

Add `MAX_CONNECTION_TEST_TIMEOUT_MS = 2_147_483_647` and switch the
guard to `Number.isSafeInteger(n) && n >= 1 && n <= MAX...`. The
boundary value is still accepted; one millisecond past it falls
back with a warn. Regression test exercises `3_000_000_000`,
`2_147_483_647`, and `2_147_483_648`.

Addresses #1222 review feedback from @chatgpt-codex-connector,
@mrcfps, and @lefarcen.
2026-05-12 16:34:37 +08:00
chaoxiaoche
2ce6355558
feat(daemon): inject compiled design-system tokens + fixture into prompts (#1385)
* feat(daemon): inject compiled design-system tokens + fixture into prompts

Follow-up to #1231. The prior PR landed the structured form of two
brands (`default` + `kami`) and codified the schema; this PR teaches
the daemon to actually consume those files when assembling the system
prompt, so agents stop having to re-derive token names from DESIGN.md
prose every turn.

Gated behind `OD_DESIGN_TOKEN_CHANNEL=1` for the smoke-test phase —
flag-off keeps the daemon byte-equivalent to today's behavior, flag-on
appends two new prompt blocks (the brand's `tokens.css` :root contract
and its `components.html` reference fixture) right after the existing
DESIGN.md block. Brands without those sibling files (every brand
except `default` and `kami` today) skip silently in either mode.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(daemon): only swallow ENOENT/ENOTDIR in readFileOptional, rethrow rest

Reviewer feedback (nettee, #1385). The prior catch-all hid permission
errors, EISDIR, and broken packaged-resource paths behind the same
"undefined = absent" branch the legacy ~138-brand fallback uses,
which would let `OD_DESIGN_TOKEN_CHANNEL=1` silently degrade to the
DESIGN.md-only prompt while reporting success. That corrupts the
exact signal the smoke-test rollout depends on.

Now `readFileOptional` only returns undefined for ENOENT / ENOTDIR
(real "file does not exist" cases) and rethrows everything else.
Added a focused test that plants a directory at the tokens.css path
to exercise the EISDIR branch, plus a partial-presence regression
test to confirm the stricter contract preserves the legacy fallback.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@192.168.10.16>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 16:28:59 +08:00
Yuhao Chen
3790c00363
docs: add Windows troubleshooting guide (#478) (#1170)
* docs: add Windows troubleshooting guide (#478)

Add docs/windows-troubleshooting.md with step-by-step fixes for the
most common native-Windows setup errors:

- Node 24 / nvm-windows gotchas (fake nvm file in System32)
- pnpm not found after installation
- Build scripts blocked by pnpm 10 (better-sqlite3, sharp)
- Visual Studio / gyp build errors
- Starting the dev server
- Optional OpenCode CLI setup

Also update CONTRIBUTING.md and QUICKSTART.md to link to the new
guide instead of the vague "file an issue if it doesn't" note.

* docs: fix Windows guide command accuracy (#1170)

Address all 6 inline review comments from lefarcen:

- Pin npm-global pnpm install to @10.33.2 (matches packageManager field)
- Use where.exe instead of bare where (PowerShell alias conflict)
- Fix OpenCode package: opencode-ai (not opencode), binary is opencode
- Add EPERM fallback note for corepack enable on protected installs
- Add Python check for gyp ERR! find Python
- Expand diagnostic checklist with corepack, python, execution policy

Also remove redundant corepack pnpm --version from checklist.
2026-05-12 16:17:44 +08:00
Hesam
d97b6041eb
fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins (#1110)
* fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins

fnm legacy installations use ~/.fnm/node-versions. Closes #1102

* fix: remove stray .fnm token from type declaration
2026-05-12 16:15:52 +08:00
Nicholas-Xiong
d5812ee535
feat: add FAQ page skill (#1162)
* fix: set writable OD_DATA_DIR default for nix run

Fixes #1157

When running via 'nix run github:nexu-io/open-design', the daemon
attempted to create runtime state under the Nix store package path:

  /nix/store/.../lib/open-design/.od/projects

The Nix store is read-only at runtime, causing startup to fail with
ENOENT when mkdir() tried to create the projects directory.

This commit updates the nix run wrapper to export OD_DATA_DIR with
a writable default ($HOME/.od) when the variable is unset. Users
can still override it by setting OD_DATA_DIR before running.

The Home Manager and NixOS modules already set OD_DATA_DIR, so they
are unaffected by this change.

* feat: add FAQ page skill

Add a new skill for generating Frequently Asked Questions pages with:
- Collapsible accordion sections for Q&A pairs
- Real-time search functionality
- Category filtering (Billing, Account, Technical, General)
- Smooth animations and transitions
- Keyboard navigation support
- Mobile-friendly responsive design
- Semantic HTML with proper ARIA attributes

The skill includes:
- SKILL.md with triggers, workflow, and output contract
- example.html demonstrating a complete FAQ page with 12 questions

Use cases: help centers, support pages, product documentation

* fix: address PR review feedback for FAQ page skill

- Fix craft slugs: use accessibility-baseline and state-coverage instead of non-existent slugs
- Remove overly broad 'questions and answers' trigger
- Add edge case handling for insufficient/excessive FAQs
- Remove search highlighting requirement (XSS risk)
- Update self-check to reflect filtering instead of highlighting

Addresses review comments from @lefarcen and @chatgpt-codex-connector

* feat: add localized copy for faq-page skill

Add German, French, and Russian translations for the FAQ page skill
example prompt to fix validation test failure.

- DE: FAQ-Seite mit Akkordeon-Abschnitten, Suchfunktion und Kategoriefilterung
- FR: Page FAQ avec sections accordéon, recherche et filtrage par catégorie
- RU: Страница FAQ со складными секциями-аккордеонами, поиском и фильтрацией

* fix: escape apostrophe in French translation

Use double quotes to avoid syntax error with d'auth
2026-05-12 16:13:50 +08:00
이용진
aeb6cde923
prevent duplicate saves and add template deletion (#1294)
* prevent duplicate template entries on repeated save

* add delete button to saved template list

Templates can now be removed from the template picker via a hover x button, calling the existing DELETE /api/templates/:id endpoint.

* add missing onDeleteTemplate prop in test fixtures

* add template deletion flow test for NewProjectPanel

* reject template names longer than 100 characters

* preserve original createdAt on template update
2026-05-12 15:48:04 +08:00
Prantik Medhi
325d1d3ceb
docs: add NotebookLM GitHub export script (#1062)
* docs: add NotebookLM GitHub export script

* fix: make NotebookLM export TOC anchors work

* fix: escape TOC link text markdown chars

* fix: include merged PRs when exporting --prs all

* fix: allow --prs merged mode

* fix: treat --limit as total export budget

* fix: avoid starving buckets under global --limit

* fix: support --issues none and handle repos w/ issues disabled

* fix: avoid underfilling export when buckets empty

* fix: keep disabled-issues fallback quiet

* fix: silence disabled issues fallback

* fix: satisfy script typecheck
2026-05-12 15:47:32 +08:00
shangxinyu1
5d674410f2
test: stabilize extended Playwright coverage (#1341)
* test: align extended Playwright coverage with current UI behavior

* test: address extended suite review feedback

* test: restore Codex path hydration assertion
2026-05-12 15:11:34 +08:00
Eli
9c489aa045
feat(web): redesign Designs tab cards — covers, tags, overflow menu, multi-select (#1161)
* feat(web): redesign Designs tab cards — covers, tags, overflow menu, multi-select

- Render real previews on project cards: HTML iframe / image / video / hashed gradient fallback with project initial; lazily fetches the project's primary file when metadata.entryFile is unset, prefers index.html → newest html → image → video.
- Live artifact card thumbnails embed the rendered artifact URL via sandboxed iframe.
- Replace the per-card close button with a `…` overflow menu (Rename, Delete) that opens on hover/click; click-outside and Esc close it.
- Add multi-select mode (toolbar toggle → checkbox per card → "N selected · Delete · Cancel" pill) with batch delete via the existing onDelete prop.
- Add a category tag to every card (Prototype / Live Artifact / Slide / Media) derived from project.metadata.intent / kind / skillId.
- Replace browser prompt() and confirm() with custom modals (rename input + danger-confirm) reusing the existing .modal shell.
- Add `more-horizontal` icon and 16 new i18n keys across all 18 locales (zh-CN/zh-TW localized; others fall back to English).

* test(e2e): update home delete flow for overflow menu + custom confirm modal

The previous flow targeted a per-card X button labelled "delete project <name>"
and asserted on a native `dialog` event. The card UI now exposes a `…` overflow
menu and a styled confirm modal, so reach delete via the menu and assert against
the modal's Cancel / Delete buttons instead.

* fix(web): harden Designs tab preview sandbox

* fix(web): hide Designs select mode in kanban
2026-05-12 15:08:22 +08:00
Eli
77f69257a7
feat(web): in-context comment thread for the artifact preview (#1276)
* feat(web): free-pin fallback in comment mode for unannotated artifacts

When the artifact has no data-od-id annotations, clicking in Comment
mode now posts a synthetic position-based target so the host opens a
popover at the click location. Daemon upsert validation requires a
non-empty selector/label, so the pin uses [data-od-pin=ID] and label
'pin'. Coordinates are document-space (viewport + scrollY) so pins
stay anchored after scroll/reload. Clicks on interactive elements
(a/button/input/textarea/select/label/contenteditable) keep their
native behavior and are not pinned.

* feat(web): tighten comment popover layout for free-pin and element targets

The popover header used to dump the raw elementId verbatim — fine for
data-od-id targets like 'hero-cta' but jarring for free-pins where
elementId is a synthetic 'pin-...' string. Branch on the prefix and
show 'Pin · at X, Y' for free-pins; keep the label + selection kind
for real element / pod targets. Replace the text 'Close' button with
an icon-only close affordance to match the popover-as-card visual.

Action row is now two right-aligned buttons (Comment + Send to
Claude) for element targets and (Add note + Send to Claude) for pod
targets, eliminating the three-button row that wrapped onto two
lines at narrow widths. The 'Remove' affordance for existing
comments stays left-aligned.

* feat(web): drop comments tab from chat sidebar

The chat sidebar's 'Comments' tab listed saved/attached preview
comments but duplicates the per-element popover already shown in the
artifact viewer. Hide the tab and its content while the right-side
comment thread panel takes over the same surface in-context. The
CommentsPanel / CommentSection components stay defined as dead code
for the moment so callers and translation keys remain valid; a later
pass can delete them.

* feat(web): right-side comment thread panel in board mode

Render a 320px CommentSidePanel anchored to the right of the
artifact preview whenever board (comment) mode is on. The panel
lists every saved preview comment for the current file with an
avatar initial, the element label (or 'Pin' for free-pin synthetic
ids), an Xd/Xh/Xm-ago timestamp, the note body, a Reply link, and
a checkbox.

Reply focuses the comment's element via liveSnapshotForComment so
the popover opens at the right anchor. Selecting one or more
comments via the checkboxes surfaces a 'N selected · Clear · Send
to Claude' action bar above the list; Send to Claude reuses the
existing onSendBoardCommentAttachments pipeline via
commentsToAttachments. The panel takes the place of the chat
sidebar's removed Comments tab so the thread lives next to the
artifact instead of behind a tab switch.

* feat(web): styles for right-side comment thread panel

Floating 320px panel anchored to the right edge of the artifact
preview with a scrollable comment list and a coral selection bar
that appears when one or more comments are checked. Selected items
get a coral tint; the reply / check / send-to-claude controls
match the popover's coral primary tone.

* feat(web): toast confirmation on comment save, close popover

After savePersistentComment succeeds, close the popover via
clearBoardComposer and surface a transient 'Comment saved' (or
'Pin saved' for free-pin targets) toast for 2.2s. Replaces the
previous behavior where the popover stayed open with an empty
draft after save, which left users uncertain whether the save
landed and forced an extra click to dismiss.

* feat(web): position the comment-save toast at the top of the preview

* feat(web): allow editing saved comment notes via the side panel

Rename the per-item 'Reply' affordance to 'Edit' (no thread model
exists yet, so reply was misleading) and pre-fill the popover with
the existing note when clicked. The save path goes through
onSavePreviewComment which the daemon implements as an upsert keyed
on (project, conversation, filePath, elementId), so the edit
overwrites the existing row's note without spawning a duplicate.

Also fall back to a snapshot synthesized from the saved comment's
own fields when the corresponding live target is no longer in the
iframe DOM (e.g. free-pin parents that were re-rendered), so the
edit path still works after artifact reloads.

* feat(web): hide already-sent comments from the side panel

After Send to Claude, the daemon flips the comment status from
'open' to 'applying' (and then 'needs_review' / 'resolved' /
'failed' depending on the run). Filter the side panel to status
=== 'open' so sent comments visibly leave the list — the user
gets clear feedback that the send landed and the panel stays
focused on actionable, un-sent items.

* feat(web): drop single-tab bar and conversation count badge

After the Comments tab was removed the chat header still rendered
a one-tab 'tablist' just for the Chat tab, which read as visual
noise without a sibling to switch between. Drop the tabs wrapper
entirely; the chat content stays mounted and the header now hosts
only the conversation-history affordance.

Also drop the numeric badge that overlaid the conversation history
button: counting open conversations next to a generic history icon
was easy to mistake for an unread / notification count. The dropdown
itself remains the canonical place to see and switch between past
conversations.

* feat(web): right-align chat header actions after tab bar removal

With the tabs wrapper gone, chat-header-actions sat flush left
because nothing was pushing it across the header. Add margin-left:
auto so the history / new-conversation / collapse buttons land at
the right edge, matching the design files / index.html tab row's
own right-aligned controls.

* feat(web): rename board-mode toggle to Comment with comment icon

The artifact preview toolbar's board-mode entry was labeled 'Tweaks'
with the tweaks icon, which collided with the palette Tweaks button
next to it and hid the comment capability behind a generic label.
Rename to 'Comment' with the comment icon and switch to the
viewer-action class so the button matches the surrounding toolbar
items (Edit/Draw) and the coral active state lands on the right
surface.

* fix(web): pass designTemplates to ProjectView in api-empty-response test

The test props for ProjectView were missing the designTemplates
prop that was added to Props in #955 (generic skills split). CI's
strict typecheck (tsc -b --noEmit) caught it; local runs that hit
project references differently did not. Pass an empty SkillSummary
array — matches the empty skills fixture for the same reason.
2026-05-12 15:05:08 +08:00
Eli
928079daf5
feat(web): consolidate Image/Video/Audio entries into a Media tab (#1167)
Reduces the New Project panel's top-level tab count by collapsing the
three media surfaces into a single Media tab with an inner segmented
control, and polishes the controls inside that tab so they stop
dominating the panel:

- Media tab + segmented (Image / Video / Audio) inside the panel body.
  Underlying ProjectKind branches and submission contract unchanged —
  the daemon still receives kind=image/video/audio.
- Model picker rewritten as a combobox: one trigger row + searchable,
  provider-grouped popover with Recommended badges. Replaces the flat
  grid of provider-grouped cards that scrolled past the fold once the
  fourth provider landed.
- Aspect picker compressed from a 5-card grid to a single row of
  segmented pills with mini ratio glyphs.
- Image surface no longer carries a free-form Style notes field; it
  was redundant with the prompt template + main prompt input.
- Live artifact tab locks fidelity to high-fidelity (the wireframe
  option is now hidden) — a wireframe live artifact doesn't make
  sense and the picker added noise.

i18n: adds tabMedia / titleMedia / model* keys across all 18 locales,
removes imageStyleLabel / imageStylePlaceholder. Tests + e2e selectors
updated to drive the new Media tab + segmented surface flow.
2026-05-12 14:52:03 +08:00
Eli
1b307bf17f
feat(web): tweaks palette popover with HSL hue-shift recoloring (#1292)
* feat(web): tweaks palette popover with HSL hue-shift recoloring

Adds a Tweaks color-palette popover to the HTML preview toolbar.
Selecting a palette re-skins the iframe in place via a srcDoc-side
bridge that walks the DOM and shifts every chromatic paint to the
target hue while preserving each color's saturation and lightness —
pale tints stay pale, bold CTAs stay bold, just in the new color
family. Mono-noir desaturates instead of shifting.

- runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options
- file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected
- FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration
- PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir
- PreviewDrawOverlay: stub pass-through until the draw branch lands

* feat(web): hide finalize-design toolbar from project header

* test(e2e): skip project actions toolbar flow after toolbar removal
2026-05-12 14:38:00 +08:00
nettee
03da01a56f
ci: use open-design bot for contributors wall refresh (#1349) 2026-05-12 14:35:28 +08:00
Nicholas-Xiong
c0b679ecbc
fix: restore custom dropdown chevron for timezone selector in dark mode (#1368)
Fixes #1359

The timezone selector in the Routines form was showing repeated dropdown
icons and poor text readability in dark mode because:

1.  set  to remove the native
   chevron, but didn't restore a custom one via background-image
2. Missing  caused text to overlap with any chevron
3. No dark-mode-specific chevron color was defined

This commit adds the custom dropdown chevron styling (matching the global
select behavior) with proper padding and dark-mode color variants, ensuring:
- Single, correctly-positioned chevron icon
- Sufficient padding to prevent text overlap
- Proper contrast in both light and dark themes
- Consistent visual behavior with other form controls
2026-05-12 14:29:01 +08:00
Sid
fb47d0ae51
style(web): polish EntryView UI — sidebar layout, folder tabs, slim form, blue selected token (#1360)
* chore(web): upgrade radius scale + introduce blue --selected token

UI polish pass — design tokens for follow-up commits.

Radius scale was visually too square at the small end. Bump up so
buttons / inputs / cards feel rounded rather than boxy:

- `--radius-sm: 6px → 8px`  (buttons, inputs, small chips)
- `--radius: 10px → 12px`  (medium containers, Recent filter pill)
- `--radius-lg: 14px → 16px`  (project cards)
- `--radius-pill: 999px` unchanged (status chips)

Introduce a separate "selected" colour so selection indicators
(card borders, focus rings) read as blue instead of fighting with
the orange brand accent that drives primary CTAs:

- `--selected: #2563eb`  (Tailwind blue-600)
- `--selected-soft: rgba(37, 99, 235, 0.16)`  (soft tint for shadows)

No selectors are migrated to `--selected` in this commit — that
happens in a later "selected state" commit so the diff stays scoped.

* refactor(web): replace entry global header with sidebar brand + reorder bottom chips

Pre-existing layout: a global \`AppChromeHeader\` strip sat across the
whole top of EntryView (logo + settings gear), then a 2-column body
below it. Visual mass concentrated in a thin horizontal bar that did
not relate to the page's column structure, and the settings gear
duplicated the bottom Local-CLI chip.

New layout matches the two-column "brand-in-sidebar + tabs-in-main"
pattern: the brand block lives at the top of \`.entry-side\` (left
column), the right tabs live at the top of \`.entry-main\`, and the
vertical divider between them is the only horizontal seam.

EntryView:

- Drop \`<AppChromeHeader actions={avatarMenu} />\` from EntryView's
  render — the home page no longer renders the global chrome strip.
  (ProjectView still uses AppChromeHeader for back-nav / file
  actions, so the component itself stays in the codebase.)
- Add a sidebar brand block inside \`.entry-side\` using the
  already-defined \`.entry-brand\` / \`.entry-brand-mark\` /
  \`.entry-brand-title\` classes that were sitting dead in
  index.css.
- Reorder \`.entry-side-foot\` chips so that the env-critical
  Local CLI row sits on top of the row, with the secondary
  toggles (language picker, pet adoption, X follow icon) compact
  on a second row. The Follow @nexudotio chip drops its text
  label and becomes icon-only — pure marketing content, so it no
  longer earns a full-width pill.
- Settings access moves entirely to the Local CLI chip's existing
  click handler; the top-right gear is gone (it was a duplicate).

CSS:

- \`.entry-shell\` grid: \`auto 1fr\` → \`1fr\` (no header row).
- \`.entry-side\` background: \`var(--bg-panel)\` → \`transparent\`,
  so the sidebar shares the page beige and only the New-prototype
  card reads as white. Removes the "everything on the left is on
  one big white sheet" feeling.
- \`.entry-brand\` gets \`padding: 24px 20px 18px\` so the logo +
  title block has breathing room at the top of the sidebar.
- \`.entry-brand-mark\` width/height \`44 → 34\`. The previous
  44px gradient ring was visually heavier than the title text it
  sat next to.
- \`.entry-brand-title\` weight \`600 → 450\`, color
  \`var(--text-strong)\` → \`var(--text)\`. Serif title still
  reads as the page anchor without the chunky "bold black" stamp.
- \`.entry-brand-actions\` added for future right-aligned actions
  (carries no actual content in this commit — kept so re-adding a
  settings/avatar entry point doesn't need new CSS).
- \`.entry-side-foot .foot-pill\` slim pass: padding
  \`4px 10px → 3px 8px\`, font \`11.5px → 10.5px\`, gap \`6 → 5\`,
  plus \`justify-content: center\` and \`min-height: 24px\` so the
  icon-only Follow pill stays the same height as the text pills
  next to it.

* style(web): align right tabs row with brand row + strip hover/focus noise

Right column's tabs row ("Designs / Templates / Design systems /
Image templates / Video templates") needed three things:

1. Vertical center of tab text aligned with the brand logo on the
   left (both rows feel like one row, separated by the vertical
   divider only).
2. Active tab's underline sitting flush on the horizontal divider
   below the tabs (not floating mid-row).
3. No hover background, no focus outline, no transition — tabs are
   a navigation strip, not action buttons.

Changes:

- `.entry-header` padding `0 28px` → `24px 28px 0`, drop the
  `min-height: 52px`. Padding-top mirrors the brand block's
  padding-top (24px) so left logo top and right tabs top land on
  the same Y. Header height now content-driven; underline meets
  the `border-bottom` divider naturally.
- `.entry-tabs` gets `align-self: stretch` + `align-items: center`
  + `gap: 2px → 24px`. The stretch lets the tabs container fill
  header height; the bigger gap matches Claude Design's tab
  rhythm.
- `.entry-tab` becomes a "plain underline tab":
  - `border-radius: 6px 6px 0 0 → 0` (no folder-tab look — that's
    on the left tabs).
  - `padding: 14px 11px → 6px 4px 8px` so text + underline form a
    tight group, with the underline sitting at the bottom of the
    tab box right above the header divider.
  - `font-size: 14px → 12px` matches the left newproj tabs (set
    in commit 4) — both columns share the same tab type-size.
  - `transition: none` removes the inherited 120ms background /
    border / color transition.
- Hover / focus / active states explicitly zero out background,
  border-color, outline. Hover keeps a subtle color change
  (`text-muted → text`) so the tab still feels interactive
  without flashing a chip behind it.
- Active state colors are duplicated across `.active`,
  `.active:hover`, `.active:focus`, `.active:focus-visible` so the
  black underline never gets overwritten by the inactive-state
  rules above.

* style(web): folder-tab merge on left newproj tabs + flat card top corners

The left "Prototype / Live artifact / Slide deck / …" tabs sat as
plain underline tabs above a fully-rounded card. The active tab and
card looked like two stacked rectangles with a gap.

Folder-tab pattern:

- Active tab gets a white background + 12px top corners + a 1px
  border on top / left / right.
- Active tab's bottom border matches the card's background color
  (effectively invisible) — so where the tab sits, the card's top
  border is "broken" and tab + card read as one merged shape.
- Card top corners are square (`border-radius: 0 0 12px 12px`),
  bottom corners stay 12px. With the active tab's square bottom
  edge, the merge line at the tab/card seam is a clean horizontal,
  not a curve mismatch.

Implementation:

- `.newproj-tabs-shell`:
  - `overflow: hidden → visible` so the tab's overlap with the
    card below isn't clipped at the shell's bottom edge.
  - `margin-bottom: -1px` + `z-index: 2` so the shell renders on
    top of the card and the 1px tab/card overlap actually paints.
  - The `.can-left { padding-left: 40 }` / `.can-right` overrides
    used to reserve room for scroll arrows are removed (arrows
    are hidden, no extra padding needed).
- `.newproj-tabs` keeps its horizontal `overflow-x: auto` so the
  8 project-type tabs can still scroll inside the sidebar width.
- `.newproj-tabs-arrow` becomes `display: none`. The two
  chevron-circle buttons added clutter without much benefit —
  users with touchpads / wheels / keyboard already scroll the
  tabs row natively, and the `::before` / `::after` linear-
  gradient fades (now using `--bg` instead of `--bg-panel` so
  they fade into the page beige, not the sidebar panel that no
  longer exists) signal there are more tabs to the right.
- `.newproj-tab`:
  - Replace the plain bottom-underline (`border-bottom: 2px
    solid transparent`) with a full transparent 1px border so
    the active state can flip just the colors without changing
    layout.
  - `border-radius: 0 → 12px 12px 0 0`.
  - `position: relative` for z-index stacking.
  - `padding: 10px 6px → 7px 14px` (less vertical, more
    horizontal — tabs read as "labels" rather than chunky
    buttons).
  - Symmetric top/bottom padding (`7px`) so the text + folder-
    tab top corners stack cleanly.
  - `transition: none` — no animation between active/inactive
    states (tabs are nav, not action buttons).
- All hover / focus / focus-visible / active states zeroed out
  background and border-color so the inherited `button { … }`
  base style (which adds bg-subtle on hover) does not bleed in.
  Subtle color change on hover (`text-muted → text`) is the
  only affordance.
- `.newproj-tab.active` (+ active hover/focus combos so the
  base rules don't override): white bg, full var(--border) on
  three sides, bottom border = var(--bg-panel) (invisible
  against card), z-index 3 (above non-active tabs and shell
  pseudo-elements).
- `.newproj-body`:
  - `margin: 0 24px` so the card breathes inside the sidebar
    (and the active tab's left edge aligns with the card's left
    edge).
  - `padding: 18px 24px 28px → 16px 18px 18px` — tighter.
  - `border-radius: (full 12) → 0 0 12px 12px` for the
    flat-top merge with the active tab.
  - Adds explicit `border` + `background: var(--bg-panel)` +
    `box-shadow: var(--shadow-xs)` so the form reads as a card
    floating on the transparent sidebar.
  - `flex: 1 → 0 0 auto` (and `min-height: 0` / `overflow-y:
    auto` removed) — the card is content-sized, not stretched
    to fill the sidebar. Empty space below the card is now
    page beige, not a giant white sheet.
  - `gap: 14px → 12px` between form sections.

* style(web): slim NewProjectPanel form (title, fidelity, buttons, ds-picker)

The form inside the new white card felt overweight against the
compacted layout from the previous commits — fidelity cards were
~133px tall, the Create button + Open-folder secondary button both
had ~11px symmetric padding, the design-system trigger had a 32px
avatar in a 55px-tall row. Slim every element so the card reads as a
focused form, not a stack of beefy buttons.

Title:
- \`.newproj-title\` font \`14px / 600 → 13px / 550\`. Still
  visibly the section heading but no longer competing with the
  serif brand title above.

Fidelity:
- \`.fidelity-thumb { aspect-ratio: 12/7 → 16/7 }\`. The previous
  aspect made cards taller than they needed to be in the narrow
  sidebar column.
- \`.fidelity-card { gap: 8 → 6, padding: 10/10/12 → 8/8/10 }\`.
  Combined with the thumb aspect change, card height drops from
  ~133px → ~102px (visually close to the Claude Design reference
  while keeping the same content).

Primary / secondary buttons:
- \`.newproj-create\` padding \`11px (symmetric) → 8px 11px\`,
  margin-top \`4 → 2\` — primary CTA no longer towers over the
  fidelity cards above it.
- \`.newproj-import\` padding \`10px → 6px 10px\` — the secondary
  "Import Claude Design ZIP" button feels like an alt option, not
  a peer of Create.

Design system trigger:
- \`.ds-picker-trigger\` gap \`10 → 8\`, padding \`8/10 → 6/10\`.
- \`.ds-picker-title\` font \`13 → 12.5\` so name + subtitle stay
  legible in the slimmer row without overflowing the column.
- \`.ds-avatar\` width/height \`32 → 26\`, border-radius \`6 → 5\`.
  The thumbnail was the dominant element in the row; shrinking it
  pulls the row height from ~55px → ~50px.

Footer disclaimer:
- \`.newproj-footer\` padding-top \`0 → 12px\`. The "Only you can
  see your project by default." line was butting against the card
  bottom; 12px of air separates the disclaimer (page-bg context)
  from the card (panel-bg context) cleanly.

* style(web): blue selected indicators + Recent filter rounded + neutral input focus

Three small "selection state" tweaks driven by the new
\`--selected\` token introduced earlier in this branch:

1. Fidelity card selected border is now blue, not the brand
   accent. The orange Create button + the orange selected card
   border were fighting for the same visual role (primary
   action vs primary selection). Blue clearly says "this is
   the one that is selected" without competing with the CTA.

   - \`.fidelity-card.active\` border-color
     \`var(--accent) → var(--selected)\`.
   - Box-shadow ring + soft 0.04 drop swapped from the orange
     \`180/90/59\` rgba tuple to the blue \`37/99/235\` tuple.
   - \`.fidelity-card.active .fidelity-thumb\` border swapped
     from \`var(--accent-soft) → var(--selected-soft)\`.

2. Recent / Your designs filter is no longer a fully-rounded
   pill. The bottom-left settings chips deserve to be the only
   "999px pill" shape — those are tertiary status indicators.
   The Recent/Your designs toggle is a higher-importance
   inline filter, so it gets the medium radius instead.

   - \`.subtab-pill\` wrapper border-radius
     \`var(--radius-pill) → var(--radius)\` (12px).
   - Inner button border-radius
     \`var(--radius-pill) → var(--radius-sm)\` (8px).
   - Active state background \`var(--text) → var(--bg-panel)\`,
     color \`var(--bg) → var(--text)\`. The "black filled pill"
     read as a status badge; white-on-faint-gray reads as
     "selected toggle" — same shape as Claude Design's Recent
     pill.

3. Input focus is neutralised. The base \`input:focus\` rule
   added an orange border + a 3px orange-soft ring around the
   focused field — way too much visual weight for a quiet form
   ("Project name" → focus made it scream).

   - \`input:focus / textarea:focus / select:focus\` border-color
     \`var(--accent) → var(--border-strong)\` (light grey).
   - Box-shadow ring removed (\`none\`). Focused inputs now only
     darken their border by one step — barely visible but enough
     to confirm focus.

These three changes are grouped because they all migrate selection-
state styling off the brand accent and onto neutral / blue tokens.
The next pass (if any) can sweep the remaining \`var(--accent)\`
selection sites (\`.ds-row.active\`, \`.ds-picker-trigger.open\`,
\`.conv-pill.open\`, …) to use \`--selected\` too, but each of those
lives in a different surface and felt out of scope for the entry
view polish.

* refactor(web): pet rail toggle moves inside pet pill as split button

WHAT
- Convert the pet pill from a single `<button>` to a `<div>` containing
  two buttons separated by a 1px divider:
  * `.pet-pill-main` keeps the existing "Adopt a pet" / "Change pet"
    glyph + label + unadopted dot, still wired to `onAdoptPet`.
  * `.pet-pill-toggle` is a small icon-only button that flips
    `petRailHidden` — eye icon when the rail is hidden ("click to show"),
    eye-off when visible ("click to hide").
- Drop the old avatar-menu popover from EntryView entirely:
  `avatarMenuOpen` state, the outside-click / Escape effect, and the
  cog-popover trigger are all removed. The `Settings` entry of that
  popover was already redundant with the `Local CLI` chip; the
  `Hide/Show pet picker` entry now lives directly on the pet pill.
- CSS in the `.pet-pill` block:
  * `height: 24px` + `padding: 0` so the outer pill matches every other
    chip in the row vertically.
  * `.pet-pill-glyph` reduced from 14px to 12px and constrained to a
    14x14 inline-flex box so the unicorn / paw glyph stops pushing the
    chip taller than 24px.
  * Per-region hover (`.pet-pill-main:hover`, `.pet-pill-toggle:hover`)
    so each side of the split lights up independently, with the divider
    inheriting the accent tint while the chip is in `pet-pill-fresh`.

WHY
- After commit 5fe5721c removed the global `<AppChromeHeader>`, the only
  entrypoint to "Show pet picker" was the avatar-menu popover. Putting
  the avatar cog back next to the brand mark felt wrong: it elevates
  Settings (already on the `Local CLI` chip) to a primary affordance and
  sits next to the logo, where it doesn't belong by hierarchy.
- The pet-rail toggle is fundamentally a pet-area control — it belongs
  with the pet adoption chip, not in a popover. Putting both on the same
  chip via a split button gives the rail toggle a stable, discoverable
  home and keeps `.entry-brand` a brand-only row.

SCOPE
- `apps/web/src/components/EntryView.tsx` + `apps/web/src/index.css`.
  No new state, no new i18n keys (reuses `pet.railShow` / `pet.railHide`).
- The orphan i18n keys `entry.openSettingsTitle` and
  `entry.openSettingsAria` are no longer referenced by EntryView but are
  left in place — they're shared types that other locale files still
  declare; a focused cleanup belongs in a separate commit.

* test(e2e): update entry chrome + project mgmt assertions for new layout

WHAT
- entry-chrome-flows.test.ts:
  - Rename `entry chrome settings menu toggles pet rail visibility` →
    `pet pill toggle hides and shows the pet rail`. The flow no longer
    goes through an `Open settings` cog + `.avatar-popover` chain;
    instead it clicks the in-pill `.pet-pill-toggle` directly and
    verifies its `aria-label` flips between `Hide` / `Show pet picker`.
  - Replace `.app-chrome-header` / `.app-chrome-brand` assertions with
    `.entry-brand` + `.entry-brand-title` text checks. The global
    chrome strip no longer exists on EntryView.
  - The compact-width overflow guard now measures `.entry-brand` rather
    than `.app-chrome-header`, since the brand row replaced the chrome
    strip as the only top-of-page horizontal stack.

- project-management-flows.test.ts:
  - Drop the `Scroll project types right` arrow click. The
    `.newproj-tabs-arrow` buttons are hidden (the folder-tab pattern
    leans on shadow gradients on `.newproj-tabs-shell::before/::after`
    instead). Playwright's `locator.click()` auto-calls
    `scrollIntoViewIfNeeded()`, so clicking `new-project-tab-image`
    after a tab-switch still reaches the off-screen tab.

WHY
- These selectors / interactions are tied to UI affordances the earlier
  commits in this branch deliberately replaced. The behaviors they pin
  (pet rail toggle reachability, no horizontal overflow at 820px,
  draft preservation across tab switches) are still asserted — only
  the selectors needed to follow the new structure.

VERIFICATION
- `pnpm exec playwright test ui/entry-chrome-flows.test.ts ui/entry-configuration-flows.test.ts ui/project-management-flows.test.ts`
  → 17/17 passed (chromium project, single worker, fresh daemon).

* fix(web): restore .newproj-body as scroll container (P1 regression)

WHAT
Reintroduce `flex: 1 1 auto; min-height: 0; overflow-y: auto;` on
`.newproj-body`, alongside the `display: flex` + `padding` that
commit ba44e396 kept. The parent `.newproj` is still `overflow: hidden`,
so without these three lines the card can clip its own content with
no scroll recovery.

WHY
Reported by @lefarcen (P1) and @Siri-Ray in review on #1360. Before
this commit the slim-form pass made the body shrink-wrap (`flex: 0 0 auto`)
to keep the empty-state caption snug against the card edge. That works
when the form is short, but the card can grow well past the available
sidebar height in real scenarios:
- Compact-height windows (≤ 720 vertical px).
- Image / media tabs that add aspect + model rows.
- Validation / error text after a failed Create.
- Design-system popover opened with many systems.

In all four cases the Create / Import / Open-folder stack — or the
picker's bottom options — were sliding below the visible sidebar with
no scroll bar to recover them. This is a regression against the
behavior that landed in #1038, which made `.newproj-body` the scroll
container precisely to keep the form bounded.

SCOPE
- `apps/web/src/index.css` only, one ruleset.
- Visual cost: the empty-state caption (`.newproj-footer`) now sits at
  the bottom of the available sidebar height instead of hugging the
  card, which is the same behavior pre-#1167 / pre-this branch.
- A short comment in CSS now flags the invariant so a future refactor
  doesn't quietly flip the flex semantics again.

* fix(web): restore :focus-visible ring on entry-tab + newproj-tab (a11y)

WHAT
Split the prior `:focus, :focus-visible, :active` group on both tab
selectors so that `:focus-visible` no longer inherits the zero-out
that was added to scrub the orange mouse-focus halo:

- `.newproj-tab:focus-visible` → 2px inset blue ring (`--selected`)
  hugging the folder-tab's 8px top-corner radius, plus `--text`
  foreground so the label reads at full contrast while focused.
- `.entry-tab:focus-visible` → 2px solid outline in `--selected` with
  `outline-offset: 2px` and `border-radius: 4px`. Outline is used
  here instead of inset shadow because the tab has no padding to
  spare against the active 1px bottom border, and outline doesn't
  participate in layout.

Mouse-driven `:focus` and `:active` keep the prior transparent
treatment — there is no orange ring on click, which is the polish
the rest of this branch is going for.

WHY
Flagged by @Siri-Ray (changed-range) and @lefarcen (P2) on #1360:
the polish-tab commits stripped the focus indicator entirely instead
of just suppressing mouse focus, so keyboard users had no way to see
which tab was active during arrow-key navigation. Re-introducing
`:focus-visible` only restores keyboard reachability while keeping
the visual quiet for pointer users.

SCOPE
- `apps/web/src/index.css` only. Two rulesets touched, one new
  `:focus-visible` rule added per selector.
- No JS, no aria, no test churn — the rules trigger off the existing
  `:focus-visible` pseudo-class, which the same Playwright tests
  already exercise via Tab.

* fix(web): scope quieted input focus to .entry-side, restore global ring (a11y)

WHAT
Split the global input focus rule into two layers:

- `input:focus, textarea:focus, select:focus` now keeps a visible
  focus indicator on every input across the app — but in the new
  `--selected` blue (border + 3px `--selected-soft` ring) instead of
  the original `--accent` orange. This preserves accessibility for
  every settings page, dialog, project workspace, and right-column
  control that was previously losing its focus halo.

- `.entry-side input:focus` keeps the neutral treatment from this
  branch — `border-color: var(--border-strong)`, no ring. The orange
  "Create" CTA on the entry sidebar is already the loudest element in
  that panel, so a competing blue ring on the title / path inputs
  next to it pulled the eye in the wrong direction. Scoping the
  quieter focus to the sidebar keeps that intent without leaking out
  to the rest of the app.

WHY
Flagged by @lefarcen as a P2 a11y regression on #1360: the previous
version of this rule scrubbed the focus indicator (`box-shadow: none`,
border only one shade darker) for every input in the app, not just on
the entry surface this branch is targeting. Keyboard users on
settings forms and dialogs were left without a visible focus state.

SCOPE
- `apps/web/src/index.css` only, one global rule restored and one
  scoped override added. No JS, no template change.
- Color shift global focus orange → blue is intentional: it consumes
  the new `--selected` token introduced in commit 13dc8a65 and
  matches the active-state direction this PR is establishing.

* chore(web): drop dead AppChromeHeader / isMacPlatform imports + document --selected token

WHAT
- Remove the `AppChromeHeader` import from `EntryView.tsx`. The
  component itself is still used (and re-exported) by ProjectView;
  EntryView dropped its render site in commit 5fe5721c and the import
  has been a stale reference ever since.
- Remove the `isMacPlatform` import too. It was only used by the old
  avatar-menu popover (for the `⌘,` / `Ctrl+,` Settings hint) which
  was deleted along with the popover when the pet-pill split button
  replaced it.
- Add a docblock above the `--selected` / `--selected-soft` token pair
  in `index.css` so the cascade has a local explanation for why this
  blue is separate from the brand `--accent`. The note calls out which
  affordances should reach for `--selected` (active option, focused
  input ring, active filter pill) and pins the 16% soft fill role.

WHY
Both flagged by @lefarcen on #1360:
- P3 — dead import: the TS config doesn't fail on unused imports, so
  this was silently shipping as dead code and obscuring the deliberate
  removal of the global chrome header.
- P3 — token doc: the `--accent` vs `--selected` split was only
  explained in the PR body. Putting the rationale next to the token
  makes the contract durable beyond this discussion.

SCOPE
- `apps/web/src/components/EntryView.tsx`: two `import` lines removed.
- `apps/web/src/index.css`: one comment block added directly above the
  token declaration.
- Verified: `pnpm --filter @open-design/web typecheck` → exit 0.
2026-05-12 14:26:39 +08:00