open-design

mirror of https://github.com/nexu-io/open-design.git synced 2026-06-01 03:14:35 +07:00

Author	SHA1	Message	Date
Caprika	6736310a01	Implement manual edit inspector (#1448 ) * feat(web): tweaks palette popover with HSL hue-shift recoloring Adds a Tweaks color-palette popover to the HTML preview toolbar. Selecting a palette re-skins the iframe in place via a srcDoc-side bridge that walks the DOM and shifts every chromatic paint to the target hue while preserving each color's saturation and lightness — pale tints stay pale, bold CTAs stay bold, just in the new color family. Mono-noir desaturates instead of shifting. - runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options - file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected - FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration - PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir - PreviewDrawOverlay: stub pass-through until the draw branch lands * feat(web): hide finalize-design toolbar from project header * test(e2e): skip project actions toolbar flow after toolbar removal * Polish manual edit inspector * Implement manual edit inspector * Fix manual edit review regressions * Fix FileViewer CI regressions * Fix remaining manual edit review issues * Flush manual edit styles before draw exit * Restore Critique Theater styles * Accept pixel line-height manual edits --------- Co-authored-by: qiongyu1999 <2694684348@qq.com>	2026-05-13 13:25:58 +08:00
Joey-nexu	e3a848a33a	feat(landing-page): replace Ø wordmark with PNG logo across nav/footer/favicon (#1449 ) * feat(landing-page): replace Ø wordmark with PNG logo across nav/footer/favicon Switches the brand mark from the Unicode 'Ø' glyph to the new circular gradient paper-plane PNG. Header nav and footer share the same image, and the browser tab + iOS home screen icons are regenerated from the same 500x500 source. - public/logo.png (500x500, brand source) - public/favicon.png (32x32, replaces favicon.svg) - public/apple-touch-icon.png (180x180, regenerated) - header.tsx + page.tsx footer: <span>Ø</span> -> <img src=/logo.png /> - globals.css: simplify .brand-mark (drop Ø-era border/font, add object-fit contain on child img) - index.astro: link rel=icon now points at favicon.png * fix(landing-page): apply logo + favicon swap to sub-page layout too Review on #1449 caught two cross-page consistency issues: - P1: sub-page-layout.astro still linked /favicon.svg, which this PR deletes — every Skills/Systems/Templates/Craft page would request a missing asset. Updated to /favicon.png to match index.astro. - P2: sub-page-layout.astro still rendered the Ø wordmark in its footer brand block, leaving the public site with mixed brand marks. Replaced with the same <img src=/logo.png /> wrapper pattern used on the homepage header and footer. Repo-wide grep now shows 0 favicon.svg references and 0 Ø brand-mark spans. typecheck still 25 files / 0 errors / 0 warnings. --------- Co-authored-by: Joey-nexu <236967869+joeylee12629-star@users.noreply.github.com>	2026-05-13 12:30:32 +08:00
Prantik Medhi	0244a769cb	feat(landing-page): add blog routes (#1444 ) * fix(landing-page): register blog collection config * fix(landing-page): restore blog content config * fix(blog): use content-layer ids	2026-05-13 12:20:45 +08:00
nettee	0f0d2879ff	Make de/fr/ru content i18n optional (#1511 )	2026-05-13 12:17:17 +08:00
Nagendhra Madishetti	e2f409579d	docs: Critique Theater Phase 14 (user guide + 2 AGENTS module maps) (#1319 ) * feat(web): pure reducer for Critique Theater states (Phase 7.1) Pure CritiqueState reducer driven by the contracts-level PanelEvent (the same shape both the live SSE stream and the recorded transcript emit), so a single reducer powers both the in-flight panel and the rerun replay. Lifecycle covers run_started → running → (shipped / degraded / interrupted / failed), with panelist_open / dim / must_fix / close / round_end events building per-round CritiquePanelistView entries as they arrive. Defensive behaviour that surfaced while writing the spec tests: - Terminal phases (shipped / degraded / interrupted / failed) are sticky against further lifecycle events for the same run, except for parser_warning which can land late and is recorded in a side channel without changing phase. - A new run_started for a different runId at any time discards the prior state and reboots, so the UI can launch consecutive runs without an explicit reset action. - Events whose runId does not match the active run return the same state reference, so React's useReducer doesn't re-render subscribers on stray traffic. - Round bookkeeping keys by round number rather than "always last", so an out-of-order panelist_dim for round 1 arriving after a round 2 dim does not corrupt the round 2 bucket. Test coverage: 18 cases covering each transition, the runId guard, sticky-terminal behaviour, the out-of-order round invariant, and the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire SSE + replay into the same reducer. * feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2) createCritiqueEventsConnection is a pure connection manager that mirrors apps/web/src/providers/project-events.ts: opens an EventSource at /api/projects/:id/events, listens for every name in CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent (stripping the critique. prefix and merging the data payload), and hands it to the caller's onEvent. Reconnect uses exponential backoff (1s → 30s) and resets on `ready`; malformed payloads drop with a dev-mode warning rather than tearing the stream. useCritiqueStream wraps the manager in a useReducer that owns the CritiqueState. enabled=false or a null projectId tears down the connection cleanly; switching projectId closes the old connection and opens a fresh one. The returned dispatch lets local UI synthesise actions (e.g. an Esc keypress firing a synthetic interrupted while a kill request is in flight); production traffic comes from the SSE stream. Test coverage: - sse.test.ts (10 cases, node env): subscription set covers every CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire shape back to PanelEvent; malformed JSON is swallowed and does not stop the stream; exponential backoff schedule and ready-reset semantics are pinned with a setTimeout seam; close() cancels pending reconnects and shuts the live source; no-op fallback when EventSource is unavailable. - useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event, reducer driven by synthetic actions, no connection when disabled or projectId is null, clean close on unmount, projectId change reopens cleanly. * feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3) Fetches the per-run NDJSON transcript (one PanelEvent per line), parses every line via the shared isPanelEvent predicate, and dispatches into the same CritiqueState reducer the live SSE stream uses. A single reducer means the UI rendering a replay can be identical to the live panel, and a UI mounting both useCritiqueStream and useCritiqueReplay in parallel does not have to reconcile two state shapes. speed knob is `paused \| instant \| live \| { intervalMs: N }`. - instant flushes every event synchronously, useful for opening a finished run already at its terminal state. - intervalMs paces dispatches at a fixed cadence so the reviewer can watch the run unfold. - paused parses the transcript but holds events back until the caller advances speed (consumers can drive a scrubber later). - live is reserved for the future "playback at original cadence" feature, currently treated as instant; replay timestamps are not yet persisted with each event so honest pacing requires a follow-up Phase 7+ task. gunzip seam handles `.ndjson.gz` transcripts via DecompressionStream when present; the production fetch path picks between text and arrayBuffer based on the URL extension. Both seams are injectable so the unit tests don't need to spin up a real network or a real gzip pipeline. Test coverage (8 cases, jsdom env): - Idle status before any URL is provided. - speed=instant flushes the full transcript synchronously to shipped state. - speed={intervalMs:N} paces with the setTimeout seam, reaching done after the last tick. - speed=paused leaves status=playing with no dispatches. - Empty transcript reports done with state still idle. - Fetch rejection surfaces an error status with the message. - Malformed NDJSON lines are skipped; valid events around them still land. - .gz transcripts route through the gunzip seam. Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream + replay), all on one branch ready for review. Phases 8+ (Theater components) consume these from this PR. * fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review) Two P1 fixes from lefarcen's review on PR #1307: SSE payload override `sseToPanelEvent` previously spread `data` after the channel-derived `type`, so a payload-provided `type` could override the channel and route a `critique.run_started` frame into the reducer as a `ship` action. Reversed the spread so the channel-derived `type` is authoritative, and revalidated the resulting object through the contracts-level `isPanelEvent` predicate before returning. Frames that fail validation (missing runId, empty runId, unknown type) are dropped, so a malformed or compromised SSE frame can no longer dispatch a wrong-shape action into the reducer. Three new sse.test.ts cases pin the regression: hostile `type:'ship'` in the payload still resolves to `run_started`, missing runId is dropped, empty runId is dropped. Replay pause/resume `useCritiqueReplay` had one big effect keyed on `transcriptUrl` only, so flipping `speed` from `paused` to `instant` never re-fired and the held events sat undispatched. Split into a parse effect (depends on URL, fetches and stores events in state) and a pace effect (depends on parsed-events + speed, owns the cursor + timers). The playback cursor lives in a ref that survives pause/resume cycles, so flipping `paused` -> `instant` flushes from the current position rather than restarting (which would double-dispatch `run_started` and reset the reducer). Two new useCritiqueReplay.test.tsx cases: - paused-then-instant transitions from `playing` to `done` and reaches the shipped terminal phase - intervalMs paced playback dispatches one event, pauses to drain the next scheduled timer, flips to instant, and confirms the remaining transcript drains exactly once (cursor was preserved) Doc consistency The earlier source comment in useCritiqueReplay.ts claimed `live` "paces by recorded timestamps" while the impl used zero-delay timers and the PR body said it behaves like `instant`. Aligned to reality: `live` currently behaves like `{ intervalMs: 0 }` (events drain on successive microtasks via setTimeoutFn) because transcripts do not yet carry per-event timestamps. Honest timestamp-driven pacing is queued as a Phase 7+ follow-up. Validated: pnpm guard, pnpm --filter @open-design/web typecheck, Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite 96 files / 888 tests. * feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread) * feat(web): Theater PanelistLane component (Phase 8.1) * feat(web): Theater ScoreTicker component (Phase 8.2) * feat(web): Theater RoundDivider component (Phase 8.3) * feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4) * feat(web): Theater TheaterDegraded chip (Phase 8.5) * feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6) * feat(web): Theater TheaterTranscript replay surface (Phase 8.7) * feat(web): Theater TheaterStage top-level container (Phase 8.8) * feat(web): Theater CSS using existing semantic tokens (no hex literals) * feat(web): Theater public exports barrel * fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314) Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen. State-lifecycle fixes (3 x P2) 1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`). Host hooks dispatch it when their gating prop changes so a stale run from a prior project / transcript cannot bleed into the next context. Reset is idempotent on idle (returns the same reference). 2. `useCritiqueStream` dispatches `__reset__` at the top of its connection effect, so a workspace switch from project A (which streamed a critique) to project B clears the reducer before the new EventSource opens. enabled=false also clears. 3. `useCritiqueReplay` dispatches `__reset__` at the top of its parse effect, so transcriptUrl swaps (including swap-to-null after a replay reached `shipped`) lift the reducer back to idle before the new fetch starts. SSE validation (1 x P2) 4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape` check after the cheap `isPanelEvent` predicate. A `critique.ship` frame missing `composite` / `round` / `status` / `artifactRef` is rejected before reaching the reducer, so TheaterCollapsed can no longer crash on `undefined.toFixed(1)`. Every variant's required fields are validated: run_started (protocolVersion, non-empty cast, maxRounds, threshold, scale), panelist_* (round, role, plus variant-specific shape), round_end (round, composite, mustFix, decision in {continue,ship}, reason), ship (round, composite, status, artifactRef.{projectId,artifactId}, summary), degraded (reason, adapter), interrupted (bestRound, composite), failed (cause), parser_warning (kind, position). Reducer correctness (1 x P2) 5. `panelist_open` now materializes the round + an empty panelist view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight the in-progress lane the instant the tag opens. Before this, a stream that emitted only `panelist_open` after `run_started` left `rounds = []` and the UI rendered no current round until a later `panelist_dim` arrived. Polish (3 x P3) 6. Brand role tint swaps from `var(--magenta, var(--accent))` to `var(--purple, var(--accent))`. `--purple` is actually defined across the design systems; `--magenta` is not, so Brand was silently falling through to `--accent` and looking identical to Designer. 7. New i18n key `critiqueTheater.interruptedSummary` for the interrupted-collapse copy ("Interrupted at round N, best composite X.X"). Previously the interrupted branch reused `shippedSummary` and the UI read "Shipped at round..." for a run that specifically did not ship. Native value in en + zh-CN; other locales fall back via `...en` spread. 8. `TheaterDegraded` heading id comes from `useId()` instead of a hardcoded `theater-degraded-heading`, so two chips rendered on the same page (chat history with multiple completed runs) keep their aria-labelledby references unambiguous. Tests (15 new cases) - reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data. - sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship. - useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false. - useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped. - TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...". - TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`. Validated - pnpm guard clean - pnpm --filter @open-design/web typecheck clean - Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new) - tests/i18n/locales.test.ts 5 of 5 across 18 locales * feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1) * feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2) * fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315) Addresses every blocker from codex, Siri-Ray, and lefarcen. The three state-lifecycle and SSE-validation issues they also flagged inherit fixes from PR #1314's review pass that this branch now sits on top of after rebase. Real daemon kill on Interrupt (P1) - CritiqueTheaterMount now POSTs to /api/projects/:id/critique/:runId/interrupt alongside the optimistic local dispatch. Before this fix, clicking Interrupt only flipped the React state to interrupted while the daemon job kept running. The fetch is best-effort: a 404 (endpoint not wired yet, lands in Phase 15) is swallowed with a dev-mode console.warn so the UI still moves to the collapsed badge. - New fetchInterrupt test seam lets RTL assert on the URL / method and simulate the "daemon not ready yet" path. Two tests pin both: the happy URL proj-42/critique/run-abc/interrupt POSTs, and a rejected fetch still flips the UI. interruptPending reset on new run (P2) - A ref-backed effect compares the current runId against the last one we saw; when it changes, interruptPending is cleared. A user who interrupts run-1 and then triggers run-2 from the same mount now gets a fresh, enabled kill button instead of one stuck in "Interrupting…". Pinned by a new mount test. Escape keybind scope (P2) - InterruptButton now checks the keydown target. Escape inside an input, textarea, select, or contenteditable element is ignored (and any ancestor of those via closest() is treated the same way). Body-level focus still fires the keybind so the Theater area's affordance keeps working. Four new tests cover textarea, input, contenteditable, and the body-focus positive case. userFacingName i18n key (P2) - The spec at specs/current/critique-theater.md:6 mandates a single critiqueTheater.userFacingName key so the "Design Jury" label can be renamed without touching code. Phase 8 introduced critiqueTheater.title by mistake; renamed across types.ts, en.ts, zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer TheaterStage.tsx. The locale alignment test stays green. Validated - pnpm guard clean - pnpm --filter @open-design/web typecheck clean - Theater suite: 14 files, 112 tests (was 101 before, +11 new for the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope; the rest were already in #1314's review fix). - tests/i18n/locales.test.ts 5 of 5 across 18 locales. * feat(daemon): adapter-degraded registry with TTL (Phase 10.1) In-memory registry recording adapters that produced malformed or oversize transcripts so the orchestrator can skip them for a TTL window (default 24h) instead of cycling through known-bad providers on every run. Records carry reason (malformed_block \| oversize_block \| missing_artifact), source label, and expiresAt. The test-only clock seam lets the suite advance time deterministically and prove that an expired entry stops counting as degraded without anyone calling clearDegraded. 7/7 vitest cases green. * feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2) Two test-only adapters that read the existing v1 transcript fixtures (happy-3-rounds and malformed-unbalanced) and replay them as either a full string or a 512-byte chunked stream. The chunked form is what the conformance harness uses to prove the parser holds together when the transcript arrives in arbitrary network slices, not as one buffered blob. * feat(daemon): adapter conformance harness (Phase 10.3) runAdapterConformance pulls a transcript through the same parseCritiqueStream pipeline the orchestrator uses and classifies the outcome as shipped, degraded, or failed. On a degraded outcome it forwards the matched reason to the adapter-degraded registry, so a single nightly conformance run is what populates the skip list rather than the orchestrator learning each adapter is broken at request time. 5/5 vitest cases green covering shipped, malformed degraded, oversize degraded, no-ship failure, and the harness-thrown failure path. * test(e2e): Critique Theater Playwright suite (Phase 11) Six tests, one viewport per visual case, deterministic SSE fixtures stubbed via page.route(). Adds the suite to test:ui:extended so the existing extended-UI lane picks it up. Coverage: 1. Happy path: a single mounted theater plays the full fixture (1 run_started, 5 panelists open / dim / must_fix / close, 1 round_end, 1 ship) and ends on the score badge. 2. Interrupt mid-run: the panelist that is open at the time the interrupt button is clicked closes with an interrupted marker and the transcript freezes there. 3. Visual regression at 375x720 mobile. 4. Visual regression at 768x1024 tablet. 5. Visual regression at 1280x800 desktop. 6. A11y role tree: the theater region exposes a labelled landmark, each panelist lane is a group with an accessible name, the score is a status live region. All SSE traffic is stubbed by page.route so the suite runs in CI without a daemon. The toggle is seeded via localStorage by bootAppWithCritiqueEnabled so the gate behaves as if Settings flipped it on. typecheck clean; playwright --list reports 6. * test(web): reducer p99 bench at 10k iterations (Phase 13.1) Locks the documented 2ms budget for the Critique Theater reducer on a representative SSE script (27 actions, one full happy run) behind a regression gate. Asserts p99 stays under 4ms (2x the documented budget) so CI runners with a noisy neighbour do not flake while a real regression to 20ms or 200ms still trips. The bench is a vitest case rather than a bare microbenchmark so it runs in the same CI lane as every other web test and does not need a parallel runner. * test(web): critique surface coverage walker (Phase 13.2) Walks the public critique surface (11 SSE event names, 5 panelist roles, 6 lifecycle phases, 9 named i18n keys) and asserts each named symbol appears in both the src corpus and the test corpus. The walker is the gate that catches a rename in one half of the codebase without a matching update in the other half: a future PR that drops 'panelist_must_fix' from the reducer without also removing its test reference fails this suite. 62 assertions, one per symbol per corpus. * docs: Critique Theater user guide (Phase 14.1) Seven sections aimed at end users (not contributors): 1. What is Design Jury 2. How it works (the five panelists, auto-converging rounds, the composite formula) 3. Settings (the M1 toggle and what it does) 4. Reading the score badge 5. Replay surface 6. Troubleshooting (degraded, interrupted, failed) 7. FAQ The composite formula is documented as designer * 0 + critic * 0.4 + brand * 0.2 + a11y * 0.2 + copy * 0.2 because anyone trying to reverse-engineer the score is going to search for those weights and the docs are the place they should land first. * docs(daemon): critique module AGENTS map (Phase 14.2) Daemon-side wayfinder for the apps/daemon/src/critique directory. Tables every file, what owns what invariant, and the 'when you change anything here' guide so a future contributor does not have to reverse-engineer the rollout resolver before adding a new SSE event. * docs(web): Theater module AGENTS map (Phase 14.3) Web-side mirror of the daemon AGENTS map. Same file table, same invariants section, same change-impact guide, sized to the Theater component package. * docs: tighten Phase 14 reasoning from lefarcen review (PR #1319) Four content gaps lefarcen flagged in the Phase 14 docs review, addressed inline rather than deferred. The fifth item (scope-drift between 'docs only' PR body and the cumulative stacked diff) is handled by rewriting the PR body, not the docs. 1. Round exit conditions (lefarcen P2-1). docs/critique-theater.md §2 'Auto-converging rounds' now lists the five conditions that stop a run (threshold reached, round budget exhausted, per-round timeout, total timeout, user interrupt) with their default values. A user debugging a run that stopped at round 1 with composite 5.4 can read this list and find the matching cause without spelunking the orchestrator. 2. Prior-art comparison (lefarcen P2-2). New §1.5 'Why an in-CLI panel and not a third-party design lint' pre-answers the 'why not Figma lint / Adobe checker / Material You conformance' question. Three differences: rule engines vs generative reviewers, post-hoc vs in-loop, external service vs same-CLI-session. 3. Composite formula rationale (lefarcen P2-4). §2 now explains why each weight is set the way it is: critic gates correctness so it gets 0.4; brand / a11y / copy are secondary quality dimensions at 0.2 each; designer is at 0.0 in v1 because aesthetic preference is not a ship gate. The slot stays in the schema so notes flow into the transcript and a v2 config release can bump the weight without a wire-shape change. 4. v2 cast-config ownership (lefarcen P2-3). Both AGENTS.md files (daemon + web) now declare a 'Designer weight frozen at 0.0 until v2 cast config' invariant. The daemon side calls out where the SKILL.md frontmatter resolver lands (apps/daemon/src/critique/config.ts); the web side calls out where the Settings surface lands (apps/web/src/components/ Settings/). A contributor reading either AGENTS.md before implementing v2 sees which module to touch first. * docs(web): mirror the Designer-weight invariant in Theater AGENTS.md (PR #1319) lefarcen P1 follow-up on PR #1319: the daemon AGENTS.md already declares 'Designer weight is frozen at 0.0 until v2 cast config lands' as an invariant, but the web AGENTS.md's parallel bullet led with 'Composite weights are read-only on the web side' which buried the Designer-specific constraint. A web contributor reading that bullet would not realise the v1 weight distribution is wire-shape (changing it mid-v1 invalidates persisted critique_runs composite values). Rewrote the bullet to lead with the same 'Designer weight is frozen at 0.0 until v2 cast config lands' phrasing the daemon side uses, and added an explicit cross-link to the daemon AGENTS.md so the two halves of the invariant read as one rule. Web-side specifics retained: ScoreTicker / TheaterCollapsed read composite off the wire (no client recompute), v2 lands as a Settings surface at apps/web/src/components/Settings/, do not add a 'weights' prop to any component in this directory until the contracts package carries the v2 cast type. * docs: replace deferred metrics endpoint reference + refresh Theater module map (PR #1319) Two carryover items lefarcen flagged across the PR #1319 + #1320 reviews. 1. docs/critique-theater.md was sending users to /api/metrics/critique as the conformance-status check on malformed_block, but the Phase 12 metrics endpoint is explicitly deferred until after orchestrator wiring lands. Replaced the link with the pnpm conformance-harness command that DOES exist today (pnpm --filter @open-design/daemon vitest run tests/critique-conformance.test.ts) and noted that the dashboard surfaces this status as a series once Phase 12 ships. 2. apps/web/src/components/Theater/AGENTS.md module map was stale after Phase 15: the index.ts row said 'only two hooks are exported' but the barrel now exports useCritiqueTheaterEnabled too (plus the setCritiqueTheaterEnabled setter). Updated the row to list all three hooks + the setter + the reducer-derived contract types, and added a new row for hooks/useCritiqueTheaterEnabled.ts in the file table so a web contributor scanning the table sees the new hook without inferring it from the index.ts blurb. * fix(web): restore wait-for-daemon-ack pattern on Theater interrupt Same regression as flagged on PR #1316 post-main-merge: the optimistic local dispatch fired before the POST resolved, so a daemon 404 / 409 still terminalized the UI and the real SSE terminal event got ignored by the sticky interrupted phase. Snapshot runId / bestRound / composite at click time, dispatch interrupted only on res.ok, clear interruptPending on rejection or non-2xx so the user can retry. Tests cover rejection + 404 leaving the run on the live stage; the 204 path waits for the ack. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-13 12:11:48 +08:00
mehmet turac	313008933a	fix(memory): add loading state to Refresh button in Memory settings (#1477 ) * fix(memory): add loading state to Refresh button in Memory settings Addressed review feedback: - Fixed icon class: use 'icon-spin' instead of 'spin' - Added 'settings.memoryExtractionsRefreshing' to types.ts and all locale files - Removed unrelated tools-pack changes (split to separate PR) Fixes #1418 * fix(memory): remove duplicate translation and add missing Thai locale Addressed review feedback: - Removed duplicate settings.memoryExtractionsRefreshing in en.ts - Added settings.memoryExtractionsRefreshing to th.ts Fixes #1418	2026-05-13 12:06:30 +08:00
Samay	874b1e9db3	fix: treat Codex reconnect events as warnings not fatal errors (#1482 ) * fix: treat Codex reconnect events as warnings not fatal errors Reconnecting... x/5 events are recoverable — Codex eventually completes successfully. Surface them as status events instead of failing the entire run. Fixes #1471 * test: add regression tests for codex reconnect warning handling (#1471) * test: add regression tests for codex reconnect warning handling (#1471)	2026-05-13 12:00:55 +08:00
Prantik Medhi	03ed39602e	fix: preserve Claude tool inputs (#1476 ) Fixes nexu-io/open-design#1475	2026-05-13 11:25:02 +08:00
Nicholas-Xiong	9db5dd8798	fix: close Settings modal when opening project from Routines (#1490 ) * fix: close Settings modal when opening project from Routines Fixes #1355 When clicking 'Open project' in Routine history, the Settings modal now closes automatically, providing a clean transition to the project. Changes: - Added onClose prop to RoutinesSection component - Call onClose after navigate() in Open project button handler - Pass onClose from SettingsDialog to RoutinesSection This ensures the modal dismisses after navigation, consistent with other modal-based navigation actions in the app. * fix: pass onClose prop to RunHistory component Address review feedback: RunHistory component now receives and uses the onClose prop to properly close Settings modal when navigating.	2026-05-13 11:15:36 +08:00
Rocky	1b84af44c0	fix(web/i18n): translate template platform selection + Companion surfaces to Chinese (#1491 ) The template creation panel's Target platforms section was hardcoded in English with no i18n keys at all, while the adjacent Companion surfaces section had i18n keys in place but zh-CN.ts and zh-TW.ts stored English placeholder values for them. Both sections appeared in English regardless of the user's selected language, as reported for Chinese in #1415. Fix: - Add 14 new keys to `apps/web/src/i18n/types.ts` for the Target platforms section (`newproj.targetPlatformsLabel`, `newproj.targetPlatformsHint`, plus `label`/`hint` pairs for each of the six platform cards: responsive, webDesktop, mobileIos, mobileAndroid, tablet, desktopApp). - Populate the new keys in `en.ts` (source of truth) and translate to Simplified Chinese in `zh-CN.ts` and Traditional Chinese in `zh-TW.ts`. - Replace the English-placeholder values in `zh-CN.ts` and `zh-TW.ts` for the pre-existing Companion surfaces keys (`surfaceOptionsLabel`, `includeLandingPage`, `includeLandingPageHint`, `includeOsWidgets`, `includeOsWidgetsHint`, `includeOsWidgetsDisabledHint`). - Refactor `DESIGN_PLATFORMS` in `NewProjectPanel.tsx` to store `keyof Dict` entries instead of hardcoded English strings; `PlatformPicker` now pulls translated labels and hints through `useT()` at render time. Other locales (de, fr, ja, ko, ru, ar, pl, hu, uk, tr, th, es-ES, pt-BR, fa, id) use `...en` spread in their locale files, so they inherit the new English defaults automatically and can be translated by native speakers in follow-up PRs. Chinese translations follow Open Design's existing zh-CN.ts style conventions: 应用 (app), 移动 (mobile), 营销 (marketing) are consistent with prior entries in the file; 画框 (frame) matches Figma's Chinese UI for the same mockup-frame concept. Verified: - `pnpm guard` clean - `pnpm --filter @open-design/web typecheck` clean - 945 web tests pass (`pnpm --filter @open-design/web test`) - Manually verified in a locally-built `tools-pack` app with 简体中文 selected: the template creation panel's Target platforms and Companion surfaces sections now render fully in Chinese with no layout breakage. Fixes #1415	2026-05-13 11:10:57 +08:00
Nicholas-Xiong	0e3438731a	fix: add spacing between window controls and logo on macOS (#1480 ) * fix: add spacing between window controls and logo on macOS Fixes #1427 When hovering over the macOS window control area, the traffic light controls appeared too close to the Open Design logo, creating visual crowding in the title bar. Changes: - Added 8px margin-right to .app-chrome-traffic-space - This creates breathing room between the window controls and the brand logo - The spacing applies on top of the existing 12px gap in the header - Added explanatory comment for future maintainers Result: - ✅ Window controls and logo have proper spacing - ✅ Title bar feels more balanced and polished - ✅ No visual crowding around the brand mark - ✅ Maintains clean layout on macOS This is a visual polish fix that improves the perceived quality of the app's title bar, one of the most visible parts of the interface. * fix: only apply traffic light spacing on macOS Address review feedback: Use a separate CSS variable --app-chrome-traffic-margin that defaults to 0px and is only set to 8px on macOS via MAC_WINDOW_CHROME_CSS injection. This prevents the spacing from affecting non-macOS platforms where traffic-space is 0px. Changes: - Added --app-chrome-traffic-margin variable (default: 0px) - Set it to 8px in MAC_WINDOW_CHROME_CSS for macOS only - Removed unconditional margin-right from base style	2026-05-13 11:06:58 +08:00
Prantik Medhi	def9544996	fix(web): tighten routines project radios (#1493 )	2026-05-13 11:05:18 +08:00
sukumarp2022	b167991d7c	feat: add project-level and user-level custom instructions (#1304 ) * feat: add project-level and user-level custom instructions Implements #510 — editable custom instructions that get injected into every model message, at both user level (Settings → Memory) and project level (pencil icon in project header). - Add customInstructions to Project, AppConfigPrefs contracts - Add custom_instructions column migration to projects table - Inject user + project instructions into system prompt (after memory, before design system; project-level wins on conflict) - Add Settings textarea for user-level instructions - Add inline editor bar in ProjectView for project-level instructions - Sync user-level instructions through daemon app-config round-trip * fix: address PR review — validation, draft reset, length limit - Reset instructionsDraft on Cancel and toggle close (stale draft bug) - Thread customInstructions through POST /api/projects create handler - Add type + length validation (5000 chars) in PATCH handler - Enforce length cap in app-config applyConfigValue - Add maxLength={5000} to both UI textareas - Resync draft via useEffect when editor is closed - Remove stray run.sh from commit * fix: address maintainer review — save race condition, precedence wording - Make handleSaveInstructions async with await + revert on failure - Add instructionsSaving state to disable Save/Cancel/textarea during save - Clarify precedence wording with concrete example in both prompt composers - UpdateProjectRequest already has customInstructions (verified) * fix: use server-returned project in save handler, drop optimistic update The previous optimistic-update + revert approach captured a stale project snapshot in the useCallback closure. On failure, reverting with the captured object could clobber unrelated project fields that changed during the async request. Switch to pessimistic update: wait for patchProject to succeed, then call onProjectChange(result) with the server-returned project object. The instructionsSaving flag disables the editor UI during the round-trip. * fix: align create/PATCH validation for customInstructions Create endpoint now rejects invalid types and >5000 char values with 400 instead of silently truncating, matching the PATCH handler behavior.	2026-05-12 14:27:57 -04:00
Neha Prasad	342ba44383	fix memory extraction history affordance (#1447 )	2026-05-12 13:35:34 -04:00
PerishFire	61163d6b92	Optimize Windows packaged prebundle flow (#1389 )	2026-05-12 12:07:32 -04:00
Eli	49ea2499ac	[codex] Add draw annotation workflow (#1435 ) * feat(web): tweaks palette popover with HSL hue-shift recoloring Adds a Tweaks color-palette popover to the HTML preview toolbar. Selecting a palette re-skins the iframe in place via a srcDoc-side bridge that walks the DOM and shifts every chromatic paint to the target hue while preserving each color's saturation and lightness — pale tints stay pale, bold CTAs stay bold, just in the new color family. Mono-noir desaturates instead of shifting. - runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options - file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected - FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration - PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir - PreviewDrawOverlay: stub pass-through until the draw branch lands * feat(web): hide finalize-design toolbar from project header * test(e2e): skip project actions toolbar flow after toolbar removal * Add draw annotation workflow * Restore project actions toolbar	2026-05-12 21:54:59 +08:00
nettee	f621dbbfea	feat(web): Add Tailwind foundation (#1388 )	2026-05-12 21:48:16 +08:00
mehmet turac	7e2168ed29	fix(picker): improve provider group header separation in Media model picker (#1441 ) Added min-height and border-bottom to the sticky provider group header to ensure it fully separates from the model content below. Fixes #1434	2026-05-12 09:47:44 -04:00
Neha Prasad	2d405fae96	fix:align artifact preview exit button (#1445 )	2026-05-12 21:39:31 +08:00
Nagendhra Madishetti	09a8fa8d64	feat(web): Critique Theater Phase 8 (8 Theater components, barrel, role-keyed CSS) (#1314 ) * feat(web): pure reducer for Critique Theater states (Phase 7.1) Pure CritiqueState reducer driven by the contracts-level PanelEvent (the same shape both the live SSE stream and the recorded transcript emit), so a single reducer powers both the in-flight panel and the rerun replay. Lifecycle covers run_started → running → (shipped / degraded / interrupted / failed), with panelist_open / dim / must_fix / close / round_end events building per-round CritiquePanelistView entries as they arrive. Defensive behaviour that surfaced while writing the spec tests: - Terminal phases (shipped / degraded / interrupted / failed) are sticky against further lifecycle events for the same run, except for parser_warning which can land late and is recorded in a side channel without changing phase. - A new run_started for a different runId at any time discards the prior state and reboots, so the UI can launch consecutive runs without an explicit reset action. - Events whose runId does not match the active run return the same state reference, so React's useReducer doesn't re-render subscribers on stray traffic. - Round bookkeeping keys by round number rather than "always last", so an out-of-order panelist_dim for round 1 arriving after a round 2 dim does not corrupt the round 2 bucket. Test coverage: 18 cases covering each transition, the runId guard, sticky-terminal behaviour, the out-of-order round invariant, and the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire SSE + replay into the same reducer. * feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2) createCritiqueEventsConnection is a pure connection manager that mirrors apps/web/src/providers/project-events.ts: opens an EventSource at /api/projects/:id/events, listens for every name in CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent (stripping the critique. prefix and merging the data payload), and hands it to the caller's onEvent. Reconnect uses exponential backoff (1s → 30s) and resets on `ready`; malformed payloads drop with a dev-mode warning rather than tearing the stream. useCritiqueStream wraps the manager in a useReducer that owns the CritiqueState. enabled=false or a null projectId tears down the connection cleanly; switching projectId closes the old connection and opens a fresh one. The returned dispatch lets local UI synthesise actions (e.g. an Esc keypress firing a synthetic interrupted while a kill request is in flight); production traffic comes from the SSE stream. Test coverage: - sse.test.ts (10 cases, node env): subscription set covers every CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire shape back to PanelEvent; malformed JSON is swallowed and does not stop the stream; exponential backoff schedule and ready-reset semantics are pinned with a setTimeout seam; close() cancels pending reconnects and shuts the live source; no-op fallback when EventSource is unavailable. - useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event, reducer driven by synthetic actions, no connection when disabled or projectId is null, clean close on unmount, projectId change reopens cleanly. * feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3) Fetches the per-run NDJSON transcript (one PanelEvent per line), parses every line via the shared isPanelEvent predicate, and dispatches into the same CritiqueState reducer the live SSE stream uses. A single reducer means the UI rendering a replay can be identical to the live panel, and a UI mounting both useCritiqueStream and useCritiqueReplay in parallel does not have to reconcile two state shapes. speed knob is `paused \| instant \| live \| { intervalMs: N }`. - instant flushes every event synchronously, useful for opening a finished run already at its terminal state. - intervalMs paces dispatches at a fixed cadence so the reviewer can watch the run unfold. - paused parses the transcript but holds events back until the caller advances speed (consumers can drive a scrubber later). - live is reserved for the future "playback at original cadence" feature, currently treated as instant; replay timestamps are not yet persisted with each event so honest pacing requires a follow-up Phase 7+ task. gunzip seam handles `.ndjson.gz` transcripts via DecompressionStream when present; the production fetch path picks between text and arrayBuffer based on the URL extension. Both seams are injectable so the unit tests don't need to spin up a real network or a real gzip pipeline. Test coverage (8 cases, jsdom env): - Idle status before any URL is provided. - speed=instant flushes the full transcript synchronously to shipped state. - speed={intervalMs:N} paces with the setTimeout seam, reaching done after the last tick. - speed=paused leaves status=playing with no dispatches. - Empty transcript reports done with state still idle. - Fetch rejection surfaces an error status with the message. - Malformed NDJSON lines are skipped; valid events around them still land. - .gz transcripts route through the gunzip seam. Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream + replay), all on one branch ready for review. Phases 8+ (Theater components) consume these from this PR. * fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review) Two P1 fixes from lefarcen's review on PR #1307: SSE payload override `sseToPanelEvent` previously spread `data` after the channel-derived `type`, so a payload-provided `type` could override the channel and route a `critique.run_started` frame into the reducer as a `ship` action. Reversed the spread so the channel-derived `type` is authoritative, and revalidated the resulting object through the contracts-level `isPanelEvent` predicate before returning. Frames that fail validation (missing runId, empty runId, unknown type) are dropped, so a malformed or compromised SSE frame can no longer dispatch a wrong-shape action into the reducer. Three new sse.test.ts cases pin the regression: hostile `type:'ship'` in the payload still resolves to `run_started`, missing runId is dropped, empty runId is dropped. Replay pause/resume `useCritiqueReplay` had one big effect keyed on `transcriptUrl` only, so flipping `speed` from `paused` to `instant` never re-fired and the held events sat undispatched. Split into a parse effect (depends on URL, fetches and stores events in state) and a pace effect (depends on parsed-events + speed, owns the cursor + timers). The playback cursor lives in a ref that survives pause/resume cycles, so flipping `paused` -> `instant` flushes from the current position rather than restarting (which would double-dispatch `run_started` and reset the reducer). Two new useCritiqueReplay.test.tsx cases: - paused-then-instant transitions from `playing` to `done` and reaches the shipped terminal phase - intervalMs paced playback dispatches one event, pauses to drain the next scheduled timer, flips to instant, and confirms the remaining transcript drains exactly once (cursor was preserved) Doc consistency The earlier source comment in useCritiqueReplay.ts claimed `live` "paces by recorded timestamps" while the impl used zero-delay timers and the PR body said it behaves like `instant`. Aligned to reality: `live` currently behaves like `{ intervalMs: 0 }` (events drain on successive microtasks via setTimeoutFn) because transcripts do not yet carry per-event timestamps. Honest timestamp-driven pacing is queued as a Phase 7+ follow-up. Validated: pnpm guard, pnpm --filter @open-design/web typecheck, Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite 96 files / 888 tests. * feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread) * feat(web): Theater PanelistLane component (Phase 8.1) * feat(web): Theater ScoreTicker component (Phase 8.2) * feat(web): Theater RoundDivider component (Phase 8.3) * feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4) * feat(web): Theater TheaterDegraded chip (Phase 8.5) * feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6) * feat(web): Theater TheaterTranscript replay surface (Phase 8.7) * feat(web): Theater TheaterStage top-level container (Phase 8.8) * feat(web): Theater CSS using existing semantic tokens (no hex literals) * feat(web): Theater public exports barrel * fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314) Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen. State-lifecycle fixes (3 x P2) 1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`). Host hooks dispatch it when their gating prop changes so a stale run from a prior project / transcript cannot bleed into the next context. Reset is idempotent on idle (returns the same reference). 2. `useCritiqueStream` dispatches `__reset__` at the top of its connection effect, so a workspace switch from project A (which streamed a critique) to project B clears the reducer before the new EventSource opens. enabled=false also clears. 3. `useCritiqueReplay` dispatches `__reset__` at the top of its parse effect, so transcriptUrl swaps (including swap-to-null after a replay reached `shipped`) lift the reducer back to idle before the new fetch starts. SSE validation (1 x P2) 4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape` check after the cheap `isPanelEvent` predicate. A `critique.ship` frame missing `composite` / `round` / `status` / `artifactRef` is rejected before reaching the reducer, so TheaterCollapsed can no longer crash on `undefined.toFixed(1)`. Every variant's required fields are validated: run_started (protocolVersion, non-empty cast, maxRounds, threshold, scale), panelist_* (round, role, plus variant-specific shape), round_end (round, composite, mustFix, decision in {continue,ship}, reason), ship (round, composite, status, artifactRef.{projectId,artifactId}, summary), degraded (reason, adapter), interrupted (bestRound, composite), failed (cause), parser_warning (kind, position). Reducer correctness (1 x P2) 5. `panelist_open` now materializes the round + an empty panelist view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight the in-progress lane the instant the tag opens. Before this, a stream that emitted only `panelist_open` after `run_started` left `rounds = []` and the UI rendered no current round until a later `panelist_dim` arrived. Polish (3 x P3) 6. Brand role tint swaps from `var(--magenta, var(--accent))` to `var(--purple, var(--accent))`. `--purple` is actually defined across the design systems; `--magenta` is not, so Brand was silently falling through to `--accent` and looking identical to Designer. 7. New i18n key `critiqueTheater.interruptedSummary` for the interrupted-collapse copy ("Interrupted at round N, best composite X.X"). Previously the interrupted branch reused `shippedSummary` and the UI read "Shipped at round..." for a run that specifically did not ship. Native value in en + zh-CN; other locales fall back via `...en` spread. 8. `TheaterDegraded` heading id comes from `useId()` instead of a hardcoded `theater-degraded-heading`, so two chips rendered on the same page (chat history with multiple completed runs) keep their aria-labelledby references unambiguous. Tests (15 new cases) - reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data. - sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship. - useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false. - useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped. - TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...". - TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`. Validated - pnpm guard clean - pnpm --filter @open-design/web typecheck clean - Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new) - tests/i18n/locales.test.ts 5 of 5 across 18 locales * fix(web): tighten isPanelEvent in contracts so enum + numeric fields are checked end-to-end (Siri-Ray round-3 P1 on PR #1314) The variant validator on the web SSE path previously accepted any `typeof === 'string'` for closed-enum fields (ship.status, panelist_.role, degraded.reason, failed.cause, parser_warning.kind, run_started.cast[]) and any `typeof === 'number'` for numeric fields, which let NaN / Infinity through. Downstream components index i18n tables by enum value, so an unknown status or role would land `SHIP_BADGE_KEY[final.status]` on undefined and crash the translator. The replay parser had a separate gap: `useCritiqueReplay.parseTranscript` called the cheap `isPanelEvent` header check directly, so a recorded line like `{"type":"ship","runId":"r"}` reached the reducer with composite, status, round, artifactRef, summary all undefined and TheaterCollapsed then called `final.composite.toFixed(1)` on undefined. Resolution: move all wire-side validation into the contract guard. - Export const arrays for the closed enums: SHIP_STATUSES, DEGRADED_REASONS, FAILED_CAUSES, PARSER_WARNING_KINDS, ROUND_DECISIONS (PANELIST_ROLES already existed). - Rewrite `isPanelEvent` in packages/contracts/src/critique.ts to be the single deep validator: header (known type + non-empty runId) plus every variant-specific required field plus closed-enum membership plus Number.isFinite on every numeric field. Documented as the wire source of truth. - Drop the local `hasValidVariantShape` from web/sse.ts; sseToPanelEvent now relies entirely on the contract guard, and parseTranscript in useCritiqueReplay (which already uses isPanelEvent) gets the deeper validation for free. Tests (TDD, red-first): - packages/contracts/tests/critique.test.ts: 13 new cases pinning the strict guard directly (well-formed across every variant, every rejection path: unknown type, empty/non-string runId, unknown enum, non-finite numeric, missing variant field). - apps/web/tests/components/Theater/state/sse.test.ts: 9 new cases for each closed-enum rejection on the wire path plus a positive sweep across every legal enum value across every variant. - apps/web/tests/components/Theater/hooks/useCritiqueReplay.test.tsx: 2 new cases for incomplete and unknown-enum transcript lines. Verified: - pnpm --filter @open-design/contracts test 4 files / 30 tests green. - pnpm --filter @open-design/contracts build clean. - pnpm --filter @open-design/web typecheck clean. - pnpm --filter @open-design/web test 107 files / 976 tests green. fix(contracts): enforce numeric domains in isPanelEvent (lefarcen P2 on PR #1314 round 4) The strict guard from PR #1314 round 3 enforced enum membership and Number.isFinite, but accepted any finite number where the contract intends a specific domain: scale: 0 (ScoreTicker divides by it), negative thresholds, fractional rounds, negative mustFix, etc. ScoreTicker.tsx writes `var(--scale, ${state.scale})` into inline CSS and divides by it for tick width, so a guard-passing scale: 0 shipped Infinity into the rendered style. Negative composite / score values reached downstream code that assumes >= 0. Resolution: mirror the daemon-side Zod domain constraints in the runtime guard. Three new helpers in packages/contracts/src/critique.ts: - isPositiveInt(v): integer with v > 0. Used for round, maxRounds, scale, protocolVersion (all 1-indexed in the orchestrator). - isNonNegativeInt(v): integer with v >= 0. Used for mustFix, position, bestRound. bestRound: 0 is the valid sentinel for 'interrupted before any round closed'. - isNonNegativeFinite(v): finite number with v >= 0. Used for composite, score, dimScore, threshold. Threshold may be fractional (e.g. 8.5 on a scale of 10). Cross-field check inside run_started: threshold <= scale (the daemon Zod schema enforces this with an epsilon refine, the wire guard matches the same intent). Tests (TDD, red-first) added in packages/contracts/tests/critique.test.ts: - 22 new rejection cases across every numeric field that previously slipped through: scale: 0, negative scale, fractional scale, maxRounds: 0, fractional maxRounds, protocolVersion: 0, fractional protocolVersion, negative threshold, threshold > scale, round: 0, fractional round, negative dimScore / score, negative / fractional mustFix, negative composite, ship round: 0, negative / fractional bestRound, negative interrupted composite, negative / fractional parser_warning position. - 3 positive boundary cases that must still pass: threshold == scale, fractional threshold within [0, scale], interrupted with bestRound: 0 (no round completed before interrupt), parser_warning with position: 0 (start of stream). Verified: - pnpm --filter @open-design/contracts build clean. - pnpm --filter @open-design/contracts test: 4 files / 59 tests green (was 37 before the new domain cases). - pnpm --filter @open-design/web typecheck clean. - pnpm --filter @open-design/web test: 110 files / 1004 tests green; no regression on Theater suite, sse validator, replay parser, or assistant-feedback widget tests. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-12 21:38:58 +08:00
Rocky	a7e6e0dc3d	fix(web/AssistantMessage): update status block detail to latest value instead of skipping (#1413 ) When using Open Design with an ACP agent (e.g. Devin for Terminal), after selecting a non-default model in the picker, the model badge under the conversation header kept showing the agent's initial default (e.g. `model swe-1-6-fast`) instead of the running model. The conversation header text and the agent itself reflected the selected model correctly — only the badge UI was stale. Root cause: `buildBlocks()` in this file de-duplicated consecutive status events with the same label by SKIPPING the new one rather than updating the existing block. The daemon emits two `status: label='model'` events per turn — once after `session/new` returns with the agent's default model, then again after `session/set_config_option` succeeds with the user-selected model (see `apps/daemon/src/acp.ts` ~lines 495 and 587). The dedupe kept the first and skipped the second, so the badge stayed stuck on the default. Fix: update the existing block's detail to the latest value instead. "Most recent detail wins" is more accurate than "first one wins forever" for every status label that reaches this code path (`'model'`, `'initializing'`, etc.; the filter at lines 1156-1162 already drops the labels we don't want to surface as badges: streaming, starting, requesting, thinking, empty_response). Adds two regression tests in apps/web/tests/components/AssistantMessage.test.tsx: - Two sequential `status: 'model'` events with different details render the second detail and not the first (the Bug A scenario). - Two sequential events with the SAME detail still collapse to a single badge (no regression in the existing dedupe behavior). `pnpm guard` clean; AssistantMessage tests pass 22/22. Manually verified working in a local build against Devin for Terminal `2026.5.6-4` with #1208 applied — badge now updates to the selected model on every turn.	2026-05-12 19:28:35 +08:00
Nagendhra Madishetti	d60c4521bf	fix(daemon): mark ghost CLIs unavailable when --version probe is rejected by the OS (#658 ) (#1301 ) * fix(daemon): mark ghost CLIs unavailable when --version probe ENOENTs Settings > Execution & model > Local CLI kept advertising Codex CLI with a stale version number after the user had uninstalled the binary (#658). Root cause was the probe in apps/daemon/src/runtimes/detection.ts swallowing every `--version` failure and returning `available: true` anyway. The detection flow normally does: resolveAgentExecutable(def, env) -> resolved path or null if null: return available: false (handles "binary not on PATH") else: spawn `<resolved> --version` and read the first line On the uninstall path though, `resolveAgentExecutable` can still return a non-null value: - macOS / Linux: a leftover wrapper shim (e.g. an npm bin shim or a homebrew alias) survives the uninstall, but the underlying interpreter / target the shim invokes is gone, so execFile rejects with ENOENT. - Windows: a `.cmd` shim file is still on PATH but points at a deleted target. - Permissions: the binary file is still there but the executable bit was stripped, so execFile rejects with EACCES. The previous catch arm was unconditional: } catch { // binary exists but --version failed; still mark available } That branch was correct for the original case it was written for (adapters whose `--version` flag is not supported, so execFile returns non-zero but the binary itself is fine). It was wrong for the OS-level rejections above, where the binary cannot be invoked at all. Tighten the catch: - ENOENT, EACCES, ENOTDIR: return `available: false` immediately (the OS itself rejected the spawn; the CLI is not invocable). - Anything else: fall through to the old "available, version unknown" branch so adapters that have no `--version` flag keep working. Coverage in apps/daemon/tests/runtimes/probe-ghost-cli.test.ts: five vitest cases. Three pin each spawn-error code to available=false, one pins ETIMEDOUT to available=true (the adapter-with-no-version-flag contract), one happy-path check asserts a clean --version returns the parsed version string. * test(daemon): use .js specifier in ghost-cli mock so NodeNext tsc accepts the import The mock specifier and the typeof import(...) query in probe-ghost-cli.test.ts both passed `../../src/runtimes/executables.ts`. Under apps/daemon/tsconfig.tests.json's NodeNext module setting, allowImportingTsExtensions is off, so tsc rejects .ts-extension imports and the test typecheck gate fails. Switching to the .js specifier mirrors the rest of the daemon test suite and the production side's import path, and tsc -p tsconfig.tests.json now exits clean. * fix(daemon): cover stale-wrapper exits and stale-override fallback in ghost-CLI probe Two gaps left from the first revision of the #658 fix, both surfaced by lefarcen's review on PR #1301: 1. Stale wrapper shims commonly exit 126 ("not executable") or 127 ("command not found") instead of rejecting at spawn time with ENOENT. execFile reports those as a numeric err.code equal to the exit status, so the original string-only ENOENT/EACCES/ENOTDIR guard missed them and still advertised the agent as available. The probe now also classifies exit code 126 / 127 as not invocable while leaving every other non-zero exit on the legacy "available, version=null" path so adapters with no --version flag are still not regressed. 2. A user with a stale CODEX_BIN override and a working binary on PATH would be locked out of Settings' "adopt detected binary" repair flow (PR #1205), because that flow gates on agent.available === true. The probe now consults inspectAgentExecutableResolution directly: when the selected override is not invocable but a distinct PATH candidate exists, the probe retries against the PATH candidate before giving up. That keeps Settings able to surface the working binary and adopt-or-clear the bad override. Extracted probeVersionAtPath() as a tagged-result helper so the not-invocable / spawned discriminator lives in one place and the retry path can re-use it without duplicating the classifier. Tests: probe-ghost-cli.test.ts grows from 5 to 11 cases covering the two new exit-code shapes, the override-to-PATH fallback (success and both-broken), a generic non-126/127 exit (must stay available), and a no-distinct-PATH-candidate guard that pins the no-retry contract. * fix(daemon): keep ghost-CLI detection aligned with chat-run resolution Siri-Ray's review on #1301 pointed out that the earlier configured- override fallback broke a load-bearing invariant: the previous revision had detection retry a PATH binary when the configured override failed to spawn, then report that PATH binary as `available: true` with `path` pointing at it. But chat/run resolution still goes through `resolveAgentBin -> resolveAgentExecutable`, whose `selectedPath` prefers the configured override whenever the file exists. The result: Settings would advertise Codex as available at `/usr/local/bin/codex` while every actual chat send would spawn the stale `/stale/custom/codex` and fail with the same ghost error #658 was meant to fix. The bug got swapped from Settings to chat instead of fixed. The fix here probes the same path the run-time resolver picks. If the configured override is stale (ENOENT / EACCES / ENOTDIR or the 126/127 wrapper-shim exit signature) the agent is reported unavailable, full stop, even when a different PATH candidate exists. That keeps the daemon-internal invariant Siri-Ray flagged intact: spawning at chat time uses the same executable detection reported as available. The Settings repair flow (PR #1205) still has all the signal it needs via `inspectAgentExecutableResolution` directly, which exposes both `configuredOverridePath` and `pathResolvedPath` independently of the detection result. The UI can decide whether to surface an adopt-or-clear affordance based on the divergence without needing `available: true` as a permission gate. If that flow currently does gate on `available`, a follow-up PR can rewire Settings to read the resolution diagnostic instead. The 126/127 wrapper-shim exit classification (the other lefarcen P2 fix) stays in place: numeric exit codes 126 and 127 are POSIX- shell "not executable" / "command not found" and reliably signal a shim whose target has been removed; generic non-zero exits (1, 2, ETIMEDOUT) keep the legacy "available, version=null" contract so adapters with no `--version` flag are not regressed. Tests - 3 OS errno cases (ENOENT / EACCES / ENOTDIR) -> available: false - 2 stale-wrapper exit cases (126 / 127) -> available: false - 1 generic non-zero exit (1) -> available: true, version: null - 1 timeout case (ETIMEDOUT) -> available: true, version: null - 1 happy path -> parsed version string - New regression test for the Siri-Ray invariant: a stale configured override + working PATH-environment combination drives the real `resolveAgentExecutable` through `vi.importActual`, and pins that detection's reported `available` / `path` match whatever the resolver returns for the same configured env. A future refactor that diverges detection from resolution trips this assertion before merge instead of in production. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-12 19:25:00 +08:00
Joey-nexu	5077a1cd38	feat(landing-page): split catalog into per-facet pages + auto-deploy on content changes (#1158 ) * feat(landing-page): split catalog into per-facet pages + auto-deploy on content changes Convert the single-page landing into a content-driven multi-page site sourced directly from the canonical Markdown bundles in the repo root, and close the deploy loop so contributor edits go live without manual follow-up. ## What's new - `/skills/`, `/systems/`, `/craft/`, `/templates/` index + detail pages, generated from `skills/<slug>/SKILL.md`, `design-systems/<slug>/DESIGN.md`, `craft/.md`, and `templates/live-artifacts/<slug>/README.md` via Astro content collections (`app/content.config.ts`). No mirroring of content into the landing-page package — `glob` re-scans on every build. - Faceted sub-routes generated from frontmatter: - `/skills/mode/<slug>/` — 8 pages (deck, prototype, image, …) - `/skills/scenario/<slug>/` — 18 pages after alias collapse - `/systems/category/<slug>/` — 21 pages Each page owns its own `<title>`, meta description, and `CollectionPage` JSON-LD; chips on the parent index pages are now real anchors that link to these facet routes. - Updated top-bar nav (`_components/header.tsx`) to point at the new internal routes with live counts pulled from the catalog. Counts in the homepage hero meta description likewise driven by `getCatalogCounts()` so they never drift. - Per-skill / per-template thumbnails. A Playwright generator (`scripts/generate-previews.ts`) walks every `example.html` and `templates/live-artifacts/<slug>/index.html`, screenshots them at 1440×900@2x, and writes PNGs to `public/previews/`. The catalog data layer auto-detects presence and degrades gracefully when an artifact has no renderable HTML. ## Plumbing the auto-update loop - `landing-page-deploy.yml` and `landing-page-ci.yml` now trigger on changes under `skills/`, `design-systems/`, `craft/`, and `templates/`. Without this, a contributor adding a new SKILL.md to `main` would silently skip the deploy and the published site would fall behind. - Both workflows now install Playwright Chromium (cached by version) and run `pnpm previews` before `astro build`, so generated thumbnails ship in `out/previews/` automatically. Preview generation is `continue-on-error: true` — a single broken example.html should not block the deploy of the rest of the catalog. - `apps/landing-page/public/previews/` is gitignored: the directory is owned by CI and would otherwise add ~70MB of binary churn to the repo on every regeneration. ## Tag canonicalization - `app/_lib/catalog.ts` adds a small per-scope alias table so authoring drift like `od.scenario: operation` vs `operations`, or `live` vs `live-artifacts`, collapses to a single canonical route instead of leaking two near-empty pages. Mode and category alias tables are scaffolded but currently empty. ## Validation - `pnpm --filter @open-design/landing-page typecheck` — 0 errors, 0 warnings, 0 hints across 25 Astro files - `pnpm --filter @open-design/landing-page build` — 341 pages built (1 home + 8 mode + 18 scenario + 21 category + N detail pages + sitemap + RSS), zero external JS, ≥16 Cloudflare-resized hero image URLs intact ## Why this matters After merge, any push to `main` that adds, removes, or edits a skill, design system, craft principle, or live-artifact template automatically triggers a fresh build that: 1. picks up the new Markdown via the content-collection glob, 2. regenerates thumbnails for any matching example.html, 3. emits new sitemap entries and JSON-LD, 4. and ships to Cloudflare Pages — no landing-page-side change required. fix(landing-page): address review feedback on PR #1158 Five fixes from the review pass — none change scope, all close the "contradictory totals" / "stale data" / "silent CI failure" gaps the reviewers flagged. ## Hero / catalog claims now read live counts everywhere `apps/landing-page/app/page.tsx` previously hardcoded `31` skills and `72` systems in the hero copy and stat rings, while the nav and meta description had already moved to `getCatalogCounts()`. After this PR every visible "X skills / Y systems" claim — hero lead, hero stat rings, capabilities cards body copy, labs section meta + filter pills, selected-work fractions, the labs CTA, and the footer Library — reads from a single `counts` prop. `Header` and `Page` now both require `counts` (no optional fallback) so a future caller can never silently publish stale numbers. The labs-section filter pills also stop being decorative buttons: they now link to the actual `/skills/mode/<slug>/` and `/skills/` catalog routes the new multi-page architecture exposes. ## Craft README no longer publishes `apps/landing-page/app/_lib/catalog.ts` filtered out `e.id !== 'README'`, but Astro normalizes `craft/README.md`'s id to lowercase `readme`, so the published site shipped `/craft/readme/` as a public craft principle and the nav badge counted 12 instead of 11. Compare case-insensitively (`e.id.toLowerCase() !== 'readme'`) so any future README casing is also filtered out. Verified locally: `apps/landing-page/out/craft/` now contains exactly 11 entries. ## Preview URL preserves actual file extension `listPreviews()` was already discovering `.png`, `.webp`, `.jpg`, and `.jpeg`, but `previewUrlFor()` always emitted `.png`, so a future sharp/webp post-processor (or a manually committed template asset) would mark the record as available while the rendered `<img src>` 404'd. Switched the structure from `Set<slug>` to `Map<slug, filename>` and emit the actual on-disk filename verbatim. ## Preview script: per-artifact soft, systemic hard Previously any single failed `example.html` capture exited the script non-zero, which forced both workflows to mark the entire preview step `continue-on-error: true`. That blanket tolerance also masked systemic generator failures — a chromium launch that never finds the browser binary would silently ship a deploy with zero thumbnails. `scripts/generate-previews.ts` now distinguishes: - per-artifact failures → logged and skipped, exit 0 (catalog degrades gracefully for those skills), - discoverJobs / chromium.launch / 100%-failure run → exit 1 (systemic, must fail the build). Both workflows drop their `continue-on-error: true` flags so a real problem actually surfaces. ## AGENTS.md reflects the multi-page architecture `apps/landing-page/AGENTS.md` previously declared the landing page single-route ("Not multi-page. There is exactly one route ('/')"). That guidance is now wrong — there are six top-level route groups (`/`, `/skills/`, `/systems/`, `/craft/`, `/templates/`, plus their facet variants). Updated to describe content-collection sourcing, the no-mirror rule, the auto-deploy workflow contract, and the "never hardcode catalog claims" boundary. ## Validation - `pnpm --filter @open-design/landing-page typecheck` — 0 errors, 0 warnings, 0 hints across 25 Astro files - `pnpm --filter @open-design/landing-page build` — 340 pages built (was 341 before the README filter; the README route is now correctly absent), live counts visible in the built `out/index.html`: `driven by 125 composable skills and 149 brand-grade design systems` - Verified `out/craft/` no longer contains `readme/` - Verified preview URLs resolve to the actual on-disk filename via the regenerated catalog index page * fix(landing-page): clean up live-artifact template name + summary parsing Address @mrcfps's follow-up review on `0715d8c`. The `shapeLiveArtifactTemplate()` parser was passing the README's H1 verbatim (literal backticks intact) and using the first non-empty post-H1 line as the summary, even when that line was the `> Category: Live Artifacts` editorial blockquote. Result: `/templates/live-otd-operations-brief/` was shipping a `<meta name="description" content=">">` and a card title with raw Markdown noise — a regression for both SEO snippets and the templates catalog at-a-glance scan. ## Two new shared helpers - `stripMarkdownInline()` — strip backticks, asterisks, and link wrappers so `# \`otd-operations-brief\` · live-artifact template` becomes `otd-operations-brief · live-artifact template` before any further trimming. - `extractFirstProseParagraph()` — walk the body after the H1 and skip blockquotes (`>`), list markers, table rows, fenced code, and HR rules. Stop at the first contiguous prose paragraph and pass it through `stripMarkdownInline()` so the result is human-readable. Both helpers live next to `titleizeSlug()` and are used by `shapeCraft()` and `shapeLiveArtifactTemplate()` so they share one implementation. ## Live-artifact title boilerplate trim Live-artifact READMEs commonly title themselves `# \`<slug>\` · live-artifact template`. After stripping the inline backticks the trailing `· live-artifact template` is redundant ("Templates" already groups them) and adds a wide noisy suffix on catalog cards. Removed it via a narrow regex tail-strip. ## Result on the existing fixture Verified locally for `templates/live-artifacts/otd-operations-brief/`: - before: `<title>\`otd-operations-brief\` · live-artifact template …</title>`, `<meta name="description" content=">">` - after: `<title>otd-operations-brief — Open Design template</title>`, `<meta name="description" content="A drop-in html_template_v1 live-artifact template for an editorial On-Time Delivery brief. It ships:">` Typecheck 0/0/0, build 340 pages. --------- Co-authored-by: Joey <joey@cursor.so> Co-authored-by: Joey-nexu <236967869+joeylee12629-star@users.noreply.github.com>	2026-05-12 19:24:50 +08:00
Prantik Medhi	060540f73c	fix(web): improve design system search affordance (#1437 )	2026-05-12 19:15:16 +08:00
Joey-nexu	4f70bf80fb	feat(web): add Discord community link in entry sidebar footer (#1391 ) Adds a Discord invite (https://discord.com/invite/qhbcCH8Am4) as a foot-pill sibling to the existing X follow link in EntryView's left sidebar bottom row. Introduces a 'discord' icon to the inline SVG icon set, rendered with fill=currentColor so it adopts local text color like the rest of the system. CSS unchanged: reuses .foot-pill .foot-pill-follow. Co-authored-by: Joey-nexu <236967869+joeylee12629-star@users.noreply.github.com>	2026-05-12 18:51:06 +08:00
Matt Van Horn	23218eacd9	refactor(settings): use tiled language picker instead of dropdown (#1406 ) The Language section in Settings rendered a single-button dropdown trigger that opened a floating menu. With one visible label and lots of empty panel space, the layout misled users into thinking only one language existed. Replace the dropdown trigger + portaled menu with an inline tile grid that shows every locale at a glance and clicks directly to switch. Side effects of the new layout: the languageOpen / languageMenuRect state, the dynamic placement effect, the resize-close effect, the mousedown click-outside handler, and the languageRef are gone. The global Escape handler no longer needs to guard against the menu being open. CSS for .settings-language-picker, .settings-language-button, .settings-language-menu, and .settings-language-option is replaced by .settings-language-grid (auto-fill 180px minmax columns) + .settings-language-tile. Tests in SettingsDialog.execution.test.tsx that drove the dropdown (click trigger → click menuitemradio → assert menu closed) are rewritten to drive the tiles directly via the radio role. Refs #1347	2026-05-12 17:49:04 +08:00
shangxinyu1	0220124e0f	test: add Memory and Routines coverage (#1400 ) * test: align extended Playwright coverage with current UI behavior * test: address extended suite review feedback * test: fix Codex fallback config hydration in e2e * test: add Memory and Routines coverage * test: fix Memory and Routines component test typing * test: include Memory and Routines e2e in extended suite	2026-05-12 17:48:56 +08:00
nettee	28d3e5faf5	Fix Codex wrapper launch paths (#1395 )	2026-05-12 17:20:32 +08:00
Rocky	6c3fd86642	fix(daemon/acp): terminate ACP child after clean prompt completion (#1286 ) * fix(daemon/acp): terminate ACP child after clean prompt completion (Bug B / #1265) Some ACP agents (notably Devin for Terminal) keep the child process alive after stdin closes, waiting for the next prompt. Open Design spawns a fresh agent per chat turn and relies on child.on('close') to finalize the run, so without an explicit signal-driven shutdown the chat sits stuck in the 'working' state indefinitely. Three small, targeted changes: - apps/daemon/src/acp.ts: After a clean session/prompt response we schedule a 500ms grace period and then SIGTERM the child. This mirrors the pattern detectAcpModels() already uses after model discovery. The grace period leaves well-behaved agents that exit on stdin.end() unaffected. - apps/daemon/src/acp.ts: New completedSuccessfully() method on the session handle reports whether the prompt resolved without a fatal error or abort, so the consumer can distinguish 'clean signal exit' from 'genuine signal failure'. - apps/daemon/src/server.ts: child.on('close') now treats a SIGTERM exit as 'succeeded' when acpSession.completedSuccessfully() is true. - apps/web/src/providers/daemon.ts: Trust the server's authoritative endStatus; the signal/non-zero-code safety net no longer overrides an explicit 'succeeded' status, so the chat doesn't surface a fake 'agent exited with signal SIGTERM' error after a clean ACP run. Daemon tests cover the SIGTERM grace timer, clean early-exit (timer cleared), and completedSuccessfully() abort/error states. Manual UI test on plain main + this fix confirms Devin chats now return to ready automatically after Done · ... * fix(daemon/connectionTest): treat ACP clean SIGTERM as success Codex review on #1286 caught that the new SIGTERM in attachAcpSession breaks ACP connection tests for agents that don't shut down on stdin.end() (the exact Devin behavior the patch targets). attachAgentStreamHandlers() in connectionTest.ts now also respects acpSession.completedSuccessfully(), mirroring the same check we apply in server.ts. Without this, a clean prompt response followed by our SIGTERM would set winner.signal === 'SIGTERM', flip exitedCleanly to false, and the connection test would report 'agent_spawn_failed' even when the agent had returned a healthy response. Also widened the AgentSpawnHandle type so completedSuccessfully is visible on the structural type used inside connectionTest.ts. All 56 daemon tests still pass; typecheck + guard clean. * fix(daemon/acp): narrow ACP success-on-signal override to forced-SIGTERM Looper review on #1286 caught that the success predicate was broader than the SIGTERM case it was meant to handle. `completedSuccessfully()` flips to true as soon as the ACP `session/prompt` response is processed, but it does not say why the child later closed. With the broad predicate, an ACP agent that returned a prompt result and then exited with code 1 (or was killed by SIGKILL/SIGSEGV) was still marked 'succeeded', regressing the existing close-status behavior for genuine post-response process failures. Scope the override to the exact forced-shutdown shape this PR introduces: code === null && signal === 'SIGTERM' && acpCleanCompletion Applied to both `server.ts` (chat run finalization) and `connectionTest.ts` (connection-test classification). Any other post-response failure now falls through to 'failed' / 'agent_spawn_failed' as before. All 59 daemon tests still pass; typecheck + guard clean. * fix(web/daemon): only bypass exit-code safety net on explicit server success Looper review on #1286 caught that the previous web change trusted `endStatus === 'succeeded'` absolutely, but `endStatus` can become 'succeeded' in two distinct ways: 1. The SSE end event explicitly carries `status: 'succeeded'` (authoritative server declaration). 2. The end event omits or has an invalid `status` field and the handler silently falls back to 'succeeded' as a local default. Both produced `endStatus === 'succeeded'` in the existing code, so the new safety-net bypass treated them identically. That regressed backward compat: a compatible or older daemon emitting an end event like `{code:1}` or `{code:null,signal:"SIGTERM"}` with no `status` would suddenly skip the failure banner. Track explicit success separately via `serverDeclaredSuccess`, set true only when: - The SSE end event has `status === 'succeeded'`, or - The fallback `fetchChatRunStatus` REST path returns `status === 'succeeded'` (which the existing `isChatRunStatus()` guard already proves is explicit). The safety net is now bypassed only on that explicit signal; the local-fallback success path still reaches the exit-code/signal check so real failures surface as before. Adds three web-side regression tests in `apps/web/tests/providers/sse.test.ts`: - Explicit `status: 'succeeded'` + SIGTERM → onDone called, no error - End event with `{code:1}` and no `status` → onError surfaces 'agent exited with code 1' as before - End event with `{code:null,signal:'SIGTERM'}` and no `status` → onError surfaces 'agent exited with signal SIGTERM' as before `pnpm guard` + daemon typecheck clean; 27/27 SSE tests pass (up from 24).	2026-05-12 17:13:10 +08:00
Yiang Yiyan	5ff578dc8d	fix: support object-style question-form options (#1293 ) * fix: support object-style question-form options * fix: preserve stable option values in form submissions	2026-05-12 17:03:45 +08:00
Mason	2f51f3c1ae	feature: refine assistant artifact feedback (#1379 ) * feature: refine assistant artifact feedback * fix: clear hidden custom feedback reason * test: update assistant feedback expectations	2026-05-12 17:00:42 +08:00
Nagendhra Madishetti	4d0ea247a7	fix(desktop): swallow setTypeOfService EINVAL crashes in dev main (#647 ) (#1298 ) * fix(desktop): swallow harmless setTypeOfService EINVAL crashes in dev main The packaged Electron entry (apps/packaged/src/logging.ts) already filters the undici "setTypeOfService EINVAL" crash that issue #895 introduced for the prod build, but the dev / source-built desktop entry was missing the parallel guard. Result: switching settings tabs in a from-source desktop run could fire a fresh fetch, undici would try to set IP_TOS on the outbound socket, the kernel would refuse on certain macOS / VPN configurations, and the rejection bubbled to Electron's default handler as the "JavaScript error in the main process" dialog reported in issue #647. Add the same defensive filter to apps/desktop: - isHarmlessSocketOptionError matches only the canonical undici shape (syscall name AND EINVAL code). A contradicting code (EACCES, EPERM, etc) explicitly fails the match so real bugs don't get hidden. - The uncaughtException handler logs harmless cases at warn and returns silently. For anything else it removes itself from the listener list and re-throws via setImmediate, restoring Node's default crash path so Electron's native dialog renders exactly as it would without this filter. - unhandledRejection mirrors the same harmless / fall-through split. The filter is installed BEFORE app.whenReady so it is armed by the time the renderer fires its first fetch. The helper is duplicated rather than imported from apps/packaged because AGENTS.md forbids cross-app private-source imports. The file header calls out the parallel and notes that the two copies should stay in sync until the helper is promoted to a shared workspace package (follow-up); the contract is identical so a regression in one will surface in the other's test suite. Tests in apps/desktop/tests/main/uncaught-exception.test.ts mirror apps/packaged/tests/logging.test.ts: 8 cases pinning the matcher shape, 2 cases pinning the handler's harmless-log-warn vs fall-through-rethrow split. Validated: pnpm guard, pnpm --filter @open-design/desktop typecheck, pnpm --filter @open-design/desktop build, and pnpm --filter @open-design/desktop test (14 passed, 10 new). * fix(desktop,packaged): fail-fast on non-harmless unhandled rejections The previous unhandledRejection listeners logged non-harmless reasons and returned, which kept the main process alive after any rejected promise. A real bug, a failed IPC registration, or any unexpected async exception was reduced to a console line instead of surfacing through Node/Electron's default crash path the filter was meant to preserve. Both copies now route non-harmless rejections through a parallel factory (createDesktopUnhandledRejectionHandler / createFatalUnhandledRejectionHandler) that mirrors the uncaughtException policy: harmless setTypeOfService EINVAL shapes log at warn and return, anything else logs at error, removes the listener, and re-throws via setImmediate. Listener removal happens before the scheduled throw, so the rethrown reason lands in the uncaughtException path with no recursion. Tests cover the harmless branch, the detach + ordered rethrow, and non-Error / primitive rejection reasons (Promise.reject(42)) which must fall through. Desktop suite: 13/13, packaged suite: 16/16. Flagged on PR #1298 by Siri-Ray and the codex P2 review thread; the two file copies stay in lockstep per the AGENTS.md sync invariant. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-12 16:50:57 +08:00
mrzhangkris	54516a9866	fix: truncate long template names on project cards (#1220 ) (#1302 ) Add min-width: 0 to .design-card-name so text-overflow: ellipsis works correctly in flex layouts. Long template names were pushing the task execution status (Running, Failed, etc.) out of view on project cards. Closes #1220 Co-authored-by: laomo <laomo@openclaw.ai>	2026-05-12 16:49:35 +08:00
Bryan	1fb099916c	feat(daemon): export self-contained HTML via /export/?inline=1 endpoint (#1312 ) test(daemon): add Red unit tests for inlineRelativeAssets helper 14 cases pinning the behavior contract for the upcoming apps/daemon/src/inline-assets.ts helper: - link/script inlining with verbatim body preservation - non-src script attrs preserved (type=module, defer, crossorigin) - relative path resolution (root + nested + deep-nested owners) - self-closing and single-quoted attr forms - negative cases: missing rel, rel=preload, absolute/data/blob/leading-slash - escaping: </style and </script inside body - null-fileReader graceful degradation - duplicate identical tags fully replaced (diverges from apps/web/src/components/FileViewer.tsx:5313's first-match-only; locked decision per plan §3.3) - HTML-escaped data-od-inline-asset attr Tests intentionally Red — module ../src/inline-assets.js does not yet exist. Phase B-G of plan declarative-roaming-gosling.md will turn them green by porting FileViewer.tsx:5248-5354 server-side. Refs nexu-io/open-design#368. * feat(daemon): port inlineRelativeAssets server-side for export endpoint Adds apps/daemon/src/inline-assets.ts — a pure helper that takes (html, ownerFileName, fileReader closure) and returns the HTML with every relative <link rel=stylesheet> and <script src> contents inlined into <style data-od-inline-asset="…">/<script>…</script> blocks. The fileReader closure keeps the helper free of fs/Express coupling so the route handler owns the filesystem boundary. Port source: apps/web/src/components/FileViewer.tsx:5248-5354 — five functions (inlineRelativeAssets, resolveProjectRelativePath, baseDirFor, readHtmlAttr, escapeHtmlAttr). The fetch hop becomes the fileReader closure; replace-all replaces first-match-only per locked design decision §3.3 (inline comment in inline-assets.ts cites the divergence from FileViewer.tsx:5313 and notes the web inline path is on a deprecation track since PR #384 made URL-load the default). Phase B-G of plan declarative-roaming-gosling.md. All 14 unit cases from the Red commit (`a60a9023`) now pass; tightens one case to use a realistic '&'-only filename (the original `<`/`>`-bearing filename was unreachable in real filesystems and exposed a regex limitation the web client carries too). Daemon delta: +14 tests (1704 → 1718). Typecheck clean. Refs nexu-io/open-design#368. * test(daemon): add Red integration tests for /export/?inline=1 route 9 HTTP cases against GET /api/projects/:id/export/?inline=1: - 3-file React-ish layout returns self-contained HTML (wiring guard: body assertions catch removal of the await inlineRelativeAssets(...) line, not just helper-internals changes) - missing inline / non-canonical values (0, false, foo, empty) → 400 - non-HTML file → 400 UNSUPPORTED_FILE_TYPE - missing file → 404 FILE_NOT_FOUND - invalid project id (..) → some 4xx (Express normalizes before route) - null-origin OPTIONS preflight → 204 + Access-Control-Allow-Origin: * - missing sibling asset → 200 with <link> tag intact, other asset inlined - nested HTML entry (pages/index.html + ../shared/util.js) → 200 inlined 8 of 9 tests Red (404 / 403); the invalid-project-id case is tolerant about how Express rejects .. so it accidentally passes Red — Green will tighten to 400 BAD_REQUEST via isSafeId. Phase C-R of plan declarative-roaming-gosling.md. C-G will register the route in apps/daemon/src/import-export-routes.ts. Refs nexu-io/open-design#368. * feat(daemon): wire GET /api/projects/:id/export/?inline=1 endpoint Adds the export-inline endpoint into registerProjectExportRoutes (import-export-routes.ts) alongside /export/pdf and /archive. The route: - Validates project id via ctx.validation.isSafeId - Requires ?inline=1 (accept-list: 1 / true / yes / on, matching Part 1's parseForceInline at file-viewer-render-mode.ts:59-66) - Reads the owner HTML via ctx.projectFiles.readProjectFile; maps ENOENT to 404 FILE_NOT_FOUND, everything else to 400 BAD_REQUEST - Gates non-HTML callers with 400 UNSUPPORTED_FILE_TYPE - Builds a fileReader closure that silently returns null on any sibling read failure (failure-local, not fatal — matches the web client's null-filter at FileViewer.tsx:5311) - Hands the buffer + relPath to inlineRelativeAssets and returns the result as text/html DI: RegisterProjectExportRoutesDeps gains 'projectFiles' \| 'validation'; server.ts:2879 passes the corresponding deps. Mirrors the dep shape of RegisterFinalizeRoutesDeps used by PR #832's /finalize/anthropic. Null-origin support intentionally omitted (decision §10 in the PR description): the daemon's null-origin allowlist is /raw/ and /codex-pets/.../spritesheet only, and export consumers are same-origin UI or server-side tooling — sandboxed-iframe srcdoc previews fetch /raw/* instead. Integration test #7 pins the 403 contract so a future allowlist change is deliberate. Phase C-G of plan declarative-roaming-gosling.md. All 23 tests green (14 unit + 9 integration); full daemon suite 1727 passing (delta +9 over B-G's 1718). Typecheck clean. Refs nexu-io/open-design#368. * test(daemon): add Red regression for inlined-body tag-literal corruption Reproduces the correctness bug Siri-Ray (looper) and codex-bot flagged on PR #1312: the reduce/split-join approach in inlineRelativeAssets re-scans the progressively mutated HTML, so a tag literal that happens to appear inside an already-inlined asset body gets the inner literal also replaced — corrupting the body and producing duplicate inlining. Concrete reproducer (CSS, where </style escape doesn't touch <link>): HTML: <link rel="stylesheet" href="a.css"> <link rel="stylesheet" href="b.css"> a.css: /* see also <link rel="stylesheet" href="b.css"> / b.css: body{color:red} Under split/join the second pass splits on `<link rel="stylesheet" href="b.css">` and matches BOTH the real outer tag AND the literal inside a.css's comment. Result: b.css's <style> block is injected inside a.css's comment, and b.css gets inlined twice. Phase F-R of plan declarative-roaming-gosling.md (post-PR-#1312 review round). F-G will rewrite the helper to collect matches by position in the original HTML and concat slices in a single pass, so already-inlined content is never re-scanned. Refs nexu-io/open-design#1312 review threads at apps/daemon/src/inline-assets.ts:122 (Siri-Ray looper + codex bot). feat(daemon): replace inliner reduce/split-join with position-based concat Fixes the inlined-body tag-literal corruption Siri-Ray (looper) + codex-bot flagged on PR #1312. The previous `replaceAllOccurrences` (`source.split(from).join(to)`) re-scanned the progressively mutated HTML on each pass, so a tag literal that appeared inside an already- inlined CSS/JS body got the inner literal replaced too, producing duplicate inlining and corrupted bodies. New shape: collect every match's {start, end} byte span from the ORIGINAL html via `matchAll`, await the per-match replacements in parallel, sort by start, and concat slices of the original html with the replacement strings in a single pass. Text introduced by an earlier replacement is never scanned for matches. The dup-tag fix (decision §8 — replace every occurrence, not first-match-only) is preserved: every original-tag position gets its own slice, so all duplicates are inlined. Also extracts buildInlineStyleBlock / buildInlineScriptBlock so the match-collection loops stay readable. Phase F-G of plan declarative-roaming-gosling.md. Regression test (`c809bccc`) goes Green; all 24 unit + integration tests pass; daemon suite still clean. Refs nexu-io/open-design#1312. * test(daemon): add Red CSP-sandbox test + P3 coverage gaps from PR #1312 review Three tests covering lefarcen's review on PR #1312: 1. [Red] CSP sandbox header (P2, lefarcen @ import-export-routes.ts:423). Top-level browser navigation to /export/?inline=1 sends no Origin header, so the daemon middleware lets it through and any JS in the exported document runs with daemon-origin privileges. Asserts the response sends `Content-Security-Policy: sandbox allow-scripts` so the browser treats it as a sandboxed iframe with an opaque origin (scripts still run, but no cookies / no /api/ access). This test fails until G1-G adds the header in the handler. 2. [Green-on-commit] Accept-list cases (P3, lefarcen @ test.ts:262). PR body decision §7 promises `inline=true/yes/on` case-insensitive, but round-1 tests only exercised inline=1. Pin the full accept list (true / yes / on + TRUE / Yes / ON). Already passes — the route's parser already implements the accept list; this just makes the contract testable. 3. [Green-on-commit] isSafeId guard (P3, lefarcen @ test.ts:287). Previous `..` test was normalized by Express before reaching the route. New input uses `bad!id` (URL-safe, but outside isSafeId's /^[A-Za-z0-9._-]+$/ char class), so Express passes it into req.params unchanged and isSafeId rejects with the documented 400 BAD_REQUEST envelope. Phase G1-R / H of plan declarative-roaming-gosling.md. Refs nexu-io/open-design#1312 review comments. feat(daemon): send Content-Security-Policy: sandbox allow-scripts on /export Closes the same-origin XSS surface lefarcen flagged on PR #1312 (P2 at import-export-routes.ts:423): top-level browser navigation to the export URL sends no Origin header, so the daemon's /api middleware admits the request and any JS in the exported document executes with daemon-origin privileges (cookies, /api/, localStorage). `Content-Security-Policy: sandbox allow-scripts` on the response makes the browser treat the document as a sandboxed iframe with an opaque origin. Scripts still execute (necessary for the screenshot use case — the whole point of inlining JS), but they cannot read cookies, hit /api/, or otherwise escalate to the daemon's origin. Phase G1-G of plan declarative-roaming-gosling.md. Daemon delta: +3 tests (the Red CSP test from `58151356` turns Green; the P3 coverage gap tests stay green). Refs nexu-io/open-design#1312. * test(daemon): add Red regression for <link> stylesheet attr preservation Currently `<link rel="stylesheet" href="print.css" media="print">` becomes a plain `<style data-od-inline-asset="print.css">…</style>` with no media query — print-only styles apply unconditionally. Same problem for `title` (alternate stylesheet sets), `disabled` (initial disabled state), and `nonce` (CSP nonce). All four are valid on both `<link rel=stylesheet>` and `<style>` per HTML spec, so the inliner must carry them across. PR #1312 round-2 review (lefarcen P2 @ inline-assets.ts:44). Phase G2-R; G2-G will extend buildInlineStyleBlock to copy the four attrs off the source <link>. Refs nexu-io/open-design#1312. * feat(daemon): preserve <link> stylesheet semantics on inlined <style> Closes lefarcen's P2 review note on PR #1312 (inline-assets.ts:44): `<link rel="stylesheet" href="print.css" media="print">` was becoming a plain <style> with no media query, so print-only styles applied unconditionally. Same issue for `title` (alternate stylesheet sets), `disabled` (initial disabled state), and `nonce` (CSP nonce). buildInlineStyleBlock now carries four attrs across from the source <link>: - media, title, nonce (value attrs, HTML-escaped via escapeHtmlAttr) - disabled (boolean attr — copied as bare presence) Other <link> attrs (rel, href, type, crossorigin, integrity, referrerpolicy) don't apply to <style> and are intentionally dropped. New `hasBooleanHtmlAttr` helper distinguishes presence-as-attr from substring-inside-another-attr-value via a regex that requires a word boundary after the name (whitespace, `=`, or `>`). Phase G2-G of plan declarative-roaming-gosling.md. All 28 tests pass. Refs nexu-io/open-design#1312. * docs(daemon): narrow inliner contract claim + document size-limit policy Closes lefarcen's P2 review notes on PR #1312: 1. "Self-contained" incomplete (inline-assets.ts:67): the helper only rewrites top-level <link rel=stylesheet> / <script src>. `<img src>`, CSS `url(...)`, CSS `@import`, ES module imports, font sources, and similar remain external in the response. The PR title/body claimed "self-contained HTML" which over-promised for screenshot tooling expecting bundled images/fonts. Module docstring now enumerates the full not-rewritten list and names the screenshot path as the primary use case (headless browser fetches each external asset on render, so inline-CSS- and-JS-only is sufficient). The route handler comment block mirrors the contract. A fully offline export with image/font bundling is filed as a follow-up — out of scope for this PR. 2. No response cap (inline-assets.ts:72): the helper does concurrent reads + multiple string copies and could spike daemon memory. The daemon is local-first (single-user, developer's machine — see open_design_architecture.md), so the effective ceiling is the size of the user's own project. The docstring now states this rationale and names the conditions under which a bounded-concurrency reader and output-size limit would be needed (non-trusted callers). Docs-only — no behavior change, all 28 tests still pass. Refs nexu-io/open-design#1312. * test(daemon): add Red regression for hasBooleanHtmlAttr quoted-value match PR #1312 round-2 review (lefarcen P3): `hasBooleanHtmlAttr` tests the tag string with no attr-quoting awareness, so the literal text `disabled` appearing inside any quoted attribute value followed by another whitespace char satisfies `\sdisabled(?=\s\|=\|/?>)`. <link rel=stylesheet href=x.css data-note="content disabled stuff"> emits a <style disabled> block, silently disabling a stylesheet the author wrote without that attr. Also adds a counterweight test for the legitimate-disabled case (<link … disabled>) so the next-commit fix doesn't over-correct and start dropping real boolean attrs. Phase I3-R of plan declarative-roaming-gosling.md (post-PR-#1312 round-2 review). I3-G will strip quoted attribute values from the tag string before testing for the bare attr. Refs nexu-io/open-design#1312. * feat(daemon): make hasBooleanHtmlAttr quote-aware to avoid false positives Closes lefarcen's P3 review note on PR #1312: `hasBooleanHtmlAttr` previously ran `\sname(?=\s\|=\|/?>)` over the full tag string, so the literal text `disabled` appearing inside any quoted attribute value followed by whitespace satisfied the regex. Source tags like `<link rel=stylesheet href=x.css data-note="content disabled stuff">` were emitting a <style disabled> block — silently disabling a stylesheet the author wrote without that attr. Fix: strip `="…"` and `='…'` substrings out of the tag with two regex passes BEFORE testing for the bare attr. The lookahead still requires `\s\|=\|/?>` after the attr name, so `<link disabled>`, `<link disabled="">`, `<link disabled/>`, etc. all match — but the attr name as a substring of any quoted value cannot match because values have been stripped to `""` / `''`. Phase I3-G of plan declarative-roaming-gosling.md. All 30 tests green (28 prior + 2 round-3 regression cases: false-positive and legitimate-disabled). Refs nexu-io/open-design#1312. * test(daemon): add Red cap-enforcement tests + scaffold InlineOptions PR #1312 round-2 review (lefarcen P2 — still open): round-2 only documented that no cap is enforced. Reviewer pushed back: the helper still builds unbounded candidate arrays + runs Promise.all over all asset reads + concatenates the full output in memory. Need actual limits in code. This commit adds the Red test surface that drives the next commit's enforcement: - InlineAssetsLimitError("owner") when owner HTML > maxOwnerBytes - InlineAssetsLimitError("candidates") when tag matches > maxCandidates - Per-asset graceful: oversized asset → tag stays as URL ref - InlineAssetsLimitError("total") when assembled output > maxTotalBytes - Bounded read concurrency: peak in-flight reads ≤ maxReadConcurrency - Integration: route maps the throw to 413 PAYLOAD_TOO_LARGE InlineOptions interface is added to the helper signature as a no-op test-door (per feedback_test_doors_over_fake_timers.md), so tests can exercise tiny fixtures while production callers use module-level defaults. The next commit (H3-G) wires the enforcement. Phase H3-R of plan declarative-roaming-gosling.md. Daemon delta on this commit: +6 tests (5 unit + 1 integration), all Red. Refs nexu-io/open-design#1312. * feat(daemon): enforce inliner caps + map limit errors to 413 PAYLOAD_TOO_LARGE Closes lefarcen's still-open P2 review on PR #1312 round 2 ("the code still builds unbounded candidate arrays + Promise.all over all asset reads + concatenates the full output in memory"). Caps are now enforced in code with the documented defaults: MAX_INLINE_OWNER_BYTES = 2 MiB MAX_INLINE_ASSET_BYTES = 5 MiB per sibling MAX_INLINE_CANDIDATES = 500 link/script matches MAX_INLINE_TOTAL_BYTES = 50 MiB assembled output MAX_INLINE_READ_CONCURRENCY = 8 simultaneous fileReader calls Enforcement points: - Owner cap (input): fires immediately at function entry. Cheap — Buffer.byteLength of the already-decoded UTF-8 string. - Candidate cap (planning): fires after matchAll, BEFORE any sibling read. Pathological HTML with thousands of <link>/<script src> tags is rejected without opening a single file descriptor. - Asset cap (per-sibling): post-read length check; oversized assets return null from the wrapped reader, so the tag stays as a URL ref and the response is still 200. This is the only "graceful" cap — one bad asset doesn't fail the whole export. - Total cap (output): tracked across the slice-and-concat loop, guarding both preserved-html slices AND injected replacements. - Concurrency cap (planning): a tiny in-module runWithConcurrency worker-pool keeps at most maxReadConcurrency fileReader calls in flight, with order-preserving results. `InlineAssetsLimitError` carries a `limit` discriminator so logs and clients can disambiguate owner/asset/candidates/total. The route handler catches it and emits 413 PAYLOAD_TOO_LARGE. Drive-by error-envelope fix while in the route: UNSUPPORTED_FILE_TYPE (an unregistered ApiErrorCode) → UNSUPPORTED_MEDIA_TYPE (the canonical code) with HTTP 415. The round-1 string was a slip; caught by reading packages/contracts/src/errors.ts:11 while wiring PAYLOAD_TOO_LARGE. Phase H3-G of plan declarative-roaming-gosling.md. All 36 tests green (28 prior + 2 round-3 quoted-attr + 5 cap unit + 1 cap integration). Refs nexu-io/open-design#1312. * feat(daemon): enforce inliner caps pre-buffer via AssetHandle contract Closes lefarcen's still-open P2 review on PR #1312 round 3 ("the helper enforces maxTotalBytes only after all candidate assets have already been read and converted to replacement strings" / "maxAssetBytes is checked after fileReader fully buffers each sibling"). Round-3 caps were defensive against the final output size but did not bound peak memory during read fanout — 500 assets at 5 MiB each could materialize ~2.5 GiB before the 413 fired. Contract change: InlineAssetReader now returns `AssetHandle \| null` where AssetHandle is `{ readonly size: number; read(): Promise<...> }`. Callers expose `size` from a cheap stat-equivalent (the route uses `resolveProjectFilePath`) and defer the full materialization to `read()`. The helper checks size against maxAssetBytes BEFORE invoking read, and against the running total BEFORE the reservation is committed. Enforcement flow inside runWithConcurrency: 1. await fileReader(p.resolved) → cheap stat-only call 2. if (handle.size > maxAssetBytes) return null ← pre-buffer 3. if (runningBytes + handle.size > maxTotalBytes) ← pre-buffer totalAborted = true; return null 4. runningBytes += handle.size ← reserve 5. await handle.read() ← only now 6. if (read returned null) runningBytes -= refund `totalAborted` is a shared flag the workers check at entry, so once the running total hits the cap, no new reads start. With maxReadConcurrency = 8, at most ~8 stat-side calls finish after abort — peak memory bounded. The concat-time guard stays as the exact final assertion (the pre-buffer reservation is approximate — it counts the original tag bytes and skips wrapper overhead). Route closure updated to do `resolveProjectFilePath` first, then `readProjectFile` inside the deferred `read()`. Test reader helpers (`readerFrom` + the concurrency-test reader) updated to the new shape. Two new unit tests pin the pre-buffer semantics: - `maxAssetBytes` is checked via handle.size BEFORE handle.read() (the reader's `read()` throws — must never run) - Running total abort stops further reads once exceeded (counting reader observes ≤ 2 reads when cap should fire after the first) Phase K of plan declarative-roaming-gosling.md (post-PR-#1312 round-3 review). All 38 tests green (36 prior + 2 round-4 pre-buffer cases). Refs nexu-io/open-design#1312. * test(daemon): add Red test pinning owner pre-buffer 413 before mime 415 PR #1312 round-5 (lefarcen P2): the route currently reads the owner file with readProjectFile() before any size check, so a 100 MiB owner HTML is fully buffered into memory before the helper's ownerBytes check fires. The fix is to stat with resolveProjectFilePath first, reject pre-buffer with 413 PAYLOAD_TOO_LARGE on oversize, then fold in the mime check (still 415 on mismatch, now pre-buffer), then readProjectFile when both gates pass. The Red→Green discriminator is the combination 'oversize AND non-HTML': pre-fix the route reads the buffer first and the text/plain mime check fires → 415; post-fix the route stats first and the size check fires before the mime check → 413. Asserting 'got 413, not 415' pins both the pre-buffer property and the check ordering (size before mime, per lefarcen's locked round-5 sequence). 2 MiB+1 byte fixture is acceptable in test setup; MAX_INLINE_OWNER_BYTES is the production 2 MiB so no test-door is needed. Red verified: AssertionError: expected 415 to be 413 (pre-fix flow reads → mime → 415). * feat(daemon): stat owner before readProjectFile in /export route to bound owner pre-buffer PR #1312 round-5 (lefarcen P2 confirmed at PR-1312#issuecomment-4424868413 follow-up): the route previously called readProjectFile() unconditionally on the owner, so a 100 MiB owner HTML was fully buffered into memory before the helper's ownerBytes check fired with InlineAssetsLimitError ('owner'). That meant the 413 envelope returned to the caller but only after peak memory had already hit the file size. Fix mirrors the sibling-asset stat-then-read contract round 4 added via the AssetHandle interface: call resolveProjectFilePath first (cheap stat), reject pre-buffer with 413 PAYLOAD_TOO_LARGE on size > MAX_INLINE_OWNER_BYTES, fold in the mime check (still 415 UNSUPPORTED_MEDIA_TYPE on mismatch, now also pre-buffer per lefarcen's 'fold-in is welcome'), then readProjectFile() only when both gates pass. Size check fires before mime check, so an oversize non-HTML file returns 413 rather than 415 — the observable Red→Green discriminator for this round. The helper's ownerBytes check (inline-assets.ts:127-133) stays as defense-in-depth for direct in-process callers that skip the route and for any drift between stat-reported size and the bytes returned by readFile. Verifies the round-5 Red at apps/daemon/tests/export-inline-route.ts ('returns 413 (not 415) for an oversize non-HTML file'). Daemon suite 1743/1743 passing. * test(daemon): add Red test pinning stat-vs-actual byte reconciliation PR #1312 round-5 (lefarcen P3 confirmed at PR-1312#issuecomment-4424868413 follow-up): the helper trusts handle.size for the running-total guard and never reconciles with the actual byte length of content unless the per-asset cap is exceeded. A reader that under-reports size (stale stat, UTF-8 expansion at decode, sparse file, deliberate lie) can let many strings materialize in memory before the concat-time guard at the bottom of inlineRelativeAssets throws — defeating the round-4 pre-buffer cap intent. Fix is lefarcen-confirmed path-a: post-read, the helper computes actualBytes = Buffer.byteLength(content, 'utf8'), reconciles runningBytes (add actualBytes, refund handle.size), and if running total exceeds maxTotalBytes flips totalAborted = true and returns null. Subsequent workers see totalAborted before invoking their own read(). Helper still throws InlineAssetsLimitError('total') after Promise.all settles — preserving the round-2/3/4 graceful-fallback pattern instead of racing throws across in-flight workers. Red→Green discriminator is read count. Pre-fix the helper trusts the lying handle.size (10), so both reads complete (each returning 1000 bytes) under the reservation total of 56+10+10=76 < cap 500. The concat-time guard then catches the 2000+-byte assembly and throws 'total' — but only after both reads materialized in memory. Post-fix worker 1's reconciliation trips totalAborted as soon as actualBytes (1000) is folded into runningBytes; worker 2 skips its read. Red verified: AssertionError expected 1, received 2 (pre-fix flow completes both reads before concat-guard fires). * feat(daemon): reconcile inliner reservation with post-read actual bytes PR #1312 round-5 (lefarcen P3 confirmed at PR-1312#issuecomment-4424868413 follow-up, path-a): the helper trusted handle.size for the running- total guard and only reconciled with actual bytes for the per-asset cap. A reader that under-reported size — stale stat, UTF-8 decode expansion at read time, sparse file, deliberate lie — could let many strings materialize before the concat-time guard at the bottom of inlineRelativeAssets caught the excess. That defeated the round-4 pre-buffer cap intent. Fix: after a successful read(), compute actualBytes = Buffer.byteLength(content, 'utf8'), reconcile runningBytes by folding in (actualBytes - handle.size), and re-check the total cap. If the reconciliation pushes runningBytes past maxTotalBytes, drop the asset's inlining (tag stays as URL ref), set totalAborted = true to block subsequent worker reads, and let Promise.all settle. The helper then throws InlineAssetsLimitError('total') below — matching the round-2/3/4 graceful-fallback pattern (no throw-before-settle race between in-flight workers). The per-asset cap check at line 228 is preserved for stat-lying readers that blow a single asset past maxAssetBytes; that branch refunds handle.size and drops without flipping totalAborted, so sibling assets still get a fair shot. Verifies the round-5 Red at apps/daemon/tests/export-inline-route.ts ('reconciles handle.size with actual content bytes'). Daemon suite 1744/1744 passing. --------- Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>	2026-05-12 16:48:16 +08:00
@aaronjmars	377d65b7e4	fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN SSRF bypass) (#1122 ) * fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN bypass) new URL('http://192.168.1.5./').hostname returns '192.168.1.5.' — the trailing dot is the RFC 1034 absolute-FQDN form and resolves identically to '192.168.1.5'. parseIpv4 fails on the dotted form, so 169.254.169.254. slips past the metadata-service block, 192.168.1.5. slips past the LAN block, and localhost. slips past the loopback identification. Strip trailing dots in normalizeBracketedIpv6 so all downstream checks (isLoopbackApiHost, isBlockedExternalApiHostname, isBlockedIpv4, IPv6 range tests) see the canonical form. Adds 6 vitest cases covering loopback FQDN forms (localhost., foo.localhost., 127.0.0.1.) and SSRF FQDN bypasses (169.254.169.254., 192.168.1.5., 10.0.0.5.). Refs nexu-io/open-design#1119 review feedback (P2 from @lefarcen). * test(connectionTest): tighten trailing-dot coverage per #1122 review Two issues from #1122 review: 1. (P2 from @mrcfps + codex bot) The original `foo.localhost.` case asserted error===undefined on validateBaseUrl, which only proves the URL passed validation — not that the host is identified as loopback. Replaced with direct isLoopbackApiHost(...) assertions on the actual loopback FQDN forms (localhost., 127.0.0.1., 127.0.0.5.) so the test exercises the loopback path the comment claims. 2. (P3 from @lefarcen) Original blocked-FQDN tests covered only 3 of 7 ranges that isBlockedIpv4 handles. Added a dedicated case per range (0.0.0.0/8, 10/8, 100.64/10, 169.254/16, 172.16/12, 192.168/16, multicast >=224) so future regressions in normalizeBracketedIpv6 surface against the full coverage. * docs: drop misleading foo.localhost./endsWith claim in normalizer comment @lefarcen review feedback: isLoopbackApiHost only accepts exact 'localhost', '::1', loopback IPv4, and mapped loopback IPv4 — there's no subdomain or endsWith handling, so referencing 'foo.localhost.' overstates what the trailing-dot strip enables. Rewrite the comment to match actual call sites (isLoopbackApiHost equality + isBlockedIpv4 numeric parse).	2026-05-12 16:36:09 +08:00
eggward han	b6e4ae5e11	feat(daemon): make connection-test timeouts configurable (#1222 ) * feat(daemon): make connection-test timeouts configurable Provider and agent connection tests had hardcoded 12s / 45s budgets, which are too tight for slow networks or distant providers (the user sees "timeout" in Settings with no way to extend the budget). - Add OD_CONNECTION_TEST_PROVIDER_TIMEOUT_MS (default 12_000) - Add OD_CONNECTION_TEST_AGENT_TIMEOUT_MS (default 45_000) - Invalid values (non-numeric, zero, negative, fractional) emit a console.warn and fall back to the default, so a typo in the env never silently disables the safety timeout. - Export resolveConnectionTestTimeoutMs for unit testing; cover the three resolution paths (fallback / honored override / invalid). 41 connection-test tests pass (+3 new), full daemon suite 1170/1170. * fix(daemon): reject connection-test timeout overrides above Node's setTimeout maximum Node's `setTimeout` silently clamps any delay above `2^31-1` ms (2_147_483_647) to ~1 ms with a TimeoutOverflowWarning. The previous `Number.isInteger(n) && n >= 1` check accepted oversized values unchanged and passed them straight to `setTimeout`, so an override that intended to raise the budget — e.g. `OD_CONNECTION_TEST_AGENT_TIMEOUT_MS=3000000000` — instead caused every connection test to fail almost immediately. The safety timeout was effectively disarmed. Add `MAX_CONNECTION_TEST_TIMEOUT_MS = 2_147_483_647` and switch the guard to `Number.isSafeInteger(n) && n >= 1 && n <= MAX...`. The boundary value is still accepted; one millisecond past it falls back with a warn. Regression test exercises `3_000_000_000`, `2_147_483_647`, and `2_147_483_648`. Addresses #1222 review feedback from @chatgpt-codex-connector, @mrcfps, and @lefarcen.	2026-05-12 16:34:37 +08:00
chaoxiaoche	2ce6355558	feat(daemon): inject compiled design-system tokens + fixture into prompts (#1385 ) * feat(daemon): inject compiled design-system tokens + fixture into prompts Follow-up to #1231. The prior PR landed the structured form of two brands (`default` + `kami`) and codified the schema; this PR teaches the daemon to actually consume those files when assembling the system prompt, so agents stop having to re-derive token names from DESIGN.md prose every turn. Gated behind `OD_DESIGN_TOKEN_CHANNEL=1` for the smoke-test phase — flag-off keeps the daemon byte-equivalent to today's behavior, flag-on appends two new prompt blocks (the brand's `tokens.css` :root contract and its `components.html` reference fixture) right after the existing DESIGN.md block. Brands without those sibling files (every brand except `default` and `kami` today) skip silently in either mode. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(daemon): only swallow ENOENT/ENOTDIR in readFileOptional, rethrow rest Reviewer feedback (nettee, #1385). The prior catch-all hid permission errors, EISDIR, and broken packaged-resource paths behind the same "undefined = absent" branch the legacy ~138-brand fallback uses, which would let `OD_DESIGN_TOKEN_CHANNEL=1` silently degrade to the DESIGN.md-only prompt while reporting success. That corrupts the exact signal the smoke-test rollout depends on. Now `readFileOptional` only returns undefined for ENOENT / ENOTDIR (real "file does not exist" cases) and rethrows everything else. Added a focused test that plants a directory at the tokens.css path to exercise the EISDIR branch, plus a partial-presence regression test to confirm the stricter contract preserves the legacy fallback. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: chaoxiaoche <chaoxiaoche@192.168.10.16> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 16:28:59 +08:00
Yuhao Chen	3790c00363	docs: add Windows troubleshooting guide (#478 ) (#1170 ) * docs: add Windows troubleshooting guide (#478) Add docs/windows-troubleshooting.md with step-by-step fixes for the most common native-Windows setup errors: - Node 24 / nvm-windows gotchas (fake nvm file in System32) - pnpm not found after installation - Build scripts blocked by pnpm 10 (better-sqlite3, sharp) - Visual Studio / gyp build errors - Starting the dev server - Optional OpenCode CLI setup Also update CONTRIBUTING.md and QUICKSTART.md to link to the new guide instead of the vague "file an issue if it doesn't" note. * docs: fix Windows guide command accuracy (#1170) Address all 6 inline review comments from lefarcen: - Pin npm-global pnpm install to @10.33.2 (matches packageManager field) - Use where.exe instead of bare where (PowerShell alias conflict) - Fix OpenCode package: opencode-ai (not opencode), binary is opencode - Add EPERM fallback note for corepack enable on protected installs - Add Python check for gyp ERR! find Python - Expand diagnostic checklist with corepack, python, execution policy Also remove redundant corepack pnpm --version from checklist.	2026-05-12 16:17:44 +08:00
Hesam	d97b6041eb	fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins (#1110 ) * fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins fnm legacy installations use ~/.fnm/node-versions. Closes #1102 * fix: remove stray .fnm token from type declaration	2026-05-12 16:15:52 +08:00
Nicholas-Xiong	d5812ee535	feat: add FAQ page skill (#1162 ) * fix: set writable OD_DATA_DIR default for nix run Fixes #1157 When running via 'nix run github:nexu-io/open-design', the daemon attempted to create runtime state under the Nix store package path: /nix/store/.../lib/open-design/.od/projects The Nix store is read-only at runtime, causing startup to fail with ENOENT when mkdir() tried to create the projects directory. This commit updates the nix run wrapper to export OD_DATA_DIR with a writable default ($HOME/.od) when the variable is unset. Users can still override it by setting OD_DATA_DIR before running. The Home Manager and NixOS modules already set OD_DATA_DIR, so they are unaffected by this change. * feat: add FAQ page skill Add a new skill for generating Frequently Asked Questions pages with: - Collapsible accordion sections for Q&A pairs - Real-time search functionality - Category filtering (Billing, Account, Technical, General) - Smooth animations and transitions - Keyboard navigation support - Mobile-friendly responsive design - Semantic HTML with proper ARIA attributes The skill includes: - SKILL.md with triggers, workflow, and output contract - example.html demonstrating a complete FAQ page with 12 questions Use cases: help centers, support pages, product documentation * fix: address PR review feedback for FAQ page skill - Fix craft slugs: use accessibility-baseline and state-coverage instead of non-existent slugs - Remove overly broad 'questions and answers' trigger - Add edge case handling for insufficient/excessive FAQs - Remove search highlighting requirement (XSS risk) - Update self-check to reflect filtering instead of highlighting Addresses review comments from @lefarcen and @chatgpt-codex-connector * feat: add localized copy for faq-page skill Add German, French, and Russian translations for the FAQ page skill example prompt to fix validation test failure. - DE: FAQ-Seite mit Akkordeon-Abschnitten, Suchfunktion und Kategoriefilterung - FR: Page FAQ avec sections accordéon, recherche et filtrage par catégorie - RU: Страница FAQ со складными секциями-аккордеонами, поиском и фильтрацией * fix: escape apostrophe in French translation Use double quotes to avoid syntax error with d'auth	2026-05-12 16:13:50 +08:00
이용진	aeb6cde923	prevent duplicate saves and add template deletion (#1294 ) * prevent duplicate template entries on repeated save * add delete button to saved template list Templates can now be removed from the template picker via a hover x button, calling the existing DELETE /api/templates/:id endpoint. * add missing onDeleteTemplate prop in test fixtures * add template deletion flow test for NewProjectPanel * reject template names longer than 100 characters * preserve original createdAt on template update	2026-05-12 15:48:04 +08:00
Prantik Medhi	325d1d3ceb	docs: add NotebookLM GitHub export script (#1062 ) * docs: add NotebookLM GitHub export script * fix: make NotebookLM export TOC anchors work * fix: escape TOC link text markdown chars * fix: include merged PRs when exporting --prs all * fix: allow --prs merged mode * fix: treat --limit as total export budget * fix: avoid starving buckets under global --limit * fix: support --issues none and handle repos w/ issues disabled * fix: avoid underfilling export when buckets empty * fix: keep disabled-issues fallback quiet * fix: silence disabled issues fallback * fix: satisfy script typecheck	2026-05-12 15:47:32 +08:00
shangxinyu1	5d674410f2	test: stabilize extended Playwright coverage (#1341 ) * test: align extended Playwright coverage with current UI behavior * test: address extended suite review feedback * test: restore Codex path hydration assertion	2026-05-12 15:11:34 +08:00
Eli	9c489aa045	feat(web): redesign Designs tab cards — covers, tags, overflow menu, multi-select (#1161 ) * feat(web): redesign Designs tab cards — covers, tags, overflow menu, multi-select - Render real previews on project cards: HTML iframe / image / video / hashed gradient fallback with project initial; lazily fetches the project's primary file when metadata.entryFile is unset, prefers index.html → newest html → image → video. - Live artifact card thumbnails embed the rendered artifact URL via sandboxed iframe. - Replace the per-card close button with a `…` overflow menu (Rename, Delete) that opens on hover/click; click-outside and Esc close it. - Add multi-select mode (toolbar toggle → checkbox per card → "N selected · Delete · Cancel" pill) with batch delete via the existing onDelete prop. - Add a category tag to every card (Prototype / Live Artifact / Slide / Media) derived from project.metadata.intent / kind / skillId. - Replace browser prompt() and confirm() with custom modals (rename input + danger-confirm) reusing the existing .modal shell. - Add `more-horizontal` icon and 16 new i18n keys across all 18 locales (zh-CN/zh-TW localized; others fall back to English). * test(e2e): update home delete flow for overflow menu + custom confirm modal The previous flow targeted a per-card X button labelled "delete project <name>" and asserted on a native `dialog` event. The card UI now exposes a `…` overflow menu and a styled confirm modal, so reach delete via the menu and assert against the modal's Cancel / Delete buttons instead. * fix(web): harden Designs tab preview sandbox * fix(web): hide Designs select mode in kanban	2026-05-12 15:08:22 +08:00
Eli	77f69257a7	feat(web): in-context comment thread for the artifact preview (#1276 ) * feat(web): free-pin fallback in comment mode for unannotated artifacts When the artifact has no data-od-id annotations, clicking in Comment mode now posts a synthetic position-based target so the host opens a popover at the click location. Daemon upsert validation requires a non-empty selector/label, so the pin uses [data-od-pin=ID] and label 'pin'. Coordinates are document-space (viewport + scrollY) so pins stay anchored after scroll/reload. Clicks on interactive elements (a/button/input/textarea/select/label/contenteditable) keep their native behavior and are not pinned. * feat(web): tighten comment popover layout for free-pin and element targets The popover header used to dump the raw elementId verbatim — fine for data-od-id targets like 'hero-cta' but jarring for free-pins where elementId is a synthetic 'pin-...' string. Branch on the prefix and show 'Pin · at X, Y' for free-pins; keep the label + selection kind for real element / pod targets. Replace the text 'Close' button with an icon-only close affordance to match the popover-as-card visual. Action row is now two right-aligned buttons (Comment + Send to Claude) for element targets and (Add note + Send to Claude) for pod targets, eliminating the three-button row that wrapped onto two lines at narrow widths. The 'Remove' affordance for existing comments stays left-aligned. * feat(web): drop comments tab from chat sidebar The chat sidebar's 'Comments' tab listed saved/attached preview comments but duplicates the per-element popover already shown in the artifact viewer. Hide the tab and its content while the right-side comment thread panel takes over the same surface in-context. The CommentsPanel / CommentSection components stay defined as dead code for the moment so callers and translation keys remain valid; a later pass can delete them. * feat(web): right-side comment thread panel in board mode Render a 320px CommentSidePanel anchored to the right of the artifact preview whenever board (comment) mode is on. The panel lists every saved preview comment for the current file with an avatar initial, the element label (or 'Pin' for free-pin synthetic ids), an Xd/Xh/Xm-ago timestamp, the note body, a Reply link, and a checkbox. Reply focuses the comment's element via liveSnapshotForComment so the popover opens at the right anchor. Selecting one or more comments via the checkboxes surfaces a 'N selected · Clear · Send to Claude' action bar above the list; Send to Claude reuses the existing onSendBoardCommentAttachments pipeline via commentsToAttachments. The panel takes the place of the chat sidebar's removed Comments tab so the thread lives next to the artifact instead of behind a tab switch. * feat(web): styles for right-side comment thread panel Floating 320px panel anchored to the right edge of the artifact preview with a scrollable comment list and a coral selection bar that appears when one or more comments are checked. Selected items get a coral tint; the reply / check / send-to-claude controls match the popover's coral primary tone. * feat(web): toast confirmation on comment save, close popover After savePersistentComment succeeds, close the popover via clearBoardComposer and surface a transient 'Comment saved' (or 'Pin saved' for free-pin targets) toast for 2.2s. Replaces the previous behavior where the popover stayed open with an empty draft after save, which left users uncertain whether the save landed and forced an extra click to dismiss. * feat(web): position the comment-save toast at the top of the preview * feat(web): allow editing saved comment notes via the side panel Rename the per-item 'Reply' affordance to 'Edit' (no thread model exists yet, so reply was misleading) and pre-fill the popover with the existing note when clicked. The save path goes through onSavePreviewComment which the daemon implements as an upsert keyed on (project, conversation, filePath, elementId), so the edit overwrites the existing row's note without spawning a duplicate. Also fall back to a snapshot synthesized from the saved comment's own fields when the corresponding live target is no longer in the iframe DOM (e.g. free-pin parents that were re-rendered), so the edit path still works after artifact reloads. * feat(web): hide already-sent comments from the side panel After Send to Claude, the daemon flips the comment status from 'open' to 'applying' (and then 'needs_review' / 'resolved' / 'failed' depending on the run). Filter the side panel to status === 'open' so sent comments visibly leave the list — the user gets clear feedback that the send landed and the panel stays focused on actionable, un-sent items. * feat(web): drop single-tab bar and conversation count badge After the Comments tab was removed the chat header still rendered a one-tab 'tablist' just for the Chat tab, which read as visual noise without a sibling to switch between. Drop the tabs wrapper entirely; the chat content stays mounted and the header now hosts only the conversation-history affordance. Also drop the numeric badge that overlaid the conversation history button: counting open conversations next to a generic history icon was easy to mistake for an unread / notification count. The dropdown itself remains the canonical place to see and switch between past conversations. * feat(web): right-align chat header actions after tab bar removal With the tabs wrapper gone, chat-header-actions sat flush left because nothing was pushing it across the header. Add margin-left: auto so the history / new-conversation / collapse buttons land at the right edge, matching the design files / index.html tab row's own right-aligned controls. * feat(web): rename board-mode toggle to Comment with comment icon The artifact preview toolbar's board-mode entry was labeled 'Tweaks' with the tweaks icon, which collided with the palette Tweaks button next to it and hid the comment capability behind a generic label. Rename to 'Comment' with the comment icon and switch to the viewer-action class so the button matches the surrounding toolbar items (Edit/Draw) and the coral active state lands on the right surface. * fix(web): pass designTemplates to ProjectView in api-empty-response test The test props for ProjectView were missing the designTemplates prop that was added to Props in #955 (generic skills split). CI's strict typecheck (tsc -b --noEmit) caught it; local runs that hit project references differently did not. Pass an empty SkillSummary array — matches the empty skills fixture for the same reason.	2026-05-12 15:05:08 +08:00
Eli	928079daf5	feat(web): consolidate Image/Video/Audio entries into a Media tab (#1167 ) Reduces the New Project panel's top-level tab count by collapsing the three media surfaces into a single Media tab with an inner segmented control, and polishes the controls inside that tab so they stop dominating the panel: - Media tab + segmented (Image / Video / Audio) inside the panel body. Underlying ProjectKind branches and submission contract unchanged — the daemon still receives kind=image/video/audio. - Model picker rewritten as a combobox: one trigger row + searchable, provider-grouped popover with Recommended badges. Replaces the flat grid of provider-grouped cards that scrolled past the fold once the fourth provider landed. - Aspect picker compressed from a 5-card grid to a single row of segmented pills with mini ratio glyphs. - Image surface no longer carries a free-form Style notes field; it was redundant with the prompt template + main prompt input. - Live artifact tab locks fidelity to high-fidelity (the wireframe option is now hidden) — a wireframe live artifact doesn't make sense and the picker added noise. i18n: adds tabMedia / titleMedia / model* keys across all 18 locales, removes imageStyleLabel / imageStylePlaceholder. Tests + e2e selectors updated to drive the new Media tab + segmented surface flow.	2026-05-12 14:52:03 +08:00
Eli	1b307bf17f	feat(web): tweaks palette popover with HSL hue-shift recoloring (#1292 ) * feat(web): tweaks palette popover with HSL hue-shift recoloring Adds a Tweaks color-palette popover to the HTML preview toolbar. Selecting a palette re-skins the iframe in place via a srcDoc-side bridge that walks the DOM and shifts every chromatic paint to the target hue while preserving each color's saturation and lightness — pale tints stay pale, bold CTAs stay bold, just in the new color family. Mono-noir desaturates instead of shifting. - runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options - file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected - FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration - PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir - PreviewDrawOverlay: stub pass-through until the draw branch lands * feat(web): hide finalize-design toolbar from project header * test(e2e): skip project actions toolbar flow after toolbar removal	2026-05-12 14:38:00 +08:00
nettee	03da01a56f	ci: use open-design bot for contributors wall refresh (#1349 )	2026-05-12 14:35:28 +08:00
Nicholas-Xiong	c0b679ecbc	fix: restore custom dropdown chevron for timezone selector in dark mode (#1368 ) Fixes #1359 The timezone selector in the Routines form was showing repeated dropdown icons and poor text readability in dark mode because: 1. set to remove the native chevron, but didn't restore a custom one via background-image 2. Missing caused text to overlap with any chevron 3. No dark-mode-specific chevron color was defined This commit adds the custom dropdown chevron styling (matching the global select behavior) with proper padding and dark-mode color variants, ensuring: - Single, correctly-positioned chevron icon - Sufficient padding to prevent text overlap - Proper contrast in both light and dark themes - Consistent visual behavior with other form controls	2026-05-12 14:29:01 +08:00
Sid	fb47d0ae51	style(web): polish EntryView UI — sidebar layout, folder tabs, slim form, blue selected token (#1360 ) * chore(web): upgrade radius scale + introduce blue --selected token UI polish pass — design tokens for follow-up commits. Radius scale was visually too square at the small end. Bump up so buttons / inputs / cards feel rounded rather than boxy: - `--radius-sm: 6px → 8px` (buttons, inputs, small chips) - `--radius: 10px → 12px` (medium containers, Recent filter pill) - `--radius-lg: 14px → 16px` (project cards) - `--radius-pill: 999px` unchanged (status chips) Introduce a separate "selected" colour so selection indicators (card borders, focus rings) read as blue instead of fighting with the orange brand accent that drives primary CTAs: - `--selected: #2563eb` (Tailwind blue-600) - `--selected-soft: rgba(37, 99, 235, 0.16)` (soft tint for shadows) No selectors are migrated to `--selected` in this commit — that happens in a later "selected state" commit so the diff stays scoped. * refactor(web): replace entry global header with sidebar brand + reorder bottom chips Pre-existing layout: a global \`AppChromeHeader\` strip sat across the whole top of EntryView (logo + settings gear), then a 2-column body below it. Visual mass concentrated in a thin horizontal bar that did not relate to the page's column structure, and the settings gear duplicated the bottom Local-CLI chip. New layout matches the two-column "brand-in-sidebar + tabs-in-main" pattern: the brand block lives at the top of \`.entry-side\` (left column), the right tabs live at the top of \`.entry-main\`, and the vertical divider between them is the only horizontal seam. EntryView: - Drop \`<AppChromeHeader actions={avatarMenu} />\` from EntryView's render — the home page no longer renders the global chrome strip. (ProjectView still uses AppChromeHeader for back-nav / file actions, so the component itself stays in the codebase.) - Add a sidebar brand block inside \`.entry-side\` using the already-defined \`.entry-brand\` / \`.entry-brand-mark\` / \`.entry-brand-title\` classes that were sitting dead in index.css. - Reorder \`.entry-side-foot\` chips so that the env-critical Local CLI row sits on top of the row, with the secondary toggles (language picker, pet adoption, X follow icon) compact on a second row. The Follow @nexudotio chip drops its text label and becomes icon-only — pure marketing content, so it no longer earns a full-width pill. - Settings access moves entirely to the Local CLI chip's existing click handler; the top-right gear is gone (it was a duplicate). CSS: - \`.entry-shell\` grid: \`auto 1fr\` → \`1fr\` (no header row). - \`.entry-side\` background: \`var(--bg-panel)\` → \`transparent\`, so the sidebar shares the page beige and only the New-prototype card reads as white. Removes the "everything on the left is on one big white sheet" feeling. - \`.entry-brand\` gets \`padding: 24px 20px 18px\` so the logo + title block has breathing room at the top of the sidebar. - \`.entry-brand-mark\` width/height \`44 → 34\`. The previous 44px gradient ring was visually heavier than the title text it sat next to. - \`.entry-brand-title\` weight \`600 → 450\`, color \`var(--text-strong)\` → \`var(--text)\`. Serif title still reads as the page anchor without the chunky "bold black" stamp. - \`.entry-brand-actions\` added for future right-aligned actions (carries no actual content in this commit — kept so re-adding a settings/avatar entry point doesn't need new CSS). - \`.entry-side-foot .foot-pill\` slim pass: padding \`4px 10px → 3px 8px\`, font \`11.5px → 10.5px\`, gap \`6 → 5\`, plus \`justify-content: center\` and \`min-height: 24px\` so the icon-only Follow pill stays the same height as the text pills next to it. * style(web): align right tabs row with brand row + strip hover/focus noise Right column's tabs row ("Designs / Templates / Design systems / Image templates / Video templates") needed three things: 1. Vertical center of tab text aligned with the brand logo on the left (both rows feel like one row, separated by the vertical divider only). 2. Active tab's underline sitting flush on the horizontal divider below the tabs (not floating mid-row). 3. No hover background, no focus outline, no transition — tabs are a navigation strip, not action buttons. Changes: - `.entry-header` padding `0 28px` → `24px 28px 0`, drop the `min-height: 52px`. Padding-top mirrors the brand block's padding-top (24px) so left logo top and right tabs top land on the same Y. Header height now content-driven; underline meets the `border-bottom` divider naturally. - `.entry-tabs` gets `align-self: stretch` + `align-items: center` + `gap: 2px → 24px`. The stretch lets the tabs container fill header height; the bigger gap matches Claude Design's tab rhythm. - `.entry-tab` becomes a "plain underline tab": - `border-radius: 6px 6px 0 0 → 0` (no folder-tab look — that's on the left tabs). - `padding: 14px 11px → 6px 4px 8px` so text + underline form a tight group, with the underline sitting at the bottom of the tab box right above the header divider. - `font-size: 14px → 12px` matches the left newproj tabs (set in commit 4) — both columns share the same tab type-size. - `transition: none` removes the inherited 120ms background / border / color transition. - Hover / focus / active states explicitly zero out background, border-color, outline. Hover keeps a subtle color change (`text-muted → text`) so the tab still feels interactive without flashing a chip behind it. - Active state colors are duplicated across `.active`, `.active:hover`, `.active:focus`, `.active:focus-visible` so the black underline never gets overwritten by the inactive-state rules above. * style(web): folder-tab merge on left newproj tabs + flat card top corners The left "Prototype / Live artifact / Slide deck / …" tabs sat as plain underline tabs above a fully-rounded card. The active tab and card looked like two stacked rectangles with a gap. Folder-tab pattern: - Active tab gets a white background + 12px top corners + a 1px border on top / left / right. - Active tab's bottom border matches the card's background color (effectively invisible) — so where the tab sits, the card's top border is "broken" and tab + card read as one merged shape. - Card top corners are square (`border-radius: 0 0 12px 12px`), bottom corners stay 12px. With the active tab's square bottom edge, the merge line at the tab/card seam is a clean horizontal, not a curve mismatch. Implementation: - `.newproj-tabs-shell`: - `overflow: hidden → visible` so the tab's overlap with the card below isn't clipped at the shell's bottom edge. - `margin-bottom: -1px` + `z-index: 2` so the shell renders on top of the card and the 1px tab/card overlap actually paints. - The `.can-left { padding-left: 40 }` / `.can-right` overrides used to reserve room for scroll arrows are removed (arrows are hidden, no extra padding needed). - `.newproj-tabs` keeps its horizontal `overflow-x: auto` so the 8 project-type tabs can still scroll inside the sidebar width. - `.newproj-tabs-arrow` becomes `display: none`. The two chevron-circle buttons added clutter without much benefit — users with touchpads / wheels / keyboard already scroll the tabs row natively, and the `::before` / `::after` linear- gradient fades (now using `--bg` instead of `--bg-panel` so they fade into the page beige, not the sidebar panel that no longer exists) signal there are more tabs to the right. - `.newproj-tab`: - Replace the plain bottom-underline (`border-bottom: 2px solid transparent`) with a full transparent 1px border so the active state can flip just the colors without changing layout. - `border-radius: 0 → 12px 12px 0 0`. - `position: relative` for z-index stacking. - `padding: 10px 6px → 7px 14px` (less vertical, more horizontal — tabs read as "labels" rather than chunky buttons). - Symmetric top/bottom padding (`7px`) so the text + folder- tab top corners stack cleanly. - `transition: none` — no animation between active/inactive states (tabs are nav, not action buttons). - All hover / focus / focus-visible / active states zeroed out background and border-color so the inherited `button { … }` base style (which adds bg-subtle on hover) does not bleed in. Subtle color change on hover (`text-muted → text`) is the only affordance. - `.newproj-tab.active` (+ active hover/focus combos so the base rules don't override): white bg, full var(--border) on three sides, bottom border = var(--bg-panel) (invisible against card), z-index 3 (above non-active tabs and shell pseudo-elements). - `.newproj-body`: - `margin: 0 24px` so the card breathes inside the sidebar (and the active tab's left edge aligns with the card's left edge). - `padding: 18px 24px 28px → 16px 18px 18px` — tighter. - `border-radius: (full 12) → 0 0 12px 12px` for the flat-top merge with the active tab. - Adds explicit `border` + `background: var(--bg-panel)` + `box-shadow: var(--shadow-xs)` so the form reads as a card floating on the transparent sidebar. - `flex: 1 → 0 0 auto` (and `min-height: 0` / `overflow-y: auto` removed) — the card is content-sized, not stretched to fill the sidebar. Empty space below the card is now page beige, not a giant white sheet. - `gap: 14px → 12px` between form sections. * style(web): slim NewProjectPanel form (title, fidelity, buttons, ds-picker) The form inside the new white card felt overweight against the compacted layout from the previous commits — fidelity cards were ~133px tall, the Create button + Open-folder secondary button both had ~11px symmetric padding, the design-system trigger had a 32px avatar in a 55px-tall row. Slim every element so the card reads as a focused form, not a stack of beefy buttons. Title: - \`.newproj-title\` font \`14px / 600 → 13px / 550\`. Still visibly the section heading but no longer competing with the serif brand title above. Fidelity: - \`.fidelity-thumb { aspect-ratio: 12/7 → 16/7 }\`. The previous aspect made cards taller than they needed to be in the narrow sidebar column. - \`.fidelity-card { gap: 8 → 6, padding: 10/10/12 → 8/8/10 }\`. Combined with the thumb aspect change, card height drops from ~133px → ~102px (visually close to the Claude Design reference while keeping the same content). Primary / secondary buttons: - \`.newproj-create\` padding \`11px (symmetric) → 8px 11px\`, margin-top \`4 → 2\` — primary CTA no longer towers over the fidelity cards above it. - \`.newproj-import\` padding \`10px → 6px 10px\` — the secondary "Import Claude Design ZIP" button feels like an alt option, not a peer of Create. Design system trigger: - \`.ds-picker-trigger\` gap \`10 → 8\`, padding \`8/10 → 6/10\`. - \`.ds-picker-title\` font \`13 → 12.5\` so name + subtitle stay legible in the slimmer row without overflowing the column. - \`.ds-avatar\` width/height \`32 → 26\`, border-radius \`6 → 5\`. The thumbnail was the dominant element in the row; shrinking it pulls the row height from ~55px → ~50px. Footer disclaimer: - \`.newproj-footer\` padding-top \`0 → 12px\`. The "Only you can see your project by default." line was butting against the card bottom; 12px of air separates the disclaimer (page-bg context) from the card (panel-bg context) cleanly. * style(web): blue selected indicators + Recent filter rounded + neutral input focus Three small "selection state" tweaks driven by the new \`--selected\` token introduced earlier in this branch: 1. Fidelity card selected border is now blue, not the brand accent. The orange Create button + the orange selected card border were fighting for the same visual role (primary action vs primary selection). Blue clearly says "this is the one that is selected" without competing with the CTA. - \`.fidelity-card.active\` border-color \`var(--accent) → var(--selected)\`. - Box-shadow ring + soft 0.04 drop swapped from the orange \`180/90/59\` rgba tuple to the blue \`37/99/235\` tuple. - \`.fidelity-card.active .fidelity-thumb\` border swapped from \`var(--accent-soft) → var(--selected-soft)\`. 2. Recent / Your designs filter is no longer a fully-rounded pill. The bottom-left settings chips deserve to be the only "999px pill" shape — those are tertiary status indicators. The Recent/Your designs toggle is a higher-importance inline filter, so it gets the medium radius instead. - \`.subtab-pill\` wrapper border-radius \`var(--radius-pill) → var(--radius)\` (12px). - Inner button border-radius \`var(--radius-pill) → var(--radius-sm)\` (8px). - Active state background \`var(--text) → var(--bg-panel)\`, color \`var(--bg) → var(--text)\`. The "black filled pill" read as a status badge; white-on-faint-gray reads as "selected toggle" — same shape as Claude Design's Recent pill. 3. Input focus is neutralised. The base \`input:focus\` rule added an orange border + a 3px orange-soft ring around the focused field — way too much visual weight for a quiet form ("Project name" → focus made it scream). - \`input:focus / textarea:focus / select:focus\` border-color \`var(--accent) → var(--border-strong)\` (light grey). - Box-shadow ring removed (\`none\`). Focused inputs now only darken their border by one step — barely visible but enough to confirm focus. These three changes are grouped because they all migrate selection- state styling off the brand accent and onto neutral / blue tokens. The next pass (if any) can sweep the remaining \`var(--accent)\` selection sites (\`.ds-row.active\`, \`.ds-picker-trigger.open\`, \`.conv-pill.open\`, …) to use \`--selected\` too, but each of those lives in a different surface and felt out of scope for the entry view polish. * refactor(web): pet rail toggle moves inside pet pill as split button WHAT - Convert the pet pill from a single `<button>` to a `<div>` containing two buttons separated by a 1px divider: * `.pet-pill-main` keeps the existing "Adopt a pet" / "Change pet" glyph + label + unadopted dot, still wired to `onAdoptPet`. * `.pet-pill-toggle` is a small icon-only button that flips `petRailHidden` — eye icon when the rail is hidden ("click to show"), eye-off when visible ("click to hide"). - Drop the old avatar-menu popover from EntryView entirely: `avatarMenuOpen` state, the outside-click / Escape effect, and the cog-popover trigger are all removed. The `Settings` entry of that popover was already redundant with the `Local CLI` chip; the `Hide/Show pet picker` entry now lives directly on the pet pill. - CSS in the `.pet-pill` block: * `height: 24px` + `padding: 0` so the outer pill matches every other chip in the row vertically. * `.pet-pill-glyph` reduced from 14px to 12px and constrained to a 14x14 inline-flex box so the unicorn / paw glyph stops pushing the chip taller than 24px. * Per-region hover (`.pet-pill-main:hover`, `.pet-pill-toggle:hover`) so each side of the split lights up independently, with the divider inheriting the accent tint while the chip is in `pet-pill-fresh`. WHY - After commit `5fe5721c` removed the global `<AppChromeHeader>`, the only entrypoint to "Show pet picker" was the avatar-menu popover. Putting the avatar cog back next to the brand mark felt wrong: it elevates Settings (already on the `Local CLI` chip) to a primary affordance and sits next to the logo, where it doesn't belong by hierarchy. - The pet-rail toggle is fundamentally a pet-area control — it belongs with the pet adoption chip, not in a popover. Putting both on the same chip via a split button gives the rail toggle a stable, discoverable home and keeps `.entry-brand` a brand-only row. SCOPE - `apps/web/src/components/EntryView.tsx` + `apps/web/src/index.css`. No new state, no new i18n keys (reuses `pet.railShow` / `pet.railHide`). - The orphan i18n keys `entry.openSettingsTitle` and `entry.openSettingsAria` are no longer referenced by EntryView but are left in place — they're shared types that other locale files still declare; a focused cleanup belongs in a separate commit. * test(e2e): update entry chrome + project mgmt assertions for new layout WHAT - entry-chrome-flows.test.ts: - Rename `entry chrome settings menu toggles pet rail visibility` → `pet pill toggle hides and shows the pet rail`. The flow no longer goes through an `Open settings` cog + `.avatar-popover` chain; instead it clicks the in-pill `.pet-pill-toggle` directly and verifies its `aria-label` flips between `Hide` / `Show pet picker`. - Replace `.app-chrome-header` / `.app-chrome-brand` assertions with `.entry-brand` + `.entry-brand-title` text checks. The global chrome strip no longer exists on EntryView. - The compact-width overflow guard now measures `.entry-brand` rather than `.app-chrome-header`, since the brand row replaced the chrome strip as the only top-of-page horizontal stack. - project-management-flows.test.ts: - Drop the `Scroll project types right` arrow click. The `.newproj-tabs-arrow` buttons are hidden (the folder-tab pattern leans on shadow gradients on `.newproj-tabs-shell::before/::after` instead). Playwright's `locator.click()` auto-calls `scrollIntoViewIfNeeded()`, so clicking `new-project-tab-image` after a tab-switch still reaches the off-screen tab. WHY - These selectors / interactions are tied to UI affordances the earlier commits in this branch deliberately replaced. The behaviors they pin (pet rail toggle reachability, no horizontal overflow at 820px, draft preservation across tab switches) are still asserted — only the selectors needed to follow the new structure. VERIFICATION - `pnpm exec playwright test ui/entry-chrome-flows.test.ts ui/entry-configuration-flows.test.ts ui/project-management-flows.test.ts` → 17/17 passed (chromium project, single worker, fresh daemon). * fix(web): restore .newproj-body as scroll container (P1 regression) WHAT Reintroduce `flex: 1 1 auto; min-height: 0; overflow-y: auto;` on `.newproj-body`, alongside the `display: flex` + `padding` that commit `ba44e396` kept. The parent `.newproj` is still `overflow: hidden`, so without these three lines the card can clip its own content with no scroll recovery. WHY Reported by @lefarcen (P1) and @Siri-Ray in review on #1360. Before this commit the slim-form pass made the body shrink-wrap (`flex: 0 0 auto`) to keep the empty-state caption snug against the card edge. That works when the form is short, but the card can grow well past the available sidebar height in real scenarios: - Compact-height windows (≤ 720 vertical px). - Image / media tabs that add aspect + model rows. - Validation / error text after a failed Create. - Design-system popover opened with many systems. In all four cases the Create / Import / Open-folder stack — or the picker's bottom options — were sliding below the visible sidebar with no scroll bar to recover them. This is a regression against the behavior that landed in #1038, which made `.newproj-body` the scroll container precisely to keep the form bounded. SCOPE - `apps/web/src/index.css` only, one ruleset. - Visual cost: the empty-state caption (`.newproj-footer`) now sits at the bottom of the available sidebar height instead of hugging the card, which is the same behavior pre-#1167 / pre-this branch. - A short comment in CSS now flags the invariant so a future refactor doesn't quietly flip the flex semantics again. * fix(web): restore :focus-visible ring on entry-tab + newproj-tab (a11y) WHAT Split the prior `:focus, :focus-visible, :active` group on both tab selectors so that `:focus-visible` no longer inherits the zero-out that was added to scrub the orange mouse-focus halo: - `.newproj-tab:focus-visible` → 2px inset blue ring (`--selected`) hugging the folder-tab's 8px top-corner radius, plus `--text` foreground so the label reads at full contrast while focused. - `.entry-tab:focus-visible` → 2px solid outline in `--selected` with `outline-offset: 2px` and `border-radius: 4px`. Outline is used here instead of inset shadow because the tab has no padding to spare against the active 1px bottom border, and outline doesn't participate in layout. Mouse-driven `:focus` and `:active` keep the prior transparent treatment — there is no orange ring on click, which is the polish the rest of this branch is going for. WHY Flagged by @Siri-Ray (changed-range) and @lefarcen (P2) on #1360: the polish-tab commits stripped the focus indicator entirely instead of just suppressing mouse focus, so keyboard users had no way to see which tab was active during arrow-key navigation. Re-introducing `:focus-visible` only restores keyboard reachability while keeping the visual quiet for pointer users. SCOPE - `apps/web/src/index.css` only. Two rulesets touched, one new `:focus-visible` rule added per selector. - No JS, no aria, no test churn — the rules trigger off the existing `:focus-visible` pseudo-class, which the same Playwright tests already exercise via Tab. * fix(web): scope quieted input focus to .entry-side, restore global ring (a11y) WHAT Split the global input focus rule into two layers: - `input:focus, textarea:focus, select:focus` now keeps a visible focus indicator on every input across the app — but in the new `--selected` blue (border + 3px `--selected-soft` ring) instead of the original `--accent` orange. This preserves accessibility for every settings page, dialog, project workspace, and right-column control that was previously losing its focus halo. - `.entry-side input:focus` keeps the neutral treatment from this branch — `border-color: var(--border-strong)`, no ring. The orange "Create" CTA on the entry sidebar is already the loudest element in that panel, so a competing blue ring on the title / path inputs next to it pulled the eye in the wrong direction. Scoping the quieter focus to the sidebar keeps that intent without leaking out to the rest of the app. WHY Flagged by @lefarcen as a P2 a11y regression on #1360: the previous version of this rule scrubbed the focus indicator (`box-shadow: none`, border only one shade darker) for every input in the app, not just on the entry surface this branch is targeting. Keyboard users on settings forms and dialogs were left without a visible focus state. SCOPE - `apps/web/src/index.css` only, one global rule restored and one scoped override added. No JS, no template change. - Color shift global focus orange → blue is intentional: it consumes the new `--selected` token introduced in commit `13dc8a65` and matches the active-state direction this PR is establishing. * chore(web): drop dead AppChromeHeader / isMacPlatform imports + document --selected token WHAT - Remove the `AppChromeHeader` import from `EntryView.tsx`. The component itself is still used (and re-exported) by ProjectView; EntryView dropped its render site in commit `5fe5721c` and the import has been a stale reference ever since. - Remove the `isMacPlatform` import too. It was only used by the old avatar-menu popover (for the `⌘,` / `Ctrl+,` Settings hint) which was deleted along with the popover when the pet-pill split button replaced it. - Add a docblock above the `--selected` / `--selected-soft` token pair in `index.css` so the cascade has a local explanation for why this blue is separate from the brand `--accent`. The note calls out which affordances should reach for `--selected` (active option, focused input ring, active filter pill) and pins the 16% soft fill role. WHY Both flagged by @lefarcen on #1360: - P3 — dead import: the TS config doesn't fail on unused imports, so this was silently shipping as dead code and obscuring the deliberate removal of the global chrome header. - P3 — token doc: the `--accent` vs `--selected` split was only explained in the PR body. Putting the rationale next to the token makes the contract durable beyond this discussion. SCOPE - `apps/web/src/components/EntryView.tsx`: two `import` lines removed. - `apps/web/src/index.css`: one comment block added directly above the token declaration. - Verified: `pnpm --filter @open-design/web typecheck` → exit 0.	2026-05-12 14:26:39 +08:00

1 2 3 4 5 ...

630 commits