open-design

mirror of https://github.com/nexu-io/open-design.git synced 2026-06-01 03:14:35 +07:00

Author	SHA1	Message	Date
PerishFire	11b4750677	Update release light background (#1540 )	2026-05-13 15:36:22 +08:00
Siri-Ray	026e13b347	fix(web): restore release header layout (#1519 ) * fix(web): restore release header layout * fix(web): disambiguate entry settings button Generated-By: looper 0.7.4 (runner=fixer, agent=codex)	2026-05-13 14:57:25 +08:00
Sid	eda182c8a1	refactor(web): UI polish for v0.7.0 — neutralised palette, official brand glyphs, lucide (#1522 ) * refactor(web): adopt lucide-react for the inline Icon component The hand-rolled `<Icon>` set drifted in stroke weight and proportion across its 50+ glyphs as new icons were added. Swap the implementation to dispatch to `lucide-react` while keeping the same `<Icon name="..." size={X} />` API so the 246 existing call sites stay untouched. - Adds `lucide-react` as a dependency (tree-shaken; ~30KB gzipped for the ~50 icons we actually import). - `discord` and `x-brand` keep their bespoke inline SVG paths since lucide intentionally does not ship brand artwork. - `spinner` continues to use the existing `.icon-spin` className for its rotation; under the hood it now renders lucide's `Loader2`. - New `paw` glyph (lucide `PawPrint`) so the Pets nav item stops sharing the `sparkles` icon with External MCP. No behaviour change: the prop surface is identical, fill follows `currentColor` exactly as before, and aria-hidden / focusable defaults are preserved. Visual deltas are limited to the strokes themselves (slightly finer endcaps, more consistent baseline weights) — exactly the consistency upgrade lucide gives us. * feat(web): bundle official brand assets for agent icons `AgentIcon` previously approximated each agent's brand with hand-drawn SVG (orange Anthropic-ish sparkle, OpenAI-knot ellipses, etc). Replace those approximations with the real, vendor-published artwork shipped as static assets under `apps/web/public/agent-icons/`. - 13 SVG marks sourced from `@lobehub/icons-static-svg` (MIT) — color variants where the vendor published one (Claude, Codex, Gemini, Copilot, Qwen, Qoder, DeepSeek, Kimi, Mistral/Vibe), monochrome marks for the rest (Cursor, OpenCode, Hermes, MiMo, Pi, Kilo). - 1 PNG mark (Devin) sourced from devin.ai/icon.png, resized to 96×96 via `sips` since Cognition doesn't publish an SVG. - Each SVG was cleaned (stripped `<title>` brand text and the library's internal `style="flex:none;..."` ; dropped `width/height="1em"` so `viewBox` governs sizing) and run through `svgo --multipass`. Total bundle footprint: ~36 KB for all 17 files, only loaded on the agent cards that render them. - `AgentIcon` now resolves brands via a small `ICON_EXT` table and renders `<img src="/agent-icons/<id>.<ext>">`. Agents without an asset (`devin` is the lone outlier removed in this commit because PNG; new agents with no shipped artwork at all) fall back to an initial-letter pill that reads as "no official mark yet" rather than inventing brand artwork. - Removes the `simple-icons` dependency from a previous iteration since `AgentIcon` was its only consumer. Public-API stable: `<AgentIcon id={a.id} size={X} />` still accepts the same prop shape; `AvatarMenu`'s small-size usage continues to work. * refactor(web): polish entry view + Settings dialog UI for v0.7.0 A sweep over the two surfaces that have the most visual surface area in the app (the entry sidebar / New Project panel on the left, and the Settings modal). The work converged on a single neutral palette + a small set of shared dimensional standards documented in CSS, so future sections that get added slot into the same rhythm. New Project panel (apps/web/src/components/NewProjectPanel.tsx + .newproj* rules in index.css) - Adds a spec comment block at the top of the .newproj rules listing the canonical heights (input 30, dropdown 38, compact toggle 36, popover item 38) and the neutral colour rules. - Rebuilds PlatformPicker as a DS-picker-style dropdown trigger + popover (the previous 6-card 2×3 grid was ~280px tall; the dropdown collapses to a single 38px row with the same multi-select semantics). - Replaces SurfaceOptions' two heavy `ToggleRow` cards with the new compact one-line `CompactToggle`; the descriptive hint moves to a native `title` tooltip. - Compresses the Fidelity card grid (thumb aspect 16/7 → 16/5, tighter padding, smaller label). - Neutralises every selected/active state inside the panel: removes the orange accent fills and rings from `.newproj-card.active`, `.newproj-title-badge`, `.compact-toggle.on`, `.toggle-row.on`, the DS picker popover items + radio/check marks, the trigger open border and shadow, and the search-bar background. The Create CTA stays the only orange element on the panel. - Aligns the project-name input focus state across the sidebar: border `var(--text)` + 8% black halo (rgba is written out because the CSS pipeline collapses `color-mix(... 8%, transparent)` down to a solid `var(--text)`, which would render as a 3px solid black band). - Switches the body card from `flex: 1 1 auto` to `flex: 0 1 auto` so a short form variant doesn't leave a white void at the bottom of the card, and disables overscroll-bounce on the card so a fast scroll doesn't briefly expose the page-level gray under the white surface. - Pins the privacy footer below the card with a fixed 0 margin-top + shorter padding-top so it reads as a label of the card rather than a centred dialog footer. Entry sidebar footer (apps/web/src/components/EntryView.tsx + .entry-side-foot* rules) - Replaces the X social pill's `external-link` placeholder glyph with a bespoke filled `x-brand` SVG that mirrors the `discord` mark already in the icon set. - Wraps Discord + X in `.entry-side-foot-social` and lets that group flex-margin to the right of the row, so the two social pills read as a tight pair instead of a fourth pill stuck to the Pet pill. - Drops the "unadopted" red dot on the Pet pill (it duplicated the call to action that the label already carried). - Shrinks the footer icons to 10px and dims them to 55% / 75% opacity on hover so the labels are clearly the focal point — `currentColor` on the lucide-rendered SVGs would otherwise make the glyphs full black on hover. - Tightens the env-pill version text cap (180 → 142) so the top row ends close to the right edge of the Language + Pet group below it. Settings dialog (apps/web/src/components/SettingsDialog.tsx + .modal-settings / .settings-* / .seg-* / .agent-* rules) - Removes the "SETTINGS" kicker eyebrow above each section title (the big-typography title and modal context already make it redundant). - Switches the sidebar from a card-per-item layout to ChatGPT-style single-line pills: hides the `<small>` description, swaps the sidebar bg from gray to white, makes the active item a gray pill (no border, no shadow) so all items keep a consistent row height regardless of state. - Drops the modal-body's top border (already separated by the whitespace between modal-head and the body grid) and pins `.modal-settings { height: min(720px, 100vh - 64px) }` so the dialog no longer resizes when the user switches between short and long sections. - Compresses the Local CLI / BYOK seg-control from a 2-line ~52px card pair to a 1-line ~42px segmented pill that height-matches the active sidebar nav-item, and aligns the `.settings-content` padding-top with `.settings-sidebar` (22 → 16) so the first content row sits level with the first sidebar item. - Neutralises agent-card selected state, install/docs link colour, and protocol-chip active state — same accent-stripping pattern as the New Project panel. - Uniform agent-card height via `min-height: 64px` so installed cards (icon + name + version) align with unavailable cards (icon + name + not-installed + Install/Docs row). No prop-API changes, no business-logic edits — this is a pure visual refactor. Existing tests, providers and daemon contracts are untouched.	2026-05-13 13:59:19 +08:00
lefarcen	dc7791ef9d	feat(analytics): add project_id + project_kind to studio/artifact events (#1509 ) Product tracking doc 260513 added project_id + project_kind to studio_view (artifact), studio_click (share_option), and artifact_export_result. The Studio funnel can now group by project type without joining run_created on the back end. - contracts: 3 props gain required project_id + project_kind - ProjectView → FileWorkspace → FileViewer: thread projectKind down, converting metadata.kind via projectKindToTracking once at the top - FileViewer + HtmlViewer: populate the three call sites	2026-05-13 12:13:55 +08:00
Siri-Ray	c16297f10c	Refine preview and project dropdown controls (#1514 ) * Refine preview and project dropdown controls * fix(web): gate OS widget metadata Generated-By: looper 0.7.4 (runner=fixer, agent=codex) * fix(web): mark platform picker listbox multi-select Generated-By: looper 0.7.4 (runner=fixer, agent=codex)	2026-05-13 12:13:31 +08:00
lefarcen	e2952acd05	Revert "fix(web): restore consistent app header layout (#1432 )" This reverts commit `3d3119333c`.	2026-05-13 11:20:16 +08:00
Siri-Ray	3d3119333c	fix(web): restore consistent app header layout (#1432 ) * docs: add NotebookLM GitHub export script (#1062) * docs: add NotebookLM GitHub export script * fix: make NotebookLM export TOC anchors work * fix: escape TOC link text markdown chars * fix: include merged PRs when exporting --prs all * fix: allow --prs merged mode * fix: treat --limit as total export budget * fix: avoid starving buckets under global --limit * fix: support --issues none and handle repos w/ issues disabled * fix: avoid underfilling export when buckets empty * fix: keep disabled-issues fallback quiet * fix: silence disabled issues fallback * fix: satisfy script typecheck * prevent duplicate saves and add template deletion (#1294) * prevent duplicate template entries on repeated save * add delete button to saved template list Templates can now be removed from the template picker via a hover x button, calling the existing DELETE /api/templates/:id endpoint. * add missing onDeleteTemplate prop in test fixtures * add template deletion flow test for NewProjectPanel * reject template names longer than 100 characters * preserve original createdAt on template update * feat: add FAQ page skill (#1162) * fix: set writable OD_DATA_DIR default for nix run Fixes #1157 When running via 'nix run github:nexu-io/open-design', the daemon attempted to create runtime state under the Nix store package path: /nix/store/.../lib/open-design/.od/projects The Nix store is read-only at runtime, causing startup to fail with ENOENT when mkdir() tried to create the projects directory. This commit updates the nix run wrapper to export OD_DATA_DIR with a writable default ($HOME/.od) when the variable is unset. Users can still override it by setting OD_DATA_DIR before running. The Home Manager and NixOS modules already set OD_DATA_DIR, so they are unaffected by this change. * feat: add FAQ page skill Add a new skill for generating Frequently Asked Questions pages with: - Collapsible accordion sections for Q&A pairs - Real-time search functionality - Category filtering (Billing, Account, Technical, General) - Smooth animations and transitions - Keyboard navigation support - Mobile-friendly responsive design - Semantic HTML with proper ARIA attributes The skill includes: - SKILL.md with triggers, workflow, and output contract - example.html demonstrating a complete FAQ page with 12 questions Use cases: help centers, support pages, product documentation * fix: address PR review feedback for FAQ page skill - Fix craft slugs: use accessibility-baseline and state-coverage instead of non-existent slugs - Remove overly broad 'questions and answers' trigger - Add edge case handling for insufficient/excessive FAQs - Remove search highlighting requirement (XSS risk) - Update self-check to reflect filtering instead of highlighting Addresses review comments from @lefarcen and @chatgpt-codex-connector * feat: add localized copy for faq-page skill Add German, French, and Russian translations for the FAQ page skill example prompt to fix validation test failure. - DE: FAQ-Seite mit Akkordeon-Abschnitten, Suchfunktion und Kategoriefilterung - FR: Page FAQ avec sections accordéon, recherche et filtrage par catégorie - RU: Страница FAQ со складными секциями-аккордеонами, поиском и фильтрацией * fix: escape apostrophe in French translation Use double quotes to avoid syntax error with d'auth * fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins (#1110) * fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins fnm legacy installations use ~/.fnm/node-versions. Closes #1102 * fix: remove stray .fnm token from type declaration * docs: add Windows troubleshooting guide (#478) (#1170) * docs: add Windows troubleshooting guide (#478) Add docs/windows-troubleshooting.md with step-by-step fixes for the most common native-Windows setup errors: - Node 24 / nvm-windows gotchas (fake nvm file in System32) - pnpm not found after installation - Build scripts blocked by pnpm 10 (better-sqlite3, sharp) - Visual Studio / gyp build errors - Starting the dev server - Optional OpenCode CLI setup Also update CONTRIBUTING.md and QUICKSTART.md to link to the new guide instead of the vague "file an issue if it doesn't" note. * docs: fix Windows guide command accuracy (#1170) Address all 6 inline review comments from lefarcen: - Pin npm-global pnpm install to @10.33.2 (matches packageManager field) - Use where.exe instead of bare where (PowerShell alias conflict) - Fix OpenCode package: opencode-ai (not opencode), binary is opencode - Add EPERM fallback note for corepack enable on protected installs - Add Python check for gyp ERR! find Python - Expand diagnostic checklist with corepack, python, execution policy Also remove redundant corepack pnpm --version from checklist. * feat(daemon): inject compiled design-system tokens + fixture into prompts (#1385) * feat(daemon): inject compiled design-system tokens + fixture into prompts Follow-up to #1231. The prior PR landed the structured form of two brands (`default` + `kami`) and codified the schema; this PR teaches the daemon to actually consume those files when assembling the system prompt, so agents stop having to re-derive token names from DESIGN.md prose every turn. Gated behind `OD_DESIGN_TOKEN_CHANNEL=1` for the smoke-test phase — flag-off keeps the daemon byte-equivalent to today's behavior, flag-on appends two new prompt blocks (the brand's `tokens.css` :root contract and its `components.html` reference fixture) right after the existing DESIGN.md block. Brands without those sibling files (every brand except `default` and `kami` today) skip silently in either mode. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(daemon): only swallow ENOENT/ENOTDIR in readFileOptional, rethrow rest Reviewer feedback (nettee, #1385). The prior catch-all hid permission errors, EISDIR, and broken packaged-resource paths behind the same "undefined = absent" branch the legacy ~138-brand fallback uses, which would let `OD_DESIGN_TOKEN_CHANNEL=1` silently degrade to the DESIGN.md-only prompt while reporting success. That corrupts the exact signal the smoke-test rollout depends on. Now `readFileOptional` only returns undefined for ENOENT / ENOTDIR (real "file does not exist" cases) and rethrows everything else. Added a focused test that plants a directory at the tokens.css path to exercise the EISDIR branch, plus a partial-presence regression test to confirm the stricter contract preserves the legacy fallback. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: chaoxiaoche <chaoxiaoche@192.168.10.16> Co-authored-by: Cursor <cursoragent@cursor.com> * feat(daemon): make connection-test timeouts configurable (#1222) * feat(daemon): make connection-test timeouts configurable Provider and agent connection tests had hardcoded 12s / 45s budgets, which are too tight for slow networks or distant providers (the user sees "timeout" in Settings with no way to extend the budget). - Add OD_CONNECTION_TEST_PROVIDER_TIMEOUT_MS (default 12_000) - Add OD_CONNECTION_TEST_AGENT_TIMEOUT_MS (default 45_000) - Invalid values (non-numeric, zero, negative, fractional) emit a console.warn and fall back to the default, so a typo in the env never silently disables the safety timeout. - Export resolveConnectionTestTimeoutMs for unit testing; cover the three resolution paths (fallback / honored override / invalid). 41 connection-test tests pass (+3 new), full daemon suite 1170/1170. * fix(daemon): reject connection-test timeout overrides above Node's setTimeout maximum Node's `setTimeout` silently clamps any delay above `2^31-1` ms (2_147_483_647) to ~1 ms with a TimeoutOverflowWarning. The previous `Number.isInteger(n) && n >= 1` check accepted oversized values unchanged and passed them straight to `setTimeout`, so an override that intended to raise the budget — e.g. `OD_CONNECTION_TEST_AGENT_TIMEOUT_MS=3000000000` — instead caused every connection test to fail almost immediately. The safety timeout was effectively disarmed. Add `MAX_CONNECTION_TEST_TIMEOUT_MS = 2_147_483_647` and switch the guard to `Number.isSafeInteger(n) && n >= 1 && n <= MAX...`. The boundary value is still accepted; one millisecond past it falls back with a warn. Regression test exercises `3_000_000_000`, `2_147_483_647`, and `2_147_483_648`. Addresses #1222 review feedback from @chatgpt-codex-connector, @mrcfps, and @lefarcen. * fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN SSRF bypass) (#1122) * fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN bypass) new URL('http://192.168.1.5./').hostname returns '192.168.1.5.' — the trailing dot is the RFC 1034 absolute-FQDN form and resolves identically to '192.168.1.5'. parseIpv4 fails on the dotted form, so 169.254.169.254. slips past the metadata-service block, 192.168.1.5. slips past the LAN block, and localhost. slips past the loopback identification. Strip trailing dots in normalizeBracketedIpv6 so all downstream checks (isLoopbackApiHost, isBlockedExternalApiHostname, isBlockedIpv4, IPv6 range tests) see the canonical form. Adds 6 vitest cases covering loopback FQDN forms (localhost., foo.localhost., 127.0.0.1.) and SSRF FQDN bypasses (169.254.169.254., 192.168.1.5., 10.0.0.5.). Refs nexu-io/open-design#1119 review feedback (P2 from @lefarcen). * test(connectionTest): tighten trailing-dot coverage per #1122 review Two issues from #1122 review: 1. (P2 from @mrcfps + codex bot) The original `foo.localhost.` case asserted error===undefined on validateBaseUrl, which only proves the URL passed validation — not that the host is identified as loopback. Replaced with direct isLoopbackApiHost(...) assertions on the actual loopback FQDN forms (localhost., 127.0.0.1., 127.0.0.5.) so the test exercises the loopback path the comment claims. 2. (P3 from @lefarcen) Original blocked-FQDN tests covered only 3 of 7 ranges that isBlockedIpv4 handles. Added a dedicated case per range (0.0.0.0/8, 10/8, 100.64/10, 169.254/16, 172.16/12, 192.168/16, multicast >=224) so future regressions in normalizeBracketedIpv6 surface against the full coverage. * docs: drop misleading foo.localhost./endsWith claim in normalizer comment @lefarcen review feedback: isLoopbackApiHost only accepts exact 'localhost', '::1', loopback IPv4, and mapped loopback IPv4 — there's no subdomain or endsWith handling, so referencing 'foo.localhost.' overstates what the trailing-dot strip enables. Rewrite the comment to match actual call sites (isLoopbackApiHost equality + isBlockedIpv4 numeric parse). * feat(daemon): export self-contained HTML via /export/?inline=1 endpoint (#1312) test(daemon): add Red unit tests for inlineRelativeAssets helper 14 cases pinning the behavior contract for the upcoming apps/daemon/src/inline-assets.ts helper: - link/script inlining with verbatim body preservation - non-src script attrs preserved (type=module, defer, crossorigin) - relative path resolution (root + nested + deep-nested owners) - self-closing and single-quoted attr forms - negative cases: missing rel, rel=preload, absolute/data/blob/leading-slash - escaping: </style and </script inside body - null-fileReader graceful degradation - duplicate identical tags fully replaced (diverges from apps/web/src/components/FileViewer.tsx:5313's first-match-only; locked decision per plan §3.3) - HTML-escaped data-od-inline-asset attr Tests intentionally Red — module ../src/inline-assets.js does not yet exist. Phase B-G of plan declarative-roaming-gosling.md will turn them green by porting FileViewer.tsx:5248-5354 server-side. Refs nexu-io/open-design#368. * feat(daemon): port inlineRelativeAssets server-side for export endpoint Adds apps/daemon/src/inline-assets.ts — a pure helper that takes (html, ownerFileName, fileReader closure) and returns the HTML with every relative <link rel=stylesheet> and <script src> contents inlined into <style data-od-inline-asset="…">/<script>…</script> blocks. The fileReader closure keeps the helper free of fs/Express coupling so the route handler owns the filesystem boundary. Port source: apps/web/src/components/FileViewer.tsx:5248-5354 — five functions (inlineRelativeAssets, resolveProjectRelativePath, baseDirFor, readHtmlAttr, escapeHtmlAttr). The fetch hop becomes the fileReader closure; replace-all replaces first-match-only per locked design decision §3.3 (inline comment in inline-assets.ts cites the divergence from FileViewer.tsx:5313 and notes the web inline path is on a deprecation track since PR #384 made URL-load the default). Phase B-G of plan declarative-roaming-gosling.md. All 14 unit cases from the Red commit (`a60a9023`) now pass; tightens one case to use a realistic '&'-only filename (the original `<`/`>`-bearing filename was unreachable in real filesystems and exposed a regex limitation the web client carries too). Daemon delta: +14 tests (1704 → 1718). Typecheck clean. Refs nexu-io/open-design#368. * test(daemon): add Red integration tests for /export/?inline=1 route 9 HTTP cases against GET /api/projects/:id/export/?inline=1: - 3-file React-ish layout returns self-contained HTML (wiring guard: body assertions catch removal of the await inlineRelativeAssets(...) line, not just helper-internals changes) - missing inline / non-canonical values (0, false, foo, empty) → 400 - non-HTML file → 400 UNSUPPORTED_FILE_TYPE - missing file → 404 FILE_NOT_FOUND - invalid project id (..) → some 4xx (Express normalizes before route) - null-origin OPTIONS preflight → 204 + Access-Control-Allow-Origin: * - missing sibling asset → 200 with <link> tag intact, other asset inlined - nested HTML entry (pages/index.html + ../shared/util.js) → 200 inlined 8 of 9 tests Red (404 / 403); the invalid-project-id case is tolerant about how Express rejects .. so it accidentally passes Red — Green will tighten to 400 BAD_REQUEST via isSafeId. Phase C-R of plan declarative-roaming-gosling.md. C-G will register the route in apps/daemon/src/import-export-routes.ts. Refs nexu-io/open-design#368. * feat(daemon): wire GET /api/projects/:id/export/?inline=1 endpoint Adds the export-inline endpoint into registerProjectExportRoutes (import-export-routes.ts) alongside /export/pdf and /archive. The route: - Validates project id via ctx.validation.isSafeId - Requires ?inline=1 (accept-list: 1 / true / yes / on, matching Part 1's parseForceInline at file-viewer-render-mode.ts:59-66) - Reads the owner HTML via ctx.projectFiles.readProjectFile; maps ENOENT to 404 FILE_NOT_FOUND, everything else to 400 BAD_REQUEST - Gates non-HTML callers with 400 UNSUPPORTED_FILE_TYPE - Builds a fileReader closure that silently returns null on any sibling read failure (failure-local, not fatal — matches the web client's null-filter at FileViewer.tsx:5311) - Hands the buffer + relPath to inlineRelativeAssets and returns the result as text/html DI: RegisterProjectExportRoutesDeps gains 'projectFiles' \| 'validation'; server.ts:2879 passes the corresponding deps. Mirrors the dep shape of RegisterFinalizeRoutesDeps used by PR #832's /finalize/anthropic. Null-origin support intentionally omitted (decision §10 in the PR description): the daemon's null-origin allowlist is /raw/ and /codex-pets/.../spritesheet only, and export consumers are same-origin UI or server-side tooling — sandboxed-iframe srcdoc previews fetch /raw/* instead. Integration test #7 pins the 403 contract so a future allowlist change is deliberate. Phase C-G of plan declarative-roaming-gosling.md. All 23 tests green (14 unit + 9 integration); full daemon suite 1727 passing (delta +9 over B-G's 1718). Typecheck clean. Refs nexu-io/open-design#368. * test(daemon): add Red regression for inlined-body tag-literal corruption Reproduces the correctness bug Siri-Ray (looper) and codex-bot flagged on PR #1312: the reduce/split-join approach in inlineRelativeAssets re-scans the progressively mutated HTML, so a tag literal that happens to appear inside an already-inlined asset body gets the inner literal also replaced — corrupting the body and producing duplicate inlining. Concrete reproducer (CSS, where </style escape doesn't touch <link>): HTML: <link rel="stylesheet" href="a.css"> <link rel="stylesheet" href="b.css"> a.css: /* see also <link rel="stylesheet" href="b.css"> / b.css: body{color:red} Under split/join the second pass splits on `<link rel="stylesheet" href="b.css">` and matches BOTH the real outer tag AND the literal inside a.css's comment. Result: b.css's <style> block is injected inside a.css's comment, and b.css gets inlined twice. Phase F-R of plan declarative-roaming-gosling.md (post-PR-#1312 review round). F-G will rewrite the helper to collect matches by position in the original HTML and concat slices in a single pass, so already-inlined content is never re-scanned. Refs nexu-io/open-design#1312 review threads at apps/daemon/src/inline-assets.ts:122 (Siri-Ray looper + codex bot). feat(daemon): replace inliner reduce/split-join with position-based concat Fixes the inlined-body tag-literal corruption Siri-Ray (looper) + codex-bot flagged on PR #1312. The previous `replaceAllOccurrences` (`source.split(from).join(to)`) re-scanned the progressively mutated HTML on each pass, so a tag literal that appeared inside an already- inlined CSS/JS body got the inner literal replaced too, producing duplicate inlining and corrupted bodies. New shape: collect every match's {start, end} byte span from the ORIGINAL html via `matchAll`, await the per-match replacements in parallel, sort by start, and concat slices of the original html with the replacement strings in a single pass. Text introduced by an earlier replacement is never scanned for matches. The dup-tag fix (decision §8 — replace every occurrence, not first-match-only) is preserved: every original-tag position gets its own slice, so all duplicates are inlined. Also extracts buildInlineStyleBlock / buildInlineScriptBlock so the match-collection loops stay readable. Phase F-G of plan declarative-roaming-gosling.md. Regression test (`c809bccc`) goes Green; all 24 unit + integration tests pass; daemon suite still clean. Refs nexu-io/open-design#1312. * test(daemon): add Red CSP-sandbox test + P3 coverage gaps from PR #1312 review Three tests covering lefarcen's review on PR #1312: 1. [Red] CSP sandbox header (P2, lefarcen @ import-export-routes.ts:423). Top-level browser navigation to /export/?inline=1 sends no Origin header, so the daemon middleware lets it through and any JS in the exported document runs with daemon-origin privileges. Asserts the response sends `Content-Security-Policy: sandbox allow-scripts` so the browser treats it as a sandboxed iframe with an opaque origin (scripts still run, but no cookies / no /api/ access). This test fails until G1-G adds the header in the handler. 2. [Green-on-commit] Accept-list cases (P3, lefarcen @ test.ts:262). PR body decision §7 promises `inline=true/yes/on` case-insensitive, but round-1 tests only exercised inline=1. Pin the full accept list (true / yes / on + TRUE / Yes / ON). Already passes — the route's parser already implements the accept list; this just makes the contract testable. 3. [Green-on-commit] isSafeId guard (P3, lefarcen @ test.ts:287). Previous `..` test was normalized by Express before reaching the route. New input uses `bad!id` (URL-safe, but outside isSafeId's /^[A-Za-z0-9._-]+$/ char class), so Express passes it into req.params unchanged and isSafeId rejects with the documented 400 BAD_REQUEST envelope. Phase G1-R / H of plan declarative-roaming-gosling.md. Refs nexu-io/open-design#1312 review comments. feat(daemon): send Content-Security-Policy: sandbox allow-scripts on /export Closes the same-origin XSS surface lefarcen flagged on PR #1312 (P2 at import-export-routes.ts:423): top-level browser navigation to the export URL sends no Origin header, so the daemon's /api middleware admits the request and any JS in the exported document executes with daemon-origin privileges (cookies, /api/, localStorage). `Content-Security-Policy: sandbox allow-scripts` on the response makes the browser treat the document as a sandboxed iframe with an opaque origin. Scripts still execute (necessary for the screenshot use case — the whole point of inlining JS), but they cannot read cookies, hit /api/, or otherwise escalate to the daemon's origin. Phase G1-G of plan declarative-roaming-gosling.md. Daemon delta: +3 tests (the Red CSP test from `58151356` turns Green; the P3 coverage gap tests stay green). Refs nexu-io/open-design#1312. * test(daemon): add Red regression for <link> stylesheet attr preservation Currently `<link rel="stylesheet" href="print.css" media="print">` becomes a plain `<style data-od-inline-asset="print.css">…</style>` with no media query — print-only styles apply unconditionally. Same problem for `title` (alternate stylesheet sets), `disabled` (initial disabled state), and `nonce` (CSP nonce). All four are valid on both `<link rel=stylesheet>` and `<style>` per HTML spec, so the inliner must carry them across. PR #1312 round-2 review (lefarcen P2 @ inline-assets.ts:44). Phase G2-R; G2-G will extend buildInlineStyleBlock to copy the four attrs off the source <link>. Refs nexu-io/open-design#1312. * feat(daemon): preserve <link> stylesheet semantics on inlined <style> Closes lefarcen's P2 review note on PR #1312 (inline-assets.ts:44): `<link rel="stylesheet" href="print.css" media="print">` was becoming a plain <style> with no media query, so print-only styles applied unconditionally. Same issue for `title` (alternate stylesheet sets), `disabled` (initial disabled state), and `nonce` (CSP nonce). buildInlineStyleBlock now carries four attrs across from the source <link>: - media, title, nonce (value attrs, HTML-escaped via escapeHtmlAttr) - disabled (boolean attr — copied as bare presence) Other <link> attrs (rel, href, type, crossorigin, integrity, referrerpolicy) don't apply to <style> and are intentionally dropped. New `hasBooleanHtmlAttr` helper distinguishes presence-as-attr from substring-inside-another-attr-value via a regex that requires a word boundary after the name (whitespace, `=`, or `>`). Phase G2-G of plan declarative-roaming-gosling.md. All 28 tests pass. Refs nexu-io/open-design#1312. * docs(daemon): narrow inliner contract claim + document size-limit policy Closes lefarcen's P2 review notes on PR #1312: 1. "Self-contained" incomplete (inline-assets.ts:67): the helper only rewrites top-level <link rel=stylesheet> / <script src>. `<img src>`, CSS `url(...)`, CSS `@import`, ES module imports, font sources, and similar remain external in the response. The PR title/body claimed "self-contained HTML" which over-promised for screenshot tooling expecting bundled images/fonts. Module docstring now enumerates the full not-rewritten list and names the screenshot path as the primary use case (headless browser fetches each external asset on render, so inline-CSS- and-JS-only is sufficient). The route handler comment block mirrors the contract. A fully offline export with image/font bundling is filed as a follow-up — out of scope for this PR. 2. No response cap (inline-assets.ts:72): the helper does concurrent reads + multiple string copies and could spike daemon memory. The daemon is local-first (single-user, developer's machine — see open_design_architecture.md), so the effective ceiling is the size of the user's own project. The docstring now states this rationale and names the conditions under which a bounded-concurrency reader and output-size limit would be needed (non-trusted callers). Docs-only — no behavior change, all 28 tests still pass. Refs nexu-io/open-design#1312. * test(daemon): add Red regression for hasBooleanHtmlAttr quoted-value match PR #1312 round-2 review (lefarcen P3): `hasBooleanHtmlAttr` tests the tag string with no attr-quoting awareness, so the literal text `disabled` appearing inside any quoted attribute value followed by another whitespace char satisfies `\sdisabled(?=\s\|=\|/?>)`. <link rel=stylesheet href=x.css data-note="content disabled stuff"> emits a <style disabled> block, silently disabling a stylesheet the author wrote without that attr. Also adds a counterweight test for the legitimate-disabled case (<link … disabled>) so the next-commit fix doesn't over-correct and start dropping real boolean attrs. Phase I3-R of plan declarative-roaming-gosling.md (post-PR-#1312 round-2 review). I3-G will strip quoted attribute values from the tag string before testing for the bare attr. Refs nexu-io/open-design#1312. * feat(daemon): make hasBooleanHtmlAttr quote-aware to avoid false positives Closes lefarcen's P3 review note on PR #1312: `hasBooleanHtmlAttr` previously ran `\sname(?=\s\|=\|/?>)` over the full tag string, so the literal text `disabled` appearing inside any quoted attribute value followed by whitespace satisfied the regex. Source tags like `<link rel=stylesheet href=x.css data-note="content disabled stuff">` were emitting a <style disabled> block — silently disabling a stylesheet the author wrote without that attr. Fix: strip `="…"` and `='…'` substrings out of the tag with two regex passes BEFORE testing for the bare attr. The lookahead still requires `\s\|=\|/?>` after the attr name, so `<link disabled>`, `<link disabled="">`, `<link disabled/>`, etc. all match — but the attr name as a substring of any quoted value cannot match because values have been stripped to `""` / `''`. Phase I3-G of plan declarative-roaming-gosling.md. All 30 tests green (28 prior + 2 round-3 regression cases: false-positive and legitimate-disabled). Refs nexu-io/open-design#1312. * test(daemon): add Red cap-enforcement tests + scaffold InlineOptions PR #1312 round-2 review (lefarcen P2 — still open): round-2 only documented that no cap is enforced. Reviewer pushed back: the helper still builds unbounded candidate arrays + runs Promise.all over all asset reads + concatenates the full output in memory. Need actual limits in code. This commit adds the Red test surface that drives the next commit's enforcement: - InlineAssetsLimitError("owner") when owner HTML > maxOwnerBytes - InlineAssetsLimitError("candidates") when tag matches > maxCandidates - Per-asset graceful: oversized asset → tag stays as URL ref - InlineAssetsLimitError("total") when assembled output > maxTotalBytes - Bounded read concurrency: peak in-flight reads ≤ maxReadConcurrency - Integration: route maps the throw to 413 PAYLOAD_TOO_LARGE InlineOptions interface is added to the helper signature as a no-op test-door (per feedback_test_doors_over_fake_timers.md), so tests can exercise tiny fixtures while production callers use module-level defaults. The next commit (H3-G) wires the enforcement. Phase H3-R of plan declarative-roaming-gosling.md. Daemon delta on this commit: +6 tests (5 unit + 1 integration), all Red. Refs nexu-io/open-design#1312. * feat(daemon): enforce inliner caps + map limit errors to 413 PAYLOAD_TOO_LARGE Closes lefarcen's still-open P2 review on PR #1312 round 2 ("the code still builds unbounded candidate arrays + Promise.all over all asset reads + concatenates the full output in memory"). Caps are now enforced in code with the documented defaults: MAX_INLINE_OWNER_BYTES = 2 MiB MAX_INLINE_ASSET_BYTES = 5 MiB per sibling MAX_INLINE_CANDIDATES = 500 link/script matches MAX_INLINE_TOTAL_BYTES = 50 MiB assembled output MAX_INLINE_READ_CONCURRENCY = 8 simultaneous fileReader calls Enforcement points: - Owner cap (input): fires immediately at function entry. Cheap — Buffer.byteLength of the already-decoded UTF-8 string. - Candidate cap (planning): fires after matchAll, BEFORE any sibling read. Pathological HTML with thousands of <link>/<script src> tags is rejected without opening a single file descriptor. - Asset cap (per-sibling): post-read length check; oversized assets return null from the wrapped reader, so the tag stays as a URL ref and the response is still 200. This is the only "graceful" cap — one bad asset doesn't fail the whole export. - Total cap (output): tracked across the slice-and-concat loop, guarding both preserved-html slices AND injected replacements. - Concurrency cap (planning): a tiny in-module runWithConcurrency worker-pool keeps at most maxReadConcurrency fileReader calls in flight, with order-preserving results. `InlineAssetsLimitError` carries a `limit` discriminator so logs and clients can disambiguate owner/asset/candidates/total. The route handler catches it and emits 413 PAYLOAD_TOO_LARGE. Drive-by error-envelope fix while in the route: UNSUPPORTED_FILE_TYPE (an unregistered ApiErrorCode) → UNSUPPORTED_MEDIA_TYPE (the canonical code) with HTTP 415. The round-1 string was a slip; caught by reading packages/contracts/src/errors.ts:11 while wiring PAYLOAD_TOO_LARGE. Phase H3-G of plan declarative-roaming-gosling.md. All 36 tests green (28 prior + 2 round-3 quoted-attr + 5 cap unit + 1 cap integration). Refs nexu-io/open-design#1312. * feat(daemon): enforce inliner caps pre-buffer via AssetHandle contract Closes lefarcen's still-open P2 review on PR #1312 round 3 ("the helper enforces maxTotalBytes only after all candidate assets have already been read and converted to replacement strings" / "maxAssetBytes is checked after fileReader fully buffers each sibling"). Round-3 caps were defensive against the final output size but did not bound peak memory during read fanout — 500 assets at 5 MiB each could materialize ~2.5 GiB before the 413 fired. Contract change: InlineAssetReader now returns `AssetHandle \| null` where AssetHandle is `{ readonly size: number; read(): Promise<...> }`. Callers expose `size` from a cheap stat-equivalent (the route uses `resolveProjectFilePath`) and defer the full materialization to `read()`. The helper checks size against maxAssetBytes BEFORE invoking read, and against the running total BEFORE the reservation is committed. Enforcement flow inside runWithConcurrency: 1. await fileReader(p.resolved) → cheap stat-only call 2. if (handle.size > maxAssetBytes) return null ← pre-buffer 3. if (runningBytes + handle.size > maxTotalBytes) ← pre-buffer totalAborted = true; return null 4. runningBytes += handle.size ← reserve 5. await handle.read() ← only now 6. if (read returned null) runningBytes -= refund `totalAborted` is a shared flag the workers check at entry, so once the running total hits the cap, no new reads start. With maxReadConcurrency = 8, at most ~8 stat-side calls finish after abort — peak memory bounded. The concat-time guard stays as the exact final assertion (the pre-buffer reservation is approximate — it counts the original tag bytes and skips wrapper overhead). Route closure updated to do `resolveProjectFilePath` first, then `readProjectFile` inside the deferred `read()`. Test reader helpers (`readerFrom` + the concurrency-test reader) updated to the new shape. Two new unit tests pin the pre-buffer semantics: - `maxAssetBytes` is checked via handle.size BEFORE handle.read() (the reader's `read()` throws — must never run) - Running total abort stops further reads once exceeded (counting reader observes ≤ 2 reads when cap should fire after the first) Phase K of plan declarative-roaming-gosling.md (post-PR-#1312 round-3 review). All 38 tests green (36 prior + 2 round-4 pre-buffer cases). Refs nexu-io/open-design#1312. * test(daemon): add Red test pinning owner pre-buffer 413 before mime 415 PR #1312 round-5 (lefarcen P2): the route currently reads the owner file with readProjectFile() before any size check, so a 100 MiB owner HTML is fully buffered into memory before the helper's ownerBytes check fires. The fix is to stat with resolveProjectFilePath first, reject pre-buffer with 413 PAYLOAD_TOO_LARGE on oversize, then fold in the mime check (still 415 on mismatch, now pre-buffer), then readProjectFile when both gates pass. The Red→Green discriminator is the combination 'oversize AND non-HTML': pre-fix the route reads the buffer first and the text/plain mime check fires → 415; post-fix the route stats first and the size check fires before the mime check → 413. Asserting 'got 413, not 415' pins both the pre-buffer property and the check ordering (size before mime, per lefarcen's locked round-5 sequence). 2 MiB+1 byte fixture is acceptable in test setup; MAX_INLINE_OWNER_BYTES is the production 2 MiB so no test-door is needed. Red verified: AssertionError: expected 415 to be 413 (pre-fix flow reads → mime → 415). * feat(daemon): stat owner before readProjectFile in /export route to bound owner pre-buffer PR #1312 round-5 (lefarcen P2 confirmed at PR-1312#issuecomment-4424868413 follow-up): the route previously called readProjectFile() unconditionally on the owner, so a 100 MiB owner HTML was fully buffered into memory before the helper's ownerBytes check fired with InlineAssetsLimitError ('owner'). That meant the 413 envelope returned to the caller but only after peak memory had already hit the file size. Fix mirrors the sibling-asset stat-then-read contract round 4 added via the AssetHandle interface: call resolveProjectFilePath first (cheap stat), reject pre-buffer with 413 PAYLOAD_TOO_LARGE on size > MAX_INLINE_OWNER_BYTES, fold in the mime check (still 415 UNSUPPORTED_MEDIA_TYPE on mismatch, now also pre-buffer per lefarcen's 'fold-in is welcome'), then readProjectFile() only when both gates pass. Size check fires before mime check, so an oversize non-HTML file returns 413 rather than 415 — the observable Red→Green discriminator for this round. The helper's ownerBytes check (inline-assets.ts:127-133) stays as defense-in-depth for direct in-process callers that skip the route and for any drift between stat-reported size and the bytes returned by readFile. Verifies the round-5 Red at apps/daemon/tests/export-inline-route.ts ('returns 413 (not 415) for an oversize non-HTML file'). Daemon suite 1743/1743 passing. * test(daemon): add Red test pinning stat-vs-actual byte reconciliation PR #1312 round-5 (lefarcen P3 confirmed at PR-1312#issuecomment-4424868413 follow-up): the helper trusts handle.size for the running-total guard and never reconciles with the actual byte length of content unless the per-asset cap is exceeded. A reader that under-reports size (stale stat, UTF-8 expansion at decode, sparse file, deliberate lie) can let many strings materialize in memory before the concat-time guard at the bottom of inlineRelativeAssets throws — defeating the round-4 pre-buffer cap intent. Fix is lefarcen-confirmed path-a: post-read, the helper computes actualBytes = Buffer.byteLength(content, 'utf8'), reconciles runningBytes (add actualBytes, refund handle.size), and if running total exceeds maxTotalBytes flips totalAborted = true and returns null. Subsequent workers see totalAborted before invoking their own read(). Helper still throws InlineAssetsLimitError('total') after Promise.all settles — preserving the round-2/3/4 graceful-fallback pattern instead of racing throws across in-flight workers. Red→Green discriminator is read count. Pre-fix the helper trusts the lying handle.size (10), so both reads complete (each returning 1000 bytes) under the reservation total of 56+10+10=76 < cap 500. The concat-time guard then catches the 2000+-byte assembly and throws 'total' — but only after both reads materialized in memory. Post-fix worker 1's reconciliation trips totalAborted as soon as actualBytes (1000) is folded into runningBytes; worker 2 skips its read. Red verified: AssertionError expected 1, received 2 (pre-fix flow completes both reads before concat-guard fires). * feat(daemon): reconcile inliner reservation with post-read actual bytes PR #1312 round-5 (lefarcen P3 confirmed at PR-1312#issuecomment-4424868413 follow-up, path-a): the helper trusted handle.size for the running- total guard and only reconciled with actual bytes for the per-asset cap. A reader that under-reported size — stale stat, UTF-8 decode expansion at read time, sparse file, deliberate lie — could let many strings materialize before the concat-time guard at the bottom of inlineRelativeAssets caught the excess. That defeated the round-4 pre-buffer cap intent. Fix: after a successful read(), compute actualBytes = Buffer.byteLength(content, 'utf8'), reconcile runningBytes by folding in (actualBytes - handle.size), and re-check the total cap. If the reconciliation pushes runningBytes past maxTotalBytes, drop the asset's inlining (tag stays as URL ref), set totalAborted = true to block subsequent worker reads, and let Promise.all settle. The helper then throws InlineAssetsLimitError('total') below — matching the round-2/3/4 graceful-fallback pattern (no throw-before-settle race between in-flight workers). The per-asset cap check at line 228 is preserved for stat-lying readers that blow a single asset past maxAssetBytes; that branch refunds handle.size and drops without flipping totalAborted, so sibling assets still get a fair shot. Verifies the round-5 Red at apps/daemon/tests/export-inline-route.ts ('reconciles handle.size with actual content bytes'). Daemon suite 1744/1744 passing. --------- Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai> * fix: truncate long template names on project cards (#1220) (#1302) Add min-width: 0 to .design-card-name so text-overflow: ellipsis works correctly in flex layouts. Long template names were pushing the task execution status (Running, Failed, etc.) out of view on project cards. Closes #1220 Co-authored-by: laomo <laomo@openclaw.ai> * fix(desktop): swallow setTypeOfService EINVAL crashes in dev main (#647) (#1298) * fix(desktop): swallow harmless setTypeOfService EINVAL crashes in dev main The packaged Electron entry (apps/packaged/src/logging.ts) already filters the undici "setTypeOfService EINVAL" crash that issue #895 introduced for the prod build, but the dev / source-built desktop entry was missing the parallel guard. Result: switching settings tabs in a from-source desktop run could fire a fresh fetch, undici would try to set IP_TOS on the outbound socket, the kernel would refuse on certain macOS / VPN configurations, and the rejection bubbled to Electron's default handler as the "JavaScript error in the main process" dialog reported in issue #647. Add the same defensive filter to apps/desktop: - isHarmlessSocketOptionError matches only the canonical undici shape (syscall name AND EINVAL code). A contradicting code (EACCES, EPERM, etc) explicitly fails the match so real bugs don't get hidden. - The uncaughtException handler logs harmless cases at warn and returns silently. For anything else it removes itself from the listener list and re-throws via setImmediate, restoring Node's default crash path so Electron's native dialog renders exactly as it would without this filter. - unhandledRejection mirrors the same harmless / fall-through split. The filter is installed BEFORE app.whenReady so it is armed by the time the renderer fires its first fetch. The helper is duplicated rather than imported from apps/packaged because AGENTS.md forbids cross-app private-source imports. The file header calls out the parallel and notes that the two copies should stay in sync until the helper is promoted to a shared workspace package (follow-up); the contract is identical so a regression in one will surface in the other's test suite. Tests in apps/desktop/tests/main/uncaught-exception.test.ts mirror apps/packaged/tests/logging.test.ts: 8 cases pinning the matcher shape, 2 cases pinning the handler's harmless-log-warn vs fall-through-rethrow split. Validated: pnpm guard, pnpm --filter @open-design/desktop typecheck, pnpm --filter @open-design/desktop build, and pnpm --filter @open-design/desktop test (14 passed, 10 new). * fix(desktop,packaged): fail-fast on non-harmless unhandled rejections The previous unhandledRejection listeners logged non-harmless reasons and returned, which kept the main process alive after any rejected promise. A real bug, a failed IPC registration, or any unexpected async exception was reduced to a console line instead of surfacing through Node/Electron's default crash path the filter was meant to preserve. Both copies now route non-harmless rejections through a parallel factory (createDesktopUnhandledRejectionHandler / createFatalUnhandledRejectionHandler) that mirrors the uncaughtException policy: harmless setTypeOfService EINVAL shapes log at warn and return, anything else logs at error, removes the listener, and re-throws via setImmediate. Listener removal happens before the scheduled throw, so the rethrown reason lands in the uncaughtException path with no recursion. Tests cover the harmless branch, the detach + ordered rethrow, and non-Error / primitive rejection reasons (Promise.reject(42)) which must fall through. Desktop suite: 13/13, packaged suite: 16/16. Flagged on PR #1298 by Siri-Ray and the codex P2 review thread; the two file copies stay in lockstep per the AGENTS.md sync invariant. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com> * feature: refine assistant artifact feedback (#1379) * feature: refine assistant artifact feedback * fix: clear hidden custom feedback reason * test: update assistant feedback expectations * fix: support object-style question-form options (#1293) * fix: support object-style question-form options * fix: preserve stable option values in form submissions * fix(daemon/acp): terminate ACP child after clean prompt completion (#1286) * fix(daemon/acp): terminate ACP child after clean prompt completion (Bug B / #1265) Some ACP agents (notably Devin for Terminal) keep the child process alive after stdin closes, waiting for the next prompt. Open Design spawns a fresh agent per chat turn and relies on child.on('close') to finalize the run, so without an explicit signal-driven shutdown the chat sits stuck in the 'working' state indefinitely. Three small, targeted changes: - apps/daemon/src/acp.ts: After a clean session/prompt response we schedule a 500ms grace period and then SIGTERM the child. This mirrors the pattern detectAcpModels() already uses after model discovery. The grace period leaves well-behaved agents that exit on stdin.end() unaffected. - apps/daemon/src/acp.ts: New completedSuccessfully() method on the session handle reports whether the prompt resolved without a fatal error or abort, so the consumer can distinguish 'clean signal exit' from 'genuine signal failure'. - apps/daemon/src/server.ts: child.on('close') now treats a SIGTERM exit as 'succeeded' when acpSession.completedSuccessfully() is true. - apps/web/src/providers/daemon.ts: Trust the server's authoritative endStatus; the signal/non-zero-code safety net no longer overrides an explicit 'succeeded' status, so the chat doesn't surface a fake 'agent exited with signal SIGTERM' error after a clean ACP run. Daemon tests cover the SIGTERM grace timer, clean early-exit (timer cleared), and completedSuccessfully() abort/error states. Manual UI test on plain main + this fix confirms Devin chats now return to ready automatically after Done · ... * fix(daemon/connectionTest): treat ACP clean SIGTERM as success Codex review on #1286 caught that the new SIGTERM in attachAcpSession breaks ACP connection tests for agents that don't shut down on stdin.end() (the exact Devin behavior the patch targets). attachAgentStreamHandlers() in connectionTest.ts now also respects acpSession.completedSuccessfully(), mirroring the same check we apply in server.ts. Without this, a clean prompt response followed by our SIGTERM would set winner.signal === 'SIGTERM', flip exitedCleanly to false, and the connection test would report 'agent_spawn_failed' even when the agent had returned a healthy response. Also widened the AgentSpawnHandle type so completedSuccessfully is visible on the structural type used inside connectionTest.ts. All 56 daemon tests still pass; typecheck + guard clean. * fix(daemon/acp): narrow ACP success-on-signal override to forced-SIGTERM Looper review on #1286 caught that the success predicate was broader than the SIGTERM case it was meant to handle. `completedSuccessfully()` flips to true as soon as the ACP `session/prompt` response is processed, but it does not say why the child later closed. With the broad predicate, an ACP agent that returned a prompt result and then exited with code 1 (or was killed by SIGKILL/SIGSEGV) was still marked 'succeeded', regressing the existing close-status behavior for genuine post-response process failures. Scope the override to the exact forced-shutdown shape this PR introduces: code === null && signal === 'SIGTERM' && acpCleanCompletion Applied to both `server.ts` (chat run finalization) and `connectionTest.ts` (connection-test classification). Any other post-response failure now falls through to 'failed' / 'agent_spawn_failed' as before. All 59 daemon tests still pass; typecheck + guard clean. * fix(web/daemon): only bypass exit-code safety net on explicit server success Looper review on #1286 caught that the previous web change trusted `endStatus === 'succeeded'` absolutely, but `endStatus` can become 'succeeded' in two distinct ways: 1. The SSE end event explicitly carries `status: 'succeeded'` (authoritative server declaration). 2. The end event omits or has an invalid `status` field and the handler silently falls back to 'succeeded' as a local default. Both produced `endStatus === 'succeeded'` in the existing code, so the new safety-net bypass treated them identically. That regressed backward compat: a compatible or older daemon emitting an end event like `{code:1}` or `{code:null,signal:"SIGTERM"}` with no `status` would suddenly skip the failure banner. Track explicit success separately via `serverDeclaredSuccess`, set true only when: - The SSE end event has `status === 'succeeded'`, or - The fallback `fetchChatRunStatus` REST path returns `status === 'succeeded'` (which the existing `isChatRunStatus()` guard already proves is explicit). The safety net is now bypassed only on that explicit signal; the local-fallback success path still reaches the exit-code/signal check so real failures surface as before. Adds three web-side regression tests in `apps/web/tests/providers/sse.test.ts`: - Explicit `status: 'succeeded'` + SIGTERM → onDone called, no error - End event with `{code:1}` and no `status` → onError surfaces 'agent exited with code 1' as before - End event with `{code:null,signal:'SIGTERM'}` and no `status` → onError surfaces 'agent exited with signal SIGTERM' as before `pnpm guard` + daemon typecheck clean; 27/27 SSE tests pass (up from 24). * Fix Codex wrapper launch paths (#1395) * test: add Memory and Routines coverage (#1400) * test: align extended Playwright coverage with current UI behavior * test: address extended suite review feedback * test: fix Codex fallback config hydration in e2e * test: add Memory and Routines coverage * test: fix Memory and Routines component test typing * test: include Memory and Routines e2e in extended suite * refactor(settings): use tiled language picker instead of dropdown (#1406) The Language section in Settings rendered a single-button dropdown trigger that opened a floating menu. With one visible label and lots of empty panel space, the layout misled users into thinking only one language existed. Replace the dropdown trigger + portaled menu with an inline tile grid that shows every locale at a glance and clicks directly to switch. Side effects of the new layout: the languageOpen / languageMenuRect state, the dynamic placement effect, the resize-close effect, the mousedown click-outside handler, and the languageRef are gone. The global Escape handler no longer needs to guard against the menu being open. CSS for .settings-language-picker, .settings-language-button, .settings-language-menu, and .settings-language-option is replaced by .settings-language-grid (auto-fill 180px minmax columns) + .settings-language-tile. Tests in SettingsDialog.execution.test.tsx that drove the dropdown (click trigger → click menuitemradio → assert menu closed) are rewritten to drive the tiles directly via the radio role. Refs #1347 * fix(web): restore consistent app header layout * fix(web): restore consistent app header layout Generated-By: looper 0.7.2 (runner=fixer, agent=opencode) * fix(web): restore consistent app header layout Generated-By: looper 0.7.2 (runner=fixer, agent=opencode) * fix(web): restore consistent app header layout Generated-By: looper 0.7.2 (runner=fixer, agent=opencode) * fix(web): hide project output chips in header --------- Co-authored-by: Prantik Medhi <140103052+prantikmedhi@users.noreply.github.com> Co-authored-by: 이용진 <90879448+Leesin0222@users.noreply.github.com> Co-authored-by: Nicholas-Xiong <2482929840@qq.com> Co-authored-by: Hesam <chngyzkhanwhsht@gmail.com> Co-authored-by: Yuhao Chen <godcorn001@outlook.com> Co-authored-by: chaoxiaoche <fanzhen910412@gmail.com> Co-authored-by: chaoxiaoche <chaoxiaoche@192.168.10.16> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: eggward han <32223217+Eggwardhan@users.noreply.github.com> Co-authored-by: @aaronjmars <61592645+aaronjmars@users.noreply.github.com> Co-authored-by: Bryan <121247296+bankielewicz@users.noreply.github.com> Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai> Co-authored-by: mrzhangkris <92247501+mrzhangkris@users.noreply.github.com> Co-authored-by: laomo <laomo@openclaw.ai> Co-authored-by: Nagendhra Madishetti <nagendhra.madishetti24@gmail.com> Co-authored-by: Nagendhra <nagendhra405@gmail.com> Co-authored-by: Mason <jinmeihong0201@gmail.com> Co-authored-by: Yiang Yiyan <15089131836@163.com> Co-authored-by: Rocky <101849785+MrRockySL@users.noreply.github.com> Co-authored-by: nettee <nettee.liu@gmail.com> Co-authored-by: shangxinyu1 <shangxinyu@refly.ai> Co-authored-by: Matt Van Horn <mvanhorn@users.noreply.github.com>	2026-05-12 23:15:46 +08:00
lefarcen	e1bc83a476	feat(analytics): PostHog product analytics (P0 events, consent-gated, packaged) (#1428 ) * feat(analytics): scaffold PostHog product-analytics integration - Add @open-design/contracts/analytics subpath with the 17 P0 event payload types, header constants, and code↔CSV enum mapping helpers. - Add apps/daemon/src/analytics.ts with env-gated posthog-node client, request-scoped analytics context reader, and artifact-id anonymizer. - Expose GET /api/analytics/config so the web bundle never embeds the PostHog key at build time; daemon owns POSTHOG_KEY / POSTHOG_HOST. - Add apps/web/src/analytics module (identity + lazy posthog-js client + React provider) and mount it under <I18nProvider> in app/layout. No event wiring yet — that lands in the next commit alongside trigger points (App.tsx, EntryView, NewProjectPanel, SettingsDialog, FileViewer, runs.ts). * feat(analytics): wire app_launch, home_view, home_click, project_create_result - App.tsx: fire app_launch once after first effect tick. handleCreateProject now emits project_create_result on both success and failure paths. - EntryView.tsx: home_view (page) gated on agents loading so has_available_cli isn't transiently false; home_view (asset_panel) fires per top-tab change with the right result_count. - NewProjectPanel.tsx: home_click create_button fires before delegating to the parent; a fresh request_id is generated here and threaded through onCreate so the matching project_create_result stitches via $insert_id. - contracts/analytics: tighten createTabToTracking and topTabToTracking for the worktree branch's renamed tabs (live-artifact, templates). * feat(analytics): wire settings_view + 3 settings_click events - settings_view fires on dialog mount and on every section switch, carrying the active section (mapped via settingsSectionToTracking for the 16-section worktree layout), execution_mode, and the selected CLI provider id when present. - settings_click execution_mode_tab: setMode now emits before/after values whenever the user toggles between Local CLI and BYOK. - settings_click cli_provider_card: agent card onClick reports cli_provider_id via agentIdToTracking (kiro → other). - settings_click byok_field: onFocus added to api_key, model select, and base_url inputs; provider_id widened to include google so the worktree's Gemini protocol slot type-checks. * feat(analytics): wire studio_view + studio_click chat, studio_view artifact - packages/contracts/src/analytics/artifact-id.ts: FNV-1a 64-bit helper produces a 16-hex anonymized id for (projectId, fileName). Stable cross-platform so the daemon and the web bundle resolve the same id without a Web Crypto round-trip; daemon now re-exports it. - ChatComposer: studio_view chat_panel fires once per project mount, studio_click chat_composer fires on attachment + send buttons with estimated user_query_tokens (length/4) and has_attachment. - FileViewer: studio_view artifact fires once per (project, file) at the dispatcher level, before any sub-viewer renders, with artifact_kind derived from the renderer registry / file.kind table. - Widen TrackingExportFormat to include markdown and cloudflare_pages so the worktree branch's full share menu can emit verbatim. * feat(analytics): wire studio_click share_option + artifact_export_result HtmlViewer's share menu now emits both events per click via a fireShareExport helper: - studio_click share_option fires immediately on click with the chosen export_format and a fresh request_id. - artifact_export_result fires when the export resolves — success for sync exporters (html, markdown, template) the moment the call returns, success/failed for async exporters (pdf, zip, deploy) via .then/.catch. The same request_id threads both events so PostHog stitches click → result via $insert_id. DEPLOY_PROVIDER_OPTIONS maps to the CSV's vercel / cloudflare_pages slots; markdown is now a first-class export_format value. Also ignore .env.local so local POSTHOG_KEY / .env-style secrets don't get committed. * feat(analytics): emit run_created and run_finished from the daemon POST /api/runs now reads the analytics context off the x-od-analytics-* headers the web client sets on every fetch, then: - Captures run_created with project_id, conversation_id, run_id, model_id, agent_provider_id (mapped via agentIdToTracking), skill_id, design_system_id, plus the token_count_source marker. - Schedules a run_finished capture on runs.wait(run) resolution, mapping succeeded/canceled/failed to success/cancelled/failed and reporting total_duration_ms. Both events use a stable insert_id derived from the same uuid so PostHog dedupes the daemon-side mirror against any future web-side capture without double-counting. Token sub-fields (user_query_tokens/system_prompt_tokens/...) stay omitted in v1 — the claude-stream parser only exposes input/output totals today. See tracking-doc-issues.md §3.2. * feat(analytics): emit settings_cli_test_result + settings_byok_test_result The original BLOCKING-list assumed these CSV P0 events were not implementable in this branch because main lacked Test buttons. The worktree HEAD actually wires `handleTestAgent` and `handleTestProvider` in SettingsDialog, so both events are now in scope. - handleTestAgent emits settings_cli_test_result on success and failure paths with cli_provider_id mapped via agentIdToTracking, result drawn from result.ok / catch branch, error_code from result.kind or the thrown error name, and duration_ms timed via performance.now(). - handleTestProvider emits settings_byok_test_result analogously, using apiProtocol (anthropic\|openai\|azure\|ollama\|google) directly as provider_id — wider than the CSV's 5-value enum, documented in tracking-doc-issues.md §2.5. Contracts: add SettingsCliTestResultProps / SettingsByokTestResultProps plus matching track* helpers. AnalyticsEventName union now covers all 14 P0 events this branch supports. * feat(analytics): gate PostHog on the existing telemetry.metrics consent The integration now reuses the same first-launch privacy banner + Settings → Privacy toggle that gates Langfuse, so a single user decision controls both telemetry sinks. - /api/analytics/config now consults the persisted AppConfigPrefs: it returns enabled=true only when POSTHOG_KEY is set AND the user has chosen "Share usage data" (telemetry.metrics === true). The response also echoes installationId so the web client uses the same anonymous id Langfuse keys off of — one identity per install, shared across both sinks. - Web AnalyticsProvider: - Bootstrap fetch resolves installationId and threads it through the x-od-analytics-anonymous-id header on every /api/* fetch, so daemon-side captures (run_created / run_finished / project_create_result) land on the same person record. - Exposes a setConsent(granted) method that calls posthog-js's opt_in_capturing / opt_out_capturing, wired from App.tsx via a useEffect watching config.telemetry?.metrics. Toggling Privacy → metrics now stops/resumes events immediately, no reload. - app_launch additionally gates on telemetry.metrics so a freshly- declined user fires nothing, and a freshly-opted-in user fires on the next reload. * feat(packaging): bake POSTHOG_KEY into packaged daemon spawn env Wires PostHog product analytics through the same Langfuse-style build- secret pipeline so official Open Design builds ship with the key while fork builds compile without it (the integration short-circuits cleanly when POSTHOG_KEY is absent). tools/pack - resolveToolPackConfig reads POSTHOG_KEY / POSTHOG_HOST from process.env at packaging time, validates them (no whitespace in the key, http(s) URL for host, trailing-slash strip), and stamps them on ToolPackConfig. Fork builds without the env vars simply omit the fields; the daemon-side gate keeps things off in that case. - Mac, Windows, and Linux packaged-config writers each append the two fields to open-design-config.json next to the existing telemetryRelayUrl entry. apps/packaged - RawPackagedConfig / PackagedConfig surface posthogKey / posthogHost so the Electron entry and headless entry both forward them to the daemon sidecar. - buildPackagedDaemonSpawnEnv emits POSTHOG_KEY / POSTHOG_HOST into the daemon child env when present. The daemon's existing analytics module reads these via process.env — no daemon-side changes needed. - The headless packaged path falls back to process.env for fields the builder hasn't injected, mirroring how OPEN_DESIGN_TELEMETRY_RELAY_URL is read there. CI - release-beta.yml and release-stable.yml expose POSTHOG_KEY (secret) and POSTHOG_HOST (var) at workflow-env scope so every packaging job inherits them. PR / fork builds without these set simply skip the bake step. Tests - tools/pack: config.test.ts covers bake-through, fork-build omission, whitespace rejection, invalid-URL rejection, and trailing-slash normalization. - apps/packaged: sidecars.test.ts covers buildPackagedDaemonSpawnEnv forwarding the keys when present and omitting them when null. * feat(analytics): enable PostHog autocapture + perf + exceptions Flip on the PostHog SDK's automatic diagnostic features so we capture click paths, page transitions, web vitals, dead clicks, and browser exceptions without scattering instrumentation through the codebase. Privacy defense lives in one place — apps/web/src/analytics/scrub.ts — wired in via posthog-js's `before_send` hook so every outgoing event passes through the same audit point: - $autocapture / $rageclick / $dead_click / $copy_autocapture: strips $el_text and value/placeholder/aria-label attrs from any input, textarea, password input, or contenteditable element. PostHog autocapture does not capture input.value by default, but $el_text on a <textarea> reflects the typed content — that's the prompt body for us, so it has to be scrubbed every time. - $pageview / $pageleave: drops query string and fragment from $current_url / $referrer so any future ?q=… can't leak. - $exception: rewrites file:// and absolute filesystem paths in stack frames to app://apps/<repo-relative> so we don't ship the user's home directory. - Suppresses $opt_in entirely — duplicate of our explicit setConsent toggle in App.tsx. Element-level defense in depth is limited to the single most sensitive surface: the chat composer textarea gets `ph-no-capture` so PostHog never even generates an event for clicks inside that subtree. Every other input relies on scrub.ts — sprinkling the class through every form would be noisy and easy to forget on new surfaces. The existing Privacy → "Share usage data" toggle continues to gate every new feature: posthog-js's opt_out_capturing() halts autocapture, $pageview, $exception, web vitals, and dead clicks alongside the explicit capture() calls — one global switch. 11 unit tests pin the scrub rules in apps/web/tests/analytics-scrub.test.ts. * ci(nix): bump pnpmDepsHash for posthog-js + posthog-node additions Adding posthog-js to apps/web and posthog-node to apps/daemon changed pnpm-lock.yaml, which Nix's fixed-output pnpmDeps derivation pins by sha256. The CI nix flake check failed with: specified: sha256-KF3Mld72/iau+pJmA7HvnanRx8VLtDP0N624SKrtrrc= got: sha256-PGFgX4lYyeH2TRAXfUq52A3EOa6bb1gO59hPsXhEk3s= Copy the new hash into both nix/package-web.nix and nix/package-daemon.nix per the procedure documented in nix/README.md §"First-build hash pinning". * feat(analytics): unify PostHog identity with Langfuse installationId PostHog's distinct_id is the installationId stamped by /api/analytics/ config; Langfuse already reads the same id off app-config.json to populate trace.userId. With both sinks keying off the same anonymous identity, dashboards can correlate user actions (PostHog events) with LLM runs (Langfuse traces) without re-identifying. Two gaps closed: 1. applyConsent(false) — clear posthog-js's persisted ph__posthog localStorage entry on opt-out via posthog.reset(). Without this, a user who opts out, then clicks Delete my data, then re-opts in would see PostHog stitch their new session to the deleted identity because bootstrap.distinctID only takes effect on first init. 2. applyIdentity(newInstallationId) — Delete my data rotates the installationId in app-config; App.tsx now watches config.installationId and calls posthog.reset() then identify(newId) so the next event batch is fully decoupled from the deleted one. Idempotent on same-id re-renders so benign config refreshes don't churn PostHog identities. The fetch wrapper's x-od-analytics-anonymous-id header also flips to the new id on rotation so daemon-side captures (run_created / run_finished) land on the same person record from the very next API call, not after a reload. The end-to-end rotation flow is verified against a live PostHog project; these unit tests pin the safety guards (no-client paths, null inputs) since stubbing posthog-js's init-loaded callback chain is brittle. fix(langfuse): require both metrics AND content consent for trace reports Tightens the Langfuse gate so a user who shares anonymous metrics but NOT conversation content stops emitting Langfuse traces entirely — Langfuse is used for turn-quality evals which only make sense with prompt/output bodies. PostHog (product analytics, content-free) stays gated on `metrics` alone and is unaffected. i18n: "Conversation content" → "Conversation and tool content" with hints expanded to mention tool inputs/outputs so the consent surface matches what the trace actually carries (en + zh-CN). Bundled here per PR scope — change originated outside this PostHog PR but lands cleanly on the same files; gating Langfuse strictly on `content` makes the dual-sink consent model (PostHog = metrics, Langfuse = metrics + content) symmetric across both i18n locales and the daemon-side gate. * feat(analytics): wire byok_provider_option + fix PR review P1s Adds the BYOK protocol-chip click event (5-value provider_id mirroring the apiProtocol Settings UI) and resolves four P1 review threads on PR #1428. byok_provider_option: - New SettingsClickByokProviderOptionProps in contracts (provider_id = anthropic\|openai\|azure\|google\|ollama; maps to CSV's 5 values per tracking-doc-issues.md §2.5). - trackSettingsClickByokProviderOption helper in apps/web/src/analytics. - SettingsDialog hooks it on the protocol-chip onClick alongside the existing setApiProtocol call; is_selected reflects whether the chip was already active. Review fixes: 1. client.ts (Siri-Ray): clear `initPromise` when the resolution is null so a Privacy → metrics opt-in after a previous decline triggers a fresh /api/analytics/config fetch. Without this, the disabled response was cached forever — first-session opt-in needed a reload to start sending PostHog events. 2. provider.tsx (Siri-Ray): replace `url.includes('/api/')` with a strict same-origin + /api/ pathname check (shared `isSameOriginApiCall` helper). Outbound third-party URLs containing `/api/` (e.g. provider.example.com/api/x) no longer receive our x-od-analytics-* headers. 3. provider.tsx (codex-connector, lefarcen): gate header injection on `resolvedAnonId` being non-null. When Privacy → metrics is off, /api/analytics/config returns enabled=false → resolvedAnonId stays null → wrapper never installs → daemon can't read consent-bearing headers → no daemon-side PostHog event. setConsent now also clears resolvedAnonId on opt-out and re-fetches on opt-in. 4. daemon/analytics.ts (defense in depth): createAnalyticsService now takes dataDir and capture() re-reads app-config to check telemetry.metrics inside the fire-and-forget wrapper. Even if a stale header somehow reaches the daemon after opt-out, the capture is dropped before posthog-node.capture is called. * fix(web): place "Share usage data" on the right in privacy consent banner Swap button order in PrivacyConsentModal and the in-settings ConsentCard so the affirmative "Share usage data" lands on the right and "Not now" on the left. Matches the OK-on-the-right pattern users expect for primary actions. Both buttons keep equal visual prominence (same .privacy-consent-action styling) so the swap doesn't change the EDPB equal-prominence stance called out in the original Langfuse telemetry spec. * feat(analytics): populate run_finished token totals from claude-stream usage Daemon's claude-stream parser already emits agent usage events with input_tokens / output_tokens totals; the run service buffers them in run.events and Langfuse reads them out the same way. The run_finished PostHog event was leaving these fields empty. Scan run.events for the most recent agent usage frame on terminal transition and emit input_tokens / output_tokens / total_tokens when present. token_count_source flips to 'provider_usage' only when at least one count landed; runs without provider-side usage data keep 'unknown'. Provider does not break the input down into the 7 sub-fields the tracking doc lists (memory / context / attachment / system_prompt / …); those stay omitted until a parser change exposes them. * feat(analytics): estimate user_query_tokens from prompt length The user_query_tokens field for run_created / run_finished was hardcoded to 0. We can't tokenize without bundling a model-specific tokenizer, but the character/4 heuristic is the industry-standard estimate when one isn't available and is enough for funnel analysis (prompt-length cohorts, short-vs-long-query conversion rates). Extracted from req.body via the same telemetryPromptFromRunRequest pattern the daemon already uses for langfuse-bridge (currentPrompt then message fallback). Only the integer count goes to PostHog — the prompt text itself never leaves the daemon. token_count_source flips appropriately: - run_created with a prompt: 'estimated' (was 'unknown') - run_created with no prompt: 'unknown' - run_finished with provider usage: 'provider_usage' (overrides baseProps' 'estimated' value) - run_finished without provider usage: inherits 'estimated' or 'unknown' from baseProps so input/output absent doesn't mask the estimate.	2026-05-12 22:32:42 +08:00
lefarcen	7b191b5f85	fix: load Orbit templates from design templates (#1442 ) (cherry picked from commit `988e727927`) Co-authored-by: shangxinyu1 <shangxinyu@refly.ai>	2026-05-12 19:38:16 +08:00
lefarcen	4d8d233ce0	Fix Langfuse report finalization hook (#1402 )	2026-05-12 19:22:49 +08:00
PerishFire	e6c5560884	Fix appearance accent color persistence (#1439 )	2026-05-12 19:11:09 +08:00
lefarcen	2a0ebea50b	release: Open Design 0.7.0 - bump 14 monorepo package.json files to 0.7.0 (root + apps/{web,daemon,desktop,packaged,landing-page} + packages/{contracts,platform,sidecar,sidecar-proto} + tools/{dev,pack,pr} + e2e); apps/packaged was already at 0.6.1 from beta lane, all others at 0.6.0 - add CHANGELOG.md [0.7.0] - 2026-05-12 entry covering 97 merged PRs since 0.6.0: - Critique Theater: Phase 7 web client state machine (#1307) + Phase 6.2 daemon artifact extraction (#1085) - Web/UI: thumbs-up/down feedback widget (#1308), Cmd+, opens Settings (#1173), Finalize design package + Continue in CLI (#974), fetch models button for BYOK (#1034), provider models alphabetical sort (#1097), collapsible MCP JSON field-mapping (#1136), design file rename (#894) - Daemon: auto-memory store with chat-protocol-aware extraction (#999), install/uninstall skills & design systems (#1003), HTTP 206 range requests for video/audio (#1105), scheduled routines (#1033), agent runtime + route registration refactor (#1063, #1043) - HyperFrames: HTML-in-Canvas across web + skills (#866) - Skills/design systems: generic skills + design-templates split + finalize-design API (#955), agent-browser skill (#1284), WeChat design system + login-flow skill (#1083), hud/loom/trading-terminal design systems (#1069), release-notes-one-pager skill (#873), tokens.css schema (#1231) - Packaging: macOS Intel (x64) build (#759), official Nix flake (#402), beta packaging cache (#1095) - Maintainer ops: tools-pr PR-duty workspace (#1259), MAINTAINERS.md (#1290), contributor card bot (#932), PR→issue linking discipline (#1263) - Changed: conversation run isolation (#1271), default English i18n fallback (#1270), Codex CLI exit diagnostics / empty-response handling / path fallback (#1267, #1244, #1205) - Fixed: ~30 web + desktop + daemon + packaging bugfixes - Internal: nightly UI/desktop regression coverage (#1256), e2e/release report hardening (#1140), entry/settings automation (#954) - catch up [Unreleased] compare link to v0.7.0 and add missing [0.6.0] release link - add 97 PR footnote refs ([#402]..[#1330]) Verified locally: pnpm install + pre-build contracts/daemon/desktop dist + pnpm typecheck (exit 0 across all 14 packages on Node 22.22 with engine-warning). Release workflow validation runs after merge via release-stable.	2026-05-12 15:33:28 +08:00
Eli	9c489aa045	feat(web): redesign Designs tab cards — covers, tags, overflow menu, multi-select (#1161 ) * feat(web): redesign Designs tab cards — covers, tags, overflow menu, multi-select - Render real previews on project cards: HTML iframe / image / video / hashed gradient fallback with project initial; lazily fetches the project's primary file when metadata.entryFile is unset, prefers index.html → newest html → image → video. - Live artifact card thumbnails embed the rendered artifact URL via sandboxed iframe. - Replace the per-card close button with a `…` overflow menu (Rename, Delete) that opens on hover/click; click-outside and Esc close it. - Add multi-select mode (toolbar toggle → checkbox per card → "N selected · Delete · Cancel" pill) with batch delete via the existing onDelete prop. - Add a category tag to every card (Prototype / Live Artifact / Slide / Media) derived from project.metadata.intent / kind / skillId. - Replace browser prompt() and confirm() with custom modals (rename input + danger-confirm) reusing the existing .modal shell. - Add `more-horizontal` icon and 16 new i18n keys across all 18 locales (zh-CN/zh-TW localized; others fall back to English). * test(e2e): update home delete flow for overflow menu + custom confirm modal The previous flow targeted a per-card X button labelled "delete project <name>" and asserted on a native `dialog` event. The card UI now exposes a `…` overflow menu and a styled confirm modal, so reach delete via the menu and assert against the modal's Cancel / Delete buttons instead. * fix(web): harden Designs tab preview sandbox * fix(web): hide Designs select mode in kanban	2026-05-12 15:08:22 +08:00
Eli	77f69257a7	feat(web): in-context comment thread for the artifact preview (#1276 ) * feat(web): free-pin fallback in comment mode for unannotated artifacts When the artifact has no data-od-id annotations, clicking in Comment mode now posts a synthetic position-based target so the host opens a popover at the click location. Daemon upsert validation requires a non-empty selector/label, so the pin uses [data-od-pin=ID] and label 'pin'. Coordinates are document-space (viewport + scrollY) so pins stay anchored after scroll/reload. Clicks on interactive elements (a/button/input/textarea/select/label/contenteditable) keep their native behavior and are not pinned. * feat(web): tighten comment popover layout for free-pin and element targets The popover header used to dump the raw elementId verbatim — fine for data-od-id targets like 'hero-cta' but jarring for free-pins where elementId is a synthetic 'pin-...' string. Branch on the prefix and show 'Pin · at X, Y' for free-pins; keep the label + selection kind for real element / pod targets. Replace the text 'Close' button with an icon-only close affordance to match the popover-as-card visual. Action row is now two right-aligned buttons (Comment + Send to Claude) for element targets and (Add note + Send to Claude) for pod targets, eliminating the three-button row that wrapped onto two lines at narrow widths. The 'Remove' affordance for existing comments stays left-aligned. * feat(web): drop comments tab from chat sidebar The chat sidebar's 'Comments' tab listed saved/attached preview comments but duplicates the per-element popover already shown in the artifact viewer. Hide the tab and its content while the right-side comment thread panel takes over the same surface in-context. The CommentsPanel / CommentSection components stay defined as dead code for the moment so callers and translation keys remain valid; a later pass can delete them. * feat(web): right-side comment thread panel in board mode Render a 320px CommentSidePanel anchored to the right of the artifact preview whenever board (comment) mode is on. The panel lists every saved preview comment for the current file with an avatar initial, the element label (or 'Pin' for free-pin synthetic ids), an Xd/Xh/Xm-ago timestamp, the note body, a Reply link, and a checkbox. Reply focuses the comment's element via liveSnapshotForComment so the popover opens at the right anchor. Selecting one or more comments via the checkboxes surfaces a 'N selected · Clear · Send to Claude' action bar above the list; Send to Claude reuses the existing onSendBoardCommentAttachments pipeline via commentsToAttachments. The panel takes the place of the chat sidebar's removed Comments tab so the thread lives next to the artifact instead of behind a tab switch. * feat(web): styles for right-side comment thread panel Floating 320px panel anchored to the right edge of the artifact preview with a scrollable comment list and a coral selection bar that appears when one or more comments are checked. Selected items get a coral tint; the reply / check / send-to-claude controls match the popover's coral primary tone. * feat(web): toast confirmation on comment save, close popover After savePersistentComment succeeds, close the popover via clearBoardComposer and surface a transient 'Comment saved' (or 'Pin saved' for free-pin targets) toast for 2.2s. Replaces the previous behavior where the popover stayed open with an empty draft after save, which left users uncertain whether the save landed and forced an extra click to dismiss. * feat(web): position the comment-save toast at the top of the preview * feat(web): allow editing saved comment notes via the side panel Rename the per-item 'Reply' affordance to 'Edit' (no thread model exists yet, so reply was misleading) and pre-fill the popover with the existing note when clicked. The save path goes through onSavePreviewComment which the daemon implements as an upsert keyed on (project, conversation, filePath, elementId), so the edit overwrites the existing row's note without spawning a duplicate. Also fall back to a snapshot synthesized from the saved comment's own fields when the corresponding live target is no longer in the iframe DOM (e.g. free-pin parents that were re-rendered), so the edit path still works after artifact reloads. * feat(web): hide already-sent comments from the side panel After Send to Claude, the daemon flips the comment status from 'open' to 'applying' (and then 'needs_review' / 'resolved' / 'failed' depending on the run). Filter the side panel to status === 'open' so sent comments visibly leave the list — the user gets clear feedback that the send landed and the panel stays focused on actionable, un-sent items. * feat(web): drop single-tab bar and conversation count badge After the Comments tab was removed the chat header still rendered a one-tab 'tablist' just for the Chat tab, which read as visual noise without a sibling to switch between. Drop the tabs wrapper entirely; the chat content stays mounted and the header now hosts only the conversation-history affordance. Also drop the numeric badge that overlaid the conversation history button: counting open conversations next to a generic history icon was easy to mistake for an unread / notification count. The dropdown itself remains the canonical place to see and switch between past conversations. * feat(web): right-align chat header actions after tab bar removal With the tabs wrapper gone, chat-header-actions sat flush left because nothing was pushing it across the header. Add margin-left: auto so the history / new-conversation / collapse buttons land at the right edge, matching the design files / index.html tab row's own right-aligned controls. * feat(web): rename board-mode toggle to Comment with comment icon The artifact preview toolbar's board-mode entry was labeled 'Tweaks' with the tweaks icon, which collided with the palette Tweaks button next to it and hid the comment capability behind a generic label. Rename to 'Comment' with the comment icon and switch to the viewer-action class so the button matches the surrounding toolbar items (Edit/Draw) and the coral active state lands on the right surface. * fix(web): pass designTemplates to ProjectView in api-empty-response test The test props for ProjectView were missing the designTemplates prop that was added to Props in #955 (generic skills split). CI's strict typecheck (tsc -b --noEmit) caught it; local runs that hit project references differently did not. Pass an empty SkillSummary array — matches the empty skills fixture for the same reason.	2026-05-12 15:05:08 +08:00
Eli	928079daf5	feat(web): consolidate Image/Video/Audio entries into a Media tab (#1167 ) Reduces the New Project panel's top-level tab count by collapsing the three media surfaces into a single Media tab with an inner segmented control, and polishes the controls inside that tab so they stop dominating the panel: - Media tab + segmented (Image / Video / Audio) inside the panel body. Underlying ProjectKind branches and submission contract unchanged — the daemon still receives kind=image/video/audio. - Model picker rewritten as a combobox: one trigger row + searchable, provider-grouped popover with Recommended badges. Replaces the flat grid of provider-grouped cards that scrolled past the fold once the fourth provider landed. - Aspect picker compressed from a 5-card grid to a single row of segmented pills with mini ratio glyphs. - Image surface no longer carries a free-form Style notes field; it was redundant with the prompt template + main prompt input. - Live artifact tab locks fidelity to high-fidelity (the wireframe option is now hidden) — a wireframe live artifact doesn't make sense and the picker added noise. i18n: adds tabMedia / titleMedia / model* keys across all 18 locales, removes imageStyleLabel / imageStylePlaceholder. Tests + e2e selectors updated to drive the new Media tab + segmented surface flow.	2026-05-12 14:52:03 +08:00
Eli	1b307bf17f	feat(web): tweaks palette popover with HSL hue-shift recoloring (#1292 ) * feat(web): tweaks palette popover with HSL hue-shift recoloring Adds a Tweaks color-palette popover to the HTML preview toolbar. Selecting a palette re-skins the iframe in place via a srcDoc-side bridge that walks the DOM and shifts every chromatic paint to the target hue while preserving each color's saturation and lightness — pale tints stay pale, bold CTAs stay bold, just in the new color family. Mono-noir desaturates instead of shifting. - runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options - file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected - FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration - PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir - PreviewDrawOverlay: stub pass-through until the draw branch lands * feat(web): hide finalize-design toolbar from project header * test(e2e): skip project actions toolbar flow after toolbar removal	2026-05-12 14:38:00 +08:00
Nicholas-Xiong	c0b679ecbc	fix: restore custom dropdown chevron for timezone selector in dark mode (#1368 ) Fixes #1359 The timezone selector in the Routines form was showing repeated dropdown icons and poor text readability in dark mode because: 1. set to remove the native chevron, but didn't restore a custom one via background-image 2. Missing caused text to overlap with any chevron 3. No dark-mode-specific chevron color was defined This commit adds the custom dropdown chevron styling (matching the global select behavior) with proper padding and dark-mode color variants, ensuring: - Single, correctly-positioned chevron icon - Sufficient padding to prevent text overlap - Proper contrast in both light and dark themes - Consistent visual behavior with other form controls	2026-05-12 14:29:01 +08:00
Sid	fb47d0ae51	style(web): polish EntryView UI — sidebar layout, folder tabs, slim form, blue selected token (#1360 ) * chore(web): upgrade radius scale + introduce blue --selected token UI polish pass — design tokens for follow-up commits. Radius scale was visually too square at the small end. Bump up so buttons / inputs / cards feel rounded rather than boxy: - `--radius-sm: 6px → 8px` (buttons, inputs, small chips) - `--radius: 10px → 12px` (medium containers, Recent filter pill) - `--radius-lg: 14px → 16px` (project cards) - `--radius-pill: 999px` unchanged (status chips) Introduce a separate "selected" colour so selection indicators (card borders, focus rings) read as blue instead of fighting with the orange brand accent that drives primary CTAs: - `--selected: #2563eb` (Tailwind blue-600) - `--selected-soft: rgba(37, 99, 235, 0.16)` (soft tint for shadows) No selectors are migrated to `--selected` in this commit — that happens in a later "selected state" commit so the diff stays scoped. * refactor(web): replace entry global header with sidebar brand + reorder bottom chips Pre-existing layout: a global \`AppChromeHeader\` strip sat across the whole top of EntryView (logo + settings gear), then a 2-column body below it. Visual mass concentrated in a thin horizontal bar that did not relate to the page's column structure, and the settings gear duplicated the bottom Local-CLI chip. New layout matches the two-column "brand-in-sidebar + tabs-in-main" pattern: the brand block lives at the top of \`.entry-side\` (left column), the right tabs live at the top of \`.entry-main\`, and the vertical divider between them is the only horizontal seam. EntryView: - Drop \`<AppChromeHeader actions={avatarMenu} />\` from EntryView's render — the home page no longer renders the global chrome strip. (ProjectView still uses AppChromeHeader for back-nav / file actions, so the component itself stays in the codebase.) - Add a sidebar brand block inside \`.entry-side\` using the already-defined \`.entry-brand\` / \`.entry-brand-mark\` / \`.entry-brand-title\` classes that were sitting dead in index.css. - Reorder \`.entry-side-foot\` chips so that the env-critical Local CLI row sits on top of the row, with the secondary toggles (language picker, pet adoption, X follow icon) compact on a second row. The Follow @nexudotio chip drops its text label and becomes icon-only — pure marketing content, so it no longer earns a full-width pill. - Settings access moves entirely to the Local CLI chip's existing click handler; the top-right gear is gone (it was a duplicate). CSS: - \`.entry-shell\` grid: \`auto 1fr\` → \`1fr\` (no header row). - \`.entry-side\` background: \`var(--bg-panel)\` → \`transparent\`, so the sidebar shares the page beige and only the New-prototype card reads as white. Removes the "everything on the left is on one big white sheet" feeling. - \`.entry-brand\` gets \`padding: 24px 20px 18px\` so the logo + title block has breathing room at the top of the sidebar. - \`.entry-brand-mark\` width/height \`44 → 34\`. The previous 44px gradient ring was visually heavier than the title text it sat next to. - \`.entry-brand-title\` weight \`600 → 450\`, color \`var(--text-strong)\` → \`var(--text)\`. Serif title still reads as the page anchor without the chunky "bold black" stamp. - \`.entry-brand-actions\` added for future right-aligned actions (carries no actual content in this commit — kept so re-adding a settings/avatar entry point doesn't need new CSS). - \`.entry-side-foot .foot-pill\` slim pass: padding \`4px 10px → 3px 8px\`, font \`11.5px → 10.5px\`, gap \`6 → 5\`, plus \`justify-content: center\` and \`min-height: 24px\` so the icon-only Follow pill stays the same height as the text pills next to it. * style(web): align right tabs row with brand row + strip hover/focus noise Right column's tabs row ("Designs / Templates / Design systems / Image templates / Video templates") needed three things: 1. Vertical center of tab text aligned with the brand logo on the left (both rows feel like one row, separated by the vertical divider only). 2. Active tab's underline sitting flush on the horizontal divider below the tabs (not floating mid-row). 3. No hover background, no focus outline, no transition — tabs are a navigation strip, not action buttons. Changes: - `.entry-header` padding `0 28px` → `24px 28px 0`, drop the `min-height: 52px`. Padding-top mirrors the brand block's padding-top (24px) so left logo top and right tabs top land on the same Y. Header height now content-driven; underline meets the `border-bottom` divider naturally. - `.entry-tabs` gets `align-self: stretch` + `align-items: center` + `gap: 2px → 24px`. The stretch lets the tabs container fill header height; the bigger gap matches Claude Design's tab rhythm. - `.entry-tab` becomes a "plain underline tab": - `border-radius: 6px 6px 0 0 → 0` (no folder-tab look — that's on the left tabs). - `padding: 14px 11px → 6px 4px 8px` so text + underline form a tight group, with the underline sitting at the bottom of the tab box right above the header divider. - `font-size: 14px → 12px` matches the left newproj tabs (set in commit 4) — both columns share the same tab type-size. - `transition: none` removes the inherited 120ms background / border / color transition. - Hover / focus / active states explicitly zero out background, border-color, outline. Hover keeps a subtle color change (`text-muted → text`) so the tab still feels interactive without flashing a chip behind it. - Active state colors are duplicated across `.active`, `.active:hover`, `.active:focus`, `.active:focus-visible` so the black underline never gets overwritten by the inactive-state rules above. * style(web): folder-tab merge on left newproj tabs + flat card top corners The left "Prototype / Live artifact / Slide deck / …" tabs sat as plain underline tabs above a fully-rounded card. The active tab and card looked like two stacked rectangles with a gap. Folder-tab pattern: - Active tab gets a white background + 12px top corners + a 1px border on top / left / right. - Active tab's bottom border matches the card's background color (effectively invisible) — so where the tab sits, the card's top border is "broken" and tab + card read as one merged shape. - Card top corners are square (`border-radius: 0 0 12px 12px`), bottom corners stay 12px. With the active tab's square bottom edge, the merge line at the tab/card seam is a clean horizontal, not a curve mismatch. Implementation: - `.newproj-tabs-shell`: - `overflow: hidden → visible` so the tab's overlap with the card below isn't clipped at the shell's bottom edge. - `margin-bottom: -1px` + `z-index: 2` so the shell renders on top of the card and the 1px tab/card overlap actually paints. - The `.can-left { padding-left: 40 }` / `.can-right` overrides used to reserve room for scroll arrows are removed (arrows are hidden, no extra padding needed). - `.newproj-tabs` keeps its horizontal `overflow-x: auto` so the 8 project-type tabs can still scroll inside the sidebar width. - `.newproj-tabs-arrow` becomes `display: none`. The two chevron-circle buttons added clutter without much benefit — users with touchpads / wheels / keyboard already scroll the tabs row natively, and the `::before` / `::after` linear- gradient fades (now using `--bg` instead of `--bg-panel` so they fade into the page beige, not the sidebar panel that no longer exists) signal there are more tabs to the right. - `.newproj-tab`: - Replace the plain bottom-underline (`border-bottom: 2px solid transparent`) with a full transparent 1px border so the active state can flip just the colors without changing layout. - `border-radius: 0 → 12px 12px 0 0`. - `position: relative` for z-index stacking. - `padding: 10px 6px → 7px 14px` (less vertical, more horizontal — tabs read as "labels" rather than chunky buttons). - Symmetric top/bottom padding (`7px`) so the text + folder- tab top corners stack cleanly. - `transition: none` — no animation between active/inactive states (tabs are nav, not action buttons). - All hover / focus / focus-visible / active states zeroed out background and border-color so the inherited `button { … }` base style (which adds bg-subtle on hover) does not bleed in. Subtle color change on hover (`text-muted → text`) is the only affordance. - `.newproj-tab.active` (+ active hover/focus combos so the base rules don't override): white bg, full var(--border) on three sides, bottom border = var(--bg-panel) (invisible against card), z-index 3 (above non-active tabs and shell pseudo-elements). - `.newproj-body`: - `margin: 0 24px` so the card breathes inside the sidebar (and the active tab's left edge aligns with the card's left edge). - `padding: 18px 24px 28px → 16px 18px 18px` — tighter. - `border-radius: (full 12) → 0 0 12px 12px` for the flat-top merge with the active tab. - Adds explicit `border` + `background: var(--bg-panel)` + `box-shadow: var(--shadow-xs)` so the form reads as a card floating on the transparent sidebar. - `flex: 1 → 0 0 auto` (and `min-height: 0` / `overflow-y: auto` removed) — the card is content-sized, not stretched to fill the sidebar. Empty space below the card is now page beige, not a giant white sheet. - `gap: 14px → 12px` between form sections. * style(web): slim NewProjectPanel form (title, fidelity, buttons, ds-picker) The form inside the new white card felt overweight against the compacted layout from the previous commits — fidelity cards were ~133px tall, the Create button + Open-folder secondary button both had ~11px symmetric padding, the design-system trigger had a 32px avatar in a 55px-tall row. Slim every element so the card reads as a focused form, not a stack of beefy buttons. Title: - \`.newproj-title\` font \`14px / 600 → 13px / 550\`. Still visibly the section heading but no longer competing with the serif brand title above. Fidelity: - \`.fidelity-thumb { aspect-ratio: 12/7 → 16/7 }\`. The previous aspect made cards taller than they needed to be in the narrow sidebar column. - \`.fidelity-card { gap: 8 → 6, padding: 10/10/12 → 8/8/10 }\`. Combined with the thumb aspect change, card height drops from ~133px → ~102px (visually close to the Claude Design reference while keeping the same content). Primary / secondary buttons: - \`.newproj-create\` padding \`11px (symmetric) → 8px 11px\`, margin-top \`4 → 2\` — primary CTA no longer towers over the fidelity cards above it. - \`.newproj-import\` padding \`10px → 6px 10px\` — the secondary "Import Claude Design ZIP" button feels like an alt option, not a peer of Create. Design system trigger: - \`.ds-picker-trigger\` gap \`10 → 8\`, padding \`8/10 → 6/10\`. - \`.ds-picker-title\` font \`13 → 12.5\` so name + subtitle stay legible in the slimmer row without overflowing the column. - \`.ds-avatar\` width/height \`32 → 26\`, border-radius \`6 → 5\`. The thumbnail was the dominant element in the row; shrinking it pulls the row height from ~55px → ~50px. Footer disclaimer: - \`.newproj-footer\` padding-top \`0 → 12px\`. The "Only you can see your project by default." line was butting against the card bottom; 12px of air separates the disclaimer (page-bg context) from the card (panel-bg context) cleanly. * style(web): blue selected indicators + Recent filter rounded + neutral input focus Three small "selection state" tweaks driven by the new \`--selected\` token introduced earlier in this branch: 1. Fidelity card selected border is now blue, not the brand accent. The orange Create button + the orange selected card border were fighting for the same visual role (primary action vs primary selection). Blue clearly says "this is the one that is selected" without competing with the CTA. - \`.fidelity-card.active\` border-color \`var(--accent) → var(--selected)\`. - Box-shadow ring + soft 0.04 drop swapped from the orange \`180/90/59\` rgba tuple to the blue \`37/99/235\` tuple. - \`.fidelity-card.active .fidelity-thumb\` border swapped from \`var(--accent-soft) → var(--selected-soft)\`. 2. Recent / Your designs filter is no longer a fully-rounded pill. The bottom-left settings chips deserve to be the only "999px pill" shape — those are tertiary status indicators. The Recent/Your designs toggle is a higher-importance inline filter, so it gets the medium radius instead. - \`.subtab-pill\` wrapper border-radius \`var(--radius-pill) → var(--radius)\` (12px). - Inner button border-radius \`var(--radius-pill) → var(--radius-sm)\` (8px). - Active state background \`var(--text) → var(--bg-panel)\`, color \`var(--bg) → var(--text)\`. The "black filled pill" read as a status badge; white-on-faint-gray reads as "selected toggle" — same shape as Claude Design's Recent pill. 3. Input focus is neutralised. The base \`input:focus\` rule added an orange border + a 3px orange-soft ring around the focused field — way too much visual weight for a quiet form ("Project name" → focus made it scream). - \`input:focus / textarea:focus / select:focus\` border-color \`var(--accent) → var(--border-strong)\` (light grey). - Box-shadow ring removed (\`none\`). Focused inputs now only darken their border by one step — barely visible but enough to confirm focus. These three changes are grouped because they all migrate selection- state styling off the brand accent and onto neutral / blue tokens. The next pass (if any) can sweep the remaining \`var(--accent)\` selection sites (\`.ds-row.active\`, \`.ds-picker-trigger.open\`, \`.conv-pill.open\`, …) to use \`--selected\` too, but each of those lives in a different surface and felt out of scope for the entry view polish. * refactor(web): pet rail toggle moves inside pet pill as split button WHAT - Convert the pet pill from a single `<button>` to a `<div>` containing two buttons separated by a 1px divider: * `.pet-pill-main` keeps the existing "Adopt a pet" / "Change pet" glyph + label + unadopted dot, still wired to `onAdoptPet`. * `.pet-pill-toggle` is a small icon-only button that flips `petRailHidden` — eye icon when the rail is hidden ("click to show"), eye-off when visible ("click to hide"). - Drop the old avatar-menu popover from EntryView entirely: `avatarMenuOpen` state, the outside-click / Escape effect, and the cog-popover trigger are all removed. The `Settings` entry of that popover was already redundant with the `Local CLI` chip; the `Hide/Show pet picker` entry now lives directly on the pet pill. - CSS in the `.pet-pill` block: * `height: 24px` + `padding: 0` so the outer pill matches every other chip in the row vertically. * `.pet-pill-glyph` reduced from 14px to 12px and constrained to a 14x14 inline-flex box so the unicorn / paw glyph stops pushing the chip taller than 24px. * Per-region hover (`.pet-pill-main:hover`, `.pet-pill-toggle:hover`) so each side of the split lights up independently, with the divider inheriting the accent tint while the chip is in `pet-pill-fresh`. WHY - After commit `5fe5721c` removed the global `<AppChromeHeader>`, the only entrypoint to "Show pet picker" was the avatar-menu popover. Putting the avatar cog back next to the brand mark felt wrong: it elevates Settings (already on the `Local CLI` chip) to a primary affordance and sits next to the logo, where it doesn't belong by hierarchy. - The pet-rail toggle is fundamentally a pet-area control — it belongs with the pet adoption chip, not in a popover. Putting both on the same chip via a split button gives the rail toggle a stable, discoverable home and keeps `.entry-brand` a brand-only row. SCOPE - `apps/web/src/components/EntryView.tsx` + `apps/web/src/index.css`. No new state, no new i18n keys (reuses `pet.railShow` / `pet.railHide`). - The orphan i18n keys `entry.openSettingsTitle` and `entry.openSettingsAria` are no longer referenced by EntryView but are left in place — they're shared types that other locale files still declare; a focused cleanup belongs in a separate commit. * test(e2e): update entry chrome + project mgmt assertions for new layout WHAT - entry-chrome-flows.test.ts: - Rename `entry chrome settings menu toggles pet rail visibility` → `pet pill toggle hides and shows the pet rail`. The flow no longer goes through an `Open settings` cog + `.avatar-popover` chain; instead it clicks the in-pill `.pet-pill-toggle` directly and verifies its `aria-label` flips between `Hide` / `Show pet picker`. - Replace `.app-chrome-header` / `.app-chrome-brand` assertions with `.entry-brand` + `.entry-brand-title` text checks. The global chrome strip no longer exists on EntryView. - The compact-width overflow guard now measures `.entry-brand` rather than `.app-chrome-header`, since the brand row replaced the chrome strip as the only top-of-page horizontal stack. - project-management-flows.test.ts: - Drop the `Scroll project types right` arrow click. The `.newproj-tabs-arrow` buttons are hidden (the folder-tab pattern leans on shadow gradients on `.newproj-tabs-shell::before/::after` instead). Playwright's `locator.click()` auto-calls `scrollIntoViewIfNeeded()`, so clicking `new-project-tab-image` after a tab-switch still reaches the off-screen tab. WHY - These selectors / interactions are tied to UI affordances the earlier commits in this branch deliberately replaced. The behaviors they pin (pet rail toggle reachability, no horizontal overflow at 820px, draft preservation across tab switches) are still asserted — only the selectors needed to follow the new structure. VERIFICATION - `pnpm exec playwright test ui/entry-chrome-flows.test.ts ui/entry-configuration-flows.test.ts ui/project-management-flows.test.ts` → 17/17 passed (chromium project, single worker, fresh daemon). * fix(web): restore .newproj-body as scroll container (P1 regression) WHAT Reintroduce `flex: 1 1 auto; min-height: 0; overflow-y: auto;` on `.newproj-body`, alongside the `display: flex` + `padding` that commit `ba44e396` kept. The parent `.newproj` is still `overflow: hidden`, so without these three lines the card can clip its own content with no scroll recovery. WHY Reported by @lefarcen (P1) and @Siri-Ray in review on #1360. Before this commit the slim-form pass made the body shrink-wrap (`flex: 0 0 auto`) to keep the empty-state caption snug against the card edge. That works when the form is short, but the card can grow well past the available sidebar height in real scenarios: - Compact-height windows (≤ 720 vertical px). - Image / media tabs that add aspect + model rows. - Validation / error text after a failed Create. - Design-system popover opened with many systems. In all four cases the Create / Import / Open-folder stack — or the picker's bottom options — were sliding below the visible sidebar with no scroll bar to recover them. This is a regression against the behavior that landed in #1038, which made `.newproj-body` the scroll container precisely to keep the form bounded. SCOPE - `apps/web/src/index.css` only, one ruleset. - Visual cost: the empty-state caption (`.newproj-footer`) now sits at the bottom of the available sidebar height instead of hugging the card, which is the same behavior pre-#1167 / pre-this branch. - A short comment in CSS now flags the invariant so a future refactor doesn't quietly flip the flex semantics again. * fix(web): restore :focus-visible ring on entry-tab + newproj-tab (a11y) WHAT Split the prior `:focus, :focus-visible, :active` group on both tab selectors so that `:focus-visible` no longer inherits the zero-out that was added to scrub the orange mouse-focus halo: - `.newproj-tab:focus-visible` → 2px inset blue ring (`--selected`) hugging the folder-tab's 8px top-corner radius, plus `--text` foreground so the label reads at full contrast while focused. - `.entry-tab:focus-visible` → 2px solid outline in `--selected` with `outline-offset: 2px` and `border-radius: 4px`. Outline is used here instead of inset shadow because the tab has no padding to spare against the active 1px bottom border, and outline doesn't participate in layout. Mouse-driven `:focus` and `:active` keep the prior transparent treatment — there is no orange ring on click, which is the polish the rest of this branch is going for. WHY Flagged by @Siri-Ray (changed-range) and @lefarcen (P2) on #1360: the polish-tab commits stripped the focus indicator entirely instead of just suppressing mouse focus, so keyboard users had no way to see which tab was active during arrow-key navigation. Re-introducing `:focus-visible` only restores keyboard reachability while keeping the visual quiet for pointer users. SCOPE - `apps/web/src/index.css` only. Two rulesets touched, one new `:focus-visible` rule added per selector. - No JS, no aria, no test churn — the rules trigger off the existing `:focus-visible` pseudo-class, which the same Playwright tests already exercise via Tab. * fix(web): scope quieted input focus to .entry-side, restore global ring (a11y) WHAT Split the global input focus rule into two layers: - `input:focus, textarea:focus, select:focus` now keeps a visible focus indicator on every input across the app — but in the new `--selected` blue (border + 3px `--selected-soft` ring) instead of the original `--accent` orange. This preserves accessibility for every settings page, dialog, project workspace, and right-column control that was previously losing its focus halo. - `.entry-side input:focus` keeps the neutral treatment from this branch — `border-color: var(--border-strong)`, no ring. The orange "Create" CTA on the entry sidebar is already the loudest element in that panel, so a competing blue ring on the title / path inputs next to it pulled the eye in the wrong direction. Scoping the quieter focus to the sidebar keeps that intent without leaking out to the rest of the app. WHY Flagged by @lefarcen as a P2 a11y regression on #1360: the previous version of this rule scrubbed the focus indicator (`box-shadow: none`, border only one shade darker) for every input in the app, not just on the entry surface this branch is targeting. Keyboard users on settings forms and dialogs were left without a visible focus state. SCOPE - `apps/web/src/index.css` only, one global rule restored and one scoped override added. No JS, no template change. - Color shift global focus orange → blue is intentional: it consumes the new `--selected` token introduced in commit `13dc8a65` and matches the active-state direction this PR is establishing. * chore(web): drop dead AppChromeHeader / isMacPlatform imports + document --selected token WHAT - Remove the `AppChromeHeader` import from `EntryView.tsx`. The component itself is still used (and re-exported) by ProjectView; EntryView dropped its render site in commit `5fe5721c` and the import has been a stale reference ever since. - Remove the `isMacPlatform` import too. It was only used by the old avatar-menu popover (for the `⌘,` / `Ctrl+,` Settings hint) which was deleted along with the popover when the pet-pill split button replaced it. - Add a docblock above the `--selected` / `--selected-soft` token pair in `index.css` so the cascade has a local explanation for why this blue is separate from the brand `--accent`. The note calls out which affordances should reach for `--selected` (active option, focused input ring, active filter pill) and pins the 16% soft fill role. WHY Both flagged by @lefarcen on #1360: - P3 — dead import: the TS config doesn't fail on unused imports, so this was silently shipping as dead code and obscuring the deliberate removal of the global chrome header. - P3 — token doc: the `--accent` vs `--selected` split was only explained in the PR body. Putting the rationale next to the token makes the contract durable beyond this discussion. SCOPE - `apps/web/src/components/EntryView.tsx`: two `import` lines removed. - `apps/web/src/index.css`: one comment block added directly above the token declaration. - Verified: `pnpm --filter @open-design/web typecheck` → exit 0.	2026-05-12 14:26:39 +08:00
huyhoangnhh98	140a4e1ff6	Improve responsive preview and design handoff outputs (#1224 ) * feat: improve responsive design handoff * feat: refine cross-platform design outputs Changelog:\n- Add auto-fit responsive preview behavior for tablet/mobile frames.\n- Add landing page and OS widgets metadata options with project header chips.\n- Strengthen prompt contracts for modern breakpoints, app-specific modules, CJX-ready UX, and final product surfaces.\n- Require cross-platform outputs to use separate platform files instead of tabbed demo selectors.\n- Add DESIGN-MANIFEST.json plus richer handoff guidance to daemon/client exports.\n- Update archive/export tests for manifest and responsive viewport matrix. * feat: enforce screen-file design outputs Changelog:\n- Enforce screen-file-first generation for landing pages, app screens, platform surfaces, and OS widgets.\n- Update design handoff and manifest exports so coding tools map each screen file to separate routes/surfaces.\n- Strengthen minimal-brief visual guidance to avoid monochrome or unstyled design outputs. * fix: address responsive handoff review feedback * fix: address handoff review blockers * fix: preserve proxy auth and normalized export entry * fix: narrow frame wrapper filter to directory paths only * fix: make artifact save failure banner generic --------- Co-authored-by: Huy Hoàng <macos@MacBook-Pro-Hoang.local>	2026-05-12 14:18:33 +08:00
PerishFire	93865f71e7	fix(daemon): remove opencode stdin dash sentinel (#1365 )	2026-05-12 14:15:46 +08:00
Krishna shakula	1ce7d6e8c5	fix: use ACP config options for model selection (#1208 )	2026-05-12 14:07:20 +08:00
lefarcen	43f7fc536a	Add Langfuse telemetry relay (#1296 ) * Add Langfuse telemetry relay * Configure telemetry worker custom domain * Add telemetry relay health check * Harden telemetry relay config	2026-05-12 13:59:19 +08:00
Nagendhra Madishetti	dbc94b83ed	feat(web): add thumbs-up/down feedback widget under completed assistant turns (#1288 ) (#1308 ) * feat(web): add thumbs-up/down feedback widget under completed assistant turns (#1288) Adds a lightweight feedback widget that surfaces under each assistant turn whose run succeeded. Users can submit positive or negative feedback in one click; the negative path opens an optional free-text comment area. The widget never blocks the message composer and only mounts after the run has produced its final artifact, matching the acceptance criteria. What ships - `<MessageFeedback>` (apps/web/src/components/MessageFeedback.tsx) renders the three states: idle (prompt + thumbs), submitted positive (confirmation + Change), submitted negative (confirmation + optional comment textarea + Send + Change). - AssistantMessage.tsx slots the widget under AssistantFooter, gated on `runSucceeded && !hasEmptyResponse`, so failed and empty-response turns don't ask the user to rate something that never finished. - The full record shape leaves room for the future analytics metadata the issue calls out (rating, comment, submittedAt; artifactRef / runId derivable from the surrounding message whenever the analytics pipeline lands). Persistence (v1 = localStorage) Lefarcen's clarifying comment on the issue asked whether v1 should be daemon-persisted or in-memory while the analytics pipeline is defined. The daemon's messages table is column-strict, so daemon persistence would require a SQLite migration plus a contract bump on `ChatMessage`; locking that shape in before the analytics pipeline is designed risks reworking it twice. localStorage is the middle ground: feedback survives reload (so the "feedback state is visually clear after submission" criterion holds across tabs and sessions) without committing the wire shape. The hook surface is just `(value, setter)`, so a future PR can swap the storage layer for a daemon mirror or an analytics shipper without touching the React surface. The store handles corrupted JSON, unknown future rating values, disabled storage (private-mode browsers), and broadcasts changes across listeners in the same tab via a CustomEvent so two mounts of the hook for the same messageId stay in sync. i18n 11 new keys under `feedback.` (prompt, thumbsUp/Down, two confirmation chips, comment label/placeholder/submit/saved, change). English source values authored alongside the keys; zh-CN translations added in the same pass so the locale alignment test stays green and Chinese users see Chinese strings from day one. The other 16 locales pick up English fallbacks via their existing `...en` spread. Test coverage - `tests/state/message-feedback.test.ts` (8 jsdom cases) — round-trip, null-clear, corrupted JSON, missing rating, unknown rating, key collision across messages. - `tests/components/MessageFeedback.test.tsx` (7 jsdom cases) — idle state, positive submit, negative submit, comment save, blank-comment Send disabled, Change unsticks the rating, rehydration from pre-populated storage. The locale alignment test continues to enforce that every locale declares the new keys (5/5 across 18 locales). Validated - pnpm guard clean - pnpm --filter @open-design/web typecheck clean - tests/i18n/locales.test.ts 5/5 - tests/state/message-feedback.test.ts 8/8 - tests/components/MessageFeedback.test.tsx 7/7 - Full web suite: 98 files, 903 tests fix(web): tighten feedback widget gate + storage sync + textarea, add styles (PR #1308 review) Addresses every P2/P3 from the codex + Siri-Ray + lefarcen reviews on PR #1308, plus a couple of polish items the review surfaced indirectly. Visibility gate (lefarcen P2) The gate was `runSucceeded && !hasEmptyResponse`, which also matched text-only acknowledgements and question-form replies. The issue scopes feedback to turns that produced a final artifact, so the gate now also requires `produced.length > 0`. New AssistantMessage suite (5 jsdom cases) pins: artifact -> shown, no-artifact -> hidden, streaming -> hidden, failed run -> hidden, empty_response -> hidden. Storage sync (codex P2 + lefarcen P2) The previous broadcast contract was: write storage, dispatch a bare CustomEvent, listeners re-read storage. That had two failure modes: - setItem throwing (private mode / quota / disabled storage) left the listener seeing null and clobbering the in-memory state the user just confirmed. - The clear path early-returned after removeItem and never dispatched, so a second mount of the same messageId stayed in the submitted state when the user clicked Change. New contract: every successful OR failed write dispatches a CustomEvent whose `detail.value` carries the new feedback record (or null). Listeners apply the value directly without re-reading. Same- tab sync survives storage failures and the clear path no longer early-returns. Cross-tab still re-reads on the platform `storage` event since that event has no detail. Two new storage tests pin the new broadcast contract (positive + null) and the failed-setItem path; two new component tests pin in-session confirmation under setItem failure and two-mount Submit + Change synchrony. Textarea draft fix (lefarcen P3) The textarea used `draftComment \|\| feedback.comment \|\| ''` as its controlled value, so erasing a saved comment snapped it back. The draft is now exclusively the source of truth; a ref-backed effect re-seeds the draft from feedback.comment whenever the rating transitions (mount, idle -> negative, cross-mount sync). Send is now enabled when `draftComment !== savedComment`, which lets the user both edit and clear a saved comment. New component test pins erase+ Send actually removing a previously-saved comment. Accessibility The confirmation chip and "Comment saved" tag both gain `role="status"` + `aria-live="polite"` so screen readers announce the state transition. The thumb buttons keep their `aria-label`. CSS (lefarcen P3) The widget's `.message-feedback*` class set had no rules in index.css, so it rendered with default browser controls. Added a ~130-line block that mirrors the surrounding chat pill/chip vocabulary: bg-subtle background, border-pill confirmation chip, accent-tinted positive state and amber-tinted negative state to match the assistant-footer's data-unfinished pattern. Comment area sits below the chip and wraps on narrow widths so the composer isn't pushed off-screen on small panes. Validated - pnpm guard clean - pnpm --filter @open-design/web typecheck clean - tests/state/message-feedback.test.ts 10/10 (was 8, +2 broadcast) - tests/components/MessageFeedback.test.tsx 10/10 (was 7, +3 sync / storage-failure / clear-saved-comment) - tests/components/AssistantMessage.test.tsx 5/5 (new file) - tests/i18n/locales.test.ts 5/5 - Full web suite: 866 tests --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-12 11:10:28 +08:00
Nagendhra Madishetti	1df3eca161	feat(web): Critique Theater Phase 7 — reducer + useCritiqueStream + useCritiqueReplay (#1307 ) * feat(web): pure reducer for Critique Theater states (Phase 7.1) Pure CritiqueState reducer driven by the contracts-level PanelEvent (the same shape both the live SSE stream and the recorded transcript emit), so a single reducer powers both the in-flight panel and the rerun replay. Lifecycle covers run_started → running → (shipped / degraded / interrupted / failed), with panelist_open / dim / must_fix / close / round_end events building per-round CritiquePanelistView entries as they arrive. Defensive behaviour that surfaced while writing the spec tests: - Terminal phases (shipped / degraded / interrupted / failed) are sticky against further lifecycle events for the same run, except for parser_warning which can land late and is recorded in a side channel without changing phase. - A new run_started for a different runId at any time discards the prior state and reboots, so the UI can launch consecutive runs without an explicit reset action. - Events whose runId does not match the active run return the same state reference, so React's useReducer doesn't re-render subscribers on stray traffic. - Round bookkeeping keys by round number rather than "always last", so an out-of-order panelist_dim for round 1 arriving after a round 2 dim does not corrupt the round 2 bucket. Test coverage: 18 cases covering each transition, the runId guard, sticky-terminal behaviour, the out-of-order round invariant, and the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire SSE + replay into the same reducer. * feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2) createCritiqueEventsConnection is a pure connection manager that mirrors apps/web/src/providers/project-events.ts: opens an EventSource at /api/projects/:id/events, listens for every name in CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent (stripping the critique. prefix and merging the data payload), and hands it to the caller's onEvent. Reconnect uses exponential backoff (1s → 30s) and resets on `ready`; malformed payloads drop with a dev-mode warning rather than tearing the stream. useCritiqueStream wraps the manager in a useReducer that owns the CritiqueState. enabled=false or a null projectId tears down the connection cleanly; switching projectId closes the old connection and opens a fresh one. The returned dispatch lets local UI synthesise actions (e.g. an Esc keypress firing a synthetic interrupted while a kill request is in flight); production traffic comes from the SSE stream. Test coverage: - sse.test.ts (10 cases, node env): subscription set covers every CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire shape back to PanelEvent; malformed JSON is swallowed and does not stop the stream; exponential backoff schedule and ready-reset semantics are pinned with a setTimeout seam; close() cancels pending reconnects and shuts the live source; no-op fallback when EventSource is unavailable. - useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event, reducer driven by synthetic actions, no connection when disabled or projectId is null, clean close on unmount, projectId change reopens cleanly. * feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3) Fetches the per-run NDJSON transcript (one PanelEvent per line), parses every line via the shared isPanelEvent predicate, and dispatches into the same CritiqueState reducer the live SSE stream uses. A single reducer means the UI rendering a replay can be identical to the live panel, and a UI mounting both useCritiqueStream and useCritiqueReplay in parallel does not have to reconcile two state shapes. speed knob is `paused \| instant \| live \| { intervalMs: N }`. - instant flushes every event synchronously, useful for opening a finished run already at its terminal state. - intervalMs paces dispatches at a fixed cadence so the reviewer can watch the run unfold. - paused parses the transcript but holds events back until the caller advances speed (consumers can drive a scrubber later). - live is reserved for the future "playback at original cadence" feature, currently treated as instant; replay timestamps are not yet persisted with each event so honest pacing requires a follow-up Phase 7+ task. gunzip seam handles `.ndjson.gz` transcripts via DecompressionStream when present; the production fetch path picks between text and arrayBuffer based on the URL extension. Both seams are injectable so the unit tests don't need to spin up a real network or a real gzip pipeline. Test coverage (8 cases, jsdom env): - Idle status before any URL is provided. - speed=instant flushes the full transcript synchronously to shipped state. - speed={intervalMs:N} paces with the setTimeout seam, reaching done after the last tick. - speed=paused leaves status=playing with no dispatches. - Empty transcript reports done with state still idle. - Fetch rejection surfaces an error status with the message. - Malformed NDJSON lines are skipped; valid events around them still land. - .gz transcripts route through the gunzip seam. Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream + replay), all on one branch ready for review. Phases 8+ (Theater components) consume these from this PR. * fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review) Two P1 fixes from lefarcen's review on PR #1307: SSE payload override `sseToPanelEvent` previously spread `data` after the channel-derived `type`, so a payload-provided `type` could override the channel and route a `critique.run_started` frame into the reducer as a `ship` action. Reversed the spread so the channel-derived `type` is authoritative, and revalidated the resulting object through the contracts-level `isPanelEvent` predicate before returning. Frames that fail validation (missing runId, empty runId, unknown type) are dropped, so a malformed or compromised SSE frame can no longer dispatch a wrong-shape action into the reducer. Three new sse.test.ts cases pin the regression: hostile `type:'ship'` in the payload still resolves to `run_started`, missing runId is dropped, empty runId is dropped. Replay pause/resume `useCritiqueReplay` had one big effect keyed on `transcriptUrl` only, so flipping `speed` from `paused` to `instant` never re-fired and the held events sat undispatched. Split into a parse effect (depends on URL, fetches and stores events in state) and a pace effect (depends on parsed-events + speed, owns the cursor + timers). The playback cursor lives in a ref that survives pause/resume cycles, so flipping `paused` -> `instant` flushes from the current position rather than restarting (which would double-dispatch `run_started` and reset the reducer). Two new useCritiqueReplay.test.tsx cases: - paused-then-instant transitions from `playing` to `done` and reaches the shipped terminal phase - intervalMs paced playback dispatches one event, pauses to drain the next scheduled timer, flips to instant, and confirms the remaining transcript drains exactly once (cursor was preserved) Doc consistency The earlier source comment in useCritiqueReplay.ts claimed `live` "paces by recorded timestamps" while the impl used zero-delay timers and the PR body said it behaves like `instant`. Aligned to reality: `live` currently behaves like `{ intervalMs: 0 }` (events drain on successive microtasks via setTimeoutFn) because transcripts do not yet carry per-event timestamps. Honest timestamp-driven pacing is queued as a Phase 7+ follow-up. Validated: pnpm guard, pnpm --filter @open-design/web typecheck, Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite 96 files / 888 tests. --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-12 10:45:07 +08:00
Nagendhra Madishetti	64510b790b	fix(web): translate Design Files refresh strings instead of hardcoding English (#1254 ) (#1300 ) * fix(web): translate Design Files / live artifact refresh strings instead of hardcoding English When the app language was set to Chinese, the Design Files refresh flow showed Chinese for the surrounding chrome but kept English for every label and message originating in describeRefreshStatus, describeEventPhase, and the refresh-event timeline body of LiveArtifactRefreshHistoryPanel. Same-screen mixed-language UX, the exact symptom reported in #1254. Root cause: those three sites bypassed i18n entirely. describeRefreshStatus returned hardcoded English label + description strings for the running / succeeded / failed / idle / never statuses; describeEventPhase returned hardcoded Started / Succeeded / Failed labels; the timeline body inlined "Refresh started…", "<n> source(s) updated", and "Refresh failed." string literals; and the empty-timeline copy ("No refresh activity yet in this session. Trigger Refresh to record a timeline…") was hardcoded too. Fix: thread the existing TranslateFn through both helpers, swap every hardcoded string for a t() lookup, and pull the empty-timeline copy and the failure-fallback through the same path. Added 13 new keys under liveArtifact.refresh.* — statusRunning, the five Description keys, three event-phase labels (eventStarted/Succeeded/Failed), eventStartedDetail, sourcesUpdatedOne/Many with an {n} placeholder, and timelineEmpty. Status labels for succeeded / failed / ready / never already had keys (statusSucceeded / statusFailed / statusReady / statusNever) so those are reused unchanged. Locales: full Chinese translations added to zh-CN.ts (the locale directly named in the issue). The other 16 locales pick up English fallbacks through their existing ...en spread, so the locale-key alignment test stays green; native translations for those locales can land via the usual locale-team passes without re-touching the source code. fix(web): cover the rest of the refresh panel under i18n + add a zh-CN render test Lefarcen's review on #1254 / PR #1300 surfaced that the first pass only translated three helpers (describeRefreshStatus, describeEventPhase, session timeline body) and left the rest of the panel in English. Under a Chinese UI the panel still mixed languages, which was exactly the regression the issue was filed for. This commit threads t() through every user-visible refresh-panel string the user would see in the Chinese flow: - Hero block: "Last refreshed" label + "Never" empty state. - Created / Last updated facts + their "Unknown" empty label. - Persisted refresh history header, hint, empty-state copy. - Persisted timeline status badge: succeeded / running / failed / cancelled / skipped now resolve through describePersistedStatus, which uses an exhaustive switch off LiveArtifactRefreshLogEntry's status union so a future contract addition trips tsc. - Session activity header, hint. - Document source header, hint, Type / Tool / Connector field labels. - Advanced debug metadata summary + note line. - "just now" relative-time fallback in the persisted timeline. 22 new i18n keys total (23 with the new heroLastRefreshedNever distinct from statusNever); zh-CN strings authored alongside the English source, every other locale picks them up via its existing ...en spread and the locale-key alignment test stays green. Intentionally untranslated surfaces: raw daemon payloads inside the <details> debug panel (event.step / refreshId / error.message and the JSON.stringify dump), since those are agent / connector identifiers and stack-trace style strings, not localised copy. The debug summary heading itself is translated; if the debug section should be hidden in localised primary flows, that is a separate UX call worth its own issue. Test coverage: new render test wraps LiveArtifactRefreshHistoryPanel in I18nProvider initial="zh-CN" and pins the Chinese rendering of every translated label, plus negative assertions that the formerly hardcoded English literals are NOT present in the markup. With the no-provider fallback returning English, the existing static-markup tests can't observe the regression this PR is meant to fix; the zh-CN render test is the only one that would have caught the original gap and will catch the next one. Validated: pnpm guard, pnpm --filter @open-design/web typecheck, locales.test.ts (5/5), FileViewer.test.tsx (69/69, +1 new zh-CN test), full web suite (92 files, 841 tests). * fix(web): route formatRelativeTime through Intl.RelativeTimeFormat so units localise Lefarcen's second pass on PR #1300 caught the remaining hardcoded English path: formatRelativeTime() still emitted units like `5s ago` and `45m ago`, so Chinese users would see those strings inside the otherwise-translated refresh panel. The function now takes the active locale + TranslateFn and routes through Intl.RelativeTimeFormat with style: 'narrow', numeric: 'always'. That preserves the historical `5s ago` shape for English while producing locale-correct output for every other locale (zh-CN gets `5秒前` / `45分前`, with the right past / future suffix and word order). The `just now` carve-out (abs < 5s) keeps using t('liveArtifact.refresh.justNow') since Intl's narrow output for zero-delta reads awkwardly. A try/catch around the RTF constructor falls back to 'en' if the runtime rejects the locale, so the function is safe on engines with limited ICU data. Callsites threaded through: - LiveArtifactRefreshHistoryPanel hero metric (`lastRefreshedAt`) - Session timeline event row (`event.startedAt`) - Session timeline event time (`event.at`) - LiveArtifactRefreshFact for the created / last-updated facts; the component now accepts optional `locale` + `t` props and the panel passes them in. Test coverage extension: - The existing zh-CN render test sets a real lastRefreshedAt (now - 45s) and real session-event timestamps, then asserts the Chinese past-tense suffix `前` appears AND the legacy English `Xs ago` / `Xm ago` shapes do NOT. That was the gap lefarcen pointed at: setting `lastRefreshedAt: undefined` couldn't see the regression because no relative-time formatting ran. - Added a small second test for the lastRefreshedAt-undefined empty hero so the original `从未` coverage still pins. Validated: pnpm guard, pnpm --filter @open-design/web typecheck, FileViewer.test.tsx (70/70, +1 new test), locales.test.ts (5/5), full web suite (92 files, 842 tests). --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-12 10:38:07 +08:00
Caprika	5bd9763181	[codex] Improve Claude Code exit diagnostics (#1267 ) * fix daemon claude diagnostics * fix claude custom endpoint auth diagnostics * fix project view api empty response test props * fix claude diagnostic review gaps * fix silent custom endpoint claude diagnostics * fix claude diagnostic credential redaction * fix quoted api key redaction * fix claude diagnostic tail redaction * fix silent claude configured profile diagnostics	2026-05-12 00:08:31 +08:00
nettee	87a95b7fb4	Fix conversation run isolation (#1271 )	2026-05-11 21:13:54 +08:00
Kaelz31	3524a43d18	fix: pretty-print JSON file previews (#1206 ) * fix: pretty-print JSON file previews * fix: avoid formatting JSON with unsafe numbers * fix: preserve precision-sensitive JSON previews * fix: preserve signed zero in JSON previews * fix: scan JSON numbers without repeated slicing --------- Co-authored-by: Kael S <YOUR_GITHUB_EMAIL_HERE>	2026-05-11 20:52:55 +08:00
eggward han	a0316d2599	fix(web): suppress autosave indicator for draft-only Connector key edits (#1232 ) When the user typed a replacement Composio API key, the global Settings autosave loop persisted `buildPersistedConfig(cfg)` — which intentionally strips the in-flight secret — and then advanced the indicator through 'saving' -> 'saved' despite the key never actually being written. The "All changes saved" status then contradicted the section-local "Save key" gesture and eroded trust in the saved-state badge for a sensitive field. The autosave effect now tracks the snapshot at the last successful save (or the initial cfg on mount) and compares the next snapshot's persisted shape against it via a new `isAutosaveDraftOnlyChange` helper. When the only diffs since last save are fields that `buildPersistedConfig` strips (today the Composio API key, generalizing to any future save-on-explicit-confirm secret), the persist call is skipped and the indicator settles to 'idle' instead of flashing 'saved'. The forced media-provider sync path still runs because that is a real outbound effect even when the persisted shape hasn't changed. Refs #1187	2026-05-11 20:52:45 +08:00
donglrd	19f1ff7995	Reject filesystem root folder imports (#1266 )	2026-05-11 20:52:35 +08:00
nettee	be77dc0394	Default English resource i18n fallback (#1270 )	2026-05-11 20:29:05 +08:00
Caprika	fb079d8115	Add reliable agent-browser skill (#1284 ) * Add reliable agent browser skill * Fix ProjectView delete conversation test props	2026-05-11 20:09:12 +08:00
PerishFire	1eb20e3807	fix(web): keep tweaks selection usable without annotations (#1268 )	2026-05-11 20:06:49 +08:00
Sebastian Westberg	8962088c75	feat(daemon): guard against agent-emitted stub artifact regressions (#1171 ) * feat(daemon): guard against agent-emitted stub artifact regressions When an agent emits an <artifact> block whose body is a placeholder ("see other-file.html in this project", a bare filename string, a tiny fallback page) instead of the full document, the daemon writes the placeholder to disk verbatim. Users see a 25-500 byte HTML file where their previous version had tens of kilobytes of real markup. Add a structural regression guard in writeProjectFile: before writing an html/deck artifact whose manifest carries metadata.identifier, scan the project dir for prior siblings matching <identifier>(-\d+)?\.html? and compare sizes. If the new body is below minRetainedRatio (default 0.2) of the largest prior sibling >= minPriorBytes (default 4096), flag a regression. Three modes via env: - OD_ARTIFACT_STUB_GUARD=warn (default) writes the file and attaches stubGuardWarning to the response so the frontend can surface it. - OD_ARTIFACT_STUB_GUARD=reject throws ArtifactRegressionError before fs.writeFile; the route returns 422 ARTIFACT_REGRESSION with the prior sibling's name and size in error.details. - OD_ARTIFACT_STUB_GUARD=off skips the guard entirely. Cross-agent by design: anchored on size delta + identifier match, no agent-specific stub-phrase regex, so works for any agent backend behind the agent-adapter abstraction. The body-then-manifest write order pre-dates this change; the reject path throws before fs.writeFile so rejections never leave a partial state behind. 24 unit + 8 HTTP tests cover happy paths, all three modes, deck kind, .htm extension sibling detection, ratio=1 edge case, and verify rejected writes leave neither the html nor its manifest sidecar on disk. * fix(stub-guard): close same-name, nested-dir, and non-slug bypasses Code review on PR #1171 (lefarcen, Codex, mrcfps) found three holes where the stub guard could be silently bypassed. All three are now closed with HTTP test coverage. Same-name overwrite (lefarcen P1): the writer's prior-sibling scan deliberately skipped the file at safeName, but for an in-session overwrite (persistArtifact reuses the same fileName when savedArtifactRef.current matches) that file is the prior content, not the new entry. Drop the exclude-by-name filter; the current on-disk size at scan time is always the prior because the overwrite happens after this check. Subdirectory scoping (Codex/mrcfps P2): writeProjectFile creates parent directories for nested paths like reports/overview.html, but the guard only scanned the project root. Pass path.dirname(target) as scanDir so nested artifacts are evaluated against their real sibling set. Non-slug identifier (Codex/lefarcen/mrcfps P2): the web's persistArtifact slugifies the filename basename but stores the raw identifier in the manifest, so an identifier like "Landing Page" yields filename landing-page.html with metadata.identifier="Landing Page". Build the sibling regex from both the raw identifier and a slugified variant (mirroring the frontend's slugifier) so either form matches the same priors. Also surface warn-mode warnings in the web UI: ProjectView now checks file.stubGuardWarning after writeProjectTextFile and renders the warning via setError. Reject-mode 422 surfacing requires restructuring writeProjectTextFile's return contract and is deferred. API change inside the daemon: evaluateArtifactStubGuard / findPriorArtifactSiblings drop excludeSafeName and rename projectDir to scanDir. Tests updated. Tests: 4 new HTTP cases (same-name overwrite preserves prior body, nested subdir rejects, slug-form match rejects, plus the existing warn/off/deck/.htm cases) and 1 new unit case (slug-form sibling match). 44 tests pass. * fix(stub-guard): empty-slug fallback + reject-mode UI surface Round 3 review on PR #1171 (lefarcen, mrcfps) found two remaining holes after `9cc82430` closed the same-name / subdir / non-slug bypasses. Empty-slug fallback bypass (lefarcen P2): an identifier like "测试" (all-non-ASCII) strips to empty through the web slugifier, and persistArtifact's `slice(0,60) \|\| 'artifact'` falls back to the literal "artifact" basename. The guard searched for raw identifier + slug only, so a later artifact-2.html stub bypassed the prior. Add EMPTY_SLUG_FALLBACK_NAME = 'artifact' as a sibling-name candidate when the slug is empty, mirroring the frontend fallback exactly. Reject-mode UI silence (mrcfps P2 + lefarcen P2): writeProjectTextFile collapses any non-OK response (including 422 ARTIFACT_REGRESSION) to null, and persistArtifact previously had no else branch. Users in reject mode saw the daemon log fire but the UI was silent. Add an else branch that surfaces a generic banner pointing at the most likely cause and mentions checking the daemon logs for structured details. Also clear savedArtifactRef.current on failure so retries re-enter the persistence path. Plumbing the structured 422 details through writeProjectTextFile itself remains out of scope (cross-cutting client contract change affecting 5+ call sites). The generic banner is the "at minimum" path mrcfps suggested. Tests: 1 new unit case (artifact.html sibling discovery for non-ASCII identifier) + 1 new HTTP case (empty-slug stub regression rejected end-to-end). 46 tests pass across stub-guard suites (was 44). * fix(stub-guard): verify sidecar identity to avoid cross-identifier false positives Round 4 review on PR #1171 (mrcfps inline + lefarcen review) caught a false-positive introduced by the round-3 empty-slug fallback. Two distinct identifiers that both slugify to empty (e.g. "测试" and "首页") share the artifact.html basename, so a brand-new save under the second identifier was being compared against — and falsely rejected because of — the unrelated first. The same shape exists symmetrically: a non-empty-slug identifier literally named "artifact" would falsely match empty-slug fallback files written under any other identifier. Fix: filename pattern matching is now a candidate generator, not the source of truth. For every candidate sibling, read its .artifact.json sidecar and verify metadata.identifier matches the input via artifactIdentifiersMatch (raw equality OR shared non-empty slug). Files without a sidecar are skipped — they weren't written through the artifact-tag path this guard targets, and treating them as priors was always a stretch. Empty-slug equivalence is intentionally NOT honored: 测试 != 首页 even though both slugify to empty. The whole bug was conflating distinct identifiers via the fallback name; slug-equivalence kicks in only for non-empty slugs (Landing Page <-> landing-page). Tests: unit fixtures now write file+sidecar pairs (mirrors prod); new artifactIdentifiersMatch suite covers the 5 equivalence cases; new HTTP test does NOT cross-reject distinct empty-slug identifiers asserts the second save returns 200 instead of 422; new unit test skips files without a sidecar. 42 tests pass across stub-guard suites. fix(stub-guard): require canonical-form anchor in identifier match to avoid 60-char truncation collisions Round 5 review on PR #1171 (mrcfps) caught another false-positive in artifactIdentifiersMatch: slugifyArtifactIdentifier truncates at 60 chars, so two distinct >60-char identifiers that share their first 60 chars (e.g. "A...A1" and "A...A2", 70 chars each) slugify to the same string and would falsely bridge. Same shape as the empty-slug fallback bug from round 4, just at the other end of the input range. Tighten the rule: slug-equivalence requires at least one input to BE its own canonical slug form. That keeps the legitimate bridge ("Landing Page" <-> "landing-page" — second input IS the slug) but rejects truncation collisions ("A...A1" <-> "A...A2" — neither is in canonical form). Side effect: two non-canonical forms that slugify to the same value no longer bridge (e.g. "Landing Page" vs "LANDING-PAGE"). This is correct: without one canonical anchor we can't safely call them the same lineage. Updated the slug-equivalence test to assert the new semantics explicitly with both directions and a negative case. Tests: 2 new cases (no bridge for >60-char truncation collision; raw 70-char to its 60-char truncated slug still bridges) + 1 negative test for the non-canonical-pair case. 45 tests pass. * fix(stub-guard): cover legacy sidecar-less HTML priors Round 6 review on PR #1171 (mrcfps, non-blocking) caught a real legacy bypass: round 4's sidecar-required policy skipped any HTML file without an .artifact.json companion, but readManifestForPath (projects.ts) treats those same files as legitimate artifacts via inferLegacyManifest. So a project with an older sidecar-less dashboard.html (pre-sidecar era, Write-tool-emitted, paste-text, manual import, etc.) let its first stub rewrite through as a supposed "first emission". Fix: when the sidecar is missing, derive a synthetic identifier from the filename (strip the (-N)?\.html? suffix) and run it through the same artifactIdentifiersMatch rules. Synthetic identifiers come from already-slugified filenames, so they bridge raw inputs only via the canonical-form rule established in round 5 — no truncation collisions, no empty-slug conflation, no unrelated cross-identifier matches. Tests: 3 new unit cases (legacy fallback finds the prior; bridges raw->slug under the same rules; does NOT bridge unrelated slug forms via inference) + 1 new HTTP test that seeds a sidecar-less prior via the artifact-manifest-less write path and asserts the stub rewrite is rejected with 422 ARTIFACT_REGRESSION. 48 tests pass across stub-guard suites (was 45). * fix(stub-guard): try both interpretations for legacy filename inference Round 7 review on PR #1171 (mrcfps, non-blocking) caught a real ambiguity in the round-6 legacy fallback: a filename like `phase-2.html` is genuinely ambiguous without a sidecar. It could be the identifier "phase" with a -2 collision suffix, OR the standalone identifier "phase-2". The round-6 helper only stripped the suffix, so a sidecar-less `phase-2.html` followed by a stub emission with metadata.identifier="phase-2" bypassed the guard ("phase-2" doesn't match the inferred "phase"). Fix: when the sidecar is missing, generate both candidate identifiers (full basename and suffix-stripped basename) and accept the file as a prior if either matches. Visible false positives are preferable to silent false negatives — and the canonical-form anchor in artifactIdentifiersMatch still rules out truncation collisions and empty-slug conflations regardless of which candidate matched. Tests: 2 new unit cases (full-basename interpretation finds "phase-2"; suffix-stripped interpretation also finds "phase") and 1 new HTTP test that seeds a sidecar-less `phase-2.html` and asserts the stub rewrite is rejected with 422 ARTIFACT_REGRESSION. 51 tests pass across stub-guard suites (was 48). --------- Co-authored-by: Sebastian Westberg <sebastianwestberg@users.noreply.github.com>	2026-05-11 19:59:37 +08:00
初晨	0f0d214298	fix(web): render static previews for sketch json files (#1060 ) * fix(web): render static previews for sketch json files * fix(web): tolerate malformed sketch text items * fix(web): harden sketch preview parsing * fix(web): preserve sketch items on round-trip * fix(web): clear sketch files destructively * fix(web): unblock unsupported sketch saves	2026-05-11 19:29:46 +08:00
Dongsen	fd67b680d7	fix(contracts): pin API-mode override above discovery layer (#313 ) (#1207 ) * fix(contracts): pin API-mode override above discovery layer (#313) The old streamFormat='plain' rule was appended at the BOTTOM of the composed prompt, but DISCOVERY_AND_PHILOSOPHY is pinned at the TOP with its own 'these override anything later' header — so its hard rules ('TodoWrite on turn 3', 'brand-spec extraction via Bash + Read + WebFetch') still won precedence in API mode. With no real tools wired through to the Anthropic Messages path, the agent narrated pseudo-tool markup (<todo-list>...</todo-list>, [读取 X]) instead of emitting structured tool_use events the UI could render. Move the API-mode override to the absolute top of the prompt so it beats the discovery layer, name every unavailable tool, and explicitly forbid the pseudo-tool / fake-protocol markup observed in #313. <artifact> output and <question-form> discovery are still allowed — both are markup the UI parses, not tool calls. * fix(daemon): mirror API-mode override above discovery layer (#313) Address Codex + mrcfps review on #1207: the daemon has its own copy of composeSystemPrompt that is hit by any adapter declaring streamFormat: 'plain' (e.g. DeepSeek) via server.ts:6190. That copy still appended the obsolete bottom '## API mode rule', which loses the precedence war against DISCOVERY_AND_PHILOSOPHY's 'these override anything later' header — so plain-stream daemon agents could still narrate <todo-list> / [读取 X] pseudo-tool markup. Mirror the same top-anchored API_MODE_OVERRIDE here (byte-identical to the contracts copy) so both code paths produce the same behaviour. Adds 8 daemon-side tests including the indexOf-based positional assertion that pins the override above the discovery layer header.	2026-05-11 19:29:34 +08:00
Dongsen	12ce5ad38b	fix(web): ignore <artifact> tags inside markdown code spans and fences (#1132 ) * test(web): add failing parser cases for <artifact> recitation in markdown code Cover the three real-world prose contexts where the model legitimately quotes the artifact tag without intending to emit one: - inside an inline backtick span - inside a fenced code block - spread across streaming chunks crossing the fence boundary Establishes the RED baseline before parser code-fence awareness lands. * fix(web): ignore <artifact> tags inside markdown code spans and fences The streaming artifact parser scanned the buffer with a raw indexOf, guarded only by 'next char must be whitespace'. That meant any literal <artifact ...> the model recited while documenting the protocol — even inside backticks or a ```html fence — flipped the parser into artifact mode, swallowed the rest of the reply from the chat UI, and (when a matching </artifact> appeared in the recitation) silently wrote a spurious file to disk via persistArtifact. Replace findOpenTag with a linear scan that tracks fenced code blocks (```) and inline code spans (`), skipping any <artifact prefix found inside either. If the buffer ends mid-fence, return a partial match anchored at the fence start so the next streaming chunk can resolve the boundary without losing fence context. Closes #1130. * fix(web): match renderer fence/inline-code rules in artifact parser Codex review on PR #1132 caught that the previous fix toggled inFence on any triple-backtick run anywhere in the buffer, including mid-line, while the chat renderer (apps/web/src/runtime/markdown.tsx) only treats ``` as a fence when it occupies a whole line matching /^[ ]{0,3}```(\w[\w+-])?\s$/. That asymmetry would suppress a real <artifact> tag emitted after a prose sentence like "the opening marker is ```html and the response then writes:". Rework findOpenTag in three passes that mirror the renderer: 1. Walk \n-terminated lines; only a line that matches FENCE_LINE_RE toggles fence state. Open fences without a close (or with an unterminated tail line) return partial so the next chunk can resolve. 2. Collect inline code spans with /`[^`]+`/g — the same regex used by renderInline — so what the parser skips matches what the user sees as code. Unmatched trailing backticks after the last \n hold back. 3. Find the first <artifact …> outside any skip range; preserve the existing partial-prefix tail handling. Adds a regression test covering the exact case Codex reported. * test(web): pin parser behavior on double-backtick and in-fence string literal recitation Two cases raised in PR #1132 review: - a real artifact tag wrapped in '``<artifact …>``' (double-backtick inline code span) should not be treated as a real artifact - a fenced JS example whose body contains a string literal like 'const fence = "```";' should not pop fence state early and let a later literal <artifact> be parsed as real Both already pass on 96e88ca because the line-anchored fence regex and the renderer-aligned inline regex handle them correctly. Pinning the behavior so future regressions surface as test failures. * fix(web): make stripArtifact markdown-aware to stop truncating literal recitations The streaming artifact parser was hardened in 96e88ca to skip <artifact> recitations inside backticks and fences, but the post-stream stripper at AssistantMessage.tsx still ran a naive 'content.indexOf("<artifact")' over the same text events. As reported by lefarcen on PR #1132, that meant chat replies with literal protocol recitations could still get silently truncated mid-explanation — even though the parser preserved them in the text stream and the file panel was no longer polluted with ghost files. Extract the renderer-aligned classification (FENCE_LINE_RE, INLINE_CODE_RE, computeSkipRanges, rangeContains) into a single source of truth at apps/web/src/artifacts/markdown-context.ts so the parser and the stripper agree on what counts as code. Add apps/web/src/artifacts/strip.ts with a markdown-aware stripArtifact that: - ignores any <artifact open inside a fenced block or inline code span - looks for </artifact> with the same skip-range filter, so a real open paired with a literal close inside backticks does not strip a literal body that is meant to render - returns content unchanged when an open exists with no matching real close (the previous implementation sliced to end-of-string, which would nuke trailing prose on a malformed or still-streaming tag) Refactor parser.ts to import the shared helpers; behavior preserved (all seven existing parser tests still pass). New strip.test.ts covers six cases including the empirically-verified inline-backtick regression. * fix(web): align artifact stripper/parser fence rules with renderer exactly Two gaps surfaced in review at a0bf05f: - markdown-context.ts used a single FENCE_LINE_RE that allowed 0-3 leading spaces and reused the same pattern for opening and closing fences. The chat renderer (runtime/markdown.tsx:44 and :49) is asymmetric — opens with /^```(\w[\w+-])?\s$/, closes with /^```\s$/, and rejects any leading indentation on either side. Indented " ```html" was being treated as a code fence even though the renderer keeps it as a paragraph, and a literal "```html" line inside an open fenced example was closing the skip range early — both could expose a real or literal <artifact …> to the wrong handler. - stripArtifact discarded computeSkipRanges' unclosedFenceStart, so a fenced literal that ends at EOF without a trailing newline (very common for chat output) leaked the inner <artifact …> recitation to the stripper, reproducing the original #1130 truncation symptom on a narrower input shape. Split FENCE_LINE_RE into FENCE_OPEN_RE / FENCE_CLOSE_RE with no leading indentation, gate the fence state machine on the right side of the toggle, and have stripArtifact extend skip ranges to end-of-content when a fence is left open. Also tightened the parser's tail-line hold-back regex to match the renderer's no-leading-space rule. Added regression tests for the EOF-unclosed-fence case, the indented pseudo-fence (renderer treats as paragraph, stripper must strip the real artifact), and a "```html" line inside an open fence. Refs nexu-io/open-design#1130 refactor(web): align streaming tail-line fence guard with FENCE_OPEN_RE The streaming parser's tail-line hold-back used a stricter local regex (/^```\w$/) than the renderer's FENCE_OPEN_RE (/^```(\w[\w+-])?\s$/), missing valid opener tails like ```c++, ```ts-, or ``` (trailing space). In practice these tails are still held back by the unmatched-backtick parity scan that runs immediately after — three backticks in a tail line are odd, so firstUnmatched stays set and the parser holds from that position. So this wasn't a runtime correctness bug, just a regex divergence that future readers could trip on. Drop the local regex and reuse FENCE_OPEN_RE so the tail check matches the same shape the rest of the pipeline already uses. Pinned the behavior with three new parser tests (`+`/`-` info-string suffix and trailing-space tails arriving as the first chunk) — they pass at HEAD, proving the parity scan was already covering these cases. Refs nexu-io/open-design#1132 (lefarcen polish P2) fix(web): scope inline-code skip ranges per block and reject <artifact prefix-shared opens INLINE_CODE_RE previously ran over the whole buffer, so an unmatched backtick in one paragraph could pair with a backtick in a later paragraph and create a phantom inline span that swallowed any real <artifact …> between them. Mirror runtime/markdown.tsx by splitting the buffer on fence / blank / heading / list / hr boundaries and running INLINE_CODE_RE per block region instead. stripArtifact accepted any unskipped `<artifact` substring as a real open, while the streaming parser already required a following whitespace character — so prose like `<artifactual>demo</artifact>` was being truncated to `prefix suffix`. Extract the parser's real-open guard into isRealArtifactOpenAt and reuse it from both sides. While reordering findOpenTag for the shared guard, also fix the related hold-back ordering issue tracked at #1141: a stray tail-line backtick or fence-opener prefix used to suppress an artifact already complete earlier in the buffer. Scan for the earliest complete real open first, then pick the earliest hold-back position only when no complete tag was found. Regressions pinned in parser.test.ts and strip.test.ts for both new finding shapes. * fix(web): keep HR-shaped lines inside paragraph regions for inline-code scanning The previous walker closed inline-scan regions on lines matching the HR regex, but `parseBlocks()` in runtime/markdown.tsx does not break a paragraph on HR — its inner accumulation loop only breaks on blank / fence / heading / ul / ol (runtime/markdown.tsx:95-104). HR is only an HR block in the outer loop's first-look, never mid-paragraph. So inputs like `intro \`\n---\n<artifact …>…</artifact>\n---\nclosing \`` are one paragraph in the renderer, whose two stray backticks pair to cover the literal artifact recitation — but the walker was splitting on the `---` lines, leaving the recitation outside skip ranges, and the parser/stripper would treat it as a real tag. Drop HR from the paragraph-break list (HR-shaped lines carry no backticks of their own, so keeping them inside the surrounding region is benign either way) and document the renderer-mirror rationale. Regressions pinned on both sides.	2026-05-11 19:29:22 +08:00
Sid	156bf5a34e	fix(web): refresh home projects after deleting a conversation (#1202 ) (#1219 ) The home design cards render their `Needs input` badge from the cached `/api/projects` payload — App.tsx owns the `projects` state and exposes a `refreshProjects` callback that ProjectView already fires from every other state-changing branch (run end, live-artifact events, project rename, etc.). The conversation-delete branch silently skipped it: deleting a conversation that owned an unanswered `<question-form>` flips the daemon-side flag, but the home view kept showing the stale badge until the next manual reload. Call `onProjectsRefresh()` immediately after a successful `deleteConversation` API response (and only then — if the request fails the cached state is still the truth and we must not pretend otherwise). Adds `onProjectsRefresh` to the useCallback deps for exhaustive-deps correctness; matches the pattern at the four existing call sites in this file. New regression coverage in `apps/web/tests/components/ProjectView.deleteConversation.test.tsx`: - triggers onProjectsRefresh after deleting a conversation (verified RED before this fix, GREEN after) - does not trigger onProjectsRefresh when the delete request fails (defensive complement so a future "always refresh" refactor doesn't paper over a real failure with a stale-but-confident UI)	2026-05-11 19:29:09 +08:00
shangxinyu1	10802bb0b0	test: expand nightly UI and desktop regression coverage (#1256 ) * e2e(ui): cover examples preview flows * e2e(ui): cover Codex local CLI fallback UX * test: expand desktop and connector regression coverage * e2e(ui): cover workspace restoration flows * e2e(ui): cover retry recovery workspace flow * test: cover artifact and connector recovery flows * e2e(ui): cover Continue in CLI stale provenance flow * e2e(ui): cover BYOK model fetch caching * test: expand Orbit and desktop connector coverage * e2e(ui): cover workspace quick switcher recovery flows * e2e(ui): cover connector pending authorization recovery * e2e(ui): cover workspace and conversation restoration routes * e2e(ui): cover conversation draft and attachment restoration * e2e(ui): cover conversation history selection recovery * e2e(ui): cover workspace surface conversation selection * test: cover artifact presentation and orbit link behavior * test: cover artifact external link restoration * e2e(ui): cover root-route deep-link restoration * e2e(specs): cover Orbit open-artifact desktop click * e2e(specs): cover desktop artifact open link * test: fix Orbit settings fixture type drift * test: split Playwright critical and extended suites * test: fix ProjectView design template fixtures * ci: split workspace test stages * guard: allow split Playwright suite scripts * test: shrink Playwright critical suite * test: restore omitted Playwright suites	2026-05-11 19:23:13 +08:00
PerishFire	8c0fb8dc01	feat(tools-pr): add maintainer PR-duty workspace (#1259 ) * feat(tools-pr): add maintainer PR-duty workspace Adds `tools/pr` as the maintainer-only control plane for PR-duty work on this repo. Thin `gh` wrapper that encodes repo-specific knowledge: review lanes, forbidden surfaces, lane-specific checklists, validation command derivation from touched packages. Subcommands: - `list` — triage open queue by lane and review-state bucket. - `view <num>` — agent-friendly review brief for a single PR. - `classify [num]` — emit script-level tags for one PR or the whole open queue; full-queue JSON output lands under `.tmp/tools-pr/classify/` with rate-limit telemetry per run. - `assignment` — assigner-perspective view of PR ownership, idle time, and blockers (derived from existing tags; no new judgments). Tag dictionary (13 tags) covers: bot-only-approval, needs-rebase, forbidden-surface, unlabeled, duplicate-title, non-ascii-slug, maintainer-edits-disabled, org-member, unresolved-changes-requested, stale-approval, and three awaiting-* timing tags. Each rule is expressible as one factual sentence over `gh` data + repo paths — see `tools/pr/AGENTS.md` for the full dictionary plus precision rules. Templates in `tools/pr/templates/.md` are aesthetic references for recurring maintainer comments (duplicate-title ask, awaiting-author nudge, agent-review brief shape). `templates/examples/` holds frozen-in-time agent-review snapshots for three PR shapes. Infrastructure: - `gh()` wraps `execFile` with minimum-touch retry (2 attempts at 1s + 2s backoff) on transient 5xx / network errors. Persistent failures still surface — retry is anti-jitter, not an exponential-backoff resilience layer. - Heavy chunks (`reviews`, `comments`, `commits`, assignment timelines) use cursor-paginated `gh api graphql` via `fetchPaginatedPrList` to stay under GitHub's GraphQL server-side timeout. Light chunks stay on `gh pr list --json`. - `fetchOrgMembers` cached per process via `gh api orgs/<owner>/members --paginate`. Wiring: - Root `package.json` adds `pnpm tools-pr` to the allowed root entry points. - `scripts/postinstall.mjs` builds `tools/pr` alongside other workspace packages. - `scripts/guard.ts` allowlists `tools/pr/bin/tools-pr.mjs` and `tools/pr/esbuild.config.mjs`, and adds `pr/` to the `tools/` top-level layout allowlist. - Root `AGENTS.md` and `tools/AGENTS.md` document the new command surface, root-command-boundary update, and per-tool ownership. docs(agents): brief tools-pr in root AGENTS.md, link to tools/pr/AGENTS.md Adds a `PR-duty tooling` section to the root AGENTS.md summarising what `pnpm tools-pr` is, listing the four common subcommands (list / view / classify / assignment), and pointing readers to `tools/pr/AGENTS.md` for the full tag dictionary, operational playbook, templates, and design rules. The section keeps root-level guidance to high-level orientation while details stay local to the tool's own AGENTS.md. * fix(tools-pr): drop overly broad touches-root-package.json forbidden hit `deriveForbidden` was flagging any change to root `package.json` as a forbidden-surface hit, but AGENTS.md §Root command boundary only forbids specific lifecycle aliases (pnpm dev / test / build / daemon / preview / start) — tools-control-plane entrypoints like `pnpm tools-pr` are explicitly allowed. Distinguishing "forbidden alias" from "allowed entry" requires reading the diff content, which is `pnpm guard`'s job rather than a path-derived classify tag. Dogfooded on this branch's own PR (#1259), which added the `pnpm tools-pr` script and was incorrectly flagged. Removing the hit aligns the `forbidden-surface` tag with what tools-pr can mechanically detect from file paths alone (apps/nextjs/, packages/shared/). * fix(tools-pr): paginate commits fetch, recognise ready-to-merge, escape title-index separator Three review follow-ups on #1259, all factual fixes: - `fetchOpenPrCommits` now uses `fetchPaginatedPrList` instead of a one-shot `pullRequests(first: $first)` query. GitHub GraphQL caps connection page size at 100, so the previous implementation would fail at runtime when callers passed `--limit > 100`. The paginated path makes the commits fetch consistent with the other heavy chunks (reviews, comments, assignment timelines) and removes the artificial ceiling entirely. The `limit` parameter is dropped from `fetchOpenPrCommits`; the CLI `--limit` continues to bound the `gh pr list --json` chunks. - `deriveStatus` in `assignment.ts` now reads `facts.reviewDecision` and `facts.mergeStateStatus`. When the PR is `APPROVED` with merge state `CLEAN` or `UNSTABLE` and carries no blockers, status renders as `ready to merge` instead of falling through to `in review`. The assignment view loses its main triage signal without this — a clean human-approved PR rendered identical to a REVIEW_REQUIRED one. - `tags.ts:tagDuplicateTitle` and `tags.ts:buildContext` both constructed the title-index key with a literal NUL byte between author and title, which made the file appear as binary in `git diff` / review tooling. Replaced the literal byte with a Unicode escape sequence in source; the runtime string value is identical, the source stays plain text and round-trips through review tooling cleanly. * fix(tools-pr): raise default --limit to 1000 to cover the live open queue mrcfps flagged that `tools-pr list` (and `classify --all`, `assignment`) defaults to `--limit 100`, which silently drops every PR past the first 100 in the open queue. The repo currently sits at 104 open PRs, so the out-of-the-box run was already omitting four PRs. Raise the default to 1000 in `list.ts`, `classify.ts`, and `assignment.ts`, and remove the now-pointless 200 ceiling — `gh pr list --limit N` paginates internally, so a high cap is cheap. Users can still pass `--limit <small>` for a truncated preview. CLI help text on the three subcommands updated to match. * fix(web): pass designTemplates to ProjectView render helper #955 made `designTemplates` a required Prop on ProjectView, but the test helper added in #1244 (`renderProjectView` in `ProjectView.api-empty-response.test.tsx`) was never updated. The two PRs landed on main without conflicting, leaving `apps/web` typecheck red for every PR that rebases past `b5eb8c16`. Pass `designTemplates={[] as SkillSummary[]}` alongside the existing `skills={[] as SkillSummary[]}` so the helper compiles. The component already treats the array shape (empty included) as a no-op fallback in the empty-response paths the test exercises. * fix(tools-pr): correct author signal + merge inline review comments Two correctness gaps in the awaiting-* signal pipeline surfaced during review of the new tools-pr commands: 1. `authorSignalAt` iterated every PR commit unconditionally. On `maintainerCanModify=true` PRs a maintainer's follow-up push would advance the author timestamp, masking a stalled author response. Filter commits to those whose `authorLogin` matches `facts.author`, mirroring the same filter already applied to comments. 2. `fetchOpenPrComments` (and `fetchView`) only fetched `pullRequest.comments` / `gh pr view --json comments`, which is the issue-conversation thread. Inline review-thread replies — where authors and reviewers actually exchange most fix-up replies — live in `reviewThreads.comments` / REST `pulls/{n}/comments`. Missing them let `humanReviewerSignalAt` / `authorSignalAt` and the `view` brief point at the wrong side after someone replied inline. Extend the list-mode GraphQL to also sweep `reviewThreads(last: 20).comments(first: 20)`, and add a parallel REST inline-comments fetch in `fetchView` that merges into `GhView.comments`.	2026-05-11 19:17:21 +08:00
Tom Huang	b5eb8c1647	feat: generic skills + split skills/design-templates + finalize-design API (#955 ) * feat: general-purpose skills with @-mention composition and user import Lift skills from "one mode-bound skill per project" to a generic capability the user can compose per turn: - Daemon: scan multiple skill roots (user-skills under runtime data, then the bundled `skills/`); user-imported skills can shadow built-ins by id. - New `POST /api/skills/import` and `DELETE /api/skills/:id` endpoints, with CONFLICT/BAD_REQUEST/NOT_FOUND error codes and built-in delete protection. - ChatRequest gains `skillIds: string[]`; the chat run concatenates each picked skill's body (and merges craftRequires) into the system prompt for that turn only — the project's persistent `skillId` is untouched. - Web composer: `@` popover now lists skills alongside project files; picks render as removable chips above the textarea and ride along with the request as `skillIds`. - Settings → Library: import form (name/description/triggers/body), per-card delete for user skills, "user" origin badge. * chore(web): drop welcome pet teaser + add ds→prompt-template mapping util - SettingsDialog: remove the inline pet adoption teaser from the welcome panel so the first-run modal stays focused on configuration. - New `inferPromptTemplateCategoriesForDs(ds)` helper that maps a design system's authored metadata to prompt-template gallery categories. Imported by the design-system gallery wiring on a sibling branch; no callers in this branch yet. * feat: split skills/design-templates and add finalize-design API Phase 0 of the skills/design-templates refactor (specs/current/ skills-and-design-templates.md): - Move ~104 rendering catalogue entries from skills/ to design-templates/ and keep skills/ for the small set of functional skills that do work on user input (utilities, briefs, packagers). - Add design-templates/AGENTS.md and skills/AGENTS.md describing the contract, and a brand-agnostic craft/ surface for opt-in craft rules. - Daemon: add DESIGN_TEMPLATES_DIR / USER_DESIGN_TEMPLATES_DIR roots and an /api/design-templates surface mirroring /api/skills. Asset/example routes still span both registries so existing srcdoc URLs keep resolving across the rename. - Web: split LibrarySection into SkillsSection + DesignSystemsSection, rename the EntryView "Examples" tab to "Templates", and update locales + the New-project picker accordingly. Adds the finalize-design endpoint: - New apps/daemon/src/finalize-design.ts and packages/contracts/src/api/ finalize.ts — one-shot synthesis of a project's transcript + active design system + current artifact into <projectDir>/DESIGN.md via the Anthropic Messages API. Per-project .finalize.lock mirrors the transcript-export hygiene from PR #493; provider credentials are not persisted by the daemon. Other supporting changes: - README + AGENTS.md updates to document the new directory split and craft/ surface, plus i18n strings across 13 locales. - Test refactors and new coverage (finalize-design, runs, sidecar server, plus refreshed daemon integration tests). - .gitignore: scope the .exe ignore to /OpenDesign.exe so legitimate vendor binaries are no longer hidden. fix(merge): move clinical-case-report to design-templates/ Origin/main added the clinical-case-report skill under skills/ before the skills/design-templates split landed. Its od.mode is prototype, so per specs/current/skills-and-design-templates.md it is a design template and belongs alongside the other rendering catalogue entries — not under the slimmed-down functional skills/ root. Moving it keeps the EntryView Templates tab consistent with origin/main's intent. * feat(skills): curated design/creative catalogue + collapsible Settings rows Seed ~100 curated design/creative skill stubs under skills/ sourced from awesome-claude-skills (ComposioHQ) and awesome-agent-skills (VoltAgent). Each stub carries an od.category tag so the new filter pill row in Settings -> Skills can group them. The seed script (scripts/seed-curated-design-skills.ts, pnpm seed:curated-design-skills) is idempotent: it only creates folders that don't already exist, so hand-edited stubs are never overwritten. - Daemon: parse and surface od.category on SkillInfo with a strict slug normaliser; mirror the field on SkillSummary in @open-design/contracts. Category is purely a UI hint — system-prompt composition is unchanged. - Web: rewrite SkillsSection from a left-list / right-detail grid into a vertical stack of collapsible rows mirroring the External MCP panel (header always visible with name + mode/source/category pills + per-row enable toggle; SKILL.md preview, file tree and inline edit form expand on demand). Add a Category filter row above the list. Reorder Settings nav so Skills + External MCP sit above the Composio/MCP cluster. Update composer placeholder/hint across 17 locales to advertise '@ files or skills · / for commands'. - Docs: extend skills/AGENTS.md with the curated catalogue rules (idempotency, category vocabulary, no upstream vendoring). Co-authored-by: Cursor <cursoragent@cursor.com> * test(skills): teach localized-content + system-prompt tests about the skills/design-templates split mrcfps blocking review on PR #955: the skills/design-templates split (`b5993385`) moved ~110 SKILL.md entries out of `skills/` and into `design-templates/`, but two repo-level tests still hard-coded the single-root layout, so CI gates went red on the merged branch: - `e2e/tests/localized-content.test.ts` only scanned `<repo>/skills` while the locale `skillCopy` map keeps id-keyed entries spanning both roots (ExamplesTab/Templates uses one lookup regardless of origin). Teach the helper to read both `skills/` and `design-templates/`, deduplicating ids so the union matches the localized claim. - `apps/daemon/tests/prompts/system.test.ts` read `skills/live-artifact/SKILL.md`, which now lives under `design-templates/live-artifact/`. Update the absolute path so composeSystemPrompt's coverage of the live-artifact preamble is exercised again. Also enroll the curated design/creative catalogue (PR #955, ~91 stubs sourced from awesome-claude-skills / awesome-agent-skills) in the DE / FR / RU `_SKILL_IDS_WITH_EN_FALLBACK` lists. The stubs are English-only by design (frontmatter advertises an upstream URL); the fallback list is exactly the place to acknowledge "we know this id exists, English copy is fine here" so the localized-content coverage gate passes without forcing a translation task per locale. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(skills): always quote frontmatter name so importUserSkill round-trips numeric / boolean ids mrcfps PR #955 review: `buildSkillMarkdown` emitted `name: ${escapeYamlString(name)}` without quotes, so YAML coerced names like `123`, `true`, `false`, or `null` into non-string scalars on re-parse. listSkills() then read `data.name` as a number/boolean and the import flow's follow-up `findSkillById(skills, result.id)` missed it, falling into `/api/skills/import`'s "imported skill could not be re-read" 500 path for those ids. Switch the emitter to a quoted scalar (`name: "..."`) — the double-escape already in `escapeYamlString` makes the quoted form safe — and add a round-trip test covering `123`, `true`, `false`, `null`, and `0` to lock in the contract. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): drop staged-skill chips when the matching @<id> token leaves the draft mrcfps PR #955 review: `submit()` always forwarded every id in `stagedSkills`, but that state was only mutated on picker click and chip removal. Hand-deleting an `@<id>` token from the textarea left the chip staged, so the request still carried `skillIds: [<id>]` and the daemon composed a skill the prompt no longer referenced. Sync the chips with the draft inside `handleChange()` by pruning `stagedSkills` whenever the new value no longer contains the `@<id>` token (using the same whitespace boundary as `removeStagedSkill`'s strip regex). Comment explains why this prune does not run for `staged` file attachments — users frequently add files via the upload button without leaving an `@<path>` token, so a symmetric prune there would erase legitimate uploads. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(daemon): stage @-composed skills' side files alongside the active skill codex PR #955 review: composing a per-turn `@`-picked skill into the system prompt appended its body (with the `withSkillRootPreamble` guidance pointing at relative paths under `<cwd>/.od-skills/<folder>/`) but never staged the actual folder. `startChatRun` only copied `activeSkillDir`, so when the project's primary skill was different (or absent) the composed skill's references/, examples/, and scripts/ files lived only at their absolute repo path — agents that honour the cwd-relative form (or that don't get `--add-dir`, e.g. Codex with allowlisted gpt-image projects) couldn't reach them. Thread the composed skills' dirs out of `composeDaemonSystemPrompt` as `extraSkillDirs` and stage each one through the same `stageActiveSkill` API used for the primary skill. Dedupe by folder basename so a project whose primary skill is also `@`-composed isn't copied twice. Each preamble already advertises its own folder, so the prompt and the staged tree stay aligned without further changes. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): respect the Library disable toggle in the project @-mention picker codex PR #955 review: only `EntryView` received `enabledSkills` (filtered against `config.disabledSkills`); active projects still got `skills={skills}` raw, so a skill the user disabled in Settings kept appearing in the project's `@`-mention popover and could ride along to the daemon via `skillIds`. That broke the Library toggle for any project opened on the post-split branch. Compute a functional-skills-only enabled subset (`enabledFunctionalSkills`) and pass it into `<ProjectView>` instead. Templates stay separate — design-templates are filtered through their own `enabledDesignTemplates` memo for the Templates gallery — so ProjectView's chat composer still only sees skills, never templates, matching the pre-split prop surface. Co-authored-by: Cursor <cursoragent@cursor.com> * test(e2e): mock /api/design-templates for example-use-prompt flow The Templates tab in EntryView fetches from /api/design-templates after the skills/design-templates split (specs/current/skills-and-design-templates.md). The example-use-prompt Playwright scenario only mocked /api/skills, so the gallery card never appeared and the test timed out waiting on example-card-warm-utility-example. Serve the same fixture summary on both endpoints so the templates gallery renders the card the test clicks. Co-authored-by: Cursor <cursoragent@cursor.com> * test(tools-pack): create design-templates fixture for resources test The packaging resources copy now bundles the new design-templates tree alongside skills (see resources.ts BUNDLED_RESOURCE_TREES). The copyBundledResourceTrees fixture only created skills, design-systems, craft, etc., so the recursive copy crashed with ENOENT on design-templates before it could check the prompt-templates assertion. Add the missing fixture directory so the test exercises the same set of resource trees the packaged build does. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(skills): clone built-in side files into the shadow on first edit mrcfps PR #955 review: editing a built-in skill wrote a USER_SKILLS_DIR shadow folder that contained only a new SKILL.md. The next listSkills() pass surfaced the shadow as the active dir, but every side-file resolver (/api/skills/:id/files, /example, /assets/, the system-prompt preamble, and the per-turn cwd staging) reads through skill.dir. With nothing but SKILL.md in the shadow, the bundled assets/, references/, scripts/, and examples/ disappeared the moment the user hit save — a built-in like last30days or live-artifact would break immediately after edit instead of just having its body overridden. Teach updateUserSkill() to take a `sourceDir` and clone every entry except SKILL.md / dotfiles into the shadow on the very first edit. The shadow stays self-contained, so all the resolvers keep working without fallback bookkeeping. Subsequent edits detect the existing shadow and skip the clone, so user tweaks under the side tree survive a re-save. Wire `sourceDir: skill.dir` from server.ts's PUT /api/skills/:id handler and add two regression tests: - 'clones built-in side files into the shadow on the first edit' walks the file tree after save and asserts assets/template.html, references/ notes.md, and scripts/helper.sh all round-trip from the built-in. - 'preserves user-edited side files on subsequent edits' edits the staged assets/template.html, re-saves, and confirms the user content is still there. Co-authored-by: Cursor <cursoragent@cursor.com> test(e2e): rename home tab from Examples to Templates The Examples tab was renamed to Templates in EntryView (b5993385's skills/design-templates split — entry.tabExamples became entry.tabTemplates and the tab value moved from 'examples' to 'templates'), but entry-chrome-flows still asserted the old label and testId. Update both. * fix(skills+web): preserve template body in API mode and dir-based skill delete Two follow-ups from PR #955 review: 1. ProjectView only received `enabledFunctionalSkills`, but `composedSystemPrompt()` still resolved `project.skillId` through that prop and `fetchSkill()`. Projects created from the new `/api/design-templates` surface keep a template id in `project.skillId`, so opening one in API mode dropped the template body from the system prompt and the upstream request ran without the project's primary template instructions. Now ProjectView takes a separate `designTemplates` prop (the unfiltered template list, so a later-disabled template still loads for projects already created from it) and `composedSystemPrompt()` plus the metadata / `isDeck` lookups fall back to that list, with `fetchDesignTemplate()` as the body-fetch fallback to `fetchSkill()`. The chat composer's `@`-picker keeps receiving only the enabled functional skills. 2. `DELETE /api/skills/:id` used `deleteUserSkill(USER_SKILLS_DIR, skill.id)` which re-slugified the frontmatter id and removed `<userSkillsDir>/<slug>/`. That matched the import shape but missed the install shape — `installFromTarget` writes the folder at `sanitizeRepoName(url)` (GitHub) or `path.basename(realpath)` (local symlink), neither of which is guaranteed to equal the slugified frontmatter `name`. A duplicate `app.delete('/api/skills/:id', ...)` handler at the install routes never fired because Express resolved the earlier registration first, leaving the install/uninstall path without working teardown. The handler now removes `skill.dir` (the absolute path listSkills already discovered) under a USER_SKILLS_DIR safety check, using `lstat` + `unlinkSync` so symlinked local installs unlink cleanly without recursing into the user's source tree. The dead duplicate handler is removed; `deleteUserSkill` is dropped from the server.ts import set (still exported and unit-tested in skills.ts). Regression coverage in `apps/daemon/tests/skills-delete-route.test.ts` pins both shapes plus the symlink-preserves-source case. * test(daemon): point hyperframes system-prompt test at design-templates The merge with main brought in a hyperframes system-prompt test that reads `skills/hyperframes/SKILL.md`, but this branch's split moved `hyperframes` into `design-templates/` (same migration as `live-artifact` already handled above in this file). CI was failing with ENOENT on the old path. --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 17:48:34 +08:00
PerishFire	a797e079b1	fix(desktop): exit fullscreen before hiding window on macOS close (#1249 ) * fix(desktop): exit fullscreen before hiding window on macOS close (#1215) When a preview is in 演示 → 全屏 mode, the macOS close handler called window.hide() directly, leaving the OS fullscreen Space orphaned as a black screen — the window vanished but the Space stayed up. Extract hideWindowExitingFullscreen as the named invariant ("hide, but first leave fullscreen so the OS Space tears down with the window") and route the darwin close handler through it. The hide is deferred until 'leave-full-screen' fires so we don't race the OS Space teardown. Bootstraps Vitest on apps/desktop with a single test under tests/main/hide-window-exiting-fullscreen.test.ts that exercises the helper through a structural mock — the bug shape is pure logic, no real Electron window required. Spec was red against a hide-only helper and green after the leave-full-screen sequencing. * docs(agents): codify bug follow-up workflow Distill the spec-first / cheapest-layer / scope-discipline / invariant-shaped-fix / baseline-diff playbook used recently on #135 and #1215 into a top-level subsection of root AGENTS.md, framed as a default action shape with explicit room for case-by-case judgment rather than a hard rule. Includes a single pointer back to the worked example spec. * docs(agents): require staged human verification for visible bugs Add the human-verification gate as a sixth bullet in the Bug follow-up workflow. UI / platform-native / animation symptoms can pass green specs and still ship the visible regression — proven by #1215, where the desktop unit test green-lighted the helper logic but only a side-by-side buggy-vs-fix run on a real macOS Space proved the black-screen actually went away. Reinforces the production-API-only seed constraint while we're there: source-level backdoors prove a fake flow, not the real one, so they invalidate the verification. * fix(desktop): defer hide across the fullscreen-enter transition (#1215) mrcfps observed on PR #1249 that the close handler only catches windows already in fullscreen — Electron's enter-full-screen event is async on macOS, so isFullScreen() can still read false during the OS Space transition triggered by requestFullscreen(). A close in that window took the plain hide() path and stranded the same black Space the fix was meant to eliminate. Track in-flight fullscreen entry from webContents.enter-html-full-screen (set) and BrowserWindow.{enter,leave}-full-screen (clear), and surface it through WindowFullscreenSurface.isEnteringFullscreen. The helper now parks on enter-full-screen until the OS confirms the Space, then runs the existing exit-then-hide path. Adds a regression test ("waits out a fullscreen-enter transition before exiting and hiding") that goes red against the previous helper.	2026-05-11 17:04:42 +08:00
Caprika	f7f2661bda	[codex] Handle empty API responses as no output (#1244 ) * Handle empty API responses as no output * Fix empty API response comment cleanup * Stabilize API empty response detection	2026-05-11 16:57:02 +08:00
nettee	e859c31574	fix(web): complete finished tool calls missing results (#1240 )	2026-05-11 15:54:11 +08:00
Tom Huang	e254d1280b	feat(memory): auto-memory store with chat-protocol-aware extraction (#999 ) * feat(memory): auto-memory store with chat-protocol-aware extraction Markdown memory store at <dataDir>/memory/ with two extractors — heuristic regex for explicit "remember:" / "我是 X" markers, and a small-model LLM pass after each turn — folded into the system prompt so cross-chat preferences, role, and ongoing-work context survive restarts. Settings UI: - Memory tab lists entries, exposes a hand-edited MEMORY.md index, and shows an extraction history with per-attempt phase/skip/failure rows. - Memory model picker is inline next to the chat model picker (CLI and BYOK) so the choice "which fast model mines facts each turn?" sits next to the chat-model decision instead of a separate panel. The picker reuses the same SUGGESTED_MODELS table and "Custom..." pattern the chat picker uses. LLM extractor supports all four protocols (anthropic / openai / azure / google); pickProvider takes the chat agent id from the chat handler and constrains its auto-pick to the chat's protocol family — Claude Code chats no longer surprise users by silently extracting on whatever OpenAI key happens to be in media-config. When no matching key is configured the attempt records as 'skipped: no-provider' instead of quietly switching vendors. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): keep hint outside <label> and disambiguate Model selectors The inline Memory model picker wrapped its hint paragraph inside the <label>, which made the hint's "API key" / "model" wording bleed into the <select>'s accessible name and broke Playwright's getByLabel('API key') / getByLabel('Model') strict-mode matching in the existing settings-api-protocol e2e suite. - Move the hint <p> out of the <label> in MemoryModelInline so the select's accessible name is just "Memory model". - Switch the chat-Model selectors in settings-api-protocol.test.ts from getByLabel('Model') to getByRole('combobox', { name: 'Model', exact: true }) so they no longer collide with the new "Memory model" select that sits next to the chat Model picker. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): address review changes — BYOK wiring, MEMORY.md index, /v1, label wrapper Addresses the four blocking review threads on PR #999. 1. MemoryModelInline accessibility (mrcfps) The inline picker still wrapped its select + custom input + flash + hint inside a single <label>, which made the select's accessible name absorb every text descendant — including the "API key" / "model" hint copy. The previous fix moved only the hint outside; the reviewer asked for a non-label wrapper. Switch to <div className="field"> and associate just the short title with the controls via `aria-labelledby` / `aria-label`. The select's accessible name is now exactly "Memory model" so `getByLabel` strict-mode locators on the surrounding chat form stop cross-matching the memory copy. 2. Respect the hand-edited MEMORY.md index (mrcfps + codex) `composeMemoryBody()` was reading every .md file in the memory dir, ignoring the index. Removing a `- [Name](id.md)` line had no effect on future prompts. Parse the index's `INDEX_LINK_RE` bullets and filter `listMemoryEntries()` to the linked id set, so the editor's "delete this line to disable injection" promise actually holds. 3. Versioned OpenAI-compatible base URLs (codex) `callOpenAI` and `callAnthropic` hard-coded `/v1` onto `provider.baseUrl`, breaking custom endpoints whose saved URL already includes `/v1` (`/v1/v1/chat/completions`). Apply the same conditional `appendVersionedApiPath` helper the chat proxy and connection-test routes already use. 4. Wire memory into BYOK / API-mode chats (mrcfps + codex) The previous PR's daemon-only memory hook never fired for BYOK, leaving the Memory tab + model picker as a no-op for that mode. Add the missing surface and wire it through ProjectView: - contracts: extend `composeSystemPrompt` with `memoryBody`, mirroring the daemon's local composer; add `MemorySystemPromptResponse` and the `attemptedLLM` flag on `ExtractMemoryResponse`. - daemon: expose `GET /api/memory/system-prompt` (returns the composed body) and turn `POST /api/memory/extract` into a two-phase endpoint — heuristic-only when only userMessage is supplied (pre-turn), LLM-only when assistantMessage is also supplied (post-turn), so the extraction-history doesn't double up. - web: ProjectView's BYOK branch now fetches the memory body before composing the system prompt, runs the heuristic extractor before the run (so "remember:" markers in this turn reach this turn's prompt), accumulates assistant text during streaming, and queues the LLM extractor on `onDone` — fire-and- forget so it never blocks the chat round-trip. Co-authored-by: Cursor <cursoragent@cursor.com> fix(memory): re-sync BYOK memory override when chat config drifts The inline memory-model picker captured `apiProtocol` / `chatApiKey` / `chatBaseUrl` / `chatApiVersion` into the saved override only at the moment the user clicked a model. If they later swapped the BYOK protocol tab, rotated the API key, or edited the base URL in the same settings flow, the daemon's background extractor kept calling the old vendor / credential — directly contradicting the picker's "borrows the surrounding chat picker's protocol, key, base URL, and api-version automatically" promise. Add a debounced effect that compares the persisted (masked) shape against the live chat props and re-PATCHes /api/memory/config when they drift. The masked config exposes `apiKeyTail` (last 4 chars), so key rotation is detectable without ever round-tripping the secret back to the browser. The 300 ms debounce coalesces the keystroke- granularity prop updates the parent settings dialog streams during its autosave loop, so a user editing the base URL doesn't trigger one PATCH per character. Background re-syncs are silent — the "Saved!" flash only fires for explicit user clicks, so the picker doesn't feel like it's fighting them as they edit unrelated chat fields. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): thread BYOK chat config through /api/memory/extract default path Leaving the BYOK memory picker on "Same as chat" still broke the default LLM extraction path: `MemoryModelInline` clears the override for that option, both `/api/memory/extract` calls in `ProjectView` only sent the messages, and the daemon never persists BYOK creds, so `extractWithLLM(..., { chatAgentId: null })` always reached `pickProvider()` with no chat context and fell through to env / media-config — the wrong vendor for a BYOK chat that works for inference. Thread the live BYOK chat config through the extract endpoint as a per-call snapshot: - contracts: extend `ExtractMemoryRequest` with an optional `chatProvider` (provider/apiKey/baseUrl/apiVersion/model) and add `'chat-byok'` to the credentialSource enum. - daemon: parse + validate `chatProvider` on `/api/memory/extract` (provider must be one of the five known shapes) and forward to `extractWithLLM` as a new option. `pickProvider()` gets a new path 2 that uses the snapshot directly with the per-protocol fast-model default — so a memory pass on `gpt-4o` / `claude-sonnet-4-5` silently turns into a cheap `gpt-4o-mini` / `claude-haiku-4-5` call instead of paying chat-tier rates for sediment work. Override and CLI-agent-constrained paths still win when they apply. - web: `ProjectView` snapshots `apiProtocol` / `apiKey` / `baseUrl` / `apiVersion` from the live `AppConfig` on each BYOK extract call (both pre-turn heuristic-only and post-turn LLM phases). The picker's existing drift-resync effect already covers explicit overrides; this snapshot covers the implicit "Same as chat" default that the override flow can't reach. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): treat empty apiKey on PATCH as a real clear MemoryModelInline silently re-PATCHes /api/memory/config whenever the surrounding BYOK chat creds drift. The previous reuse branch lumped `apiKey === ''` together with `apiKey === undefined`, so clearing the chat API key from the picker quietly preserved the old daemon-side secret and kept calling the provider on a stale credential. Distinguish four states for the apiKey field: - absent -> preserve stored secret (form re-save without re-typing) - '' -> clear stored secret (user removed it from the picker) - 'sk-...' -> replace - new provider -> ignore stored secret entirely Add tests/memory-config-route.test.ts covering all four cases. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 15:45:42 +08:00
Tom Huang	e11e86d468	feat(hyperframes): land HTML-in-Canvas across web + skills (#866 ) * feat(hyperframes): land HTML-in-Canvas across web + skills Ships HTML-in-Canvas as a first-class HyperFrames video path: - 7 new video prompt templates (liquid glass, iPhone+MacBook, portal, shatter, magnetic, liquid background, text-cursor reveal). - skills/hyperframes/references/html-in-canvas.md, surfaced via SKILL.md description+triggers and the system-prompt pre-flight references list. - ChatPane starter prompts now branch by project kind and video model, so the hyperframes-html surface shows HTML-in-canvas-shaped prompts instead of the generic prototype trio. - NewProjectPanel propagates a picked template's model+aspect onto the project, and defaults videoModel to hyperframes-html when the hyperframes skill resolves for the video tab. Polish bundled in the same branch: - DesignFilesPanel empty state becomes a centered pill with a "New sketch" CTA; designFiles.empty copy simplified across 19 locales. - Topbar project title + meta render on one baseline row separated by a middot. - scripts/seed-test-projects.ts hardens daemon URL discovery against pnpm engine warnings on stdout. * fix(new-project): preserve explicit video model choice across tab revisits Latch a videoModelTouched guard once the user picks a model via the dropdown or via a template that declares one, so the hyperframes-html auto-default no longer silently overwrites the override when the Video tab is re-entered. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * fix(i18n): register hyperframes html-in-canvas templates, category, and tags Adds the seven new prompt-template ids, the "VFX / HTML-in-Canvas" category, and the new tag set to the de/ru/fr i18n bundles so the e2e localized-content coverage test passes. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * fix(daemon): inject html-in-canvas preflight for hyperframes runs The contracts-side derivePreflight() learned about references/html-in-canvas.md when this PR landed, but the daemon copy at apps/daemon/src/prompts/system.ts kept the older five-ref allowlist. server.ts:4138 wires composeSystemPrompt from the daemon copy into live chat runs, so the main HyperFrames flow this PR is meant to improve still wasn't auto-injecting the preflight directive in production. Mirror the html-in-canvas case into the daemon composer and lock it behind a daemon-side test so the two copies cannot drift again on this reference. The broader live-artifact preflight gap (artifact- schema / connector-policy / refresh-contract) is pre-existing drift and is intentionally out of scope here. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): restyle designs empty state as centered card on grid backdrop Swap the horizontal pill for a stacked card and add a faint grid backdrop so the empty designs surface reads as an intentional canvas rather than a gap. Title now wraps instead of truncating; container is taller. * fix(new-project): pin skillId to hyperframes when videoModel is hyperframes-html When the Video tab resolves its skill it used to fall back to `list[0]?.id` if no skill declared `default_for: video`. That list is built from an unsorted `readdir()` in apps/daemon/src/skills.ts, so a freshly mounted project could land on `video-shortform` even when the user had explicitly chosen the HyperFrames-HTML model (or one of the new `hyperframes-html-in-canvas-*` templates). The agent then ran without the hyperframes SKILL body or its `references/html-in-canvas.md` preflight — the exact regression PR #866 was meant to land. `skillIdForTab` now pins to `hyperframes` whenever the current video model is `hyperframes-html`, regardless of discovery order. Added a unit test that mounts both `video-shortform` and `hyperframes` (with hyperframes last, simulating the bad readdir order) and asserts the create payload routes through `hyperframes`. --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 15:45:12 +08:00
PerishFire	31e57fd773	fix(daemon): persist runStatus/endedAt on chat run termination (#1230 ) * fix(daemon): persist runStatus/endedAt on chat run termination (#135) POST /api/runs created the run but never reconciled the messages row on terminal status. If the web failed to persist the cancel (refresh, dropped PUT), the row stayed at run_status='running' / ended_at=NULL, and on reload the elapsed timer kept climbing because the renderer fell back to now - startedAt. Mirror routine/orbit reconciliation: attach a wait-completion handler that updates run_status and ended_at, guarded by COALESCE and a run_status IN ('queued','running') filter so concurrent web persists are not clobbered. Adds cancelRun helper and two regression specs under e2e/tests/dialog/. * fix(daemon): annotate reconcile callback params for chat-routes The chat run reconciliation block landed in chat-routes.ts after the recent server-route split (#1043), where stricter type checking surfaces implicit `any` parameters. Annotate the wait/then callback as `{ status: string }` and the catch callback as `unknown`. * refactor(daemon): extract reconcileAssistantMessageOnRunEnd helper The inline if/wait/then/catch block in POST /api/runs read as a bolt-on patch. Lift it to a named file-scope helper so the route handler stays intent-level (start the run, arrange follow-up reconciliation) and the guard for missing assistantMessageId is an internal detail. The helper's docblock describes the invariant ("messages row reflects the run's terminal state even without web persist"); commit history keeps the issue context. * test(e2e): wait for any terminal status in stop-reconcile spec The earlier .catch fallback chained two waitForRunStatus calls (canceled then succeeded). waitForRunStatus throws on the first non-expected terminal, so a canceled run that resolves to failed (e.g. agent exits non-zero on SIGTERM) would still abort the test before reaching the messages-row assertion. Add waitForRunTerminal to e2e/lib/vitest/runs.ts: polls until any terminal status without throwing on mismatch, since this spec's claim is about the resulting messages row, not which terminal the run took. Addresses Codex inline review on PR #1230.	2026-05-11 15:37:52 +08:00
nettee	ab922327f4	refactor(daemon): split agent runtime definitions (#1063 )	2026-05-11 15:01:55 +08:00
nettee	b1d440d2bd	refactor(daemon): split route registration (#1043 ) * spec * refactor(daemon): split server route registrars * refactor(daemon): group route registrar dependencies * refactor(daemon): move remaining domain routes out of server * update doc * revert spec * fix daemon route context contract Generated-By: looper 0.5.6 (runner=fixer, agent=opencode) * fix media task persistence Generated-By: looper 0.5.6 (runner=fixer, agent=opencode) * fix: restore daemon route registrations * fix: restore static resource mutation origin checks	2026-05-11 15:00:23 +08:00
PerishFire	976edaf38e	test: harden e2e smoke and release reports (#1140 ) * test: harden e2e inspect specs * test: wire e2e release reports * chore: bump packaged beta base to 0.6.1 * test: run release smoke vitest directly * test: add suite-owned tools-dev lifecycle * ci: harden stable release packaging * fix(release,e2e): gate stable signing on verify and harden suite cleanup - restore `needs: [metadata, verify]` on the stable release `build_mac`, `build_mac_intel`, `build_win`, and `build_linux` jobs so Apple signing/notarization and Windows release builds cannot run before pnpm guard, typecheck, and layout checks complete on the metadata commit. - in `runToolsDevSuite`, drop the `started` flag and always attempt `stopToolsDevWeb` in `finally`; record stop errors in diagnostics, and when the test body succeeded, escalate the stop failure to the suite result and rethrow — so orphan daemon/web processes from an interrupted `startToolsDevWeb` or a broken shutdown can no longer pass silently. Addresses PR #1140 review feedback from lefarcen and mrcfps.	2026-05-11 13:11:16 +08:00

1 2 3 4 5 ...

437 commits