mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
141 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
bb13eee765
|
chore: optimize CI and beta release runtime (#2231)
* chore(ci): add runtime trace summaries * chore(ci): tighten measured workspace steps * chore(release): tighten beta setup steps * chore(release): slim beta windows smoke * chore(ci): shard daemon tests * chore(ci): harden runtime trace lookup * chore(release): avoid mac pnpm cache in beta * chore(ci): split critical playwright checks * chore(release): publish beta platforms from builders * test(e2e): update beta release workflow expectation * chore(ci): stop gating PRs on nix check * fix(release): keep beta latest complete |
||
|
|
86ec951fb9
|
[codex] Add automation templates and proposal workflows (#2193)
* feat(web): introduce Automations tab with dual-track capability for routines This commit adds a new Automations tab that consolidates routines, schedules, and live artifacts, allowing users to manage automations seamlessly. The tab features a modal for creating and editing automations, which supports various scheduling options (hourly, daily, weekdays, weekly) and project modes (create_each_run, reuse). The CLI is also updated to expose automation commands, ensuring consistency between the web UI and CLI interfaces. Key changes include: - New `NewAutomationModal` component for automation creation and editing. - Updated `TasksView` to integrate the new Automations functionality. - Enhanced styling for the Automations tab to improve user experience. This implementation aligns with the dual-track capability exposure policy, ensuring all features are accessible via both the web UI and CLI. * feat(daemon): enhance automation context handling and CLI commands This commit introduces several improvements to the automation context management and updates the CLI commands accordingly. Key changes include: - Added support for new context fields (`plugin`, `mcp`, `connector`) in automation commands. - Updated the CLI to reflect new target options (`new-project`). - Enhanced error messages for invalid target inputs. - Introduced functions to handle context selection and normalization for routines, including the ability to parse and store context data in the database. - Updated the database schema to include a new `context_json` field for routines. - Improved the handling of context in routine routes and the web interface, ensuring that selected contexts are properly managed and displayed. These changes aim to provide a more robust and flexible automation experience, aligning with the recent enhancements in the web UI. * feat(web): enhance TasksView with automation run history and status indicators This commit introduces several new features to the TasksView component, including: - Added functionality to display automation run history for each routine, showing metadata such as status, timestamps, and project details. - Implemented status indicators for routine runs, providing visual feedback on their current state (succeeded, failed, running, queued). - Enhanced the UI to allow users to expand and view detailed run history, including the ability to open the corresponding project conversation. - Updated styles to improve the presentation of automation statuses and history. These changes aim to provide users with better insights into their automation routines and improve overall usability. * feat(daemon): implement automation ingestion and proposal management This commit introduces several new features related to automation ingestion and proposal management within the daemon. Key changes include: - Added new modules for handling automation source packets and proposals, allowing for the storage, retrieval, and management of automation-related data. - Implemented functions to list, create, and apply automation proposals, enhancing the automation workflow. - Introduced new CLI commands for interacting with memory entries and automation sources, providing users with more control over their automation processes. - Enhanced the server routes to support automation source and proposal APIs, enabling seamless integration with the existing system. These changes aim to improve the overall automation experience, making it easier for users to manage and utilize automation proposals and ingestions effectively. |
||
|
|
eb127e0f79
|
Remove live artifact home chip (#2221) | ||
|
|
a38e09f931
|
fix(web): demote Plugins and Integrations to nav rail footer (#1806)
* fix(web): demote Plugins and Integrations to nav rail footer
Plugins and Integrations are platform-configuration surfaces, not
daily-use destinations. Moving them to the footer section of the
left nav rail — separated from primary items by a thin divider —
keeps them reachable while giving the primary four items
(Home, Projects, Automations, Design Systems) the visual weight
they deserve.
- EntryNavRail: remove Plugins/Integrations from the main __group
and place them in the __footer above the help launcher
- entry-layout.css: add __divider rule (1 px separator) to visually
mark the boundary between primary and secondary nav regions
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): remove settings dropdown, gear opens settings directly
The gear/cog button previously opened a dropdown that mixed three
unrelated concerns: community links (X, Discord), preference quick-
access (Language, Appearance), a feature shortcut (Use everywhere),
and a redundant Settings entry — creating two separate paths to the
same Settings dialog and duplicating Language/Appearance relative to
the Settings sidebar.
Changes:
- Gear button now directly opens the Settings dialog (no intermediate
dropdown layer)
- Follow @nexudotio on X and Join Discord moved to the Help menu at
the bottom of the nav rail, where community/external links belong
- Language and Appearance remain exclusively in the Settings dialog
- Use everywhere remains exclusively in the topbar chip
- Remove dead state (avatarMenuOpen, languageExpanded,
appearanceExpanded), ref, outside-click effect, and module-level
constants (APPEARANCE_THEMES, APPEARANCE_LABEL, describeModelChip)
that were introduced solely to support the dropdown
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(web): move output-type chips above chat input as tab bar
The home hero previously had two chip rows below the chat input that
mixed two unrelated dimensions: output type (Prototype, Slide deck,
Image…) and workflow source (Create plugin, From Figma, From folder,
From template). Users had no visual cue that these were different
categories.
This change separates the two dimensions clearly:
- Output type (create group) becomes a tab bar positioned above the
input card. Tabs share the same chip data and onPickChip dispatch,
so plugin selection, active state, and pending state are unchanged.
Active tab shows a colored underline; the bar border visually
connects to the input card below.
- Workflow source (migrate group) stays as the chip row below the
input card, now standing alone with unambiguous "how to start"
semantics.
- Subtitle updated from "Pick a plugin below" to "Pick a type"
to match the new placement.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): skip replace-prompt confirmation when switching output-type tabs
Output-type tab clicks (create group) are mode-selection gestures;
the user expects the prompt to update immediately without a dialog.
The confirmation is still shown for migrate-group chips (From Figma
etc.) where the replacement carries meaningful user-provided content.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): use brand logo as Home destination, drop redundant brand mentions
The entry nav rail previously rendered the brand logo and a separate
Home icon back-to-back; both invoked `onViewChange('home')`, so the
Home button was pure duplicate affordance. The hero pane also
displayed a third "Open Design" lockup that competed with the rail
logo and the rebranded title.
This collapses those affordances:
* `EntryNavRail` drops the dedicated Home `NavButton`. The brand
logo button now carries `aria-current="page"` and an `is-active`
visual state when the home view is showing, and its tooltip reads
"Open Design · Home" off-home so the navigation behavior stays
discoverable for new users.
* `entry-layout.css` adds the matching `.entry-nav-rail__logo.is-active`
accent treatment so the "you are here" cue reads at parity with
primary rail buttons.
* `HomeHero` removes the inline `home-hero__brand` lockup and the
associated CSS, then retunes the title/subtitle/type-tab spacing
so the headline group still pairs tightly with the type tabs and
input card below.
* `entry-chrome-flows` is updated to assert the logo carries the
active page treatment and that no `entry-nav-home` test id
resurfaces by mistake.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): auto-grow home chat input, disable internal scroll and drag-resize
The home hero textarea was capped at a fixed `min-height: 84px` with
`resize: vertical`, so users had to either drag the bottom-right
corner to enlarge it or scroll the textarea internally to read
longer prompts. That hid context (loaded plugin templates routinely
overflow three lines) and exposed a manual grip whose state was easy
to leave in an awkward height.
This change makes the chat box grow with its content:
* `HomeHero` adds a `useLayoutEffect` that, on every prompt change,
resets the textarea height to `auto` so the browser can measure a
smaller content, then writes back `scrollHeight` in pixels. The
effect uses layout phase (not effect phase) to avoid a one-frame
flash at the previous height when a plugin loads a long example
prompt.
* `home-hero.css` swaps `resize: vertical` for `resize: none` and
adds `overflow: hidden`, so the manual drag grip disappears and
the textarea never scrolls internally. `min-height: 84px` is kept
so the empty input still reads as a chunky chat box. The outer
page handles overflow when the prompt is genuinely very long.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): make output-type tabs read as folder-tabs attached to chat box
The previous tab bar used a small icon + label per tab and signaled
the active tab with a 2px accent underline sitting on a horizontal
divider line. That divider visually broke the relationship between
the tabs and the input card below — the active tab looked like a
free-floating header, not a flap of the chat surface, so users had
to do extra work to mentally connect "I picked Prototype" with the
prompt area immediately underneath.
This switches the tab bar to a folder-tab pattern:
* `HomeHero` drops the per-tab `Icon` element. The seven labels
(Prototype, Live artifact, Slide deck, Image, Video, HyperFrames,
Audio) already disambiguate at the type sizes used here, and the
icons were primarily decorative.
* `home-hero.css` rewrites the tab styling:
- The bar no longer paints its own baseline border; the input
card's top edge serves as the baseline.
- Each tab is a rounded-top container with a 1px transparent
border and a `-1px` bottom margin, so its bottom edge overlaps
the card's top border by exactly one pixel.
- The active tab borrows the card's panel background and border
color, and overrides its bottom border with the panel color
so it visually erases the card's top border for the tab's
width. The result reads as one continuous "Prototype is this
chat box" surface.
- The card's prior `margin-top: 8px` is removed so the tab bar
bottom and card top sit at the same y coordinate, which is the
geometric precondition for the 1px overlap to land cleanly.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): enlarge home chat attach and submit buttons for legibility
The attach (paper-clip) and submit (arrow-up) controls on the home
chat input rendered at 32px diameter with 14px and 18px glyphs
respectively. After stroke antialiasing the icons read at roughly
11–12px on typical displays — small enough that users reported the
glyphs were illegible and had to be discovered by trial-and-error.
This bumps both controls to 38px circles and grows their glyphs
(attach 14 → 18, submit 18 → 22). The primary call-to-action now
has clearly more visual weight than the surrounding muted hint
text, and the paper-clip is recognisable at a glance.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): paint a chevron on inline select slots in the home prompt
Inline select slots in the home prompt overlay (e.g. the
`high-fidelity` / `wireframe` fidelity picker on the Live artifact
template) rendered as plain pill highlights, visually identical to
free-text and read-only text slots. The slot's `appearance: none`
strips the browser's default chevron, so users had no affordance
hinting the value was switchable.
This wraps select-type slots in an `inline-flex` span and overlays
an explicit chevron-down `Icon` against the slot's trailing padding
(bumped from 18px to 22px to make room). The chevron inherits the
accent color via the wrap's `color` property so it themes correctly
in light and dark modes. Click semantics are unchanged because the
chevron is `pointer-events: none`.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): size inline number slots to their value, not the whole row
Number-type inline slots in the home prompt (e.g. the slide-count
spinner on the Slide deck template) inherited the generic slot
`min-width: 8ch` and then stretched to the browser's default
`<input type=number>` width, so a two-digit value like "10" ate
the entire remaining line and pushed the native spinner buttons
to the far right edge — far away from the value they control.
This sizes the number input to its actual content plus four
characters of trailing room for the spinner buttons (clamped to
6–14ch), and adds a `home-hero__prompt-slot-input--number`
modifier that overrides the slot `min-width` to 4ch. The
spinner now sits flush against the value and the slot reads as
a compact inline pill matching the surrounding text/select
slots.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): widen prompt line-height so slot pills do not collide vertically
Each interactive slot and mention in the home prompt paints a 2px
outline ring via `box-shadow`. At the previous `line-height: 1.55`
on a 15px font, two lines were ~23px apart while a single pill
occupied ~20px of vertical space (text box + ring on both sides),
leaving roughly 2px of clear space between rows. When the prompt
wrapped onto multiple lines — common for the Image template's
"Generate a {kind} of {subject}. Style: {style}. Aspect: {ratio}."
example — the rings from line N and line N+1 visually merged into
a single bar, making it ambiguous which pill the user was about to
click or edit.
This bumps the line-height of both the highlight overlay and the
underlying editable textarea to 1.85 (~28px per line), restoring
~8px of clear space between pill rows. The two values must stay
identical so the overlay glyphs continue to track the textarea
caret positions exactly; a brief comment in each rule documents
this coupling for future edits.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): paint dropdown chevron on select element with neutral color
The previous chevron treatment wrapped each select slot in an
inline-flex span and laid an accent-orange `<Icon>` over the
trailing padding. At the prompt's normal display size the orange
chevron blended into the orange pill ring corners, so users
perceived "a bunch of dropdown arrows" across every highlighted
slot, even though only the actual select rendered one.
Move the chevron onto the `<select>` itself as a small neutral-
gray `background-image` SVG. The grey contrasts clearly with the
pill's accent ring and the chevron lives inside the select's own
padding, so it can never visually overlap the value text and can
never appear on non-select slots.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): shrink select-slot dropdown caret so it stops reading as a check-mark
The 8×6 stroked chevron looked like a check-mark in zoomed views
because the stroke width was a large fraction of the glyph. Swap
for a small (9×5) filled triangle drawn as a `background-image`,
with `background-size` pinned so browsers can't scale it to its
intrinsic SVG box. The caret is now unambiguously a dropdown
indicator without crowding the value text.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): make read-only prompt slots visually distinct from editable pills
Multi-word context values are rendered as plain <span>s with
pointer-events: none, but they inherited the same orange pill +
ring treatment as the truly editable <input>/<select> slots.
Users couldn't tell which pills they could click into.
Strip the pill background, the ring, the radius, and the padding
from `.home-hero__prompt-slot-text` and replace them with a
subtle dashed bottom border. The orange foreground keeps the
slot family link, but read-only highlights now clearly read as
"context value spliced into the prompt" rather than as
"interactive control".
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): scale select-slot dropdown caret up to 12px wide for legibility
The 9×5 caret was too quiet at the prompt's font size to read as
a dropdown affordance. Bump to a 12×6 filled triangle and pad the
select's trailing space (24px) so the value text still has clear
breathing room from the caret.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): collapse plugins search and filter strips into one bar
The plugins-home gallery laid out search, count, mode (Featured),
total-in-catalog, clear-filters, the main category row, and the
sub-category row as four floating clusters. Search lived up in
the section header, far from the chips it actually scopes; the
mode strip wedged "Featured + 386 in catalog + Clear filters"
between the header and the category row with no obvious
ownership; and the two clear-filter affordances duplicated each
other (chip strip + empty-state).
Fold the mode strip into the category row: Featured is now the
leading chip on the same line as the category pills, and the
search field, the result count, and a compact Clear link sit as
a right-aligned tools cluster on that same row. The header
shrinks back to just the title, subtitle, and Browse registry
link. The sub-category row stays as the contextual second line
when a category is active.
Tests: keeps the existing data-testids (plugins-home-chip-featured,
plugins-home-row-category, plugins-home-clear, plugins-home-count,
plugins-home-search) so the existing 11-case section spec passes
without modification.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): hide Recent projects rail on first-run home when empty
A brand-new user landing on Home saw an empty dashed box with the
copy "No projects yet — type a prompt to start one." sitting
above the plugin gallery. The hero already prompts the user to
type, so the empty rail just adds vertical noise without telling
them anything new.
Return null from RecentProjectsStrip when the recent list is
empty (loading or not) so the rail appears only when there is
actually something to show. Update the entry-chrome e2e to
assert toHaveCount(0) on first run — codifying the new contract
that the rail is conditional, not always-mounted.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): hide structured inputs form when every field is in the prompt
The PluginInputsForm below the chat textarea surfaced every plugin
input — even the ones the prompt template already substitutes
inline as highlighted slot pills. For a plugin like Prototype
that's five identical labelled inputs duplicating the five slot
pills above them, making the chat box look like it has grown a
second composer.
Compute the set of placeholder keys actually referenced in the
prompt template (via INPUT_PLACEHOLDER_PATTERN) and skip those
when deciding what the structured form needs to render. The form
still appears for plugins that expose inputs not referenced in
the prompt text (e.g. a "Run in background" toggle), but
template-only plugins collapse the redundant second editor.
Update the picker spec accordingly: when every field is in the
template the form is now expected to be absent rather than to
render a duplicate of the inline slot.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): drop redundant filtered-count and Clear link beside the search
The combined plugins filter bar trailed the search field with
"59 / 386" and an inline "Clear" link, but both repeat what the
chip strip already shows: every chip carries its own count
(Slides 59, All 386, …) and clicking the All chip already resets
the filters. Users had no way to know what the bare "59 / 386"
fraction or "Clear" referred to without inference.
Strip the trailing tools cluster down to just the search input.
The empty-results message at the bottom of the gallery still
exposes a contextual "Clear filters" button when a stacked
filter yields no matches, so the affordance isn't lost — just
removed from a position where it didn't read as actionable.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): drop default subtitle copy from the Official starters strip
The Home gallery rendered a two-line explanatory subtitle under
"Official starters" — "Ready-to-use Open Design workflows bundled
with this runtime. Pick one to load a starter prompt, or browse
the registry for more." — every visit. Returning users skip it,
new users get the same message from the section title plus the
Browse registry link plus the visual card grid itself; the prose
read as filler chrome above the chip strip.
Default the subtitle prop to undefined and only render the <p>
when a caller passes an explicit string. Other surfaces that
mount PluginsHomeSection with their own copy keep their
subtitle; the bare Home gallery loses one row of vertical noise.
Co-authored-by: Cursor <cursoragent@cursor.com>
* Fade out empty workflow lanes in Plugins home filter
Empty top-level lanes (Deploy 0, Refine 0, etc.) and empty
sub-categories (Vercel 0 under Deploy, Figma 0 under Import, etc.)
used to look identical to populated ones, so users couldn't tell at
a glance which chips were real catalog buckets and which were
"contribute a plugin" invites. The strip kept all lanes visible by
design — the workflow shape (Import / Create / Export / Share /
Deploy / Refine / Extend) is part of what we want users to see —
so we keep them clickable, but tag count-zero chips with
data-empty='true' and give them a faded, dashed-border treatment.
"All" pills stay solid since their count reflects the parent lane,
not their own emptiness.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): update home nav test expectations
* fix(e2e): align critical smoke with entry chrome
---------
Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
bd48c597b0
|
chore: pin dependency versions and harden CI caches (#2189)
* chore: pin dependency versions * ci: enforce pinned dependency specs * ci: fix pnpm executable invocation |
||
|
|
99b42726b8
|
Simplify CI PR gate (#2183) | ||
|
|
f650a043d9
|
test(e2e): align entry coverage with redesigned flows (#2101)
* Migrate entry E2E coverage and split CI * test(e2e): relax connectors auth error assertions * ci: route scenario registry changes to extended e2e * ci: decouple packaged smoke jobs from validate gate * ci: restore pre-split workflow * test(e2e): slim critical ui smoke coverage * test(e2e): move broader entry flows out of critical * test(e2e): restore entry chrome coverage to ci * ci: parallelize workspace validation jobs * test(web): stabilize media palette bridge assertion * ci: cache e2e playwright browsers |
||
|
|
4424f08be0
|
[codex] Add packaged desktop auto-update (#1375)
* Add packaged desktop auto-update * Handle counted beta nightly update versions * Refresh desktop auto-update branch for main * Serialize desktop updater operations * Refresh auto-update branch for packaged paths |
||
|
|
3f56395e2d
|
fix(web): restore Settings subtab-pill hover contrast in dark theme (#1815)
The .subtab-pill button hover rule hard-coded `rgba(255,255,255,0.6)` as the
hover background. In light theme this composites to a near-white wash that
sits cleanly under `--text`, but in dark theme it overlays a translucent
white on the dark panel and produces a bright, near-white surface beneath
the same light `--text`, dropping contrast to ~1.87 (well below WCAG AA 4.5).
The reporter (#1795, on 0.6.0) called this out on five Settings surfaces.
Four of them — BYOK / Appearance / Notifications / protocol-chip — were
incidentally fixed by the
|
||
|
|
91195bc4ac |
Merge origin/preview/v0.8.0 (PR #1830) into sync branch
Pulls in the teammate's PR #1830 — e2e align to redesigned entry flows + watcher and codex runtime test fixes. Conflicts resolved: - apps/daemon/tests/runtimes/registry-and-args.test.ts: imported both withPlatform (HEAD, used 6x) and withEnvSnapshot (theirs, used 1x); both helpers are needed in the file. - apps/web/src/components/ProjectView.tsx: kept both ChatPane-remount guards. Main's fix #1710 uses lastSyncedConversationIdRef tracking URL sync; the teammate added lastSeenRouteConversationIdRef tracking last observed route. Both refs are already defined in the file and the two checks are complementary defenses. - e2e/ui/settings-*.test.ts (5 files): took teammate's version wholesale — they replaced inline `page.goto('/') + click Execution mode` with `gotoEntryHome + openSettingsDialogFromEntry` helpers, which is exactly the PR's intent. |
||
|
|
2181eb376d
|
test(e2e): align UI suites with redesigned entry flows (#1830)
Some checks failed
nix-check / build (push) Failing after 2s
* test(e2e): align critical ui tests with new entry shell * test(e2e): align ui suites with redesigned entry flows * fix(e2e): correct entry chrome page helper type * fix(daemon): stabilize watcher and codex runtime tests * fix(daemon): satisfy strict watcher option typing |
||
|
|
22a3b99a47 |
Merge origin/main into preview/v0.8.0
Sync 49 commits from main. Conflicts resolved:
- .github/workflows/ci.yml: kept v0.8.0 granular per-area gating, added main's
linux specs + release-stable.yml + release-preview.yml triggers
- .github/workflows/release-preview.yml: kept v0.8.0's full workflow over main's placeholder
- apps/web/src/components/AssistantMessage.tsx: combined v0.8.0 file-ops
summary with main's stripTodoToolGroups + suppressAskUserQuestionFallbackText
- apps/web/src/components/ChatPane.tsx: kept both new imports
- apps/web/src/index.css: kept both .msg-plugin-chip and .user-copy-btn blocks
- e2e/ui/*.test.ts: kept v0.8.0 openEntrySettingsDialog helper over main's
inline dialog navigation (UI was redesigned in v0.8.0)
- nix/package-{daemon,web}.nix: kept v0.8.0 pnpmDepsHash; rerun nix build to refresh
|
||
|
|
74637f1cb5
|
Add Linux packaged client parity smoke coverage (#1204)
* docs: plan linux client issue 709 * fix: complete linux headless lifecycle routing * feat: add linux packaged inspect * test: add linux headless packaged smoke * ci: add linux headless packaged smoke * ci: smoke linux AppImage release artifacts * docs: document linux packaged client status * chore: finalize linux client audit remediation * docs: add linux client publication packet * test: harden linux client smoke coverage * ci: preserve linux smoke audit evidence * refactor: consolidate linux e2e helpers Move pathExists and the desktop/web/daemon app-key array out of linux.spec.ts into linux-helpers.ts, where expectPathInside and linuxUserHome already live. Keeps the spec file focused on tests and the helpers file as the canonical home for shared Linux e2e utilities. * fix: move linux e2e helpers to lib * fix: address linux release review blockers * fix: drop npm dependency from containerized linux build writeAssembledApp() previously called runNpmInstall() which executed `npm install` directly. Inside the containerized build path, electronuserland/builder:base strips npm/npx/corepack, so the inner tools-pack build would fail at the assembled-app install step. Route the install through OD_TOOLS_PACK_PNPM_BIN: buildDockerArgs sets the env to the standalone pnpm binary it bootstraps, and the new resolveProductionInstallCommand helper consumes that env to run `<bin> install --prod --no-lockfile --config.node-linker=hoisted`. Host invocations with no env set keep the prior npm behavior. --config.node-linker=hoisted preserves the flat node_modules layout that electron-builder packs the same way as npm-installed trees. New tests cover the resolver branches and assert the docker-arg-to- resolver chain end-to-end so reviewers can see the container's inner build receives the env that switches its install away from npm. * fix: harden linux container bootstrap * fix: validate desktop marker liveness in headless cleanup cleanup --headless previously skipped on any parseable desktop-root.json, trapping recovery when the AppImage had crashed and left a stale marker. Validate the marker the same way stopPackedLinuxApp does: if the PID is not in the live snapshot list, proceed through cleanup instead of skipping. Extract the validation into validateDesktopAppImageMarker so the stop and cleanup paths share one definition of live and owned. Tests cover both branches: a stale marker drives cleanup to remove the runtime/output roots, while a live marker drives cleanup to skip and preserve them. |
||
|
|
c5d77a03bd
|
Garnet hemisphere (#1769)
Some checks failed
nix-check / build (push) Failing after 2s
* feat(chat-composer): enhance mention handling and input overlay - Introduced a new overlay for inline mentions in the chat composer, improving user experience by visually indicating mentions as users type. - Updated the `ChatComposer` component to manage mention entities and integrate them into the input field, allowing for better context and interaction. - Enhanced the `AssistantMessage` component to support the display of plugin action panels based on the current project context, facilitating easier plugin management. - Refactored related components to ensure consistent handling of project files and mentions across the application. This update significantly improves the chat interaction model, making it more intuitive for users to engage with mentions and plugins. * feat(plugin-management): enhance plugin action panels and UI components - Updated the `AssistantMessage` component to include plugin action panels based on the latest project context, improving user interaction with generated plugins. - Refactored the `PluginsView` to support detailed views for available marketplace entries, allowing users to access more information and actions for each plugin. - Introduced new CSS styles for improved visual representation of plugin-related UI elements, enhancing overall user experience. - Enhanced the `listPlugins` function to include an option for fetching hidden plugins, providing more flexibility in plugin management. This update significantly improves the usability and functionality of the plugin management system, making it easier for users to interact with and manage their plugins. * fix(assistant-message): refine plugin folder candidate selection logic - Updated the `pluginFoldersTouchedThisTurn` function to improve the logic for selecting plugin folder candidates based on touched paths and message content. - Introduced a new helper function, `pathMatchesFolderFileBasename`, to enhance the matching criteria for folder candidates. - Added a check for explicit folder matches before falling back to a single candidate, improving accuracy in folder selection. - Modified the `shouldRenderSlotAsText` function in `HomeHero` to include the name parameter, refining the rendering logic for slot text. These changes enhance the functionality and reliability of the assistant message component in managing plugin folder candidates. * feat(plugin-folder-actions): implement agent-routed CLI actions for plugin management - Introduced a new `PluginFolderAgentAction` type to streamline actions related to plugin folders, including install, publish, and contribute. - Updated the `DesignFilesPanel`, `FileWorkspace`, and `AssistantMessage` components to utilize the new agent action handling, improving user interaction with generated plugins. - Refactored the action handling logic to send commands to the agent, enhancing the workflow for managing plugin folders. - Added corresponding tests to ensure the new functionality works as expected and integrates seamlessly with existing components. This update significantly enhances the plugin management experience by routing actions through the agent, allowing for a more cohesive and interactive user experience. * Fix PR 1702 CI blockers * Fix PR 1702 remaining CI checks * Prebuild AGUI adapter after install * Restore plugin project snapshot wiring * feat(marketplace): refactor marketplace URL handling and enhance fetching logic - Introduced new functions to normalize marketplace URLs and manage fetching of marketplace manifests, improving the reliability of marketplace integrations. - Updated the server and plugin logic to utilize the new fetching mechanisms, ensuring consistent handling of marketplace data. - Enhanced tests to cover new URL normalization and fetching scenarios, ensuring robustness in marketplace management. This update significantly improves the marketplace experience by streamlining URL handling and enhancing data fetching capabilities. * Fix project auto-send cleanup spec * Reconcile run messages on cancel * Use active design system as visual direction * Fix active design system prompt wording * feat(workspace-tabs): implement workspace tabs functionality and file attachment handling - Introduced a new `WorkspaceTabsBar` component to manage workspace tabs, allowing users to navigate between different views (projects, marketplace, etc.). - Enhanced file handling capabilities in the `HomeHero` and `EntryShell` components, enabling users to stage and attach files before project creation. - Updated the `App` component to support auto-sending attachments alongside the first message in a project. - Improved CSS styles for workspace tabs and attachment UI, ensuring a cohesive design and user experience. This update significantly enhances the workspace navigation and file management features, providing users with a more intuitive and efficient workflow. * refactor(workspace-tabs): streamline workspace tabs and UI components - Removed unused components and actions from the `WorkspaceTabsBar` and `AppChromeHeader`, simplifying the codebase. - Updated CSS styles for the workspace shell and tabs, enhancing visual consistency and reducing element sizes for a cleaner layout. - Introduced a new client type detection mechanism to dynamically adjust the workspace shell's class, improving responsiveness. - Added tests for the `WorkspaceTabsBar` to ensure proper navigation and tab management functionality. These changes improve the overall performance and user experience of the workspace navigation system. * Update critical e2e for entry modal flow * Stabilize entry critical e2e flows * fix(ui): adjust workspace tabs and header styles for improved layout - Updated the CSS for workspace tabs and the app header, reducing element sizes and padding for a cleaner appearance. - Introduced a new button in the `WorkspaceTabsBar` for quick access to the home tab, enhancing navigation. - Minor adjustments to the layout and styles to ensure consistency across components. These changes enhance the user interface and improve the overall user experience in the workspace navigation system. * feat(workspace-tabs): implement pinned home tab functionality - Added a new pinned home tab feature to the `WorkspaceTabsBar`, allowing the home tab to remain accessible during navigation. - Updated tab management logic to collapse duplicate home tabs into a single pinned instance when restoring from local storage. - Enhanced CSS styles for workspace tabs to accommodate the new pinned tab design. - Updated tests to verify the behavior of the pinned home tab and its interaction with other tabs. These changes improve navigation consistency and user experience within the workspace. * refactor(workspace-tabs): enhance tab management and styling - Updated CSS styles for workspace tabs, adjusting padding and flex properties for improved layout and consistency. - Refactored tab creation logic to ensure unique IDs for project and marketplace tabs, enhancing navigation clarity. - Removed deprecated functions related to pinned home tabs, streamlining the codebase. - Improved test cases to verify independent behavior of home tabs during navigation. These changes enhance the user experience by providing a more intuitive tab management system and a cleaner UI. * style(workspace-tabs): update CSS for improved layout and visibility - Adjusted CSS properties for workspace tabs, including overflow, position, and z-index to enhance layout and stacking context. - Ensured consistent styling across tab components for better visual hierarchy. These changes contribute to a more polished and user-friendly interface within the workspace. * style(entry-layout): update CSS variables for improved layout consistency - Replaced fixed width values with CSS variables for the entry rail to enhance flexibility. - Adjusted padding and height properties for better visual alignment and spacing. - Introduced a new background style for the entry main topbar to improve aesthetics. These changes contribute to a more responsive and visually appealing layout in the entry view. --------- Co-authored-by: qiongyu1999 <2694684348@qq.com> Co-authored-by: Eli <129168833+qiongyu1999@users.noreply.github.com> |
||
|
|
bcc58af931
|
refactor(web): rename Execution mode and tighten settings dialog UI (#1568)
* refactor(web): rename Execution mode and tighten settings dialog UI - Rename "Settings → Execution & model" to "Settings → Execution mode" across the web UI, i18n keys, docs, and e2e selectors. - Redesign SettingsDialog: kicker + title row in the modal head, a flatMap-driven agent grid that renders the inline test-result row beside the selected card, compact unavailable cards with right-aligned install/docs links, and an install guide that only shows when the user has no working agent picked. - Trim verbose subtitle / hint copy across chat model, CLI proxy, media providers, custom instructions, and memory sections. - Add an `info` Icon variant for the redesigned settings hints. - Update e2e selectors and docs that referenced the old menu label. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(web): polish Settings dialog — media providers, skills, MCP Media providers - Hide internal Stub fixture provider (settingsVisible: false) - Split provider list into Available (integrated, editable) and Coming Soon (collapsed <details> drawer with name/hint/Docs link only) - Drop right-side Integrated/Configured badges from every row; all rows in the main list are integrated by definition; inline grey "Saved" chip next to the provider name is the only status indicator now - "Saved" badge moves inline to the right of the provider name and uses a neutral grey treatment (was a standalone green pill below the name) - "Reload from daemon" button shows a 2s green "✓ Reloaded" flash on success instead of leaving a permanent paragraph under the header; errors remain sticky Skills - Replace three pill-row filter banks (Source, Type, Category) with a compact single-row toolbar: search + three inline <select> dropdowns side by side; active filter highlighted with a stronger border MCP server - Shorten section hint to one line - Move WHAT YOUR AGENT CAN DO capabilities above the client dropdown (motivate before asking to act) - Move "Build the daemon first" warning below the code block where it contextually explains why the command might fail, not as a top-level error before the user has done anything - Downgrade "Restart your client" left-border from accent orange to border-strong grey — it is a next step, not a warning External MCP - Shorten section hint to one line Misc CSS - Add .sr-only utility for accessible off-screen live regions - Add button.ghost.is-success-flash for transient success feedback - Add .library-filter-selects / .library-filter-select for dropdown filter rows - Add .media-provider-coming-soon-* for the roadmap drawer Co-authored-by: Cursor <cursoragent@cursor.com> * [codex] Add Cursor Agent auth diagnostics (#1538) * Add Cursor Agent auth diagnostics * Handle Cursor not logged in auth status * Address Cursor auth review feedback * Classify Cursor stdout auth failures * test: expand Memory and Routines coverage (#1521) * test: expand settings and packaged coverage * test: extend memory settings coverage * test: cover routine settings failure states * test: cover routine operation failures * test: fix daemon test typing on CI * test: decouple packaged smoke from orbit bug * test: avoid live memory LLM calls in route tests * test: fix daemon fetch typing in CI * fix: restore preview comment and inspect toggles * test: align manual edit flow with current inspector UX * test: align comment attachment flow with current preview comments UI * fix: probe resolved Codex launch path during detection * fix: remove duplicate board activation helper after rebase * test: update ghost cli detection mock * test: align FileViewer toolbar expectation * ci: move full app tests to extended lane * ci: run app tests by changed scope * ci: cover shared app inputs in test scopes * ci: avoid setup-node cache in windows packaged smoke * test: align extended settings and manual edit flows * refactor(web): rename Execution mode and tighten settings dialog UI - Rename "Settings → Execution & model" to "Settings → Execution mode" across the web UI, i18n keys, docs, and e2e selectors. - Redesign SettingsDialog: kicker + title row in the modal head, a flatMap-driven agent grid that renders the inline test-result row beside the selected card, compact unavailable cards with right-aligned install/docs links, and an install guide that only shows when the user has no working agent picked. - Trim verbose subtitle / hint copy across chat model, CLI proxy, media providers, custom instructions, and memory sections. - Add an `info` Icon variant for the redesigned settings hints. - Update e2e selectors and docs that referenced the old menu label. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(web): settings dialog UX polish — layout, dedup, and interactions - Remove duplicate section headers from all settings sections (Notifications, Appearance, Privacy, About, Design Systems, Skills, MCP server, Connectors, Media providers, Routines) - Restructure Notifications cards: title + toggle on same row, hint below - Restructure Skills toolbar: search + New skill button in row 1, filter dropdowns in row 2 with left-aligned labels - Restructure Pet section: tabs and Wake button on same row - MCP server: group capabilities and setup into separate cards, remove nested double border on client picker - Connectors: show connect errors as toast instead of inline card text, position toast inside panel, hide single-provider tab - Media providers: move Reload button to left-aligned small ghost button - Memory: info icon shows path on hover, Path copied badge inline; Extraction history and MEMORY.md as standalone collapsible cards; group header hidden when only one type visible - Pet grid cards: Adopt button hidden until hover, icon-only when adopted, description truncated to 2 lines, text fills full width via abs positioning - Agent cards: selected state uses accent border only, no background change - Add sun/moon icons to Appearance theme buttons (Light/Dark) - Shorten several hint strings for clarity Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): resolve i18n review comments from PR #1568 - Update settings.title and settings.envConfigure to localized "Execution mode" in all 17 non-English locale files - Add settings.memoryFlashPathCopied to all locales and use t() in MemorySection instead of hardcoded English "Path copied" - Add settings.agentModelHead to all locales and use t() in SettingsDialog for "Model for:" agent model row header Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): update tests to match settings dialog redesign - Add role prop to Toast (alert/status) so error toasts from ConnectorsBrowser are announced immediately by screen readers - Clear connectErrorToast on successful connector retry - Update SettingsDialog.execution tests: - Remove heading assertions for About and MCP server (headers were intentionally removed as duplicate nav labels) - Rewrite CLI env test to use codex-only fields (per-agent filtering means only selected agent's fields are shown) - Update Composio key hint text assertion to match shortened copy - Replace filter button click with select change for Type filter - Replace Configured/Unsupported/Integrated badge checks with updated assertions matching the new media provider UI - Replace disabled BFL row test with coming-soon section check - Update SettingsDialog.media test: remove Fal.ai input assertions (non-integrated providers no longer have editable fields) Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): unblock CI for #1568 Three small fixes to get Playwright back to green on the settings dialog redesign: 1. `en.ts`: revert `settings.envConfigure` to "Configure execution mode". This PR collapsed both `settings.title` (header gear) and `settings.envConfigure` (entry-side foot pill) to the same string "Execution mode", so `getByRole('button', { name: 'Execution mode' })` resolved to two elements and tripped Playwright strict mode in the three Composio-flow tests (entry-configuration-flows.test.ts:174, 228, 285). Restoring the distinct label also gives screen readers a clearer hint for the pill, which doubles as a status display. Non-English locales still alias the two keys; happy to follow up on those, but they don't gate the (English-only) Playwright suite. 2. entry-configuration-flows.test.ts:167 — `Connectors` heading is now rendered at `<h2>` in the modal-head (SettingsDialog.tsx:1545), with the inner `<h3>` removed by design (see comment around line 1448). Updated the assertion from `level: 3` to `level: 2`. 3. project-management-flows.test.ts:360 — same change for the `Pets` heading. Verified locally with `pnpm --filter @open-design/web typecheck` and `pnpm --filter @open-design/e2e typecheck`. The actual Playwright specs need the dev server up; I didn't rerun them here, but the locator changes are mechanical and match the new DOM. * fix(web): use exact match for Execution mode button locator Playwright's `getByRole({ name })` defaults to substring matching, so `{ name: 'Execution mode' }` still resolved to both the header gear (aria-label "Execution mode") and the entry-side foot pill (aria-label "Configure execution mode" — substring contains "Execution mode"). Strict mode tripped in the three composio-flow tests at lines 202, 257, and 319. Adding `exact: true` makes each call resolve to just the header gear, which opens the same dialog the foot pill does — the test outcomes are unchanged. --------- Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Caprika <56862773+alchemistklk@users.noreply.github.com> Co-authored-by: shangxinyu1 <shangxinyu@refly.ai> Co-authored-by: lefarcen <935902669@qq.com> |
||
|
|
4f15c33595 | Merge remote-tracking branch 'origin/preview/0.8.0' into preview/v0.8.0 | ||
|
|
43b1b94c8e | Add preview release channel | ||
|
|
3b7f87c7ae | Merge remote-tracking branch 'origin/main' into reconcile/garnet-main-merge | ||
|
|
cba8bf151d | chore: align namespace lifecycle packaging | ||
|
|
b268bbe169 |
Merge origin/garnet-hemisphere (post-9e196d34) — Use Plugin handoff fix
Brings in 11 new garnet commits, most importantly: - |
||
|
|
40766ef1ba
|
test(web): Critique Theater Phase 13 (reducer p99 bench + surface coverage walker) (#1318)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)
* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)
* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)
Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.
Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
/api/projects/:id/critique/:runId/interrupt alongside the
optimistic local dispatch. Before this fix, clicking Interrupt
only flipped the React state to interrupted while the daemon job
kept running. The fetch is best-effort: a 404 (endpoint not wired
yet, lands in Phase 15) is swallowed with a dev-mode console.warn
so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
and simulate the "daemon not ready yet" path. Two tests pin both:
the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
rejected fetch still flips the UI.
interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
one we saw; when it changes, interruptPending is cleared. A user
who interrupts run-1 and then triggers run-2 from the same mount
now gets a fresh, enabled kill button instead of one stuck in
"Interrupting…". Pinned by a new mount test.
Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
input, textarea, select, or contenteditable element is ignored
(and any ancestor of those via closest() is treated the same
way). Body-level focus still fires the keybind so the Theater
area's affordance keeps working. Four new tests cover textarea,
input, contenteditable, and the body-focus positive case.
userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
critiqueTheater.userFacingName key so the "Design Jury" label can
be renamed without touching code. Phase 8 introduced
critiqueTheater.title by mistake; renamed across types.ts, en.ts,
zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
TheaterStage.tsx. The locale alignment test stays green.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.
* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)
In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.
Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.
7/7 vitest cases green.
* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)
Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.
* feat(daemon): adapter conformance harness (Phase 10.3)
runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.
5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.
* test(e2e): Critique Theater Playwright suite (Phase 11)
Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.
Coverage:
1. Happy path: a single mounted theater plays the full
fixture (1 run_started, 5 panelists open / dim / must_fix /
close, 1 round_end, 1 ship) and ends on the score badge.
2. Interrupt mid-run: the panelist that is open at the time
the interrupt button is clicked closes with an interrupted
marker and the transcript freezes there.
3. Visual regression at 375x720 mobile.
4. Visual regression at 768x1024 tablet.
5. Visual regression at 1280x800 desktop.
6. A11y role tree: the theater region exposes a labelled
landmark, each panelist lane is a group with an accessible
name, the score is a status live region.
All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.
* test(web): reducer p99 bench at 10k iterations (Phase 13.1)
Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.
The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.
* test(web): critique surface coverage walker (Phase 13.2)
Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.
62 assertions, one per symbol per corpus.
* fix(web): tighten Phase 13 gates from lefarcen review (PR #1318)
Address the actionable items from lefarcen's review of the two
Phase 13 CI gates. The two questions about longer-term DX (pre-
commit hook to auto-update the symbol table, AST-walker swap)
are documented as deferred follow-ups rather than landed here.
reducer-bench:
- Describe renamed to 'reducer p99 regression gate (Phase 13.1)'
so it reads as a gate, not a comparative benchmark.
- Failure message now carries the full distribution
(p50 / p90 / p99 / max + ceiling), so triage on a tripped gate
can distinguish a real 20ms regression from a 4.001ms CI hiccup
without re-running locally (lefarcen Q3).
- Captured a baseline (p50=0.011ms p90=0.013ms p99=0.018ms
max=0.244ms on a local Node 24 / Win11 run, 2026-05-11) inside
the docblock so reviewers can see the actual reading sits ~222x
below the 4ms ceiling (lefarcen Q1).
- Replaced 'role as any' casts with PanelistRole-typed casts so
the fixture is typecheck-strict.
- Phase numbering corrected (13.2 → 13.1 to match the PR body).
critique-coverage:
- Symbols now grouped under four describe blocks (SSE events /
panelist roles / lifecycle phases / i18n keys) so a failure
points at the category that drifted at a glance (lefarcen nit).
- Docblock now explains the grep-over-AST trade-off (the bug
class is structural at the string level, not at the AST level)
and points at the future AST-walker work as a deferred follow-
up (lefarcen Q2).
- Docblock now walks a contributor through the four-step
maintenance flow (add to contract → add caller → add test →
add literal here), so the next person to add an SSE event or
i18n key knows the gate exists and what to update (lefarcen
Q4).
- Phase strings switched from 'phase: <name>' to bare-quoted
literals so the walker is robust against single vs double
quotes and ':' vs '===' source-shape changes.
- Dead try/catch around 'stack = [root]' removed (cannot throw).
- Per-symbol failure messages name the symbol AND which corpus
is missing it, so the gate is self-describing on the next
CI red.
- Phase numbering corrected (13.4 → 13.2 to match the PR body).
63 / 63 vitest cases green (1 bench + 62 coverage). Web
typecheck clean.
* fix(web): tighten coverage walker semantics from lefarcen P2/P3 (PR #1318)
Two follow-on findings on commit
|
||
|
|
6c16283850 |
Merge origin/main (post-7c8305f4) into reconcile branch
Brings in 10 new main commits: routine deep-link to specific conversations (#1508), Windows resource cache fix for Orbit templates, collapsible comment side panel (#1607), routines project radio polish, Copilot logo swap, and minor UI fixes. Conflicts resolved: - router.ts: garnet's home/view + marketplace routes + main's per-project conversationId deep-link field coexist on Route union - ProjectView.tsx: garnet's isPhantomDaemonRunMessage helper + main's isStoppableAssistantMessage helper both kept - ProjectView.run-cleanup.test.tsx: accepted HEAD (garnet's phantom-row regression test); main's three new tests for finalizeActiveAssistantMessagesOnStop / clearStreamingConversationMarker / shouldClearActiveRunRefs are queued as a follow-up TODO inline. |
||
|
|
2976c76fc3
|
test: expand Memory and Routines coverage (#1521)
* test: expand settings and packaged coverage * test: extend memory settings coverage * test: cover routine settings failure states * test: cover routine operation failures * test: fix daemon test typing on CI * test: decouple packaged smoke from orbit bug * test: avoid live memory LLM calls in route tests * test: fix daemon fetch typing in CI * fix: restore preview comment and inspect toggles * test: align manual edit flow with current inspector UX * test: align comment attachment flow with current preview comments UI * fix: probe resolved Codex launch path during detection * fix: remove duplicate board activation helper after rebase * test: update ghost cli detection mock * test: align FileViewer toolbar expectation * ci: move full app tests to extended lane * ci: run app tests by changed scope * ci: cover shared app inputs in test scopes * ci: avoid setup-node cache in windows packaged smoke * test: align extended settings and manual edit flows |
||
|
|
e508fa3fbd
|
test(e2e): Critique Theater Phase 11 activation (un-fixme suite, seeded-project nav, split SSE fixture) (#1483) | ||
|
|
f1d0dac58b |
feat(plugins): enhance plugin registry with new metadata and preview features
- Added new fields to the plugin and metadata types, including `mode`, `taskKind`, `surface`, and `preview`, to improve plugin description and categorization. - Implemented a `previewFrom` function to generate structured previews for plugins based on their metadata, enhancing the user experience. - Introduced a `visualKindFor` function to determine the visual representation of plugins based on their attributes, improving sorting and display logic. - Updated the `entryFromMarketplace` and `officialEntryFromManifest` functions to accommodate the new fields and ensure proper handling of plugin entries. - Created a new documentation file for the plugin system test suite, consolidating testing strategies and acceptance criteria for better clarity. This update significantly enhances the plugin registry's capabilities, providing richer metadata and improved user interactions with plugin previews. |
||
|
|
a6307bf658 |
chore(reconcile): remove ad-hoc Playwright screenshot scripts
These landed inside the e2e/ package during the screenshot-comparison work for product review; they were never meant to be tracked. |
||
|
|
53997990b7 |
Merge origin/main (post-0.7.0) into reconciled garnet branch
Second-pass merge layering 41+ new commits from origin/main on top of the first reconcile commit. Headline upstream additions absorbed: - 0.7.0 release: redesigned chat bubble user-text styling, neutralised palette, lucide icons, ElevenLabs audio voice option discovery in the prompt composer, analytics tracking (PostHog) wired across home / studio / create surfaces, Prometheus `/api/metrics` endpoint, critique-theater drop-in mount with a settings toggle. - Misc upstream fixes (titlebar padding, release header layout, deck preview chrome, feedback form auto-scroll, conversation-created SSE on routine runs, etc.) Conflict resolutions (12 files, ~22 hunks): - contracts barrel + prompts/system: union of both sides; new analytics exports (`./analytics/events`, `./analytics/public-params`) added alongside garnet's plugin/atom/genui exports. Both ElevenLabs voice fields (audioVoiceOptions/audioVoiceOptionsError, main) and pluginBlock/activeStageBlocks (garnet) preserved on ComposeInput. - daemon/server.ts: Prometheus `/api/metrics` route inserted after garnet's `/api/daemon/shutdown`. main's `createAnalyticsService` call added before the chat-run service init alongside the prior reconcile note about the dropped legacy POST /api/projects body. - App.tsx: handleCreateProject now consumes both garnet's plugin fields (pluginId / appliedPluginSnapshotId / pluginInputs / autoSendFirstMessage) and main's analytics requestId. Tracking fires success + failure paths; PluginLoopHome auto-send sessionStorage flag is preserved. - ProjectView.tsx: the garnet auto-send useEffect coexists with main's `useCritiqueTheaterEnabled()` hook. - ChatComposer.tsx: imports merged (drop now-unused fetchSkills, add analytics provider + tracking + buildVisualAnnotationAttachment). - index.css: main's redesigned `.msg.user .user-text` chat bubble styling wins over garnet's plain text rule; garnet's `.msg-plugin-chip*` rules preserved alongside. - EntryView.tsx: accepted HEAD (garnet wrapper) — consistent with reconcile decision #2. main's added PetRail / TopTab / analytics view tracking is intentionally NOT brought into the wrapper; the follow-up to re-integrate PetRail / image-templates / video-templates into EntryShell still stands and now also covers analytics view-tracking hooks. - daemon/package.json + pnpm-lock: merged dep set (tar + posthog-node + prom-client coexist). - Test fixtures (FileWorkspace.test): kept garnet's plugin-folders describe block intact; main's projectKind="prototype" addition is dropped where it conflicted with garnet's plugin-folder fixture files. Verification: `pnpm install` (after lockfile reconciled), `pnpm typecheck` exits 0 across all workspace packages. Follow-up not done in this commit: - PetRail / image-templates / video-templates / 0.7.0 analytics view-tracking hooks need to be added to EntryShell. - Critique-theater settings toggle UX (added on main) lives in the SettingsDialog hierarchy; the reconcile state preserves the SettingsDialog so this should work without changes, but no end-to-end verification yet. |
||
|
|
d3602be666 |
Merge origin/main into garnet-hemisphere (reconcile)
Merge of `origin/main` (`03ed3960`, 2026-05-13 pre-0.7.0) into the 161-commit garnet-hemisphere line, reconciling the product-vibe-coded plugin/marketplace/EntryShell surfaces from garnet with the routines / skills / live-artifacts feature work landed on main since the fork point. Headline decisions (full rationale + side-by-side screenshots in `specs/change/20260513-garnet-skills-automations/reconcile-result-vs-garnet.md`): - #1 SettingsDialog: keep main's Memory / Skills / External MCP / Connectors / Routines / MCP server nav items even though the top-level /integrations + /automations routes also cover them. Two entries coexist for now; revisit once Track A/B fill in the placeholder content. - #2 EntryView: accept garnet's thin wrapper delegating to EntryShell. Main's PetRail sidebar + image-templates/video-templates tabs are intentionally deferred to a follow-up that re-integrates them into the new EntryShell layout. - #3 /integrations + /automations top-level routes: kept (garnet's product intent). Skills tab is still a "Coming soon" placeholder awaiting Track A; Routines/Schedules/Live-artifacts cards on /automations are still mock awaiting Track B. - #5 DesignFilesPanel: hybrid — main's pagination as primary list, garnet's Plugin folders section preserved between the live-artifacts block and the pagination block. (by-kind sections drop in favour of pagination; plugin-folders rendering stays because it is a garnet-specific product addition.) - #7 server.ts (10 hunks, ~5400 conflict lines): manual hunk-by-hunk merge. Both daemon admin routes + plugin/genui routes (garnet) and routines/memory/skills upgrades (main) preserved. Garnet's inline project route block kept alongside main's `registerProjectRoutes` / `registerProjectUploadRoutes` modular wiring — duplicate route audit is a follow-up. Garnet's POST /api/projects plugin-snapshot resolution + default-scenario fallback is intentionally dropped from the inline body (now handled by registerProjectRoutes) and listed for follow-up re-integration into `project-routes.ts`. Verification (worktree at /Users/elian/Documents/open-design-garnet): - `pnpm typecheck` exits 0 across all workspace packages - daemon (`pnpm tools-dev run web --namespace reconcile-shots`) boots, serves `/api/daemon/status` healthy, and survives a Playwright walkthrough of /integrations / /automations / home / projects / design-systems / plugins / settings dialog - `@open-design/plugin-runtime` package built (was missing dist/ on garnet); without it the daemon's plugins/* imports fail at boot Track A (Skills tab → real SkillsSection) and Track B (Automations cards → real routines / live-artifacts backend) are the two remaining follow-ups blocking the placeholder/mock content from going live. See `spec.md` and `track-skills.md` in the same directory. |
||
|
|
38a5ab69e6
|
feat(daemon): Critique Theater Phase 12 (9 Prometheus metrics + 6 log events + OTel span + Grafana dashboard) (#1485)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)
* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)
* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)
Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.
Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
/api/projects/:id/critique/:runId/interrupt alongside the
optimistic local dispatch. Before this fix, clicking Interrupt
only flipped the React state to interrupted while the daemon job
kept running. The fetch is best-effort: a 404 (endpoint not wired
yet, lands in Phase 15) is swallowed with a dev-mode console.warn
so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
and simulate the "daemon not ready yet" path. Two tests pin both:
the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
rejected fetch still flips the UI.
interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
one we saw; when it changes, interruptPending is cleared. A user
who interrupts run-1 and then triggers run-2 from the same mount
now gets a fresh, enabled kill button instead of one stuck in
"Interrupting…". Pinned by a new mount test.
Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
input, textarea, select, or contenteditable element is ignored
(and any ancestor of those via closest() is treated the same
way). Body-level focus still fires the keybind so the Theater
area's affordance keeps working. Four new tests cover textarea,
input, contenteditable, and the body-focus positive case.
userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
critiqueTheater.userFacingName key so the "Design Jury" label can
be renamed without touching code. Phase 8 introduced
critiqueTheater.title by mistake; renamed across types.ts, en.ts,
zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
TheaterStage.tsx. The locale alignment test stays green.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.
* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)
In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.
Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.
7/7 vitest cases green.
* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)
Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.
* feat(daemon): adapter conformance harness (Phase 10.3)
runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.
5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.
* test(e2e): Critique Theater Playwright suite (Phase 11)
Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.
Coverage:
1. Happy path: a single mounted theater plays the full
fixture (1 run_started, 5 panelists open / dim / must_fix /
close, 1 round_end, 1 ship) and ends on the score badge.
2. Interrupt mid-run: the panelist that is open at the time
the interrupt button is clicked closes with an interrupted
marker and the transcript freezes there.
3. Visual regression at 375x720 mobile.
4. Visual regression at 768x1024 tablet.
5. Visual regression at 1280x800 desktop.
6. A11y role tree: the theater region exposes a labelled
landmark, each panelist lane is a group with an accessible
name, the score is a status live region.
All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.
* test(web): reducer p99 bench at 10k iterations (Phase 13.1)
Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.
The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.
* test(web): critique surface coverage walker (Phase 13.2)
Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.
62 assertions, one per symbol per corpus.
* docs: Critique Theater user guide (Phase 14.1)
Seven sections aimed at end users (not contributors):
1. What is Design Jury
2. How it works (the five panelists, auto-converging rounds,
the composite formula)
3. Settings (the M1 toggle and what it does)
4. Reading the score badge
5. Replay surface
6. Troubleshooting (degraded, interrupted, failed)
7. FAQ
The composite formula is documented as
designer * 0 + critic * 0.4 + brand * 0.2 + a11y * 0.2 + copy * 0.2
because anyone trying to reverse-engineer the score is going to
search for those weights and the docs are the place they should
land first.
* docs(daemon): critique module AGENTS map (Phase 14.2)
Daemon-side wayfinder for the apps/daemon/src/critique directory.
Tables every file, what owns what invariant, and the 'when you
change anything here' guide so a future contributor does not
have to reverse-engineer the rollout resolver before adding a
new SSE event.
* docs(web): Theater module AGENTS map (Phase 14.3)
Web-side mirror of the daemon AGENTS map. Same file table, same
invariants section, same change-impact guide, sized to the
Theater component package.
* feat(daemon): rollout flag resolver (Phase 15.1)
Single decision point every caller consults to know whether the
orchestrator should wire the critique pipeline for a given run.
Priority:
1. Skill-level policy (required wins, opt-out wins inversely)
2. Per-project override from the Settings toggle
3. OD_CRITIQUE_ENABLED env override
4. Rollout phase default
M0 dark-launch false
M1 settings only false (toggle is off until the user flips it)
M2 per-skill true if skill opted in
M3 global default true
OD_CRITIQUE_ROLLOUT_PHASE parser defaults to M0 on unknown input
so a fresh install never surprises a user with the feature on.
10/10 vitest cases green covering every cell of the matrix.
* feat(web): Settings toggle hook for Critique Theater (Phase 15.2)
React hook that reads critiqueTheaterEnabled from the existing
open-design:config localStorage blob and stays in sync via:
- the platform storage event (cross-tab)
- a open-design:critique-theater-toggle CustomEvent (same-tab)
Same-tab event is the one that fires when the Settings panel saves
in the current window: the toggle and every mounted theater update
without a page reload.
setCritiqueTheaterEnabled(next) is the imperative setter the Settings
panel calls. It preserves the rest of the stored config (mode, apiKey,
etc.) and dispatches the same-tab event after the localStorage write.
The web hook reflects what the user toggled; the daemon-side
isCritiqueEnabled is the final routing authority (project override,
env, rollout phase). When they disagree, the daemon wins for backend
gating and the web reflects the toggle state.
6/6 vitest cases green covering first read, stored read, same-tab
event flip, config preservation, corrupted JSON tolerance, and
cross-tab storage event.
* test(web): Phase 15 toggle hook failure-mode coverage (PR #1320)
lefarcen P2 on PR #1320 flagged that the PR body claimed safe
behavior for disabled localStorage, non-object JSON, and missing
CustomEvent shim, but the suite only covered corrupt JSON plus
happy-path storage events. Added four failure-mode tests so the
swallowed errors are not silently traded for a throw in a future
refactor:
1. Returns false on a stored JSON value that parses to an array
(non-object). Catches a regression where the guard treats
anything truthy as a config blob.
2. Returns false on a stored JSON value of literal 'null'.
typeof null === 'object' in JS, so the guard has to check null
explicitly; this test pins that check.
3. Returns false when localStorage.getItem throws (private mode /
disabled storage / SecurityError). The hook must swallow and
return false so the rest of the app keeps rendering.
4. setCritiqueTheaterEnabled still dispatches the same-tab
CustomEvent when localStorage.setItem throws (quota exceeded /
disabled storage). The dispatch path is the in-session
broadcast that keeps every mounted hook coherent even when
persistence is unavailable; verified by mounting two probes
and asserting both flip after the setter is called with a
throwing setItem.
10/10 vitest cases green (6 existing + 4 new).
* fix(web): honor CustomEvent payload in toggle hook listener (PR #1320)
Both Siri-Ray (blocking) and lefarcen (P2 new) caught the same
real bug in the failure-mode test I added in
|
||
|
|
5172e37217 |
Merge origin/main into release/v0.7.0 to prepare merge-back PR
Resolves 7 conflicts via hybrid strategy: - apps/web/src/components/EntryView.tsx: take main (Discord+X pills are forward feature) - apps/web/src/components/Icon.tsx: take main (switch-case refactor) - apps/web/src/components/NewProjectPanel.tsx: take release (preserve #1514 dropdown UX validated in 0.7.0 acceptance) - apps/web/src/index.css: take main (project-target-platforms / instructions chip styles) - apps/web/tests/components/FileViewer.inspect-empty-hint.test.tsx: accept main's deletion - nix/package-daemon.nix, nix/package-web.nix: take main pnpmDepsHash Non-conflicting hunks from #1519 (AppChromeHeader), #1428 (PostHog analytics call sites), and #1540 (release light background) are preserved via auto-merge. |
||
|
|
026e13b347
|
fix(web): restore release header layout (#1519)
* fix(web): restore release header layout * fix(web): disambiguate entry settings button Generated-By: looper 0.7.4 (runner=fixer, agent=codex) |
||
|
|
6736310a01
|
Implement manual edit inspector (#1448)
* feat(web): tweaks palette popover with HSL hue-shift recoloring Adds a Tweaks color-palette popover to the HTML preview toolbar. Selecting a palette re-skins the iframe in place via a srcDoc-side bridge that walks the DOM and shifts every chromatic paint to the target hue while preserving each color's saturation and lightness — pale tints stay pale, bold CTAs stay bold, just in the new color family. Mono-noir desaturates instead of shifting. - runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options - file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected - FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration - PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir - PreviewDrawOverlay: stub pass-through until the draw branch lands * feat(web): hide finalize-design toolbar from project header * test(e2e): skip project actions toolbar flow after toolbar removal * Polish manual edit inspector * Implement manual edit inspector * Fix manual edit review regressions * Fix FileViewer CI regressions * Fix remaining manual edit review issues * Flush manual edit styles before draw exit * Restore Critique Theater styles * Accept pixel line-height manual edits --------- Co-authored-by: qiongyu1999 <2694684348@qq.com> |
||
|
|
0f0d2879ff
|
Make de/fr/ru content i18n optional (#1511) | ||
|
|
e2f409579d
|
docs: Critique Theater Phase 14 (user guide + 2 AGENTS module maps) (#1319)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)
* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)
* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)
Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.
Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
/api/projects/:id/critique/:runId/interrupt alongside the
optimistic local dispatch. Before this fix, clicking Interrupt
only flipped the React state to interrupted while the daemon job
kept running. The fetch is best-effort: a 404 (endpoint not wired
yet, lands in Phase 15) is swallowed with a dev-mode console.warn
so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
and simulate the "daemon not ready yet" path. Two tests pin both:
the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
rejected fetch still flips the UI.
interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
one we saw; when it changes, interruptPending is cleared. A user
who interrupts run-1 and then triggers run-2 from the same mount
now gets a fresh, enabled kill button instead of one stuck in
"Interrupting…". Pinned by a new mount test.
Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
input, textarea, select, or contenteditable element is ignored
(and any ancestor of those via closest() is treated the same
way). Body-level focus still fires the keybind so the Theater
area's affordance keeps working. Four new tests cover textarea,
input, contenteditable, and the body-focus positive case.
userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
critiqueTheater.userFacingName key so the "Design Jury" label can
be renamed without touching code. Phase 8 introduced
critiqueTheater.title by mistake; renamed across types.ts, en.ts,
zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
TheaterStage.tsx. The locale alignment test stays green.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.
* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)
In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.
Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.
7/7 vitest cases green.
* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)
Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.
* feat(daemon): adapter conformance harness (Phase 10.3)
runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.
5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.
* test(e2e): Critique Theater Playwright suite (Phase 11)
Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.
Coverage:
1. Happy path: a single mounted theater plays the full
fixture (1 run_started, 5 panelists open / dim / must_fix /
close, 1 round_end, 1 ship) and ends on the score badge.
2. Interrupt mid-run: the panelist that is open at the time
the interrupt button is clicked closes with an interrupted
marker and the transcript freezes there.
3. Visual regression at 375x720 mobile.
4. Visual regression at 768x1024 tablet.
5. Visual regression at 1280x800 desktop.
6. A11y role tree: the theater region exposes a labelled
landmark, each panelist lane is a group with an accessible
name, the score is a status live region.
All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.
* test(web): reducer p99 bench at 10k iterations (Phase 13.1)
Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.
The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.
* test(web): critique surface coverage walker (Phase 13.2)
Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.
62 assertions, one per symbol per corpus.
* docs: Critique Theater user guide (Phase 14.1)
Seven sections aimed at end users (not contributors):
1. What is Design Jury
2. How it works (the five panelists, auto-converging rounds,
the composite formula)
3. Settings (the M1 toggle and what it does)
4. Reading the score badge
5. Replay surface
6. Troubleshooting (degraded, interrupted, failed)
7. FAQ
The composite formula is documented as
designer * 0 + critic * 0.4 + brand * 0.2 + a11y * 0.2 + copy * 0.2
because anyone trying to reverse-engineer the score is going to
search for those weights and the docs are the place they should
land first.
* docs(daemon): critique module AGENTS map (Phase 14.2)
Daemon-side wayfinder for the apps/daemon/src/critique directory.
Tables every file, what owns what invariant, and the 'when you
change anything here' guide so a future contributor does not
have to reverse-engineer the rollout resolver before adding a
new SSE event.
* docs(web): Theater module AGENTS map (Phase 14.3)
Web-side mirror of the daemon AGENTS map. Same file table, same
invariants section, same change-impact guide, sized to the
Theater component package.
* docs: tighten Phase 14 reasoning from lefarcen review (PR #1319)
Four content gaps lefarcen flagged in the Phase 14 docs review,
addressed inline rather than deferred. The fifth item (scope-drift
between 'docs only' PR body and the cumulative stacked diff) is
handled by rewriting the PR body, not the docs.
1. Round exit conditions (lefarcen P2-1).
docs/critique-theater.md §2 'Auto-converging rounds' now lists
the five conditions that stop a run (threshold reached, round
budget exhausted, per-round timeout, total timeout, user
interrupt) with their default values. A user debugging a run
that stopped at round 1 with composite 5.4 can read this list
and find the matching cause without spelunking the orchestrator.
2. Prior-art comparison (lefarcen P2-2).
New §1.5 'Why an in-CLI panel and not a third-party design lint'
pre-answers the 'why not Figma lint / Adobe checker / Material
You conformance' question. Three differences: rule engines vs
generative reviewers, post-hoc vs in-loop, external service vs
same-CLI-session.
3. Composite formula rationale (lefarcen P2-4).
§2 now explains why each weight is set the way it is: critic
gates correctness so it gets 0.4; brand / a11y / copy are
secondary quality dimensions at 0.2 each; designer is at 0.0
in v1 because aesthetic preference is not a ship gate. The slot
stays in the schema so notes flow into the transcript and a v2
config release can bump the weight without a wire-shape change.
4. v2 cast-config ownership (lefarcen P2-3).
Both AGENTS.md files (daemon + web) now declare a 'Designer
weight frozen at 0.0 until v2 cast config' invariant. The
daemon side calls out where the SKILL.md frontmatter resolver
lands (apps/daemon/src/critique/config.ts); the web side calls
out where the Settings surface lands (apps/web/src/components/
Settings/). A contributor reading either AGENTS.md before
implementing v2 sees which module to touch first.
* docs(web): mirror the Designer-weight invariant in Theater AGENTS.md (PR #1319)
lefarcen P1 follow-up on PR #1319: the daemon AGENTS.md already
declares 'Designer weight is frozen at 0.0 until v2 cast config
lands' as an invariant, but the web AGENTS.md's parallel bullet
led with 'Composite weights are read-only on the web side' which
buried the Designer-specific constraint. A web contributor
reading that bullet would not realise the v1 weight distribution
is wire-shape (changing it mid-v1 invalidates persisted
critique_runs composite values).
Rewrote the bullet to lead with the same 'Designer weight is
frozen at 0.0 until v2 cast config lands' phrasing the daemon
side uses, and added an explicit cross-link to the daemon
AGENTS.md so the two halves of the invariant read as one rule.
Web-side specifics retained: ScoreTicker / TheaterCollapsed read
composite off the wire (no client recompute), v2 lands as a
Settings surface at apps/web/src/components/Settings/, do not
add a 'weights' prop to any component in this directory until
the contracts package carries the v2 cast type.
* docs: replace deferred metrics endpoint reference + refresh Theater module map (PR #1319)
Two carryover items lefarcen flagged across the PR #1319 + #1320
reviews.
1. docs/critique-theater.md was sending users to
/api/metrics/critique as the conformance-status check on
malformed_block, but the Phase 12 metrics endpoint is
explicitly deferred until after orchestrator wiring lands.
Replaced the link with the pnpm conformance-harness command
that DOES exist today (pnpm --filter @open-design/daemon
vitest run tests/critique-conformance.test.ts) and noted
that the dashboard surfaces this status as a series once
Phase 12 ships.
2. apps/web/src/components/Theater/AGENTS.md module map was
stale after Phase 15: the index.ts row said 'only two hooks
are exported' but the barrel now exports
useCritiqueTheaterEnabled too (plus the setCritiqueTheaterEnabled
setter). Updated the row to list all three hooks + the
setter + the reducer-derived contract types, and added a
new row for hooks/useCritiqueTheaterEnabled.ts in the file
table so a web contributor scanning the table sees the new
hook without inferring it from the index.ts blurb.
* fix(web): restore wait-for-daemon-ack pattern on Theater interrupt
Same regression as flagged on PR #1316 post-main-merge: the
optimistic local dispatch fired before the POST resolved, so a
daemon 404 / 409 still terminalized the UI and the real SSE
terminal event got ignored by the sticky interrupted phase.
Snapshot runId / bestRound / composite at click time, dispatch
interrupted only on res.ok, clear interruptPending on rejection or
non-2xx so the user can retry. Tests cover rejection + 404 leaving
the run on the live stage; the 204 path waits for the ack.
---------
Co-authored-by: Nagendhra <nagendhra405@gmail.com>
|
||
|
|
e2952acd05 |
Revert "fix(web): restore consistent app header layout (#1432)"
This reverts commit
|
||
|
|
c36609c47d |
feat(daemon, web): implement plugin sharing project creation and enhance CLI functionality
- Added new flags for conversation, message, agent, and model in the CLI to support enhanced plugin sharing features. - Introduced a new API endpoint for creating share projects for plugins, allowing users to publish to GitHub or contribute to Open Design. - Updated the UI components to facilitate the new sharing functionalities, including prompts for user input during the sharing process. - Enhanced the project management system to handle new plugin share actions, improving user interaction and experience. - Added tests to ensure the reliability of the new sharing features and their integration within the existing plugin management system. This update significantly enhances the plugin ecosystem by enabling users to share their creations more effectively and streamline collaboration. |
||
|
|
3d3119333c
|
fix(web): restore consistent app header layout (#1432)
* docs: add NotebookLM GitHub export script (#1062) * docs: add NotebookLM GitHub export script * fix: make NotebookLM export TOC anchors work * fix: escape TOC link text markdown chars * fix: include merged PRs when exporting --prs all * fix: allow --prs merged mode * fix: treat --limit as total export budget * fix: avoid starving buckets under global --limit * fix: support --issues none and handle repos w/ issues disabled * fix: avoid underfilling export when buckets empty * fix: keep disabled-issues fallback quiet * fix: silence disabled issues fallback * fix: satisfy script typecheck * prevent duplicate saves and add template deletion (#1294) * prevent duplicate template entries on repeated save * add delete button to saved template list Templates can now be removed from the template picker via a hover x button, calling the existing DELETE /api/templates/:id endpoint. * add missing onDeleteTemplate prop in test fixtures * add template deletion flow test for NewProjectPanel * reject template names longer than 100 characters * preserve original createdAt on template update * feat: add FAQ page skill (#1162) * fix: set writable OD_DATA_DIR default for nix run Fixes #1157 When running via 'nix run github:nexu-io/open-design', the daemon attempted to create runtime state under the Nix store package path: /nix/store/.../lib/open-design/.od/projects The Nix store is read-only at runtime, causing startup to fail with ENOENT when mkdir() tried to create the projects directory. This commit updates the nix run wrapper to export OD_DATA_DIR with a writable default ($HOME/.od) when the variable is unset. Users can still override it by setting OD_DATA_DIR before running. The Home Manager and NixOS modules already set OD_DATA_DIR, so they are unaffected by this change. * feat: add FAQ page skill Add a new skill for generating Frequently Asked Questions pages with: - Collapsible accordion sections for Q&A pairs - Real-time search functionality - Category filtering (Billing, Account, Technical, General) - Smooth animations and transitions - Keyboard navigation support - Mobile-friendly responsive design - Semantic HTML with proper ARIA attributes The skill includes: - SKILL.md with triggers, workflow, and output contract - example.html demonstrating a complete FAQ page with 12 questions Use cases: help centers, support pages, product documentation * fix: address PR review feedback for FAQ page skill - Fix craft slugs: use accessibility-baseline and state-coverage instead of non-existent slugs - Remove overly broad 'questions and answers' trigger - Add edge case handling for insufficient/excessive FAQs - Remove search highlighting requirement (XSS risk) - Update self-check to reflect filtering instead of highlighting Addresses review comments from @lefarcen and @chatgpt-codex-connector * feat: add localized copy for faq-page skill Add German, French, and Russian translations for the FAQ page skill example prompt to fix validation test failure. - DE: FAQ-Seite mit Akkordeon-Abschnitten, Suchfunktion und Kategoriefilterung - FR: Page FAQ avec sections accordéon, recherche et filtrage par catégorie - RU: Страница FAQ со складными секциями-аккордеонами, поиском и фильтрацией * fix: escape apostrophe in French translation Use double quotes to avoid syntax error with d'auth * fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins (#1110) * fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins fnm legacy installations use ~/.fnm/node-versions. Closes #1102 * fix: remove stray .fnm token from type declaration * docs: add Windows troubleshooting guide (#478) (#1170) * docs: add Windows troubleshooting guide (#478) Add docs/windows-troubleshooting.md with step-by-step fixes for the most common native-Windows setup errors: - Node 24 / nvm-windows gotchas (fake nvm file in System32) - pnpm not found after installation - Build scripts blocked by pnpm 10 (better-sqlite3, sharp) - Visual Studio / gyp build errors - Starting the dev server - Optional OpenCode CLI setup Also update CONTRIBUTING.md and QUICKSTART.md to link to the new guide instead of the vague "file an issue if it doesn't" note. * docs: fix Windows guide command accuracy (#1170) Address all 6 inline review comments from lefarcen: - Pin npm-global pnpm install to @10.33.2 (matches packageManager field) - Use where.exe instead of bare where (PowerShell alias conflict) - Fix OpenCode package: opencode-ai (not opencode), binary is opencode - Add EPERM fallback note for corepack enable on protected installs - Add Python check for gyp ERR! find Python - Expand diagnostic checklist with corepack, python, execution policy Also remove redundant corepack pnpm --version from checklist. * feat(daemon): inject compiled design-system tokens + fixture into prompts (#1385) * feat(daemon): inject compiled design-system tokens + fixture into prompts Follow-up to #1231. The prior PR landed the structured form of two brands (`default` + `kami`) and codified the schema; this PR teaches the daemon to actually consume those files when assembling the system prompt, so agents stop having to re-derive token names from DESIGN.md prose every turn. Gated behind `OD_DESIGN_TOKEN_CHANNEL=1` for the smoke-test phase — flag-off keeps the daemon byte-equivalent to today's behavior, flag-on appends two new prompt blocks (the brand's `tokens.css` :root contract and its `components.html` reference fixture) right after the existing DESIGN.md block. Brands without those sibling files (every brand except `default` and `kami` today) skip silently in either mode. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(daemon): only swallow ENOENT/ENOTDIR in readFileOptional, rethrow rest Reviewer feedback (nettee, #1385). The prior catch-all hid permission errors, EISDIR, and broken packaged-resource paths behind the same "undefined = absent" branch the legacy ~138-brand fallback uses, which would let `OD_DESIGN_TOKEN_CHANNEL=1` silently degrade to the DESIGN.md-only prompt while reporting success. That corrupts the exact signal the smoke-test rollout depends on. Now `readFileOptional` only returns undefined for ENOENT / ENOTDIR (real "file does not exist" cases) and rethrows everything else. Added a focused test that plants a directory at the tokens.css path to exercise the EISDIR branch, plus a partial-presence regression test to confirm the stricter contract preserves the legacy fallback. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: chaoxiaoche <chaoxiaoche@192.168.10.16> Co-authored-by: Cursor <cursoragent@cursor.com> * feat(daemon): make connection-test timeouts configurable (#1222) * feat(daemon): make connection-test timeouts configurable Provider and agent connection tests had hardcoded 12s / 45s budgets, which are too tight for slow networks or distant providers (the user sees "timeout" in Settings with no way to extend the budget). - Add OD_CONNECTION_TEST_PROVIDER_TIMEOUT_MS (default 12_000) - Add OD_CONNECTION_TEST_AGENT_TIMEOUT_MS (default 45_000) - Invalid values (non-numeric, zero, negative, fractional) emit a console.warn and fall back to the default, so a typo in the env never silently disables the safety timeout. - Export resolveConnectionTestTimeoutMs for unit testing; cover the three resolution paths (fallback / honored override / invalid). 41 connection-test tests pass (+3 new), full daemon suite 1170/1170. * fix(daemon): reject connection-test timeout overrides above Node's setTimeout maximum Node's `setTimeout` silently clamps any delay above `2^31-1` ms (2_147_483_647) to ~1 ms with a TimeoutOverflowWarning. The previous `Number.isInteger(n) && n >= 1` check accepted oversized values unchanged and passed them straight to `setTimeout`, so an override that *intended* to raise the budget — e.g. `OD_CONNECTION_TEST_AGENT_TIMEOUT_MS=3000000000` — instead caused every connection test to fail almost immediately. The safety timeout was effectively disarmed. Add `MAX_CONNECTION_TEST_TIMEOUT_MS = 2_147_483_647` and switch the guard to `Number.isSafeInteger(n) && n >= 1 && n <= MAX...`. The boundary value is still accepted; one millisecond past it falls back with a warn. Regression test exercises `3_000_000_000`, `2_147_483_647`, and `2_147_483_648`. Addresses #1222 review feedback from @chatgpt-codex-connector, @mrcfps, and @lefarcen. * fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN SSRF bypass) (#1122) * fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN bypass) new URL('http://192.168.1.5./').hostname returns '192.168.1.5.' — the trailing dot is the RFC 1034 absolute-FQDN form and resolves identically to '192.168.1.5'. parseIpv4 fails on the dotted form, so 169.254.169.254. slips past the metadata-service block, 192.168.1.5. slips past the LAN block, and localhost. slips past the loopback identification. Strip trailing dots in normalizeBracketedIpv6 so all downstream checks (isLoopbackApiHost, isBlockedExternalApiHostname, isBlockedIpv4, IPv6 range tests) see the canonical form. Adds 6 vitest cases covering loopback FQDN forms (localhost., foo.localhost., 127.0.0.1.) and SSRF FQDN bypasses (169.254.169.254., 192.168.1.5., 10.0.0.5.). Refs nexu-io/open-design#1119 review feedback (P2 from @lefarcen). * test(connectionTest): tighten trailing-dot coverage per #1122 review Two issues from #1122 review: 1. (P2 from @mrcfps + codex bot) The original `foo.localhost.` case asserted error===undefined on validateBaseUrl, which only proves the URL passed validation — not that the host is identified as loopback. Replaced with direct isLoopbackApiHost(...) assertions on the actual loopback FQDN forms (localhost., 127.0.0.1., 127.0.0.5.) so the test exercises the loopback path the comment claims. 2. (P3 from @lefarcen) Original blocked-FQDN tests covered only 3 of 7 ranges that isBlockedIpv4 handles. Added a dedicated case per range (0.0.0.0/8, 10/8, 100.64/10, 169.254/16, 172.16/12, 192.168/16, multicast >=224) so future regressions in normalizeBracketedIpv6 surface against the full coverage. * docs: drop misleading foo.localhost./endsWith claim in normalizer comment @lefarcen review feedback: isLoopbackApiHost only accepts exact 'localhost', '::1', loopback IPv4, and mapped loopback IPv4 — there's no subdomain or endsWith handling, so referencing 'foo.localhost.' overstates what the trailing-dot strip enables. Rewrite the comment to match actual call sites (isLoopbackApiHost equality + isBlockedIpv4 numeric parse). * feat(daemon): export self-contained HTML via /export/*?inline=1 endpoint (#1312) * test(daemon): add Red unit tests for inlineRelativeAssets helper 14 cases pinning the behavior contract for the upcoming apps/daemon/src/inline-assets.ts helper: - link/script inlining with verbatim body preservation - non-src script attrs preserved (type=module, defer, crossorigin) - relative path resolution (root + nested + deep-nested owners) - self-closing and single-quoted attr forms - negative cases: missing rel, rel=preload, absolute/data/blob/leading-slash - escaping: </style and </script inside body - null-fileReader graceful degradation - duplicate identical tags fully replaced (diverges from apps/web/src/components/FileViewer.tsx:5313's first-match-only; locked decision per plan §3.3) - HTML-escaped data-od-inline-asset attr Tests intentionally Red — module ../src/inline-assets.js does not yet exist. Phase B-G of plan declarative-roaming-gosling.md will turn them green by porting FileViewer.tsx:5248-5354 server-side. Refs nexu-io/open-design#368. * feat(daemon): port inlineRelativeAssets server-side for export endpoint Adds apps/daemon/src/inline-assets.ts — a pure helper that takes (html, ownerFileName, fileReader closure) and returns the HTML with every relative <link rel=stylesheet> and <script src> contents inlined into <style data-od-inline-asset="…">/<script>…</script> blocks. The fileReader closure keeps the helper free of fs/Express coupling so the route handler owns the filesystem boundary. Port source: apps/web/src/components/FileViewer.tsx:5248-5354 — five functions (inlineRelativeAssets, resolveProjectRelativePath, baseDirFor, readHtmlAttr, escapeHtmlAttr). The fetch hop becomes the fileReader closure; replace-all replaces first-match-only per locked design decision §3.3 (inline comment in inline-assets.ts cites the divergence from FileViewer.tsx:5313 and notes the web inline path is on a deprecation track since PR #384 made URL-load the default). Phase B-G of plan declarative-roaming-gosling.md. All 14 unit cases from the Red commit ( |
||
|
|
0220124e0f
|
test: add Memory and Routines coverage (#1400)
* test: align extended Playwright coverage with current UI behavior * test: address extended suite review feedback * test: fix Codex fallback config hydration in e2e * test: add Memory and Routines coverage * test: fix Memory and Routines component test typing * test: include Memory and Routines e2e in extended suite |
||
|
|
2a0ebea50b |
release: Open Design 0.7.0
- bump 14 monorepo package.json files to 0.7.0 (root + apps/{web,daemon,desktop,packaged,landing-page} + packages/{contracts,platform,sidecar,sidecar-proto} + tools/{dev,pack,pr} + e2e); apps/packaged was already at 0.6.1 from beta lane, all others at 0.6.0
- add CHANGELOG.md [0.7.0] - 2026-05-12 entry covering 97 merged PRs since 0.6.0:
- Critique Theater: Phase 7 web client state machine (#1307) + Phase 6.2 daemon artifact extraction (#1085)
- Web/UI: thumbs-up/down feedback widget (#1308), Cmd+, opens Settings (#1173), Finalize design package + Continue in CLI (#974), fetch models button for BYOK (#1034), provider models alphabetical sort (#1097), collapsible MCP JSON field-mapping (#1136), design file rename (#894)
- Daemon: auto-memory store with chat-protocol-aware extraction (#999), install/uninstall skills & design systems (#1003), HTTP 206 range requests for video/audio (#1105), scheduled routines (#1033), agent runtime + route registration refactor (#1063, #1043)
- HyperFrames: HTML-in-Canvas across web + skills (#866)
- Skills/design systems: generic skills + design-templates split + finalize-design API (#955), agent-browser skill (#1284), WeChat design system + login-flow skill (#1083), hud/loom/trading-terminal design systems (#1069), release-notes-one-pager skill (#873), tokens.css schema (#1231)
- Packaging: macOS Intel (x64) build (#759), official Nix flake (#402), beta packaging cache (#1095)
- Maintainer ops: tools-pr PR-duty workspace (#1259), MAINTAINERS.md (#1290), contributor card bot (#932), PR→issue linking discipline (#1263)
- Changed: conversation run isolation (#1271), default English i18n fallback (#1270), Codex CLI exit diagnostics / empty-response handling / path fallback (#1267, #1244, #1205)
- Fixed: ~30 web + desktop + daemon + packaging bugfixes
- Internal: nightly UI/desktop regression coverage (#1256), e2e/release report hardening (#1140), entry/settings automation (#954)
- catch up [Unreleased] compare link to v0.7.0 and add missing [0.6.0] release link
- add 97 PR footnote refs ([#402]..[#1330])
Verified locally: pnpm install + pre-build contracts/daemon/desktop dist + pnpm typecheck (exit 0 across all 14 packages on Node 22.22 with engine-warning).
Release workflow validation runs after merge via release-stable.
|
||
|
|
5d674410f2
|
test: stabilize extended Playwright coverage (#1341)
* test: align extended Playwright coverage with current UI behavior * test: address extended suite review feedback * test: restore Codex path hydration assertion |
||
|
|
9c489aa045
|
feat(web): redesign Designs tab cards — covers, tags, overflow menu, multi-select (#1161)
* feat(web): redesign Designs tab cards — covers, tags, overflow menu, multi-select - Render real previews on project cards: HTML iframe / image / video / hashed gradient fallback with project initial; lazily fetches the project's primary file when metadata.entryFile is unset, prefers index.html → newest html → image → video. - Live artifact card thumbnails embed the rendered artifact URL via sandboxed iframe. - Replace the per-card close button with a `…` overflow menu (Rename, Delete) that opens on hover/click; click-outside and Esc close it. - Add multi-select mode (toolbar toggle → checkbox per card → "N selected · Delete · Cancel" pill) with batch delete via the existing onDelete prop. - Add a category tag to every card (Prototype / Live Artifact / Slide / Media) derived from project.metadata.intent / kind / skillId. - Replace browser prompt() and confirm() with custom modals (rename input + danger-confirm) reusing the existing .modal shell. - Add `more-horizontal` icon and 16 new i18n keys across all 18 locales (zh-CN/zh-TW localized; others fall back to English). * test(e2e): update home delete flow for overflow menu + custom confirm modal The previous flow targeted a per-card X button labelled "delete project <name>" and asserted on a native `dialog` event. The card UI now exposes a `…` overflow menu and a styled confirm modal, so reach delete via the menu and assert against the modal's Cancel / Delete buttons instead. * fix(web): harden Designs tab preview sandbox * fix(web): hide Designs select mode in kanban |
||
|
|
928079daf5
|
feat(web): consolidate Image/Video/Audio entries into a Media tab (#1167)
Reduces the New Project panel's top-level tab count by collapsing the three media surfaces into a single Media tab with an inner segmented control, and polishes the controls inside that tab so they stop dominating the panel: - Media tab + segmented (Image / Video / Audio) inside the panel body. Underlying ProjectKind branches and submission contract unchanged — the daemon still receives kind=image/video/audio. - Model picker rewritten as a combobox: one trigger row + searchable, provider-grouped popover with Recommended badges. Replaces the flat grid of provider-grouped cards that scrolled past the fold once the fourth provider landed. - Aspect picker compressed from a 5-card grid to a single row of segmented pills with mini ratio glyphs. - Image surface no longer carries a free-form Style notes field; it was redundant with the prompt template + main prompt input. - Live artifact tab locks fidelity to high-fidelity (the wireframe option is now hidden) — a wireframe live artifact doesn't make sense and the picker added noise. i18n: adds tabMedia / titleMedia / model* keys across all 18 locales, removes imageStyleLabel / imageStylePlaceholder. Tests + e2e selectors updated to drive the new Media tab + segmented surface flow. |
||
|
|
1b307bf17f
|
feat(web): tweaks palette popover with HSL hue-shift recoloring (#1292)
* feat(web): tweaks palette popover with HSL hue-shift recoloring Adds a Tweaks color-palette popover to the HTML preview toolbar. Selecting a palette re-skins the iframe in place via a srcDoc-side bridge that walks the DOM and shifts every chromatic paint to the target hue while preserving each color's saturation and lightness — pale tints stay pale, bold CTAs stay bold, just in the new color family. Mono-noir desaturates instead of shifting. - runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options - file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected - FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration - PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir - PreviewDrawOverlay: stub pass-through until the draw branch lands * feat(web): hide finalize-design toolbar from project header * test(e2e): skip project actions toolbar flow after toolbar removal |
||
|
|
fb47d0ae51
|
style(web): polish EntryView UI — sidebar layout, folder tabs, slim form, blue selected token (#1360)
* chore(web): upgrade radius scale + introduce blue --selected token
UI polish pass — design tokens for follow-up commits.
Radius scale was visually too square at the small end. Bump up so
buttons / inputs / cards feel rounded rather than boxy:
- `--radius-sm: 6px → 8px` (buttons, inputs, small chips)
- `--radius: 10px → 12px` (medium containers, Recent filter pill)
- `--radius-lg: 14px → 16px` (project cards)
- `--radius-pill: 999px` unchanged (status chips)
Introduce a separate "selected" colour so selection indicators
(card borders, focus rings) read as blue instead of fighting with
the orange brand accent that drives primary CTAs:
- `--selected: #2563eb` (Tailwind blue-600)
- `--selected-soft: rgba(37, 99, 235, 0.16)` (soft tint for shadows)
No selectors are migrated to `--selected` in this commit — that
happens in a later "selected state" commit so the diff stays scoped.
* refactor(web): replace entry global header with sidebar brand + reorder bottom chips
Pre-existing layout: a global \`AppChromeHeader\` strip sat across the
whole top of EntryView (logo + settings gear), then a 2-column body
below it. Visual mass concentrated in a thin horizontal bar that did
not relate to the page's column structure, and the settings gear
duplicated the bottom Local-CLI chip.
New layout matches the two-column "brand-in-sidebar + tabs-in-main"
pattern: the brand block lives at the top of \`.entry-side\` (left
column), the right tabs live at the top of \`.entry-main\`, and the
vertical divider between them is the only horizontal seam.
EntryView:
- Drop \`<AppChromeHeader actions={avatarMenu} />\` from EntryView's
render — the home page no longer renders the global chrome strip.
(ProjectView still uses AppChromeHeader for back-nav / file
actions, so the component itself stays in the codebase.)
- Add a sidebar brand block inside \`.entry-side\` using the
already-defined \`.entry-brand\` / \`.entry-brand-mark\` /
\`.entry-brand-title\` classes that were sitting dead in
index.css.
- Reorder \`.entry-side-foot\` chips so that the env-critical
Local CLI row sits on top of the row, with the secondary
toggles (language picker, pet adoption, X follow icon) compact
on a second row. The Follow @nexudotio chip drops its text
label and becomes icon-only — pure marketing content, so it no
longer earns a full-width pill.
- Settings access moves entirely to the Local CLI chip's existing
click handler; the top-right gear is gone (it was a duplicate).
CSS:
- \`.entry-shell\` grid: \`auto 1fr\` → \`1fr\` (no header row).
- \`.entry-side\` background: \`var(--bg-panel)\` → \`transparent\`,
so the sidebar shares the page beige and only the New-prototype
card reads as white. Removes the "everything on the left is on
one big white sheet" feeling.
- \`.entry-brand\` gets \`padding: 24px 20px 18px\` so the logo +
title block has breathing room at the top of the sidebar.
- \`.entry-brand-mark\` width/height \`44 → 34\`. The previous
44px gradient ring was visually heavier than the title text it
sat next to.
- \`.entry-brand-title\` weight \`600 → 450\`, color
\`var(--text-strong)\` → \`var(--text)\`. Serif title still
reads as the page anchor without the chunky "bold black" stamp.
- \`.entry-brand-actions\` added for future right-aligned actions
(carries no actual content in this commit — kept so re-adding a
settings/avatar entry point doesn't need new CSS).
- \`.entry-side-foot .foot-pill\` slim pass: padding
\`4px 10px → 3px 8px\`, font \`11.5px → 10.5px\`, gap \`6 → 5\`,
plus \`justify-content: center\` and \`min-height: 24px\` so the
icon-only Follow pill stays the same height as the text pills
next to it.
* style(web): align right tabs row with brand row + strip hover/focus noise
Right column's tabs row ("Designs / Templates / Design systems /
Image templates / Video templates") needed three things:
1. Vertical center of tab text aligned with the brand logo on the
left (both rows feel like one row, separated by the vertical
divider only).
2. Active tab's underline sitting flush on the horizontal divider
below the tabs (not floating mid-row).
3. No hover background, no focus outline, no transition — tabs are
a navigation strip, not action buttons.
Changes:
- `.entry-header` padding `0 28px` → `24px 28px 0`, drop the
`min-height: 52px`. Padding-top mirrors the brand block's
padding-top (24px) so left logo top and right tabs top land on
the same Y. Header height now content-driven; underline meets
the `border-bottom` divider naturally.
- `.entry-tabs` gets `align-self: stretch` + `align-items: center`
+ `gap: 2px → 24px`. The stretch lets the tabs container fill
header height; the bigger gap matches Claude Design's tab
rhythm.
- `.entry-tab` becomes a "plain underline tab":
- `border-radius: 6px 6px 0 0 → 0` (no folder-tab look — that's
on the left tabs).
- `padding: 14px 11px → 6px 4px 8px` so text + underline form a
tight group, with the underline sitting at the bottom of the
tab box right above the header divider.
- `font-size: 14px → 12px` matches the left newproj tabs (set
in commit 4) — both columns share the same tab type-size.
- `transition: none` removes the inherited 120ms background /
border / color transition.
- Hover / focus / active states explicitly zero out background,
border-color, outline. Hover keeps a subtle color change
(`text-muted → text`) so the tab still feels interactive
without flashing a chip behind it.
- Active state colors are duplicated across `.active`,
`.active:hover`, `.active:focus`, `.active:focus-visible` so the
black underline never gets overwritten by the inactive-state
rules above.
* style(web): folder-tab merge on left newproj tabs + flat card top corners
The left "Prototype / Live artifact / Slide deck / …" tabs sat as
plain underline tabs above a fully-rounded card. The active tab and
card looked like two stacked rectangles with a gap.
Folder-tab pattern:
- Active tab gets a white background + 12px top corners + a 1px
border on top / left / right.
- Active tab's bottom border matches the card's background color
(effectively invisible) — so where the tab sits, the card's top
border is "broken" and tab + card read as one merged shape.
- Card top corners are square (`border-radius: 0 0 12px 12px`),
bottom corners stay 12px. With the active tab's square bottom
edge, the merge line at the tab/card seam is a clean horizontal,
not a curve mismatch.
Implementation:
- `.newproj-tabs-shell`:
- `overflow: hidden → visible` so the tab's overlap with the
card below isn't clipped at the shell's bottom edge.
- `margin-bottom: -1px` + `z-index: 2` so the shell renders on
top of the card and the 1px tab/card overlap actually paints.
- The `.can-left { padding-left: 40 }` / `.can-right` overrides
used to reserve room for scroll arrows are removed (arrows
are hidden, no extra padding needed).
- `.newproj-tabs` keeps its horizontal `overflow-x: auto` so the
8 project-type tabs can still scroll inside the sidebar width.
- `.newproj-tabs-arrow` becomes `display: none`. The two
chevron-circle buttons added clutter without much benefit —
users with touchpads / wheels / keyboard already scroll the
tabs row natively, and the `::before` / `::after` linear-
gradient fades (now using `--bg` instead of `--bg-panel` so
they fade into the page beige, not the sidebar panel that no
longer exists) signal there are more tabs to the right.
- `.newproj-tab`:
- Replace the plain bottom-underline (`border-bottom: 2px
solid transparent`) with a full transparent 1px border so
the active state can flip just the colors without changing
layout.
- `border-radius: 0 → 12px 12px 0 0`.
- `position: relative` for z-index stacking.
- `padding: 10px 6px → 7px 14px` (less vertical, more
horizontal — tabs read as "labels" rather than chunky
buttons).
- Symmetric top/bottom padding (`7px`) so the text + folder-
tab top corners stack cleanly.
- `transition: none` — no animation between active/inactive
states (tabs are nav, not action buttons).
- All hover / focus / focus-visible / active states zeroed out
background and border-color so the inherited `button { … }`
base style (which adds bg-subtle on hover) does not bleed in.
Subtle color change on hover (`text-muted → text`) is the
only affordance.
- `.newproj-tab.active` (+ active hover/focus combos so the
base rules don't override): white bg, full var(--border) on
three sides, bottom border = var(--bg-panel) (invisible
against card), z-index 3 (above non-active tabs and shell
pseudo-elements).
- `.newproj-body`:
- `margin: 0 24px` so the card breathes inside the sidebar
(and the active tab's left edge aligns with the card's left
edge).
- `padding: 18px 24px 28px → 16px 18px 18px` — tighter.
- `border-radius: (full 12) → 0 0 12px 12px` for the
flat-top merge with the active tab.
- Adds explicit `border` + `background: var(--bg-panel)` +
`box-shadow: var(--shadow-xs)` so the form reads as a card
floating on the transparent sidebar.
- `flex: 1 → 0 0 auto` (and `min-height: 0` / `overflow-y:
auto` removed) — the card is content-sized, not stretched
to fill the sidebar. Empty space below the card is now
page beige, not a giant white sheet.
- `gap: 14px → 12px` between form sections.
* style(web): slim NewProjectPanel form (title, fidelity, buttons, ds-picker)
The form inside the new white card felt overweight against the
compacted layout from the previous commits — fidelity cards were
~133px tall, the Create button + Open-folder secondary button both
had ~11px symmetric padding, the design-system trigger had a 32px
avatar in a 55px-tall row. Slim every element so the card reads as a
focused form, not a stack of beefy buttons.
Title:
- \`.newproj-title\` font \`14px / 600 → 13px / 550\`. Still
visibly the section heading but no longer competing with the
serif brand title above.
Fidelity:
- \`.fidelity-thumb { aspect-ratio: 12/7 → 16/7 }\`. The previous
aspect made cards taller than they needed to be in the narrow
sidebar column.
- \`.fidelity-card { gap: 8 → 6, padding: 10/10/12 → 8/8/10 }\`.
Combined with the thumb aspect change, card height drops from
~133px → ~102px (visually close to the Claude Design reference
while keeping the same content).
Primary / secondary buttons:
- \`.newproj-create\` padding \`11px (symmetric) → 8px 11px\`,
margin-top \`4 → 2\` — primary CTA no longer towers over the
fidelity cards above it.
- \`.newproj-import\` padding \`10px → 6px 10px\` — the secondary
"Import Claude Design ZIP" button feels like an alt option, not
a peer of Create.
Design system trigger:
- \`.ds-picker-trigger\` gap \`10 → 8\`, padding \`8/10 → 6/10\`.
- \`.ds-picker-title\` font \`13 → 12.5\` so name + subtitle stay
legible in the slimmer row without overflowing the column.
- \`.ds-avatar\` width/height \`32 → 26\`, border-radius \`6 → 5\`.
The thumbnail was the dominant element in the row; shrinking it
pulls the row height from ~55px → ~50px.
Footer disclaimer:
- \`.newproj-footer\` padding-top \`0 → 12px\`. The "Only you can
see your project by default." line was butting against the card
bottom; 12px of air separates the disclaimer (page-bg context)
from the card (panel-bg context) cleanly.
* style(web): blue selected indicators + Recent filter rounded + neutral input focus
Three small "selection state" tweaks driven by the new
\`--selected\` token introduced earlier in this branch:
1. Fidelity card selected border is now blue, not the brand
accent. The orange Create button + the orange selected card
border were fighting for the same visual role (primary
action vs primary selection). Blue clearly says "this is
the one that is selected" without competing with the CTA.
- \`.fidelity-card.active\` border-color
\`var(--accent) → var(--selected)\`.
- Box-shadow ring + soft 0.04 drop swapped from the orange
\`180/90/59\` rgba tuple to the blue \`37/99/235\` tuple.
- \`.fidelity-card.active .fidelity-thumb\` border swapped
from \`var(--accent-soft) → var(--selected-soft)\`.
2. Recent / Your designs filter is no longer a fully-rounded
pill. The bottom-left settings chips deserve to be the only
"999px pill" shape — those are tertiary status indicators.
The Recent/Your designs toggle is a higher-importance
inline filter, so it gets the medium radius instead.
- \`.subtab-pill\` wrapper border-radius
\`var(--radius-pill) → var(--radius)\` (12px).
- Inner button border-radius
\`var(--radius-pill) → var(--radius-sm)\` (8px).
- Active state background \`var(--text) → var(--bg-panel)\`,
color \`var(--bg) → var(--text)\`. The "black filled pill"
read as a status badge; white-on-faint-gray reads as
"selected toggle" — same shape as Claude Design's Recent
pill.
3. Input focus is neutralised. The base \`input:focus\` rule
added an orange border + a 3px orange-soft ring around the
focused field — way too much visual weight for a quiet form
("Project name" → focus made it scream).
- \`input:focus / textarea:focus / select:focus\` border-color
\`var(--accent) → var(--border-strong)\` (light grey).
- Box-shadow ring removed (\`none\`). Focused inputs now only
darken their border by one step — barely visible but enough
to confirm focus.
These three changes are grouped because they all migrate selection-
state styling off the brand accent and onto neutral / blue tokens.
The next pass (if any) can sweep the remaining \`var(--accent)\`
selection sites (\`.ds-row.active\`, \`.ds-picker-trigger.open\`,
\`.conv-pill.open\`, …) to use \`--selected\` too, but each of those
lives in a different surface and felt out of scope for the entry
view polish.
* refactor(web): pet rail toggle moves inside pet pill as split button
WHAT
- Convert the pet pill from a single `<button>` to a `<div>` containing
two buttons separated by a 1px divider:
* `.pet-pill-main` keeps the existing "Adopt a pet" / "Change pet"
glyph + label + unadopted dot, still wired to `onAdoptPet`.
* `.pet-pill-toggle` is a small icon-only button that flips
`petRailHidden` — eye icon when the rail is hidden ("click to show"),
eye-off when visible ("click to hide").
- Drop the old avatar-menu popover from EntryView entirely:
`avatarMenuOpen` state, the outside-click / Escape effect, and the
cog-popover trigger are all removed. The `Settings` entry of that
popover was already redundant with the `Local CLI` chip; the
`Hide/Show pet picker` entry now lives directly on the pet pill.
- CSS in the `.pet-pill` block:
* `height: 24px` + `padding: 0` so the outer pill matches every other
chip in the row vertically.
* `.pet-pill-glyph` reduced from 14px to 12px and constrained to a
14x14 inline-flex box so the unicorn / paw glyph stops pushing the
chip taller than 24px.
* Per-region hover (`.pet-pill-main:hover`, `.pet-pill-toggle:hover`)
so each side of the split lights up independently, with the divider
inheriting the accent tint while the chip is in `pet-pill-fresh`.
WHY
- After commit
|
||
|
|
b55f171693 |
feat(web): redesign entry view with new navigation and home layout
- Introduced `EntryNavRail` for a streamlined left navigation experience, featuring primary actions and a brand logo. - Created `EntryShell` to manage the entire home view layout, integrating the centered hero, recent projects, and plugins section. - Developed `HomeHero` for user prompt input, allowing seamless interaction with plugins and project creation. - Replaced the previous `PluginLoopHome` with a more cohesive `HomeView` that orchestrates plugin interactions and project submissions. - Enhanced CSS styles for improved visual consistency across the redesigned components. This update significantly enhances the user experience by providing a more intuitive and visually appealing entry point into the application. |
||
|
|
a75d9938c7
|
feat(design-systems): add structured tokens.css schema (default + kami) (#1231)
* feat(design-systems): add structured tokens.css schema (default + kami) Compile each brand's DESIGN.md prose into a machine-readable :root block agents paste verbatim, removing the "Primary → --accent" translation step where most token misuse happens. Daemon prompt injection lands in a follow-up; lint-artifact already enforces the shared token vocabulary so no rule changes needed. Schema validated across two contrasting aesthetics: - default (sans-serif, cobalt, B2B utility) — stress test the shallow form, 2-level fg / 2-level surface - kami (serif, parchment, ink-blue, print-first) — stress test the rich form, 4-level fg ramp, 3-level surface, ring elevation, i18n font stacks, and solid-hex tag tints (print renderers double-paint alpha) Schema growth from kami's stress test (5 new optional slots, all backward-compatible — default aliases via var() to existing tokens): - --fg-2 / --meta (4-level fg ramp) - --surface-warm (3-level surface) - --border-soft (2-level border) - --elev-ring (ring elevation as first-class level) Brand-specific extensions live in tokens.css with explicit "NOT in shared schema" labels and a documented promotion path (≥2 brands need it → promote to schema slot). components.html in each brand is a self-contained reference fixture that exercises every token through real layouts. Both fixtures lint clean against apps/daemon/src/lint-artifact.ts. Co-authored-by: Cursor <cursoragent@cursor.com> * feat(design-systems): add token-fixture drift guard Each design system in design-systems/<brand>/ ships two files agents consume in tandem: tokens.css (canonical token bindings) and components.html (a self-contained fixture whose first <style> embeds the same :root paste so the file renders standalone). The fixture's :root block is a copy of tokens.css's :root block, kept in sync only by an inline comment. This adds scripts/check-tokens-fixture-sync.ts and registers it in pnpm guard. The check pairs each brand's tokens.css with its components.html and asserts the unscoped :root block is byte-equivalent after canonical normalization (CSS comments stripped, whitespace collapsed, separator spacing normalized). Brands missing one half of the pair, or with no :root rule in either file, fail the guard. Scoped overrides like :root[lang="zh-CN"] are not required to appear in the fixture (per the kami fixture's inline comment they are pasted only when an artifact's <html lang> matches), so the check only compares the unscoped :root block. Verified: pnpm guard passes for default + kami, fails on intentional value drift, fails on missing token, tolerates whitespace-only formatting differences. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(design-systems): point fixture CTAs to real files Both default and kami components.html advertised in-page anchors (#tokens, #spec, #surface, #accent, #type, #components) but defined no matching ids, so every CTA was a no-op when the fixture was opened locally — flagged by mrcfps in #1231. Re-point each link to a real artifact in the same brand directory: - "View tokens" / "Inspect tokens" / "Inspect typography" → ./tokens.css - "Read the spec" / "Read the rule" → ./DESIGN.md Browsers render these as raw source views, which is the desired UX for a reference fixture: clicking the CTA shows the underlying contract instead of jumping to nothing. Agents copying the fixture also learn the pattern of "buttons link to actual sibling resources". The :root token block is unchanged, so the token-fixture drift guard still passes for both brands. Co-authored-by: Cursor <cursoragent@cursor.com> * feat(design-systems): codify token schema (A1/A2/B/C layers) The two-brand pilot (default + kami) settled the shape of the shared token schema; this commit codifies it as a machine-readable contract and enforces it in pnpm guard, addressing lefarcen's review on #1231: > the optional-vs-required split won't generalize cleanly when brand > #3 needs different Layer A tokens or when multiple brands converge > on the same extension (promoting C→B→A). Consider surfacing that > limitation in the PR narrative or in a future SCHEMA.md. Schema lives under design-systems/_schema/ as three files: - tokens.schema.ts — TypeScript declaration of every shared token with its layer (A1-identity / A1-structure / A2 / B-slot), plus per-brand C-extension allowlists and a global C-prefix allowlist - defaults.css — CSS mirror of A2 fallback values, used as the human-readable contract reviewer's-eye copy and the future input to the derive script - AGENTS.md — schema layer model, C → B-slot → A2 promotion rules, when-not-to-add-a-token guidance Layer model: A1-identity 8 tokens — bg/surface/fg/muted/border/accent + font-display/font-body. The brand IS these values; no fallback is defensible. A1-structure 18 tokens — type scale (8), leading (2), tracking (1), section-y (3), container (4). Structural decisions vary per brand by design and have no cross-brand default. A2 26 tokens — accent states, semantic colors, motion, base spacing scale, radius, elevation, focus, font-mono. Required in every tokens.css; fallback lives in defaults.css for the future derive script to inline when DESIGN.md does not specify the value. B-slot 4 tokens — fg-2 / meta / surface-warm / border-soft. Brand may bind independently or alias the named sibling via var(...) for components that target the richer ramp. C-extension n tokens — brand-specific names (kami's tag-bg-*, leading-display, accent-light, etc.). Allowlisted per-brand in BRAND_EXTENSIONS or globally by prefix in BRAND_EXTENSION_PREFIXES. Promote when a second brand adopts the same name. Why A2 fails the guard today: Artifacts are generated by agents pasting one brand's :root block into a single <style>; there is no global stylesheet that supplies fallbacks at runtime. A tokens.css missing an A2 declaration would silently break any var() reference in the fixture. Until the derive script (PR-B) lands and inlines defaults, every brand's tokens.css must declare every A2 token directly. The guard enforces this strictly. Why --font-mono lands in A2 (not A1): 149 brands' DESIGN.md files were surveyed: 87 (58%) declare a monospace stack, 62 (42%) do not — including major brands like bmw / nike / apple / notion / mastercard / meta. Agent paste cannot rely on the brand author having written it down; a defaultable A2 fallback (with CJK brands like kami overriding) is safer than forcing every brand author to add a field they may not realize their kbd / code-block components need. Five guard checks, each registered as its own entry in scripts/guard.ts so failures attribute to a specific contract: 1. token-fixture sync — components.html :root ↔ tokens.css :root byte-equivalent (existing) 2. A1 required tokens — every brand declares every A1 token 3. A2 required tokens — every brand declares every A2 token 4. unknown token allowlist — every declared token is in schema or brand-extension allowlist 5. A2 defaults parity — defaults.css ↔ tokens.schema.ts fallback byte-equivalent Verified on default + kami: - 26 A1 tokens declared in both brands - 26 A2 tokens declared in both brands - 129 total declarations, all match shared schema or brand extensions - defaults.css ↔ tokens.schema.ts parity holds - sanity test: drifting --motion-fast in defaults.css fails check 5 with a clear divergence message The PR description originally listed "Dedicated SCHEMA.md" as explicitly NOT in this PR ("Once 3+ brands ship, extracting a single source of truth becomes worthwhile"). That boundary moves: lefarcen's review surfaced the schema-generalization risk, and the schema must exist as a machine-enforced contract before the derive script can read it. The TS file replaces the markdown that was deferred. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web/tests): pass missing designTemplates prop to ProjectView Pre-existing typecheck regression on main: PR #955 ( |
||
|
|
be77dc0394
|
Default English resource i18n fallback (#1270) | ||
|
|
10802bb0b0
|
test: expand nightly UI and desktop regression coverage (#1256)
* e2e(ui): cover examples preview flows * e2e(ui): cover Codex local CLI fallback UX * test: expand desktop and connector regression coverage * e2e(ui): cover workspace restoration flows * e2e(ui): cover retry recovery workspace flow * test: cover artifact and connector recovery flows * e2e(ui): cover Continue in CLI stale provenance flow * e2e(ui): cover BYOK model fetch caching * test: expand Orbit and desktop connector coverage * e2e(ui): cover workspace quick switcher recovery flows * e2e(ui): cover connector pending authorization recovery * e2e(ui): cover workspace and conversation restoration routes * e2e(ui): cover conversation draft and attachment restoration * e2e(ui): cover conversation history selection recovery * e2e(ui): cover workspace surface conversation selection * test: cover artifact presentation and orbit link behavior * test: cover artifact external link restoration * e2e(ui): cover root-route deep-link restoration * e2e(specs): cover Orbit open-artifact desktop click * e2e(specs): cover desktop artifact open link * test: fix Orbit settings fixture type drift * test: split Playwright critical and extended suites * test: fix ProjectView design template fixtures * ci: split workspace test stages * guard: allow split Playwright suite scripts * test: shrink Playwright critical suite * test: restore omitted Playwright suites |
||
|
|
b5eb8c1647
|
feat: generic skills + split skills/design-templates + finalize-design API (#955)
* feat: general-purpose skills with @-mention composition and user import
Lift skills from "one mode-bound skill per project" to a generic capability
the user can compose per turn:
- Daemon: scan multiple skill roots (user-skills under runtime data, then
the bundled `skills/`); user-imported skills can shadow built-ins by id.
- New `POST /api/skills/import` and `DELETE /api/skills/:id` endpoints,
with CONFLICT/BAD_REQUEST/NOT_FOUND error codes and built-in delete
protection.
- ChatRequest gains `skillIds: string[]`; the chat run concatenates each
picked skill's body (and merges craftRequires) into the system prompt
for that turn only — the project's persistent `skillId` is untouched.
- Web composer: `@` popover now lists skills alongside project files;
picks render as removable chips above the textarea and ride along with
the request as `skillIds`.
- Settings → Library: import form (name/description/triggers/body),
per-card delete for user skills, "user" origin badge.
* chore(web): drop welcome pet teaser + add ds→prompt-template mapping util
- SettingsDialog: remove the inline pet adoption teaser from the welcome
panel so the first-run modal stays focused on configuration.
- New `inferPromptTemplateCategoriesForDs(ds)` helper that maps a design
system's authored metadata to prompt-template gallery categories.
Imported by the design-system gallery wiring on a sibling branch; no
callers in this branch yet.
* feat: split skills/design-templates and add finalize-design API
Phase 0 of the skills/design-templates refactor (specs/current/
skills-and-design-templates.md):
- Move ~104 rendering catalogue entries from skills/ to design-templates/
and keep skills/ for the small set of functional skills that *do work*
on user input (utilities, briefs, packagers).
- Add design-templates/AGENTS.md and skills/AGENTS.md describing the
contract, and a brand-agnostic craft/ surface for opt-in craft rules.
- Daemon: add DESIGN_TEMPLATES_DIR / USER_DESIGN_TEMPLATES_DIR roots and
an /api/design-templates surface mirroring /api/skills. Asset/example
routes still span both registries so existing srcdoc URLs keep
resolving across the rename.
- Web: split LibrarySection into SkillsSection + DesignSystemsSection,
rename the EntryView "Examples" tab to "Templates", and update locales
+ the New-project picker accordingly.
Adds the finalize-design endpoint:
- New apps/daemon/src/finalize-design.ts and packages/contracts/src/api/
finalize.ts — one-shot synthesis of a project's transcript + active
design system + current artifact into <projectDir>/DESIGN.md via the
Anthropic Messages API. Per-project .finalize.lock mirrors the
transcript-export hygiene from PR #493; provider credentials are not
persisted by the daemon.
Other supporting changes:
- README + AGENTS.md updates to document the new directory split and
craft/ surface, plus i18n strings across 13 locales.
- Test refactors and new coverage (finalize-design, runs, sidecar
server, plus refreshed daemon integration tests).
- .gitignore: scope the *.exe ignore to /OpenDesign.exe so legitimate
vendor binaries are no longer hidden.
* fix(merge): move clinical-case-report to design-templates/
Origin/main added the clinical-case-report skill under skills/ before
the skills/design-templates split landed. Its od.mode is prototype, so
per specs/current/skills-and-design-templates.md it is a design template
and belongs alongside the other rendering catalogue entries — not under
the slimmed-down functional skills/ root. Moving it keeps the EntryView
Templates tab consistent with origin/main's intent.
* feat(skills): curated design/creative catalogue + collapsible Settings rows
Seed ~100 curated design/creative skill stubs under skills/ sourced from
awesome-claude-skills (ComposioHQ) and awesome-agent-skills (VoltAgent).
Each stub carries an od.category tag so the new filter pill row in
Settings -> Skills can group them. The seed script
(scripts/seed-curated-design-skills.ts, pnpm seed:curated-design-skills)
is idempotent: it only creates folders that don't already exist, so
hand-edited stubs are never overwritten.
- Daemon: parse and surface od.category on SkillInfo with a strict slug
normaliser; mirror the field on SkillSummary in @open-design/contracts.
Category is purely a UI hint — system-prompt composition is unchanged.
- Web: rewrite SkillsSection from a left-list / right-detail grid into a
vertical stack of collapsible rows mirroring the External MCP panel
(header always visible with name + mode/source/category pills + per-row
enable toggle; SKILL.md preview, file tree and inline edit form expand
on demand). Add a Category filter row above the list. Reorder Settings
nav so Skills + External MCP sit above the Composio/MCP cluster. Update
composer placeholder/hint across 17 locales to advertise '@ files or
skills · / for commands'.
- Docs: extend skills/AGENTS.md with the curated catalogue rules
(idempotency, category vocabulary, no upstream vendoring).
Co-authored-by: Cursor <cursoragent@cursor.com>
* test(skills): teach localized-content + system-prompt tests about the skills/design-templates split
mrcfps blocking review on PR #955: the skills/design-templates split
(
|
||
|
|
f7f2661bda
|
[codex] Handle empty API responses as no output (#1244)
* Handle empty API responses as no output * Fix empty API response comment cleanup * Stabilize API empty response detection |
||
|
|
421ddf553c
|
fix(pack/win): close running app before silent reinstall (#1238) | ||
|
|
e254d1280b
|
feat(memory): auto-memory store with chat-protocol-aware extraction (#999)
* feat(memory): auto-memory store with chat-protocol-aware extraction
Markdown memory store at <dataDir>/memory/ with two extractors —
heuristic regex for explicit "remember:" / "我是 X" markers, and a
small-model LLM pass after each turn — folded into the system prompt
so cross-chat preferences, role, and ongoing-work context survive
restarts.
Settings UI:
- Memory tab lists entries, exposes a hand-edited MEMORY.md index, and
shows an extraction history with per-attempt phase/skip/failure rows.
- Memory model picker is inline next to the chat model picker (CLI and
BYOK) so the choice "which fast model mines facts each turn?" sits
next to the chat-model decision instead of a separate panel. The
picker reuses the same SUGGESTED_MODELS table and "Custom..." pattern
the chat picker uses.
LLM extractor supports all four protocols (anthropic / openai / azure /
google); pickProvider takes the chat agent id from the chat handler
and constrains its auto-pick to the chat's protocol family — Claude
Code chats no longer surprise users by silently extracting on whatever
OpenAI key happens to be in media-config. When no matching key is
configured the attempt records as 'skipped: no-provider' instead of
quietly switching vendors.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(memory): keep hint outside <label> and disambiguate Model selectors
The inline Memory model picker wrapped its hint paragraph inside the
<label>, which made the hint's "API key" / "model" wording bleed into
the <select>'s accessible name and broke Playwright's getByLabel('API
key') / getByLabel('Model') strict-mode matching in the existing
settings-api-protocol e2e suite.
- Move the hint <p> out of the <label> in MemoryModelInline so the
select's accessible name is just "Memory model".
- Switch the chat-Model selectors in settings-api-protocol.test.ts from
getByLabel('Model') to getByRole('combobox', { name: 'Model', exact:
true }) so they no longer collide with the new "Memory model" select
that sits next to the chat Model picker.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(memory): address review changes — BYOK wiring, MEMORY.md index, /v1, label wrapper
Addresses the four blocking review threads on PR #999.
1. MemoryModelInline accessibility (mrcfps)
The inline picker still wrapped its select + custom input + flash +
hint inside a single <label>, which made the select's accessible
name absorb every text descendant — including the "API key" / "model"
hint copy. The previous fix moved only the hint outside; the
reviewer asked for a non-label wrapper. Switch to <div className="field">
and associate just the short title with the controls via
`aria-labelledby` / `aria-label`. The select's accessible name is
now exactly "Memory model" so `getByLabel` strict-mode locators
on the surrounding chat form stop cross-matching the memory copy.
2. Respect the hand-edited MEMORY.md index (mrcfps + codex)
`composeMemoryBody()` was reading every *.md file in the memory
dir, ignoring the index. Removing a `- [Name](id.md)` line had no
effect on future prompts. Parse the index's `INDEX_LINK_RE` bullets
and filter `listMemoryEntries()` to the linked id set, so the
editor's "delete this line to disable injection" promise actually
holds.
3. Versioned OpenAI-compatible base URLs (codex)
`callOpenAI` and `callAnthropic` hard-coded `/v1` onto
`provider.baseUrl`, breaking custom endpoints whose saved URL
already includes `/v1` (`/v1/v1/chat/completions`). Apply the same
conditional `appendVersionedApiPath` helper the chat proxy and
connection-test routes already use.
4. Wire memory into BYOK / API-mode chats (mrcfps + codex)
The previous PR's daemon-only memory hook never fired for BYOK,
leaving the Memory tab + model picker as a no-op for that mode.
Add the missing surface and wire it through ProjectView:
- contracts: extend `composeSystemPrompt` with `memoryBody`,
mirroring the daemon's local composer; add
`MemorySystemPromptResponse` and the `attemptedLLM` flag on
`ExtractMemoryResponse`.
- daemon: expose `GET /api/memory/system-prompt` (returns the
composed body) and turn `POST /api/memory/extract` into a
two-phase endpoint — heuristic-only when only userMessage is
supplied (pre-turn), LLM-only when assistantMessage is also
supplied (post-turn), so the extraction-history doesn't double
up.
- web: ProjectView's BYOK branch now fetches the memory body
before composing the system prompt, runs the heuristic
extractor before the run (so "remember:" markers in this turn
reach this turn's prompt), accumulates assistant text during
streaming, and queues the LLM extractor on `onDone` — fire-and-
forget so it never blocks the chat round-trip.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(memory): re-sync BYOK memory override when chat config drifts
The inline memory-model picker captured `apiProtocol` / `chatApiKey` /
`chatBaseUrl` / `chatApiVersion` into the saved override only at the
moment the user clicked a model. If they later swapped the BYOK
protocol tab, rotated the API key, or edited the base URL in the same
settings flow, the daemon's background extractor kept calling the
*old* vendor / credential — directly contradicting the picker's
"borrows the surrounding chat picker's protocol, key, base URL, and
api-version automatically" promise.
Add a debounced effect that compares the persisted (masked) shape
against the live chat props and re-PATCHes /api/memory/config when
they drift. The masked config exposes `apiKeyTail` (last 4 chars), so
key rotation is detectable without ever round-tripping the secret
back to the browser. The 300 ms debounce coalesces the keystroke-
granularity prop updates the parent settings dialog streams during
its autosave loop, so a user editing the base URL doesn't trigger one
PATCH per character. Background re-syncs are silent — the "Saved!"
flash only fires for explicit user clicks, so the picker doesn't feel
like it's fighting them as they edit unrelated chat fields.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(memory): thread BYOK chat config through /api/memory/extract default path
Leaving the BYOK memory picker on "Same as chat" still broke the
default LLM extraction path: `MemoryModelInline` clears the override
for that option, both `/api/memory/extract` calls in `ProjectView`
only sent the messages, and the daemon never persists BYOK creds, so
`extractWithLLM(..., { chatAgentId: null })` always reached
`pickProvider()` with no chat context and fell through to env /
media-config — the wrong vendor for a BYOK chat that works for
inference.
Thread the live BYOK chat config through the extract endpoint as a
per-call snapshot:
- contracts: extend `ExtractMemoryRequest` with an optional
`chatProvider` (provider/apiKey/baseUrl/apiVersion/model) and add
`'chat-byok'` to the credentialSource enum.
- daemon: parse + validate `chatProvider` on `/api/memory/extract`
(provider must be one of the five known shapes) and forward to
`extractWithLLM` as a new option. `pickProvider()` gets a new
path 2 that uses the snapshot directly with the per-protocol
fast-model default — so a memory pass on `gpt-4o` / `claude-sonnet-4-5`
silently turns into a cheap `gpt-4o-mini` / `claude-haiku-4-5` call
instead of paying chat-tier rates for sediment work. Override and
CLI-agent-constrained paths still win when they apply.
- web: `ProjectView` snapshots `apiProtocol` / `apiKey` / `baseUrl` /
`apiVersion` from the live `AppConfig` on each BYOK extract call
(both pre-turn heuristic-only and post-turn LLM phases). The
picker's existing drift-resync effect already covers explicit
overrides; this snapshot covers the implicit "Same as chat"
default that the override flow can't reach.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(memory): treat empty apiKey on PATCH as a real clear
MemoryModelInline silently re-PATCHes /api/memory/config whenever the
surrounding BYOK chat creds drift. The previous reuse branch lumped
`apiKey === ''` together with `apiKey === undefined`, so clearing the
chat API key from the picker quietly preserved the old daemon-side
secret and kept calling the provider on a stale credential.
Distinguish four states for the apiKey field:
- absent -> preserve stored secret (form re-save without re-typing)
- '' -> clear stored secret (user removed it from the picker)
- 'sk-...' -> replace
- new provider -> ignore stored secret entirely
Add tests/memory-config-route.test.ts covering all four cases.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
31e57fd773
|
fix(daemon): persist runStatus/endedAt on chat run termination (#1230)
* fix(daemon): persist runStatus/endedAt on chat run termination (#135) POST /api/runs created the run but never reconciled the messages row on terminal status. If the web failed to persist the cancel (refresh, dropped PUT), the row stayed at run_status='running' / ended_at=NULL, and on reload the elapsed timer kept climbing because the renderer fell back to now - startedAt. Mirror routine/orbit reconciliation: attach a wait-completion handler that updates run_status and ended_at, guarded by COALESCE and a run_status IN ('queued','running') filter so concurrent web persists are not clobbered. Adds cancelRun helper and two regression specs under e2e/tests/dialog/. * fix(daemon): annotate reconcile callback params for chat-routes The chat run reconciliation block landed in chat-routes.ts after the recent server-route split (#1043), where stricter type checking surfaces implicit `any` parameters. Annotate the wait/then callback as `{ status: string }` and the catch callback as `unknown`. * refactor(daemon): extract reconcileAssistantMessageOnRunEnd helper The inline if/wait/then/catch block in POST /api/runs read as a bolt-on patch. Lift it to a named file-scope helper so the route handler stays intent-level (start the run, arrange follow-up reconciliation) and the guard for missing assistantMessageId is an internal detail. The helper's docblock describes the invariant ("messages row reflects the run's terminal state even without web persist"); commit history keeps the issue context. * test(e2e): wait for any terminal status in stop-reconcile spec The earlier .catch fallback chained two waitForRunStatus calls (canceled then succeeded). waitForRunStatus throws on the first non-expected terminal, so a canceled run that resolves to failed (e.g. agent exits non-zero on SIGTERM) would still abort the test before reaching the messages-row assertion. Add waitForRunTerminal to e2e/lib/vitest/runs.ts: polls until any terminal status without throwing on mismatch, since this spec's claim is about the resulting messages row, not which terminal the run took. Addresses Codex inline review on PR #1230. |
||
|
|
976edaf38e
|
test: harden e2e smoke and release reports (#1140)
* test: harden e2e inspect specs * test: wire e2e release reports * chore: bump packaged beta base to 0.6.1 * test: run release smoke vitest directly * test: add suite-owned tools-dev lifecycle * ci: harden stable release packaging * fix(release,e2e): gate stable signing on verify and harden suite cleanup - restore `needs: [metadata, verify]` on the stable release `build_mac`, `build_mac_intel`, `build_win`, and `build_linux` jobs so Apple signing/notarization and Windows release builds cannot run before pnpm guard, typecheck, and layout checks complete on the metadata commit. - in `runToolsDevSuite`, drop the `started` flag and always attempt `stopToolsDevWeb` in `finally`; record stop errors in diagnostics, and when the test body succeeded, escalate the stop failure to the suite result and rethrow — so orphan daemon/web processes from an interrupted `startToolsDevWeb` or a broken shutdown can no longer pass silently. Addresses PR #1140 review feedback from lefarcen and mrcfps. |
||
|
|
d45bf3fb9a
|
test: expand entry and settings automation coverage (#954)
* test: harden new project panel metadata coverage * test: expand entry e2e coverage * test: drop e2e docs from the guarded package * test: cover examples gallery interactions * test: cover examples preview modal actions * test: cover examples preview escape fullscreen * test: cover examples template prompt filtering * test: cover updated settings and entry tabs * test: fix entry/settings coverage type drift * test: fix example preview fetch assertion * test: fix new project panel skill fixture |
||
|
|
84f768d4a2
|
feat: add WeChat design system, login-flow skill, and fix API mode tool_calls bug (#1083)
* feat: add WeChat design system, login-flow skill, and fix API mode tool_calls bug - Add WeChat design system (design-systems/wechat/) with full brand spec including color palette, typography, and component rules for chat UI - Add login-flow skill (skills/login-flow/) for mobile authentication flows with P0 checklist, example HTML, and i18n registration across 3 locales - Fix DeepSeek V4 bug: API/BYOK mode (streamFormat=plain) models now receive a directive to emit only <artifact> HTML blocks and suppress tool_calls, since plain adapters proxy to external providers that cannot execute tools * fix: restore full server.ts and WeChat DESIGN.md from ad46d8cd commit Restore files that were corrupted in PR #1083 head branch. The WeChat DESIGN.md was reduced to a single line (filename only) and server.ts was reduced to ~1 line. Both are restored to their original ad46d8cd state with full content. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: restore full server.ts and WeChat DESIGN.md from ad46d8cd Restore files corrupted in PR #1083: - apps/daemon/src/server.ts: restored 7106-line file - design-systems/wechat/DESIGN.md: restored 301-line WeChat design spec - skills/login-flow/SKILL.md: restored from local working state - skills/login-flow/example.html: restored 351-line example HTML * fix: only suppress tool_calls when streamFormat='plain' explicitly, remove nonexistent assets/template.html 1. streamFormat check now requires explicit 'plain' value instead of defaulting to 'plain' when undefined. This prevents normal tool-using chat runs from incorrectly inheriting the API/BYOK tool_calls suppression rule. 2. login-flow SKILL.md: removed reference to assets/template.html since that file does not exist in the skill bundle and derivePreflight() would inject a hard instruction to read it before any other tool, causing pre-flight to fail. * fix: thread streamFormat to composeSystemPrompt in server.ts call Previously the composeSystemPrompt call at line ~4940 omitted streamFormat, causing the composer to default to 'plain' and suppress tool_calls even for tool-using chat runs. Now streamFormat is passed through from the adapter definition so the API mode rule only fires when streamFormat='plain' is explicitly set. * fix: WeChat category metadata, font-family, and login-flow example interactivity WeChat DESIGN.md: - Add Category: Social & Messaging metadata so it appears correctly in picker - Fix font-family declaration: remove invalid -webkit-font-family prefix, use standard font-family so downstream CSS generation works correctly skills/login-flow/example.html: - Add password toggle click handler so show/hide actually works - Change Apple icon fill from hardcoded #fff to currentColor so it is visible on light backgrounds * fix: mirror streamFormat suppression in contracts composer and add WeChat i18n 1. packages/contracts/src/prompts/system.ts: Add streamFormat parameter to ComposeInput and ComposeInput interface, mirroring the same suppression rule from daemon prompts/system.ts. When streamFormat='plain' is passed, a directive is appended telling models not to emit tool_calls and to only output <artifact> HTML blocks. 2. apps/web/src/i18n/content.{ts,fr,ru}.ts: Add WeChat design system entries: - Add 'wechat' to DE/FR/RU_DESIGN_SYSTEM_IDS_WITH_EN_FALLBACK arrays - Add 'wechat' summary to DE/FR/RU_DESIGN_SYSTEM_SUMMARIES - Add 'Social & Messaging' category to DE/FR/RU_DESIGN_SYSTEM_CATEGORIES (matching the Category: Social & Messaging metadata in WeChat DESIGN.md) * fix: thread streamFormat='plain' into web composeSystemPrompt for api mode * test: focus localized content coverage on missing resources --------- Co-authored-by: Open Design Contributor <z@open-design.dev> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: mrcfps <mrc@powerformer.com> |
||
|
|
b03a504da6
|
release: Open Design 0.6.0 (#1080) | ||
|
|
0b039777b9
|
fix/Bug#772-DesignFilesSortButton-DesignFilesTableNowSortableColumns (#804)
* Disclaimer: Changes made using OpenCode with Big Pickle, an AI code assistant.
1. Removed the ↑ button from DesignFilesPanel.tsx (was a "back" button that
closed the preview pane — confusingly placed in the header as if it were for
sorting).
2. Converted the file list to a single <table> with sortable columns:
- Replaced the section-based grouping (Pages, Sketches, Scripts, Images,
Other) with a flat, sortable table
- Added column headers: Name, Kind, Modified — all clickable to toggle
sort direction
- Default sort is by modified time descending (same as previous behavior)
- Sort indicator arrows (↑/↓) show the active sort column and direction
- Live artifacts remain as a separate section above the table
3. Added i18n keys designFiles.colName, designFiles.colKind, designFiles.
colModified to all 17 locale files and the type definitions.
4. Updated CSS with table layout styles (.df-table, .df-file-row, column
width classes, sortable header styles).
Files modified:
- apps/web/src/components/DesignFilesPanel.tsx
- apps/web/src/index.css
- apps/web/src/i18n/types.ts
- apps/web/src/i18n/locales/en.ts
- apps/web/src/i18n/locales/*.ts (all 16 other locale files)
* Updated to preserve keyboard access to sorting
* Fixed keyboard to focus/activate/launch file from Design Files list. Single space bar will show preview, double spare bar will open the file as a tab
* Top pagination bar (above the table):
- "Show" dropdown with options 15, 30 (default), 45, 60, All
- Page range indicator (1–20 of 45)
- Previous / Next buttons
Bottom pagination bar (below the table):
- Previous / Next buttons
- "Go to page" dropdown listing all page numbers
- Same page range indicator
Implementation details:
- All controls use native <select> and <button> elements — fully keyboard
accessible (Tab, arrow keys, Enter/Space)
- Page resets to 0 when page size changes
- safePage clamps to valid bounds when file count changes (e.g. after delete)
- "All" sets page size to total file count (effectively one page)
- Prev/Next buttons show disabled state at boundaries with reduced opacity
* All 46 test files, all 385 tests pass. Here's what the regression test covers:
┌────────────────────┬──────────────────────────────────────────────────────┐
│Test │What it verifies │
├────────────────────┼──────────────────────────────────────────────────────┤
│default page size │500 files → only 30 .df-file-row elements in DOM │
├────────────────────┼──────────────────────────────────────────────────────┤
│page size All │changing per-page to "All" shows all 500 rows │
├────────────────────┼──────────────────────────────────────────────────────┤
│page size 60 │changing to 60 shows 60 rows │
├────────────────────┼──────────────────────────────────────────────────────┤
│Next navigation │clicking Next advances page and shows file-31 (sorted │
│ │by mtime desc) │
├────────────────────┼──────────────────────────────────────────────────────┤
│Prev/Next disabled │Prev disabled on page 0, Next disabled on last page │
│states │ │
├────────────────────┼──────────────────────────────────────────────────────┤
│jump to page │bottom dropdown jumps to page 3 (shows file-91) │
├────────────────────┼──────────────────────────────────────────────────────┤
│page info text │1–30 of 500 → after Next → 31–60 of 500 │
├────────────────────┼──────────────────────────────────────────────────────┤
│render time │renders 500 files in under 2s │
└────────────────────┴──────────────────────────────────────────────────────┘
* Fixed i18n for DesignFiles, and Fixed DesignFilesPanel Test
* Fixed - P3 — .df-thead rule defined but never applied
* Fixed keyboard use for file navigation, focus and button usage
* Fix i18n for x of y in design files pagination
* Fixed SafePage clamping
* Fixed dupe file total count
* Fixed x of y i18n
* Fixed DeleteSelected i18n and missing from Test
* fix effective pagesize issue, and change duplicate file kind to a filesize
* Readded page/everything selection and i18n
* Fixed i18n issues
* Resolved indonesian i18n issue with cloudflare keys
* Fixed unrelated cloudflare i18n issues as requested in Pull Request by reviewer
* Fix e2e test: click filename button instead of row for preview
The DesignFilesPanel was refactored from <button> rows to a <tr> with
a nested <button> for the filename. The e2e test was still clicking
the <tr> which has no onClick handler, so the preview never appeared.
* Remove duplicate formatSize helper, reuse humanBytes instead
|
||
|
|
dcfab797c2
|
[codex] Add stable nightly promotion gate (#962)
* Upload beta e2e spec reports to R2 * Expose beta report URLs in summary * Complete Indonesian deploy locale keys * chore: factor release workflow scripts * chore: bump packaged beta base version * test: wait for mac packaged runtime health * fix: capture mac packaged startup logs * chore: improve mac release build observability * fix: ad-hoc sign unsigned mac builds * chore: diagnose mac packaged startup * fix: relax unsigned mac launch signing * chore: improve mac launch diagnostics * chore: simplify beta mac release artifacts * fix: align packaged mac smoke launch config * fix: externalize mac daemon wasm dependency * chore: require signed stable mac releases * fix: use stable app version for nightly package builds * chore: clean release artifacts after publish * chore: publish beta reports as zip * ci: disable beta mac tools-pack cache * fix: skip mac framework binary symlinks when signing * fix: sign mac framework version bundles * ci: disable beta mac pnpm cache * chore: align stable release reports * ci: require matching nightly before stable release * ci: avoid mac pnpm cache for packaged smoke |
||
|
|
d592f6087f
|
feat(mcp): external MCP client with daemon-managed OAuth and 39 design-focused templates (#898)
* feat(mcp): add external MCP client with daemon-managed OAuth and 17 design-focused templates
Open Design now acts as an MCP CLIENT and surfaces tools from third-party
MCP servers to the underlying agent (Claude Code, Hermes, Kimi).
Daemon
- New mcp-config / mcp-oauth / mcp-tokens modules: persist server entries
to .od/mcp-config.json, run the OAuth dance for HTTP/SSE servers
end-to-end on the daemon (so cloud deployments work and tokens
survive across turns), and inject Authorization: Bearer headers into
the per-spawn .mcp.json the daemon writes for Claude Code (or the
ACP mcpServers map for Hermes/Kimi).
- /api/mcp/servers and /api/mcp/oauth/{start,status,disconnect}
endpoints, plus spawn-time wiring in agents that hands the configured
servers to the active agent CLI.
- System-prompt directive for connected external MCPs so the model
does not chase Claude Code's synthetic *_authenticate /
*_complete_authentication tools when the Bearer is already pinned.
Web
- Settings -> External MCP servers panel with per-row OAuth Connect /
Disconnect / Refresh affordances and per-row template hints.
- New "Add server" picker categorized into 7 groups
(image-generation, image-editing, web-capture, ui-components,
data-viz, publishing, utilities) with a search box, sticky close
button, collapsible <details> sections (auto-expand on search),
60vh capped scroll region, and a pinned Custom-server footer.
- ChatComposer /mcp slash and MCP picker button forward to the new
Settings tab; AssistantMessage renders MCP tool calls inline;
markdown autolinker handles bare http(s) URLs (incl. OAuth links)
before italic markers so OAuth callback URLs do not get
italic-fragmented mid-token.
Contracts
- packages/contracts/src/api/mcp.ts owns the wire shapes
(McpServerConfig, McpTemplate with stable McpTemplateCategory
enum, McpServersResponse, OAuth start/status/disconnect bodies, the
postMessage payload from the OAuth callback).
Templates (17 built-in)
- image-generation: Higgsfield (OpenClaw, OAuth HTTP), Pollinations,
Allyson (animated SVG), AWS Bedrock Image (uvx).
- image-editing: Imagician, ImageSorcery.
- web-capture: just-every screenshot-website-fast, ScreenshotOne.
- ui-components: 21st.dev Magic, shadcn/ui, FlyonUI.
- data-viz: AntV Chart, Mermaid.
- publishing: EdgeOne Pages.
- utilities: Filesystem, GitHub, Fetch.
Tests
- apps/daemon/tests/mcp-{config,oauth,tokens,spawn}.test.ts cover
storage round-trip, OAuth helpers, token persistence, spawn-time
wiring, every template's transport / command / args / env-field
invariants, and the canonical category enum.
- apps/web/tests/runtime/markdown.test.tsx covers the new autolinker
ordering rules.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(mcp): add 21 more design-focused templates and a `design-systems` category
Expands the built-in MCP picker from 17 to 38 templates so users can compose
the full Open Design craft loop (design-system intake → generate → edit →
audit → publish) without leaving the Settings dialog. Every install spec is
verified live against the upstream README; templates that needed Go binaries,
multi-step `init` ceremonies, or massive runtime stacks (PostgreSQL + Redis
+ Ollama) are intentionally deferred so picking a template still resolves to
a working server in one click.
New `design-systems` category between `web-capture` and `ui-components`
(reflects the upstream-of-components position in the workflow). Mirrored in
`McpTemplateCategory` on both contracts and daemon, and `CATEGORY_ORDER` on
the web side.
New templates by category:
- image-generation (+4): prompt-to-asset (icons / favicons / OG / logos with
free-tier routing across Cloudflare AI / NVIDIA NIM / HF / Stable Horde),
Nano Banana (hosted streamable HTTP, virtual try-on + product placement),
Seedream (hosted streamable HTTP, ByteDance Seedream v3-v5 + SeedEdit),
fal.ai (uvx, 600+ models incl. FLUX / Kling / Hunyuan / MusicGen).
- image-editing (+3): Photopea (34 layered-editor tools — closes the PSD
gap), Topaz Labs (AI upscale / denoise / sharpen), Transloadit (86+ media
pipeline robots).
- web-capture (+1): Pagecast (browser → demo GIF / MP4 with auto-zoom).
- design-systems (+4, NEW category): Figma-Context (Framelink, designs →
code), Design Token Bridge (Tailwind ⇄ CSS ⇄ Figma ⇄ M3 / SwiftUI / W3C
DTCG + WCAG contrast), Design System Extractor (Storybook scrape),
Aesthetics Wiki (cottagecore / dark-academia / y2k / … moodboards).
- data-viz (+2): MCP Dashboards (45+ chart types + KPI dashboards),
Excalidraw Architect (hand-drawn architecture diagrams).
- publishing (+6): PageDrop, PDFSpark, OGForge, QRMint, Slideshot
(HTML → PDF / PPTX / PNG with 7 themes), Deckrun (Markdown → PDF / video,
hosted free tier with no key required).
- utilities (+1): A11y axe-core (WCAG 2.0/2.1/2.2 + color-contrast + ARIA).
Tests cover every new template's wiring (command, args, env / header
required-vs-optional, secret flag), the category enum invariant, and
in-category declaration order for image-generation, design-systems and
publishing buckets where the order is what users see in the picker. 21 new
test cases pass; full mcp-config suite is green.
Templates intentionally deferred (documented in PR body): figma-use
(needs Figma desktop with --remote-debugging-port=9222), m-moire (multi-step
`memi suite init` + daemon ceremony), gemini-media-mcp + trident-mcp (Go
binaries — no npx / uvx path), Pixelle-MCP (full app with web UI + ComfyUI
backend), storybook-addon-mcp (lives inside user's Storybook, not standalone),
primitiv (multi-step init / build / serve), ReftrixMCP (PostgreSQL + Redis +
Ollama + DINOv2), narasimhaponnada/mermaid (overlap with peng-shawn).
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(mcp): add figma-use template (write designs from chat) under design-systems
figma-use is the natural counterpart to Figma-Context already in this PR:
where Framelink reads Figma designs into the model, figma-use writes back
into the canvas (90+ tools — create frames / text / components / variants,
render JSX into Figma, export PNG/SVG, query nodes via XPath, lint for
WCAG / auto-layout / hardcoded colors, analyze design systems).
Wired as an HTTP MCP template (`http://localhost:38451/mcp`) because
`figma-use mcp serve` only exposes HTTP — there's no stdio mode in the
upstream `serve.ts`. No API key. Two prerequisites the user owns are
spelled out in the description so picking the template still resolves to
a working server: (1) start Figma with `--remote-debugging-port=9222`
(or `figma-use daemon start --pipe` on Figma 126+), and (2) leave
`npx figma-use mcp serve` running in a terminal.
Inserted between `design-system-extractor` and `aesthetics-wiki` so the
design-systems category reads as a workflow: read existing design (Figma
Context) → translate tokens (Token Bridge) → extract from Storybook
(Extractor) → write back to Figma (figma-use) → break creative block
(Aesthetics Wiki).
Tests cover the new template's transport (`http`), endpoint URL, the
empty header-fields invariant (no auth required), and bump the
design-systems group order to include it.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(settings): i18n the External MCP / MCP server / Connectors sidebar entries and make the dialog header track the active section
The External MCP sidebar entry this PR introduces was hardcoded English
("External MCP / Add MCP tools (Higgsfield, GitHub…)"). Same for the
adjacent Connectors and MCP server entries. The dialog header was also
pinned to "Execution & model" copy, so opening Settings → External MCP
showed a header that lied about which section the user was on.
Adds six translation keys — `settings.connectorsTitle/Hint`,
`settings.mcpServerTitle/Hint`, `settings.externalMcpTitle/Hint` — and
translates them across all 17 locales (ar, de, en, es-ES, fa, fr, hu, id,
ja, ko, pl, pt-BR, ru, tr, uk, zh-CN, zh-TW).
`SettingsDialog` now derives the header title/subtitle from the active
section (11 sections total) instead of a single hardcoded pair, so each
section renders an honest header.
Co-authored-by: Cursor <cursoragent@cursor.com>
* test(e2e): pin level: 3 on dialog heading lookups for Pets and Connectors
CI's Validate workspace job (#1479) failed two Playwright cases with the
strict-mode violation:
getByRole('dialog').getByRole('heading', { name: 'Pets' })
resolved to 2 elements:
1) <h2>Pets</h2>
2) <h3>Pets</h3>
Same root cause as the unit-test fix already in this PR: the dynamic
dialog `<h2>` now echoes the section's own `<h3>` because the dialog
header tracks the active section. Disambiguate to `level: 3` so each
assertion still pins the section heading specifically (which is what
the test intends to verify).
Audit of the rest of e2e/ for `dialog.getByRole('heading', ...)` —
settings-api-protocol.test.ts looks for "OpenAI API" / "Anthropic API"
section h3s which never appear in the dialog `<h2>` (always
"Execution & model"), so those stay safe.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(mcp): bind OAuth refresh to the issuing client and skip stale tokens
Persist the OAuth client context (token endpoint, client_id, client_secret,
issuer, redirect_uri, resource) alongside the bearer token so refresh hits
the same client the refresh_token was bound to (RFC 6749 §6). The previous
refresh path re-ran beginAuth with a dummy OOB redirect URI, which kept
getOrRegisterClient from finding the original DCR client and made
providers reject the refresh on the next chat turn. Refreshes now reuse
the persisted endpoint/client pair directly.
Also stop injecting expired access tokens at spawn time when refresh is
unavailable or fails. Pinning a stale Bearer made every Claude MCP call
401 while the prompt still treated the server as connected; on that path
we now skip the entry and let the UI surface a reconnect.
Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
b06f26a5fd
|
test: strengthen e2e PR coverage (#796)
* test: strengthen e2e PR coverage * fix: address e2e PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address e2e PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * ci: cache Windows packaged smoke builds * test: fake additional agent runtimes * fix: address e2e PR feedback Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: address e2e PR feedback Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: address e2e PR feedback Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: address e2e PR feedback Route tools-pack mac starts through a launch-time packaged config override so portable packaged smoke runs keep using the namespace runtime root that inspect and logs expect. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: address e2e PR feedback Fall back to the packaged app's embedded config when the build output config is missing so installed mac starts still work. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: align packaged mac PR smoke with tools-pack runtime mode Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: address e2e PR feedback Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: address e2e PR feedback Keep blake3-wasm out of the packaged mac daemon prebundle so the standalone runtime loads the Cloudflare asset hasher from node_modules instead of crashing in ESM. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: address e2e PR feedback Skip the portable mac launch override when the bundled packaged config is missing so installed fallback app targets can still boot with packaged defaults. Add a regression test covering the missing-config start path. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(pack): remove duplicate mac prebundle dependency key |
||
|
|
e14b8092ea
|
feat: add Orbit activity summaries (#681)
* feat: add Orbit activity summaries * fix(orbit): make runs navigable while agent continues * fix(web): widen minimum chat panel * feat: support Orbit template selection * fix(daemon): avoid bogus skill side-file preflight * fix(web): collapse orbit artifact project cards * fix(web): preserve orbit project card titles * fix: improve Orbit run daily briefing * fix: handle Orbit digest data failures * fix: load Orbit templates and connector tools reliably * fix: keep Orbit summary counts consistent Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: apply Orbit template skill context * fix: cache and curate connector tools for Orbit * fix: align Orbit defaults and connector discovery * fix: simplify Orbit template settings * fix: move connectors into settings * fix: compact connector settings catalog * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: address Orbit PR feedback Generated-By: looper 0.6.1 (runner=fixer, agent=opencode) * fix: prevent connector action button from stretching into pill The icon-only connect/disconnect buttons in the embedded connectors catalog inherited min-width: 92px / 106px from the non-embedded pill rules, overriding the 24px square sizing and causing the buttons to overlap the card head text. Reset min-width to 0 in the embedded icon-only rule so the compact square layout holds. * fix(web): align live artifact file rows * fix: clean up Orbit connector settings lifecycle Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix: address Orbit review regressions Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * feat(web): localize Orbit and connector settings * feat(web): gate Orbit runs without connectors * feat(web): refine connector settings UX * feat(web): safeguard Composio key clearing * fix(web): refresh Composio tool badges * feat(web): show connector logos * feat(daemon): localize Orbit prompt window * fix(daemon): clarify blocked connector callback closes * test(daemon): harden flaky async probes * fix(web): align Indonesian connector locale keys * test(web): align connector browser props * fix(web): preserve explicit credential clears Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): time out Composio logo proxy fetches Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): localize Indonesian connector settings copy Translate the new connector settings strings in the Indonesian locale and lock them with a regression test so this surface no longer silently falls back to English. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve discovered connector tools Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve onboarding autosave completion Keep settings autosave from clearing onboarding completion after the close gesture, and expose the desktop main types from source so workspace validation can typecheck packaged imports without a prior desktop build. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): defer Composio catalog cache hydration Load persisted Composio catalog data only after the runtime data directory is configured so startup cannot read another namespace's cache. Add a regression test that exercises the module-load singleton path. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): treat discovery completion independently Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve latest settings draft on close Use the latest persisted settings draft when the dialog closes so onboarding completion does not race a stale daemon sync and overwrite newer Orbit/template selections. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): avoid syncing draft Composio key on Orbit run Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): localize Orbit settings copy Translate the new Indonesian Orbit and autosave strings so the settings UI no longer falls back to English and the locale regression stays covered. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): prefer fresh connector catalog state Keep refetched connector status/auth data authoritative while retaining discovery-only tool metadata so the connectors UI stays consistent after refreshes. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): declare Indonesian locale fallback keys explicitly Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): inline Indonesian fallback strings for CI Replace the Indonesian locale's per-key English lookups with explicit strings so workspace typecheck no longer depends on brittle build-mode resolution in CI. Add a regression test that blocks those per-key English lookups from reappearing in the CI-sensitive fallback sections. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): restrict proxied connector logos to image MIME types Reject non-image upstream logo responses so the daemon never serves third-party HTML from its localhost origin. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * test(e2e): align settings dialog regressions Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): decouple Orbit runs from media sync failures Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): keep SPA catch-all export-compatible Disable dynamic catch-all params for the exported SPA shell so Next.js static builds can emit the root route again. Add a regression test covering the route config against the web export mode. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): preserve Orbit config and workspace routes Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): block SVG in connector logo proxy Reject SVG and other unsafe proxied logo responses so third-party logo content cannot execute under the daemon origin, while keeping raster logo fetches working and making rejected responses non-cacheable. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): fall back to static catalog for empty cache Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): disable Orbit run before connector gate resolves Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(desktop): export shipped desktop types Point the desktop ./main type export at the generated declaration so installed consumers resolve the published file set. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): restore persisted question form selections Render historical submitted answers directly so reloaded question forms keep their locked selections visible. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): retry forced media sync autosave Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): keep Composio logo timeout through body read Keep the Composio logo fetch timeout active until the response body is fully consumed so stalled body reads abort and clear the inflight cache entry. Add a regression test that proves a delayed body read times out and the next request can recover.\n\nGenerated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): refresh Orbit gate after connector auth Re-check connector availability when the settings window regains focus so Orbit unlocks as soon as a connector finishes authenticating in the same settings session. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): keep connector detail tool lists intact Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): ignore malformed Orbit summaries Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(e2e): stabilize design-system multi-select flow Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): cap Composio logo cache growth Bound the Composio logo cache with LRU eviction and expired-entry pruning so repeated untrusted logo requests cannot grow daemon memory without limit. Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(daemon): bound proxied Composio logo payloads Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): align autosave settings tests Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): remove stray CSS conflict marker Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fixer: address PR #681 follow-up items Generated-By: looper 0.6.2 (runner=fixer, agent=opencode) * fix(web): restore restart routes and connector flows * fix(web): keep SPA export route static * fix(web): stabilize chat scroll tests --------- Co-authored-by: lefarcen <935902669@qq.com> |
||
|
|
7107623ee2
|
test: expand entry and settings automation coverage (#811)
* test: harden new project panel metadata coverage * test: add settings and connector sync coverage * test: expand entry e2e coverage * test: satisfy exact optional property types in entry connector flow * test: keep entry Playwright coverage under e2e/ui * test: tighten coverage docs and settings test cleanup * test: drop e2e docs from the guarded package * docs: move automation coverage docs out of e2e * test: restore clipboard cleanup without delete * test: match composio save dialog behavior * test: avoid placeholder assertion after composio save * test: expect closeModal on settings saves * test: align settings save assertions with closeModal flags * test: fix settings save mocks * test: align composio replacement hint |
||
|
|
2bb029cb58
|
release: Open Design 0.5.0 (#820)
0.5.0 已从
|
||
|
|
a383b4bd3a
|
Preserve beta e2e spec reports in R2 (#812)
* Upload beta e2e spec reports to R2 * Expose beta report URLs in summary * Complete Indonesian deploy locale keys |
||
|
|
6efac8887e
|
Improve Windows beta packaging and installer flow (#768)
* Optimize Windows packaged web output * Fix packaged contracts runtime build * Optimize Windows packaged size pruning * Prune Windows root Next payload * Remove Windows bundled Node runtime * Prune Windows standalone duplicate Next * Add tools-pack cache foundation * Cache Windows packaged build layers * Cache Windows workspace builds * Cache Electron-ready Windows app * Split Windows tools-pack module * Cache Windows dir build outputs * Split Windows pack build modules * Document Windows NSIS smoke namespace limits * Move Windows NSIS smoke note to agents guide * Optimize Windows beta packaging * Bump packaged beta base version * Improve Windows installer namespace UX * Improve Windows tools-pack cache keys * Stabilize Windows beta cache version keys * Cache Windows workspace build outputs * Optimize windows release beta cache layers * Cache windows release dependencies * Trim windows release cache before save * Refresh windows tools-pack cache key * Improve windows installer preflight prompts * Fallback NSIS installer strings to English * Fix Windows installer cleanup and preflight * Improve Windows NSIS state logging * Fix system NSIS Persian language alias * Use long-path removal for Windows uninstall * Fix mac tools-pack tests on Windows * Address Windows packaging review feedback * Fix Windows installer cache namespace isolation * Include web output mode in Windows tarball cache key * Use unique Windows release cache save keys |
||
|
|
a5aaeb7905
|
Fix: Remove element selector tooltip in Tweaks mode (#697)
* Fix: Remove element selector tooltip in Tweaks mode * fix: cover localized content fallbacks * fix: keep comment target labels testable --------- Co-authored-by: bojiehuang <bojiehuang@bojiehuangdeMacBook-Pro.local> Co-authored-by: mrcfps <mrc@powerformer.com> |
||
|
|
8301bcd46e
|
test: add desktop settings and project flow e2e coverage (#306)
* test: add desktop settings regression coverage * test: stabilize desktop smoke interactions on latest main * fixer: address PR #306 follow-up items Generated-By: looper 0.2.7 (runner=fixer, agent=codex) * test: expand ui e2e automation suite * fix: add missing Ukrainian prompt template labels * chore: align desktop e2e helpers with layout guard * chore: move settings protocol e2e into ui suite * fix: preserve api provider settings across protocol switches * fix: avoid leaking api keys across protocol drafts * test: fold desktop smoke coverage into mac spec * fix: dedupe Ukrainian prompt template labels |
||
|
|
ae4a08773a
|
chore(release): prepare 0.4.1 (#659)
- bump remaining monorepo package.json files to 0.4.1 after apps/packaged was already bumped in #637 - add CHANGELOG.md [0.4.1] - 2026-05-06 entry covering the startup hotfix and 19 merged PRs since 0.4.0: - Added: manual edit mode (#620), Cmd/Ctrl+P quick file switcher (#556), resizable chat panel (#563), PI status/cancel updates (#618), accessibility and RTL/Bidi craft modules (#587, #595), i18n structure checks (#608) - Changed: first-PR README links now surface help-wanted issues (#605) - Fixed: packaged contracts runtime exports (#577), packaged runtime beta gating (#637), ACP/MCP/agent fixes (#604, #612, #627), conversation error recovery (#623), native mac quit (#637) - Documentation/Internal: OD_DATA_DIR migration docs (#570), Simplified Chinese QUICKSTART (#578), zh-TW/ko README syncs (#586, #619), generated metrics (#592) Release workflow validation runs after merge via release-stable. |
||
|
|
f1cdb2844a
|
test(e2e): gate beta packaged runtime (#637)
* test(e2e): gate beta mac packaged runtime * test(e2e): separate ui automation layout * test(e2e): move localized content coverage * chore(release): prepare packaged 0.4.1 beta validation * test(e2e): keep ui lane playwright-only * fix(web): keep chat recoverable after conversation load failure * fix(desktop): honor native mac quit |
||
|
|
8eb9b1b506
|
Implement manual edit mode (#620) | ||
|
|
d1d63f9dae
|
Fix/error message persistence (#623)
* test(ProjectView): persists daemon errors on assistant messages Signed-off-by: nhancdt2602 <nhancu2602@gmail.com> * fix(ProjectView): persists daemon errors on assistant message Signed-off-by: nhancdt2602 <nhancu2602@gmail.com> * test(e2e): cover agent switch and persistence Signed-off-by: nhancdt2602 <nhancu2602@gmail.com> * chore(chat-event): handle falsy empty error detail Signed-off-by: nhancdt2602 <nhancu2602@gmail.com> --------- Signed-off-by: nhancdt2602 <nhancu2602@gmail.com> |
||
|
|
963bbf2500
|
release: Open Design 0.4.0 (#454) | ||
|
|
bbdd4e84b5
|
chore: enforce test directory conventions (#496)
* chore: enforce test directory conventions Move package, app, and tool tests out of src and add guard enforcement so source directories stay source-only. * ci: use guard and package-scoped tests Run the new repository guard in CI and keep test execution aligned with package-scoped commands after removing root aliases. * ci: align stable release guard check Use the new repository guard in stable release verification after replacing the residual-JS-only script. * chore: tighten test layout enforcement Enforce sibling tests directories, typecheck moved test suites with dedicated configs, and refresh remaining guidance that pointed at src-based tests. * chore: clarify no-emit test tsconfigs Explicitly disable declaration-only emit in test tsconfigs so review tooling sees they are no-emit typecheck configs. |
||
|
|
df2e8adba0
|
docs: add Brazilian Portuguese (pt-BR) translations (#460)
* docs: add Brazilian Portuguese (pt-BR) translations Translate README, CONTRIBUTING, QUICKSTART, and the e2e/cases, e2e/reports, skills/html-ppt, skills/guizang-ppt READMEs into pt-BR. Add the pt-BR entry to the language switcher in every existing locale variant (en, de, fr, ko, ja-JP, ru, uk, zh-CN, zh-TW for README; en, de, fr, ja-JP, zh-CN for CONTRIBUTING; en, de, fr, ja-JP for QUICKSTART) so readers in any language can jump to the Portuguese version. Code blocks, file paths, identifiers, license attribution links and brand names are kept verbatim with the source. * docs(pt-BR): use repo-relative links in e2e READMEs Replace author-local absolute paths (/Users/mac/...) with repo-relative links so the translated docs navigate correctly on GitHub and other checkouts. * docs(pt-BR): fix README badge anchor and skill reference link - Update Agents badge href to the Portuguese fragment (#agentes-de-código-suportados) so the badge jumps to the translated heading. - Restore the [`SKILL.md`][skill] reference-link syntax in the comparison table; the opening bracket was lost in translation, breaking the row. |
||
|
|
2473ab9567
|
fix(web): add copy buttons for FileViewer code blocks (#471)
* fix(web): add copy buttons for FileViewer code blocks * fix(web): harden FileViewer markdown copy controls * fix(web): restore focus after clipboard fallback * test(e2e): restore execCommand after markdown copy tests |
||
|
|
016c08183f
|
release: Open Design 0.3.0 | ||
|
|
0c65a0508e
|
fix(web): keep Design Files view active after deleting a file (#329)
* fix(web): keep Design Files view active after deleting a file (#115) Deleting a file from the Design Files panel was navigating the user into another file's tab (typically the previously-opened file such as `index.html`). Two unrelated symptoms contributed: - `FileWorkspace.handleDelete` always called `setActiveTab(nextActive ?? DESIGN_FILES_TAB)`, which yanked the user out of the Design Files panel whenever `tabsState.active` still pointed at a real file (clicking the Design Files tab only updates local `activeTab`; it does not touch persisted `tabsState.active`). - The row popover had `onMouseDown` propagation guard but no corresponding `onClick` guard on the destructive action path. Fix in `FileWorkspace.handleDelete`: - If the deleted file is the one being viewed (`activeTab === name`), fall back to another open tab (or Design Files if none remain) — the prior behavior, which is correct in this case. - Otherwise (deletion came from the Design Files panel, the typical path), leave `activeTab` alone so the user stays in the list view. Only null out `tabsState.active` when it referenced the deleted file to avoid leaving a dangling pointer for the hydration `useEffect` to resync against. Defensive cleanup in `DesignFilesPanel`: - Add `onClick` propagation guard on the popover wrapper (mirrors the existing `onMouseDown` guard). - Stop propagation on each popover button (Open / Download / Delete) and on the download `<a>` so destructive/menu actions can't bubble into row click handlers under any DOM arrangement. Closes #115 * fix(web): address review feedback + add multi-tab delete regression E2E Review feedback (#329): - Drop the redundant `onClick={(e) => e.stopPropagation()}` on the download `<a>` wrapper; the inner `<button>` is the actual click target and already stops propagation. - Expand the comment in `handleDelete`'s else branch to explicitly state that `activeTab` is preserved because the user is in a different context (Design Files panel or another tab) and shouldn't be navigated away. Regression coverage: - Extend `runDesignFilesDeleteFlow` to upload a sibling file (`keep-me.png`) before `trash-me.png`, so deletion has a fallback tab that the buggy code would have navigated to. After deletion, assert the Design Files tab still has `aria-selected="true"` and the sibling tab is still present (just not auto-activated). |
||
|
|
62b01a6dbf
|
release: Open Design 0.2.0 (#297) | ||
|
|
0c00f241e7
|
Add preview comment attachments (#284) | ||
|
|
a93246d892
|
chore(ci): add GitHub CI workflow (#271)
* Add GitHub CI workflow * Address CI workflow review feedback Generated-By: looper 0.3.0 (runner=fixer, agent=openai/gpt-5.5) |
||
|
|
5a0f954297
|
fix(daemon): emit tool_use from tool_execution_start in pi-rpc (#186)
* fix(daemon): emit tool_use from tool_execution_start in pi-rpc The pi-rpc adapter emitted tool_use from message_end, which fires before tool execution starts. The web UI pairs tool results to prior tool_use events, so receiving tool_result without a preceding tool_use broke tool card rendering and file auto-open behavior. Move tool_use emission to tool_execution_start, matching the pattern in copilot-stream.ts. Remove redundant tool call extraction from message_end (tool_use is now emitted at execution time, usage is already emitted from turn_end). Extract mapPiRpcEvent as a pure exported function so tests exercise the real event mapping logic instead of an inlined copy that can diverge from production. Ref: mrcfps review comments on PR #117 * docs(daemon): clarify mapPiRpcEvent mutability contract The function mutates ctx.sentFirstToken to track streaming state. Calling it "pure" is misleading; revised the doc comment to say no I/O or child process interaction instead. * fix(pi-rpc): remove redundant status(tool) emission from tool_execution_start Now that tool_use fires inline from tool_execution_start, the accompanying status(tool) event is redundant: tool_use already carries the tool name, and the UI renders running state from the tool card. The extra status pill breaks consecutive tool_use grouping in AssistantMessage.buildBlocks. Aligns with copilot-stream, which emits only tool_use from tool.execution_start with no status event. |
||
|
|
0bafc73d24
|
Add visible conversation timestamps (#120)
* Add visible conversation timestamps Generated-By: looper 0.2.7 (runner=worker, agent=codex) * Fix assistant message timestamps Generated-By: looper 0.2.7 (runner=fixer, agent=codex) |
||
|
|
f8af2cd875
|
fix(web): exit PreviewModal fullscreen on first Esc press (#168)
Closes #141. When the user clicked the Fullscreen button, requestFullscreen() put the stage element into native browser fullscreen and React's `fullscreen` state was set true. Pressing Esc was meant to exit the overlay, but in browsers like Firefox the browser consumes Esc to drop its native fullscreen element without delivering keydown to JS. The React state stayed true, the `ds-modal-fullscreen` class lingered, and only a second Esc reached the keydown handler that flipped the state. Subscribe to `fullscreenchange` so the React state mirrors the native state. When the browser exits its fullscreen element, the overlay drops on the same keystroke. The keydown handler is still needed for the fallback path (no native fullscreen API support, where requestFullscreen is undefined and only React state is set). Adds three regression tests in e2e/tests/preview-modal-fullscreen.test.tsx covering the bug fix path, the keydown fallback, and a non-collapse guard for transitions where another element is still fullscreen. Co-authored-by: d 🔹 <258577966+voidborne-d@users.noreply.github.com> |
||
|
|
3fb849d047
|
Fix chat runs surviving web disconnects (#146)
* fix chat runs surviving web disconnects * fix chat run create abort propagation Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon keepalive reconnect budget Generated-By: looper 0.0.0-dev (runner=fixer, agent=gpt-5.5) * fix daemon stream disconnect cancellation Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon stream abort cancellation race Generated-By: looper 0.0.0-dev (runner=fixer, agent=openai/gpt-5.5) * fix daemon run cancellation semantics * fix load * doc * 2 * add run refresh recovery * fix active run refresh status * fix reattach abort handling * fix * fix chat initial scroll * fix daemon start failures Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix background run recovery Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix stop run status Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * fix background run recovery Generated-By: looper 0.2.7 (runner=fixer, agent=openai/gpt-5.5) * extract daemon run service * move prompt composition to daemon * fix prompt module resolution * fix project id generation * add project run status * add designs kanban view with awaiting_input status - add grid/kanban view toggle on Designs tab; persist choice in localStorage - introduce awaiting_input project display status (daemon-derived from unanswered <question-form>) so projects asking the user aren't shown as Completed; ordered between Running and Completed with amber accent - hide transient queued state from users: coerce queued/starting to running in daemon /api/projects projection and drop the queued kanban column - a11y polish on Designs cards: Space activation, aria-labels on delete, focus-visible outlines, reveal delete on focus-within and touch, prefers-reduced-motion handling - kanban layout uses flex sizing instead of viewport math; scoped icon- only pill button rule fixes view-toggle icon alignment --------- Co-authored-by: mrcfps <mrc@powerformer.com> |
||
|
|
8f34e39b7b
|
feat(daemon): add pi coding agent adapter (#117)
* feat(daemon): add pi coding agent adapter Add pi (https://pi.dev) as a supported coding agent, using its --mode rpc JSON-RPC protocol over stdio for structured event streaming. Changes: - apps/daemon/pi-rpc.js: new RPC session handler that drives pi's --mode rpc protocol, translating typed agent events (text_delta, thinking_delta, tool_use, tool_result, usage, status) into the daemon's UI event format. Auto-resolves extension UI requests (fire-and-forget consumed, dialogs auto-approved) so pi stays unblocked in the headless web UI. Kills the process after agent_end since pi's RPC process is designed for multi-prompt sessions. - apps/daemon/agents.js: add pi agent definition with custom fetchModels (pi --list-models outputs to stderr, not stdout), 575+ models from 20+ providers, reasoning/thinking level support via --thinking flag, and streamFormat 'pi-rpc'. - apps/daemon/server.js: wire pi-rpc stream format to attachPiRpcSession; skip stdin.end() for pi-rpc since the RPC session manages stdin bidirectionally. - apps/daemon/acp.js: export createJsonLineStream for reuse by pi-rpc.js. - apps/daemon/pi-rpc.test.mjs: 19 unit tests covering model list parsing (TSV, dedup, edge cases), RPC event translation (text, thinking, tools, usage, compaction, retry), sendCommand wire format, extension UI auto-resolution. - e2e/tests/structured-streams.test.ts: add pi RPC tool_use/tool_result event mapping test alongside existing Claude/Copilot fixtures. Verified end-to-end: daemon /api/chat → pi RPC → SSE stream with status, text_delta, usage, and tool events. Live E2E test passes (OD_E2E_RUNTIMES=pi). All 59 project tests green. * refactor(daemon): migrate pi-rpc to TypeScript Follow upstream #118 TypeScript migration convention: rename pi-rpc.js → pi-rpc.ts and pi-rpc.test.mjs → pi-rpc.test.ts with @ts-nocheck header (same as all other daemon modules). Import paths remain ./pi-rpc.js per NodeNext module resolution. * fix(daemon): avoid duplicate usage events in pi-rpc handler Pi emits both message_end and turn_end per turn, both carrying usage data. Emitting from both handlers caused double-counting in the UI and any consumer that aggregates usage. Remove usage emission from the message_end branch since turn_end is the canonical per-turn usage source. Keep tool call extraction in message_end (unique data not available in turn_end). Add regression test confirming exactly one usage event is emitted when both message_end and turn_end carry usage for the same turn. Addresses Copilot P2 review on PR #117. * fix(daemon): scope pi RPC id counter per session, bump graceful shutdown Move nextRpcId and sendCommand inside attachPiRpcSession as local state, matching the pattern in acp.ts where nextId is scoped per session. Prevents RPC id collisions across concurrent /api/chat requests. Bump post-agent_end SIGTERM grace period from 2s to 5s and make it configurable via PI_GRACEFUL_SHUTDOWN_MS env var for resource- constrained machines. Add test confirming concurrent sessions get independent id sequences. * fix(daemon): wrap parser.feed in try-catch in pi-rpc Catch errors from parser.feed and route them through the existing fail() handler instead of letting them propagate as unhandled exceptions. |
||
|
|
c6d11018a0
|
Refresh desktop integration control plane (#123)
* feat(dev): add desktop tools-dev control plane * refactor(sidecar): split Open Design contracts Move Open Design-specific sidecar protocol definitions into @open-design/contracts so sidecar and platform can remain descriptor-driven primitives. * refactor(daemon): organize package sources Keep daemon app code, tests, and sidecar entrypoints in separate package directories so each layer can be built and verified independently. * chore(repo): streamline maintenance entrypoints Centralize agent guidance by directory and reduce root command chains while preserving the existing build scope. * docs: translate agent guidance to English * fix(sidecar): tolerate stale IPC sockets Remove stale Unix socket files only after confirming no listener is active, so tools-dev can restart after unclean shutdowns. |
||
|
|
56d08b8c5f
|
Add shared contracts and migrate project code to TypeScript (#118) | ||
|
|
5c45c3b967
|
fix sse keepalive behind nginx (#111) | ||
|
|
cfebff9653
|
Align app directories and isolate e2e tests (#102)
* chore: align app directories * test: consolidate external suites under e2e |
||
|
|
751c9de56d
|
Add UI e2e automation suite and reporting (#64)
* test: add e2e ui automation suite * fix review feedback for ui e2e suite Resolved the FileWorkspace.tsx merge-marker issue and kept the intended combination of multiple, accept="image/*", and data-testid. Updated the e2e port handling so the test config no longer relies on a single hardcoded app port. It now resolves an available port first and passes the same port selection through the dev server and Playwright base URL. Since main has moved to the Next.js dev stack, this was also adapted from the old Vite-based flow to NEXT_PORT. Kept test:ui serialized so cleanup completes before Playwright starts. Updated reset-e2e-artifacts.mjs so cleanup failures are surfaced with a warning instead of being silently swallowed, except for the expected ENOENT case. |