Commit graph

840 commits

Author SHA1 Message Date
Marc Chan
0b551c8afe
Merge b0c0559198 into 53fb175855 2026-05-31 13:23:27 +08:00
estelledc
53fb175855
fix(web): truncate long filenames in file list (no-preview state) (#3370)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
landing-page-ci / Validate landing page (push) Failing after 1s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 2s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 2s
ci / Web workspace tests (push) Failing after 2s
ci / Browser tests (push) Failing after 2s
ci / Build workspaces (push) Failing after 2s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
When the design files preview pane was closed, a very long filename
would expand its `<td.df-cell-name>` and push the kind / mtime / menu
columns off-screen — the auto-layout `<table>` had no width
constraint on the cell, so the existing `text-overflow: ellipsis`
on `.df-row-name` never engaged.

The `:not(.no-preview)` overrides in `routines.css` already pinned
the row's children to `width: 100%; max-width: 100%` when a preview
pane was open, but the no-preview state — the one shown in the issue
screenshot — had no equivalent guard.

CSS:
- `.df-cell-name`: add `max-width: 0; min-width: 0` so the table
  cell collapses to its column allocation in auto-layout.
- `.df-row-name-wrap`: add `max-width: 100%`.
- `.df-row-name`: add `max-width: 100%; min-width: 0` so the flex
  child clamps to the wrap and the existing ellipsis engages.

JSX (DesignFilesPanel.tsx):
- Add `title={f.name}` (and the equivalent for directory rows and
  live-artifact rows) on `.df-row-name`. The browser surfaces the
  full filename on hover even when the visible text is truncated, so
  users can read the leading characters without opening the preview
  pane. `<DfPreview>` already renders the full name with
  `word-break: break-word`.

Closes #3260

Validation:
- pnpm exec vitest run tests/components/DesignFilesPanel.long-name-truncate.test.tsx
  → 3/3 passed (1 was red on main: title attr was absent)
- pnpm --filter @open-design/web test → 2501/2501 passed (260 files)
- pnpm --filter @open-design/web typecheck → green
- pnpm guard → green

Note: jsdom does not measure layout, so the truncation itself can't
be asserted directly. The specs encode the structural contract the
CSS depends on (cell / wrap / name nesting + the `title` attr) so a
JSX shape change won't silently regress the fix.
2026-05-31 05:17:52 +00:00
BayesWang
af4a62b69a
Add configurable project locations (#2041)
* add daemon project location support

* wire project locations into web settings

* localize project location settings

* move default project location to settings

* polish project location selection cards

* fix project location i18n gaps

* fix external project validation cleanup
2026-05-31 04:47:45 +00:00
Dan Porat
3395d2c855
feat(daemon): implement fal.ai renderer for image + video generation (#1606)
* feat(daemon): implement fal.ai renderer for image + video generation

Adds renderFalImage and renderFalVideo backed by the fal queue API
(queue.fal.run). Any fal-ai/* model path can be used directly without
a catalog entry, enabling the full fal model library without code
changes. Catalogued shortcuts are mapped via FAL_ENDPOINTS to their
fal-ai/* paths; OD_FAL_MAX_POLL_MS controls the poll ceiling.

Expands the fal model catalog with flux-pro-ultra, flux-dev-fal,
flux-schnell-fal, ideogram-v3-fal, recraft-v3-fal (images) and
veo-3-fal, veo-2-fal, wan-2.1-t2v, wan-2.1-i2v, seedance-1-pro-fal,
kling-2.1-t2v-fal (video). Marks fal provider as integrated: true in
both daemon and web model registries.

* fix(daemon): address fal renderer review comments

- Correct Wan 2.1 endpoints: wan-video/v2.1/* → fal-ai/wan-t2v / fal-ai/wan-i2v
- Correct Kling 2.1 t2v endpoint: .../pro/... → .../master/text-to-video
- Add FAL_IMAGE_USES_ASPECT_RATIO: flux-pro-ultra sends aspect_ratio not image_size
- Add FAL_VIDEO_NO_DURATION: Wan models reject the duration field
- Add FAL_VIDEO_STRING_DURATION: Veo expects duration as "5s" not 5
- Fix falQueueBase() to use anchored regex replace, avoiding mangled
  custom base URLs
- Do not wrap payload under input — raw fal queue HTTP API expects flat
  body; the input wrapper is an SDK abstraction only (confirmed by 422
  validation error from fal showing prompt missing at body.prompt)

* fix(daemon): correct fal queue protocol comment (flat body, no SDK input wrapper)

* fix(daemon): clamp Veo duration to valid fal buckets (4s/6s/8s)

* fix(daemon): report effective fal Veo duration in providerNote (with snap warning)

* fix(daemon): reduce image generation latency from 4m37s to ~73s

Five layered fixes targeting the overhead that padded a ~10s fal API call
into a 4m37s user-facing wait:

1. Skip DISCOVERY_AND_PHILOSOPHY for media surfaces (image/video/audio).
   The ~3000-token HTML-artifact discovery layer is irrelevant for media
   generation and forced the agent to parse and override all its rules
   before dispatching. Removes it from the system prompt entirely for
   these surfaces; MEDIA_GENERATION_CONTRACT is the sole authority.

2. Broaden the wait-loop contract to cover ALL slow models, not just
   "Volcengine i2v / hyperframes-html". Any model whose generation
   exceeds 25s — including fal flux-pro-ultra, Veo, Sora — returns exit 2
   from od media generate. The contract now makes this universal and
   provides a python3-based bash pattern (jq is not guaranteed to be
   installed on all agent runtimes).

3. Increase od media wait polling budget from 25s to 120s. od media
   generate keeps its 25s budget for fast feedback; od media wait is
   purpose-built to sit and poll, so it can safely use the full 2-minute
   bash-tool window. Reduces re-entries for a 3-minute generation from
   ~7 to ~2.

4. First fal poll is now immediate instead of always sleeping 3s before
   the first status check. Saves 3s for all fal jobs.

5. Project metadata no longer emits "(unknown — ask)" for imageModel and
   aspectRatio when unset. Emits the actual defaults (gpt-image-2,
   aspect-ratio scene heuristic) so the agent can dispatch without
   extended reasoning about model selection. Also adds dispatch-immediately
   defaults and a brief-reply rule (2–3 sentences max after generation).

Measured end-to-end on the exact problem prompt before/after:
  Before: 4m37s (discovery form + 7x LLM re-entries + jq failure)
  After:  ~73s   (single bash loop, no question turn, image delivered)

* feat(daemon): inject media dispatch hint for non-media project surfaces

Agents running inside prototype, deck, and other non-image/video/audio
projects previously had no knowledge of `od media generate`, so when
asked to create an image with fal they would try to call provider REST
APIs directly and ask the user for API keys — even though the daemon
already holds credentials in .od/media-config.json.

Add MEDIA_DISPATCH_HINT to composeSystemPrompt for all non-media
surfaces. The hint tells the agent to always route media generation
through the daemon dispatcher, and explicitly forbids prompting for
API keys.

Verified end-to-end: a prototype project generates a 952 KB image via
flux-pro-ultra in ~52s with no key errors.

* fix(daemon): prevent agent from converting bash env vars to PowerShell syntax

MEDIA_DISPATCH_HINT now explicitly labels the shell as POSIX bash and
shows the correct $VAR form side-by-side with a warning NOT to use
PowerShell $env:VAR. Without this, claude-sonnet running on a Windows
host converts the example to PowerShell syntax (`& $env:OD_NODE_BIN`)
which then fails at the bash executor with 'syntax error near unexpected
token &'.

* fix(daemon): add generate→wait loop to MEDIA_DISPATCH_HINT for slow models

MEDIA_DISPATCH_HINT previously showed only a bare  call.
flux-pro-ultra and other slow models always exit 2 after ~25s — without
the wait loop the agent would treat exit 2 as a failure and report an
error to the user.

Replace the single-command example with the canonical generate→wait loop
(matching media-contract.ts), add an explicit note that exit 2 means
'keep polling', and reinforce the POSIX bash / no-PowerShell rule
directly inside the code block.

* fix(daemon): allow fal-ai/* passthrough in media-agent contract

The media-agent prompt instructed the agent to warn and substitute the
default model for any ID not in the catalogue. This blocked the custom
fal-ai/* passthrough path the daemon already supports, so users could
not reach uncatalogued fal models from the normal chat flow.

Carve out the fal-ai/* exception so the agent passes those IDs through
directly instead of warning or substituting.

* fix(daemon): align MEDIA_DISPATCH_HINT with exit-0 generate contract

media generate now always exits 0 (handoff included). The non-media
agent hint still checked ec==2 to decide whether to keep polling, so
slow fal models (flux-pro-ultra, veo-3-fal) would stop after printing
the handoff JSON instead of entering the wait loop.

- generate error check: drop the ec!=2 exception (exits 0 always)
- while loop: drive on taskId presence, not ec==2; stop on ec==0/5
- footer: remove --surface inference claim; CLI requires it explicitly

* fix(guard): add test-fal-webui.ts to e2e scripts allowlist

CI failed: guard flagged e2e/scripts/test-fal-webui.ts as an
unapproved package-owned entrypoint. Add it to allowedE2eScripts.

* fix(daemon): update prompt test expectations to match exit-0 handoff wording

The two stale assertions checked for the old generate-exits-2 copy which
no longer exists in the contract. Update them to match the current
always-exits-0 wording.

* fix(daemon): move skipDiscoveryBrief override before discovery block

* chore(e2e): remove ad-hoc fal webui test script

The script was a one-time developer helper used to manually validate fal
image generation through the live UI. It relied on a real fal API key and
hardcoded local port, so it cannot participate in the e2e package's
fixture/reporting/CI conventions. Removing it per reviewer feedback.

- Delete e2e/scripts/test-fal-webui.ts
- Remove its guard.ts allowlist entry
- Gitignore the file and its screenshots to prevent accidental re-addition

* chore: remove accidental local scratch files from branch

Remove bash.exe.stackdump (MSYS crash dump) and fix_loop.py (one-off
local rewrite helper) — neither is a repo-owned source artifact.

* fix(prompts): document fal-ai/* passthrough in non-media dispatch hint

Prototype/deck agents now know arbitrary fal-ai/* model ids are valid
--model values and should be forwarded as-is, mirroring the exception
already present in media-contract.ts. Adds a prompt regression test.

* fix(daemon): use renderMediaGenerationContract(mediaExecution) for media surfaces

---------

Co-authored-by: mrcfps <mrc@powerformer.com>
2026-05-31 04:44:44 +00:00
kami
def2e9fd2e
fix(web): dock comment side panel outside preview (#2073)
* fix: dock comment board without clipping inspect

When the comment-side dock falls back to the stacked layout in narrow
panes, collapsing the side panel now shrinks the bottom strip to a
horizontal rail height instead of keeping the full panel-height row.
commentPreviewCanvasSize() also stops over-deducting the expanded
panel height in the stacked-collapsed path so the canvas sizing stays
in sync with what is rendered.

* fix(web): address docked comment panel review follow-ups

* Fix non-docked comment tool tablet scaling

* test(web): align comment panel tests with collapse API
2026-05-31 04:36:15 +00:00
蓝宙
e8c179d3a6
fix: show cumulative conversation duration (#3354)
* fix: show cumulative conversation duration

* fix: include usage-only run durations

---------

Co-authored-by: Lanzhou3 <217479610+Lanzhou3@users.noreply.github.com>
2026-05-31 03:52:12 +00:00
estelledc
0b493a66c0
fix(web): prevent caret reset on tools-menu picker mousedown (#3368)
The right-side @-button tools popover (ToolsPluginsPanel,
ToolsSkillsPanel, ToolsMcpPanel) inserts text into the composer
draft using the textarea's selectionStart at click time, but the
picker rows had `onClick` without `onMouseDown={(e) =>
e.preventDefault()}`. On a real mouse, mousedown fires first, the
textarea loses focus before the click handler runs, and
selectionStart resets — so the inserted token lands at offset 0
instead of at the user's cursor.

The @-mention popover already prevents this by calling
preventDefault on mousedown for every picker row (the comment at
ChatComposer.tsx:3039-3043 explains the reason). This change
mirrors that protection on the three tools-menu pickers.

The mention popover itself was unaffected, so design-file
mentions (which only flow through the @-popover via
`replaceMentionWithText`) are not impacted by this issue. The
reporter's mention of "design files" appears to refer to picking a
file via the @-popover, where the protection was already in place.

Closes #3195

Validation:
- pnpm exec vitest run tests/components/ChatComposer.tools-menu-caret.test.tsx
  → 3/3 passed (red on main, asserts each picker calls
  preventDefault on mousedown)
- pnpm --filter @open-design/web test → 2501/2501 passed (260 files)
- pnpm --filter @open-design/web typecheck → green
- pnpm guard → green
2026-05-31 03:50:45 +00:00
estelledc
1a6face04c
fix(web): prune draft tokens when the plugin chip strip clears (#2881) (#3356)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
nix-check / build (push) Failing after 1s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 1s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 1s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
ChatComposer tracks the `@…` tokens this surface authored via the
@-mention popover plugin-pick path. When PluginsSection's chip strip
clears, we wire its `onCleared` and prune *only* those tracked
insertions from the draft so the textarea no longer holds orphaned
styled mentions whose chips just unmounted.

Architecture summary (rounds 1–9 collapsed; round 10 detailed below):

  - `Array<{token, start, pluginId, insertionId?}>` tracking with
    start offsets reconciled across each keystroke via an LCS+LCP
    edit-range diff in
    `apps/web/src/utils/pluginInsertionTracking.ts` (round 3-4).
    `insertionId` is forwarded by `reconcileInsertions` so the
    producer can locate its own entry across reconciles
    (round 10).
  - All draft mutations route through a single `updateDraft`
    chokepoint that runs `reconcileInsertions` outside the
    `setDraft` updater so React StrictMode's double-invoke is
    harmless (round 4-5).
  - Boundaries delegate to the shared
    `inlineMentions.isMentionBoundary` /
    `inlineMentions.isMentionRightBoundary` helpers so the
    tracker can never diverge from the parser (round 5).
  - `setActivePlugin` is a chokepoint for every applyById path,
    filtering tracked entries to those matching the new active
    plugin so a replace-plugin flow can never let stale entries
    survive (round 6).
  - Picker rollback double-snapshots draft + tracker so apply-
    failure restores the tracker but only rewrites the draft
    when no user keystrokes arrived during the await
    (round 7-8).
  - `stripPluginInsertedTokens` collapses whitespace seam-local
    so user-authored multi-space spans elsewhere are preserved
    (round 8).
  - `setActivePlugin` is deferred past `await applyById` on
    every path, and `onCleared` filters by
    `pluginsSectionRef.current?.getActiveRecord()?.id` so a
    pending-window clear scopes to the actually-mounted
    plugin's tokens (round 9).

race in the picker rollback:

  Round 9 made `onCleared` mutate the tracker and the draft when
  it ran during a pending replace, and added the `getActiveRecord`
  filter so the strip targets the still-mounted plugin's entries
  only. The picker's failure-path rollback, however, still
  restored `prevEntries` / `prevActiveId` wholesale — assuming
  nothing else had touched the tracker during the await. If the
  user clicked the still-mounted original chip's × during the
  pending replace AND the deferred `applyById` then resolved
  with a 500, the wholesale restore (a) resurrected entries that
  `onCleared` had legitimately stripped (now stale offsets) and
  (b) left the optimistic `@<target>` orphaned in the draft with
  no chip ever having mounted — the original #2881 symptom
  recurring inside the failure window.

  Fix splits the failure rollback into two paths:

  1. **Detect "intervening clear" via `activePluginIdRef.current
     === null && prevActiveId !== null`.** `onCleared` always
     nulls the active id as its last action; our deferred
     `setActivePlugin` never ran in the failure branch. So the
     null-while-prev-not-null state is the smoking gun for an
     intervening clear during the await.

  2. **On detection, surgically remove only our optimistic
     entry and only its `@<target>`.** Locate the entry by
     `insertionId` (added to `TrackedInsertion` as an optional
     field, forwarded by `reconcileInsertions` so the id
     survives offset shifts) — this disambiguates the case
     where the user picked the same plugin from the @-popover
     more than once during the await window. Splice that entry
     out and run `updateDraft((d) => stripPluginInsertedTokens(
     d, [ourEntry]))` so the draft loses `@<target>` and any
     remaining tracked entries (the in-flight target would have
     no others, but a co-pending second pick could) get their
     offsets reconciled. `activePluginIdRef` stays at `null` —
     `onCleared`'s truth, since no chip is mounted.

  The "no intervening clear" branch is the round 7/8 path:
  restore `prevEntries`/`prevActiveId` wholesale and rewrite
  the draft only if `draftRef.current === postInsertDraft`
  (no user keystrokes during the await).

Regression coverage (additions):

  - `apps/web/tests/components/ChatComposer.plugin-clear-prunes-draft.test.tsx`
    — 18 integration specs total (17 prior + 1 new round-10):
    * `@-popover pick A → @-popover pick B (apply pending) →
      clear A's chip → resolve B with 500 → assert no orphan
      @<target>, no orphan @A, no chip mounted, no stale
      tracker entries`. Uses a deferred `Promise<Response>` so
      the apply stays in flight while the chip-clear is fired,
      then resolves with a 500 to drive the failure path. Pre-
      fix this would resurrect Airbnb's stale entry AND leave
      `@SecondPlugin` orphaned in the draft.

PluginsSection.tsx is unchanged. The host-local tracking +
draft-update chokepoint + parser-aligned boundaries + deferred
active-plugin scoping + transactional applyById + intervening-
clear-aware rollback + filtered `onCleared` keep the cross-
component contract identical to main — only ChatComposer touches
behavior, plus the utils module and two `inlineMentions` exports.

Validation:
  - pnpm exec vitest run tests/utils/pluginInsertionTracking.test.ts → 36/36 passed
  - pnpm exec vitest run tests/components/ChatComposer.plugin-clear-prunes-draft.test.tsx → 18/18 passed
  - pnpm exec vitest run -c vitest.config.ts (full apps/web suite, 228 files) → 2202/2202 passed
  - pnpm --filter @open-design/web typecheck → green
  - pnpm guard → green
2026-05-30 17:16:24 +00:00
chaoxiaoche
df1535b7fd
feat(web): add staged preview feedback during generation (#3227)
* feat(web): wire generation preview stage into workspace

Show a 3-step progress overlay (understand → generate → prepare) in the
preview area while artifacts are being generated, replacing the blank
empty state. Displays elapsed time, an estimated duration hint, and a
retry button on failure.

- Add GenerationPreviewStage component + CSS module + runtime helpers
- Integrate buildGenerationPreviewState into FileWorkspace
- Pass messages/artifact/error/retry from ProjectView to FileWorkspace
- Register i18n keys for en and zh-CN locales

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): keep generation preview alive and persistent across waiting states

Address UX feedback on the generation preview surface:

- Make the waiting card feel alive instead of frozen: breathing mark,
  sweeping progress shimmer, pulsing running-step dot, and a live
  activity snippet pulled from streamed events (respects
  prefers-reduced-motion).
- Add an `awaiting-input` phase so the preview no longer reverts to the
  empty "design will appear here" placeholder when the agent asks the
  user a clarifying question (detects inline <question-form>).
- Add a `stopped` phase so a canceled/paused run keeps a contextual
  paused card instead of blanking the surface.
- Fix workspaceHasPreviewSurface live-artifact tab match (was reading a
  non-existent `tabId` field) and correct the unit assertion that
  contradicted the helper's `thinking` handling.
- Populate generationPreview.* keys (incl. new awaiting/stopped strings)
  across all locales.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): reveal generation steps progressively as the agent reaches them

- Only render steps the agent has actually reached (drop pending pills)
  with a slide/fade entrance, so the card visibly evolves 1->2->3 instead
  of always showing the same fully-populated row.
- Keep the "understand" step in progress during requesting/starting so a
  fresh run opens with a single step rather than a pre-filled set.
- Stop surfacing status detail (e.g. the model slug from `requesting`) as
  the live activity line; only genuine thinking/output text is shown.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): add dynamic sub-status to the generating step

Keep 3 high-level steps but give the long "generating" phase concrete,
moving feedback (option A) instead of splitting into more, less-reliable
steps:

- Derive a sub-status from the agent's TodoWrite plan: the in-progress
  task label (activeForm) plus a done/total count, falling back to the
  latest write/edit target file when no plan was emitted.
- The count counts the in-progress task toward `done` to match the
  chat-side todo card (e.g. 3/7 on both sides).
- Suppress the higher-level narration line while the sub-status is shown
  so only one dynamic line appears at a time (early phase = narration,
  writing phase = concrete task + count).

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web): drop elapsed timer and duplicate estimate from generation preview

The "usually 2–5 minutes" estimate showed twice (lead footnote + meta row)
and the elapsed counter added little signal, so remove both: delete the
meta row, stop falling back to the estimate footnote in the generating
lead (render the lead only when live narration exists), and drop the now
unused elapsed timer/util.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-30 13:59:49 +00:00
leessju
cfde84b038
fix(web): make hand-off no-editors fallback perform a real reveal (#2494)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
landing-page-ci / Validate landing page (push) Failing after 3s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 2s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
* fix(web): make hand-off no-editors fallback perform a real reveal

The Finder/Explorer/File Manager fallback button was only calling an
optional onRequestRevealInFinder prop that the actual caller never
passes, so the surface advertised an action it never performed.

finder, explorer, and file-manager are real entries in the daemon's
open-in catalogue (open / explorer / xdg-open), so route the fallback
through openProjectInEditor(projectId, fallbackId) for a genuine
reveal. Keep the renderer reveal bridge as a secondary fallback if
the daemon spawn fails, and disable the button while busy so a double
click can't queue two reveals.

Adjacent: PreviewDrawOverlay's Send-while-streaming behavior is
intentional (sending is queued downstream, not blocked), and the
button already carries sendDisabledReason as its tooltip. Cover that
contract with a regression test so a future change can't silently
re-disable the control or drop the localized reason.

Scope note: the i18n hand-off key migration that previously rode on
this branch landed on main via a different key set, so this PR is
narrowed to just the fallback wire-up and the two regression tests.

* fix(web): surface daemon spawn failure inline in zero-editors fallback

The zero-editors HandoffButton fallback called setError() on a rejected
openProjectInEditor but returned only the <button>, so the error never
rendered. Production callers (ProjectView) mount the component without
onRequestRevealInFinder, so a daemon spawn failure became a silent no-op
— exactly the failure mode the PR was meant to cover.

Wrap the solo button in a handoff-wrap container and render the error
inline next to it. Adds a regression test for the rejected-spawn path.

* fix(web): align preview draw send-disabled test

* fix(web): show handoff fallback for zero editors

---------

Co-authored-by: nicejames <nicejames@gmail.com>
Co-authored-by: mrcfps <mrc@powerformer.com>
2026-05-30 06:34:12 +00:00
mehmet turac
259295419a
fix(web): remove design file mentions with chips (#3204) 2026-05-30 04:50:50 +00:00
RyanCheng77
f12679185c
fix(web): send Anthropic proxy image attachments (#3273)
* fix(web): send Anthropic proxy image attachments

* fix(web): omit image attachment stubs for Anthropic proxy

* fix(web): keep image fallback context aligned

* fix(web): align Anthropic image attachment omission

---------

Co-authored-by: 116405 <116405@ky-tech.com.cn>
2026-05-30 04:47:47 +00:00
RyanCheng77
653a3fcc70
fix(web): harden image export downloads (#3318)
* feat(web): export preview as image

* fix(web): harden image export downloads

* docs(skills): add PR feedback quality gate

* docs(skills): require critical review of Claude feedback

---------

Co-authored-by: 116405 <116405@ky-tech.com.cn>
2026-05-30 04:44:00 +00:00
YOMXXX
9305bd1cff
fix(web): truncate long project names in the automation project picker (#3274) (#3317)
Long project names in the "Existing projects" section of the
automation project picker rendered verbatim with no truncate styling,
so a single name like "A very long project name that would otherwise
wrap onto several lines" blew up the row height and made the dropdown
messy to scan. The expected behavior is a single-line label with
ellipsis, with the full name still discoverable on hover.

Add the standard truncate triad (`white-space: nowrap`,
`overflow: hidden`, `text-overflow: ellipsis`) to
`.automation-popover__label`. The parent
`.automation-popover__body` already sets `min-width: 0`, so the
ellipsis renders cleanly. Thread an optional `title` prop through
`PopoverItem` and pass each project's full name from the picker
call site, so the native hover tooltip carries the unclipped name.

Other PopoverItems with fixed in-product copy (e.g. "New project
each run") deliberately omit the title — they never exceed the row
width and the redundant tooltip would be noise.

Regression test covers the DOM contract (every project row has
`title=<full name>`, fixed rows do not); the CSS half is verified by
code review since jsdom does not apply stylesheets.
2026-05-30 04:42:21 +00:00
Ramiro
ed5e8c147b
fix(web): keep pet composer menu expanded (#3336)
- apps/web/src/components/ChatComposer.tsx
- apps/web/tests/components/ChatComposer.context-pickers.test.tsx

Clear stale absolute anchors when the pet composer menu is positioned fixed so the popover wraps its content instead of collapsing over the composer textarea.
2026-05-30 04:19:02 +00:00
xinsngx
41b1cd763e
fix(media): hide OpenAI OAuth-only image credentials (#3308)
* fix(media): ignore OpenAI OAuth tokens

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.1

* fix(media): hide unavailable model providers

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.2

* fix(media): clear unavailable picker models

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.3

* fix(media): keep missing-model projects executable

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.8

---------

Co-authored-by: Codex <gpt-5@openai.com>
2026-05-30 04:12:10 +00:00
JasonBroderick
0fbeaf829e
fix(#3247): Detect, terminate, and warn on fabricated role markers across all agent paths (#3303)
* fix(daemon): detect and strip fabricated role markers in model output (#3247)

Three-layer defence against models emitting `## user` / `## assistant` /
`## system` lines mid-response, which the chat host interprets as real
turn boundaries and acts on as unauthorised instruction:

1. **System prompt**: anti-roleplay instruction elevated from a bullet
   under "What you don't do" to a standalone `## CRITICAL` section in
   `official-system.ts`, with a REMINDER pinned at the end of the
   composed prompt for recency bias.

2. **Stream-level detection and truncation**: shared `role-marker-guard.ts`
   module (`createRoleMarkerGuard` + `FABRICATED_ROLE_MARKER_RE`) used
   across all text paths — Claude stream (per-message guards), non-Claude
   structured streams (run-scoped guard via `emitGuardedTextDelta`),
   and BYOK proxy routes (`createDeltaGuard`). When a marker is detected,
   the contaminated suffix is dropped and a `fabricated_role_marker` event
   surfaces a warning in the UI.

3. **UI**: `StatusPill` gains `is-warning` / `is-error` CSS variants;
   `fabricated_role_marker` events render as amber warning pills.

* fix(chat-routes): do not await reader.cancel() on stream early-return

The await on reader.cancel() can hang indefinitely on response streams
whose underlying source is a Uint8Array (most notably surfaced by the
ollama test in proxy-routes.test.ts, which builds its mock body via
`new Response(uint8array)` rather than the controller-based helper
`sseResponse()`). The hung await holds the request handler open, which
in turn blocks `server.close()` in the afterAll hook, producing the two
test timeouts (test at 145, hook at 36) currently failing CI on #3296.

Fix is in production code, not the test: don't await the cancel. It
is a cleanup hint and we are returning from the function anyway, so
blocking on it offers no value. fire-and-forget with an empty catch
keeps the cancel signal flowing for real HTTP streams without
risking a hang on mock/edge-case implementations.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(daemon): terminate child on role-marker detection (close #3247 generation vector)

PR #3296's detection layer truncates display and persistence of fabricated
role markers, but the underlying model subprocess keeps generating tokens
after detection. Three concrete consequences:

  1. The model bills the user for the entire contaminated response
     (we observed 5,106 chars stored in claude's session file for a turn
     where only the first 3,013 chars were legitimate — a 40% overhead).
  2. tool_use blocks emitted AFTER the marker reach the daemon's
     dispatcher unchecked, since detection only gates the text-delta
     emission path, not content-block-stop / tool_use blocks. The
     model could fabricate "## user delete file X" then emit a
     tool_use(delete X) that the dispatcher would execute.
  3. The UI surfaces a `fabricated_role_marker` warning followed by an
     eventual normal turn-end, blurring the distinction between
     "completed normally" and "killed by safety guard."

This commit adds a single idempotent `abortForRoleMarker(marker)`
helper in server.ts, scoped to the same closure as `child` and
`runGuard`. On any detection event (per-message Claude guard,
run-scoped non-Claude guard, plain stdout guard) the helper:

  - Emits a structured `ROLE_MARKER_HALLUCINATION` SSE error so the
    UI can render a security-class status distinct from a normal
    turn-end. The existing `fabricated_role_marker` warning is still
    sent and rendered as the amber pill (PR #3296's UI).
  - Calls `acpSession.abort()` for ACP-multiplexed agents (Hermes,
    Kimi, Devin, Kiro) whose I/O doesn't necessarily release on
    SIGTERM of the wrapper process alone.
  - SIGTERMs the child immediately, with the existing
    `scheduleForcedChildShutdown()` SIGKILL fallback at 2x grace.

Wired into three sites where contamination is detected:
  - `emitGuardedTextDelta` (sendAgentEvent / copilot / ACP / pi-rpc
    text_delta paths)
  - Plain-stdout listener (BYOK plain mode)
  - The Claude stream handler's onEvent (per-message guards in
    claude-stream.ts surface `fabricated_role_marker` events directly
    via onEvent rather than through the run-scoped emitGuardedTextDelta)

Tool_use blocks emitted BEFORE the marker still flow through normally
— this guard can't help with those, since by the time we observe a
text marker the prior content block has already finished. Closing
that gap requires speculative cancellation of in-flight tool calls
when a downstream text block contains a marker; that's tracked as
follow-up work, not included here.

Co-Authored-By: roverkai <2196140098@qq.com>
Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* refactor(role-marker-guard): bounded tail + drop chat-style markers

Addresses two review comments on #3303:

(1) O(1) memory + per-delta work (review r3323982225)
  Replace the unbounded `accumulated` string with a rolling tail capped
  at TAIL_BUFFER_SIZE (64 chars — comfortably exceeds the longest
  marker prefix `\n<whitespace>## assistant` ≈ 16–24 chars in practice).
  A 50 KB assistant response delivered in 1000 chunks of 50 bytes was
  previously O(n²) on string concatenation alone; now it is O(1) per
  delta regardless of message length. The `tail.length` value carries
  the "already emitted" offset that the cut-point math needs, so the
  offset semantics at L74–78 of the prior implementation are preserved
  without re-introducing the full-text buffer.

(2) Drop chat-style markers entirely (review r3323982234, option (a))
  `User:` / `Assistant:` / `Human:` / `AI:` are removed from the regex.
  Rationale:
    - The host parses ONLY `## user` / `## assistant` / `## system`
      lines as turn boundaries (see `buildDaemonTranscript` in
      apps/web/src/providers/daemon.ts). A model emitting chat-style
      markers does NOT cause the original #3247 security failure.
    - With kill-on-detection wired in this PR (`abortForRoleMarker`
      in server.ts), a false positive aborts the whole run — far
      more expensive than a stray unflagged `User:` line in chat
      scrollback. Chat-style markers collide with legitimate output
      (form labels, email contacts, JSDoc) often enough that pairing
      them with kill-semantics is the wrong tradeoff.
  The tradeoff is now documented in the regex docblock so the
  kill-on-match behaviour is justified against the false-positive
  surface.

Also aligns the prompt-side CRITICAL block in system.ts: drop the
"don't emit User: / Assistant: / Human: / AI:" bullet, since we no
longer enforce it. Less ambiguity for the model and the operators.

Test file updated:
  - Chat-style positive tests flipped to negative ("does NOT match
    User: — chat-style out of scope") so the intentional exclusion
    has a permanent regression test.
  - Two new tests cover the bounded-tail behaviour: a marker arriving
    after 10 KB of clean text in small chunks, and a marker
    straddling a chunk boundary after 100 prior chunks.
  - Added test for legitimate `User: bob@example.com`-style content
    not triggering contamination.
Test count is now 35 (up from 25); two of the new ones explicitly
exercise the new bounded-tail path.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): drop \`^\` anchor after first chunk (review r3324060995)

Blocking correctness bug introduced by commit 4 (bounded-tail refactor):
once \`tail\` is a rolling slice of mid-stream text, \`^\` in the
canonical regex \`(?:^|\\n)\\s*##\\s+(?:user|...)\` no longer represents
the genuine message start. As the rolling window slides forward chunk
by chunk, a sliced tail can begin with whitespace + \`##\` (or just
\`##\`), letting \`^\` anchor a match against text that the
full-buffer implementation correctly ignored. With kill-on-detection
wired in commit 3, that false positive now SIGTERMs the run and emits
a \`ROLE_MARKER_HALLUCINATION\` error — exactly the failure class
called out in the docblock at L22–29.

Reviewer's evidence (PerishCode, r3324060995): streaming
"…take a look at the ## user content section…" one character at a
time reports \`contaminated: true\` post-refactor; the same text in a
single feed stays clean.

Fix: keep the canonical \`FABRICATED_ROLE_MARKER_RE\` for the very
first non-empty feed (where \`^\` legitimately points at the message
start), and switch to an internal \`NEWLINE_ANCHORED_ROLE_MARKER_RE\`
(\`\\n\\s*##\\s+(?:user|...)\` — drops the \`^\` alternative) for all
subsequent feeds. A \`firstChunk\` boolean tracks the state. Real
newline-preceded markers straddling chunk boundaries are still caught
because the preceding \`\\n\` is retained inside the 64-char tail.

Regression tests added (\`apps/daemon/tests/role-marker-guard.test.ts\`):
  - mid-line \`## user\` streamed char-by-char with no preceding \\n
    (mirrors the reviewer's repro)
  - space-preceded mid-line \`## user\` in a >130-char stream, which
    long enough to force the rolling window past the marker — exercises
    the exact slice condition that triggered the bug
  - real \\n-preceded \`## user\` still caught after a long preamble
    (positive case must not regress)
  - \`## user\` as the very first chunk still caught (\`^\` legitimately
    anchors on the first feed)

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): case-sensitive + tighter prefix scope (reviews r3324151877 / r3324151882)

Two refinements addressing the third review on #3303:

== Blocking (r3324151877) ==
The regex over-matched legitimate Markdown headings, and with
kill-on-detection wired in commit 3 each false positive
deterministically aborts a real run. Three changes tighten the match
to the actual security surface — `## user` / `## assistant` /
`## system` lines the chat host parses as turn boundaries — without
losing any real attack pattern:

1. CASE-SENSITIVE. Dropped the `/i` flag. The host's turn-boundary
   delimiter is lowercase (see `buildDaemonTranscript` in
   apps/web/src/providers/daemon.ts), and the `## CRITICAL`
   system-prompt block already forbids only the lowercase forms.
   Title-Case headings like `## User Guide`, `## System Architecture`,
   `## Assistant settings` are now ignored — these are legitimate
   technical writing patterns LLMs emit constantly. `## USER NOTES`
   (all-caps) likewise no longer flags.

2. POSITIVE LOOKAHEAD `(?=[^a-z])` after the role keyword. Without it,
   `## userland`, `## userspace`, `## users guide`, `## systemd`,
   `## assistance` all match via prefix in the alternation. The
   lookahead requires the next character to exist and to not be a
   lowercase letter, so:
     - `## user\\n…`     → match (newline is not lowercase)
     - `## assistantR…` → match (R is uppercase; the glued-form
                          attack pattern still gets caught)
     - `## assistant.`  → match (. is not a letter)
     - `## users guide` → no match (s is lowercase letter)
     - `## userland`    → no match (l is lowercase letter)
   POSITIVE rather than NEGATIVE `(?![a-z])` because the negative
   form is satisfied at end-of-string, which in a streaming context
   means "we have `## user` but don't know what comes next yet" —
   would fire prematurely if `land` arrives in a later chunk. The
   positive form delays detection by one character in that edge
   case, traded for correctness.

3. `[ \\t]` instead of `\\s` for inner whitespace. Markdown role
   markers are single-line by convention; restricting to space/tab
   prevents oddities like `##\\nuser` from matching across lines.

Test file: added Title-Case fixtures (`## User Guide`,
`## System Architecture`, `## Assistant settings`, `## USER NOTES`)
and prefix-of-longer-word fixtures (`## users guide`, `## userland`,
`## systemd`, `## assistance`) — each asserting NO contamination.
The existing `## usability` negative test gave false confidence as
the reviewer noted (only failed via alternation-miss, not via
word-boundary semantics); the new fixtures actually exercise the
lookahead. Also added a positive test for `## assistant.` (glued
punctuation) to balance the existing `## assistantReading`
(glued uppercase) coverage. Total tests: 35 → 50.

== Non-blocking (r3324151882) ==
Added `ROLE_MARKER_HALLUCINATION` to `API_ERROR_CODES` in
`packages/contracts/src/errors.ts` alongside the existing agent/AMR
codes, with a docblock comment explaining the emission contract:
emitted by `server.ts::abortForRoleMarker` alongside the existing
`fabricated_role_marker` warning event when the daemon detects a
fabricated Markdown role marker in agent output; retryable. The code
was already being emitted over the wire but unregistered — landing
the registration here keeps the contract and emitter in sync as
reviewer requested.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): defer complete-but-unconfirmed marker suffix

Addresses review r3324277xxx — the boundary case where a stream chunk
boundary lands between the role keyword and its lookahead character
violated the documented "everything from the marker onward is silently
dropped" contract. With (?=[^a-z]) as the lookahead, `feedText('## user')`
returned `## user` as safe (no char to satisfy the lookahead → no match
→ pass through), so the fabricated marker line leaked into UI and
app.sqlite before the next chunk confirmed contamination on the next
SIGTERM cycle.

Fix: introduce a `pending` state variable holding bytes that match the
COMPLETE-but-unconfirmed marker prefix at end of buffer
(/(?:^|\\n)[ \\t]*##[ \\t]+(?:user|assistant|assist|system)$/, no
lookahead, $ anchor instead). When the no-match branch detects this
suffix, withhold it from emission until the next feed either:
  - Confirms it (next char non-lowercase) → main regex matches →
    contaminated → withheld bytes dropped along with `## user`.
  - Denies it (next char lowercase, e.g. `userl…`) → main regex no
    longer matches the role keyword → withheld suffix is released
    and emitted alongside the new continuation.

Also tied the firstChunk transition to actual byte emission rather
than feed count. Previously a message that starts with `## system`
followed by a separate `\\n` chunk would lose the `^` anchor on the
second feed (firstChunk had flipped after the first feed even though
nothing was emitted yet), silently breaking detection for that edge
case. Now `firstChunk` stays true until at least one byte has crossed
the emission boundary, matching the conceptual definition of "message
start".

Tests added (apps/daemon/tests/role-marker-guard.test.ts):
  - `## user` deferred at chunk boundary, confirmed by `\\n` in next
  - `## user` deferred at chunk boundary, denied by `land` continuation
  - `## assistant` deferred, confirmed by punctuation
  - `## User` Title-Case still passes through unconditionally
  - `## system` as the very first chunk: deferred, confirmed by \\n
    in next chunk (tests the firstChunk-stays-true-when-nothing-
    emitted invariant)

Total tests: 50 → 55.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(claude-stream): scope role-marker guard to text_delta only, not thinking_delta

Addresses review r3324xxxxxx — guarding the thinking channel buys no
security and causes legitimate aborts.

Why thinking is NOT a #3247 vector:
  - `buildDaemonTranscript` in apps/web/src/providers/daemon.ts only
    re-serializes `m.content` as `## ${m.role}\n...`.
  - Extended-thinking content is rendered to a separate
    `kind: 'thinking'` payload (daemon.ts:857-858) and never folded
    into `m.content`.
  - So a `## user` line in the thinking channel CANNOT become a
    fabricated turn boundary on the next round-trip.

Why guarding it is harmful:
  - Models routinely emit literal `## user` / `## assistant` lines
    in chain-of-thought when reasoning about conversation structure
    ("Let me think about this. The user might phrase it as:\n## user\n
    …"). Common pattern in production traces.
  - With `abortForRoleMarker` wired in server.ts, a guard match on
    thinking SIGTERMs the run and surfaces a security error to the
    UI. The user paid for the reasoning, never sees the answer, and
    gets a confusing "fabricated role marker" warning for what was
    actually legitimate metacognition.
  - This directly contradicts the module's own stated philosophy
    ("a false positive aborts the whole run — a much more expensive
    failure than a stray unflagged ... line", role-marker-guard.ts).

Fix: `emitSafeText` now passes thinking_delta through unconditionally,
skipping both the guard and the contamination check. text_delta
remains fully guarded. The single-line change at the top of
emitSafeText preserves all other channels' behavior.

Regression tests added (apps/daemon/tests/claude-stream-thinking.test.ts):
  - `## user` / `## assistant` lines in a thinking_delta — must NOT
    fire fabricated_role_marker, the thinking content streams intact
    including the marker text, and the subsequent text_delta answer
    still reaches the consumer (run not aborted).
  - Sanity check: same `## user` pattern in a text_delta DOES fire
    fabricated_role_marker and truncates emission at the marker. Locks
    in the channel-discriminated behavior.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(role-marker-guard): tie firstChunk to slicing, not byte emission

Blocking review r3324xxxxxx: under the prior firstChunk transition
("any byte emitted"), a role marker that arrived at the very start of
a message with its prefix split across multiple chunks bypassed
detection — reopening the #3247 vector on the Claude path.

Concrete cases that were missed (all are routine provider
tokenizations of \`## user\n…\` at message start):
  - \`##\`     | \` user\nDELETE…\`
  - \`## us\`  | \`er\nDELETE…\`
  - \`## \`    | \`user\nDELETE…\`

Mechanism: the pending-deferral regex only catches COMPLETE role
keywords, so a first chunk ending in a partial prefix (\`##\`, \`## \`,
\`## us\`) was emitted in full. That emission flipped firstChunk to
false. From that point only NEWLINE_ANCHORED_ROLE_MARKER_RE was used,
which requires a literal \n before \`##\`. A marker at buffer
position 0 has no preceding \n, so it could no longer match.
abortForRoleMarker never fired and tool_use blocks emitted after the
fabricated turn boundary reached the dispatcher.

Fix: change firstChunk to track "tail has not been sliced yet" rather
than "any byte emitted". While total emitted bytes <= TAIL_BUFFER_SIZE,
tail still represents the entire emission so far and \`^\` in the
canonical regex genuinely anchors at byte 0 of the stream — so the
\`^|\n\` alternation safely catches a chunk-split message-start
marker. The transition happens at the moment we would slice: once
emitted > TAIL_BUFFER_SIZE, tail becomes a mid-stream window, \`^\`
becomes meaningless, and we switch to the newline-only variants.

Earlier iterations of this code tried two other definitions, both
unsound:
  - "any byte emitted" (this commit fixes) — lost \`^\` before a
    chunk-split message-start marker could finish arriving.
  - "newline emitted" (briefly considered as the reviewer's
    alternative suggestion) — left \`^\` valid on a sliced buffer
    when streams hadn't emitted a newline yet, re-introducing the
    rolling-tail mid-stream false positive from review r3324060995.
The slice-based invariant satisfies both: while we have not sliced,
\`^\` is correct; once we slice, it is not.

Regression tests added (apps/daemon/tests/role-marker-guard.test.ts):
  - \`##\`    | \` user\nDELETE…\`   → contaminated, marker=\`## user\`
  - \`## us\` | \`er\nDELETE…\`      → contaminated, marker=\`## user\`
  - \`## \`   | \`user\nDELETE…\`    → contaminated, marker=\`## user\`
  - \`#\`     | \`# user\nDELETE…\`  → contaminated, marker=\`## user\`
The fourth case (single \`#\` first chunk) exercises an even more
adversarial tokenization than the reviewer's examples; it is also
caught.

Total tests: 55 → 59.

Co-Authored-By: JasonBroderick <jason@buddyboss.com>

* fix(tests): wrap events in stream_event envelope in thinking test

feedJsonl was feeding raw events without the `{ type: 'stream_event',
event: ... }` wrapper that createClaudeStreamHandler requires (line 141
of claude-stream.ts). Events silently fell through all branches, making
both tests pass vacuously. Also fix TS2532 on warnings[0].marker with
non-null assertion (safe after the toHaveLength(1) guard).

Co-Authored-By: RoverKai <roverkai@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: roverkai <2196140098@qq.com>
Co-authored-by: JasonBroderick <jason@buddyboss.com>
Co-authored-by: RoverKai <roverkai@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-30 03:57:56 +00:00
xinsngx
c88a83cd5e
fix(web): preserve preview scroll across tools (#3313)
* fix(web): preserve preview scroll across tools

Capture URL-loaded preview scroll state before tool handoff and restore it through an opt-in raw HTML bridge to avoid jumping back to the top.

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.6

* test(daemon): cover scroll bridge injection paths

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.6

---------

Co-authored-by: Codex <gpt-5@openai.com>
2026-05-30 03:53:50 +00:00
xinsngx
778010bcf9
fix(web): theme home hero select menu (#3309)
* fix(web): theme home hero select menu

Use theme tokens for HomeHero footer select dropdown panels so dark theme menus do not render light-only white backgrounds with low-contrast text.

Agent-Model: gpt-5

Agent-Family: openai

Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197

Agent-Step: 0.0.3

* test(web): cover dark model logo inversion

Agent-Model: gpt-5
Agent-Family: openai
Agent-Session: 019e6ceb-c33d-7cd3-bff0-cbc20c642197
Agent-Step: 0.0.7

---------

Co-authored-by: Codex <gpt-5@openai.com>
2026-05-30 03:53:42 +00:00
Nicholas-Xiong
610aac4cc5
fix: insert skill reference when selecting from tools panel (#3220)
* fix: insert skill reference when selecting from tools panel

When selecting a skill from the tools panel (not via @ mention), the skill
reference was not being inserted into the input. The tools panel only called
applyProjectSkill() without inserting the text token.

This fix makes the tools panel skill picker behave consistently with other
reference types (MCP, connectors) by:
- Inserting the @skill token at the current cursor position
- Setting focus and cursor position after the inserted reference
- Closing the tools panel after successful insertion

Now users can:
1. Manually delete an @skill reference
2. Open the tools panel and select a skill
3. See the new @skill reference inserted at the cursor

Fixes #3188

* fix(web): preserve draft edits when inserting skill

---------

Co-authored-by: mrcfps <mrc@powerformer.com>
2026-05-30 03:52:35 +00:00
Patrick A
9146dc1c57
fix(web): persist design files view state across navigation (#2303)
* fix(web): persist design-files view state across navigation

pageSize, sortKey, sortDir, and kindFilter reset on every navigation
because DesignFilesPanel remounts via key={projectId}. Persist them to
localStorage under od:design-files:view-state:v1:<projectId> so each
project's view prefs survive tab-switching.

- Read persisted state via lazy useState initializers (SSR-safe try/catch)
- Write back in a single useEffect keyed on all four values
- Scoped per-project so proj-a settings never bleed into proj-b
- Schema-guarded: invalid/missing fields fall through to defaults
- Red spec: apps/web/tests/components/DesignFilesPanel.view-state-persist.test.tsx

* fix(web): address review feedback on view-state persistence

- Add typeof window guard in readViewState for explicit SSR safety
- Consolidate 4 separate localStorage reads into a single useRef read at
  mount time; each lazy useState initializer now reads from savedViewState.current
  instead of re-parsing localStorage independently

* fix(web): harden design-files view-state persistence

- Validate restored kindFilter values against the current ProjectFileKind
  union via isProjectFileKind() so stale stored values from a prior schema
  are dropped silently instead of being cast unchecked.

- Introduce DEFAULT_SORT_KEY/SORT_DIR/PAGE_SIZE constants so the useState
  initialisers and the new validation guard share a single source of truth.

- Add viewStateHasMounted ref to skip the first-render write in the persist
  useEffect. Without this guard every project the user visits accumulates a
  default-value entry in localStorage on mount, growing stale-key garbage
  unboundedly and making future field additions silently inject defaults into
  every existing entry.

- Harden kindFilter test: replace the silent early-return-on-missing-trigger
  with expect(filterTrigger).not.toBeNull() so a render failure surfaces as
  a real test failure rather than a passing no-op.

* test(e2e): design files view state persists across navigation and reload

Adds a Playwright UI smoke test in e2e/ui/ that exercises the three key
guarantees of the view-state persistence fix:

  (a) Tab-away / tab-back: navigating to a file tab and returning remounts
      DesignFilesPanel (conditionally rendered); all four prefs (sortKey,
      sortDir, pageSize, kindFilter) are restored from localStorage.

  (b) Hard reload: localStorage survives page.reload(); prefs are intact on
      the next mount.

  (c) Per-project key isolation: a second project starts with defaults and
      does not inherit values from the first project's localStorage entry.

The test uses OD_PORT=18011 / OD_WEB_PORT=18012 to avoid port conflicts with
the default development ports.

Also fixes a race in DesignFilesPanel: the stale-kind cleanup useEffect was
running against an empty availableKinds set before the async file list arrived
on mount, which cleared a kindFilter correctly restored from localStorage.
Guard added: skip the cleanup when availableKinds is empty.

Red on origin/main (no persistence logic exists there); green on this branch.

* fix(e2e): address code-reviewer feedback on view-state-persist test

- Add data-testid='df-page-size-select' to per-page <select> in
  DesignFilesPanel (W2: decouple test from i18n string 'Show')
- Add StrictMode comment to viewStateHasMounted guard explaining
  the dev-mode double-write behaviour (W1: document the invariant)
- Switch nav-away from dblclick to single-click + Open button,
  matching the pattern used in app-design-files.test.ts (W4)
- Raise timeout from 60s to 90s for cold CI runners (W3)
- Unify seedTextFile/seedPngFile into shared seedFile helper (N3)
- Add home-hero-input assertion in gotoEntryHome (N2)
- Switch waitForPageSizeSelect to use data-testid (W2)

* test(e2e): split design-files persist into nav, reload, and per-project scenarios

* fix(web): tighten isPageSize to discrete option set, add invalid-value regression test

* fix(web): isolate DesignFilesPanel.test.tsx from persisted view-state key
2026-05-30 03:39:27 +00:00
lefarcen
6f532ca35c
fix(web): snapshot the srcDoc bridge frame in Mark mode so deck capture works (#3304)
The Mark tool (#3081/#3277) captured the preview via the *active* iframe. For
URL-load previews — decks especially — the active frame is the bridgeless URL
iframe, while the snapshot bridge lives only in the (mounted but hidden) srcDoc
transport frame. So Send on a deck timed out and showed 'Could not capture the
preview. Try again to avoid sending only ink.'

Snapshot the srcDoc-render-mode frame instead (capture mode already keeps it on
full content, so it carries the bridge), with a short retry while it finishes
swapping to full content. Falls back to the active frame for the non-URL-load
case where they are the same.

Red spec: PreviewDrawOverlay.test 'snapshots the srcDoc bridge iframe, not the
visible URL-load frame' fails on main (targets the URL frame), passes here.
2026-05-29 11:50:37 +00:00
mrcfps
6acb12ef08 fix(pack): resolve win internal package merge conflict
Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)
2026-05-29 18:01:14 +08:00
Bassiiiii
0c4b7e50be
fix(web/router): defer popstate dispatch to microtask (#2490)
* fix(web/router): defer popstate dispatch to microtask

navigate() previously dispatched a synchronous popstate event after
mutating window.history, which caused React 18 to emit:

  Cannot update a component (Router) while rendering a different
  component (App). To locate the bad setState() call inside App,
  follow the stack trace as described in
  https://react.dev/link/setstate-in-render

This happens whenever a caller invokes navigate() from inside a
useState updater (e.g. App.tsx:479 routing first-run users through
the onboarding panel from inside the setConfig() update). The
synchronous popstate dispatch reaches useRoute() subscribers which
then call setRoute() while the parent component is still rendering.

Defer the popstate dispatch to a microtask. The window.history call
itself stays synchronous so the URL bar updates immediately; only
subscriber updates are pushed past the current render commit, which
removes the warning without changing observable behaviour for any
existing caller.

* fix(web/router): cover deferred navigation timing

---------

Co-authored-by: Visionboost <contact@visionboost.fr>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 09:37:55 +00:00
mrcfps
467d5c37d6 fix(web): resolve ChatComposer merge conflict
Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)
2026-05-29 17:22:00 +08:00
Md Mushfiqur Rahim
8ec162bb26
fix: pet hover card gets cut off at screen edges (#2860)
* fix: pet hover card gets cut off at screen edges

* fix: address review feedback - viewport clamping + unadopted pet wake

- Add window.innerHeight check to prevent bottom-edge clipping
- Increase menuH estimate for safer positioning
- Open pet settings instead of no-op Wake for unadopted pets

* fix: address review feedback on pet menu positioning and wake action

- Add viewport height check (viewH) to prevent bottom-edge clipping
- Increase menuH estimate for safer positioning
- Open pet settings instead of no-op Wake for unadopted pets
2026-05-29 09:05:51 +00:00
mrcfps
d0ea7a5823 fix(web): merge main to resolve remaining PR conflict
Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)
2026-05-29 17:00:33 +08:00
byte92
cdf34897ba
add comment composer keyboard submit shortcut (#2941)
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:46:15 +00:00
saifullakhan
73b2dc853f
Fix project empty state create action (#3082)
Co-authored-by: saifulla-khan <saifulla-khan@users.noreply.github.com>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:30:43 +00:00
maybeyourking
881571dea7
fix(media): route custom-image edits through images API (#3087)
* fix(media): route custom-image edits through images API

* fix(media): normalize custom-image endpoint suffixes

---------

Co-authored-by: Artist Ning <dingkuake@yeah.net>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-29 08:09:44 +00:00
Aria Shishegaran
fe58db2ba1
fix(web): target comment picker elements precisely (#3263)
Resolve Comment picker hit testing against meaningful visible DOM leaves before falling back to annotated ancestors, while preserving Inspect mode's annotation-first selector behavior.

Filter generated React root annotations from Comment targets, keep real element bounds separate from hoverPoint, and avoid rendering the comments drawer inline when a configured dock portal is not mounted.
2026-05-29 07:47:08 +00:00
Mason
1006efa2f6
Improve onboarding AMR runtime card (#3276)
* Improve onboarding AMR runtime card

* Fix onboarding AMR test expectations
2026-05-29 07:45:23 +00:00
freshtemp-labs
593bf2f03c
fix(composer): ellipsis overflow for referenced filenames (#3269)
Inline @-mentioned filenames in the composer can be very lengthy,
causing line wrapping and visual crowding in the input area.

- Switch display from inline to inline-block for max-width
- Cap width at min(240px, 25vw)
- Apply overflow: hidden + text-overflow: ellipsis + white-space: nowrap
- Remove box-decoration-break (unused since content won't wrap)
- Tweak vertical-align for consistent inline-block alignment

Closes #3261

Co-authored-by: freshtemp-labs <freshtemp-labs@users.noreply.github.com>
2026-05-29 07:41:16 +00:00
ziyan2006
071db7ca1b
[codex] Stabilize HTML deck navigation state (#3142)
* fix: stabilize html deck navigation state

* fix: avoid misclassifying transform decks as scroll decks

* fix: detect default root-scroller decks

---------

Co-authored-by: Nongzi <3051966228@qq.com>
2026-05-29 07:41:10 +00:00
elihahah666
be09fe92da
fix: keep settings/handoff/avatar buttons fixed to the right in project header (#3279)
Move the three buttons (settings, handoff, avatar) from fileActionsBefore
to the actions slot so they always stay pinned to the right edge of the
header, regardless of how many extra controls (Share, Present, etc.) are
injected via portal during HTML preview.

Co-authored-by: qiongyu1999 <2694684348@qq.com>
Co-authored-by: Claude Opus 4 <noreply@anthropic.com>
2026-05-29 07:33:57 +00:00
Amy
937946c6fa
Improve model picker search and shared BYOK catalogs (#3262) (#3278) 2026-05-29 07:07:40 +00:00
mrcfps
1a3caf04f8 fix(web): merge main to resolve BoardComposerPopover conflict
Generated-By: looper 0.9.2 (runner=fixer, agent=opencode)
2026-05-29 15:06:54 +08:00
lefarcen
755d84e64c
feat(web): merge Draw + Screenshot into one Studio mark tool (#3081) (#3277)
Forward-ports chaoxiaoche's Studio toolbar work from #3081 onto current
main. The preview toolbar drops to 4 controls — Comment, Mark (the merged
Draw/Screenshot tool with box-select + pen sub-tools), Edit, Comments —
matching the latest design. The standalone Screenshot button and its
copy-to-clipboard path are removed; capture now flows through the mark
overlay. Also carries #3081's comment select-all/clear-selection panel and
keeps the Draw send guard added in #3270 (Send disabled mid-run, Queue stays).

Reconciled with main work that postdates #3081's base so nothing is lost:
- Preserves #2190's preview iframe keep-alive pool and the AnnotationHoverPopover
  hover card (re-added on top of #3081's BoardComposerPopover, with its own
  anchor helper so it doesn't clash with the composer popover anchoring).
- i18n: keeps every locale key main added; adopts #3081's mark wording.

Behavior change: the comment side-panel Clear now deselects instead of
batch-deleting selected comments (per #3081); per-comment delete and
send-selected remain.

Validation: pnpm --filter @open-design/web typecheck (clean),
full web vitest (2354 passed), pnpm guard.

Co-authored-by: chaoxiaoche <fanzhen910412@gmail.com>
2026-05-29 06:51:38 +00:00
lefarcen
98a2c63973
feat(daemon): add Antigravity agent adapter (#3157)
* feat(daemon): add Antigravity agent adapter

Adds Google Antigravity (`agy` CLI) as a coding-agent runtime. Detection
picks up `agy` on PATH, the daemon spawns `agy -p "<prompt>"` for a
single non-interactive turn, and the assistant text reply streams back
on stdout. OAuth is shared with the Antigravity IDE through the system
keyring, so users who have signed into the desktop app are authenticated
on first run with no extra step.

`agy` v1.0.3 has no JSON / stream-json / ACP output mode (upstream issue
#119), no `--model` flag (issue #35), and no MCP forwarding hook yet —
the adapter ships with `streamFormat: 'plain'` and a single `default`
fallback model so the model picker doesn't mislead users into thinking
their choice is wired through. We will upgrade buildArgs + add a
dedicated event parser when upstream ships structured output.

Also gitignores `.antigravitycli/`, the project-local config directory
`agy` auto-creates on every run (upstream issue #175).

* fix(daemon): Antigravity adapter — stdin prompt, brand icon, form loop, empty-output guard

- Switch prompt delivery from argv to stdin (`agy -p -`) to avoid the
  30KB maxPromptArgBytes limit that blocked real-world composed prompts
- Add official Antigravity brand SVG icon to agent picker
- Fix repeated question-form loop for plain agents by injecting an
  OVERRIDE block when form answers are already present in the transcript
- Add empty-output guard for plain agents so expired auth or silent
  failures surface a user-visible error instead of a blank "Done" turn

* feat(daemon): expand Antigravity adapter — model picker, form-loop fix, OAuth launcher, log-file classification

PR #3157 follow-up integrating four iterations from end-to-end manual
testing on Gemini 3.5 Flash + GPT-OSS 120B Medium through `agy` v1.0.3.
Each section is independently verifiable; combined they're what made
the first successful artifact generation work end-to-end.

## Model picker via settings.json (agy has no --model flag)

agy v1.0.3 ships no `--model` CLI flag (upstream issue #35), but the
TUI Switch-Model picker writes the chosen label to
`~/.gemini/antigravity-cli/settings.json`'s `"model"` field, and every
`-p` invocation re-reads that file on startup — verified by capturing
the `--log-file` line `Propagating selected model override to backend:
label="<model>"`. Antigravity's `fallbackModels` now lists the 8
labels its TUI exposes (Gemini 3.1 Pro / 3.5 Flash variants, Claude
Sonnet/Opus 4.6 Thinking, GPT-OSS 120B Medium) and `buildArgs`
persists the user's choice to settings.json right before spawn. The
synthetic `default` id is preserved — picking it leaves settings.json
untouched so a user who switches models from agy's own TUI keeps
their choice.

Introduces `RuntimeAgentDef.supportsCustomModel?: boolean`. AMR's
hardcoded blocklist in `SettingsDialog.tsx` migrates to the
declarative flag (it rejects free-form ids at the ACP layer), and
antigravity opts out because its label set is a server-side enum that
silently fails on unrecognised strings.

## Form-loop fix (transcript sanitizer + stronger OVERRIDE)

The discovery form loop on weak/medium plain-stream models (GPT-OSS
120B Medium, Gemini 3.5 Flash) had two reinforcing causes:

  1. `buildDaemonTranscript` packed the prior assistant turn's
     literal `<question-form>` markup into the user request on the
     next turn, giving the model a template to echo. New
     `sanitizePriorAssistantTurnForTranscript` strips
     `<question-form>...</question-form>` blocks and ```json fences
     that match form-schema shape, replacing them with a brief
     placeholder. User content is preserved verbatim (a user who
     legitimately mentions `<question-form>` in chat keeps their
     message intact).
  2. The OVERRIDE block on form-answered turns was 4 lines and only
     banned the bare `<question-form>` tag — models still emitted the
     fenced JSON, form-asking prose ("Got it — tell me the following"),
     and fake system events ("subagents stopped"). The new
     `FORM_ANSWERED_SYSTEM_OVERRIDE` enumerates each anti-pattern and
     pins them via tests, so silently weakening any line reintroduces
     the regression.

Also adds RuntimeAgentDef.resumesSessionViaCli + RuntimeContext.
hasPriorAssistantTurn as forward-looking abstractions (skipTranscript
option on composeChatUserRequestForAgent). Antigravity does NOT opt
in — agy's `-c` resume activates an internal agentic loop with tool
retries and fallback-to-cached-response on tool errors that the OD
system prompt cannot steer; reverted after seeing byte-identical
form re-emissions caused by agy's own retry logic, not OD's transcript.

## One-click OAuth via system terminal

agy print mode can't complete Google Sign-In on its own (the OAuth
callback page asks the user to paste an auth code back into agy, but
`-p` has no input field). Before this commit the auth banner only
told the user to "open a terminal yourself."

Adds `POST /api/agents/antigravity/oauth-launch` and a cross-platform
launcher in `runtimes/terminal-launch.ts`:

  - macOS:    osascript → Terminal.app `do script "agy"` + activate
  - Linux:    tries x-terminal-emulator, gnome-terminal, konsole,
              xfce4-terminal, xterm in order
  - Windows:  `cmd /c start "Open Design" cmd /k agy`

The endpoint hardcodes the `agy` command (no user input → no shell
injection surface) and is loopback-gated like the other daemon
endpoints. The chat's `AGENT_AUTH_REQUIRED` banner now renders a
"Sign in via terminal" button next to Retry; clicking it spawns the
terminal so the user can finish OAuth in one click.

## Silent-failure classification (auth vs quota via --log-file)

agy print mode is silent on stdout/stderr for both missing-OAuth AND
quota-exhausted failures — the upstream
`RESOURCE_EXHAUSTED (code 429): Individual quota reached` and the
`not logged into Antigravity` line only surface in agy's
`--log-file`. Without log inspection the daemon misread quota as
"auth required" and showed the wrong banner.

`RuntimeContext.agentLogFilePath` carries a daemon-owned per-run temp
path that antigravity's buildArgs translates to `--log-file <path>`.
The empty-output guard now reads that log on a `code === 0 &&
!childStdoutSeen` exit, feeds the tail to
`classifyAgentServiceFailure`, and routes:

  - "not logged into Antigravity"     → AGENT_AUTH_REQUIRED with
                                        antigravityAuthGuidance
  - "RESOURCE_EXHAUSTED" / "quota" /  → RATE_LIMITED with
    "Individual quota reached"          antigravityQuotaGuidance
  - none of the above (rare)          → fall back to auth guidance
                                        as the most likely cause

Both surface a terminal launcher in the auth banner: auth gets "Sign
in via terminal", quota gets "Switch model in terminal" — same
endpoint, contextual label. The handler is identical (open agy in a
terminal); the user either signs in or uses agy's Switch Model
picker to pick a model with available quota.

## Validation

- `pnpm guard` pass
- `pnpm --filter @open-design/daemon` runtime + telemetry suites:
  192 passed, 1 skipped (the 1 pre-existing `task-type` failure on
  origin/main is unrelated to this change)
- `pnpm --filter @open-design/web` typecheck pass; sse / amr-guidance
  / AgentIcon suites pass (51 web tests)
- Manual end-to-end on darwin + Gemini 3.5 Flash and GPT-OSS 120B
  Medium: turn-1 question-form rendered correctly, turn-2 produced
  `<artifact>` with full HTML (3.3KB Modern Minimal design) instead
  of re-emitting the form. agy `--log-file` content correctly
  classified as RATE_LIMITED when Gemini Pro quota was exhausted,
  and as AGENT_AUTH_REQUIRED when keychain was cleared.

* fix(web/test): align amrAgent fixture with supportsCustomModel contract

The AMR agent definition in the daemon ships `supportsCustomModel: false`
so the Settings model picker hides the free-text "Custom…" option. The
PR changed `allowCustomModel` from `selected.id !== 'amr'` (hardcoded)
to `selected.supportsCustomModel !== false` (declarative), but the test
fixture was not updated to carry the same field — causing the
`__custom__` sentinel to appear in the picker under test.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): align formAnswerTransition wording with main + scope build directive to discovery

CI surfaced two failures on the merge with main:
- chat-route.test marks submitted discovery form answers ... expected
  the main-version wording 'Do not emit another <formId> form.'
- telemetry-message-finalization keeps non-discovery form answers
  active ... expected task-type to fall through the else branch
  ('Treat these form answers as the active user turn'), not the
  discovery RULE 2/RULE 3 build branch.

The colleague's earlier fba1e40b form-loop fix tightened both pieces
(stronger wording + grouped discovery|task-type into the build branch)
but didn't update the tests that pin the contract. Revert the
transition wording to main and re-scope the build directive to
'discovery' only. The aggressive form-loop suppression we added in
this PR now lives in the system-prompt FORM_ANSWERED_SYSTEM_OVERRIDE
block, which is far stronger than the user-request transition text
this commit reverts.

* fix(daemon): scope formOverride by form id, detach Linux terminal, move agy log cleanup to finally

- FORM_ANSWERED_GENERIC_OVERRIDE: new exported constant for non-discovery/
  non-task-type form ids; contains only the "do not re-ask" suppression
  without the RULE 2 / RULE 3 / artifact directive.
- formAnswerTransitionForCurrentPrompt: extend build-transition branch to
  include task-type alongside discovery, keeping user-turn and system
  override consistent.
- Prompt assembly (server.ts ~10848): derive formOverride from the parsed
  form id — FORM_ANSWERED_SYSTEM_OVERRIDE for discovery/task-type,
  FORM_ANSWERED_GENERIC_OVERRIDE for all other form ids, empty otherwise.
- launchOnLinux: replace execFileAsync (waited for terminal exit, 3 s cap)
  with spawn({ detached: true, stdio: 'ignore' }) + unref(); resolve on
  the 'spawn' event so long-lived interactive terminals (xterm, konsole)
  are not killed mid-OAuth-flow.
- Antigravity log cleanup: move fs.promises.unlink(agentLogFilePath) into
  a try/finally wrapper around the close handler so every exit path
  (success, failure, cancel, non-zero exit) cleans up the per-run temp
  file, preventing unbounded /tmp accumulation.
- Tests: rename task-type case to assert build-transition behaviour; add
  generic-form-id case (preferences) pinning the non-build path; add
  FORM_ANSWERED_GENERIC_OVERRIDE content assertions.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): switch Antigravity buildArgs to chat subcommand invocation

Replace top-level `-p -` with `agy chat [--log-file …] -` so the adapter
uses the documented chat subcommand and stdin sentinel instead of the
unrecognised global -p flag.  Update the agent-args test description and
all four deepEqual assertions to assert the ['chat', '-'] shape.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* test(daemon): drop real-platform default-launch assertion from terminal-launch suite

The removed test called launchAgentInSystemTerminal('agy') with no
platform override, which invokes the real system terminal on every
developer machine running the daemon test suite (Terminal.app on macOS,
cmd.exe on Windows, xterm/gnome-terminal on Linux). That is an
unacceptable OS side effect for a unit test.

The behaviour being asserted — that omitting platform selects
process.platform — is a TypeScript default-parameter guarantee, not a
runtime invariant that needs an integration test. The remaining 'aix'
case continues to pin the unsupported-platform failure shape.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): buffer Antigravity stdout to suppress auth URL before close-time classifier

The plain-stream close handler at code===0 can detect an agy OAuth
prompt in agentStdoutTail and emit AGENT_AUTH_REQUIRED, but by the
time close fires the stdout chunk has already been forwarded to the
client via the plain-stream `send('stdout', { chunk })` path. This
leaves both the raw OAuth URL and the terminal-launch guidance visible
in chat.

Buffer all stdout chunks for the `antigravity` agent instead of
forwarding them immediately. The existing close-time auth-prompt guard
(code===0, !trackingSubstantiveOutput, childStdoutSeen) returns early
when it detects the auth pattern, leaving the buffer unflushed and the
OAuth URL out of the SSE stream. For legitimate assistant output the
buffer is flushed in order just before design.runs.finish so the
chunks still arrive before the run's finished event.

Adds a chat-route integration test using a fake `agy` that exits 0
after printing the canonical auth prompt; asserts that the run emits
AGENT_AUTH_REQUIRED with no event: stdout delta containing the URL.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* test(daemon): isolate antigravity buildArgs argv test from real settings file

Pass a temp antigravitySettingsPath in the RuntimeContext for the
withModel argv assertion so unit tests do not touch
~/.gemini/antigravity-cli/settings.json. Adds the optional
antigravitySettingsPath field to RuntimeContext and threads it
through buildArgs to writeAntigravityModelSelection; production
callers leave it undefined, preserving the existing default path.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): revert Antigravity buildArgs to `-p -` (the only working agy v1.0.3 invocation)

The looper-reviewer-bot reported `chat` as agy's headless subcommand
based on its environment's agy build, and looper-fixer applied that
shape. The installed CLI (`agy --version` reports `1.0.3`) does NOT
expose a `chat` subcommand — `agy --help`'s `Available subcommands`
section lists only `changelog / help / install / plugin / update`,
and `agy chat - < prompt` exits 0 with empty stdout (the daemon then
forwards it as a 'successful' empty reply, exactly the failure mode
the auth/quota guard at server.ts ~12090 is meant to catch — for the
wrong reason).

`-p` is the documented print-mode flag (`Short alias for --print`)
and `agy -p -` reads the prompt from stdin and prints the model
reply, which the entire end-to-end test sequence in this PR has
verified against (form-loop fix, settings.json model routing,
log-file classification all confirmed working on Gemini 3.5 Flash
+ GPT-OSS 120B Medium with this invocation).

Updates the agent-args test to pin `['-p', '-']` instead of
`['chat', '-']` and adds an inline comment in antigravity.ts noting
that `chat` may exist in a future agy build but is not the contract
on the installed CLI today.

* fix(daemon): serialize Antigravity concrete-model spawns to dodge settings.json race

Reviewer (looper) flagged a concurrency race in the model-routing path:
~/.gemini/antigravity-cli/settings.json is process-global, so two OD
runs starting close together with different concrete models can race
the file — run A writes model A, run B writes model B, then A's agy
finally reads settings.json and executes on model B. The Settings
model picker becomes nondeterministic under parallel conversations.

Adds a per-process promise chain in antigravity.ts:
  - acquireAntigravityModelLock(): chain-await + return release fn
  - waitForAgyToReadModel(logPath, expected): polls agy's --log-file
    for the upstream signal
      'Propagating selected model override to backend: label="<X>"'
    which model_config_manager.go emits once agy has finished reading
    settings.json. Returns true on observed match, false on timeout.
    Regex-escapes the expected label so '(' / ')' in 'GPT-OSS 120B
    (Medium)' match literally, not as a capture group.

server.ts spawn pipeline now acquires the lock BEFORE buildArgs (which
performs the settings.json write) and schedules a release-once handler
that fires when EITHER (a) the log-file confirms agy read the model
or (b) the child exits — the exit fallback prevents a stuck/crashed
agy from starving the queue for every subsequent antigravity spawn.

Default-model spawns bypass the lock entirely: their buildArgs doesn't
touch settings.json, so there's nothing to serialize.

Tests pin:
  - FIFO ordering across 2 / 3 concurrent acquirers
  - Wait helper's regex correctly matches parenthesized labels
  - Wait helper does NOT match a different model with shared prefix
  - Wait helper swallows missing-log-file errors and returns false on
    timeout (no spawn-pipeline crash if the log never appears)

194 → 198 passing runtime tests, 0 regressions.

* fix(daemon): close Antigravity lock release race on slow agy startup (looper #263fd2fe7)

Reviewer flagged that the previous serialization scheduled
`releaseOnce` in `.finally()` on waitForAgyToReadModel — meaning the
helper's `false` timeout return ALSO released the lock. If agy took
longer than the 15s polling window to read settings.json (cold start,
swap-thrash, slow network handshake to the upstream backend), run A's
lock dropped at 15s, run B rewrote settings.json with model B, and
run A's still-starting agy then read the wrong model. Same race the
original mutex was meant to close.

Fix the release semantics to be release-on-confirmation-only:

  - waitForAgyToReadModel: `false` now strictly means 'I gave up
    polling,' not 'agy definitely did not read this.' Document the
    contract so a future caller can't conflate the two. Add an
    optional AbortSignal so server.ts can stop polling when the child
    exits — without it, the leftover watcher could outlive the run
    and accidentally match a later concurrent run's log content,
    releasing the wrong lock.
  - server.ts: schedule `releaseOnce` only when waitForAgyToReadModel
    returns true. The exit handler (which fires for crashes, fast
    exits, normal completion) is now the canonical fallback that
    releases the lock no matter what — the queue can't starve
    permanently because agy always exits eventually. The exit
    handler also fires the AbortController so the watcher cleans up.

New tests pin:
  - timeout returns false WITHOUT any release-implying side effect
  - already-aborted signal short-circuits (no readFile calls)
  - abort mid-poll wakes the helper from its setTimeout (no
    multi-hundred-ms hang waiting out a poll interval that no longer
    matters)

198 → 201 passing runtime tests, 0 regressions.

---------

Co-authored-by: qiongyu1999 <2694684348@qq.com>
2026-05-29 05:43:37 +00:00
lefarcen
bf7152dbdc
fix(web): disable Draw direct-send during an active run, keep Queue (#3270)
Reinstates the Studio tool hardening from #3081 on top of current main:
while a task is streaming, the Draw/annotation primary Send action and its
Enter shortcut are disabled, so an annotation can no longer leak into the
active run while the button shows a disabled reason.

This is the synthesis of two stacked-merge-divergent changes rather than a
wholesale revert: Queue stays available, so the value from #1961 (kami) is
preserved — an annotation made during a run is still staged for the next
turn instead of being dropped. Only the button/Enter availability changes;
the downstream queue/streaming-staging handler in ChatComposer is untouched.

- PreviewDrawOverlay: send('send') and canSend now respect sendDisabled.
- Reframed the streaming Draw test to assert Send is disabled while Queue
  still emits a queued annotation (preserving the "annotate during a run"
  coverage).
- Added unit coverage for the Enter/Send guard and Queue availability while
  a task is running.
2026-05-29 05:28:18 +00:00
Hashem Aldhaheri
bbf4809a7e
fix(web): use surface-appropriate noun in plugin/template preview unavailable copy (#3229)
After #2840 wired plugin and design-template 404s into the same
"no shipped preview" placeholder the skills tab uses, the placeholder
copy still hard-coded "skill" — so users opening a Community/Plugins
card whose manifest declares a preview entry that doesn't ship saw
"No shipped preview for this skill." on a card that is clearly not a
skill.

Adds a noun discriminator to PreviewView.unavailable so the placeholder
reads with the right word per surface — "this skill" on the Skills
tab, "this plugin" on Community/Plugins, "this template" on deck-mode
design-templates. Locales gain three new preview.noun* strings (with
appropriate per-language demonstrative+article) and the existing
unavailable title/body interpolate a {noun} placeholder.

Also fixes a CSS gap in .ds-modal-unavailable surfaced by the same
path: the title and body divs were collapsing onto a single line under
.ds-modal-empty's default flex-row. Mirrors the existing
.ds-modal-error column+gap layout.

Refs #897, #2840.
2026-05-29 03:23:18 +00:00
初晨
9c6a69490b
fix(web): localize mention picker copy (#3255) 2026-05-29 03:19:14 +00:00
Yuhao Chen
4a0900ca81
fix(web): remove passive video play badge (#3252) 2026-05-29 03:17:57 +00:00
lefarcen
08c350fb0f
fix(analytics): bucket feedback agent/model directly on the event (#3240)
* fix(analytics): bucket feedback agent/model directly on the event

Reason × agent / reason × model splits on
`assistant_feedback_reason_submit` were 25-74% `unknown` because the
event only carried `run_id` — analyses had to join back to
`run_created/run_finished`, which loses rows whenever the feedback is
given to a message whose run sits outside the query window (the common
case for feedback on older messages), and whose `model_id` was `null`
to begin with (the user didn't pick a specific model — went with the
agent's default).

Carry `agent_provider_id` and `model_id` directly on every feedback
event so the analyses no longer need to join. Replace `null/unknown`
with the `default` bucket via `modelIdForTracking` (and let
`agentIdToTracking` fall through to `other`) at every emit site —
`null` was an analyst-hostile mix of "no selection" and "join failed";
`default` is a real, analysable bucket. On `run_finished`, upgrade the
model to the agent-reported value from initializing/model status
events when the user did not pick one — covers ACP, claude-stream,
copilot-stream, json-event-stream, qoder, pi-rpc.

* fix(analytics): use feedbackAgentProviderIdToTracking and assistantFeedbackModelId for feedback events

Wire API-mode agent ids (anthropic-api → anthropic) and agentName-parsed
model ids through the feedback emit path. Previously the feedback props used
agentIdToTracking (no anthropic-api case) and assistantModelDetail (no
agentName fallback), causing model_id='default' and agent_provider_id='other'
for API-mode agents.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(analytics): extend feedback/run schema for full agent/model coverage

Layered on top of the conflict resolution and the v1 emit switchover
in 0c1b30440. Three things the prior commits did not cover:

1) The v2 `assistant_feedback_*` family (page='studio') shares
   `AssistantFeedbackBase`. Add `agent_provider_id` + `model_id` once on
   the base so all four derived emits (reason_view, click, reason_click,
   reason_submit) carry the same context as the v1 family, instead of
   leaving the v2 dashboard with the same `unknown` gap the v1 PR was
   trying to close.

2) Tighten `FeedbackSubmitResultProps.model_id` and
   `feedbackAgentProviderIdToTracking` from `string | null` /
   `TrackingFeedbackProviderId | null` to non-null. The web emit paths
   already bucket null/empty through `modelIdForTracking` and the
   `?? 'other'` fallback; collapsing that at the helper / contract
   layer means `null` becomes a TS error at every new emit site, so we
   can't regress the unknown bucket again in a future event.

3) Comment on `run_finished.model_id` so reviewers reading
   `finishedModelId` see why the agent-reported value upgrades the
   request-side one.

* fix(analytics): continue event scan past usage to find agent-reported model

The reverse scan for agentReportedModel was broken: the loop broke on
the first usage event (terminal) before ever reaching the status:initializing
or status:model event (emitted at run start, lower index). This meant
run_finished.model_id always fell through to modelIdForTracking(null) =
'default' for any run that reported usage tokens.

Fix: track haveUsageTokens as a flag and defer the break until both usage
tokens are found and either the model is not needed (user picked one) or
the agent-reported model has been captured. Extract the logic into
scanRunEventsForFinishedProps for unit testability.

Tests: six new cases in run-lifecycle-analytics.test.ts cover the
initializing→usage append order, ACP status:model, detail field fallback,
early exit when reqBodyModel is set, no-status event, and empty events.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(analytics): guard usage block with !haveUsageTokens to prevent early events overwriting terminal tokens

In the reverse-scan loop of scanRunEventsForFinishedProps, the usage block
lacked a !haveUsageTokens guard. When needAgentModel is true and the
agentReportedModel lives at the start of the run (lower index), the loop
walks all the way back past multiple usage events (one per step/turn in
multi-step runs), overwriting inputTokens/outputTokens on each pass. The
surviving values were those of the earliest step, not the terminal total.

Adding !haveUsageTokens to the usage block condition ensures only the first
(terminal) usage event seen in reverse sets the token counts; subsequent
earlier usage events are skipped while the scan continues for agentReportedModel.

Adds a test case for initializing(model) → usage(step1) → usage(terminal)
asserting both terminal token counts and agentReportedModel.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)
2026-05-29 03:06:06 +00:00
Nicholas-Xiong
45873a551b
fix: improve queue screenshot preview modal backdrop (#3215)
Increase the backdrop opacity from 44% to 75% and add a blur effect
to better separate the queue screenshot preview from the file-list
preview panel underneath. This prevents visual overlap and makes it
clearer which preview surface is active.

Fixes #3167
2026-05-29 03:04:28 +00:00
初晨
ef8f518b3b
Fix status detail URL parsing (#3208) 2026-05-29 03:03:46 +00:00
Nicholas-Xiong
98651ecae2
fix: localize queue UI strings in Chinese mode (#3213)
- Queued → 已排队
- to Send → 待发送
- Edit queued task → 编辑排队任务
- Save → 保存
- Cancel → 取消
- Edit → 编辑
- more queued → 个排队
- Queued follow-up → 已排队的后续任务

Fixes #3173
2026-05-29 03:03:27 +00:00
hahalolo
afc5f52445
Fix/issue/3149 (#3162)
* fix(docker): fix container startup crash due to missing OD_API_TOKEN

* fix(docker): forward OD_API_TOKEN to fix docker container boot loop

* fix(docker): enforce non-empty OD_API_TOKEN for docker-compose

* fix(deploy): automate OD_API_TOKEN generation in installer and close compose loop

* docs(readme): guide manual deployment users to configure OD_API_TOKEN

* docs(readme): align working directory paths for manual deployment instructions

* docs(readme): align working directory paths for manual deployment instructions

* docs(readme): restore git clone context for first-time users

* fix(web): add min-width constraints to plugin filter span and pill button

related issue 3149
2026-05-29 03:03:03 +00:00
kami
1efa1dc7b5
Add preview iframe keep-alive pool (#2190)
* Add preview iframe keep-alive pool

* Fix active preview eviction on prompt context changes

* Evict preview iframes on skill/design-system registry edits

Bridge Settings → Skills / Design Systems to App.tsx so the keep-alive
pool drops any preview iframe whose project depends on the affected id
after every successful mutation. Without this, body-only edits leave
SkillSummary / DesignSystemSummary fields untouched and ProjectView's
signature-driven eviction never fires, so the active preview keeps
serving stale prompt context. The handler also re-fetches the App
shell's skill / design-system lists so summary-field changes propagate
to ProjectView's signature on the next render.

Also extend IframeKeepAlivePool.evictMatching with an includeActive
option so the new handler can drop the currently-visible iframe along
with parked ones; the fallback pool only ever holds active entries so
includeActive is a no-op there.

Regression tests:
- App.previewKeepAlive: clicking a Settings stub that fires
  onSkillsChanged / onDesignSystemsChanged drives evictMatching with
  includeActive=true and a predicate that matches projects using the
  affected id while skipping unrelated projects.
- SkillsSection: onSkillsChanged fires after a body-only edit and
  after a delete.

* fix: reattach active keep-alive iframe after eviction

* fix(web): refresh design systems after rename

---------

Co-authored-by: kami.c <kami.c@chative.com>
2026-05-29 03:01:17 +00:00
chaoxiaoche
831208b823
Refine Studio preview interactions (#3000)
* Refine studio preview interactions

* Fix deck toolbar navigation for transform tracks

* Fix manual edit preview close

* Fix Simple Deck toolbar scrolling

* Fix preview screenshot capture

* Fix deck preview progress sync

* Refine edit target selection for grouped elements (#3068)

* Prefer child edit targets over grouped parents

* Keep edit inspector header and footer fixed

* Shorten floating edit inspector

* Show readable edit target names

* Allow dragging the floating edit inspector

* Add explicit edit inspector actions

* Show preview comment count in toolbar

* Separate annotation and comment toolbar groups

* Remove annotation toolbar divider

* Close edit inspector from footer actions

* Hide edit inspector until target hover

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>

* Fix manual edit iframe regression test

* Fix Studio interaction review feedback

Generated-By: looper 0.9.2 (runner=fixer, agent=codex)

* Fix saved comment link classification

Generated-By: looper 0.9.2 (runner=fixer, agent=codex)

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
Co-authored-by: Siri-Ray <2667192167@qq.com>
2026-05-28 12:52:37 +00:00