mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
69 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
9146dc1c57
|
fix(web): persist design files view state across navigation (#2303)
* fix(web): persist design-files view state across navigation
pageSize, sortKey, sortDir, and kindFilter reset on every navigation
because DesignFilesPanel remounts via key={projectId}. Persist them to
localStorage under od:design-files:view-state:v1:<projectId> so each
project's view prefs survive tab-switching.
- Read persisted state via lazy useState initializers (SSR-safe try/catch)
- Write back in a single useEffect keyed on all four values
- Scoped per-project so proj-a settings never bleed into proj-b
- Schema-guarded: invalid/missing fields fall through to defaults
- Red spec: apps/web/tests/components/DesignFilesPanel.view-state-persist.test.tsx
* fix(web): address review feedback on view-state persistence
- Add typeof window guard in readViewState for explicit SSR safety
- Consolidate 4 separate localStorage reads into a single useRef read at
mount time; each lazy useState initializer now reads from savedViewState.current
instead of re-parsing localStorage independently
* fix(web): harden design-files view-state persistence
- Validate restored kindFilter values against the current ProjectFileKind
union via isProjectFileKind() so stale stored values from a prior schema
are dropped silently instead of being cast unchecked.
- Introduce DEFAULT_SORT_KEY/SORT_DIR/PAGE_SIZE constants so the useState
initialisers and the new validation guard share a single source of truth.
- Add viewStateHasMounted ref to skip the first-render write in the persist
useEffect. Without this guard every project the user visits accumulates a
default-value entry in localStorage on mount, growing stale-key garbage
unboundedly and making future field additions silently inject defaults into
every existing entry.
- Harden kindFilter test: replace the silent early-return-on-missing-trigger
with expect(filterTrigger).not.toBeNull() so a render failure surfaces as
a real test failure rather than a passing no-op.
* test(e2e): design files view state persists across navigation and reload
Adds a Playwright UI smoke test in e2e/ui/ that exercises the three key
guarantees of the view-state persistence fix:
(a) Tab-away / tab-back: navigating to a file tab and returning remounts
DesignFilesPanel (conditionally rendered); all four prefs (sortKey,
sortDir, pageSize, kindFilter) are restored from localStorage.
(b) Hard reload: localStorage survives page.reload(); prefs are intact on
the next mount.
(c) Per-project key isolation: a second project starts with defaults and
does not inherit values from the first project's localStorage entry.
The test uses OD_PORT=18011 / OD_WEB_PORT=18012 to avoid port conflicts with
the default development ports.
Also fixes a race in DesignFilesPanel: the stale-kind cleanup useEffect was
running against an empty availableKinds set before the async file list arrived
on mount, which cleared a kindFilter correctly restored from localStorage.
Guard added: skip the cleanup when availableKinds is empty.
Red on origin/main (no persistence logic exists there); green on this branch.
* fix(e2e): address code-reviewer feedback on view-state-persist test
- Add data-testid='df-page-size-select' to per-page <select> in
DesignFilesPanel (W2: decouple test from i18n string 'Show')
- Add StrictMode comment to viewStateHasMounted guard explaining
the dev-mode double-write behaviour (W1: document the invariant)
- Switch nav-away from dblclick to single-click + Open button,
matching the pattern used in app-design-files.test.ts (W4)
- Raise timeout from 60s to 90s for cold CI runners (W3)
- Unify seedTextFile/seedPngFile into shared seedFile helper (N3)
- Add home-hero-input assertion in gotoEntryHome (N2)
- Switch waitForPageSizeSelect to use data-testid (W2)
* test(e2e): split design-files persist into nav, reload, and per-project scenarios
* fix(web): tighten isPageSize to discrete option set, add invalid-value regression test
* fix(web): isolate DesignFilesPanel.test.tsx from persisted view-state key
|
||
|
|
1c2a1c4459
|
Add launch review regression coverage and stabilize daemon tests (#3207)
* Add launch review E2E regression coverage * Harden daemon launch review regressions * Stabilize daemon runtime tests * fix(tests): restore e2e preflight typing Generated-By: looper 0.8.1 (runner=fixer, agent=codex) * fix(tests): make fake plugin runtime ESM-safe Generated-By: looper 0.8.1 (runner=fixer, agent=codex) * Stabilize e2e fake agent and regression tests * fix(tests): repair fake agent cjs runtime Generated-By: looper 0.8.1 (runner=fixer, agent=codex) * fix(review): harden plugin authoring checks Generated-By: looper 0.9.2 (runner=fixer, agent=codex) * fix(tests): bind plugin authoring run to seeded conversation Generated-By: looper 0.9.2 (runner=fixer, agent=codex) |
||
|
|
4abc08bb17
|
fix(web): restore changes silently dropped by PR #2461 sync merge (#3210)
* fix(web): remove Ingest source panel from Automations tab (#2711) * fix(web): remove Ingest source panel from Automations tab The Automations tab carried a free-form "Ingest source" composer that let users paste arbitrary content (URL, repo path, connector event, chat snippet) and turn it into a source packet plus evolution proposals. The form was confusing next to the routine/template flow on the same page, exposed an internal canonicalization concept users don't need to think about, and shipped before the surrounding evolution-proposal flow was wired into a coherent end-to-end story. Drop the UI surface only: - Remove the <section className="automations-ingest"> block, the Template / Source / Compression / Connector selects, the title/source ref/content fields, the recent-packets list, and the Ingest button. - Drop the now-dead local state (sourcePackets / sourceForm / ingestingSource), the patchSourceForm and submitSourceIngestion helpers, the SOURCE_KIND_OPTIONS / COMPRESSION_OPTIONS constants, the SourceIngestionForm type and DEFAULT_SOURCE_FORM, the /api/automation-source-packets refresh leg, and the sourcePackets side-write inside crystallizeRun. - Remove the matching .automations-ingest / .automation-ingest-* CSS block (plus the two responsive overrides) from tasks.css. - Delete the test case that drove the form in TasksView.templates.test. Backend stays intact: apps/daemon/src/automation-ingestions.ts, the POST /api/automation-ingestions route, `od automation ingest` CLI, the routine-evolution call site, and the AutomationContentPacket / AutomationSourceKind / AutomationTokenCompressionMode contracts all remain, since routine scheduling still depends on them. * fix(web): drop crystallize test assertion on removed packet list The crystallize test was asserting that the new content packet's title shows up on the page. That assertion only passed because the daemon response was being side-written into the deleted sourcePackets state and rendered in the Ingest source recent-packets strip. With that UI removed, the packet title has no surface to land on; the proposal title (`Skill: Artifact polish loop run`) is still asserted and remains the real signal that crystallize succeeded. * test(e2e): restore #2305 / #2578 e2e regressions lost in PR #2461 merge Sync merge |
||
|
|
9a4816b101
|
feat(plugins): site-wide plugin detail pages, share-to-site links, landing deploy trigger (#2999)
* feat(plugins): site-wide plugin detail pages, share-to-site links, landing deploy trigger Why: a merged plugin PR didn't redeploy the landing site (plugins/** was missing from the deploy paths), and the desktop Share menu copied a local/404 link instead of the public marketplace URL. The landing plugin routing left by the detail-page rework also 404'd: the locale listing's cards used a multi-segment href while detail pages were single-segment, and only 388 bundled _official plugins had pages. What changed: - Deploy: landing-page deploy/ci trigger on plugins/**, and skip the slow previews step on an exact cache hit (cache key aligned across both workflows so a PR-built cache is reused by main). - Share URL: packages/contracts/plugin-url.ts owns the single-segment plugin URL scheme; the web Share menu and the landing site both derive links from it. Web links now point at https://open-design.ai/plugins/<slug>/. - Full detail coverage: detail pages now cover all 403 local plugins (_official incl. atoms + community), each rendered from its local manifest. Fixes the locale-listing 404s and the community manifest-name/catalog-id (- vs /) mismatch. - Self-host: daemon exposes OD_SITE_ORIGIN via /api/app-config; web falls back to the canonical origin until the daemon answers. Validation: pnpm guard, pnpm typecheck (all packages), contracts + web tests green, and a full build E2E confirming all 403 catalog ids and locale-listing cards resolve to built detail pages (0 missing). * chore: retrigger CI * ci(landing): carry plugins/** trigger + previews cache-hit into #2994 split workflows Merged origin/main, which split landing deploy into staging + manual production (#2994). git auto-migrated my landing-page-deploy.yml changes into landing-page-staging.yml via rename detection (plugins/** path, fallback-preview-card.ts cache key, cache-hit skip all carried). The new manual landing-page-production.yml didn't have them, so add the previews cache-key alignment + cache-hit skip there too (plugins/** path is N/A — production is workflow_dispatch only). * fix(ci): wrangler-action uses pnpm so it tolerates landing's workspace dep This PR added @open-design/contracts (workspace:*) to apps/landing-page/package.json so the landing site can share the plugin-url slug rules. But the landing deploy/preview steps run cloudflare/wrangler-action with packageManager: npm in workingDirectory apps/landing-page, and 'npm i wrangler' chokes on the workspace: protocol (EUNSUPPORTEDPROTOCOL), failing 'Validate landing page'. Switch all three landing wrangler-action steps (staging / ci preview / production) to packageManager: pnpm, which is workspace-aware. * test(e2e): bundled plugins now offer the README badge After this branch, buildPluginShareUrl returns a public open-design.ai link for bundled plugins (not just official-marketplace ones), so the home-starter share menu now shows 'Copy README badge'. Update the assertion from toHaveCount(0) to toBeVisible(). * fix(landing): drop @open-design/contracts dep, use a landing-local slug helper Per review on #2999: the marketing site must not import @open-design/contracts (AGENTS.md boundary — it's the web/daemon product-runtime contract layer). Move the slug/path helpers into landing-local app/_lib/plugin-slug.ts; the web client keeps contracts' plugin-url. The two derive the same scheme and are verified in lockstep by the e2e route check (403 share URLs -> 403 detail pages, 0 missing). landing no longer has a workspace dep, so revert the wrangler-action packageManager back to npm. * fix(landing): include plugins/_official in previews cache key Per review on #2999: generate-previews.ts builds bundled-plugin preview jobs from plugins/_official/**/open-design.json and renders fallback cards from manifest fields (title/description/mode/scenario/tags). With plugins/** now triggering the workflow but the cache key not hashing plugin inputs, a plugin-only PR/merge could exact-hit an old cache and skip the preview regen, shipping with a stale or missing /previews/plugins/<manifest-id>.png. Add plugins/_official/** to the cache key in all three landing workflows (ci, staging, production). community is not currently covered by generate-previews so its glob is omitted. * fix(plugins): include community marketplace installs in share gate hasPublicPage now covers sourceMarketplaceId === 'community' so the README badge and public detail link surface for community installs. Community manifest names carry a community- prefix that diverges from the landing-page route slug, so URL derivation uses sourceMarketplaceEntryName (community/<folder>) instead — pluginDetailSlug takes the last segment, matching the /plugins/<folder>/ route the landing page emits. Adds component tests for buildPluginShareUrl, badge copy, and the Open-in-marketplace link for a community/registry-starter record. Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code) --------- Co-authored-by: mrcfps <mrc@powerformer.com> |
||
|
|
df8a0faff6
|
feat(runtimes): register AMR (vela) as an ACP stdio agent (#2355)
* feat(runtimes): register AMR (vela) as an ACP stdio agent
AMR is the vela CLI's ACP runtime mode. `vela agent run --runtime opencode`
speaks ACP JSON-RPC over stdio (see vela's
`specs/current/runtime/manual-agent-run-openrouter.md`); per
`docs/new-agent-runtime-acp.md` we expose it through the same `streamFormat:
'acp-json-rpc'` transport that already powers Hermes, Devin, Kimi, etc.
The new `defs/amr.ts` is the entire wiring — `buildArgs` returns
`['agent', 'run', '--runtime', 'opencode']`, `fetchModels` reuses
`detectAcpModels`, and the fallback list seeds the OpenRouter ids vela's
e2e baseline uses. `executables.ts`/`app-config.ts`/`metadata.ts` get the
matching `VELA_BIN`/`VELA_LINK_URL`/`VELA_RUNTIME_KEY`/`VELA_OPENCODE_BIN`
allowlist + install/docs URLs, so users can configure the per-agent env in
Settings without leaking into other adapters.
Coverage: `tests/fixtures/fake-vela.mjs` is a minimal ACP stub that returns
the documented `initialize` / `session/new` / `session/set_model` /
`session/prompt` shapes; `tests/amr-acp-integration.test.ts` spawns it via
`child_process.spawn` and drives a full turn through `attachAcpSession` and
`detectAcpModels`, so the ACP transport contract for AMR is end-to-end
verified locally even before a real `vela` binary is installed.
Validated:
- pnpm guard
- pnpm typecheck (all workspace projects)
- pnpm --filter @open-design/daemon test (2881/2881)
Deferred: real OpenRouter-backed turn through a built `vela` binary —
the runtime def needs no changes for that path, only `VELA_RUNTIME_KEY`
and `VELA_LINK_URL` in env (or Settings).
* fix(runtimes/amr): pin a concrete default model and bare openai ids
End-to-end validation against a freshly-built `vela` (nexu-io/vela@main)
+ OpenRouter surfaced two contract details the first AMR runtime def
got wrong:
1. vela rejects `session/prompt` with `session/set_model must be called
before session/prompt`. attachAcpSession in apps/daemon/src/acp.ts
skips set_model whenever the picked model is the synthetic 'default'
id, so AMR's fallback list must NOT include DEFAULT_MODEL_OPTION. The
def now ships a concrete `gpt-5.4-mini` as both `fetchModels`'
default option and `fallbackModels[0]`, which makes attachAcpSession
always send a real `session/set_model` for AMR turns.
2. `vela --runtime opencode` auto-prepends `openai/` to whatever modelId
it forwards to opencode's openai provider. With OpenRouter-style ids
like `openai/gpt-5.4-mini`, opencode receives the double-prefixed
`openai/openai/gpt-5.4-mini` and replies `ProviderModelNotFoundError`.
The new fallback list ships the bare ids opencode's openai registry
actually knows about (gpt-5.4, gpt-5.4-mini, gpt-5.4-fast, etc.).
Stub + tests:
- tests/fixtures/fake-vela.mjs now enforces the set_model gate the same
way real vela does, so a regression that silently goes back to
model: 'default' would surface as a fatal error in tests instead of a
hidden production failure.
- tests/amr-acp-integration.test.ts pins both contracts: no 'default' /
no 'openai/' prefix in fallbackModels, and a negative case that
asserts session/prompt fails when no model is set.
Adds `apps/daemon/scripts/verify-amr-real-vela.mjs` — a small dev-time
runner that drives `attachAcpSession` against a real `vela` binary and
prints the daemon's chat events, so future protocol drift can be checked
against an actual OpenRouter call.
Verified locally: `vela agent run --runtime opencode` + OpenRouter
returns the prompted string ("AMR-E2E-PASS") through the full daemon
pipeline; daemon test suite stays 2883/2883.
* fix(runtimes/amr): substitute concrete model when chat run sends 'default'
A plugin-driven AMR run from the UI surfaced a real-world hole in the
prior commit:
json-rpc id 3: session/set_model must be called before session/prompt
The Default-design-router plugin (and any caller that doesn't pin a
real model) sends `model: 'default'` straight through, which the AMR
runtime def cannot accept — vela rejects `session/prompt` without
`session/set_model` and attachAcpSession skips set_model whenever
model === 'default'. Just leaving DEFAULT_MODEL_OPTION out of the
adapter's `fallbackModels` is not enough: the chat-run handler in
server.ts still forwarded 'default' verbatim.
This adds `resolveModelForAgent(def, resolved, env?)` as the
single source of truth for the substitution:
1. If the caller picked a real id, pass it through.
2. Else, if `def.defaultModelEnvVar` is set and the daemon process
env has a non-empty value for it, return that (operator escape
hatch — see below).
3. Else, if the def's `fallbackModels` does NOT contain a 'default'
id, return `fallbackModels[0].id`.
4. Else, return the original value (the historic shape — defs that
list 'default' themselves are untouched).
AMR sets `defaultModelEnvVar: 'VELA_DEFAULT_MODEL'`, so when
opencode's openai-provider registry deprecates `gpt-5.4-mini`
upstream, an operator can swap the fallback id without a code change
by exporting `VELA_DEFAULT_MODEL=gpt-5.5` before launching tools-dev
/ od. Worth noting the env var must live in the daemon's `process.env`
(Settings-UI per-agent env values only reach the spawned child, not
the daemon's resolver) — the new field's docblock spells this out.
Coverage:
- `tests/runtimes/resolve-model.test.ts` — 8 unit tests covering all
four resolver branches plus the env-override happy path / fallback /
ignore-when-user-picked-a-real-id case.
- `pnpm --filter @open-design/daemon typecheck` clean.
* chore(runtimes/amr): move AMR to the top of the base agent list
So `AMR (vela)` shows up first in the agent picker / status views,
ahead of claude / codex. Pure ordering change; no behavior delta.
* feat(amr): Sign-in / Sign-out button on the AMR Settings card
The first half of the AMR work assumed the operator would set
VELA_RUNTIME_KEY / VELA_LINK_URL on the daemon process and never
surfaced login state to users. This adds the missing UX so a fresh
install can drive the full path from Settings:
- GET /api/integrations/vela/status reads ~/.vela/config.json
for the active profile and returns { loggedIn, profile, user }
(without leaking the runtime/control keys themselves).
- POST /api/integrations/vela/login spawns `vela login` once
(409 if one is already in flight). The vela CLI opens the user's
browser to the device-authorization page itself — Open Design
only needs to kick the subprocess off.
- POST /api/integrations/vela/logout removes ~/.vela/config.json
so the next status read returns logged-out.
`AmrAgentCard` is a dedicated agent-card component for AMR because
the existing `<button>` row can't host an interactive sub-control
(nested interactive elements). It polls /status after a login click
until the daemon reports loggedIn=true (or 5 minutes elapse), and
exposes a Sign-out action on hover. Other adapters (claude, codex,
hermes, …) keep their existing `<button>` card.
i18n: 8 new keys (settings.amrLogin / Logout / LoggingIn / etc.)
added to en + zh-CN. Other locales spread `en` and inherit the
English copy until translations land.
Coverage:
- `tests/integrations/vela.test.ts` pins the config.json reader
against a tmp HOME — including the negative case where a profile
has user info but no runtimeKey (still logged-out), and the
secret-leak guard ("rt-secret-*" must not appear in the projection
payload).
- `tests/components/AmrAgentCard.test.tsx` covers all four UI
states (logged-out, logging-in, logged-in, logging-out) plus the
click-propagation invariant the divergent card was built to keep.
`pnpm --filter @open-design/daemon test` 2901 / 2901 passing.
`pnpm --filter @open-design/web test` 1719 / 1719 passing.
`pnpm typecheck` + `pnpm guard` clean.
Dev script side-effects: `apps/daemon/scripts/verify-amr-real-vela.mjs`
no longer requires both VELA_RUNTIME_KEY and VELA_LINK_URL — if
VELA_PROFILE is set, the vela CLI is allowed to resolve credentials
from `~/.vela/config.json`. Added the two AMR `.mjs` fixtures to
`scripts/guard.ts` allowlist with the executable-fixture / dev-runner
rationale.
* fix(connection-test): substitute model for AMR before attachAcpSession
The chat-run path in server.ts already routes the requested model through
`resolveModelForAgent` so AMR / vela (whose CLI demands an explicit
`session/set_model` before `session/prompt`) gets the def's first
concrete fallback id when the chat run ships `model: 'default'`.
`connectionTest.ts` was wiring `attachAcpSession({ ..., model: model ?? null })`
directly, which made the Test Connection button on the AMR Settings
card deadlock with the same `session/set_model must be called before
session/prompt` error the chat-run path already handles — surfaced as a
permanent "Testing connection…" spinner in the UI.
Reuse the same helper here so Test Connection mirrors chat-run behavior.
* test(amr): three-layer end-to-end coverage for the AMR login + turn flow
The PR up to this point shipped runtime + UI code with unit-level Vitest
coverage. This commit adds the cross-layer regression net the live demo
relied on:
1. apps/daemon/tests/integrations/vela.routes.test.ts (HTTP, Vitest)
Spins up the real daemon Express app via `startServer({port:0,...})`,
persists `agentCliEnv.amr.VELA_BIN = <fake>` into app-config.json,
and exercises every /api/integrations/vela/* endpoint against the
extended fake-vela stub:
- status reads ~/.vela/config.json under various states
- login spawns the fake, waits for config.json to appear, returns
pid + startedAt + profile
- 409 already-running guard with the stub's delay knob
- logout removes the file (idempotent)
- secrets (runtimeKey / controlKey) never leak in the projection
- login → status round-trip flips loggedIn=false → true
2. e2e/tests/amr/turn.test.ts (tools-dev orchestrated, Vitest)
Boots a namespaced daemon + web pair through `createSmokeSuite`,
inlines a self-contained fake `vela` binary that handles BOTH
`vela login` (writes ~/.vela/config.json) and
`vela agent run --runtime opencode` (ACP stdio with the
`session/set_model must precede session/prompt` gate the real binary
enforces), then drives a complete /api/runs lifecycle for
`agentId: 'amr', model: 'default'` and asserts the assistant message
captures the fake's streamed text. This is the test that would have
surfaced today's plugin-default-model regression (the `set_model
before prompt` error) at PR time instead of demo time.
3. e2e/ui/amr-login-pill.test.ts (Playwright)
Mocks /api/agents + /api/integrations/vela/{status,login,logout}
to drive the Settings AMR card through the full Sign in → Signed in
→ Sign out cycle. Pins the AmrLoginPill polling contract and the
aria-label semantics (the pill's accessible name is "Sign out" once
logged in, regardless of which label the hover-state text shows).
fake-vela.mjs extensions:
- Handles `vela login` argv by writing
~/.vela/config.json for the active VELA_PROFILE and exiting 0 —
mirrors real vela's on-disk side-effect without the device-auth
loop.
- FAKE_VELA_LOGIN_DELAY_MS knob so route tests can observe the
in-flight state of the spawn lifecycle.
- FAKE_VELA_LOGIN_USER_EMAIL / _USER_PLAN to assert the surfaced
user fields end-to-end.
Validated:
- `pnpm guard` + `pnpm typecheck` (all workspace projects)
- `pnpm --filter @open-design/daemon test`: 2998 / 2998 passing,
including the new 8-test integration suite.
- `cd e2e && pnpm test tests/amr`: 1 / 1 passing.
- `cd e2e && pnpm exec playwright test ui/amr-login-pill.test.ts`:
1 / 1 passing (6.7s).
* feat(amr): package native cli and refine login ui
* feat(amr): wire vela cli beta packaging
* docs(amr): document vela ci packaging review
* docs(amr): refine vela ci integration review
* fix(ci): refresh nix pnpm dependency hashes
* fix(pack): clean up Vela CLI packaging
* fix(pack): bundle Vela CLI support files
* fix(amr): recover login attempts from stale auth state
* test: expand AMR and automations coverage
* fix(amr): address review follow-ups
* test(web): align tasks fixtures with contracts
* fix(daemon): type wildcard route params
* fix(ci): refresh PR merge validation
* fix(amr): clear env credentials on logout
* feat(settings): inline local CLI model configuration
* fix(amr): recognize daemon env credentials
* [codex] Fix Vela companion packaging (#2979)
* Fix Vela companion packaging
* Update Nix pnpm dependency hashes
* [codex] Surface AMR account failures (#2980)
* fix: surface AMR account failures
* fix: cover AMR recovery error guidance
* chore: bump beta base version to 0.8.1 (#2990)
* Fix AMR profile and packaged runtime review issues
* Detect packaged AMR OpenCode companion tree
* feat(web): polish AMR frontend flows
* Polish AMR onboarding card
* fix: read AMR login state from dot-amr config (#3048)
* test: tighten AMR credential and packaging coverage
* test: restore AMR executable test env helper
* [codex] Fix packaged mac Dock identity and AMR label (#3076)
* Fix packaged mac sidecar Dock identity
* Rename AMR assistant label
* Fix AMR live models and dot-amr login state (#3073)
* fix: read AMR login state from dot-amr config
* fix: load live AMR models before runs
* fix: point AMR onboarding link to production wallet
* fix: address AMR model review feedback
* fix: persist live AMR model fallback
* [codex] Fix AMR link catalog model ids (#3088)
* Fix packaged mac sidecar Dock identity
* Rename AMR assistant label
* Fix AMR link catalog model ids
* Fix AMR model normalization typecheck
* Use live AMR model for default runs
* fix: polish AMR runtime settings UI
* Accelerate AMR startup defaults (#3092)
* Surface AMR insufficient balance wallet URL (#3099)
* fix(web): polish onboarding controls (#3112)
* fix(web): show CLI scan loading state
* Avoid duplicate AMR wallet recharge links (#3117)
* Avoid duplicate AMR wallet recharge links
* Use Vela CLI 0.0.3 test package
* chore(nix): refresh pnpm deps hash
* Fix AMR wallet guidance display
---------
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
* chore(pack): pin Vela CLI 0.0.3-test.1 (#3127)
* chore(nix): refresh pnpm deps hash
* chore(pack): pin Vela CLI 0.0.3
* chore(nix): refresh pnpm deps hash
* fix(web): suppress AMR exit 130 fallback (#3136)
* feat(web): nudge users to hosted AMR on model/auth/quota failures (#3083)
* feat(web): nudge users to hosted AMR on model/auth/quota failures
When a non-AMR agent run fails with an auth / quota / upstream model
error, surface an inline nudge under the error pill linking to Open
Design's hosted AMR gateway (https://open-design.ai/amr). The nudge
fires `surface_view` (element=run_failed_toast) on impression and
`ui_click` (element=go_amr) on the link.
Also teach the daemon to classify CLI-agent auth/quota/upstream failures
(Claude Code, codex, ...) into specific API error codes
(AGENT_AUTH_REQUIRED / RATE_LIMITED / UPSTREAM_UNAVAILABLE) instead of
the generic AGENT_EXECUTION_FAILED, so both the error message and the
nudge key off accurate codes. AMR's own runs are excluded from the
nudge — they keep the dedicated sign-in / recharge affordances.
* feat(web): rework failed-run AMR guidance into per-case error UI
Replace the single inline nudge with a per-case failed-run experience
driven by the run's error code + agent:
- The error card is now neutral gray (was red) and always carries a
retry button; it is driven by the persisted per-message error event so
it survives a reload.
- Non-AMR agent hitting a model/auth/quota wall: a theme-color promotion
card under the error card offers "switch to AMR & retry" — switches the
run to AMR, opens Settings on the AMR card, and auto-retries once the
account signs in (ProjectView polls vela login status, independent of
the Settings pill lifecycle, with success / 5-min-timeout / unmount
exits).
- AMR agent unauthorized: clearer copy + an "authorize & retry" button.
- AMR agent out of balance: clearer copy + a "top up" button to the AMR
wallet, with manual retry.
- Settings AMR card: when opened from the nudge, it scrolls into view and
pulses, and an authorize-button coachmark (a fake hand cursor that
rises in and dismisses on hover) points at the sign-in control when not
yet authorized.
analytics: surface_view (run_failed_toast) on the promotion card and
ui_click (go_amr) on its action are retained. i18n adds chat.amrCard.*
and chat.amrError.* (en / zh-CN / zh-TW translated; other locales fall
back to en) and drops the old chat.amrErrorGuidance keys.
* fix(daemon): require status context for numeric service-failure codes
Per review on #3083: the model-service classifier matched bare HTTP
status numbers (`500`, `502`, `429`, `401`), so ordinary CLI output like
`line 500`, `read 502 bytes`, or `exit code 401` could be misclassified
as a provider outage / auth wall and wrongly surface the AMR nudge. Now
a status number only counts when it carries explicit context (`HTTP 500`,
`status 503`, `code: 401`, `502 Bad Gateway`); textual provider phrases
(overloaded, bad gateway, service unavailable, rate limit, …) are
unchanged. Adds fixtures proving unrelated numeric output stays null.
* fix(web): keep error pill for failed runs ChatPane's card doesn't cover
Per review on #3083: the per-message gray error pill was suppressed for
every persisted error status event, but ChatPane only renders the
replacement top-level error card for `retryableAssistantMessage` (the
last failed assistant). So a failed turn that is no longer last (after a
follow-up) or an older failed run in history showed neither the pill nor
the card — its error detail vanished, undercutting reload/history
survival. ChatPane now passes `errorCardOwnerId` (the assistant id whose
error the card represents); AssistantMessage suppresses only that one
pill and keeps rendering StatusPill for all other error events.
* fix(daemon): don't treat a process exit code as an HTTP status
Follow-up to review on #3083: the status-context helper accepted a bare
`code` prefix, so `exit code 401` / `process exited with code 429` still
matched and got classified as AGENT_AUTH_REQUIRED / RATE_LIMITED (the
very `exit code 401` case the comment calls out as noise). `code` now
only counts when qualified (`status code` / `error code` / `response
code`) or punctuation-bound (`code: 401`); bare `exit code N` no longer
matches. Adds fixtures for exit-code lines returning null.
* chore(web): translate AMR card / error keys for 16 remaining locales
PR #3083 added 10 new `chat.amrCard.*` / `chat.amrError.*` keys but only
provided en/zh-CN/zh-TW translations; the other 16 locales fell back to
English. Translate the card title/body, three chips, primary CTA, and
the AMR self-error (auth / balance) messages and buttons for ar, de,
es-ES, fa, fr, hu, id, it, ja, ko, pl, pt-BR, ru, th, tr, uk.
* fix(amr): address review feedback on #2355
Targeted fixes for the unresolved review threads on #2355. Each fix
includes / updates a focused test.
- runtimes/executables.ts: `packagedVelaOpenCodeCompanionTree` now
verifies the inner `opencode` executable exists + is runnable, not
just the directory. This closes the false-positive availability path
that let `detectAgents()` surface AMR as available even when the
packaged companion was empty / partially copied (mrcfps, 4 threads).
- runtimes/executables.ts: `resolveAmrOpenCodeExecutable` now prefers
the bundled `<OD_RESOURCE_ROOT>/bin/libexec/opencode/opencode` over a
stale `opencode` on the user's PATH, so packaged AMR builds can't be
hijacked by a global installation.
- web/EntryShell.tsx: when the Local CLI scan returns an available
agent and the previously-selected agent is AMR, switch the selection
to the first available local agent so the runtime and persisted
agent agree before Continue.
- server.ts (model-probe branch): for AMR, check `readVelaLoginStatus`
BEFORE rejecting on an empty live-model catalog — a signed-out user
was getting `AMR_MODEL_UNAVAILABLE` ("choose a model") instead of
the correct `AMR_AUTH_REQUIRED` (sign-in affordance).
- server.ts (default model fallback): if the user asked for the AMR
agent default and the cached id is no longer in the FRESH catalog,
fall back to `liveModels[0]` from the probe instead of rejecting the
run as `AMR_MODEL_UNAVAILABLE`.
- integrations/vela.ts: route `vela login` through
`createCommandInvocation` so an npm/Node-style `vela.cmd` / `.bat`
shim on Windows gets the correct `cmd.exe /d /s /c …` wrapping with
verbatim args (matches `execAgentFile` / chat-run spawning).
- tools/pack/src/linux.ts: in containerized Linux builds, bind-mount
the host directory of `OPEN_DESIGN_VELA_CLI_BIN` and rewrite the env
to the container-side path. The host path was being passed in as-is
even though the default container only mounts /project, /tools-pack
and cache/home — `copyOptionalVelaCliBinary` saw a missing path.
Deferred (out of scope for this PR):
- `od amr status/login/logout/cancel` CLI subcommands (AGENTS.md
UI/CLI dual-track rule, server.ts:5763) — sizable surface; tracked
for a separate focused PR.
- Strict `--require-vela-cli` for Windows + mac-x64 beta builds:
prematurely blocked — `@powerformer/vela-cli` only publishes the
`darwin-arm64` platform binary today; adding the flag elsewhere
would fail the builds. Revisit once win/x64/linux binaries ship.
* fix(amr): hoist sendAmrAccountFailure above the AMR catalog preflight (TDZ)
The new signed-out AMR branch in the catalog preflight at server.ts:10875
calls `sendAmrAccountFailure(...)` to emit AMR_AUTH_REQUIRED, but the
const declaration sat ~100 lines below at the outer function scope. Because
`const` is TDZ-aware, that branch would have thrown `ReferenceError:
Cannot access 'sendAmrAccountFailure' before initialization` for the
exact users it tries to help — defeating the original intent.
Hoist the helper to just above the AMR preflight block so it's available
to every AMR code path in this function. Behavior elsewhere is unchanged.
Also rerun the daemon test suite: `launch.test.ts > resolveAgentLaunch
uses packaged built-in Vela for AMR` was creating the
`<resourceRoot>/bin/libexec/opencode/` companion *directory* only, but
this PR's earlier tightening of `packagedVelaOpenCodeCompanionTree`
also requires the inner `opencode` executable. Add it to that fixture
to match the new contract; the test was a sibling of the executables /
env-and-detection fixtures already updated in
|
||
|
|
e40947ac0d
|
Add social sharing for template previews (#2924)
* Add template social sharing menu * Update plugin share e2e expectations * Add additional template social share targets * Remove Bilibili template share target * Open social share destinations in new tabs * Address template share review feedback * Use canonical public plugin share URLs * Gate public plugin share links by marketplace provenance * Update plugin share e2e for local-only badges * Limit public share URLs to official marketplace |
||
|
|
5563e7eca6
|
test: expand home entry and html preview coverage (#2992)
* test: cover entry topbar and hero flows * test: expand entry and html preview coverage * test: isolate mocked github stars in home entry e2e Generated-By: looper 0.8.1 (runner=fixer, agent=codex) * chore: retrigger CI for PR 2992 |
||
|
|
fce444bcab
|
Consolidate chat comments preview on main (#2906)
* feat(web): queue chat sends * feat(web): render code comment directives * feat(web): add preview comments and manual edits * fix(web): polish shared chrome controls * fix(web): align queued send loading state * feat(web): open primary project artifacts * fix(web): keep queued sends and tests aligned * fix(web): restore docked comment tools layout * fix(web): align preview comment toolbar * fix(web): place local cli beside handoff * fix(web): move agent menu beside handoff * fix(web): make project instructions a direct header action * fix(web): compact handoff and toolbar labels * fix(web): clarify handoff menu and annotation label * fix(web): restore compact cursor handoff trigger * fix(web): align agent menu trigger with handoff * fix(web): add draw toolbar close action * fix(web): move inspect editing into edit mode * fix(web): avoid reserving comment sidebar in annotation mode * fix(web): float preview comments panel * fix(web): keep edit canvas full width * fix(web): polish preview annotation tools * fix(web): highlight active preview comments * fix(web): open comments panel after annotation save * fix(web): polish comment handoff controls * fix(web): remove palette preview tool * fix(web): simplify draw annotation toolbar * fix(web): restore queued tasks into composer * fix(web): restore queued send strip styling * fix(web): hide internal comment target ids * fix(web): align manual edit panel header * test(web): cover visual interaction contracts * fix(web): address PR feedback regressions * fix(web): preserve artifact chrome state * fix(daemon): restore project raw file routes --------- Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local> Co-authored-by: mrcfps <mrc@powerformer.com> |
||
|
|
80e685c656
|
fix(web): polish design system source flows (#2933)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
landing-page-ci / Validate landing page (push) Failing after 2s
landing-page-deploy / Deploy landing page (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 1s
ci / Workspace unit tests (push) Failing after 1s
ci / Daemon workspace tests (push) Failing after 1s
ci / Web workspace tests (push) Failing after 1s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 1s
ci / Runtime trace (push) Has been skipped
Co-authored-by: qiongyu1999 <2694684348@qq.com> |
||
|
|
f93c886411 |
Fix conflict resolution from c14baf07
Two CI failures from PR #2461 root-caused to wrong picks in the
merge:
* apps/web/src/components/plugins-home/PluginCard.tsx — reverted to
release-side. The release-side version uses `localizePluginTitle
(locale, record)` / `localizePluginDescription(locale, record)`
to read each plugin's `titleI18n` / `descriptionI18n` fields,
driving the 'localizes plugin card titles' test in
plugins-home-section.test.tsx (which asserts '瑞士国际主义 Deck'
appears under zh-CN). The main-side version replaced that
record-level i18n with hardcoded English + `t()` aria-label
keys — a finer-grained i18n migration but a fundamental loss of
the record-level localization the test exercises. Taking
release-side keeps the test functionality; the aria-label i18n
keys are micro-optimisation we can re-port in a follow-up.
* e2e/ui/settings-local-cli-codex-fallback.test.ts — added the
`SETTINGS_MENU_LABEL` constant declaration that the menu-
dismissal helper (kept from main in
|
||
|
|
c14baf07d3 |
Merge origin/main into release/v0.8.0
PR #2461 sync prep — resolves 14 conflicts merging 84 main-side commits on top of 58 release-side commits accumulated during the 0.8.0 cycle. Resolution summary: Take main (theirs) where main carried deliberate forward progress: - apps/web/src/components/PluginCard.tsx — 7 hunks, i18n migration: hardcoded English aria-labels/titles replaced with t() calls keyed on pluginCard.* (all 8 keys verified present in en.ts). - apps/web/src/components/TasksView.tsx — 1 hunk, source-ingestion feature: sortedRoutines (newest-first), sourceIngestionTemplates, patchSourceForm, submitSourceIngestion. activeCount/pausedCount semantics preserved (now keyed on sortedRoutines, count unchanged). - e2e/ui/app.test.ts — new node:fs/promises + tmpdir + path + @/timeouts imports needed by main-side test helpers. - e2e/ui/settings-local-cli-codex-fallback.test.ts — menu-dismissal helper block added by main. Keep both sides where each added a different field to the same object literal: - apps/web/src/components/ProjectView.tsx (locale + analyticsHints spread). - apps/web/src/components/DesignSystemFlow.tsx (locale + analyticsHints). Take release (ours) where release carried deliberate work that ships 0.8.0: - CHANGELOG.md — release-side 0.8.0 entry + PR link refs; main's Unreleased section was the same body of work, now finalized. - apps/landing-page/public/{apple-touch-icon,favicon}.png + apps/web/public/app-icon.svg — release-side visual refresh assets consistent with 0.8.0 stable ship. - tools/pack/src/linux.ts — packageVersion const required by line 466; taking main's empty line would build-error. - e2e/ui/project-management-flows.test.ts + e2e/ui/settings-api-protocol.test.ts + e2e/ui/settings-memory-routines.test.ts — release-side release-smoke hardening (shangxinyu1 + PerishFire) takes precedence on overlap. Closes-issue / unblocks: PR #2461 sync release/v0.8.0 → main. |
||
|
|
32fd5286b5
|
chore(e2e): improve test framework quality (#2305)
* chore(e2e): improve test framework quality
- Add lib/timeouts.ts with CI-scaled short/medium/long/xlong constants
- Add lib/playwright/mock-factory.ts to centralise standard localStorage,
/api/agents, and /api/app-config mock setup; migrate critical-smoke and
workspace-keyboard-flows to use applyStandardMocks()
- Delete empty lib/shared.ts placeholder
- Replace waitFor({ state: 'detached' }).catch(() => {}) with
waitFor({ state: 'hidden' }) in all UI tests; 'hidden' resolves
immediately when the element was never in the DOM, eliminating the
silent error-swallowing catch
- Remove redundant .catch(() => false) from all isVisible() call sites
since isVisible() never throws in Playwright
- Convert .waitFor().then(() => true).catch(() => false) guards in
openDesignFile() to explicit try/catch blocks for clarity
- Simplify sendPrompt() in app.test.ts: replace the 3-attempt manual
retry loop with a single fill + pressSequentially fallback; the core
workaround for contenteditable unreliability is preserved but the
loop structure is gone
* fix(e2e): guard routeMockAgents to GET only
routeMockAgents was intercepting all HTTP methods and returning the mock
fixture, silently swallowing any agent mutation requests. Mirror the
GET-only guard from routeAppConfig so writes fall through to the daemon.
* fix(e2e): address code review findings
- sendPrompt() in app.test.ts, workspace-keyboard-flows.test.ts,
app-restoration.test.ts: drop fill() (unreliable on contenteditable,
inputValue() always returns '' for them) and go straight to
pressSequentially(), which types key-by-key and is authoritative
- Import T from timeouts.ts in app.test.ts and use T.short for the
input/button waits, making the timeouts module non-dead
* fix(e2e): resolve adversarial review findings
- Revert sendPrompt to fill(): chat-composer-input is a textarea, not
contenteditable; fill() is atomic and ~60x faster than pressSequentially
- Use T.medium in all waitForLoadingToClear calls: CI workers scale this
to 20s automatically via the CI env var, eliminating cold-runner flakes
- Add T import to 6 files that needed it for T.medium
- Fix openDesignFile try/catch scope in app-manual-edit: previously the
catch block only caught waitFor but click/expect errors were also swallowed;
now only waitFor is inside try, real interaction failures propagate
- Fix regex escaping: .replace('.', '\\.') -> .replace(/\./g, '\\.') in
app-manual-edit and app-design-files to handle multi-dot filenames
- Migrate entry-chrome-flows.test.ts to applyStandardMocks: it had the
identical 3-call setup pattern as the factory but was not migrated
- Add GET method guard to project-management-flows app-config route handler,
matching the pattern used by every other route handler in the suite
- Remove no-op 'as const' from timeouts.ts: Math.ceil returns number,
not a literal, so the assertion had no effect
- Update e2e/AGENTS.md: remove deleted lib/shared.ts entry, document
lib/timeouts.ts and lib/playwright/mock-factory.ts
* fix(e2e): scope openDesignFile try/catch to waitFor only
Move click and expect(preview).toBeVisible() outside the catch block so
that a regression in either open path (tab-click or file-list fallback)
fails loudly instead of being silently absorbed. The try now wraps only
the fileTabButton.waitFor existence probe; the subsequent click and final
assertion are unconditional.
---------
Co-authored-by: Patrick A <186436799+eefynet@users.noreply.github.com>
Co-authored-by: Patrick A <259201958+eefynet@users.noreply.github.com>
|
||
|
|
ce2e7a0e66
|
fix(web): chat pane preserves scroll position when todo card grows (#2299)
* fix(web): chat pane preserves scroll position when todo card grows The PinnedTodoSlot renders outside the chat-log scroll container. When the todo card grows (new tasks added via TodoWrite), the scroll container's clientHeight shrinks in the flex layout, drifting the user away from the bottom. The existing ResizeObserver only observed children of the chat-log div, so pinned-todo growth was invisible to followLatestIfPinned. Fix: pass a containerRef to PinnedTodoSlot and observe that element in the same ResizeObserver. syncPinnedTodo() is called on effect setup and from the MutationObserver callback so observation stays current as the slot appears and disappears across TodoWrite snapshots. Red spec: apps/web/tests/components/chat-todo-autoscroll.test.tsx * fixup! fix(web): chat pane preserves scroll position when todo card grows Clarify test comment: the second test confirms followLatestIfPinned snaps scroll to bottom when fired. The structural guarantee (pinned-todo element is observed) is separately asserted in test 1, which is the check that goes red on main without the fix. * fix(web): correctness extend MutationObserver to pane ancestor for PinnedTodoSlot mount detection The MutationObserver was only watching the .chat-log element. PinnedTodoSlot (.chat-pinned-todo) is a sibling of .chat-log-wrap inside .pane, outside the observed subtree. syncPinnedTodo inside the MutationObserver callback was therefore dead code for mount/unmount transitions of the slot. Add a second observation on paneEl (el.parentElement?.parentElement) with childList-only so the MutationObserver fires when PinnedTodoSlot mounts or unmounts and syncPinnedTodo can register/deregister the element with the ResizeObserver. * test(e2e): chat pane auto-scroll on todo card growth Add Playwright spec that goes red on origin/main and green on this fix branch. Scenario A asserts that a chat-log pinned to the bottom snaps back after the PinnedTodoCard grows (the ResizeObserver-on-pinned-todo path). Scenario B asserts that a deliberate scroll-up is not overridden. Also allow OD_WORKSPACE_ROOT env override in next.config.ts so Turbopack resolves node_modules correctly when the web app is booted from a worktree whose node_modules symlinks resolve outside the default workspace root. * docs(agents): note pinned-todo observer coverage in chat UI conventions PinnedTodoSlot sits outside the .chat-log scroll container, so the ResizeObserver and MutationObserver coverage that keeps auto-scroll working when the todo card grows is non-obvious to future implementers. Document the invariant in the Chat UI conventions section. * fix(web): validate OD_WORKSPACE_ROOT, harden autoscroll test precondition * fix(web): validate OD_WORKSPACE_ROOT existence, make autoscroll precondition unconditional * fix(web): throw on invalid OD_WORKSPACE_ROOT instead of warn-and-fallback * fix(web): require pnpm-workspace.yaml at OD_WORKSPACE_ROOT, drop dead test branch Three follow-ups to nettee's review feedback: 1. apps/web/next.config.ts gains a pnpm-workspace.yaml existence check after the relative-path validation. Without it, an override like '<repo>/apps' or '<repo>/apps/web' passes the relative(resolved, WEB_ROOT) check but the resolved path is missing the sibling packages/* directory that apps/web imports from (for example @open-design/contracts). Next would later fail deep inside file tracing / Turbopack with a much harder-to-diagnose error. Now we throw at config load with a clear message. 2. e2e/ui/chat-todo-autoscroll.test.ts drops the redundant 'if (scrollUpOccurred)' branch. The hard precondition above it already guarantees distanceAfterScroll > 80, so the if was dead code that read as a false-green path. The body now runs unconditionally. 3. Same test tightens the post-grow assertion. The previous toBeGreaterThan(60) would pass even if a regression dragged the log most of the way back to the bottom (e.g. before=150, after=61). Replaced with Math.abs(distanceAfterGrow - distanceAfterScroll) less than SCROLL_PRESERVATION_TOLERANCE_PX (20) — a delta check that actually verifies the comment's claim of 'within ~20px of where the user left it'. * fix(web): canonicalize workspace root with realpathSync and tighten scenario B assertion - Use realpathSync on both resolved and WEB_ROOT before the ancestor check so that symlinked paths (macOS /tmp vs /private/tmp, worktree checkouts) compare correctly instead of false-throwing on a physically valid override. - Add isAbsolute(rel) guard for the Windows cross-drive case where path.relative() returns an absolute path instead of a ..-prefixed string. - Scenario B: replace distance-to-bottom delta assertion with scrollTop preservation check. Growing the pinned todo naturally increases distance-to-bottom by ~extraPx (clientHeight shrinks while scrollTop is held fixed), so the old Math.abs(after - before) < 20 check would fail on correct behavior. asserting scrollTop directly catches the real regression: followLatestIfPinned incorrectly snapping a non-pinned user back to the bottom. - Add hard precondition that clientHeight actually changed so the test fails fast if the layout stops exercising the non-pinned path. * test(e2e,web): add clientHeight guard to scenario A and mount-wiring unit test --------- Co-authored-by: Patrick A <eefynet@users.noreply.github.com> Co-authored-by: Patrick A <259201958+eefynet@users.noreply.github.com> |
||
|
|
95bbdbb734
|
[codex] test(e2e): harden settings and entry regressions (#2578)
* test(e2e): harden settings and entry regressions * test(e2e): align entry chrome coverage with current UI Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix(web): refresh saved media providers from daemon Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * test(web): align media provider reload expectations Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix(web): keep daemon media-provider reloads authoritative Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix(web): make media-provider reload precedence depend on dialog edits Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix(web): preserve pending media-provider edits across stale autosaves Generated-By: looper 0.6.0 (runner=fixer, agent=codex) |
||
|
|
526c7f7c26
|
Fix packaged auto-update release validation (#2565)
* fix: tighten packaged updater flow * test: prune noisy extended ui coverage * fix: hide unpublished release artifacts * test: validate release updater channels * fix: align prerelease release namespaces |
||
|
|
10940ab7b6
|
[codex] Update home starter template categories (#2501)
* Update home starter template categories * fix(web): address PR #2501 review follow-ups - usePluginFacets.clearFacets() now also resets `mode` to 'all' so the empty-state "Clear filters" CTA escapes Saved mode in one click. A fresh browser clicking Saved (or Saved combined with a zero-match search) was previously stranded on the empty view because the CTA only cleared `selection`/`query`. - Restore the "Link code folder" item in the composer Tools -> Import panel. The earlier refactor dropped its `onLinkFolder` wiring and left every remaining ImportItem disabled, so chats without linked dirs lost the only enabled entry point to `openFolderDialog()` / `patchProject({ metadata.linkedDirs })`. Adds regressions: - plugins-home-section: Clear filters from the Saved empty state - ChatComposer.import-menu: folder-link click-through hits the folder dialog and patches `linkedDirs` --------- Co-authored-by: qiongyu1999 <2694684348@qq.com> Co-authored-by: lefarcen <935902669@qq.com> |
||
|
|
c45c5c9764
|
fix(ci): align visual selectors and nix hashes (#2471)
* fix(ci): align visual selectors and nix hashes * fix(ci): add strict PR visual verification * fix(ci): repair visual-home captures Generated-By: looper 0.8.1 (runner=fixer, agent=opencode) |
||
|
|
ce95266586
|
[codex] Polish home composer working-directory controls (#2468)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 1s
nix-check / build (push) Failing after 3s
ci / Preflight (push) Failing after 2s
ci / Core package tests (push) Failing after 1s
ci / Tools workspace tests (push) Failing after 1s
ci / Daemon workspace tests (1/2) (push) Failing after 1s
ci / Daemon workspace tests (2/2) (push) Failing after 1s
ci / Web workspace tests (push) Failing after 1s
ci / E2E vitest (push) Failing after 1s
ci / Playwright critical (starters) (push) Failing after 1s
ci / Playwright critical (core) (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / App workspace tests (push) Failing after 0s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
* Polish design system home flows * Polish home prompt presets * Polish home working directory controls * test: align home hero chrome smoke * fix: stabilize home composer ci checks --------- Co-authored-by: qiongyu1999 <2694684348@qq.com> |
||
|
|
e727168676
|
chore(ci): expand visual regression coverage (#2381)
Some checks failed
ci / Runtime trace (push) Blocked by required conditions
visual-baseline / Capture visual baselines (push) Waiting to run
ci / Detect CI change scopes (push) Successful in 0s
landing-page-ci / Validate landing page (push) Failing after 2s
landing-page-deploy / Deploy landing page (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Preflight (push) Failing after 1s
ci / Core package tests (push) Failing after 1s
ci / Tools workspace tests (push) Failing after 1s
ci / Daemon workspace tests (1/2) (push) Failing after 1s
ci / Daemon workspace tests (2/2) (push) Failing after 2s
ci / Web workspace tests (push) Failing after 1s
ci / E2E vitest (push) Failing after 2s
ci / Playwright critical (starters) (push) Failing after 1s
ci / Playwright critical (core) (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / App workspace tests (push) Failing after 0s
ci / Validate workspace (push) Failing after 14m14s
* Improve visual diff annotations * Expand visual regression coverage * fix(ci): cap visual diff canvas pixels Generated-By: looper 0.8.1 (runner=fixer, agent=opencode) * Stabilize visual regression screenshots * test(e2e): stub routines for visual snapshot Generated-By: looper 0.8.1 (runner=fixer, agent=opencode) * Expand visual regression surfaces * fix(e2e): order design system visual mocks Generated-By: looper 0.8.1 (runner=fixer, agent=opencode) * fix(e2e): order design system visual mocks Generated-By: looper 0.8.1 (runner=fixer, agent=opencode) * Tune visual diff box stroke * fix(e2e): stabilize visual detail mocks Generated-By: looper 0.8.1 (runner=fixer, agent=opencode) * fix(e2e): harden visual diff box helpers Generated-By: looper 0.8.1 (runner=fixer, agent=opencode) * fix(web): preserve deep-linked project bootstrap Generated-By: looper 0.8.1 (runner=fixer, agent=opencode) * fix(e2e): stub automation task mocks Generated-By: looper 0.8.1 (runner=fixer, agent=opencode) |
||
|
|
8193981511
|
Keep PR 2400 changes without folder pickers (#2462)
* feat(daemon): add project working directory management and editor hand-off functionality - Introduced new flags for project commands to manage working directories, including `--working-dir` and `--dir`. - Implemented API routes for listing available editors and opening projects in selected editors. - Added a hand-off button in the ChatPane header to facilitate opening project folders in local applications. - Enhanced the HomeHero component to include working directory and design system settings, improving user experience in project creation. - Created HomeHeroSettingsChips component for inline management of working directory and design system selection. * feat(chat): implement voice transcription proxy and enhance UI components - Added a new API route for voice transcription using OpenAI's `/audio/transcriptions` endpoint, allowing users to send audio blobs directly for transcription. - Integrated multer for handling audio file uploads in memory, ensuring efficient processing without disk storage. - Updated the HomeHero component to include example prompt suggestions for plugins, enhancing user interaction. - Introduced the EditorIcon component to visually represent different editors in the hand-off menu, improving the user experience. - Refined the HandoffButton component to utilize the new EditorIcon, providing a more cohesive interface for selecting editors. - Enhanced CSS styles for various components to improve layout and responsiveness, including adjustments to tab and button sizes for better usability. * style(workspace-shell): enhance layout and overflow handling - Updated CSS for .workspace-shell to ensure full viewport width and height, with proper overflow management. - Adjusted grid layout to prevent content overflow and maintain responsiveness. - Modified styles for .workspace-tabs-chrome to improve width handling and prevent overflow issues. * refactor(chat): remove voice transcription proxy and related components - Deleted the voice transcription proxy implementation, including the associated API route and multer configuration. - Removed the MicButton component from the ChatComposer and HomeHero components to streamline the UI. - Updated HomeHero to include example suggestions without the voice input functionality. - Adjusted CSS styles for various components to maintain layout consistency after the removal of the MicButton. * feat(daemon): implement minting of HMAC tokens for working directory management - Added a new function `mintImportTokenFromCurrentSecret` to generate HMAC tokens bound to a specified base directory, enhancing security for working directory operations. - Updated the `desktop-auth.ts` file to include the new token minting functionality, which returns structured errors when the desktop auth secret is cleared. - Introduced new IPC message types for minting import tokens in the sidecar protocol, allowing seamless integration with the daemon's working directory management. - Enhanced the `WorkingDirPill` component to utilize the new token minting flow for secure directory selection in desktop builds. - Updated CSS styles for the HomeHero component to accommodate new example suggestion features and maintain layout consistency. * fix(HomeView): import HOME_HERO_CHIPS constant for improved chip management - Updated the HomeView component to import the HOME_HERO_CHIPS constant from the chips module, enhancing the management of hero chips within the component. * feat(daemon): implement mintImportTokenViaSidecar for secure working directory management - Introduced the `mintImportTokenViaSidecar` function to facilitate the minting of HMAC tokens for desktop-import operations via the daemon's sidecar IPC. This allows CLI commands to bypass authentication when the desktop-auth gate is active. - Updated the CLI to utilize the new token minting function when setting the working directory, ensuring secure access to trust-gated API endpoints. - Enhanced the sidecar server to handle minting requests and return structured error messages for improved user feedback. - Added tests to validate the new token minting functionality and its integration with the working directory management process. - Refactored related components to support the new token flow, improving overall security and user experience. * feat(HomeHero): enhance UI components and styles for improved user experience - Updated HomeHero component to replace active dot indicators with Plug icons for better visual representation of active plugins. - Adjusted CSS styles for various elements, including padding and dimensions, to enhance layout consistency and responsiveness. - Introduced new styles for active type icons and improved hover effects for buttons. - Updated HomeHeroSettingsChips to change button titles and icons for clarity. - Added tests to ensure proper rendering and functionality of updated components. * feat(ProjectDesignSystemPicker): enhance design system selection with preview functionality - Updated the ProjectDesignSystemPicker component to include a preview feature for design systems, allowing users to see a preview of the selected design system. - Implemented hover functionality to update the preview based on the hovered design system. - Added fullscreen preview capability for a more immersive experience. - Enhanced CSS styles for the design system picker to improve layout and responsiveness. - Introduced tests to validate the new preview functionality and ensure proper interaction within the component. * feat: refactor project metadata handling and enhance design system picker - Updated the default scenario plugin ID retrieval to use project metadata, improving the logic for determining the appropriate plugin based on project intent. - Enhanced the ProjectDesignSystemPicker and related components to support localized design system summaries and categories, improving user experience. - Introduced new translations for working directory and design system picker components, ensuring better accessibility and usability across different locales. - Added a new 'live-artifact' project type to the HomeHero chips, expanding the functionality for users creating refreshable artifacts. - Updated tests to validate the new project metadata handling and design system picker functionalities. * feat: enhance localization and styling for design system components - Added French translations for working directory and design system picker components, improving accessibility for French-speaking users. - Updated CSS styles for the pet task item to ensure consistent padding and layout. - Introduced a new test suite for HomeHeroSettingsChips to validate localization and design system selection functionality. - Enhanced ProjectDesignSystemPicker tests to ensure proper localization and interaction with design system categories. * fix: update .gitignore to include all claude-sessions directories and remove specific session files - Modified .gitignore to ensure all claude-sessions directories are ignored by using a wildcard pattern. - Deleted two specific claude-sessions markdown files to clean up unnecessary session data. * fix: repair home automation ci regressions * fix: stabilize artifact consistency e2e * Remove folder picker changes from PR 2400 --------- Co-authored-by: pftom <1043269994@qq.com> Co-authored-by: qiongyu1999 <2694684348@qq.com> |
||
|
|
41a33aed9e
|
Reapply "fix(web): demote Plugins and Integrations to nav rail footer (#1806)" (#2360) (#2397)
* Reapply "fix(web): demote Plugins and Integrations to nav rail footer (#1806)" (#2360)
This reverts commit
|
||
|
|
a5e43ae2a4
|
add discord feedback entry points (#2386) | ||
|
|
71044bd3d6
|
test(e2e): harden extended coverage state assertions (#2245)
* test(e2e): harden extended coverage contracts * docs(testing): add e2e hardening status * fix(web): persist artifact chips after daemon runs * ci: install playwright browsers for e2e vitest * Fix daemon run recovery across reloads Pin daemon-created runs to assistant messages immediately so hard reloads before the create response can reattach. Replay terminal and active run events from the beginning on reload so restored turns keep assistant text, thinking events, produced files, and artifacts. Fixes #2366 Fixes #2368 Fixes #2371 * test(e2e): preserve fake runtime selection across reload * fix(web): scope daemon run recovery to daemon mode * fix(e2e): remove duplicate delayed smoke flag * fix(web): scope replay artifact recovery to current run * fix(daemon): remove duplicate run-create pin |
||
|
|
5fc27f8923
|
Fix daemon run recovery across reloads (#2374)
* Fix daemon run recovery across reloads Pin daemon-created runs to assistant messages immediately so hard reloads before the create response can reattach. Replay terminal and active run events from the beginning on reload so restored turns keep assistant text, thinking events, produced files, and artifacts. Fixes #2366 Fixes #2368 Fixes #2371 * Fix ProjectView daemon run recovery tests |
||
|
|
f294ab4915
|
chore(ci): add visual regression PR workflow (#2372)
* Add visual regression PR workflow * Allow manual visual PR comments * Post visual comments for same-repo PRs * fix(ci): surface R2 lookup failures in visual report Generated-By: looper 0.8.1 (runner=fixer, agent=opencode) * Align visual workflow names |
||
|
|
1ab8758045
|
Revert "fix(web): demote Plugins and Integrations to nav rail footer (#1806)" (#2360)
This reverts commit
|
||
|
|
d5eb6d17d7
|
test(e2e): harden connectors auth cancel recovery (#2176)
* test(e2e): harden connectors auth cancel recovery * test(e2e): target connector cancel alert |
||
|
|
86ec951fb9
|
[codex] Add automation templates and proposal workflows (#2193)
* feat(web): introduce Automations tab with dual-track capability for routines This commit adds a new Automations tab that consolidates routines, schedules, and live artifacts, allowing users to manage automations seamlessly. The tab features a modal for creating and editing automations, which supports various scheduling options (hourly, daily, weekdays, weekly) and project modes (create_each_run, reuse). The CLI is also updated to expose automation commands, ensuring consistency between the web UI and CLI interfaces. Key changes include: - New `NewAutomationModal` component for automation creation and editing. - Updated `TasksView` to integrate the new Automations functionality. - Enhanced styling for the Automations tab to improve user experience. This implementation aligns with the dual-track capability exposure policy, ensuring all features are accessible via both the web UI and CLI. * feat(daemon): enhance automation context handling and CLI commands This commit introduces several improvements to the automation context management and updates the CLI commands accordingly. Key changes include: - Added support for new context fields (`plugin`, `mcp`, `connector`) in automation commands. - Updated the CLI to reflect new target options (`new-project`). - Enhanced error messages for invalid target inputs. - Introduced functions to handle context selection and normalization for routines, including the ability to parse and store context data in the database. - Updated the database schema to include a new `context_json` field for routines. - Improved the handling of context in routine routes and the web interface, ensuring that selected contexts are properly managed and displayed. These changes aim to provide a more robust and flexible automation experience, aligning with the recent enhancements in the web UI. * feat(web): enhance TasksView with automation run history and status indicators This commit introduces several new features to the TasksView component, including: - Added functionality to display automation run history for each routine, showing metadata such as status, timestamps, and project details. - Implemented status indicators for routine runs, providing visual feedback on their current state (succeeded, failed, running, queued). - Enhanced the UI to allow users to expand and view detailed run history, including the ability to open the corresponding project conversation. - Updated styles to improve the presentation of automation statuses and history. These changes aim to provide users with better insights into their automation routines and improve overall usability. * feat(daemon): implement automation ingestion and proposal management This commit introduces several new features related to automation ingestion and proposal management within the daemon. Key changes include: - Added new modules for handling automation source packets and proposals, allowing for the storage, retrieval, and management of automation-related data. - Implemented functions to list, create, and apply automation proposals, enhancing the automation workflow. - Introduced new CLI commands for interacting with memory entries and automation sources, providing users with more control over their automation processes. - Enhanced the server routes to support automation source and proposal APIs, enabling seamless integration with the existing system. These changes aim to improve the overall automation experience, making it easier for users to manage and utilize automation proposals and ingestions effectively. |
||
|
|
eb127e0f79
|
Remove live artifact home chip (#2221) | ||
|
|
a38e09f931
|
fix(web): demote Plugins and Integrations to nav rail footer (#1806)
* fix(web): demote Plugins and Integrations to nav rail footer
Plugins and Integrations are platform-configuration surfaces, not
daily-use destinations. Moving them to the footer section of the
left nav rail — separated from primary items by a thin divider —
keeps them reachable while giving the primary four items
(Home, Projects, Automations, Design Systems) the visual weight
they deserve.
- EntryNavRail: remove Plugins/Integrations from the main __group
and place them in the __footer above the help launcher
- entry-layout.css: add __divider rule (1 px separator) to visually
mark the boundary between primary and secondary nav regions
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): remove settings dropdown, gear opens settings directly
The gear/cog button previously opened a dropdown that mixed three
unrelated concerns: community links (X, Discord), preference quick-
access (Language, Appearance), a feature shortcut (Use everywhere),
and a redundant Settings entry — creating two separate paths to the
same Settings dialog and duplicating Language/Appearance relative to
the Settings sidebar.
Changes:
- Gear button now directly opens the Settings dialog (no intermediate
dropdown layer)
- Follow @nexudotio on X and Join Discord moved to the Help menu at
the bottom of the nav rail, where community/external links belong
- Language and Appearance remain exclusively in the Settings dialog
- Use everywhere remains exclusively in the topbar chip
- Remove dead state (avatarMenuOpen, languageExpanded,
appearanceExpanded), ref, outside-click effect, and module-level
constants (APPEARANCE_THEMES, APPEARANCE_LABEL, describeModelChip)
that were introduced solely to support the dropdown
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(web): move output-type chips above chat input as tab bar
The home hero previously had two chip rows below the chat input that
mixed two unrelated dimensions: output type (Prototype, Slide deck,
Image…) and workflow source (Create plugin, From Figma, From folder,
From template). Users had no visual cue that these were different
categories.
This change separates the two dimensions clearly:
- Output type (create group) becomes a tab bar positioned above the
input card. Tabs share the same chip data and onPickChip dispatch,
so plugin selection, active state, and pending state are unchanged.
Active tab shows a colored underline; the bar border visually
connects to the input card below.
- Workflow source (migrate group) stays as the chip row below the
input card, now standing alone with unambiguous "how to start"
semantics.
- Subtitle updated from "Pick a plugin below" to "Pick a type"
to match the new placement.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): skip replace-prompt confirmation when switching output-type tabs
Output-type tab clicks (create group) are mode-selection gestures;
the user expects the prompt to update immediately without a dialog.
The confirmation is still shown for migrate-group chips (From Figma
etc.) where the replacement carries meaningful user-provided content.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): use brand logo as Home destination, drop redundant brand mentions
The entry nav rail previously rendered the brand logo and a separate
Home icon back-to-back; both invoked `onViewChange('home')`, so the
Home button was pure duplicate affordance. The hero pane also
displayed a third "Open Design" lockup that competed with the rail
logo and the rebranded title.
This collapses those affordances:
* `EntryNavRail` drops the dedicated Home `NavButton`. The brand
logo button now carries `aria-current="page"` and an `is-active`
visual state when the home view is showing, and its tooltip reads
"Open Design · Home" off-home so the navigation behavior stays
discoverable for new users.
* `entry-layout.css` adds the matching `.entry-nav-rail__logo.is-active`
accent treatment so the "you are here" cue reads at parity with
primary rail buttons.
* `HomeHero` removes the inline `home-hero__brand` lockup and the
associated CSS, then retunes the title/subtitle/type-tab spacing
so the headline group still pairs tightly with the type tabs and
input card below.
* `entry-chrome-flows` is updated to assert the logo carries the
active page treatment and that no `entry-nav-home` test id
resurfaces by mistake.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): auto-grow home chat input, disable internal scroll and drag-resize
The home hero textarea was capped at a fixed `min-height: 84px` with
`resize: vertical`, so users had to either drag the bottom-right
corner to enlarge it or scroll the textarea internally to read
longer prompts. That hid context (loaded plugin templates routinely
overflow three lines) and exposed a manual grip whose state was easy
to leave in an awkward height.
This change makes the chat box grow with its content:
* `HomeHero` adds a `useLayoutEffect` that, on every prompt change,
resets the textarea height to `auto` so the browser can measure a
smaller content, then writes back `scrollHeight` in pixels. The
effect uses layout phase (not effect phase) to avoid a one-frame
flash at the previous height when a plugin loads a long example
prompt.
* `home-hero.css` swaps `resize: vertical` for `resize: none` and
adds `overflow: hidden`, so the manual drag grip disappears and
the textarea never scrolls internally. `min-height: 84px` is kept
so the empty input still reads as a chunky chat box. The outer
page handles overflow when the prompt is genuinely very long.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): make output-type tabs read as folder-tabs attached to chat box
The previous tab bar used a small icon + label per tab and signaled
the active tab with a 2px accent underline sitting on a horizontal
divider line. That divider visually broke the relationship between
the tabs and the input card below — the active tab looked like a
free-floating header, not a flap of the chat surface, so users had
to do extra work to mentally connect "I picked Prototype" with the
prompt area immediately underneath.
This switches the tab bar to a folder-tab pattern:
* `HomeHero` drops the per-tab `Icon` element. The seven labels
(Prototype, Live artifact, Slide deck, Image, Video, HyperFrames,
Audio) already disambiguate at the type sizes used here, and the
icons were primarily decorative.
* `home-hero.css` rewrites the tab styling:
- The bar no longer paints its own baseline border; the input
card's top edge serves as the baseline.
- Each tab is a rounded-top container with a 1px transparent
border and a `-1px` bottom margin, so its bottom edge overlaps
the card's top border by exactly one pixel.
- The active tab borrows the card's panel background and border
color, and overrides its bottom border with the panel color
so it visually erases the card's top border for the tab's
width. The result reads as one continuous "Prototype is this
chat box" surface.
- The card's prior `margin-top: 8px` is removed so the tab bar
bottom and card top sit at the same y coordinate, which is the
geometric precondition for the 1px overlap to land cleanly.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): enlarge home chat attach and submit buttons for legibility
The attach (paper-clip) and submit (arrow-up) controls on the home
chat input rendered at 32px diameter with 14px and 18px glyphs
respectively. After stroke antialiasing the icons read at roughly
11–12px on typical displays — small enough that users reported the
glyphs were illegible and had to be discovered by trial-and-error.
This bumps both controls to 38px circles and grows their glyphs
(attach 14 → 18, submit 18 → 22). The primary call-to-action now
has clearly more visual weight than the surrounding muted hint
text, and the paper-clip is recognisable at a glance.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): paint a chevron on inline select slots in the home prompt
Inline select slots in the home prompt overlay (e.g. the
`high-fidelity` / `wireframe` fidelity picker on the Live artifact
template) rendered as plain pill highlights, visually identical to
free-text and read-only text slots. The slot's `appearance: none`
strips the browser's default chevron, so users had no affordance
hinting the value was switchable.
This wraps select-type slots in an `inline-flex` span and overlays
an explicit chevron-down `Icon` against the slot's trailing padding
(bumped from 18px to 22px to make room). The chevron inherits the
accent color via the wrap's `color` property so it themes correctly
in light and dark modes. Click semantics are unchanged because the
chevron is `pointer-events: none`.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): size inline number slots to their value, not the whole row
Number-type inline slots in the home prompt (e.g. the slide-count
spinner on the Slide deck template) inherited the generic slot
`min-width: 8ch` and then stretched to the browser's default
`<input type=number>` width, so a two-digit value like "10" ate
the entire remaining line and pushed the native spinner buttons
to the far right edge — far away from the value they control.
This sizes the number input to its actual content plus four
characters of trailing room for the spinner buttons (clamped to
6–14ch), and adds a `home-hero__prompt-slot-input--number`
modifier that overrides the slot `min-width` to 4ch. The
spinner now sits flush against the value and the slot reads as
a compact inline pill matching the surrounding text/select
slots.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): widen prompt line-height so slot pills do not collide vertically
Each interactive slot and mention in the home prompt paints a 2px
outline ring via `box-shadow`. At the previous `line-height: 1.55`
on a 15px font, two lines were ~23px apart while a single pill
occupied ~20px of vertical space (text box + ring on both sides),
leaving roughly 2px of clear space between rows. When the prompt
wrapped onto multiple lines — common for the Image template's
"Generate a {kind} of {subject}. Style: {style}. Aspect: {ratio}."
example — the rings from line N and line N+1 visually merged into
a single bar, making it ambiguous which pill the user was about to
click or edit.
This bumps the line-height of both the highlight overlay and the
underlying editable textarea to 1.85 (~28px per line), restoring
~8px of clear space between pill rows. The two values must stay
identical so the overlay glyphs continue to track the textarea
caret positions exactly; a brief comment in each rule documents
this coupling for future edits.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): paint dropdown chevron on select element with neutral color
The previous chevron treatment wrapped each select slot in an
inline-flex span and laid an accent-orange `<Icon>` over the
trailing padding. At the prompt's normal display size the orange
chevron blended into the orange pill ring corners, so users
perceived "a bunch of dropdown arrows" across every highlighted
slot, even though only the actual select rendered one.
Move the chevron onto the `<select>` itself as a small neutral-
gray `background-image` SVG. The grey contrasts clearly with the
pill's accent ring and the chevron lives inside the select's own
padding, so it can never visually overlap the value text and can
never appear on non-select slots.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): shrink select-slot dropdown caret so it stops reading as a check-mark
The 8×6 stroked chevron looked like a check-mark in zoomed views
because the stroke width was a large fraction of the glyph. Swap
for a small (9×5) filled triangle drawn as a `background-image`,
with `background-size` pinned so browsers can't scale it to its
intrinsic SVG box. The caret is now unambiguously a dropdown
indicator without crowding the value text.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): make read-only prompt slots visually distinct from editable pills
Multi-word context values are rendered as plain <span>s with
pointer-events: none, but they inherited the same orange pill +
ring treatment as the truly editable <input>/<select> slots.
Users couldn't tell which pills they could click into.
Strip the pill background, the ring, the radius, and the padding
from `.home-hero__prompt-slot-text` and replace them with a
subtle dashed bottom border. The orange foreground keeps the
slot family link, but read-only highlights now clearly read as
"context value spliced into the prompt" rather than as
"interactive control".
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): scale select-slot dropdown caret up to 12px wide for legibility
The 9×5 caret was too quiet at the prompt's font size to read as
a dropdown affordance. Bump to a 12×6 filled triangle and pad the
select's trailing space (24px) so the value text still has clear
breathing room from the caret.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): collapse plugins search and filter strips into one bar
The plugins-home gallery laid out search, count, mode (Featured),
total-in-catalog, clear-filters, the main category row, and the
sub-category row as four floating clusters. Search lived up in
the section header, far from the chips it actually scopes; the
mode strip wedged "Featured + 386 in catalog + Clear filters"
between the header and the category row with no obvious
ownership; and the two clear-filter affordances duplicated each
other (chip strip + empty-state).
Fold the mode strip into the category row: Featured is now the
leading chip on the same line as the category pills, and the
search field, the result count, and a compact Clear link sit as
a right-aligned tools cluster on that same row. The header
shrinks back to just the title, subtitle, and Browse registry
link. The sub-category row stays as the contextual second line
when a category is active.
Tests: keeps the existing data-testids (plugins-home-chip-featured,
plugins-home-row-category, plugins-home-clear, plugins-home-count,
plugins-home-search) so the existing 11-case section spec passes
without modification.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): hide Recent projects rail on first-run home when empty
A brand-new user landing on Home saw an empty dashed box with the
copy "No projects yet — type a prompt to start one." sitting
above the plugin gallery. The hero already prompts the user to
type, so the empty rail just adds vertical noise without telling
them anything new.
Return null from RecentProjectsStrip when the recent list is
empty (loading or not) so the rail appears only when there is
actually something to show. Update the entry-chrome e2e to
assert toHaveCount(0) on first run — codifying the new contract
that the rail is conditional, not always-mounted.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): hide structured inputs form when every field is in the prompt
The PluginInputsForm below the chat textarea surfaced every plugin
input — even the ones the prompt template already substitutes
inline as highlighted slot pills. For a plugin like Prototype
that's five identical labelled inputs duplicating the five slot
pills above them, making the chat box look like it has grown a
second composer.
Compute the set of placeholder keys actually referenced in the
prompt template (via INPUT_PLACEHOLDER_PATTERN) and skip those
when deciding what the structured form needs to render. The form
still appears for plugins that expose inputs not referenced in
the prompt text (e.g. a "Run in background" toggle), but
template-only plugins collapse the redundant second editor.
Update the picker spec accordingly: when every field is in the
template the form is now expected to be absent rather than to
render a duplicate of the inline slot.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): drop redundant filtered-count and Clear link beside the search
The combined plugins filter bar trailed the search field with
"59 / 386" and an inline "Clear" link, but both repeat what the
chip strip already shows: every chip carries its own count
(Slides 59, All 386, …) and clicking the All chip already resets
the filters. Users had no way to know what the bare "59 / 386"
fraction or "Clear" referred to without inference.
Strip the trailing tools cluster down to just the search input.
The empty-results message at the bottom of the gallery still
exposes a contextual "Clear filters" button when a stacked
filter yields no matches, so the affordance isn't lost — just
removed from a position where it didn't read as actionable.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): drop default subtitle copy from the Official starters strip
The Home gallery rendered a two-line explanatory subtitle under
"Official starters" — "Ready-to-use Open Design workflows bundled
with this runtime. Pick one to load a starter prompt, or browse
the registry for more." — every visit. Returning users skip it,
new users get the same message from the section title plus the
Browse registry link plus the visual card grid itself; the prose
read as filler chrome above the chip strip.
Default the subtitle prop to undefined and only render the <p>
when a caller passes an explicit string. Other surfaces that
mount PluginsHomeSection with their own copy keep their
subtitle; the bare Home gallery loses one row of vertical noise.
Co-authored-by: Cursor <cursoragent@cursor.com>
* Fade out empty workflow lanes in Plugins home filter
Empty top-level lanes (Deploy 0, Refine 0, etc.) and empty
sub-categories (Vercel 0 under Deploy, Figma 0 under Import, etc.)
used to look identical to populated ones, so users couldn't tell at
a glance which chips were real catalog buckets and which were
"contribute a plugin" invites. The strip kept all lanes visible by
design — the workflow shape (Import / Create / Export / Share /
Deploy / Refine / Extend) is part of what we want users to see —
so we keep them clickable, but tag count-zero chips with
data-empty='true' and give them a faded, dashed-border treatment.
"All" pills stay solid since their count reflects the parent lane,
not their own emptiness.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): update home nav test expectations
* fix(e2e): align critical smoke with entry chrome
---------
Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
f650a043d9
|
test(e2e): align entry coverage with redesigned flows (#2101)
* Migrate entry E2E coverage and split CI * test(e2e): relax connectors auth error assertions * ci: route scenario registry changes to extended e2e * ci: decouple packaged smoke jobs from validate gate * ci: restore pre-split workflow * test(e2e): slim critical ui smoke coverage * test(e2e): move broader entry flows out of critical * test(e2e): restore entry chrome coverage to ci * ci: parallelize workspace validation jobs * test(web): stabilize media palette bridge assertion * ci: cache e2e playwright browsers |
||
|
|
3f56395e2d
|
fix(web): restore Settings subtab-pill hover contrast in dark theme (#1815)
The .subtab-pill button hover rule hard-coded `rgba(255,255,255,0.6)` as the
hover background. In light theme this composites to a near-white wash that
sits cleanly under `--text`, but in dark theme it overlays a translucent
white on the dark panel and produces a bright, near-white surface beneath
the same light `--text`, dropping contrast to ~1.87 (well below WCAG AA 4.5).
The reporter (#1795, on 0.6.0) called this out on five Settings surfaces.
Four of them — BYOK / Appearance / Notifications / protocol-chip — were
incidentally fixed by the
|
||
|
|
2181eb376d
|
test(e2e): align UI suites with redesigned entry flows (#1830)
Some checks failed
nix-check / build (push) Failing after 2s
* test(e2e): align critical ui tests with new entry shell * test(e2e): align ui suites with redesigned entry flows * fix(e2e): correct entry chrome page helper type * fix(daemon): stabilize watcher and codex runtime tests * fix(daemon): satisfy strict watcher option typing |
||
|
|
c5d77a03bd
|
Garnet hemisphere (#1769)
Some checks failed
nix-check / build (push) Failing after 2s
* feat(chat-composer): enhance mention handling and input overlay - Introduced a new overlay for inline mentions in the chat composer, improving user experience by visually indicating mentions as users type. - Updated the `ChatComposer` component to manage mention entities and integrate them into the input field, allowing for better context and interaction. - Enhanced the `AssistantMessage` component to support the display of plugin action panels based on the current project context, facilitating easier plugin management. - Refactored related components to ensure consistent handling of project files and mentions across the application. This update significantly improves the chat interaction model, making it more intuitive for users to engage with mentions and plugins. * feat(plugin-management): enhance plugin action panels and UI components - Updated the `AssistantMessage` component to include plugin action panels based on the latest project context, improving user interaction with generated plugins. - Refactored the `PluginsView` to support detailed views for available marketplace entries, allowing users to access more information and actions for each plugin. - Introduced new CSS styles for improved visual representation of plugin-related UI elements, enhancing overall user experience. - Enhanced the `listPlugins` function to include an option for fetching hidden plugins, providing more flexibility in plugin management. This update significantly improves the usability and functionality of the plugin management system, making it easier for users to interact with and manage their plugins. * fix(assistant-message): refine plugin folder candidate selection logic - Updated the `pluginFoldersTouchedThisTurn` function to improve the logic for selecting plugin folder candidates based on touched paths and message content. - Introduced a new helper function, `pathMatchesFolderFileBasename`, to enhance the matching criteria for folder candidates. - Added a check for explicit folder matches before falling back to a single candidate, improving accuracy in folder selection. - Modified the `shouldRenderSlotAsText` function in `HomeHero` to include the name parameter, refining the rendering logic for slot text. These changes enhance the functionality and reliability of the assistant message component in managing plugin folder candidates. * feat(plugin-folder-actions): implement agent-routed CLI actions for plugin management - Introduced a new `PluginFolderAgentAction` type to streamline actions related to plugin folders, including install, publish, and contribute. - Updated the `DesignFilesPanel`, `FileWorkspace`, and `AssistantMessage` components to utilize the new agent action handling, improving user interaction with generated plugins. - Refactored the action handling logic to send commands to the agent, enhancing the workflow for managing plugin folders. - Added corresponding tests to ensure the new functionality works as expected and integrates seamlessly with existing components. This update significantly enhances the plugin management experience by routing actions through the agent, allowing for a more cohesive and interactive user experience. * Fix PR 1702 CI blockers * Fix PR 1702 remaining CI checks * Prebuild AGUI adapter after install * Restore plugin project snapshot wiring * feat(marketplace): refactor marketplace URL handling and enhance fetching logic - Introduced new functions to normalize marketplace URLs and manage fetching of marketplace manifests, improving the reliability of marketplace integrations. - Updated the server and plugin logic to utilize the new fetching mechanisms, ensuring consistent handling of marketplace data. - Enhanced tests to cover new URL normalization and fetching scenarios, ensuring robustness in marketplace management. This update significantly improves the marketplace experience by streamlining URL handling and enhancing data fetching capabilities. * Fix project auto-send cleanup spec * Reconcile run messages on cancel * Use active design system as visual direction * Fix active design system prompt wording * feat(workspace-tabs): implement workspace tabs functionality and file attachment handling - Introduced a new `WorkspaceTabsBar` component to manage workspace tabs, allowing users to navigate between different views (projects, marketplace, etc.). - Enhanced file handling capabilities in the `HomeHero` and `EntryShell` components, enabling users to stage and attach files before project creation. - Updated the `App` component to support auto-sending attachments alongside the first message in a project. - Improved CSS styles for workspace tabs and attachment UI, ensuring a cohesive design and user experience. This update significantly enhances the workspace navigation and file management features, providing users with a more intuitive and efficient workflow. * refactor(workspace-tabs): streamline workspace tabs and UI components - Removed unused components and actions from the `WorkspaceTabsBar` and `AppChromeHeader`, simplifying the codebase. - Updated CSS styles for the workspace shell and tabs, enhancing visual consistency and reducing element sizes for a cleaner layout. - Introduced a new client type detection mechanism to dynamically adjust the workspace shell's class, improving responsiveness. - Added tests for the `WorkspaceTabsBar` to ensure proper navigation and tab management functionality. These changes improve the overall performance and user experience of the workspace navigation system. * Update critical e2e for entry modal flow * Stabilize entry critical e2e flows * fix(ui): adjust workspace tabs and header styles for improved layout - Updated the CSS for workspace tabs and the app header, reducing element sizes and padding for a cleaner appearance. - Introduced a new button in the `WorkspaceTabsBar` for quick access to the home tab, enhancing navigation. - Minor adjustments to the layout and styles to ensure consistency across components. These changes enhance the user interface and improve the overall user experience in the workspace navigation system. * feat(workspace-tabs): implement pinned home tab functionality - Added a new pinned home tab feature to the `WorkspaceTabsBar`, allowing the home tab to remain accessible during navigation. - Updated tab management logic to collapse duplicate home tabs into a single pinned instance when restoring from local storage. - Enhanced CSS styles for workspace tabs to accommodate the new pinned tab design. - Updated tests to verify the behavior of the pinned home tab and its interaction with other tabs. These changes improve navigation consistency and user experience within the workspace. * refactor(workspace-tabs): enhance tab management and styling - Updated CSS styles for workspace tabs, adjusting padding and flex properties for improved layout and consistency. - Refactored tab creation logic to ensure unique IDs for project and marketplace tabs, enhancing navigation clarity. - Removed deprecated functions related to pinned home tabs, streamlining the codebase. - Improved test cases to verify independent behavior of home tabs during navigation. These changes enhance the user experience by providing a more intuitive tab management system and a cleaner UI. * style(workspace-tabs): update CSS for improved layout and visibility - Adjusted CSS properties for workspace tabs, including overflow, position, and z-index to enhance layout and stacking context. - Ensured consistent styling across tab components for better visual hierarchy. These changes contribute to a more polished and user-friendly interface within the workspace. * style(entry-layout): update CSS variables for improved layout consistency - Replaced fixed width values with CSS variables for the entry rail to enhance flexibility. - Adjusted padding and height properties for better visual alignment and spacing. - Introduced a new background style for the entry main topbar to improve aesthetics. These changes contribute to a more responsive and visually appealing layout in the entry view. --------- Co-authored-by: qiongyu1999 <2694684348@qq.com> Co-authored-by: Eli <129168833+qiongyu1999@users.noreply.github.com> |
||
|
|
b268bbe169 |
Merge origin/garnet-hemisphere (post-9e196d34) — Use Plugin handoff fix
Brings in 11 new garnet commits, most importantly: - |
||
|
|
6c16283850 |
Merge origin/main (post-7c8305f4) into reconcile branch
Brings in 10 new main commits: routine deep-link to specific conversations (#1508), Windows resource cache fix for Orbit templates, collapsible comment side panel (#1607), routines project radio polish, Copilot logo swap, and minor UI fixes. Conflicts resolved: - router.ts: garnet's home/view + marketplace routes + main's per-project conversationId deep-link field coexist on Route union - ProjectView.tsx: garnet's isPhantomDaemonRunMessage helper + main's isStoppableAssistantMessage helper both kept - ProjectView.run-cleanup.test.tsx: accepted HEAD (garnet's phantom-row regression test); main's three new tests for finalizeActiveAssistantMessagesOnStop / clearStreamingConversationMarker / shouldClearActiveRunRefs are queued as a follow-up TODO inline. |
||
|
|
2976c76fc3
|
test: expand Memory and Routines coverage (#1521)
* test: expand settings and packaged coverage * test: extend memory settings coverage * test: cover routine settings failure states * test: cover routine operation failures * test: fix daemon test typing on CI * test: decouple packaged smoke from orbit bug * test: avoid live memory LLM calls in route tests * test: fix daemon fetch typing in CI * fix: restore preview comment and inspect toggles * test: align manual edit flow with current inspector UX * test: align comment attachment flow with current preview comments UI * fix: probe resolved Codex launch path during detection * fix: remove duplicate board activation helper after rebase * test: update ghost cli detection mock * test: align FileViewer toolbar expectation * ci: move full app tests to extended lane * ci: run app tests by changed scope * ci: cover shared app inputs in test scopes * ci: avoid setup-node cache in windows packaged smoke * test: align extended settings and manual edit flows |
||
|
|
e508fa3fbd
|
test(e2e): Critique Theater Phase 11 activation (un-fixme suite, seeded-project nav, split SSE fixture) (#1483) | ||
|
|
f1d0dac58b |
feat(plugins): enhance plugin registry with new metadata and preview features
- Added new fields to the plugin and metadata types, including `mode`, `taskKind`, `surface`, and `preview`, to improve plugin description and categorization. - Implemented a `previewFrom` function to generate structured previews for plugins based on their metadata, enhancing the user experience. - Introduced a `visualKindFor` function to determine the visual representation of plugins based on their attributes, improving sorting and display logic. - Updated the `entryFromMarketplace` and `officialEntryFromManifest` functions to accommodate the new fields and ensure proper handling of plugin entries. - Created a new documentation file for the plugin system test suite, consolidating testing strategies and acceptance criteria for better clarity. This update significantly enhances the plugin registry's capabilities, providing richer metadata and improved user interactions with plugin previews. |
||
|
|
53997990b7 |
Merge origin/main (post-0.7.0) into reconciled garnet branch
Second-pass merge layering 41+ new commits from origin/main on top of the first reconcile commit. Headline upstream additions absorbed: - 0.7.0 release: redesigned chat bubble user-text styling, neutralised palette, lucide icons, ElevenLabs audio voice option discovery in the prompt composer, analytics tracking (PostHog) wired across home / studio / create surfaces, Prometheus `/api/metrics` endpoint, critique-theater drop-in mount with a settings toggle. - Misc upstream fixes (titlebar padding, release header layout, deck preview chrome, feedback form auto-scroll, conversation-created SSE on routine runs, etc.) Conflict resolutions (12 files, ~22 hunks): - contracts barrel + prompts/system: union of both sides; new analytics exports (`./analytics/events`, `./analytics/public-params`) added alongside garnet's plugin/atom/genui exports. Both ElevenLabs voice fields (audioVoiceOptions/audioVoiceOptionsError, main) and pluginBlock/activeStageBlocks (garnet) preserved on ComposeInput. - daemon/server.ts: Prometheus `/api/metrics` route inserted after garnet's `/api/daemon/shutdown`. main's `createAnalyticsService` call added before the chat-run service init alongside the prior reconcile note about the dropped legacy POST /api/projects body. - App.tsx: handleCreateProject now consumes both garnet's plugin fields (pluginId / appliedPluginSnapshotId / pluginInputs / autoSendFirstMessage) and main's analytics requestId. Tracking fires success + failure paths; PluginLoopHome auto-send sessionStorage flag is preserved. - ProjectView.tsx: the garnet auto-send useEffect coexists with main's `useCritiqueTheaterEnabled()` hook. - ChatComposer.tsx: imports merged (drop now-unused fetchSkills, add analytics provider + tracking + buildVisualAnnotationAttachment). - index.css: main's redesigned `.msg.user .user-text` chat bubble styling wins over garnet's plain text rule; garnet's `.msg-plugin-chip*` rules preserved alongside. - EntryView.tsx: accepted HEAD (garnet wrapper) — consistent with reconcile decision #2. main's added PetRail / TopTab / analytics view tracking is intentionally NOT brought into the wrapper; the follow-up to re-integrate PetRail / image-templates / video-templates into EntryShell still stands and now also covers analytics view-tracking hooks. - daemon/package.json + pnpm-lock: merged dep set (tar + posthog-node + prom-client coexist). - Test fixtures (FileWorkspace.test): kept garnet's plugin-folders describe block intact; main's projectKind="prototype" addition is dropped where it conflicted with garnet's plugin-folder fixture files. Verification: `pnpm install` (after lockfile reconciled), `pnpm typecheck` exits 0 across all workspace packages. Follow-up not done in this commit: - PetRail / image-templates / video-templates / 0.7.0 analytics view-tracking hooks need to be added to EntryShell. - Critique-theater settings toggle UX (added on main) lives in the SettingsDialog hierarchy; the reconcile state preserves the SettingsDialog so this should work without changes, but no end-to-end verification yet. |
||
|
|
d3602be666 |
Merge origin/main into garnet-hemisphere (reconcile)
Merge of `origin/main` (`03ed3960`, 2026-05-13 pre-0.7.0) into the 161-commit garnet-hemisphere line, reconciling the product-vibe-coded plugin/marketplace/EntryShell surfaces from garnet with the routines / skills / live-artifacts feature work landed on main since the fork point. Headline decisions (full rationale + side-by-side screenshots in `specs/change/20260513-garnet-skills-automations/reconcile-result-vs-garnet.md`): - #1 SettingsDialog: keep main's Memory / Skills / External MCP / Connectors / Routines / MCP server nav items even though the top-level /integrations + /automations routes also cover them. Two entries coexist for now; revisit once Track A/B fill in the placeholder content. - #2 EntryView: accept garnet's thin wrapper delegating to EntryShell. Main's PetRail sidebar + image-templates/video-templates tabs are intentionally deferred to a follow-up that re-integrates them into the new EntryShell layout. - #3 /integrations + /automations top-level routes: kept (garnet's product intent). Skills tab is still a "Coming soon" placeholder awaiting Track A; Routines/Schedules/Live-artifacts cards on /automations are still mock awaiting Track B. - #5 DesignFilesPanel: hybrid — main's pagination as primary list, garnet's Plugin folders section preserved between the live-artifacts block and the pagination block. (by-kind sections drop in favour of pagination; plugin-folders rendering stays because it is a garnet-specific product addition.) - #7 server.ts (10 hunks, ~5400 conflict lines): manual hunk-by-hunk merge. Both daemon admin routes + plugin/genui routes (garnet) and routines/memory/skills upgrades (main) preserved. Garnet's inline project route block kept alongside main's `registerProjectRoutes` / `registerProjectUploadRoutes` modular wiring — duplicate route audit is a follow-up. Garnet's POST /api/projects plugin-snapshot resolution + default-scenario fallback is intentionally dropped from the inline body (now handled by registerProjectRoutes) and listed for follow-up re-integration into `project-routes.ts`. Verification (worktree at /Users/elian/Documents/open-design-garnet): - `pnpm typecheck` exits 0 across all workspace packages - daemon (`pnpm tools-dev run web --namespace reconcile-shots`) boots, serves `/api/daemon/status` healthy, and survives a Playwright walkthrough of /integrations / /automations / home / projects / design-systems / plugins / settings dialog - `@open-design/plugin-runtime` package built (was missing dist/ on garnet); without it the daemon's plugins/* imports fail at boot Track A (Skills tab → real SkillsSection) and Track B (Automations cards → real routines / live-artifacts backend) are the two remaining follow-ups blocking the placeholder/mock content from going live. See `spec.md` and `track-skills.md` in the same directory. |
||
|
|
38a5ab69e6
|
feat(daemon): Critique Theater Phase 12 (9 Prometheus metrics + 6 log events + OTel span + Grafana dashboard) (#1485)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)
* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)
* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)
Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.
Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
/api/projects/:id/critique/:runId/interrupt alongside the
optimistic local dispatch. Before this fix, clicking Interrupt
only flipped the React state to interrupted while the daemon job
kept running. The fetch is best-effort: a 404 (endpoint not wired
yet, lands in Phase 15) is swallowed with a dev-mode console.warn
so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
and simulate the "daemon not ready yet" path. Two tests pin both:
the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
rejected fetch still flips the UI.
interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
one we saw; when it changes, interruptPending is cleared. A user
who interrupts run-1 and then triggers run-2 from the same mount
now gets a fresh, enabled kill button instead of one stuck in
"Interrupting…". Pinned by a new mount test.
Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
input, textarea, select, or contenteditable element is ignored
(and any ancestor of those via closest() is treated the same
way). Body-level focus still fires the keybind so the Theater
area's affordance keeps working. Four new tests cover textarea,
input, contenteditable, and the body-focus positive case.
userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
critiqueTheater.userFacingName key so the "Design Jury" label can
be renamed without touching code. Phase 8 introduced
critiqueTheater.title by mistake; renamed across types.ts, en.ts,
zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
TheaterStage.tsx. The locale alignment test stays green.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.
* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)
In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.
Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.
7/7 vitest cases green.
* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)
Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.
* feat(daemon): adapter conformance harness (Phase 10.3)
runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.
5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.
* test(e2e): Critique Theater Playwright suite (Phase 11)
Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.
Coverage:
1. Happy path: a single mounted theater plays the full
fixture (1 run_started, 5 panelists open / dim / must_fix /
close, 1 round_end, 1 ship) and ends on the score badge.
2. Interrupt mid-run: the panelist that is open at the time
the interrupt button is clicked closes with an interrupted
marker and the transcript freezes there.
3. Visual regression at 375x720 mobile.
4. Visual regression at 768x1024 tablet.
5. Visual regression at 1280x800 desktop.
6. A11y role tree: the theater region exposes a labelled
landmark, each panelist lane is a group with an accessible
name, the score is a status live region.
All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.
* test(web): reducer p99 bench at 10k iterations (Phase 13.1)
Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.
The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.
* test(web): critique surface coverage walker (Phase 13.2)
Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.
62 assertions, one per symbol per corpus.
* docs: Critique Theater user guide (Phase 14.1)
Seven sections aimed at end users (not contributors):
1. What is Design Jury
2. How it works (the five panelists, auto-converging rounds,
the composite formula)
3. Settings (the M1 toggle and what it does)
4. Reading the score badge
5. Replay surface
6. Troubleshooting (degraded, interrupted, failed)
7. FAQ
The composite formula is documented as
designer * 0 + critic * 0.4 + brand * 0.2 + a11y * 0.2 + copy * 0.2
because anyone trying to reverse-engineer the score is going to
search for those weights and the docs are the place they should
land first.
* docs(daemon): critique module AGENTS map (Phase 14.2)
Daemon-side wayfinder for the apps/daemon/src/critique directory.
Tables every file, what owns what invariant, and the 'when you
change anything here' guide so a future contributor does not
have to reverse-engineer the rollout resolver before adding a
new SSE event.
* docs(web): Theater module AGENTS map (Phase 14.3)
Web-side mirror of the daemon AGENTS map. Same file table, same
invariants section, same change-impact guide, sized to the
Theater component package.
* feat(daemon): rollout flag resolver (Phase 15.1)
Single decision point every caller consults to know whether the
orchestrator should wire the critique pipeline for a given run.
Priority:
1. Skill-level policy (required wins, opt-out wins inversely)
2. Per-project override from the Settings toggle
3. OD_CRITIQUE_ENABLED env override
4. Rollout phase default
M0 dark-launch false
M1 settings only false (toggle is off until the user flips it)
M2 per-skill true if skill opted in
M3 global default true
OD_CRITIQUE_ROLLOUT_PHASE parser defaults to M0 on unknown input
so a fresh install never surprises a user with the feature on.
10/10 vitest cases green covering every cell of the matrix.
* feat(web): Settings toggle hook for Critique Theater (Phase 15.2)
React hook that reads critiqueTheaterEnabled from the existing
open-design:config localStorage blob and stays in sync via:
- the platform storage event (cross-tab)
- a open-design:critique-theater-toggle CustomEvent (same-tab)
Same-tab event is the one that fires when the Settings panel saves
in the current window: the toggle and every mounted theater update
without a page reload.
setCritiqueTheaterEnabled(next) is the imperative setter the Settings
panel calls. It preserves the rest of the stored config (mode, apiKey,
etc.) and dispatches the same-tab event after the localStorage write.
The web hook reflects what the user toggled; the daemon-side
isCritiqueEnabled is the final routing authority (project override,
env, rollout phase). When they disagree, the daemon wins for backend
gating and the web reflects the toggle state.
6/6 vitest cases green covering first read, stored read, same-tab
event flip, config preservation, corrupted JSON tolerance, and
cross-tab storage event.
* test(web): Phase 15 toggle hook failure-mode coverage (PR #1320)
lefarcen P2 on PR #1320 flagged that the PR body claimed safe
behavior for disabled localStorage, non-object JSON, and missing
CustomEvent shim, but the suite only covered corrupt JSON plus
happy-path storage events. Added four failure-mode tests so the
swallowed errors are not silently traded for a throw in a future
refactor:
1. Returns false on a stored JSON value that parses to an array
(non-object). Catches a regression where the guard treats
anything truthy as a config blob.
2. Returns false on a stored JSON value of literal 'null'.
typeof null === 'object' in JS, so the guard has to check null
explicitly; this test pins that check.
3. Returns false when localStorage.getItem throws (private mode /
disabled storage / SecurityError). The hook must swallow and
return false so the rest of the app keeps rendering.
4. setCritiqueTheaterEnabled still dispatches the same-tab
CustomEvent when localStorage.setItem throws (quota exceeded /
disabled storage). The dispatch path is the in-session
broadcast that keeps every mounted hook coherent even when
persistence is unavailable; verified by mounting two probes
and asserting both flip after the setter is called with a
throwing setItem.
10/10 vitest cases green (6 existing + 4 new).
* fix(web): honor CustomEvent payload in toggle hook listener (PR #1320)
Both Siri-Ray (blocking) and lefarcen (P2 new) caught the same
real bug in the failure-mode test I added in
|
||
|
|
5172e37217 |
Merge origin/main into release/v0.7.0 to prepare merge-back PR
Resolves 7 conflicts via hybrid strategy: - apps/web/src/components/EntryView.tsx: take main (Discord+X pills are forward feature) - apps/web/src/components/Icon.tsx: take main (switch-case refactor) - apps/web/src/components/NewProjectPanel.tsx: take release (preserve #1514 dropdown UX validated in 0.7.0 acceptance) - apps/web/src/index.css: take main (project-target-platforms / instructions chip styles) - apps/web/tests/components/FileViewer.inspect-empty-hint.test.tsx: accept main's deletion - nix/package-daemon.nix, nix/package-web.nix: take main pnpmDepsHash Non-conflicting hunks from #1519 (AppChromeHeader), #1428 (PostHog analytics call sites), and #1540 (release light background) are preserved via auto-merge. |
||
|
|
026e13b347
|
fix(web): restore release header layout (#1519)
* fix(web): restore release header layout * fix(web): disambiguate entry settings button Generated-By: looper 0.7.4 (runner=fixer, agent=codex) |
||
|
|
6736310a01
|
Implement manual edit inspector (#1448)
* feat(web): tweaks palette popover with HSL hue-shift recoloring Adds a Tweaks color-palette popover to the HTML preview toolbar. Selecting a palette re-skins the iframe in place via a srcDoc-side bridge that walks the DOM and shifts every chromatic paint to the target hue while preserving each color's saturation and lightness — pale tints stay pale, bold CTAs stay bold, just in the new color family. Mono-noir desaturates instead of shifting. - runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options - file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected - FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration - PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir - PreviewDrawOverlay: stub pass-through until the draw branch lands * feat(web): hide finalize-design toolbar from project header * test(e2e): skip project actions toolbar flow after toolbar removal * Polish manual edit inspector * Implement manual edit inspector * Fix manual edit review regressions * Fix FileViewer CI regressions * Fix remaining manual edit review issues * Flush manual edit styles before draw exit * Restore Critique Theater styles * Accept pixel line-height manual edits --------- Co-authored-by: qiongyu1999 <2694684348@qq.com> |
||
|
|
e2f409579d
|
docs: Critique Theater Phase 14 (user guide + 2 AGENTS module maps) (#1319)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)
* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)
* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)
Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.
Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
/api/projects/:id/critique/:runId/interrupt alongside the
optimistic local dispatch. Before this fix, clicking Interrupt
only flipped the React state to interrupted while the daemon job
kept running. The fetch is best-effort: a 404 (endpoint not wired
yet, lands in Phase 15) is swallowed with a dev-mode console.warn
so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
and simulate the "daemon not ready yet" path. Two tests pin both:
the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
rejected fetch still flips the UI.
interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
one we saw; when it changes, interruptPending is cleared. A user
who interrupts run-1 and then triggers run-2 from the same mount
now gets a fresh, enabled kill button instead of one stuck in
"Interrupting…". Pinned by a new mount test.
Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
input, textarea, select, or contenteditable element is ignored
(and any ancestor of those via closest() is treated the same
way). Body-level focus still fires the keybind so the Theater
area's affordance keeps working. Four new tests cover textarea,
input, contenteditable, and the body-focus positive case.
userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
critiqueTheater.userFacingName key so the "Design Jury" label can
be renamed without touching code. Phase 8 introduced
critiqueTheater.title by mistake; renamed across types.ts, en.ts,
zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
TheaterStage.tsx. The locale alignment test stays green.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.
* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)
In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.
Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.
7/7 vitest cases green.
* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)
Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.
* feat(daemon): adapter conformance harness (Phase 10.3)
runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.
5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.
* test(e2e): Critique Theater Playwright suite (Phase 11)
Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.
Coverage:
1. Happy path: a single mounted theater plays the full
fixture (1 run_started, 5 panelists open / dim / must_fix /
close, 1 round_end, 1 ship) and ends on the score badge.
2. Interrupt mid-run: the panelist that is open at the time
the interrupt button is clicked closes with an interrupted
marker and the transcript freezes there.
3. Visual regression at 375x720 mobile.
4. Visual regression at 768x1024 tablet.
5. Visual regression at 1280x800 desktop.
6. A11y role tree: the theater region exposes a labelled
landmark, each panelist lane is a group with an accessible
name, the score is a status live region.
All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.
* test(web): reducer p99 bench at 10k iterations (Phase 13.1)
Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.
The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.
* test(web): critique surface coverage walker (Phase 13.2)
Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.
62 assertions, one per symbol per corpus.
* docs: Critique Theater user guide (Phase 14.1)
Seven sections aimed at end users (not contributors):
1. What is Design Jury
2. How it works (the five panelists, auto-converging rounds,
the composite formula)
3. Settings (the M1 toggle and what it does)
4. Reading the score badge
5. Replay surface
6. Troubleshooting (degraded, interrupted, failed)
7. FAQ
The composite formula is documented as
designer * 0 + critic * 0.4 + brand * 0.2 + a11y * 0.2 + copy * 0.2
because anyone trying to reverse-engineer the score is going to
search for those weights and the docs are the place they should
land first.
* docs(daemon): critique module AGENTS map (Phase 14.2)
Daemon-side wayfinder for the apps/daemon/src/critique directory.
Tables every file, what owns what invariant, and the 'when you
change anything here' guide so a future contributor does not
have to reverse-engineer the rollout resolver before adding a
new SSE event.
* docs(web): Theater module AGENTS map (Phase 14.3)
Web-side mirror of the daemon AGENTS map. Same file table, same
invariants section, same change-impact guide, sized to the
Theater component package.
* docs: tighten Phase 14 reasoning from lefarcen review (PR #1319)
Four content gaps lefarcen flagged in the Phase 14 docs review,
addressed inline rather than deferred. The fifth item (scope-drift
between 'docs only' PR body and the cumulative stacked diff) is
handled by rewriting the PR body, not the docs.
1. Round exit conditions (lefarcen P2-1).
docs/critique-theater.md §2 'Auto-converging rounds' now lists
the five conditions that stop a run (threshold reached, round
budget exhausted, per-round timeout, total timeout, user
interrupt) with their default values. A user debugging a run
that stopped at round 1 with composite 5.4 can read this list
and find the matching cause without spelunking the orchestrator.
2. Prior-art comparison (lefarcen P2-2).
New §1.5 'Why an in-CLI panel and not a third-party design lint'
pre-answers the 'why not Figma lint / Adobe checker / Material
You conformance' question. Three differences: rule engines vs
generative reviewers, post-hoc vs in-loop, external service vs
same-CLI-session.
3. Composite formula rationale (lefarcen P2-4).
§2 now explains why each weight is set the way it is: critic
gates correctness so it gets 0.4; brand / a11y / copy are
secondary quality dimensions at 0.2 each; designer is at 0.0
in v1 because aesthetic preference is not a ship gate. The slot
stays in the schema so notes flow into the transcript and a v2
config release can bump the weight without a wire-shape change.
4. v2 cast-config ownership (lefarcen P2-3).
Both AGENTS.md files (daemon + web) now declare a 'Designer
weight frozen at 0.0 until v2 cast config' invariant. The
daemon side calls out where the SKILL.md frontmatter resolver
lands (apps/daemon/src/critique/config.ts); the web side calls
out where the Settings surface lands (apps/web/src/components/
Settings/). A contributor reading either AGENTS.md before
implementing v2 sees which module to touch first.
* docs(web): mirror the Designer-weight invariant in Theater AGENTS.md (PR #1319)
lefarcen P1 follow-up on PR #1319: the daemon AGENTS.md already
declares 'Designer weight is frozen at 0.0 until v2 cast config
lands' as an invariant, but the web AGENTS.md's parallel bullet
led with 'Composite weights are read-only on the web side' which
buried the Designer-specific constraint. A web contributor
reading that bullet would not realise the v1 weight distribution
is wire-shape (changing it mid-v1 invalidates persisted
critique_runs composite values).
Rewrote the bullet to lead with the same 'Designer weight is
frozen at 0.0 until v2 cast config lands' phrasing the daemon
side uses, and added an explicit cross-link to the daemon
AGENTS.md so the two halves of the invariant read as one rule.
Web-side specifics retained: ScoreTicker / TheaterCollapsed read
composite off the wire (no client recompute), v2 lands as a
Settings surface at apps/web/src/components/Settings/, do not
add a 'weights' prop to any component in this directory until
the contracts package carries the v2 cast type.
* docs: replace deferred metrics endpoint reference + refresh Theater module map (PR #1319)
Two carryover items lefarcen flagged across the PR #1319 + #1320
reviews.
1. docs/critique-theater.md was sending users to
/api/metrics/critique as the conformance-status check on
malformed_block, but the Phase 12 metrics endpoint is
explicitly deferred until after orchestrator wiring lands.
Replaced the link with the pnpm conformance-harness command
that DOES exist today (pnpm --filter @open-design/daemon
vitest run tests/critique-conformance.test.ts) and noted
that the dashboard surfaces this status as a series once
Phase 12 ships.
2. apps/web/src/components/Theater/AGENTS.md module map was
stale after Phase 15: the index.ts row said 'only two hooks
are exported' but the barrel now exports
useCritiqueTheaterEnabled too (plus the setCritiqueTheaterEnabled
setter). Updated the row to list all three hooks + the
setter + the reducer-derived contract types, and added a
new row for hooks/useCritiqueTheaterEnabled.ts in the file
table so a web contributor scanning the table sees the new
hook without inferring it from the index.ts blurb.
* fix(web): restore wait-for-daemon-ack pattern on Theater interrupt
Same regression as flagged on PR #1316 post-main-merge: the
optimistic local dispatch fired before the POST resolved, so a
daemon 404 / 409 still terminalized the UI and the real SSE
terminal event got ignored by the sticky interrupted phase.
Snapshot runId / bestRound / composite at click time, dispatch
interrupted only on res.ok, clear interruptPending on rejection or
non-2xx so the user can retry. Tests cover rejection + 404 leaving
the run on the live stage; the 204 path waits for the ack.
---------
Co-authored-by: Nagendhra <nagendhra405@gmail.com>
|
||
|
|
e2952acd05 |
Revert "fix(web): restore consistent app header layout (#1432)"
This reverts commit
|
||
|
|
c36609c47d |
feat(daemon, web): implement plugin sharing project creation and enhance CLI functionality
- Added new flags for conversation, message, agent, and model in the CLI to support enhanced plugin sharing features. - Introduced a new API endpoint for creating share projects for plugins, allowing users to publish to GitHub or contribute to Open Design. - Updated the UI components to facilitate the new sharing functionalities, including prompts for user input during the sharing process. - Enhanced the project management system to handle new plugin share actions, improving user interaction and experience. - Added tests to ensure the reliability of the new sharing features and their integration within the existing plugin management system. This update significantly enhances the plugin ecosystem by enabling users to share their creations more effectively and streamline collaboration. |
||
|
|
3d3119333c
|
fix(web): restore consistent app header layout (#1432)
* docs: add NotebookLM GitHub export script (#1062) * docs: add NotebookLM GitHub export script * fix: make NotebookLM export TOC anchors work * fix: escape TOC link text markdown chars * fix: include merged PRs when exporting --prs all * fix: allow --prs merged mode * fix: treat --limit as total export budget * fix: avoid starving buckets under global --limit * fix: support --issues none and handle repos w/ issues disabled * fix: avoid underfilling export when buckets empty * fix: keep disabled-issues fallback quiet * fix: silence disabled issues fallback * fix: satisfy script typecheck * prevent duplicate saves and add template deletion (#1294) * prevent duplicate template entries on repeated save * add delete button to saved template list Templates can now be removed from the template picker via a hover x button, calling the existing DELETE /api/templates/:id endpoint. * add missing onDeleteTemplate prop in test fixtures * add template deletion flow test for NewProjectPanel * reject template names longer than 100 characters * preserve original createdAt on template update * feat: add FAQ page skill (#1162) * fix: set writable OD_DATA_DIR default for nix run Fixes #1157 When running via 'nix run github:nexu-io/open-design', the daemon attempted to create runtime state under the Nix store package path: /nix/store/.../lib/open-design/.od/projects The Nix store is read-only at runtime, causing startup to fail with ENOENT when mkdir() tried to create the projects directory. This commit updates the nix run wrapper to export OD_DATA_DIR with a writable default ($HOME/.od) when the variable is unset. Users can still override it by setting OD_DATA_DIR before running. The Home Manager and NixOS modules already set OD_DATA_DIR, so they are unaffected by this change. * feat: add FAQ page skill Add a new skill for generating Frequently Asked Questions pages with: - Collapsible accordion sections for Q&A pairs - Real-time search functionality - Category filtering (Billing, Account, Technical, General) - Smooth animations and transitions - Keyboard navigation support - Mobile-friendly responsive design - Semantic HTML with proper ARIA attributes The skill includes: - SKILL.md with triggers, workflow, and output contract - example.html demonstrating a complete FAQ page with 12 questions Use cases: help centers, support pages, product documentation * fix: address PR review feedback for FAQ page skill - Fix craft slugs: use accessibility-baseline and state-coverage instead of non-existent slugs - Remove overly broad 'questions and answers' trigger - Add edge case handling for insufficient/excessive FAQs - Remove search highlighting requirement (XSS risk) - Update self-check to reflect filtering instead of highlighting Addresses review comments from @lefarcen and @chatgpt-codex-connector * feat: add localized copy for faq-page skill Add German, French, and Russian translations for the FAQ page skill example prompt to fix validation test failure. - DE: FAQ-Seite mit Akkordeon-Abschnitten, Suchfunktion und Kategoriefilterung - FR: Page FAQ avec sections accordéon, recherche et filtrage par catégorie - RU: Страница FAQ со складными секциями-аккордеонами, поиском и фильтрацией * fix: escape apostrophe in French translation Use double quotes to avoid syntax error with d'auth * fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins (#1110) * fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins fnm legacy installations use ~/.fnm/node-versions. Closes #1102 * fix: remove stray .fnm token from type declaration * docs: add Windows troubleshooting guide (#478) (#1170) * docs: add Windows troubleshooting guide (#478) Add docs/windows-troubleshooting.md with step-by-step fixes for the most common native-Windows setup errors: - Node 24 / nvm-windows gotchas (fake nvm file in System32) - pnpm not found after installation - Build scripts blocked by pnpm 10 (better-sqlite3, sharp) - Visual Studio / gyp build errors - Starting the dev server - Optional OpenCode CLI setup Also update CONTRIBUTING.md and QUICKSTART.md to link to the new guide instead of the vague "file an issue if it doesn't" note. * docs: fix Windows guide command accuracy (#1170) Address all 6 inline review comments from lefarcen: - Pin npm-global pnpm install to @10.33.2 (matches packageManager field) - Use where.exe instead of bare where (PowerShell alias conflict) - Fix OpenCode package: opencode-ai (not opencode), binary is opencode - Add EPERM fallback note for corepack enable on protected installs - Add Python check for gyp ERR! find Python - Expand diagnostic checklist with corepack, python, execution policy Also remove redundant corepack pnpm --version from checklist. * feat(daemon): inject compiled design-system tokens + fixture into prompts (#1385) * feat(daemon): inject compiled design-system tokens + fixture into prompts Follow-up to #1231. The prior PR landed the structured form of two brands (`default` + `kami`) and codified the schema; this PR teaches the daemon to actually consume those files when assembling the system prompt, so agents stop having to re-derive token names from DESIGN.md prose every turn. Gated behind `OD_DESIGN_TOKEN_CHANNEL=1` for the smoke-test phase — flag-off keeps the daemon byte-equivalent to today's behavior, flag-on appends two new prompt blocks (the brand's `tokens.css` :root contract and its `components.html` reference fixture) right after the existing DESIGN.md block. Brands without those sibling files (every brand except `default` and `kami` today) skip silently in either mode. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(daemon): only swallow ENOENT/ENOTDIR in readFileOptional, rethrow rest Reviewer feedback (nettee, #1385). The prior catch-all hid permission errors, EISDIR, and broken packaged-resource paths behind the same "undefined = absent" branch the legacy ~138-brand fallback uses, which would let `OD_DESIGN_TOKEN_CHANNEL=1` silently degrade to the DESIGN.md-only prompt while reporting success. That corrupts the exact signal the smoke-test rollout depends on. Now `readFileOptional` only returns undefined for ENOENT / ENOTDIR (real "file does not exist" cases) and rethrows everything else. Added a focused test that plants a directory at the tokens.css path to exercise the EISDIR branch, plus a partial-presence regression test to confirm the stricter contract preserves the legacy fallback. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: chaoxiaoche <chaoxiaoche@192.168.10.16> Co-authored-by: Cursor <cursoragent@cursor.com> * feat(daemon): make connection-test timeouts configurable (#1222) * feat(daemon): make connection-test timeouts configurable Provider and agent connection tests had hardcoded 12s / 45s budgets, which are too tight for slow networks or distant providers (the user sees "timeout" in Settings with no way to extend the budget). - Add OD_CONNECTION_TEST_PROVIDER_TIMEOUT_MS (default 12_000) - Add OD_CONNECTION_TEST_AGENT_TIMEOUT_MS (default 45_000) - Invalid values (non-numeric, zero, negative, fractional) emit a console.warn and fall back to the default, so a typo in the env never silently disables the safety timeout. - Export resolveConnectionTestTimeoutMs for unit testing; cover the three resolution paths (fallback / honored override / invalid). 41 connection-test tests pass (+3 new), full daemon suite 1170/1170. * fix(daemon): reject connection-test timeout overrides above Node's setTimeout maximum Node's `setTimeout` silently clamps any delay above `2^31-1` ms (2_147_483_647) to ~1 ms with a TimeoutOverflowWarning. The previous `Number.isInteger(n) && n >= 1` check accepted oversized values unchanged and passed them straight to `setTimeout`, so an override that *intended* to raise the budget — e.g. `OD_CONNECTION_TEST_AGENT_TIMEOUT_MS=3000000000` — instead caused every connection test to fail almost immediately. The safety timeout was effectively disarmed. Add `MAX_CONNECTION_TEST_TIMEOUT_MS = 2_147_483_647` and switch the guard to `Number.isSafeInteger(n) && n >= 1 && n <= MAX...`. The boundary value is still accepted; one millisecond past it falls back with a warn. Regression test exercises `3_000_000_000`, `2_147_483_647`, and `2_147_483_648`. Addresses #1222 review feedback from @chatgpt-codex-connector, @mrcfps, and @lefarcen. * fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN SSRF bypass) (#1122) * fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN bypass) new URL('http://192.168.1.5./').hostname returns '192.168.1.5.' — the trailing dot is the RFC 1034 absolute-FQDN form and resolves identically to '192.168.1.5'. parseIpv4 fails on the dotted form, so 169.254.169.254. slips past the metadata-service block, 192.168.1.5. slips past the LAN block, and localhost. slips past the loopback identification. Strip trailing dots in normalizeBracketedIpv6 so all downstream checks (isLoopbackApiHost, isBlockedExternalApiHostname, isBlockedIpv4, IPv6 range tests) see the canonical form. Adds 6 vitest cases covering loopback FQDN forms (localhost., foo.localhost., 127.0.0.1.) and SSRF FQDN bypasses (169.254.169.254., 192.168.1.5., 10.0.0.5.). Refs nexu-io/open-design#1119 review feedback (P2 from @lefarcen). * test(connectionTest): tighten trailing-dot coverage per #1122 review Two issues from #1122 review: 1. (P2 from @mrcfps + codex bot) The original `foo.localhost.` case asserted error===undefined on validateBaseUrl, which only proves the URL passed validation — not that the host is identified as loopback. Replaced with direct isLoopbackApiHost(...) assertions on the actual loopback FQDN forms (localhost., 127.0.0.1., 127.0.0.5.) so the test exercises the loopback path the comment claims. 2. (P3 from @lefarcen) Original blocked-FQDN tests covered only 3 of 7 ranges that isBlockedIpv4 handles. Added a dedicated case per range (0.0.0.0/8, 10/8, 100.64/10, 169.254/16, 172.16/12, 192.168/16, multicast >=224) so future regressions in normalizeBracketedIpv6 surface against the full coverage. * docs: drop misleading foo.localhost./endsWith claim in normalizer comment @lefarcen review feedback: isLoopbackApiHost only accepts exact 'localhost', '::1', loopback IPv4, and mapped loopback IPv4 — there's no subdomain or endsWith handling, so referencing 'foo.localhost.' overstates what the trailing-dot strip enables. Rewrite the comment to match actual call sites (isLoopbackApiHost equality + isBlockedIpv4 numeric parse). * feat(daemon): export self-contained HTML via /export/*?inline=1 endpoint (#1312) * test(daemon): add Red unit tests for inlineRelativeAssets helper 14 cases pinning the behavior contract for the upcoming apps/daemon/src/inline-assets.ts helper: - link/script inlining with verbatim body preservation - non-src script attrs preserved (type=module, defer, crossorigin) - relative path resolution (root + nested + deep-nested owners) - self-closing and single-quoted attr forms - negative cases: missing rel, rel=preload, absolute/data/blob/leading-slash - escaping: </style and </script inside body - null-fileReader graceful degradation - duplicate identical tags fully replaced (diverges from apps/web/src/components/FileViewer.tsx:5313's first-match-only; locked decision per plan §3.3) - HTML-escaped data-od-inline-asset attr Tests intentionally Red — module ../src/inline-assets.js does not yet exist. Phase B-G of plan declarative-roaming-gosling.md will turn them green by porting FileViewer.tsx:5248-5354 server-side. Refs nexu-io/open-design#368. * feat(daemon): port inlineRelativeAssets server-side for export endpoint Adds apps/daemon/src/inline-assets.ts — a pure helper that takes (html, ownerFileName, fileReader closure) and returns the HTML with every relative <link rel=stylesheet> and <script src> contents inlined into <style data-od-inline-asset="…">/<script>…</script> blocks. The fileReader closure keeps the helper free of fs/Express coupling so the route handler owns the filesystem boundary. Port source: apps/web/src/components/FileViewer.tsx:5248-5354 — five functions (inlineRelativeAssets, resolveProjectRelativePath, baseDirFor, readHtmlAttr, escapeHtmlAttr). The fetch hop becomes the fileReader closure; replace-all replaces first-match-only per locked design decision §3.3 (inline comment in inline-assets.ts cites the divergence from FileViewer.tsx:5313 and notes the web inline path is on a deprecation track since PR #384 made URL-load the default). Phase B-G of plan declarative-roaming-gosling.md. All 14 unit cases from the Red commit ( |
||
|
|
0220124e0f
|
test: add Memory and Routines coverage (#1400)
* test: align extended Playwright coverage with current UI behavior * test: address extended suite review feedback * test: fix Codex fallback config hydration in e2e * test: add Memory and Routines coverage * test: fix Memory and Routines component test typing * test: include Memory and Routines e2e in extended suite |