mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
439 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
38b59fa4d9
|
fix(web): confirm before deleting saved template in New Project (#2330)
* fix(web): confirm before deleting saved template in New Project EntryShell and NewProjectModal were dropping the onDeleteTemplate prop, so the trash button in the template picker was inert; thread it through and gate it behind an alertdialog confirm. Fixes #2237 * fix(web/i18n): translate deleteTemplate keys across 18 locales The initial commit shipped English placeholders for the 4 new newproj.deleteTemplate* keys; replace with localized strings matching each locale's existing designs.deleteConfirm style. * fix(web): gate confirm-dialog backdrop on in-flight delete + style error message Backdrop now no-ops while `deleting` is true (matching the Cancel and Delete button gating), preventing a false-cancellation race where the backgrounded DELETE keeps running and leaks a stale `deleteError=true` into the next dialog open; also adds the missing `.modal-confirm-error` rule so the inline failure copy renders as a destructive state instead of default body styling. |
||
|
|
c530d163f8
|
feat(web): "Resume conversation in new chat" UI — #462 Commit B (companion to #1718) (#2264)
* feat(contracts): add handoff request/response DTOs
Adds HandoffRequest, HandoffResponse, and HANDOFF_SCHEMA_VERSION for
the upcoming POST /api/projects/:id/handoff synthesis endpoint. Mirrors
the finalize.ts subpath pattern (package.json#exports + esbuild entry +
index re-export) so daemon and web can import
@open-design/contracts/api/handoff.
Refs nexu-io/open-design#462.
* feat(daemon): add handoff synthesis pipeline (buildHandoffPrompt + synthesizeHandoffPrompt)
Adds `apps/daemon/src/handoff-design.ts` exposing the resume-conversation
synthesis primitives the upcoming `POST /api/projects/:id/handoff` route will
call into.
- `buildHandoffPrompt({ projectId, transcriptJsonl, transcriptMessageCount,
now })` returns the system + user prompts. System prompt asks Claude to
emit a structured Markdown body with Context / Decisions made / Open
questions / Current focus / Provenance, with Provenance bullets explicitly
flat (no Markdown emphasis on labels) to preempt the PR #1584 round-2
parser bug.
- `synthesizeHandoffPrompt(db, projectsRoot, projectId, options)` reuses the
existing finalize-design pipeline pieces: `exportProjectTranscript` →
`truncateTranscriptForPrompt` → `buildHandoffPrompt` →
`callAnthropicWithRetry` → `extractDesignMd`, but without the lockfile,
disk write, design-system, or artifact-resolution paths.
- Promotes `DEFAULT_TIMEOUT_MS` in finalize-design.ts to `export const` so
handoff shares the same 120s upstream-call bound.
Refs nexu-io/open-design#462.
* feat(daemon): wire POST /api/projects/:id/handoff route
Adds the handoff HTTP route and registers it in server.ts. Validation
block + error-mapping shape mirror registerFinalizeRoutes (BYOK payload,
upstream-error → ApiErrorCode mapping, redactSecrets on the raw upstream
body). Handoff has no lockfile, so the CONFLICT branch is omitted.
`res.on('close')` is wired to flip an AbortController whose signal is
threaded into synthesizeHandoffPrompt, so a UI-side cancel actually
aborts the daemon-side Anthropic call rather than letting it keep
running after the client walks away (mirrors the PR #974 fix for
finalize).
- `apps/daemon/src/handoff-routes.ts` — new, exports registerHandoffRoutes
+ RegisterHandoffRoutesDeps.
- `apps/daemon/src/server-context.ts` — adds handoff slot to ServerContext.
- `apps/daemon/src/route-context-contract.ts` — adds RegisterHandoffRoutesDeps
to the compile-time coverage assertion.
- `apps/daemon/src/server.ts` — imports synthesizeHandoffPrompt +
registerHandoffRoutes, builds handoffDeps, registers the route next
to finalize.
- `apps/daemon/tests/handoff-route.test.ts` — 12 HTTP-layer tests:
validation (400/403/404), happy path, upstream error mapping
(401/429/502/502 non-JSON), api-key redaction.
- `apps/daemon/tests/handoff-route-abort.test.ts` — client-disconnect
aborts the daemon-side controller.
Refs nexu-io/open-design#462.
* fix(daemon): map TranscriptExportLockedError to 409 CONFLICT on handoff route
`exportProjectTranscript` acquires a per-project `.transcript.lock`
internally (apps/daemon/src/transcript-export.ts:131-163) and throws
`TranscriptExportLockedError` on EEXIST. Concurrent handoff requests —
or a handoff that races `/api/projects/:id/finalize/anthropic` — lost
that lock and surfaced as 500 INTERNAL_ERROR through the route's
generic catch.
- `apps/daemon/src/handoff-routes.ts` — catch `TranscriptExportLockedError`
and return `409 CONFLICT` ahead of the generic 500 branch, mirroring
the existing `FinalizePackageLockedError → 409 CONFLICT` mapping at
`apps/daemon/src/import-export-routes.ts:603-605`.
- `apps/daemon/src/server.ts` — thread `TranscriptExportLockedError`
through `handoffDeps` so the route can match without a direct import.
- `apps/daemon/src/handoff-design.ts` — correct the module header
comment that incorrectly claimed "no lockfile (concurrent handoff
calls are safe)" — handoff does not add its own lock, but it does
transitively acquire `.transcript.lock` via the transcript-export
call.
- `apps/daemon/tests/handoff-route.test.ts` — regression test that
pre-acquires `.transcript.lock` on disk via `fs.openSync(lockPath, 'wx')`
before firing a handoff request, asserts 409 CONFLICT.
Refs nexu-io/open-design#462 — addresses @nettee's blocking review on
PR #1718 (comment 3242251338).
* fix(daemon): keep handoff request timeout armed through the response body read
`synthesizeHandoffPrompt` cleared the upstream-call timeout in a `finally`
that ran as soon as `callAnthropicWithRetry` returned. But `fetch()`
resolves once the upstream sends *headers* — so the subsequent
`await response.json()` body read ran with no timeout. A response that
sends headers and then stalls its body could hang `/api/projects/:id/handoff`
indefinitely instead of failing.
- `apps/daemon/src/handoff-design.ts` — move `clearTimeout(timeoutId)` into a
single outer `finally` spanning both the call and the `response.json()`
body parse, so the timeout stays armed until the body is fully consumed.
- `apps/daemon/src/handoff-design.ts` — the body-parse catch now re-throws
`AbortError` as-is, mirroring the call-phase catch. Without this a
body-phase timeout would surface as `502` "non-JSON body"; re-throwing
lets the route map it to the intended `503` "handoff timed out"
(`handoff-routes.ts:122-124`).
- `apps/daemon/tests/handoff-design.test.ts` — regression test: a `fetchImpl`
returning a `Response` whose body never closes after headers, raced
against a 500ms deadline, asserts the call aborts (not hangs) and rejects
with `AbortError`.
Refs nexu-io/open-design#462 — addresses @nettee's round-2 blocking review
on PR #1718 (`handoff-design.ts:196`).
* fix(daemon): map upstream 400 to 400 BAD_REQUEST on handoff route
`callAnthropicWithRetry` preserves a non-retryable upstream status, so an
Anthropic HTTP 400 (`invalid_request_error` — unknown model, invalid
maxTokens, malformed body) reached the route's `FinalizeUpstreamError`
branch and fell through to `502 UPSTREAM_UNAVAILABLE`. That reported
deterministic caller input as a transient server outage, inviting
pointless retries and hiding which field was wrong.
- `apps/daemon/src/handoff-routes.ts` — special-case `err.status === 400`
to `400 BAD_REQUEST` with the redacted upstream detail, ahead of the
generic 502. Also refresh the route docblock: it claimed the 409 branch
was omitted (stale since the R1 TranscriptExportLockedError fix) and
that error mapping fully mirrors finalize (now diverges on 400).
- `apps/daemon/tests/handoff-route.test.ts` — route test driving an
Anthropic `400 invalid_request_error`: asserts 400 BAD_REQUEST, the
upstream detail is surfaced, and an echoed key is redacted.
- `packages/contracts/tests/package-runtime.test.ts` — import
`@open-design/contracts/api/handoff` through the package `exports` map
and assert `HANDOFF_SCHEMA_VERSION`, covering the built publish surface
(esbuild entry + exports map + root re-export) that the source-only
`handoff-contract.test.ts` does not exercise.
Refs nexu-io/open-design#462 — addresses @nettee's round-3 blocking
review on PR #1718.
* fix(daemon): await the now-async external base-URL validator on handoff route
Main's #1176 (`9a64fccd`) made `validateExternalApiBaseUrl` DNS-aware and
asynchronous (`validateBaseUrlResolved`) and updated the proxy and finalize
callers to `await` it. The handoff route — added on this branch in parallel,
against the old synchronous validator — still called it without `await`, so
`validated` was a Promise: `validated.error` / `validated.forbidden` were
`undefined`, the SSRF / malformed-URL guard silently no-opped, and a bad
`baseUrl` fell through to the upstream call and surfaced as 502.
A semantic merge break — no textual conflict, green on the branch in
isolation, red once CI re-merged latest main.
- `apps/daemon/src/handoff-routes.ts` — `await validateExternalApiBaseUrl(...)`,
mirroring the finalize route (`import-export-routes.ts:561`). The handler
is already `async`.
The existing `handoff-route.test.ts` cases "400 BAD_REQUEST when baseUrl is
not a valid URL" and "403 FORBIDDEN when baseUrl points at a private internal
IP" already encode this — red against branch + latest main, green now.
Refs nexu-io/open-design#462 — PR #1718 CI fix.
* chore(daemon): list handoff in the assertServerContextSatisfiesRoutes literal
The `assertServerContextSatisfiesRoutes({...})` call in `server.ts` enumerates
every route registrar's deps but omitted `handoff`. Adding `handoff: handoffDeps`
makes the literal complete and consistent with the other route deps.
This was not a typecheck break: route-dep coverage is guaranteed by the
`Assert<ServerContext extends AllRegisteredRouteDeps>` type in
`route-context-contract.ts` — and `AllRegisteredRouteDeps` already includes
`RegisterHandoffRoutesDeps` — not by this assertion-call literal. The literal
has omitted `handoff` since this branch's first push (`806db576`) through green
CI throughout; `tsc -p tsconfig.json --noEmit` is clean before and after.
Refs nexu-io/open-design#462 — addresses @nettee's round-4 review note on PR #1718.
* feat(web): add "Resume conversation in new chat" action (#462)
Adds a Resume control to the chat header, next to "New conversation".
Clicking it synthesizes a handoff prompt from the current transcript
via POST /api/projects/:id/handoff, opens a fresh conversation, and
auto-sends the synthesized prompt as its first user message — so a
drifted session resumes without the user replaying context by hand.
The old conversation is preserved.
- synthesizeHandoff() web-state wrapper in apps/web/src/state/projects.ts
- resume-conversation icon button in ChatPane (onResumeConversation /
resumeConversationDisabled props)
- handleResumeConversation + pendingResumeRef + auto-send effect in
ProjectView; effect gates on messagesConversationId so the prompt
cannot fire before the new conversation's message read settles
- chat.resumeConversation i18n key across all 19 locales
Commit B of #462; Commit A is the daemon endpoint (PR #1718). This
branch is stacked on feat/handoff-endpoint so the web code resolves
@open-design/contracts/api/handoff.
* fix(daemon): scope handoff to one conversation + reject empty transcripts (#462)
Addresses the review on #1718 and #2264:
- mrcfps (#2264): the handoff endpoint exported the whole project's
transcript, so a multi-conversation project blended unrelated chats
into the synthesized prompt. HandoffRequest now carries a required
conversationId; the route validates it belongs to the project
(404 CONVERSATION_NOT_FOUND), and exportProjectTranscript takes an
optional conversationId filter so only that conversation is exported.
- nettee (#1718): a zero-message conversation still called Anthropic and
fabricated a handoff. synthesizeHandoffPrompt now throws
EmptyTranscriptError on messageCount === 0; the route maps it to
400 EMPTY_TRANSCRIPT before any BYOK tokens are spent.
HANDOFF_SCHEMA_VERSION bumped to 2 (conversationId is a new required
request field). Regression tests: a two-conversation scoping test, an
empty-conversation route + pipeline test, and a transcript-export
conversationId-filter unit test.
* feat(web): send conversationId with the resume handoff request (#462)
Follows the handoff endpoint becoming conversation-scoped. The resume
flow now passes the active conversationId to POST /handoff so the
synthesized prompt summarizes only the conversation being resumed.
handleResumeConversation bails when there is no active conversation;
synthesizeHandoff and the resume tests carry the new field.
* feat(daemon): add `od project handoff` CLI + register handoff error codes (#462)
Addresses the second-round review on #1718 and #2264:
- mrcfps (#2264): per AGENTS.md "Capability exposure (UI/CLI dual-track)",
a user-facing capability must be reachable through the `od` CLI, not
only the web UI. Adds `od project handoff <id> --conversation <id>
--api-key <key> --model <model> [--base-url] [--max-tokens] [--json]`,
driving the same POST /api/projects/:id/handoff endpoint. The logic
lives in a testable handoff-cli.ts sibling module (mirrors
artifacts-cli.ts) so cli.ts's import-time dispatch stays out of tests.
- nettee (#1718): the route emitted CONVERSATION_NOT_FOUND and
EMPTY_TRANSCRIPT, which were absent from the shared API_ERROR_CODES
union. Both are now registered in packages/contracts/src/errors.ts,
with a contract test pinning them so the route and contract cannot
drift again.
A CLI contract test covers the conversation-scoped request shape,
--json output, flag validation, and daemon-error surfacing.
* fix(daemon): fail `od project handoff` on a malformed 2xx response (#462)
Addresses nettee's review on #1718: runProjectHandoff treated any 2xx
response as success, so a broken daemon/proxy 200 with malformed or
shape-invalid JSON would print `undefined` (or `{}` under --json) and
still exit 0 — breaking the fail-fast contract scripts rely on. It now
validates the body is a well-formed HandoffResponse via an
isHandoffResponse type guard and fails fast otherwise. Regression tests
cover a shape-invalid and an unparseable 200 body.
* feat(web): surface the daemon's classified handoff error in the resume toast (#462)
Addresses mrcfps's non-blocking note on #2264: synthesizeHandoff returned
null for every non-2xx response, so RATE_LIMITED, EMPTY_TRANSCRIPT, and an
upstream 400 with provider detail all collapsed into one generic "check
your API key" toast — even though handoff-routes.ts had already classified
and sanitized them.
synthesizeHandoff now returns the daemon's structured `{ error }` on a
classified failure; `null` stays reserved for a transport failure or an
unparseable body. handleResumeConversation surfaces error.message plus
redacted details for the `{ error }` case, and a distinct
daemon-unreachable message for null.
* fix(web): omit empty baseUrl from the resume handoff request (#462)
Addresses mrcfps's review on #2264: the default Anthropic config
normalizes baseUrl to '' (config.ts), and the handoff route 400s an
explicit empty baseUrl — so the Resume action failed before synthesis
for every user who never set a custom base URL.
handleResumeConversation now forwards baseUrl only when config.baseUrl
is a non-empty string, matching the contract's optional-field semantics.
Tests: the default-config path asserts baseUrl is absent from the
request, and a new case covers a custom baseUrl being forwarded.
* refactor(daemon): dispatch `od project handoff` before the generic project parser (#462)
Addresses nettee's non-blocking note on #1718: runProject ran the shared
parseFlags(PROJECT_*) before reaching the handoff switch case, so a
malformed `od project handoff` invocation (`--unknown`, `--max-tokens`
with no value) threw out of the generic parser instead of hitting
handoff-cli's structured fail() — the entrypoint behaved differently
from the unit-tested runProjectHandoff helper.
The handoff sub now short-circuits before parseFlags / projectDaemonUrl,
so `od project handoff` runs exactly runProjectHandoff with no
intervening parsing. handoff-cli.test.ts gains unknown-flag and
missing-value cases covering the structured fail path.
---------
Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>
|
||
|
|
204599a7ae
|
feat(analytics): ship PostHog v2 event schema (#2285)
* feat(analytics): ship PostHog v2 event schema
Aligns the PostHog wire format with the product team's v2 tracking
spec (Open Design 埋点文档 2.0). The previous v1 catalogue defined a
flat per-page event name (home_view / studio_click / settings_view…);
v2 collapses everything to four core events identified through the
page_name + area + element triplet so dashboards can group by surface
without owning a separate event per page.
Key changes
- packages/contracts/src/analytics: collapse to page_view / ui_click /
surface_view / *_result event names; bump EVENT_SCHEMA_VERSION to 2;
rename the wire field anonymous_id → device_id (value unchanged);
promote the configure-state triplet (has_available_configure_cli /
configure_type / configure_availability) to a global PostHog register
so every event inherits it without per-helper boilerplate.
- apps/web/src/analytics: rewrite the 43 trackXxx helpers behind the
new typed catalogue; opt out of PostHog's built-in UA bot filter so
legitimate embedded webviews, fingerprinted browsers, and the
Playwright-based e2e runs ingest captures (the Privacy → "Share
usage data" toggle remains the single consent gate).
- apps/web components: wire P0/P1/P2 click + view + result surfaces
end-to-end — left nav, toolbar, home chat composer, recent projects,
new project modal, plugins / design systems / integrations /
automations pages, file manager, artifact toolbar/header/share popup,
feedback panel, settings sidebar / language / appearance /
notifications / pets / privacy / connectors. Fixes the v1 feedback
bug where action=clear_feedback_rating shipped rating=null instead of
the rating being cleared.
- apps/daemon: extend run_created / run_finished with the v2 context
(entry_from / project_kind / target_platforms / fidelity /
connectors / etc.), add explicit error_code classification on
result=failed (run.errorCode → AGENT_SIGNAL_* → AGENT_EXIT_* →
AGENT_TERMINATED_UNKNOWN), and read device_id from the new
x-od-analytics-device-id header. Also moves the run_created /
run_finished emission to the canonical /api/runs handler in
server.ts; the chat-routes copy was shadowed by Express's earlier
registration and never executed, which also meant run.clientType
never made it to Langfuse — fixed in the same move.
Verification
- pnpm guard / pnpm typecheck clean for daemon, web, and contracts.
- pnpm --filter @open-design/web test: 1645/1645 passing.
- End-to-end smoke through Playwright + local PostHog ingest project
420348: every page_view (home/projects/automations/design_systems/
plugins/integrations/chat_panel/file_manager), every nav element,
the new_project_modal surface_view + tab + create flow, the
plugin_replacement_modal surface_view, settings_view across nine
sections, settings_cli_test_result (codex CLI), the
project_create_result success path, and run_created + run_finished
(result=failed, error_code=AGENT_EXIT_1) all reached PostHog with
the v2 schema and the expected device_id / page_name / area /
element / fidelity / target_platforms props. The remaining
*_result events (artifact_export / feedback_submit / file_upload /
plugin_replacement / settings_byok_test / settings_connector_auth)
are wired in code; production traffic will trigger them.
* fix(analytics): preserve style category on design-systems surface chip switch
The merge resolution in DesignSystemsTab incorrectly re-introduced a
`setCategory('All')` call alongside the new `trackDesignSystemsTopClick`
emit. main intentionally keeps the active style category when the surface
filter refines within it; the regression was caught by the existing
"keeps the style category when a surface chip refines within it" test
in tests/components/DesignSystemsTab.test.tsx.
* fix(analytics): address review — senseaudio passthrough + daemon-side configure-state
Two follow-ups from the v2 schema review on #2285:
1. `byokProtocolToTracking()` was still falling through to `null` for
`senseaudio` even though the v2 BYOK provider enum now lists it. Every
`SettingsDialog` BYOK call site guards on `if (byokProviderId)`, so a
user on SenseAudio was silently dropping the provider-option,
field-focus, and test-result captures. Added the missing case so
SenseAudio gets the same analytics coverage as the other providers.
2. The daemon-authoritative `run_created` / `run_finished` events were
missing the configure-state triplet (`has_available_configure_cli` /
`configure_type` / `configure_availability`) that v2 promotes to a
global register on the web side. Daemon captures don't go through the
PostHog global register, so dashboards couldn't segment run lifecycle
by execution setup after the migration.
The fix derives the triplet server-side from `detectAgents()` and the
request's `agentId` before `design.analytics.capture(...)`:
- has_available_configure_cli: any CLI on PATH reports installed
- configure_type: 'local_cli' when the run targets an installed CLI,
otherwise 'unknown' (daemon can't see BYOK keys, which live in
web-client storage)
- configure_availability: 'available' / 'unavailable' / 'unknown'
based on the requested agent's install status, with a fallback to
'available' when any CLI is installed
This keeps the v2 schema consistent across both daemon-side and
web-side captures.
* fix(analytics): wire setConfigureGlobals so browser events carry fresh state
Third follow-up from the v2 schema review on #2285. The previous fix
addressed senseaudio + daemon-side configure-state, but reviewer flagged
that `setConfigureGlobals` was still defined-only — no caller — so every
browser-side capture inherited the boot defaults
(`has_available_configure_cli=false`, `configure_type='unknown'`,
`configure_availability='unknown'`). PostHog dashboards therefore could
not segment the new `page_view` / `ui_click` / `surface_view` events by
execution setup after a user configured their environment.
Changes:
- `packages/contracts/src/analytics/events.ts` — add a pure
`deriveConfigureGlobals(mode, agentId, agents, byokConfigured)` helper
so the web client and the daemon can derive the triplet from the same
source of truth. The helper covers all 5 `configure_type` buckets
(`local_cli` / `byok` / `both` / `none` / `unknown`) and the 3
`configure_availability` buckets (`available` / `unavailable` /
`unknown`).
- `apps/web/src/App.tsx` — add a useEffect that re-derives the triplet
whenever the user changes execution mode, selects a new CLI, saves a
BYOK key, or the detected-agent list refreshes, then pushes it to
PostHog via `analytics.setConfigureGlobals(...)`. The setter goes
through the provider so the analytics module stays the single source
of truth.
- `apps/web/src/analytics/provider.tsx` — expose
`setConfigureGlobals` on the analytics context and the test stub so
consumers route through the provider boundary.
- `apps/daemon/src/server.ts` — switch the daemon-side derive in
`/api/runs` to the shared `deriveConfigureGlobals` helper so the
authoritative run_created/run_finished captures match the web-side
payload. BYOK credentials live in the web client and stay invisible
to the daemon, so the daemon arm passes `byokConfigured: undefined`
and falls back to the installed-CLI signal.
- `apps/web/tests/analytics-configure-globals.test.ts` — new regression
test that pins the derive behavior across all branches and confirms
the setter actually mutates the client-side store. Locks the wire-up
so a future refactor can't silently turn the setter back into a
no-op.
Verification: pnpm guard clean; daemon / web typecheck clean; web tests
1703/1703 passing (up from 1696 — 7 new tests in the configure-globals
suite).
* fix(analytics): emit projects page_view + drop misattributed chat_panel source
Fourth review pass on PR #2285. Two follow-ups from mrcfps:
1. DesignsTab (projects landing) was emitting click events but no
matching page_view. Opening /projects without clicking anything left
the surface invisible in PostHog. Added a once-per-mount
trackPageView({ page_name: 'projects' }) with the same ref-keyed
pattern HomeView / PluginsView use.
2. ChatComposer was hard-coding source: 'recent_project' on every
chat_panel page_view. The web router currently only carries
projectId / conversationId / fileName, so we cannot distinguish a
New-project launch from a template-pick or a Recent-projects click
from this layer. A false constant would over-attribute every chat
launch to 'recent_project' and break the funnel slice this schema
was meant to unlock. Dropped the field for now — better no source
than the wrong source — until the router grows a launch-source
channel; the field is still defined as optional on PageViewProps so
the channel can land in a follow-up PR.
Verification: web typecheck clean; web tests 1703/1703 passing.
* fix(analytics): correct plugin-replacement async result + heterogeneous upload + missing requestId
Three follow-ups from the fifth review pass on PR #2285:
1. **plugin_replacement_result emitted before the apply settled**
(`apps/web/src/components/HomeView.tsx`). The modal's confirm action
was a synchronous wrapper around an async `usePlugin(...)` call, so
the surrounding try/catch never observed real failures and every
attempt was reported as `result=success`. Changed `PendingReplacement.
confirm` to return `Promise<void>`, made the wrapper return the
underlying promise, and moved the analytics emit into an async
IIFE in the click handler so the success/failure branches reflect
the actual outcome.
2. **file_upload_result mis-typed heterogeneous batches**
(`apps/web/src/components/FileWorkspace.tsx`). The earlier
implementation only inspected `picked[0]`, so a mixed batch like
`image.png + demo.mp4` reported `file_type=image`. Per the comment
above the block ("mixed batches collapse to other"), the
implementation now maps every file to a tracking type, collapses to
`other` when more than one distinct type is present, and falls
back to the single type otherwise.
3. **project_create_result lost the click→result correlation id**
(`apps/web/src/components/NewProjectPanel.tsx`). The click event
no longer carried the locally-generated `requestId` that
`project_create_result` keeps, so the two could not be joined.
`trackNewProjectModalElementClick()` now accepts an optional
`{ requestId }`, mirroring the other helpers, and the create-button
click threads the same id used for the result.
Verification: web typecheck clean; web tests 1703/1703 passing.
* fix(analytics): gate configure-state on agents probe + drop unsent run_created fields
Two follow-ups from the sixth review pass on PR #2285:
1. **Cold-start configure-state was stamped before fetchAgents() landed**
(`apps/web/src/App.tsx`). The useEffect that pushes the v2 triplet
into the PostHog global register fired on first paint with
`agents=[]`, so the first home/projects/plugins page_view reported
`has_available_configure_cli=false` / `configure_availability=
unavailable` even on machines that did have an installed CLI. The
effect now waits on `agentsLoading === false` and leaves the boot
defaults ('unknown'/'unknown') in place until the probe resolves.
2. **Daemon read run-context fields the web never sends**
(`apps/daemon/src/server.ts`). The daemon-side run_created /
run_finished baseProps read `projectKind`, `entryFrom`,
`projectSource`, `targetPlatforms`, `companionSurfaces`, `fidelity`,
`connectors`, `useSpeakerNotes`, `includeAnimations`,
`referenceTemplate`, and `aspect` from `req.body`, but
`packages/contracts/src/api/chat.ts` and
`apps/web/src/providers/daemon.ts` don't carry those keys on the
wire. Reading them therefore always produced null/undefined.
Dropped the unsent fields from the daemon capture; a follow-up can
extend the create payload to thread the real context through. The
`design_system_id` field stays because the chat contract does send
it.
Tests: added 3 regression tests in `tests/analytics-configure-globals.
test.ts` covering the boot-time gating contract (empty agents +
daemon mode → unavailable / local_cli; installed agent → available;
undefined agents list → unavailable). Verification: web typecheck
clean; daemon typecheck clean; web tests 1706/1706 passing (up from
1703 — 3 new cold-start tests).
* fix(analytics): pin mode='daemon' so missing-agent run reports unavailable
Eleventh review pass on PR #2285. mrcfps flagged that
`apps/daemon/src/server.ts` was calling `deriveConfigureGlobals(...)`
without `mode`, so the helper fell through to the generic branch.
Result: a run for an uninstalled agent was tagged
`configure_availability: 'available'` whenever any OTHER CLI was on
PATH, because the generic branch only looks at the cohort-wide
"any installed?" signal. That precisely undermines the slice the
daemon emit is trying to power.
The daemon's /api/runs handler is always a daemon-mode capture
(daemon is the local CLI runner — BYOK lives in the web layer), so we
now pin `mode: 'daemon'` on the call site. The helper then judges
`configure_availability` from the REQUESTED agent's install status and
reports `unavailable` when the user picked an agent that is not
installed, even if peers are.
Added a regression case in `tests/analytics-configure-globals.test.ts`:
`{ mode: 'daemon', agentId: 'codex', agents: [{claude,true},{codex,false}] }`
→ `{ has_available_configure_cli: true, configure_type: 'local_cli',
configure_availability: 'unavailable' }`.
Verification: daemon typecheck clean; web tests 1707/1707 passing
(up from 1706 — 1 new regression test).
* fix(analytics): hoist chat_panel page_view + thread requestId
- Move chat_panel page_view emit from ChatComposer to ProjectView so
it survives activeConversationId-driven ChatPane remounts. ProjectView
keys the dedupe ref by project.id; the composer drops its duplicate.
- Thread { requestId } into trackAssistantFeedbackReasonSubmitClick so
the click pairs with the existing feedback_submit_result on the same
request id (mirrors the trackNewProjectModalElementClick pattern).
* fix(analytics): keep v2 super-props alive across reset and stamp design_system_source
- Snapshot the register payload in client.ts on PostHog init and
re-register it from applyConsent(true) and applyIdentity() so a
privacy-toggle or Delete-my-data rotation does not resume capture
without event_schema_version / device_id / session_id / locale /
configure-state globals. setConfigureGlobals() also patches the
cache so a later restore picks up the current configure state.
- Stamp design_system_source on daemon-side run_created / run_finished
(it is required by RunCreatedProps / RunFinishedProps). Daemon
can't tell default vs user_selected vs inherited from the wire, so
it derives 'unknown' when designSystemId is present, 'not_applicable'
otherwise — a follow-up that threads designSystemSource through
CreateRunRequest can replace this with the precise source.
|
||
|
|
15d08d4158
|
feat: add windows packaged auto update flow (#2362) | ||
|
|
1ab8758045
|
Revert "fix(web): demote Plugins and Integrations to nav rail footer (#1806)" (#2360)
This reverts commit
|
||
|
|
41e0e26d86
|
fix(web): stop re-syncing composer scroll on every keystroke (#2282)
The `draft` dep on the overlay scroll-snap effect triggered setComposerScrollTop on every keystroke, looping through the v0.8.0 mention overlay state and crashing the chat composer with Maximum update depth exceeded. Fixes #2097 |
||
|
|
d72d689430
|
fix(web): preserve daemon run reattach across project switch (#2317)
Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
ba6497b399
|
fix(web): gate srcDoc transport activation on shell ready (#2320)
Opening Tweaks could leave the HTML preview iframe blank when the
host posted `od:srcdoc-transport-activate` before the lazy transport
shell had registered its message listener. The dropped activation
was then suppressed by the dedupe check in subsequent onLoad calls,
stranding the iframe on the 536-byte empty shell.
The race surfaces after the iframe re-mounts: closing Tweaks
increments `srcDocTransportResetKey`, which re-mounts the srcDoc
iframe with a fresh shell. Re-opening Tweaks immediately fires the
`useUrlLoadPreview`-driven `activateSrcDocTransport()` while the new
shell is still loading. The post lands in a window with no listener
yet, but `activatedSrcDocTransportHtmlRef.current` is marked to the
current srcDoc, so the iframe's onLoad call later short-circuits.
Fix:
- The lazy transport shell now posts `od:srcdoc-transport-ready`
to its parent once its activate-listener is installed.
- FileViewer tracks `srcDocShellReady`, reset on iframe re-mount
and set when ready arrives (with onLoad as belt-and-suspenders).
- Activation is funneled through a new pure helper
`canActivateSrcDocTransport()` that adds the ready gate to the
existing pre-conditions. When ready flips later, the
`activateSrcDocTransport` useCallback regenerates and the
enclosing useEffect re-fires the activation cleanly.
Tests cover the shell handshake (red on main: shell never posts
ready) and the gating helper (red on main: function did not exist).
Fixes #2253
|
||
|
|
dea07840f3
|
fix: stop stale pinned todos after terminal runs (#2321)
Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
8b16d21785
|
fix(web): coalesce chokidar rewrite bursts before refreshing files (#2326)
Agent rewrites surface to chokidar as `unlink` + `add` (+ optional
`change`) within a single tick. ProjectView refreshes the file list
on every event, so the open tab's active file vanishes for one frame
between the `unlink` and the `add`. FileWorkspace's `activeFile`
resolver returns null when the active name disappears from
`visibleFiles`, and the preview falls back to an empty state mid-run.
Add a trailing-coalesce around the file-changed refresh:
- First event arms an 80ms quiet window.
- Subsequent events inside the window reset it.
- A 250ms maxWait cap ensures a sustained edit storm still flushes.
- The intermediate `unlink` is absorbed by the next `add` and the
UI only sees a single, file-present refresh.
The coalesce is a small `useCoalescedCallback` hook so the timing
contract is unit-testable without standing up ProjectView. Tests
cover the rewrite burst, isolated single triggers, the maxWait cap
under sustained triggers, latest-callback semantics, and unmount
cleanup.
Fixes #2195
|
||
|
|
b426f614f5
|
fix(web): scope design-systems surface chip counts to active style filter (#2141)
On the Design systems page the built-in library surface chips (All / Web /
Image / …) counted the whole catalog, so applying a style category left
them showing stale totals while the grid shrank. Clicking a chip also
called setCategory('All'), discarding the user's style filter instead of
refining within it.
Surface chip counts and the grid now derive from a query-scoped set
(style category + search) over librarySystems; the chip click only
switches surface. An unscoped surfaceTotals count still backs the
"surface is empty" effect guard so it reacts to the catalog, not a
transient filter. The chip row always keeps "all" and the active surface
chip rendered, so a transient search can never hide the active filter.
Re-applied onto the DesignSystemsTab restructure from #2187. Adds 4 tests
covering scoped counts, category preservation on chip click, empty-chip
hiding, and active-chip persistence under a zero-result search.
Fixes #2062
Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>
|
||
|
|
210b94069a
|
feat(senseaudio): BYOK chat with image + video generation tools (#2065)
* feat(senseaudio): BYOK chat with image + video generation tools
Adds SenseAudio as a first-class BYOK chat protocol and wires the daemon's
chat proxy with a tool loop so BYOK users can generate images and videos
without dropping to a CLI agent.
- BYOK protocol: new senseaudio tab + /api/proxy/senseaudio/stream route +
connection-test + provider-models discovery (OpenAI-compatible wire)
- Tool loop: generate_image (synchronous /v1/image/sync) and generate_video
(async /v1/video/create + 5s polling /v1/video/status, 10-min ceiling,
periodic progress log every 30s)
- Settings dropdown + chat-composer dropdown for the BYOK image model
default; generate_image's model enum lets the LLM override per call
- Seed-on-success: a successful BYOK chat call idempotently mirrors the
key into media-config (preserves env-resolved + already-stored keys)
- Generated artifacts land in <projectsRoot>/<projectId>/ so FileViewer,
DesignFilesPanel, and project export pick them up automatically;
legacy /api/byok-image/:id route kept for old conversation links
- Markdown renderer learns  image syntax with a scheme
allowlist (http(s) / data:image/ / blob: / relative paths)
- i18n key settings.byokImageModel across all 19 locales
- 3 SenseAudio image models registered (2.0, 1.0, doubao-seedream-5.0);
1 video model (doubao-seedance-2.0)
- Tests: byok-tools (29), media-senseaudio-image (8), media-config seed
(7), proxy-routes (47), markdown image rendering (8)
* fix(senseaudio): unblock image gen + design file preview switching
- SenseAudio /v1/image/sync rejected the previous size mapping with
`参数错误:size` (1664x936, 936x1664, 1280x960, 960x1280 are not in
the gateway's accepted set). Switched to standard HD / SD sizes that
every aspect bucket can hit: 1024×1024, 1280×720, 720×1280,
1024×768, 768×1024. Kept the byok-tools and media.ts tables in sync
so the BYOK chat tool and the CLI agent path both stop failing on
non-square aspects.
- DesignFilesPanel's <DfPreview> was missing a key prop, so React
reused the same iframe DOM node when the user picked a different
file — the src prop changed but the iframe never navigated. Added
key={previewFile.name} so the previous preview unmounts cleanly.
- Updated byok-tools + media-senseaudio-image tests for the new size
expectations.
* docs(senseaudio): clear stale provider hint + update README
- Settings → Media → SenseAudio: clear the auto-promoted
"Image · TTS · 70+ voices · clone" hint; the provider label alone is
enough now that the BYOK chat surface covers image + video tooling.
- README: list the new senseaudio (and missing ollama) proxy routes so
the BYOK section reflects what the daemon actually serves, and
mention the generate_image / generate_video chat tools that ship
with the SenseAudio path.
* fix(senseaudio): address PR #2065 review feedback
Three non-blocking review notes from @PerishCode on PR #2065:
1. Drop the dead /api/byok-image/:id route. The PR description claimed
it was "legacy fallback for old chat history" but that storage
layout never existed on main, so the route can only ever 400 or
404 — never 200. Removed the handler, the isSafeByokImageId
export, the unused createReadStream / stat / path / Request /
Response imports, and the two byok-image regression tests.
2. Add rejectProxyPluginContext guard to the senseaudio proxy
handler so it matches the invariant the other five proxy paths
already enforce (plugin runs must go through /api/runs for
snapshot pinning). Extended the existing "API fallback rejects
plugin runs" describe to also cover /api/proxy/senseaudio/stream
with the 409 PLUGIN_REQUIRES_DAEMON expectation.
3. Wrap the secondary image / video downloads (the URLs the
SenseAudio gateway hands back in /v1/image/sync .url and
/v1/video/status .video_url) in validateBaseUrlResolved so a
malicious gateway can't point us at 169.254.169.254 (AWS / Azure
metadata) or RFC1918 hosts via the response payload. Also passed
`redirect: 'error'` on both fetches to match the SSRF posture
the primary proxy fetch already uses. The new
assertExternalAssetUrl helper lives next to executeGenerateImage
so future tool downloads can reuse it.
Tests: 120/120 daemon tests pass; guard + typecheck green.
* fix(senseaudio): mirror SSRF guard onto renderSenseAudioImage CLI path
Follow-up to
|
||
|
|
431a5e2d79
|
[codex] Add global onboarding flow without AMR (#2272)
* Add global onboarding flow * Remove AMR from onboarding variant * Add onboarding role question |
||
|
|
ad37fd30cf
|
Add desktop updater UI flow (#2270) | ||
|
|
e94663bfbd
|
Add connector memory extraction flow (#2265) | ||
|
|
d1eb8b7bef
|
Fix editorial deck navigation (#2173) | ||
|
|
716f06cb73
|
fix(web): allow Comment/Inspect picker to select iframe-backed components (#2254) | ||
|
|
e6d0f4eeab
|
fix(web): restore scrolling in New project modal for tall tabs (#2032) | ||
|
|
2c128e0e91
|
refactor desktop host bridge (#2246) | ||
|
|
c6b7c44424
|
confirm before clearing memory extraction history (#2241) | ||
|
|
52b702648e
|
Enable LAN web dev access (#1947) | ||
|
|
86ec951fb9
|
[codex] Add automation templates and proposal workflows (#2193)
* feat(web): introduce Automations tab with dual-track capability for routines This commit adds a new Automations tab that consolidates routines, schedules, and live artifacts, allowing users to manage automations seamlessly. The tab features a modal for creating and editing automations, which supports various scheduling options (hourly, daily, weekdays, weekly) and project modes (create_each_run, reuse). The CLI is also updated to expose automation commands, ensuring consistency between the web UI and CLI interfaces. Key changes include: - New `NewAutomationModal` component for automation creation and editing. - Updated `TasksView` to integrate the new Automations functionality. - Enhanced styling for the Automations tab to improve user experience. This implementation aligns with the dual-track capability exposure policy, ensuring all features are accessible via both the web UI and CLI. * feat(daemon): enhance automation context handling and CLI commands This commit introduces several improvements to the automation context management and updates the CLI commands accordingly. Key changes include: - Added support for new context fields (`plugin`, `mcp`, `connector`) in automation commands. - Updated the CLI to reflect new target options (`new-project`). - Enhanced error messages for invalid target inputs. - Introduced functions to handle context selection and normalization for routines, including the ability to parse and store context data in the database. - Updated the database schema to include a new `context_json` field for routines. - Improved the handling of context in routine routes and the web interface, ensuring that selected contexts are properly managed and displayed. These changes aim to provide a more robust and flexible automation experience, aligning with the recent enhancements in the web UI. * feat(web): enhance TasksView with automation run history and status indicators This commit introduces several new features to the TasksView component, including: - Added functionality to display automation run history for each routine, showing metadata such as status, timestamps, and project details. - Implemented status indicators for routine runs, providing visual feedback on their current state (succeeded, failed, running, queued). - Enhanced the UI to allow users to expand and view detailed run history, including the ability to open the corresponding project conversation. - Updated styles to improve the presentation of automation statuses and history. These changes aim to provide users with better insights into their automation routines and improve overall usability. * feat(daemon): implement automation ingestion and proposal management This commit introduces several new features related to automation ingestion and proposal management within the daemon. Key changes include: - Added new modules for handling automation source packets and proposals, allowing for the storage, retrieval, and management of automation-related data. - Implemented functions to list, create, and apply automation proposals, enhancing the automation workflow. - Introduced new CLI commands for interacting with memory entries and automation sources, providing users with more control over their automation processes. - Enhanced the server routes to support automation source and proposal APIs, enabling seamless integration with the existing system. These changes aim to improve the overall automation experience, making it easier for users to manage and utilize automation proposals and ingestions effectively. |
||
|
|
eb127e0f79
|
Remove live artifact home chip (#2221) | ||
|
|
4376d8a8ec
|
[codex] Add pet task center and desktop pet (#1833)
* feat: add pet task center and desktop pet * Fix pet task center review regressions |
||
|
|
6990291217
|
fix: escape chat transcript role delimiters (#2156)
Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
387dc83b27
|
fix(plugins): wire Open Design "Open Design PR" button end-to-end (#2182)
Some checks failed
ci / Detect CI change scopes (push) Failing after 2s
landing-page-ci / Validate landing page (push) Failing after 1s
landing-page-deploy / Deploy landing page (push) Has been skipped
nix-check / build (push) Failing after 1s
ci / Preflight (push) Has been skipped
ci / Core package tests (push) Has been skipped
ci / Tools workspace tests (push) Has been skipped
ci / Daemon workspace tests (push) Has been skipped
ci / Web workspace tests (push) Has been skipped
ci / E2E vitest (push) Has been skipped
ci / Playwright critical (push) Has been skipped
ci / Build workspaces (push) Has been skipped
ci / App workspace tests (push) Has been skipped
ci / Validate workspace (push) Failing after 0s
Two-part fix for the Plugin folder card's "Open Design PR" button. The
flow was broken end-to-end: the CLI emitted a 404 URL, and the agent
prompt under-specified the contribution steps so the agent stalled
mid-turn or fell back to the legacy issue-URL path.
Catalog target — `apps/daemon/src/plugins/publish.ts`:
`buildPublishLink({ catalog: 'open-design' })` hardcoded a submission URL
at `github.com/open-design/plugin-registry`, the dedicated registry repo
proposed in docs/plans/plugin-registry.md §1.2. That repo doesn't exist
yet (P3.1 notes "creating the external GitHub repo is an operational
launch step, not a code blocker"), so every generated URL 404'd. Retarget
the catalog at the live `nexu-io/open-design` monorepo and update the
target-path hint in the PR body to `plugins/community/<plugin-name>/`
(the actual layout under main). Plan §1.2 stays the long-term goal — see
the code comment.
Also updates the matching `od plugin yank` issue URL in cli.ts.
Agent prompt — `apps/web/src/components/design-files/pluginFolderActions.ts`:
The `contribute` action prompt only said "use the supported `od plugin
publish` Open Design registry flow", which produces an issue URL (the
legacy path) and left the agent to invent the remaining steps. The agent
ended up calling `AskUserQuestion` mid-turn waiting for input that the
DesignFilesPanel buttons couldn't satisfy, then stalled for 600s.
Rewrite the `contribute` prompt to drive the full PR flow via raw `gh`
commands:
1. preflight `gh --version` / `gh auth status` (detect-and-instruct on
missing CLI, never auto-install anything)
2. read manifest, resolve author login
3. `gh repo fork nexu-io/open-design --remote=false`
4. clone fork, branch, cp plugin into `plugins/community/<name>/`,
commit/push using author's git identity (not a hardcoded bot)
5. `gh pr create ... --web` so the author reviews and clicks Create
Hard bans on `AskUserQuestion`, retry-on-failure, and the legacy
`od plugin publish --to open-design` CLI keep the turn fire-and-forget.
The `install` / `publish` action prompts are unchanged.
Tests:
* `apps/daemon/tests/plugins-publish.test.ts` — assert URL host +
catalogLabel match `nexu-io/open-design`, body contains the new path
hint.
* `apps/web/tests/components/pluginFolderActions.test.ts` (new) — lock
the contribute prompt's command surface, the `--web` review window,
the `AskUserQuestion` ban, the install-tool ban, and folder-path
interpolation. Nine assertions covering the prompt contract without
coupling to exact wording.
Validation:
* `pnpm --filter @open-design/daemon exec vitest run tests/plugins-publish.test.ts`
→ 11/11 passed
* `pnpm --filter @open-design/web exec vitest run tests/components/pluginFolderActions.test.ts`
→ 9/9 passed
* `pnpm --filter @open-design/web typecheck` clean
* E2E: button click in DesignFilesPanel under namespace=e2e-pr-test ran
`gh --version` → fork → clone → branch → commit/push → ended at
`gh pr create --web` opening the GitHub PR draft in browser. No
AskUserQuestion stall this time.
Doesn't touch: `5f71968f` server-side endpoints (`/contribute-open-design`
+ `/publish-github` still in main as dead code — separate cleanup),
plan/spec docs (per maintainer call to keep the dedicated-repo as the
long-term goal), or `registry-backends.test.ts` (plan terminal-state
fixture; no production caller).
|
||
|
|
9596a0ccd5
|
feat(privacy): collapse first-run consent banner to a single "I get it" button (#2202)
* feat(privacy): collapse first-run banner to a single "I get it" button
Replaces the first-run privacy disclosure's two-button decision picker
("Share usage data" / "Don't share") with a single "I get it"
acknowledgement. Clicking it accepts the same default telemetry surface
the previous "Share usage data" path enabled — the banner shifts from
binary consent picker to informed disclosure.
To keep the surface honest, the banner footer is rewritten to spell out
the new default and point at the off switch:
"Data sharing is on by default. You can turn it off any time in
Settings → Privacy. We never upload the contents of your generated
artifact files."
Settings → Privacy (PrivacySection.tsx) is unchanged — that surface still
exposes both Share and Don't share buttons so users who arrive there
later (or come back to flip the choice) keep the explicit picker.
Mechanics:
* `PrivacyConsentModal.tsx`: drop `onDecline` prop and the second button;
rename `onShare` → `onAccept` to match the new semantic. Footer hint
now reads from `settings.privacyConsentBannerFooter` (new key) so the
banner copy can speak in single-button voice without disturbing the
reused `settings.privacyConsentFooter` that PrivacySection still
displays.
* `App.tsx`: drop the `onDecline` handler. Single `onAccept` handler
applies the same opt-in payload as the previous `onShare` branch
(`telemetry.metrics = true`, `telemetry.content = true`, fresh
`installationId`), so the wire format daemon-side is unchanged.
* `i18n/types.ts` + `locales/en.ts`: two new keys —
`settings.privacyConsentAccept` ("I get it") and
`settings.privacyConsentBannerFooter` (the default-on disclosure copy).
* `i18n/locales/*.ts` (all 18 non-en dictionaries): added the two new
keys. zh-CN and zh-TW are translated; the remaining 16 locales follow
the project convention of leaving the EN string as a fallback for
later contributor passes (same shape used by privacyConsentShare /
Decline today).
* `tests/components/PrivacyConsentModal.test.tsx`: rewritten. The four
new tests lock the new contract — single "I get it" button, no
Share/Decline labels, default-on disclosure text in the footer, the
external privacy-policy link, and onAccept firing on click. Replaces
the prior "equal-prominence" tests, which only made sense for the
two-button shape.
Validation:
* `pnpm --filter @open-design/web exec vitest run tests/components/PrivacyConsentModal.test.tsx`
→ 4/4 passed
* `pnpm --filter @open-design/web exec vitest run tests/i18n/locales.test.ts`
→ 5/5 passed (every locale aligned with English keys + placeholders;
the new keys ship to all 19 dictionaries)
* `pnpm --filter @open-design/web typecheck` clean
* test(privacy): update App.connectors fixture to match single-button banner
`App.connectors.test.tsx > does not show first-run privacy consent until
daemon config hydration finishes` hardcoded the previous "Share usage
data" affirmative button label. The single-button banner now renders
"I get it", so the assertion was looking for a button that no longer
exists.
CI signal: 26081467446 → Web workspace tests → `× does not show
first-run privacy consent until daemon config hydration finishes`. The
App workspace tests + Validate workspace failures cascaded from this
one — both are aggregator jobs.
Local: vitest run tests/components/App.connectors.test.tsx → 5/5 passed.
|
||
|
|
3cecb1c881
|
Polish project workspace UI (#2201) | ||
|
|
a38e09f931
|
fix(web): demote Plugins and Integrations to nav rail footer (#1806)
* fix(web): demote Plugins and Integrations to nav rail footer
Plugins and Integrations are platform-configuration surfaces, not
daily-use destinations. Moving them to the footer section of the
left nav rail — separated from primary items by a thin divider —
keeps them reachable while giving the primary four items
(Home, Projects, Automations, Design Systems) the visual weight
they deserve.
- EntryNavRail: remove Plugins/Integrations from the main __group
and place them in the __footer above the help launcher
- entry-layout.css: add __divider rule (1 px separator) to visually
mark the boundary between primary and secondary nav regions
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): remove settings dropdown, gear opens settings directly
The gear/cog button previously opened a dropdown that mixed three
unrelated concerns: community links (X, Discord), preference quick-
access (Language, Appearance), a feature shortcut (Use everywhere),
and a redundant Settings entry — creating two separate paths to the
same Settings dialog and duplicating Language/Appearance relative to
the Settings sidebar.
Changes:
- Gear button now directly opens the Settings dialog (no intermediate
dropdown layer)
- Follow @nexudotio on X and Join Discord moved to the Help menu at
the bottom of the nav rail, where community/external links belong
- Language and Appearance remain exclusively in the Settings dialog
- Use everywhere remains exclusively in the topbar chip
- Remove dead state (avatarMenuOpen, languageExpanded,
appearanceExpanded), ref, outside-click effect, and module-level
constants (APPEARANCE_THEMES, APPEARANCE_LABEL, describeModelChip)
that were introduced solely to support the dropdown
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(web): move output-type chips above chat input as tab bar
The home hero previously had two chip rows below the chat input that
mixed two unrelated dimensions: output type (Prototype, Slide deck,
Image…) and workflow source (Create plugin, From Figma, From folder,
From template). Users had no visual cue that these were different
categories.
This change separates the two dimensions clearly:
- Output type (create group) becomes a tab bar positioned above the
input card. Tabs share the same chip data and onPickChip dispatch,
so plugin selection, active state, and pending state are unchanged.
Active tab shows a colored underline; the bar border visually
connects to the input card below.
- Workflow source (migrate group) stays as the chip row below the
input card, now standing alone with unambiguous "how to start"
semantics.
- Subtitle updated from "Pick a plugin below" to "Pick a type"
to match the new placement.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): skip replace-prompt confirmation when switching output-type tabs
Output-type tab clicks (create group) are mode-selection gestures;
the user expects the prompt to update immediately without a dialog.
The confirmation is still shown for migrate-group chips (From Figma
etc.) where the replacement carries meaningful user-provided content.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): use brand logo as Home destination, drop redundant brand mentions
The entry nav rail previously rendered the brand logo and a separate
Home icon back-to-back; both invoked `onViewChange('home')`, so the
Home button was pure duplicate affordance. The hero pane also
displayed a third "Open Design" lockup that competed with the rail
logo and the rebranded title.
This collapses those affordances:
* `EntryNavRail` drops the dedicated Home `NavButton`. The brand
logo button now carries `aria-current="page"` and an `is-active`
visual state when the home view is showing, and its tooltip reads
"Open Design · Home" off-home so the navigation behavior stays
discoverable for new users.
* `entry-layout.css` adds the matching `.entry-nav-rail__logo.is-active`
accent treatment so the "you are here" cue reads at parity with
primary rail buttons.
* `HomeHero` removes the inline `home-hero__brand` lockup and the
associated CSS, then retunes the title/subtitle/type-tab spacing
so the headline group still pairs tightly with the type tabs and
input card below.
* `entry-chrome-flows` is updated to assert the logo carries the
active page treatment and that no `entry-nav-home` test id
resurfaces by mistake.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): auto-grow home chat input, disable internal scroll and drag-resize
The home hero textarea was capped at a fixed `min-height: 84px` with
`resize: vertical`, so users had to either drag the bottom-right
corner to enlarge it or scroll the textarea internally to read
longer prompts. That hid context (loaded plugin templates routinely
overflow three lines) and exposed a manual grip whose state was easy
to leave in an awkward height.
This change makes the chat box grow with its content:
* `HomeHero` adds a `useLayoutEffect` that, on every prompt change,
resets the textarea height to `auto` so the browser can measure a
smaller content, then writes back `scrollHeight` in pixels. The
effect uses layout phase (not effect phase) to avoid a one-frame
flash at the previous height when a plugin loads a long example
prompt.
* `home-hero.css` swaps `resize: vertical` for `resize: none` and
adds `overflow: hidden`, so the manual drag grip disappears and
the textarea never scrolls internally. `min-height: 84px` is kept
so the empty input still reads as a chunky chat box. The outer
page handles overflow when the prompt is genuinely very long.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): make output-type tabs read as folder-tabs attached to chat box
The previous tab bar used a small icon + label per tab and signaled
the active tab with a 2px accent underline sitting on a horizontal
divider line. That divider visually broke the relationship between
the tabs and the input card below — the active tab looked like a
free-floating header, not a flap of the chat surface, so users had
to do extra work to mentally connect "I picked Prototype" with the
prompt area immediately underneath.
This switches the tab bar to a folder-tab pattern:
* `HomeHero` drops the per-tab `Icon` element. The seven labels
(Prototype, Live artifact, Slide deck, Image, Video, HyperFrames,
Audio) already disambiguate at the type sizes used here, and the
icons were primarily decorative.
* `home-hero.css` rewrites the tab styling:
- The bar no longer paints its own baseline border; the input
card's top edge serves as the baseline.
- Each tab is a rounded-top container with a 1px transparent
border and a `-1px` bottom margin, so its bottom edge overlaps
the card's top border by exactly one pixel.
- The active tab borrows the card's panel background and border
color, and overrides its bottom border with the panel color
so it visually erases the card's top border for the tab's
width. The result reads as one continuous "Prototype is this
chat box" surface.
- The card's prior `margin-top: 8px` is removed so the tab bar
bottom and card top sit at the same y coordinate, which is the
geometric precondition for the 1px overlap to land cleanly.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): enlarge home chat attach and submit buttons for legibility
The attach (paper-clip) and submit (arrow-up) controls on the home
chat input rendered at 32px diameter with 14px and 18px glyphs
respectively. After stroke antialiasing the icons read at roughly
11–12px on typical displays — small enough that users reported the
glyphs were illegible and had to be discovered by trial-and-error.
This bumps both controls to 38px circles and grows their glyphs
(attach 14 → 18, submit 18 → 22). The primary call-to-action now
has clearly more visual weight than the surrounding muted hint
text, and the paper-clip is recognisable at a glance.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): paint a chevron on inline select slots in the home prompt
Inline select slots in the home prompt overlay (e.g. the
`high-fidelity` / `wireframe` fidelity picker on the Live artifact
template) rendered as plain pill highlights, visually identical to
free-text and read-only text slots. The slot's `appearance: none`
strips the browser's default chevron, so users had no affordance
hinting the value was switchable.
This wraps select-type slots in an `inline-flex` span and overlays
an explicit chevron-down `Icon` against the slot's trailing padding
(bumped from 18px to 22px to make room). The chevron inherits the
accent color via the wrap's `color` property so it themes correctly
in light and dark modes. Click semantics are unchanged because the
chevron is `pointer-events: none`.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): size inline number slots to their value, not the whole row
Number-type inline slots in the home prompt (e.g. the slide-count
spinner on the Slide deck template) inherited the generic slot
`min-width: 8ch` and then stretched to the browser's default
`<input type=number>` width, so a two-digit value like "10" ate
the entire remaining line and pushed the native spinner buttons
to the far right edge — far away from the value they control.
This sizes the number input to its actual content plus four
characters of trailing room for the spinner buttons (clamped to
6–14ch), and adds a `home-hero__prompt-slot-input--number`
modifier that overrides the slot `min-width` to 4ch. The
spinner now sits flush against the value and the slot reads as
a compact inline pill matching the surrounding text/select
slots.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): widen prompt line-height so slot pills do not collide vertically
Each interactive slot and mention in the home prompt paints a 2px
outline ring via `box-shadow`. At the previous `line-height: 1.55`
on a 15px font, two lines were ~23px apart while a single pill
occupied ~20px of vertical space (text box + ring on both sides),
leaving roughly 2px of clear space between rows. When the prompt
wrapped onto multiple lines — common for the Image template's
"Generate a {kind} of {subject}. Style: {style}. Aspect: {ratio}."
example — the rings from line N and line N+1 visually merged into
a single bar, making it ambiguous which pill the user was about to
click or edit.
This bumps the line-height of both the highlight overlay and the
underlying editable textarea to 1.85 (~28px per line), restoring
~8px of clear space between pill rows. The two values must stay
identical so the overlay glyphs continue to track the textarea
caret positions exactly; a brief comment in each rule documents
this coupling for future edits.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): paint dropdown chevron on select element with neutral color
The previous chevron treatment wrapped each select slot in an
inline-flex span and laid an accent-orange `<Icon>` over the
trailing padding. At the prompt's normal display size the orange
chevron blended into the orange pill ring corners, so users
perceived "a bunch of dropdown arrows" across every highlighted
slot, even though only the actual select rendered one.
Move the chevron onto the `<select>` itself as a small neutral-
gray `background-image` SVG. The grey contrasts clearly with the
pill's accent ring and the chevron lives inside the select's own
padding, so it can never visually overlap the value text and can
never appear on non-select slots.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): shrink select-slot dropdown caret so it stops reading as a check-mark
The 8×6 stroked chevron looked like a check-mark in zoomed views
because the stroke width was a large fraction of the glyph. Swap
for a small (9×5) filled triangle drawn as a `background-image`,
with `background-size` pinned so browsers can't scale it to its
intrinsic SVG box. The caret is now unambiguously a dropdown
indicator without crowding the value text.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): make read-only prompt slots visually distinct from editable pills
Multi-word context values are rendered as plain <span>s with
pointer-events: none, but they inherited the same orange pill +
ring treatment as the truly editable <input>/<select> slots.
Users couldn't tell which pills they could click into.
Strip the pill background, the ring, the radius, and the padding
from `.home-hero__prompt-slot-text` and replace them with a
subtle dashed bottom border. The orange foreground keeps the
slot family link, but read-only highlights now clearly read as
"context value spliced into the prompt" rather than as
"interactive control".
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): scale select-slot dropdown caret up to 12px wide for legibility
The 9×5 caret was too quiet at the prompt's font size to read as
a dropdown affordance. Bump to a 12×6 filled triangle and pad the
select's trailing space (24px) so the value text still has clear
breathing room from the caret.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): collapse plugins search and filter strips into one bar
The plugins-home gallery laid out search, count, mode (Featured),
total-in-catalog, clear-filters, the main category row, and the
sub-category row as four floating clusters. Search lived up in
the section header, far from the chips it actually scopes; the
mode strip wedged "Featured + 386 in catalog + Clear filters"
between the header and the category row with no obvious
ownership; and the two clear-filter affordances duplicated each
other (chip strip + empty-state).
Fold the mode strip into the category row: Featured is now the
leading chip on the same line as the category pills, and the
search field, the result count, and a compact Clear link sit as
a right-aligned tools cluster on that same row. The header
shrinks back to just the title, subtitle, and Browse registry
link. The sub-category row stays as the contextual second line
when a category is active.
Tests: keeps the existing data-testids (plugins-home-chip-featured,
plugins-home-row-category, plugins-home-clear, plugins-home-count,
plugins-home-search) so the existing 11-case section spec passes
without modification.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): hide Recent projects rail on first-run home when empty
A brand-new user landing on Home saw an empty dashed box with the
copy "No projects yet — type a prompt to start one." sitting
above the plugin gallery. The hero already prompts the user to
type, so the empty rail just adds vertical noise without telling
them anything new.
Return null from RecentProjectsStrip when the recent list is
empty (loading or not) so the rail appears only when there is
actually something to show. Update the entry-chrome e2e to
assert toHaveCount(0) on first run — codifying the new contract
that the rail is conditional, not always-mounted.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): hide structured inputs form when every field is in the prompt
The PluginInputsForm below the chat textarea surfaced every plugin
input — even the ones the prompt template already substitutes
inline as highlighted slot pills. For a plugin like Prototype
that's five identical labelled inputs duplicating the five slot
pills above them, making the chat box look like it has grown a
second composer.
Compute the set of placeholder keys actually referenced in the
prompt template (via INPUT_PLACEHOLDER_PATTERN) and skip those
when deciding what the structured form needs to render. The form
still appears for plugins that expose inputs not referenced in
the prompt text (e.g. a "Run in background" toggle), but
template-only plugins collapse the redundant second editor.
Update the picker spec accordingly: when every field is in the
template the form is now expected to be absent rather than to
render a duplicate of the inline slot.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): drop redundant filtered-count and Clear link beside the search
The combined plugins filter bar trailed the search field with
"59 / 386" and an inline "Clear" link, but both repeat what the
chip strip already shows: every chip carries its own count
(Slides 59, All 386, …) and clicking the All chip already resets
the filters. Users had no way to know what the bare "59 / 386"
fraction or "Clear" referred to without inference.
Strip the trailing tools cluster down to just the search input.
The empty-results message at the bottom of the gallery still
exposes a contextual "Clear filters" button when a stacked
filter yields no matches, so the affordance isn't lost — just
removed from a position where it didn't read as actionable.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): drop default subtitle copy from the Official starters strip
The Home gallery rendered a two-line explanatory subtitle under
"Official starters" — "Ready-to-use Open Design workflows bundled
with this runtime. Pick one to load a starter prompt, or browse
the registry for more." — every visit. Returning users skip it,
new users get the same message from the section title plus the
Browse registry link plus the visual card grid itself; the prose
read as filler chrome above the chip strip.
Default the subtitle prop to undefined and only render the <p>
when a caller passes an explicit string. Other surfaces that
mount PluginsHomeSection with their own copy keep their
subtitle; the bare Home gallery loses one row of vertical noise.
Co-authored-by: Cursor <cursoragent@cursor.com>
* Fade out empty workflow lanes in Plugins home filter
Empty top-level lanes (Deploy 0, Refine 0, etc.) and empty
sub-categories (Vercel 0 under Deploy, Figma 0 under Import, etc.)
used to look identical to populated ones, so users couldn't tell at
a glance which chips were real catalog buckets and which were
"contribute a plugin" invites. The strip kept all lanes visible by
design — the workflow shape (Import / Create / Export / Share /
Deploy / Refine / Extend) is part of what we want users to see —
so we keep them clickable, but tag count-zero chips with
data-empty='true' and give them a faded, dashed-border treatment.
"All" pills stay solid since their count reflects the parent lane,
not their own emptiness.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(web): update home nav test expectations
* fix(e2e): align critical smoke with entry chrome
---------
Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
18b947c25f
|
[codex] Land design system GitHub intake handoff (#2187)
* Add Claude-style design system workflow * Merge design system workflow into main * Restore design system workflow UI styles * Fix design system setup scrolling * Fix design system setup connector button * Preserve connector auth link after popup block * Simplify connected GitHub setup state * Open generated design system workspace project * Summarize design system auto prompt in chat * Add bounded GitHub connector design intake * Prefer path-scoped GitHub intake tools * Restore branch GitHub design context intake * Restore design system review workspace * Restore design system manager tab * Let design system workflow routes own details * Open editable design systems as projects * Restore design system workspace coverage * Fix bounded GitHub connector intake * Hide design system review while generating * Suppress design system generation questions * Constrain GitHub design intake to bounded command * Tolerate oversized GitHub metadata during intake * Rebuild daemon CLI when sources change * Fallback when GitHub connector snapshots are rate limited * Allow GitHub intake without Composio * Use native GitHub auth for design intake * Remove design system review group heading * Improve design system extraction evidence * Align design system scaffold with Claude output * Add evidence inventory for design system intake * Add local design system evidence intake * Add design system package audit gate * Allow auditing Claude Design reference packages * Audit design system package content quality * Migrate legacy design system artifacts * Clean migrated design system artifacts * Require modular design system UI kits * Reject thin design system UI kits * Prioritize core design evidence intake * Require role-based design system UI kits * Clean stale design system manifest references * Require representative preserved design assets * Warn on generic design system visuals * Enforce design system quality warnings * Audit connected design system UI kits * Require mounted design system UI kits * Require composed design system app shells * Require runnable JSX design system kits * Require browser globals for design system components * Infer design system names from source URLs * Require source examples in design system packages * Bind preserved fonts in design system tokens * Require skill frontmatter in design system packages * Preserve build icons in design system packages * Require real assets in brand previews * Require substantive source examples * Require product overview in design system README * Require reusable UI kit README * Require reusable design system skill docs * Seed Claude-style UI kit entry contract * Preserve runtime build assets in design packages * Audit design system packages after generation * Audit design system first-run output * Audit source-backed preview cards * Align design system UI kit scaffolds * Materialize design evidence package artifacts * Show project chat during design system setup * Hand off design system setup to project chat * Auto-repair design system audit failures * Harden design system evidence preservation * Tighten design system package guidance * Add targeted design system repair guidance * Bound design system audit auto repair * Use connector statuses in design system setup * Audit design system preview manifests * Require README preview manifests for design systems * Fix design system GitHub intake handoff * Fix daemon prompt CI assertions |
||
|
|
bd48c597b0
|
chore: pin dependency versions and harden CI caches (#2189)
* chore: pin dependency versions * ci: enforce pinned dependency specs * ci: fix pnpm executable invocation |
||
|
|
68b0e0258d
|
fix(web): scope daemon transcript to active agent (#2157) | ||
|
|
b838e94b88
|
fix(web): expand skill mention picker (#2170) | ||
|
|
f650a043d9
|
test(e2e): align entry coverage with redesigned flows (#2101)
* Migrate entry E2E coverage and split CI * test(e2e): relax connectors auth error assertions * ci: route scenario registry changes to extended e2e * ci: decouple packaged smoke jobs from validate gate * ci: restore pre-split workflow * test(e2e): slim critical ui smoke coverage * test(e2e): move broader entry flows out of critical * test(e2e): restore entry chrome coverage to ci * ci: parallelize workspace validation jobs * test(web): stabilize media palette bridge assertion * ci: cache e2e playwright browsers |
||
|
|
34f66113a0
|
Implement Home audio essential workflow (#2104)
Some checks failed
ci / Packaged mac smoke (push) Blocked by required conditions
ci / Packaged windows smoke (push) Blocked by required conditions
ci / Detect PR change scopes (push) Failing after 1s
ci / Validate workspace (push) Has been skipped
landing-page-ci / Validate landing page (push) Failing after 1s
landing-page-deploy / Deploy landing page (push) Has been skipped
nix-check / build (push) Failing after 1s
ci / Packaged linux headless smoke (push) Has been skipped
* fix(web): align Home prompt overlay with textarea so caret lands on click
Picking a chip such as Slide deck or Image loaded a default prompt
into the Home textarea and rendered an overlay with `{{key}}`
placeholders as interactive `<input>` / `<select>` controls. The
overlay controls and the underlying textarea text were laid out
independently:
- Inputs declared `min-width: 8ch` and `Math.max(displayValue.length
+ 1, 10)ch` of width.
- Selects added 18px of right-padding for the dropdown arrow.
- The textarea kept the raw substituted string in a proportional
font.
The two layouts no longer matched column-for-column, so every slot
shifted the textarea text to the left of where it appeared in the
overlay, compounding across the line. Clicking on visible prose to
position the caret hit a different character offset in the textarea
and subsequent typing or deletion landed in the wrong place.
For example, the Slide deck template
Create a {{slideCount}}-slide {{deckType}} for {{audience}} about
{{topic}}. ...
renders with slideCount=10 (~2 ch in the textarea) under a slot
input forced to 10 ch in the overlay — clicking right after the
literal `-slide` placed the caret several characters into `pitch
deck`. The Image / Video / Audio chips with their pre-filled
subject, style, aspect values reproduced the same drift.
Render the inline pills as read-only `<span>`s carrying the exact
substring the textarea shows at that position, mark them
`aria-hidden` so the textarea remains the single labelled control,
and surface every plugin input field — including the ones referenced
inline — in `PluginInputsForm` underneath. Editing flows through the
form, the parent's `updateActiveInputs` already re-renders the
prompt, and the pills stay aligned with the textarea on every
keystroke.
Also drop the now-unused inline helpers (updatePluginInput,
getTemplateInputNames, shouldRenderSlotAsText, inlineFieldType,
fileInputLabel, fileMetadata) and the dead
`.home-hero__prompt-slot-control/input/select/toggle/file/text` CSS
rules.
Verified:
- pnpm --filter @open-design/web typecheck
- vitest run on HomeHero.plugin-picker, HomeHero.rail, and
HomeView.prefill (29/29 pass; tests updated to reflect the new
read-only span + always-on form contract)
- Manual click-to-edit on Slide deck and Image chips in the
pnpm tools-dev web runtime — caret now lands where the user
clicked.
* Implement home audio essential workflow
* Fix Home media composer review issues
* Guard stale Home media apply results
---------
Co-authored-by: hahaplus <zmjdll@gmail.com>
|
||
|
|
1523defad1
|
feat(web): add plugin registry detail drawer (#2087)
Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
8a629eb999
|
fix: discover codex models from cli (#2082) | ||
|
|
ec53959601
|
feat(web): unify plugin trust badges (#2088)
Normalizes plugin trust badge rendering across Home, plugin details, registry entries, sources, and plugin share/install surfaces while mapping bundled and official sources to the user-facing Official tier.
Validation: CI and nix-check were green on PR head
|
||
|
|
ac748d53fc
|
Fix pointer HTML artifact previews (#2075)
Resolves short pointer-only HTML artifacts to the existing project HTML target before persisting preview artifacts, with helper and ProjectView regression coverage for #536. Validation: CI and nix-check were green on PR head
|
||
|
|
7b3b7c3b74
|
Fix finalize provider routing for Gemini BYOK (#1964)
Routes Finish Design/finalize requests through the selected BYOK provider, including Gemini, while preserving the Anthropic fallback. Validation: CI and nix-check were green on PR head
|
||
|
|
0101a09b10
|
fix(mcp): support no-auth local HTTP servers (#2008) | ||
|
|
25708fbe0f
|
fix(web): keep connector errors out of cards (#1740)
* fix(web): keep connector errors out of cards * fix(web): route panel-alert open-details through openConnectorDetails The new "open details" action on the panel-alert row called `setDetailConnectorId(alert.connectorId)` directly, bypassing the `toolPreviewFailedIds` clear that `openConnectorDetails()` does on every other entry point. Concrete regression: user opens a connector's drawer from the card, the tool-preview hydration fetch fails so `toolPreviewFailedIds[id]` gets stamped with the current `toolPreviewRetryToken`, user closes the drawer, and later the same connector throws a connect error and shows up as a panel alert. Clicking the alert's open-details button re-sets `detailConnectorId` but never clears the failed-fetch token, so the hydration effect at the previous bail-out check (`toolPreviewFailedIds[detailConnector.id] === toolPreviewRetryToken`) returns immediately and the drawer never retries loading tools. Routing the click through `openConnectorDetails(alert.connectorId)` restores the same clear-before-set behavior the card and search entry points already had, so the reopened drawer always gets a fresh hydration attempt. * test(web): cover connector panel alert tool retry * fix(web): clarify connector panel alert status text * fix(web): clear stale connector cancel alerts --------- Co-authored-by: lefarcen <935902669@qq.com> |
||
|
|
bd1b41a9d4
|
feat(web): highlight captured Pod component on chip hover (#1982)
* feat(web): highlight captured Pod component on chip hover Wires onPointerEnter/Leave on chip + onFocus/Blur on its × so the matching member overlay gets an outline-focused emphasis on canvas. * ci: trigger re-run for flaky palette dark-mode test * fix(web): skip non-mouse pointer events on pod chip hover Gate onPointerEnter/Leave to pointerType === 'mouse' so a touch tap on a chip no longer flickers the captured-member highlight; also add the CommentTargetOverlay class-wiring test, drop the redundant alpha on the hover outline, and document the non-member fallback path. |
||
|
|
92d91101ae
|
Fix inspect ancestor selection notice (#2039)
Co-authored-by: fuyizheng3120 <182291973+fuyizheng3120@users.noreply.github.com> |
||
|
|
0026dfa344
|
fix: support CODEX_API_KEY for Codex CLI (#1880)
Some checks failed
ci / Packaged mac smoke (push) Blocked by required conditions
ci / Packaged windows smoke (push) Blocked by required conditions
ci / Detect PR change scopes (push) Failing after 1s
ci / Validate workspace (push) Has been skipped
landing-page-ci / Validate landing page (push) Failing after 1s
landing-page-deploy / Deploy landing page (push) Has been skipped
nix-check / build (push) Failing after 3s
ci / Packaged linux headless smoke (push) Has been skipped
* fix(daemon): allow Codex API key env * fix(web): expose Codex API key setting * fix(web): clarify Codex key labels |
||
|
|
156191d565
|
fix(web): show unavailable state for missing examples (#1983) | ||
|
|
332e982a07
|
fix(web): stabilize HTML preview URL on tab switch (#1959)
Some checks failed
ci / Packaged mac smoke (push) Blocked by required conditions
ci / Packaged windows smoke (push) Blocked by required conditions
ci / Detect PR change scopes (push) Failing after 1s
ci / Validate workspace (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Packaged linux headless smoke (push) Has been skipped
Co-authored-by: GHX5T-SOL <200635707+GHX5T-SOL@users.noreply.github.com> |
||
|
|
91be2696dd
|
Persist routine failure reasons (#1963)
Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
89bf313782
|
fix(web): align Home prompt overlay with textarea so caret lands on click (#1958)
Picking a chip such as Slide deck or Image loaded a default prompt
into the Home textarea and rendered an overlay with `{{key}}`
placeholders as interactive `<input>` / `<select>` controls. The
overlay controls and the underlying textarea text were laid out
independently:
- Inputs declared `min-width: 8ch` and `Math.max(displayValue.length
+ 1, 10)ch` of width.
- Selects added 18px of right-padding for the dropdown arrow.
- The textarea kept the raw substituted string in a proportional
font.
The two layouts no longer matched column-for-column, so every slot
shifted the textarea text to the left of where it appeared in the
overlay, compounding across the line. Clicking on visible prose to
position the caret hit a different character offset in the textarea
and subsequent typing or deletion landed in the wrong place.
For example, the Slide deck template
Create a {{slideCount}}-slide {{deckType}} for {{audience}} about
{{topic}}. ...
renders with slideCount=10 (~2 ch in the textarea) under a slot
input forced to 10 ch in the overlay — clicking right after the
literal `-slide` placed the caret several characters into `pitch
deck`. The Image / Video / Audio chips with their pre-filled
subject, style, aspect values reproduced the same drift.
Render the inline pills as read-only `<span>`s carrying the exact
substring the textarea shows at that position, mark them
`aria-hidden` so the textarea remains the single labelled control,
and surface every plugin input field — including the ones referenced
inline — in `PluginInputsForm` underneath. Editing flows through the
form, the parent's `updateActiveInputs` already re-renders the
prompt, and the pills stay aligned with the textarea on every
keystroke.
Also drop the now-unused inline helpers (updatePluginInput,
getTemplateInputNames, shouldRenderSlotAsText, inlineFieldType,
fileInputLabel, fileMetadata) and the dead
`.home-hero__prompt-slot-control/input/select/toggle/file/text` CSS
rules.
Verified:
- pnpm --filter @open-design/web typecheck
- vitest run on HomeHero.plugin-picker, HomeHero.rail, and
HomeView.prefill (29/29 pass; tests updated to reflect the new
read-only span + always-on form contract)
- Manual click-to-edit on Slide deck and Image chips in the
pnpm tools-dev web runtime — caret now lands where the user
clicked.
|
||
|
|
0d2c87242f
|
fix(web): snapshot pending before reload so expired-auth cancel actually fires (#1972)
reloadConnectorStatuses commits its prune setState (and runs the connectorAuthorizationPendingRef mirror effect) between the two awaits in onFocus, so cancelStaleAuthorizations was reading a post-prune ref and never identified anyone as stuck. Capture the snapshot inside onFocus before reloadConnectorStatuses runs, and pass it explicitly so the expired-auth daemon cancel actually fires. Refs #1909 #1354 |
||
|
|
d36ceacf3a
|
feat(web): surface saved Project instructions for review and retrieval (#1933)
Saved Project instructions had no post-save read-back surface. The header control was a bare pencil icon that opened an editor and closed on Save, so after returning to the workspace a user could not confirm what was stored, revisit it, or tell whether it was still active (#1822). Make the saved state discoverable: once instructions exist, the header affordance becomes a persistent "Project instructions" chip. Opening it shows a read-only review panel with a preview of the saved text and an "Active - included in every message" status, plus a reopen-to-edit action. Saving lands back on the review panel so the stored value is read back immediately. The empty state keeps the pencil and opens the editor directly. Persistence and project-level system-prompt injection are unchanged. Closes #1822 Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
647433ccef
|
Fix Inspect preview transport flash (#1967)
Some checks failed
ci / Packaged mac smoke (push) Blocked by required conditions
ci / Packaged windows smoke (push) Blocked by required conditions
ci / Detect PR change scopes (push) Failing after 2s
ci / Validate workspace (push) Has been skipped
nix-check / build (push) Failing after 1s
ci / Packaged linux headless smoke (push) Has been skipped
* Fix inspect preview transport flash Co-authored-by: multica-agent <github@multica.ai> * Keep inactive srcdoc transport lazy Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
30ad8b8ac3
|
improve privacy consent modal: policy link, clearer CTAs, mobile layout (#1921)
The first-run consent banner had no link to a privacy policy, an
affirmative button ("Help improve") that didn't read as a consent
choice, and a fixed bottom-right card that crowded content on phones.
- Add a "Read the privacy policy" link (external-link icon, accent,
underlined) above the actions, plus a root PRIVACY.md it points to
documenting the telemetry behaviour the modal discloses.
- Rename the CTAs to "Share usage data" / "Don't share" so both name
the action; they stay equal-prominence per the EDPB/GDPR comment.
- Stretch the banner to a bottom-edge bar under 540px with a
safe-area inset so it clears mobile browser chrome.
- Add PrivacyConsentModal tests; sync the new i18n key to every
locale and update the consent-label assertion in App.connectors.
Refs #1756
|
||
|
|
9cc1fb28f3
|
fix(web): apply Tweaks palette shift to :root CSS custom properties (#1952)
Walk document.styleSheets so chromatic custom properties on :root/html/body/:host (including @media-nested rules) are hue-shifted via inline setProperty, not just the values returned by getComputedStyle. Fixes #1393 References: PR #1643 (open, in review) binds the Tweaks toolbar toggle to artifact `.tw-panel` panels via a new `injectTweaksBridge` in `srcdoc.ts` — adjacent surface but different problem. Our fix extends `injectPaletteBridge` to also shift `:root` CSS custom-property colors, which #1643 does not touch. |
||
|
|
b7fa1d9286
|
fix(web): recover stuck Composio connector when auth is cancelled (#1909)
* fix(web): recover stuck Composio connector when auth is cancelled When a user closes the Composio system-browser auth window without completing the flow, the connector card stayed in its loading state forever because nothing told the daemon the request was abandoned. The window-focus refresh now awaits the status poll and, for every connector still in connectorAuthorizationPending whose freshly fetched status is not 'connected', issues cancelConnectorAuthorizationRequest and clears the matching local pending/error UI state. The connected guard is read from the just-returned statuses so a postMessage success that lands just before focus is not racy. Fixes #1354 * fix(web): gate Composio focus auto-cancel on TTL + cancel response Require expiresAt to have elapsed before auto-cancelling a pending Composio auth on focus, and preserve the spinner on a failed cancel (matching the manual Cancel handler) so the UI never disagrees with the daemon. Refs https://github.com/nexu-io/open-design/pull/1909#discussion_r3254152045 Refs https://github.com/nexu-io/open-design/pull/1909#discussion_r3254152046 |
||
|
|
324eca27ea
|
feat(web): add manual removal for captured Pod components (#1951)
* feat(web): add manual removal for captured Pod components Adds `removePodMember` helper and a per-chip × in `BoardComposerPopover`; leaves `comments.ts` untouched (avoid-zone from #1127). Closes #802 Contract: runs/2026-05-16T08-08-52_open-design_issue-feat/contract.md * style(web): hide Pod chip × until chip hover Swaps the unicode × for the existing `Icon name="close"` SVG so the hit target stays centered, and fades the button in only on chip hover / keyboard focus for a quieter resting state. * fix(web): auto-close Pod composer when last chip is removed Removing the last chip leaves a stale anchor; close so Send cannot attach to elements no longer visible. * refactor(web): extract BoardComposerPopover and Pod-member reducer Moves the popover to its own module and lifts the chip-removal reducer into a pure `applyPodMemberRemoval` so unit tests exercise the real code path and the popover's export is no longer test-only. * fix(web): rebuild Pod anchor when a member is removed Without this, the popover keeps the original union bbox / selector / label after each chip removal, so a subsequent Send to chat anchors the comment to elements no longer in the Pod. * fix(web): render every captured chip and scroll on overflow The previous slice(0, 6) cap left chips beyond the sixth invisible and undeletable. Render the full list inside a 132px-tall scrollable strip. |
||
|
|
10c62bf6e4
|
fix(web): warn before editing a built-in skill creates a shadow (#1850)
Clicking Edit on a built-in skill silently issued PUT /api/skills/:id, which the daemon stores as a user-owned shadow and then hides the built-in entry from the Settings list. From the user's perspective the row they just edited disappears with no explanation. Arm an inline override-warning banner on Edit for source==='built-in' skills, mirroring the existing inline delete-confirm pattern. The edit form only opens after the user explicitly confirms. User skills bypass the warning and edit directly. No daemon or listSkills contract change. Fixes #1378 |
||
|
|
a6288cec3f
|
fix(web): allow editing existing routines from RoutinesSection (#1884)
The Routines list only exposed Run now / Pause / History / Delete, so any change to a routine required deleting and recreating it. The daemon already serves PATCH /api/routines/:id; only the web entry point was missing. Add an Edit button on each routine row that pre-populates the create form from the stored schedule and target via a new formFromRoutine helper, and branch the submit handler on editingId so it PATCHes /api/routines/:id with the updated fields instead of POSTing a new routine. Fixes #1373 |
||
|
|
bac56415a2
|
fix(web): surface daemon error messages for invalid folder imports (#1923)
* fix(web): surface daemon error messages for invalid folder imports importFolderProject() was swallowing non-2xx responses by returning null, so the UI could only show a generic "Open folder failed: <path>" message even though the daemon already returns specific errors like "cannot import the filesystem root" or "folder not found". Parse the daemon error body and throw so the panel displays the actual reason. Also show feedback for empty path input instead of silently returning. Fixes #1186 * test(web): update folder import test to match new error propagation The existing test expected a generic "Open folder failed: <path>" message from a boolean return. Update to match the new behavior where the daemon's error message is thrown and displayed directly. |
||
|
|
e64f1d8497
|
fix(desktop): export PDF saves to a file instead of the OS print dialog (#1920)
* fix(desktop): export PDF saves to a file instead of the OS print dialog The `od:print-pdf` IPC handler called `webContents.print()`, which opens the printer-first macOS system print dialog. That handler is the destination of the renderer's `window.__odDesktop.printPdf()` bridge, so on desktop "Export PDF" felt like a print flow rather than a file export — from both the PreviewModal share menu (which always uses this path) and the FileViewer share menu (whose daemon-route fallback lands here too). Route the handler through a direct Save-as-PDF flow instead: a native Save dialog, then `webContents.printToPDF()` straight to the chosen file — the same shape `exportPdfFromHtml` already uses for the daemon-backed export path. The flow is extracted into `savePrintReadyDocumentAsPdf` behind a structural `PrintReadyPdfTarget` surface that has no `print()` method, so the regression cannot be reintroduced. Fixes #1774 Co-authored-by: multica-agent <github@multica.ai> * fix(desktop): preserve PDF page sizing in print bridge Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> |
||
|
|
2f553941d3
|
fix(runtime): auto-annotate imported HTML elements for Tweaks selection (#892) (#1169)
* fix(runtime): auto-annotate imported HTML elements for Tweaks selection (#892) * fix(runtime): always annotate missing od-ids and broaden selector (#892) - Remove the conditional gate so annotateMissingOdIds runs for every srcdoc, not just comment/inspect bridges. This fixes the persistence regression where saved Inspect tweaks reference synthesized data-od-id selectors that vanish when the bridge is rebuilt without annotations. - Expand the selector to cover div-based imported HTML (div[class], div[id]), headings, buttons, and links so Picker/Pods work on common anonymous wrapper markup. Addresses review feedback from lefarcen and mrcfps. * fix(runtime): narrow div selector, skip iframe/object/embed, add tests (#892) - Replace broad div[class] with direct-child combinators only under semantic containers, body, and [id] elements to avoid layout noise - Add iframe, object, embed to the skip list alongside script/style - Add tests for: direct-child divs, nested div skip, always-on behavior, skip list with id attributes, div children of [id] elements - Format selector as array join and skip tags as Set for readability * fix(runtime): let annotateManualEditSourcePaths coexist with data-od-id (#892) annotateMissingOdIds now runs unconditionally, so elements like main/h1 get data-od-id before annotateManualEditSourcePaths runs. The old skip condition (has data-od-id) incorrectly prevented source-path marking on those elements. Change the guard to skip only when the element already has data-od-source-path, allowing both attributes to coexist. |
||
|
|
26502ea124
|
test(web): decouple memory preview icon assertion (#1863) | ||
|
|
6b15236843
|
fix(web): distinguish expanded memory preview action (#1813) | ||
|
|
19d65678a3
|
fix(web): keep filter pill hover labels readable (#1828) | ||
|
|
9689dce1ad
|
fix(web): align comment marker numbering (#1826) | ||
|
|
444f7d03eb
|
fix(web): reveal memory editor after edit click (#1827) | ||
|
|
7d9488300e
|
fix(web): keep draw overlay scrollable (#1848) | ||
|
|
c4a8e92916
|
fix(web): add breathing room to plugin publish footer (#1849) | ||
|
|
22a3b99a47 |
Merge origin/main into preview/v0.8.0
Sync 49 commits from main. Conflicts resolved:
- .github/workflows/ci.yml: kept v0.8.0 granular per-area gating, added main's
linux specs + release-stable.yml + release-preview.yml triggers
- .github/workflows/release-preview.yml: kept v0.8.0's full workflow over main's placeholder
- apps/web/src/components/AssistantMessage.tsx: combined v0.8.0 file-ops
summary with main's stripTodoToolGroups + suppressAskUserQuestionFallbackText
- apps/web/src/components/ChatPane.tsx: kept both new imports
- apps/web/src/index.css: kept both .msg-plugin-chip and .user-copy-btn blocks
- e2e/ui/*.test.ts: kept v0.8.0 openEntrySettingsDialog helper over main's
inline dialog navigation (UI was redesigned in v0.8.0)
- nix/package-{daemon,web}.nix: kept v0.8.0 pnpmDepsHash; rerun nix build to refresh
|
||
|
|
15ebe8b266
|
fix(web): keep picker hint clear of comments panel (#1820) | ||
|
|
fd80d7c997
|
fix(web): clear draw ink when exiting mode (#1821) | ||
|
|
9cfb01e6b0
|
feat: add Italian (it) locale support (#1323)
* feat: add Italian (it) locale support - Add complete Italian translation dictionary (apps/web/src/i18n/locales/it.ts) - Register 'it' locale in types.ts (Locale type, LOCALES array, LOCALE_LABEL) - Import and register Italian dictionary in index.tsx - All 1352 translation keys translated to Italian - Follows the same structure as French locale (PR #326) Closes #1245 * test: Update locale tests to include Italian (it) 1. Add 'it' to EXPECTED_LOCALES array 2. Add LOCALE_LABEL.it assertion to verify 'Italiano' label This fixes the CI test failure and completes the Italian locale integration. --------- Co-authored-by: lefarcen <935902669@qq.com> |
||
|
|
4e19c3f4f3
|
Prevent imported Claude canvases from zooming on scroll (#1726)
* Preserve HTML preview state across mode toggles
HTML previews could rebuild their iframe when switching into Edit or Comment, which reset scroll/canvas state and caused visible churn for multi-file artifacts. The viewer now keeps URL-loaded previews mounted when the artifact owns the mode bridge, relays file-refreshes through frame navigation, and restores preview scroll/viewport state across bridge mode changes.
Constraint: Generic srcdoc-only bridges are still required for unbridged artifacts, inspect mode, palette tweaks, decks, draw overlays, and forceInline.
Rejected: Keep all Edit/Comment previews on srcdoc | causes unnecessary iframe replacement for bridge-capable URL-loaded artifacts.
Confidence: high
Scope-risk: moderate
Directive: Do not enable URL-load for bridge-dependent modes unless the artifact has an owned postMessage bridge.
Tested: pnpm guard
Tested: pnpm --filter @open-design/web typecheck
Tested: pnpm --filter @open-design/web test
Tested: Playwright verified Edit and Comment toggles preserve iframe src and DOM node while receiving comment targets.
* Prevent preview wheel gestures from escaping into zoom
Trackpad pinch-like wheel events arrive with ctrl/meta modifiers on some platforms, which can make a normal vertical scroll feel like the preview zoomed. The preview now consumes those modified wheel events inside the host preview shell and in injected srcdoc previews, then maps the delta back to scroll where a scroll target exists.
Constraint: URL-loaded sandbox iframes cannot always be inspected by the host, so srcdoc previews need their own in-frame guard.
Rejected: Add allow-same-origin to preview iframes | weakens the sandbox boundary for generated artifacts.
Confidence: medium
Scope-risk: narrow
Directive: Do not broaden iframe sandbox permissions to fix gesture handling without a security review.
Tested: pnpm guard
Tested: pnpm --filter @open-design/web typecheck
Tested: pnpm --filter @open-design/web exec vitest run tests/components/FileViewer.test.tsx tests/runtime/srcdoc.test.ts
Tested: playwright-cli verified ctrl-wheel in preview keeps app zoom at 100% and prevents default in the iframe context
* Revert "Prevent preview wheel gestures from escaping into zoom"
This reverts commit
|
||
|
|
d16acf6462
|
fix: Add error feedback for manual folder path import (#1666)
* fix: Add error feedback for manual folder path import Fixes #1408 When users manually enter a folder path and click 'Open folder' (non-Electron environment), the app now provides clear error feedback if the import fails. **Before:** - No error clearing before import - No error handling for failed imports - Silent failures left users confused **After:** - Clears previous errors before attempting import - Catches and displays import errors with clear messages - Success feedback is implicit (navigation to the opened project) **Why implicit success feedback:** The parent handler (Home.tsx) navigates to the newly opened project on success, which provides clear visual feedback by changing the entire view. An additional toast would be redundant. **Error handling:** - Catches all errors from onImportFolder - Displays user-friendly error messages - Preserves error details when available * fix: surface failed folder imports --------- Co-authored-by: Siri-Ray <2667192167@qq.com> |
||
|
|
cfcfbe0178
|
Inline attached file context for BYOK chats (#1730)
BYOK/API-mode chats bypass the daemon run path, so attached project files were saved as message metadata but their readable contents were not sent to the provider. This adds a web-side attachment context step for API-mode requests, reusing raw text reads and existing document preview extraction. Constraint: Docker PDF previews require pdftotext in the runtime image Confidence: high Scope-risk: moderate Tested: corepack pnpm --filter @open-design/web test -- tests/api-attachment-context.test.ts tests/components/ProjectView.api-empty-response.test.tsx Tested: corepack pnpm --filter @open-design/web typecheck Tested: corepack pnpm --filter @open-design/web build Tested: corepack pnpm guard Tested: corepack pnpm typecheck |
||
|
|
b2d2635360
|
fix(web): hide resolved comments from preview overlays (#1762) | ||
|
|
88db51521d
|
feat(web): add custom select primitive (#1714)
Some checks failed
ci / Packaged mac smoke (push) Blocked by required conditions
ci / Packaged windows smoke (push) Blocked by required conditions
ci / Detect PR change scopes (push) Failing after 2s
ci / Validate workspace (push) Has been skipped
nix-check / build (push) Failing after 1s
* feat(web): add custom select primitive * fix(web): harden custom select active option state |
||
|
|
c5d77a03bd
|
Garnet hemisphere (#1769)
Some checks failed
nix-check / build (push) Failing after 2s
* feat(chat-composer): enhance mention handling and input overlay - Introduced a new overlay for inline mentions in the chat composer, improving user experience by visually indicating mentions as users type. - Updated the `ChatComposer` component to manage mention entities and integrate them into the input field, allowing for better context and interaction. - Enhanced the `AssistantMessage` component to support the display of plugin action panels based on the current project context, facilitating easier plugin management. - Refactored related components to ensure consistent handling of project files and mentions across the application. This update significantly improves the chat interaction model, making it more intuitive for users to engage with mentions and plugins. * feat(plugin-management): enhance plugin action panels and UI components - Updated the `AssistantMessage` component to include plugin action panels based on the latest project context, improving user interaction with generated plugins. - Refactored the `PluginsView` to support detailed views for available marketplace entries, allowing users to access more information and actions for each plugin. - Introduced new CSS styles for improved visual representation of plugin-related UI elements, enhancing overall user experience. - Enhanced the `listPlugins` function to include an option for fetching hidden plugins, providing more flexibility in plugin management. This update significantly improves the usability and functionality of the plugin management system, making it easier for users to interact with and manage their plugins. * fix(assistant-message): refine plugin folder candidate selection logic - Updated the `pluginFoldersTouchedThisTurn` function to improve the logic for selecting plugin folder candidates based on touched paths and message content. - Introduced a new helper function, `pathMatchesFolderFileBasename`, to enhance the matching criteria for folder candidates. - Added a check for explicit folder matches before falling back to a single candidate, improving accuracy in folder selection. - Modified the `shouldRenderSlotAsText` function in `HomeHero` to include the name parameter, refining the rendering logic for slot text. These changes enhance the functionality and reliability of the assistant message component in managing plugin folder candidates. * feat(plugin-folder-actions): implement agent-routed CLI actions for plugin management - Introduced a new `PluginFolderAgentAction` type to streamline actions related to plugin folders, including install, publish, and contribute. - Updated the `DesignFilesPanel`, `FileWorkspace`, and `AssistantMessage` components to utilize the new agent action handling, improving user interaction with generated plugins. - Refactored the action handling logic to send commands to the agent, enhancing the workflow for managing plugin folders. - Added corresponding tests to ensure the new functionality works as expected and integrates seamlessly with existing components. This update significantly enhances the plugin management experience by routing actions through the agent, allowing for a more cohesive and interactive user experience. * Fix PR 1702 CI blockers * Fix PR 1702 remaining CI checks * Prebuild AGUI adapter after install * Restore plugin project snapshot wiring * feat(marketplace): refactor marketplace URL handling and enhance fetching logic - Introduced new functions to normalize marketplace URLs and manage fetching of marketplace manifests, improving the reliability of marketplace integrations. - Updated the server and plugin logic to utilize the new fetching mechanisms, ensuring consistent handling of marketplace data. - Enhanced tests to cover new URL normalization and fetching scenarios, ensuring robustness in marketplace management. This update significantly improves the marketplace experience by streamlining URL handling and enhancing data fetching capabilities. * Fix project auto-send cleanup spec * Reconcile run messages on cancel * Use active design system as visual direction * Fix active design system prompt wording * feat(workspace-tabs): implement workspace tabs functionality and file attachment handling - Introduced a new `WorkspaceTabsBar` component to manage workspace tabs, allowing users to navigate between different views (projects, marketplace, etc.). - Enhanced file handling capabilities in the `HomeHero` and `EntryShell` components, enabling users to stage and attach files before project creation. - Updated the `App` component to support auto-sending attachments alongside the first message in a project. - Improved CSS styles for workspace tabs and attachment UI, ensuring a cohesive design and user experience. This update significantly enhances the workspace navigation and file management features, providing users with a more intuitive and efficient workflow. * refactor(workspace-tabs): streamline workspace tabs and UI components - Removed unused components and actions from the `WorkspaceTabsBar` and `AppChromeHeader`, simplifying the codebase. - Updated CSS styles for the workspace shell and tabs, enhancing visual consistency and reducing element sizes for a cleaner layout. - Introduced a new client type detection mechanism to dynamically adjust the workspace shell's class, improving responsiveness. - Added tests for the `WorkspaceTabsBar` to ensure proper navigation and tab management functionality. These changes improve the overall performance and user experience of the workspace navigation system. * Update critical e2e for entry modal flow * Stabilize entry critical e2e flows * fix(ui): adjust workspace tabs and header styles for improved layout - Updated the CSS for workspace tabs and the app header, reducing element sizes and padding for a cleaner appearance. - Introduced a new button in the `WorkspaceTabsBar` for quick access to the home tab, enhancing navigation. - Minor adjustments to the layout and styles to ensure consistency across components. These changes enhance the user interface and improve the overall user experience in the workspace navigation system. * feat(workspace-tabs): implement pinned home tab functionality - Added a new pinned home tab feature to the `WorkspaceTabsBar`, allowing the home tab to remain accessible during navigation. - Updated tab management logic to collapse duplicate home tabs into a single pinned instance when restoring from local storage. - Enhanced CSS styles for workspace tabs to accommodate the new pinned tab design. - Updated tests to verify the behavior of the pinned home tab and its interaction with other tabs. These changes improve navigation consistency and user experience within the workspace. * refactor(workspace-tabs): enhance tab management and styling - Updated CSS styles for workspace tabs, adjusting padding and flex properties for improved layout and consistency. - Refactored tab creation logic to ensure unique IDs for project and marketplace tabs, enhancing navigation clarity. - Removed deprecated functions related to pinned home tabs, streamlining the codebase. - Improved test cases to verify independent behavior of home tabs during navigation. These changes enhance the user experience by providing a more intuitive tab management system and a cleaner UI. * style(workspace-tabs): update CSS for improved layout and visibility - Adjusted CSS properties for workspace tabs, including overflow, position, and z-index to enhance layout and stacking context. - Ensured consistent styling across tab components for better visual hierarchy. These changes contribute to a more polished and user-friendly interface within the workspace. * style(entry-layout): update CSS variables for improved layout consistency - Replaced fixed width values with CSS variables for the entry rail to enhance flexibility. - Adjusted padding and height properties for better visual alignment and spacing. - Introduced a new background style for the entry main topbar to improve aesthetics. These changes contribute to a more responsive and visually appealing layout in the entry view. --------- Co-authored-by: qiongyu1999 <2694684348@qq.com> Co-authored-by: Eli <129168833+qiongyu1999@users.noreply.github.com> |
||
|
|
843b6fec4f
|
fix(web): fall back to srcDoc when HTML preview needs sandbox shim (#1306)
* fix(web): fall back to srcDoc preview when HTML needs the sandbox shim The URL-load HTML preview iframe is sandboxed with `allow-scripts` only — no `allow-same-origin` — so any artifact that reads `localStorage`/`sessionStorage` at startup throws SecurityError, its React tree unmounts, and the preview goes blank. The srcDoc path already polyfills both via `injectSandboxShim` (apps/web/src/runtime/srcdoc.ts) before any user script runs, but URL-load served raw HTML untouched. Agent-emitted React prototypes that read Web Storage at mount went blank until the user toggled Tweaks (which forces the srcDoc path). Detect the two reliable signals — `<script type="text/babel">` (Babel-standalone XHR-fetches and evals sibling `.jsx` files at runtime; those routinely read Web Storage from `useState` initializers) and direct `localStorage` / `sessionStorage` references in the source — and set `forceInline` automatically so those artifacts route back through the srcDoc path. Plain static HTML keeps the URL-load benefits (real source maps, per-asset HTTP caching, isolated per-script failures). No new daemon endpoint, no new contract, no sandbox loosening. Pure content sniff in the existing render-mode helper; reuses the same `forceInline` seam the `parseForceInline` opt-out already uses. Tests cover the new helper across positive (Babel-standalone variants with attribute reordering, quoting, whitespace, case) and negative (plain script tags, module type, JSON type, substring lookalikes) cases. * fix(web): address review feedback on sandbox-shim fallback - Accept unquoted `<script type=text/babel>` per HTML5 attribute syntax (regex `["']` → `["']?`). Adds a focused test covering bare unquoted, unquoted with unquoted `src=`, mixed unquoted/quoted, and the negative case `<script type=text/babelish>` to confirm the trailing word-boundary still rejects look-alikes. - Memoize `htmlNeedsSandboxShim(source)` on `source`. HtmlViewer re-renders on board/inspect/edit/slide state changes; the scan only changes when the source itself does. Cheap micro-opt, free correctness win. - Narrow the helper docstring's scope claim and add an explicit known limitation: external scripts (`<script src="./app.js">`, `<script type="module" src="./main.js">`) that read Web Storage during module eval are *not* covered — the helper only sees the document, not the linked subresource. Workaround documented: `?forceInline=1` or Tweaks. Catching this case would require fetching every script reference before deciding load strategy, duplicating browser work; not worth the cost until a real report surfaces. * fix(web): correct inline comment on `\b` boundary behavior The comment claimed `\b` rejects `text/babel-other`, but `\b` matches between `l` and `-` (hyphen is a non-word char), so the regex actually does match that input. The test asserts `text/babelish` as the negative case, which `\b` does correctly reject (`i` is a word char). Comment now matches the regex's actual behavior, with a note that hyphenated variants are a harmless false positive (srcDoc fallback is the safe direction) and a pointer to the `(?=[\s>"'])` lookahead tightening if a real case ever surfaces. No behavior change; existing tests still pass. * fix(web): align test comment with helper docstring on hyphenated variants Same class of inconsistency the previous commit fixed in the helper: the test comment claimed `type=text/babel-other` "remains a non-match", but the assertion actually covers `type=text/babelish`, and the helper docstring explicitly documents hyphenated variants as a safe false-positive that does match. Comment now describes both shapes correctly and explains why the hyphenated variant isn't asserted (it's the documented safe direction, not a regression). No behavior change; test count unchanged. * chore: trigger CI |
||
|
|
bcc58af931
|
refactor(web): rename Execution mode and tighten settings dialog UI (#1568)
* refactor(web): rename Execution mode and tighten settings dialog UI - Rename "Settings → Execution & model" to "Settings → Execution mode" across the web UI, i18n keys, docs, and e2e selectors. - Redesign SettingsDialog: kicker + title row in the modal head, a flatMap-driven agent grid that renders the inline test-result row beside the selected card, compact unavailable cards with right-aligned install/docs links, and an install guide that only shows when the user has no working agent picked. - Trim verbose subtitle / hint copy across chat model, CLI proxy, media providers, custom instructions, and memory sections. - Add an `info` Icon variant for the redesigned settings hints. - Update e2e selectors and docs that referenced the old menu label. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(web): polish Settings dialog — media providers, skills, MCP Media providers - Hide internal Stub fixture provider (settingsVisible: false) - Split provider list into Available (integrated, editable) and Coming Soon (collapsed <details> drawer with name/hint/Docs link only) - Drop right-side Integrated/Configured badges from every row; all rows in the main list are integrated by definition; inline grey "Saved" chip next to the provider name is the only status indicator now - "Saved" badge moves inline to the right of the provider name and uses a neutral grey treatment (was a standalone green pill below the name) - "Reload from daemon" button shows a 2s green "✓ Reloaded" flash on success instead of leaving a permanent paragraph under the header; errors remain sticky Skills - Replace three pill-row filter banks (Source, Type, Category) with a compact single-row toolbar: search + three inline <select> dropdowns side by side; active filter highlighted with a stronger border MCP server - Shorten section hint to one line - Move WHAT YOUR AGENT CAN DO capabilities above the client dropdown (motivate before asking to act) - Move "Build the daemon first" warning below the code block where it contextually explains why the command might fail, not as a top-level error before the user has done anything - Downgrade "Restart your client" left-border from accent orange to border-strong grey — it is a next step, not a warning External MCP - Shorten section hint to one line Misc CSS - Add .sr-only utility for accessible off-screen live regions - Add button.ghost.is-success-flash for transient success feedback - Add .library-filter-selects / .library-filter-select for dropdown filter rows - Add .media-provider-coming-soon-* for the roadmap drawer Co-authored-by: Cursor <cursoragent@cursor.com> * [codex] Add Cursor Agent auth diagnostics (#1538) * Add Cursor Agent auth diagnostics * Handle Cursor not logged in auth status * Address Cursor auth review feedback * Classify Cursor stdout auth failures * test: expand Memory and Routines coverage (#1521) * test: expand settings and packaged coverage * test: extend memory settings coverage * test: cover routine settings failure states * test: cover routine operation failures * test: fix daemon test typing on CI * test: decouple packaged smoke from orbit bug * test: avoid live memory LLM calls in route tests * test: fix daemon fetch typing in CI * fix: restore preview comment and inspect toggles * test: align manual edit flow with current inspector UX * test: align comment attachment flow with current preview comments UI * fix: probe resolved Codex launch path during detection * fix: remove duplicate board activation helper after rebase * test: update ghost cli detection mock * test: align FileViewer toolbar expectation * ci: move full app tests to extended lane * ci: run app tests by changed scope * ci: cover shared app inputs in test scopes * ci: avoid setup-node cache in windows packaged smoke * test: align extended settings and manual edit flows * refactor(web): rename Execution mode and tighten settings dialog UI - Rename "Settings → Execution & model" to "Settings → Execution mode" across the web UI, i18n keys, docs, and e2e selectors. - Redesign SettingsDialog: kicker + title row in the modal head, a flatMap-driven agent grid that renders the inline test-result row beside the selected card, compact unavailable cards with right-aligned install/docs links, and an install guide that only shows when the user has no working agent picked. - Trim verbose subtitle / hint copy across chat model, CLI proxy, media providers, custom instructions, and memory sections. - Add an `info` Icon variant for the redesigned settings hints. - Update e2e selectors and docs that referenced the old menu label. Co-authored-by: Cursor <cursoragent@cursor.com> * refactor(web): settings dialog UX polish — layout, dedup, and interactions - Remove duplicate section headers from all settings sections (Notifications, Appearance, Privacy, About, Design Systems, Skills, MCP server, Connectors, Media providers, Routines) - Restructure Notifications cards: title + toggle on same row, hint below - Restructure Skills toolbar: search + New skill button in row 1, filter dropdowns in row 2 with left-aligned labels - Restructure Pet section: tabs and Wake button on same row - MCP server: group capabilities and setup into separate cards, remove nested double border on client picker - Connectors: show connect errors as toast instead of inline card text, position toast inside panel, hide single-provider tab - Media providers: move Reload button to left-aligned small ghost button - Memory: info icon shows path on hover, Path copied badge inline; Extraction history and MEMORY.md as standalone collapsible cards; group header hidden when only one type visible - Pet grid cards: Adopt button hidden until hover, icon-only when adopted, description truncated to 2 lines, text fills full width via abs positioning - Agent cards: selected state uses accent border only, no background change - Add sun/moon icons to Appearance theme buttons (Light/Dark) - Shorten several hint strings for clarity Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): resolve i18n review comments from PR #1568 - Update settings.title and settings.envConfigure to localized "Execution mode" in all 17 non-English locale files - Add settings.memoryFlashPathCopied to all locales and use t() in MemorySection instead of hardcoded English "Path copied" - Add settings.agentModelHead to all locales and use t() in SettingsDialog for "Model for:" agent model row header Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): update tests to match settings dialog redesign - Add role prop to Toast (alert/status) so error toasts from ConnectorsBrowser are announced immediately by screen readers - Clear connectErrorToast on successful connector retry - Update SettingsDialog.execution tests: - Remove heading assertions for About and MCP server (headers were intentionally removed as duplicate nav labels) - Rewrite CLI env test to use codex-only fields (per-agent filtering means only selected agent's fields are shown) - Update Composio key hint text assertion to match shortened copy - Replace filter button click with select change for Type filter - Replace Configured/Unsupported/Integrated badge checks with updated assertions matching the new media provider UI - Replace disabled BFL row test with coming-soon section check - Update SettingsDialog.media test: remove Fal.ai input assertions (non-integrated providers no longer have editable fields) Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): unblock CI for #1568 Three small fixes to get Playwright back to green on the settings dialog redesign: 1. `en.ts`: revert `settings.envConfigure` to "Configure execution mode". This PR collapsed both `settings.title` (header gear) and `settings.envConfigure` (entry-side foot pill) to the same string "Execution mode", so `getByRole('button', { name: 'Execution mode' })` resolved to two elements and tripped Playwright strict mode in the three Composio-flow tests (entry-configuration-flows.test.ts:174, 228, 285). Restoring the distinct label also gives screen readers a clearer hint for the pill, which doubles as a status display. Non-English locales still alias the two keys; happy to follow up on those, but they don't gate the (English-only) Playwright suite. 2. entry-configuration-flows.test.ts:167 — `Connectors` heading is now rendered at `<h2>` in the modal-head (SettingsDialog.tsx:1545), with the inner `<h3>` removed by design (see comment around line 1448). Updated the assertion from `level: 3` to `level: 2`. 3. project-management-flows.test.ts:360 — same change for the `Pets` heading. Verified locally with `pnpm --filter @open-design/web typecheck` and `pnpm --filter @open-design/e2e typecheck`. The actual Playwright specs need the dev server up; I didn't rerun them here, but the locator changes are mechanical and match the new DOM. * fix(web): use exact match for Execution mode button locator Playwright's `getByRole({ name })` defaults to substring matching, so `{ name: 'Execution mode' }` still resolved to both the header gear (aria-label "Execution mode") and the entry-side foot pill (aria-label "Configure execution mode" — substring contains "Execution mode"). Strict mode tripped in the three composio-flow tests at lines 202, 257, and 319. Adding `exact: true` makes each call resolve to just the header gear, which opens the same dialog the foot pill does — the test outcomes are unchanged. --------- Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Caprika <56862773+alchemistklk@users.noreply.github.com> Co-authored-by: shangxinyu1 <shangxinyu@refly.ai> Co-authored-by: lefarcen <935902669@qq.com> |
||
|
|
a41d4f6126
|
fix(web): keep chat pinned during content growth (#1716) | ||
|
|
01e54700a2
|
fix(web): make file grouping by kind work (#1551)
* fix(web): group design files by kind * fix(web): unblock CI for #1551 - FileViewer test (line 434): add missing `projectKind="prototype"` to match every other instance; this was the source of the typecheck failure blocking workspace validation. - DesignFilesPanel "groups files by kind" test: assert against `.df-section-label` elements so the section header check is not ambiguous with the per-row kind cell text. - DesignFilesPanel batch-delete test: derive the expected file names from the rendered row testids and use `arrayContaining` so the assertion no longer depends on the (now kind-default) row order. * fix(web): satisfy strict-index typecheck in batch-delete test `onDeleteFiles.mock.calls[0][0]` tripped `noUncheckedIndexedAccess` ("Object is possibly 'undefined'"). Drop the separate length probe and assert the exact array instead — `selected` is a `Set`, `handleBatchDelete` spreads it with `[...selected]`, and the test clicks rows[0]/rows[1] in that order, so insertion order is deterministic and equals `[firstName, secondName]`. --------- Co-authored-by: lefarcen <935902669@qq.com> |
||
|
|
8aeedf368b
|
fix(web): localize accent controls in settings (#1565)
* fix(web): localize accent controls * fix(web): localize accent default label * fix(web): unblock CI for #1565 Add missing `projectKind="prototype"` to the FileViewer deck-render test (line 434) so workspace typecheck stops failing on the `Property 'projectKind' is missing` error. This mirrors every other FileViewer render in the same file and is unrelated to the accent localization changes in this PR — it's drift from a recent change on main that made `projectKind` required. --------- Co-authored-by: lefarcen <935902669@qq.com> |
||
|
|
b0963fd874
|
fix(web): allow downloads from preview iframes (#1732) | ||
|
|
76defffb93
|
Garnet hemisphere (#1702)
* feat(chat-composer): enhance mention handling and input overlay - Introduced a new overlay for inline mentions in the chat composer, improving user experience by visually indicating mentions as users type. - Updated the `ChatComposer` component to manage mention entities and integrate them into the input field, allowing for better context and interaction. - Enhanced the `AssistantMessage` component to support the display of plugin action panels based on the current project context, facilitating easier plugin management. - Refactored related components to ensure consistent handling of project files and mentions across the application. This update significantly improves the chat interaction model, making it more intuitive for users to engage with mentions and plugins. * feat(plugin-management): enhance plugin action panels and UI components - Updated the `AssistantMessage` component to include plugin action panels based on the latest project context, improving user interaction with generated plugins. - Refactored the `PluginsView` to support detailed views for available marketplace entries, allowing users to access more information and actions for each plugin. - Introduced new CSS styles for improved visual representation of plugin-related UI elements, enhancing overall user experience. - Enhanced the `listPlugins` function to include an option for fetching hidden plugins, providing more flexibility in plugin management. This update significantly improves the usability and functionality of the plugin management system, making it easier for users to interact with and manage their plugins. * fix(assistant-message): refine plugin folder candidate selection logic - Updated the `pluginFoldersTouchedThisTurn` function to improve the logic for selecting plugin folder candidates based on touched paths and message content. - Introduced a new helper function, `pathMatchesFolderFileBasename`, to enhance the matching criteria for folder candidates. - Added a check for explicit folder matches before falling back to a single candidate, improving accuracy in folder selection. - Modified the `shouldRenderSlotAsText` function in `HomeHero` to include the name parameter, refining the rendering logic for slot text. These changes enhance the functionality and reliability of the assistant message component in managing plugin folder candidates. * feat(plugin-folder-actions): implement agent-routed CLI actions for plugin management - Introduced a new `PluginFolderAgentAction` type to streamline actions related to plugin folders, including install, publish, and contribute. - Updated the `DesignFilesPanel`, `FileWorkspace`, and `AssistantMessage` components to utilize the new agent action handling, improving user interaction with generated plugins. - Refactored the action handling logic to send commands to the agent, enhancing the workflow for managing plugin folders. - Added corresponding tests to ensure the new functionality works as expected and integrates seamlessly with existing components. This update significantly enhances the plugin management experience by routing actions through the agent, allowing for a more cohesive and interactive user experience. * Fix PR 1702 CI blockers * Fix PR 1702 remaining CI checks * Prebuild AGUI adapter after install * Restore plugin project snapshot wiring * feat(marketplace): refactor marketplace URL handling and enhance fetching logic - Introduced new functions to normalize marketplace URLs and manage fetching of marketplace manifests, improving the reliability of marketplace integrations. - Updated the server and plugin logic to utilize the new fetching mechanisms, ensuring consistent handling of marketplace data. - Enhanced tests to cover new URL normalization and fetching scenarios, ensuring robustness in marketplace management. This update significantly improves the marketplace experience by streamlining URL handling and enhancing data fetching capabilities. * Fix project auto-send cleanup spec |
||
|
|
c4a67a7b3e
|
Fix Kimi CLI icon contrast in light mode (#1667)
* fix(web): improve Kimi CLI icon contrast * fix(web): render Kimi icon via theme-aware CSS mask Move Kimi to the MONO_ICONS set so it renders through CSS mask with currentColor adaptation, making it legible in both light and dark themes instead of baking a single dark fill that fails on dark backgrounds. * fix(web): adjust Kimi icon secondary mark for dual-theme contrast Keep Kimi as a baked two-tone asset: blue accent (#1783ff) for brand identity, mid-tone gray (#666666) secondary mark for acceptable contrast on both light and dark card surfaces. Revert from mask path to preserve the blue branding. * fix(web): correct corrupted Kimi SVG and strengthen asset validation test Remove extraneous PR discussion text that was accidentally included in the SVG file. Strengthen the test to validate the bundled asset is valid SVG with the expected fills (blue accent + gray secondary mark), catching asset corruption that would otherwise go undetected. |
||
|
|
640a332276 | Merge remote-tracking branch 'origin/garnet-hemisphere' into reconcile/garnet-main-merge | ||
|
|
3b7f87c7ae | Merge remote-tracking branch 'origin/main' into reconcile/garnet-main-merge | ||
|
|
8e3af79dea |
feat(github-installer): enhance GitHub content installation and error handling
- Introduced new interfaces for GitHub content entries and budgets to streamline content fetching. - Enhanced the `installFromGithub` function to support installation from GitHub contents, including subpath handling. - Implemented robust error handling and retry logic for fetching GitHub content, improving installation resilience. - Updated tests to validate the new content fetching logic and ensure correct behavior across various scenarios. This update significantly improves the GitHub installation process, making it more flexible and user-friendly. |
||
|
|
b268bbe169 |
Merge origin/garnet-hemisphere (post-9e196d34) — Use Plugin handoff fix
Brings in 11 new garnet commits, most importantly: - |
||
|
|
6614b9bf09 |
refactor(plugin-authoring): streamline prompt and input handling
- Removed the redundant `buildPluginAuthoringPrompt` function call in `startPluginAuthoring` for cleaner code. - Introduced new functions to build prompts and inputs based on user goals, enhancing the authoring experience. - Updated `HomeView` to manage authoring inputs and prompts more effectively, ensuring better state handling. - Adjusted the `PluginImportModal` to reflect changes in the import process, removing references to template creation. - Enhanced tests to cover new input handling and prompt generation logic, ensuring reliability in the authoring flow. This update improves the clarity and efficiency of the plugin authoring process, making it more intuitive for users. |
||
|
|
fbcf50382a |
feat(github-installer): enhance GitHub source parsing and error handling
- Updated the GitHub source regex to support more flexible parsing of repository references, including subpaths. - Introduced a new `parseGithubSource` function to handle the extraction of owner, repo, and potential subpaths, improving the robustness of the installation process. - Enhanced the `installFromGithub` function to retry fetching from multiple candidates and provide detailed error messages when installation fails. - Added tests to validate the new parsing logic and ensure correct behavior when handling various GitHub source formats. This update significantly improves the handling of GitHub sources, making the installation process more resilient and user-friendly. |
||
|
|
40766ef1ba
|
test(web): Critique Theater Phase 13 (reducer p99 bench + surface coverage walker) (#1318)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)
* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)
* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)
Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.
Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
/api/projects/:id/critique/:runId/interrupt alongside the
optimistic local dispatch. Before this fix, clicking Interrupt
only flipped the React state to interrupted while the daemon job
kept running. The fetch is best-effort: a 404 (endpoint not wired
yet, lands in Phase 15) is swallowed with a dev-mode console.warn
so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
and simulate the "daemon not ready yet" path. Two tests pin both:
the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
rejected fetch still flips the UI.
interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
one we saw; when it changes, interruptPending is cleared. A user
who interrupts run-1 and then triggers run-2 from the same mount
now gets a fresh, enabled kill button instead of one stuck in
"Interrupting…". Pinned by a new mount test.
Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
input, textarea, select, or contenteditable element is ignored
(and any ancestor of those via closest() is treated the same
way). Body-level focus still fires the keybind so the Theater
area's affordance keeps working. Four new tests cover textarea,
input, contenteditable, and the body-focus positive case.
userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
critiqueTheater.userFacingName key so the "Design Jury" label can
be renamed without touching code. Phase 8 introduced
critiqueTheater.title by mistake; renamed across types.ts, en.ts,
zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
TheaterStage.tsx. The locale alignment test stays green.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.
* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)
In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.
Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.
7/7 vitest cases green.
* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)
Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.
* feat(daemon): adapter conformance harness (Phase 10.3)
runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.
5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.
* test(e2e): Critique Theater Playwright suite (Phase 11)
Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.
Coverage:
1. Happy path: a single mounted theater plays the full
fixture (1 run_started, 5 panelists open / dim / must_fix /
close, 1 round_end, 1 ship) and ends on the score badge.
2. Interrupt mid-run: the panelist that is open at the time
the interrupt button is clicked closes with an interrupted
marker and the transcript freezes there.
3. Visual regression at 375x720 mobile.
4. Visual regression at 768x1024 tablet.
5. Visual regression at 1280x800 desktop.
6. A11y role tree: the theater region exposes a labelled
landmark, each panelist lane is a group with an accessible
name, the score is a status live region.
All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.
* test(web): reducer p99 bench at 10k iterations (Phase 13.1)
Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.
The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.
* test(web): critique surface coverage walker (Phase 13.2)
Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.
62 assertions, one per symbol per corpus.
* fix(web): tighten Phase 13 gates from lefarcen review (PR #1318)
Address the actionable items from lefarcen's review of the two
Phase 13 CI gates. The two questions about longer-term DX (pre-
commit hook to auto-update the symbol table, AST-walker swap)
are documented as deferred follow-ups rather than landed here.
reducer-bench:
- Describe renamed to 'reducer p99 regression gate (Phase 13.1)'
so it reads as a gate, not a comparative benchmark.
- Failure message now carries the full distribution
(p50 / p90 / p99 / max + ceiling), so triage on a tripped gate
can distinguish a real 20ms regression from a 4.001ms CI hiccup
without re-running locally (lefarcen Q3).
- Captured a baseline (p50=0.011ms p90=0.013ms p99=0.018ms
max=0.244ms on a local Node 24 / Win11 run, 2026-05-11) inside
the docblock so reviewers can see the actual reading sits ~222x
below the 4ms ceiling (lefarcen Q1).
- Replaced 'role as any' casts with PanelistRole-typed casts so
the fixture is typecheck-strict.
- Phase numbering corrected (13.2 → 13.1 to match the PR body).
critique-coverage:
- Symbols now grouped under four describe blocks (SSE events /
panelist roles / lifecycle phases / i18n keys) so a failure
points at the category that drifted at a glance (lefarcen nit).
- Docblock now explains the grep-over-AST trade-off (the bug
class is structural at the string level, not at the AST level)
and points at the future AST-walker work as a deferred follow-
up (lefarcen Q2).
- Docblock now walks a contributor through the four-step
maintenance flow (add to contract → add caller → add test →
add literal here), so the next person to add an SSE event or
i18n key knows the gate exists and what to update (lefarcen
Q4).
- Phase strings switched from 'phase: <name>' to bare-quoted
literals so the walker is robust against single vs double
quotes and ':' vs '===' source-shape changes.
- Dead try/catch around 'stack = [root]' removed (cannot throw).
- Per-symbol failure messages name the symbol AND which corpus
is missing it, so the gate is self-describing on the next
CI red.
- Phase numbering corrected (13.4 → 13.2 to match the PR body).
63 / 63 vitest cases green (1 bench + 62 coverage). Web
typecheck clean.
* fix(web): tighten coverage walker semantics from lefarcen P2/P3 (PR #1318)
Two follow-on findings on commit
|
||
|
|
2ac5854432 |
feat(plugin-inputs): enhance plugin input handling with file upload support
- Added support for file input fields in the PluginInputsForm, allowing users to upload files with serializable metadata. - Updated the HomeHero component to improve the layout and interaction of input fields, enhancing user experience. - Adjusted CSS styles for better visual representation of input fields and their states. - Modified HomeView to reflect changes in authoring chip IDs for better clarity in plugin actions. - Enhanced tests to cover new file input functionality and ensure correct behavior in various scenarios. This update significantly improves the plugin input handling, enabling users to upload files seamlessly and enhancing the overall interaction model. |
||
|
|
1a90aef4a2 |
feat(plugin-use): implement plugin use handoff functionality
- Added support for using installed plugins directly from the PluginsView, allowing users to initiate plugin actions seamlessly. - Enhanced the HomeView to handle plugin use handoffs, managing state and user interactions effectively. - Introduced new types and functions to facilitate the creation and processing of plugin use handoffs, improving the overall user experience. - Updated tests to cover the new plugin use functionality, ensuring reliability and correctness in the application flow. This update significantly enhances the interaction model for plugins, enabling users to utilize plugins more intuitively within the application. |
||
|
|
9ea33e076b |
feat(context-plugins): add support for context plugins in project metadata and UI
- Introduced a new `contextPlugins` field in the `ProjectMetadata` type to accommodate plugins selected via `@` mentions, allowing for additive context in project creation. - Updated the `HomeHero` and `EntryShell` components to handle and display context plugins, enhancing user interaction with selected plugins. - Implemented rendering logic for context plugins in the metadata block, providing clear visibility of selected plugins and their descriptions. - Enhanced the UI to support the removal of context plugins and display additional details on hover, improving the overall user experience. This update significantly enriches the project creation process by allowing users to incorporate multiple context plugins seamlessly. |
||
|
|
6c16283850 |
Merge origin/main (post-7c8305f4) into reconcile branch
Brings in 10 new main commits: routine deep-link to specific conversations (#1508), Windows resource cache fix for Orbit templates, collapsible comment side panel (#1607), routines project radio polish, Copilot logo swap, and minor UI fixes. Conflicts resolved: - router.ts: garnet's home/view + marketplace routes + main's per-project conversationId deep-link field coexist on Route union - ProjectView.tsx: garnet's isPhantomDaemonRunMessage helper + main's isStoppableAssistantMessage helper both kept - ProjectView.run-cleanup.test.tsx: accepted HEAD (garnet's phantom-row regression test); main's three new tests for finalizeActiveAssistantMessagesOnStop / clearStreamingConversationMarker / shouldClearActiveRunRefs are queued as a follow-up TODO inline. |
||
|
|
2976c76fc3
|
test: expand Memory and Routines coverage (#1521)
* test: expand settings and packaged coverage * test: extend memory settings coverage * test: cover routine settings failure states * test: cover routine operation failures * test: fix daemon test typing on CI * test: decouple packaged smoke from orbit bug * test: avoid live memory LLM calls in route tests * test: fix daemon fetch typing in CI * fix: restore preview comment and inspect toggles * test: align manual edit flow with current inspector UX * test: align comment attachment flow with current preview comments UI * fix: probe resolved Codex launch path during detection * fix: remove duplicate board activation helper after rebase * test: update ghost cli detection mock * test: align FileViewer toolbar expectation * ci: move full app tests to extended lane * ci: run app tests by changed scope * ci: cover shared app inputs in test scopes * ci: avoid setup-node cache in windows packaged smoke * test: align extended settings and manual edit flows |
||
|
|
5cb0508790
|
fix(web): deep-link Routines history rows to their specific conversation (Fixes #1505) (#1508) | ||
|
|
2a8ebff11a
|
feat(web): add collapsible comment side panel (#1607) | ||
|
|
d2738924fb
|
fix(web): freeze completed run durations across conversations (#1351)
* fix(web): freeze completed run durations across conversations * fix(web): finalize stopped API runs Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix(daemon): optimize conversation latest run lookup Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix(web): scope streaming cleanup to conversation Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix(web): capture streaming conversation cleanup Generated-By: looper 0.6.0 (runner=fixer, agent=codex) * fix(web): guard stale run ref cleanup Generated-By: looper 0.6.0 (runner=fixer, agent=codex) |
||
|
|
852a005b32
|
feat(web): add export as image screenshot to share menu (#1569)
Add an option to export the current preview viewport as a PNG image. - Add requestPreviewSnapshot() utility in exports.ts (reuses the existing srcdoc snapshot bridge via postMessage) - Add exportAsImage() and dataUrlToBlob() helpers for Blob download - Add Export as image menu item in the HTML viewer share menu, gated behind srcdoc mode (bridge only present in srcdoc, not URL-load mode) - Refactor PreviewDrawOverlay to delegate to the shared requestPreviewSnapshot() instead of duplicating the snapshot logic - Add fileViewer.exportImage i18n key across all 19 locale files - Add 7 unit tests covering snapshot request, timeout, error handling, and download filename sanitization Fixes #1500 |
||
|
|
54498f1ac5
|
fix(web): parse Provenance with Markdown-bold labels (#1584)
* fix(web): parse Provenance with Markdown-bold labels (#1580) The daemon's finalize synthesis prompt at apps/daemon/src/finalize-design.ts:560-565 lists the five Provenance fields without pinning field-label syntax, so Claude renders them with Markdown-bold labels per Markdown convention (`- **Field:** value`). The parser at apps/web/src/lib/parse-provenance.ts:32-36 uses `[:\s]+` as its label/value separator, which stops at the trailing `**` after the colon; the capture group then slurps the `**` and any following whitespace into the value. Downstream of that, transcriptMessageCount and generatedAt parse as null because the captured tokens don't start with digits or a valid ISO 8601 prefix, and the Continue in CLI clipboard prompt shows `Design system: ** ...`, `Transcript message count when DESIGN.md was generated: unknown`, `DESIGN.md generated at: unknown`. Fix: strip leading and trailing Markdown emphasis (`*`, `_`, whitespace) from every captured value via a single helper threaded through extractField / extractFieldOrNone / extractNumber / extractDate. Widen the transcriptMessageCount regex's capture from `(\d+)` to `([^\n]+)` so the strip step gets a chance to run on `** 4`. Add `[^:]*` between `count` and `[:\s]+` to mirror the other label-walking regexes for bolded label variants. Defense-in-depth: tell the synthesis prompt to emit plain `- Field: value` bullets with no emphasis on the labels. The parser hardening is the load-bearing fix; this is belt-and-suspenders for new model variants. Red-Green-Refactor: - Phase 1 (Red): 3 new parse-provenance tests covering bold labels with backticked values, bold labels with a short `Generated:` form, and bold labels with `none` sentinels. All 3 failed against pre-fix source. - Phase 2 (Green): strip + regex widening. All 7 parse-provenance tests + 1158 web tests pass. - Phase 3: empirically verified against a live finalized DESIGN.md — all five fields now parse correctly. - Phase 4 (defense-in-depth): one-line addendum to synthesis prompt. - Phase 5: bold-labelled Provenance fixture added to the hook test (useDesignMdState.test.tsx) so the round-7 `unknown-provenance` fail-closed path is regression-pinned end-to-end. Backticks in field values are intentionally kept (out of scope per the issue spec; rendered clipboard text reads fine with them). The variant `- **Field**: value` (colon outside emphasis) is not in the issue enumeration and is not handled. Fixes #1580 * fix(web): narrow Provenance strip to Markdown residue only Round-2 fix per lefarcen's review on PR #1584. The round-1 helper used `^[\s*_]+` / `[\s*_]+$`, which stripped a literal leading or trailing `*`/`_` from any captured value — `_draft.html` corrupted to `draft.html`, and a build id like `build_id_v1_` lost its trailing underscore. Narrow stripMarkdownEmphasis to three explicit passes: 1. Leading `*`/`_` tokens FOLLOWED BY WHITESPACE — only matches the `** ` residue left after `- **Field:** value` is captured starting at the `*`. 2. Trailing WHITESPACE followed by `*`/`_` tokens — mirror of (1) if the value closes with emphasis after whitespace. 3. A single balanced wrap around the remaining value (`**X**` / `*X*` / `__X__` / `_X_`) — handles the `- **Field:** **value**` shape and any plain-label `**value**` form. Asymmetric literal `*`/`_` characters in the value (no whitespace separator, no balanced closing token) are preserved by construction. Added regression tests: - plain label + `_draft.html` value - plain label + `build_id_v1_` value (trailing underscore) - bold label + `_draft.html` value (residue stripped, literal leading underscore preserved) - plain label + `**wrapped-id**` value (balanced residue stripped) All 11 parse-provenance tests + 1162 web tests pass. Empirically re-verified against a live finalized DESIGN.md — all five fields still parse correctly. --------- Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai> |
||
|
|
0c0da7cc23
|
fix(web): confirm continue-in-cli copy (#1604) | ||
|
|
3b167b6921 |
feat(plugins): add registry protocol and enhance plugin management features
- Introduced the `@open-design/registry-protocol` package, enabling improved interactions with plugin registries. - Updated the `typecheck` script in the daemon's `package.json` to include the new registry protocol. - Enhanced the CLI with new flags and commands for better plugin management, including `yank` and additional marketplace functionalities. - Implemented a plugin lockfile system to manage installed plugins and their versions, improving reliability during upgrades. - Added new marketplace doctor functionality to validate plugin entries and ensure compliance with registry standards. This update significantly enhances the plugin ecosystem by providing robust registry interactions and improved management capabilities. |
||
|
|
56c264c9bd |
feat(plugins): add login and whoami commands for GitHub CLI authentication
- Introduced `login` and `whoami` commands to the plugin CLI, enabling users to authenticate with the Open Design registry via GitHub CLI. - The `login` command wraps GitHub CLI authentication, allowing users to specify a host, defaulting to GitHub. - The `whoami` command retrieves and displays the authenticated GitHub account information, with an option for JSON output. - Updated the CLI help documentation to include usage instructions for the new commands. - Enhanced error handling for GitHub CLI dependencies and authentication status. This update improves the user experience by simplifying the authentication process for plugin publishing. |
||
|
|
d83b228c81 | Merge remote-tracking branch 'origin/garnet-hemisphere' into reconcile/garnet-main-merge | ||
|
|
7c48fbd902 |
feat(plugins): enhance PluginsView with new marketplace management features
- Updated the PluginsView component to include new tabs for 'Installed', 'Available', and 'Sources', improving the organization of plugin management. - Introduced functions for adding, refreshing, removing, and setting trust for plugin marketplaces, enhancing the marketplace interaction capabilities. - Enhanced the UI to reflect the new structure, including updated CSS styles for better visual consistency and usability. - Added tests to ensure the functionality of the new marketplace features and verify the correct rendering of available plugins. This update significantly improves the user experience in managing plugins and marketplaces, providing a more intuitive interface for users. |
||
|
|
53997990b7 |
Merge origin/main (post-0.7.0) into reconciled garnet branch
Second-pass merge layering 41+ new commits from origin/main on top of the first reconcile commit. Headline upstream additions absorbed: - 0.7.0 release: redesigned chat bubble user-text styling, neutralised palette, lucide icons, ElevenLabs audio voice option discovery in the prompt composer, analytics tracking (PostHog) wired across home / studio / create surfaces, Prometheus `/api/metrics` endpoint, critique-theater drop-in mount with a settings toggle. - Misc upstream fixes (titlebar padding, release header layout, deck preview chrome, feedback form auto-scroll, conversation-created SSE on routine runs, etc.) Conflict resolutions (12 files, ~22 hunks): - contracts barrel + prompts/system: union of both sides; new analytics exports (`./analytics/events`, `./analytics/public-params`) added alongside garnet's plugin/atom/genui exports. Both ElevenLabs voice fields (audioVoiceOptions/audioVoiceOptionsError, main) and pluginBlock/activeStageBlocks (garnet) preserved on ComposeInput. - daemon/server.ts: Prometheus `/api/metrics` route inserted after garnet's `/api/daemon/shutdown`. main's `createAnalyticsService` call added before the chat-run service init alongside the prior reconcile note about the dropped legacy POST /api/projects body. - App.tsx: handleCreateProject now consumes both garnet's plugin fields (pluginId / appliedPluginSnapshotId / pluginInputs / autoSendFirstMessage) and main's analytics requestId. Tracking fires success + failure paths; PluginLoopHome auto-send sessionStorage flag is preserved. - ProjectView.tsx: the garnet auto-send useEffect coexists with main's `useCritiqueTheaterEnabled()` hook. - ChatComposer.tsx: imports merged (drop now-unused fetchSkills, add analytics provider + tracking + buildVisualAnnotationAttachment). - index.css: main's redesigned `.msg.user .user-text` chat bubble styling wins over garnet's plain text rule; garnet's `.msg-plugin-chip*` rules preserved alongside. - EntryView.tsx: accepted HEAD (garnet wrapper) — consistent with reconcile decision #2. main's added PetRail / TopTab / analytics view tracking is intentionally NOT brought into the wrapper; the follow-up to re-integrate PetRail / image-templates / video-templates into EntryShell still stands and now also covers analytics view-tracking hooks. - daemon/package.json + pnpm-lock: merged dep set (tar + posthog-node + prom-client coexist). - Test fixtures (FileWorkspace.test): kept garnet's plugin-folders describe block intact; main's projectKind="prototype" addition is dropped where it conflicted with garnet's plugin-folder fixture files. Verification: `pnpm install` (after lockfile reconciled), `pnpm typecheck` exits 0 across all workspace packages. Follow-up not done in this commit: - PetRail / image-templates / video-templates / 0.7.0 analytics view-tracking hooks need to be added to EntryShell. - Critique-theater settings toggle UX (added on main) lives in the SettingsDialog hierarchy; the reconcile state preserves the SettingsDialog so this should work without changes, but no end-to-end verification yet. |
||
|
|
d3602be666 |
Merge origin/main into garnet-hemisphere (reconcile)
Merge of `origin/main` (`03ed3960`, 2026-05-13 pre-0.7.0) into the 161-commit garnet-hemisphere line, reconciling the product-vibe-coded plugin/marketplace/EntryShell surfaces from garnet with the routines / skills / live-artifacts feature work landed on main since the fork point. Headline decisions (full rationale + side-by-side screenshots in `specs/change/20260513-garnet-skills-automations/reconcile-result-vs-garnet.md`): - #1 SettingsDialog: keep main's Memory / Skills / External MCP / Connectors / Routines / MCP server nav items even though the top-level /integrations + /automations routes also cover them. Two entries coexist for now; revisit once Track A/B fill in the placeholder content. - #2 EntryView: accept garnet's thin wrapper delegating to EntryShell. Main's PetRail sidebar + image-templates/video-templates tabs are intentionally deferred to a follow-up that re-integrates them into the new EntryShell layout. - #3 /integrations + /automations top-level routes: kept (garnet's product intent). Skills tab is still a "Coming soon" placeholder awaiting Track A; Routines/Schedules/Live-artifacts cards on /automations are still mock awaiting Track B. - #5 DesignFilesPanel: hybrid — main's pagination as primary list, garnet's Plugin folders section preserved between the live-artifacts block and the pagination block. (by-kind sections drop in favour of pagination; plugin-folders rendering stays because it is a garnet-specific product addition.) - #7 server.ts (10 hunks, ~5400 conflict lines): manual hunk-by-hunk merge. Both daemon admin routes + plugin/genui routes (garnet) and routines/memory/skills upgrades (main) preserved. Garnet's inline project route block kept alongside main's `registerProjectRoutes` / `registerProjectUploadRoutes` modular wiring — duplicate route audit is a follow-up. Garnet's POST /api/projects plugin-snapshot resolution + default-scenario fallback is intentionally dropped from the inline body (now handled by registerProjectRoutes) and listed for follow-up re-integration into `project-routes.ts`. Verification (worktree at /Users/elian/Documents/open-design-garnet): - `pnpm typecheck` exits 0 across all workspace packages - daemon (`pnpm tools-dev run web --namespace reconcile-shots`) boots, serves `/api/daemon/status` healthy, and survives a Playwright walkthrough of /integrations / /automations / home / projects / design-systems / plugins / settings dialog - `@open-design/plugin-runtime` package built (was missing dist/ on garnet); without it the daemon's plugins/* imports fail at boot Track A (Skills tab → real SkillsSection) and Track B (Automations cards → real routines / live-artifacts backend) are the two remaining follow-ups blocking the placeholder/mock content from going live. See `spec.md` and `track-skills.md` in the same directory. |
||
|
|
828f8b93bf
|
fix(web): protect built-in skills from delete (#1583) | ||
|
|
38a5ab69e6
|
feat(daemon): Critique Theater Phase 12 (9 Prometheus metrics + 6 log events + OTel span + Grafana dashboard) (#1485)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)
* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)
* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)
Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.
Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
/api/projects/:id/critique/:runId/interrupt alongside the
optimistic local dispatch. Before this fix, clicking Interrupt
only flipped the React state to interrupted while the daemon job
kept running. The fetch is best-effort: a 404 (endpoint not wired
yet, lands in Phase 15) is swallowed with a dev-mode console.warn
so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
and simulate the "daemon not ready yet" path. Two tests pin both:
the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
rejected fetch still flips the UI.
interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
one we saw; when it changes, interruptPending is cleared. A user
who interrupts run-1 and then triggers run-2 from the same mount
now gets a fresh, enabled kill button instead of one stuck in
"Interrupting…". Pinned by a new mount test.
Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
input, textarea, select, or contenteditable element is ignored
(and any ancestor of those via closest() is treated the same
way). Body-level focus still fires the keybind so the Theater
area's affordance keeps working. Four new tests cover textarea,
input, contenteditable, and the body-focus positive case.
userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
critiqueTheater.userFacingName key so the "Design Jury" label can
be renamed without touching code. Phase 8 introduced
critiqueTheater.title by mistake; renamed across types.ts, en.ts,
zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
TheaterStage.tsx. The locale alignment test stays green.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.
* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)
In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.
Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.
7/7 vitest cases green.
* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)
Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.
* feat(daemon): adapter conformance harness (Phase 10.3)
runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.
5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.
* test(e2e): Critique Theater Playwright suite (Phase 11)
Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.
Coverage:
1. Happy path: a single mounted theater plays the full
fixture (1 run_started, 5 panelists open / dim / must_fix /
close, 1 round_end, 1 ship) and ends on the score badge.
2. Interrupt mid-run: the panelist that is open at the time
the interrupt button is clicked closes with an interrupted
marker and the transcript freezes there.
3. Visual regression at 375x720 mobile.
4. Visual regression at 768x1024 tablet.
5. Visual regression at 1280x800 desktop.
6. A11y role tree: the theater region exposes a labelled
landmark, each panelist lane is a group with an accessible
name, the score is a status live region.
All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.
* test(web): reducer p99 bench at 10k iterations (Phase 13.1)
Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.
The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.
* test(web): critique surface coverage walker (Phase 13.2)
Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.
62 assertions, one per symbol per corpus.
* docs: Critique Theater user guide (Phase 14.1)
Seven sections aimed at end users (not contributors):
1. What is Design Jury
2. How it works (the five panelists, auto-converging rounds,
the composite formula)
3. Settings (the M1 toggle and what it does)
4. Reading the score badge
5. Replay surface
6. Troubleshooting (degraded, interrupted, failed)
7. FAQ
The composite formula is documented as
designer * 0 + critic * 0.4 + brand * 0.2 + a11y * 0.2 + copy * 0.2
because anyone trying to reverse-engineer the score is going to
search for those weights and the docs are the place they should
land first.
* docs(daemon): critique module AGENTS map (Phase 14.2)
Daemon-side wayfinder for the apps/daemon/src/critique directory.
Tables every file, what owns what invariant, and the 'when you
change anything here' guide so a future contributor does not
have to reverse-engineer the rollout resolver before adding a
new SSE event.
* docs(web): Theater module AGENTS map (Phase 14.3)
Web-side mirror of the daemon AGENTS map. Same file table, same
invariants section, same change-impact guide, sized to the
Theater component package.
* feat(daemon): rollout flag resolver (Phase 15.1)
Single decision point every caller consults to know whether the
orchestrator should wire the critique pipeline for a given run.
Priority:
1. Skill-level policy (required wins, opt-out wins inversely)
2. Per-project override from the Settings toggle
3. OD_CRITIQUE_ENABLED env override
4. Rollout phase default
M0 dark-launch false
M1 settings only false (toggle is off until the user flips it)
M2 per-skill true if skill opted in
M3 global default true
OD_CRITIQUE_ROLLOUT_PHASE parser defaults to M0 on unknown input
so a fresh install never surprises a user with the feature on.
10/10 vitest cases green covering every cell of the matrix.
* feat(web): Settings toggle hook for Critique Theater (Phase 15.2)
React hook that reads critiqueTheaterEnabled from the existing
open-design:config localStorage blob and stays in sync via:
- the platform storage event (cross-tab)
- a open-design:critique-theater-toggle CustomEvent (same-tab)
Same-tab event is the one that fires when the Settings panel saves
in the current window: the toggle and every mounted theater update
without a page reload.
setCritiqueTheaterEnabled(next) is the imperative setter the Settings
panel calls. It preserves the rest of the stored config (mode, apiKey,
etc.) and dispatches the same-tab event after the localStorage write.
The web hook reflects what the user toggled; the daemon-side
isCritiqueEnabled is the final routing authority (project override,
env, rollout phase). When they disagree, the daemon wins for backend
gating and the web reflects the toggle state.
6/6 vitest cases green covering first read, stored read, same-tab
event flip, config preservation, corrupted JSON tolerance, and
cross-tab storage event.
* test(web): Phase 15 toggle hook failure-mode coverage (PR #1320)
lefarcen P2 on PR #1320 flagged that the PR body claimed safe
behavior for disabled localStorage, non-object JSON, and missing
CustomEvent shim, but the suite only covered corrupt JSON plus
happy-path storage events. Added four failure-mode tests so the
swallowed errors are not silently traded for a throw in a future
refactor:
1. Returns false on a stored JSON value that parses to an array
(non-object). Catches a regression where the guard treats
anything truthy as a config blob.
2. Returns false on a stored JSON value of literal 'null'.
typeof null === 'object' in JS, so the guard has to check null
explicitly; this test pins that check.
3. Returns false when localStorage.getItem throws (private mode /
disabled storage / SecurityError). The hook must swallow and
return false so the rest of the app keeps rendering.
4. setCritiqueTheaterEnabled still dispatches the same-tab
CustomEvent when localStorage.setItem throws (quota exceeded /
disabled storage). The dispatch path is the in-session
broadcast that keeps every mounted hook coherent even when
persistence is unavailable; verified by mounting two probes
and asserting both flip after the setter is called with a
throwing setItem.
10/10 vitest cases green (6 existing + 4 new).
* fix(web): honor CustomEvent payload in toggle hook listener (PR #1320)
Both Siri-Ray (blocking) and lefarcen (P2 new) caught the same
real bug in the failure-mode test I added in
|
||
|
|
385e1d111d
|
feat(web): Critique Theater Phase 9 (drop-in mount wrapper, native i18n for de, ja, ko, zh-TW) (#1315)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)
* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)
* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)
Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.
Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
/api/projects/:id/critique/:runId/interrupt alongside the
optimistic local dispatch. Before this fix, clicking Interrupt
only flipped the React state to interrupted while the daemon job
kept running. The fetch is best-effort: a 404 (endpoint not wired
yet, lands in Phase 15) is swallowed with a dev-mode console.warn
so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
and simulate the "daemon not ready yet" path. Two tests pin both:
the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
rejected fetch still flips the UI.
interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
one we saw; when it changes, interruptPending is cleared. A user
who interrupts run-1 and then triggers run-2 from the same mount
now gets a fresh, enabled kill button instead of one stuck in
"Interrupting…". Pinned by a new mount test.
Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
input, textarea, select, or contenteditable element is ignored
(and any ancestor of those via closest() is treated the same
way). Body-level focus still fires the keybind so the Theater
area's affordance keeps working. Four new tests cover textarea,
input, contenteditable, and the body-focus positive case.
userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
critiqueTheater.userFacingName key so the "Design Jury" label can
be renamed without touching code. Phase 8 introduced
critiqueTheater.title by mistake; renamed across types.ts, en.ts,
zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
TheaterStage.tsx. The locale alignment test stays green.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.
* fix(web): address PerishCode P3 polish items on PR #1315
Three non-blocking findings from PerishCode's latest review folded
in alongside the merge-with-main work.
1. bestRoundOf / bestCompositeOf could pair mismatched values.
The split helpers walked state.rounds independently: bestRoundOf
returned the LAST round with a numeric composite; bestCompositeOf
returned the MAX composite. With non-monotonic composites (round
1 at 8.5, round 2 at 6.0), the user-initiated interrupt shipped
{ bestRound: 2, composite: 8.5 } which is a pair that never
existed and the collapsed badge advertised forever (the reducer's
interrupted phase is sticky). Folded into a single
bestRoundAndComposite() that walks once and returns the matching
pair. Multi-round regression test in CritiqueTheaterMount.test.tsx
pins the fix.
2. critiqueTheater.interruptedSummary was missing from de.ts,
ja.ts, ko.ts, zh-TW.ts. The Phase 9 native-locale work translated
39 of the 40 critiqueTheater.* keys; this one slipped through
because the rest of the key block uses the locale-specific
interrupted line as an anchor. The collapsed badge would have
shown the English fallback string in an otherwise native UI on
every interrupted run. Added native translations to all four
locales (matching PerishCode's suggested copy).
3. InterruptButton's window-scope Escape handler could collide with
Esc-to-dismiss on modals / popovers elsewhere on the page. The
prior fix filtered text-entry focus but still fired from any
body-level Esc, which double-handles when a non-text dismissable
surface is open. New gate defers when a [role="dialog"] (or any
aria-modal="true" element) without aria-hidden="true" is on the
page AND the Escape didn't originate inside .theater-stage. Three
new InterruptButton.test.tsx cases pin the deferral, the
aria-hidden exemption, and the in-stage exemption.
Verified: typecheck clean, 111 / 1007 tests green.
* fix(web): restore wait-for-daemon-ack pattern on Theater interrupt
Same regression as flagged on PR #1316 post-main-merge: the
optimistic local dispatch fired before the POST resolved, so a
daemon 404 / 409 still terminalized the UI and the real SSE
terminal event got ignored by the sticky interrupted phase.
Pulled the corrected pattern from the Phase 10 commit:
- snapshot runId / bestRound / composite at click time
- dispatch interrupted only on res.ok
- on rejection or non-2xx, clear interruptPending and log in dev so
the user can retry and the real terminal event still wins
Tests updated: rejection and non-2xx cases now assert the UI stays
in running phase; the 204 cases wait for the async ack with waitFor.
Web typecheck clean.
* fix(web): drop shadow hasValidVariantShape; delegate to isPanelEvent (PerishCode follow-up on PR #1315)
The wire-layer guard in `apps/web/src/components/Theater/state/sse.ts` carried a `hasValidVariantShape` second-pass filter and a docstring claiming `isPanelEvent` from contracts was a header-only check ("missing runId, unknown type"). That stopped being true after the Siri-Ray round-3 fix on PR #1314: `isPanelEvent` is now the strict guard, validating every variant's required fields, closed-enum membership against PANELIST_ROLES / SHIP_STATUSES / DEGRADED_REASONS / FAILED_CAUSES / PARSER_WARNING_KINDS / ROUND_DECISIONS, finite numerics, and the threshold <= scale cross-field check.
Net effect: the shadow guard was strictly weaker than the contract guard above it. It could not reject any frame `isPanelEvent` already let through. The docstring also misled future readers into thinking the safety net was at the wire layer when it actually sits one layer up in contracts.
Two changes:
1. Drop `hasValidVariantShape` entirely. `sseToPanelEvent` collapses to the one-liner: spread the payload, pin `type` from the channel, return `isPanelEvent(candidate) ? candidate : null`. Docstring rewritten to describe `isPanelEvent` accurately as the strict guard, and to call out (with a forward reference to this commit) that the deletion was intentional.
2. Add three small wire-layer regression cases to `sse.test.ts` so a future accidental weakening of either layer fails loudly at the boundary the reducer actually sees:
- drops a `critique.ship` with an unknown status (closed-enum check)
- drops a `critique.panelist_close` with an unknown role (closed-enum check)
- drops a `critique.round_end` with `composite: NaN` (finite-numeric check)
Verification:
- pnpm --filter @open-design/web typecheck: clean
- pnpm --filter @open-design/web exec vitest run tests/components/Theater/state/sse.test.ts: 22 / 22 green (19 prior + 3 new)
- Net diff: -84 +41 lines (-43 net); the shadow function and its stale docstring are gone, the three regressions land at half the cost
* fix(test): add projectKind prop to FileViewer deck render after v0.7.0 merge
---------
Co-authored-by: Nagendhra <nagendhra405@gmail.com>
|
||
|
|
c326492203
|
feat: Critique Theater Phase 15 (rollout resolver + Settings toggle hook) (#1320)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)
* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)
* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)
Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.
Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
/api/projects/:id/critique/:runId/interrupt alongside the
optimistic local dispatch. Before this fix, clicking Interrupt
only flipped the React state to interrupted while the daemon job
kept running. The fetch is best-effort: a 404 (endpoint not wired
yet, lands in Phase 15) is swallowed with a dev-mode console.warn
so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
and simulate the "daemon not ready yet" path. Two tests pin both:
the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
rejected fetch still flips the UI.
interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
one we saw; when it changes, interruptPending is cleared. A user
who interrupts run-1 and then triggers run-2 from the same mount
now gets a fresh, enabled kill button instead of one stuck in
"Interrupting…". Pinned by a new mount test.
Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
input, textarea, select, or contenteditable element is ignored
(and any ancestor of those via closest() is treated the same
way). Body-level focus still fires the keybind so the Theater
area's affordance keeps working. Four new tests cover textarea,
input, contenteditable, and the body-focus positive case.
userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
critiqueTheater.userFacingName key so the "Design Jury" label can
be renamed without touching code. Phase 8 introduced
critiqueTheater.title by mistake; renamed across types.ts, en.ts,
zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
TheaterStage.tsx. The locale alignment test stays green.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.
* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)
In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.
Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.
7/7 vitest cases green.
* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)
Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.
* feat(daemon): adapter conformance harness (Phase 10.3)
runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.
5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.
* test(e2e): Critique Theater Playwright suite (Phase 11)
Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.
Coverage:
1. Happy path: a single mounted theater plays the full
fixture (1 run_started, 5 panelists open / dim / must_fix /
close, 1 round_end, 1 ship) and ends on the score badge.
2. Interrupt mid-run: the panelist that is open at the time
the interrupt button is clicked closes with an interrupted
marker and the transcript freezes there.
3. Visual regression at 375x720 mobile.
4. Visual regression at 768x1024 tablet.
5. Visual regression at 1280x800 desktop.
6. A11y role tree: the theater region exposes a labelled
landmark, each panelist lane is a group with an accessible
name, the score is a status live region.
All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.
* test(web): reducer p99 bench at 10k iterations (Phase 13.1)
Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.
The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.
* test(web): critique surface coverage walker (Phase 13.2)
Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.
62 assertions, one per symbol per corpus.
* docs: Critique Theater user guide (Phase 14.1)
Seven sections aimed at end users (not contributors):
1. What is Design Jury
2. How it works (the five panelists, auto-converging rounds,
the composite formula)
3. Settings (the M1 toggle and what it does)
4. Reading the score badge
5. Replay surface
6. Troubleshooting (degraded, interrupted, failed)
7. FAQ
The composite formula is documented as
designer * 0 + critic * 0.4 + brand * 0.2 + a11y * 0.2 + copy * 0.2
because anyone trying to reverse-engineer the score is going to
search for those weights and the docs are the place they should
land first.
* docs(daemon): critique module AGENTS map (Phase 14.2)
Daemon-side wayfinder for the apps/daemon/src/critique directory.
Tables every file, what owns what invariant, and the 'when you
change anything here' guide so a future contributor does not
have to reverse-engineer the rollout resolver before adding a
new SSE event.
* docs(web): Theater module AGENTS map (Phase 14.3)
Web-side mirror of the daemon AGENTS map. Same file table, same
invariants section, same change-impact guide, sized to the
Theater component package.
* feat(daemon): rollout flag resolver (Phase 15.1)
Single decision point every caller consults to know whether the
orchestrator should wire the critique pipeline for a given run.
Priority:
1. Skill-level policy (required wins, opt-out wins inversely)
2. Per-project override from the Settings toggle
3. OD_CRITIQUE_ENABLED env override
4. Rollout phase default
M0 dark-launch false
M1 settings only false (toggle is off until the user flips it)
M2 per-skill true if skill opted in
M3 global default true
OD_CRITIQUE_ROLLOUT_PHASE parser defaults to M0 on unknown input
so a fresh install never surprises a user with the feature on.
10/10 vitest cases green covering every cell of the matrix.
* feat(web): Settings toggle hook for Critique Theater (Phase 15.2)
React hook that reads critiqueTheaterEnabled from the existing
open-design:config localStorage blob and stays in sync via:
- the platform storage event (cross-tab)
- a open-design:critique-theater-toggle CustomEvent (same-tab)
Same-tab event is the one that fires when the Settings panel saves
in the current window: the toggle and every mounted theater update
without a page reload.
setCritiqueTheaterEnabled(next) is the imperative setter the Settings
panel calls. It preserves the rest of the stored config (mode, apiKey,
etc.) and dispatches the same-tab event after the localStorage write.
The web hook reflects what the user toggled; the daemon-side
isCritiqueEnabled is the final routing authority (project override,
env, rollout phase). When they disagree, the daemon wins for backend
gating and the web reflects the toggle state.
6/6 vitest cases green covering first read, stored read, same-tab
event flip, config preservation, corrupted JSON tolerance, and
cross-tab storage event.
* test(web): Phase 15 toggle hook failure-mode coverage (PR #1320)
lefarcen P2 on PR #1320 flagged that the PR body claimed safe
behavior for disabled localStorage, non-object JSON, and missing
CustomEvent shim, but the suite only covered corrupt JSON plus
happy-path storage events. Added four failure-mode tests so the
swallowed errors are not silently traded for a throw in a future
refactor:
1. Returns false on a stored JSON value that parses to an array
(non-object). Catches a regression where the guard treats
anything truthy as a config blob.
2. Returns false on a stored JSON value of literal 'null'.
typeof null === 'object' in JS, so the guard has to check null
explicitly; this test pins that check.
3. Returns false when localStorage.getItem throws (private mode /
disabled storage / SecurityError). The hook must swallow and
return false so the rest of the app keeps rendering.
4. setCritiqueTheaterEnabled still dispatches the same-tab
CustomEvent when localStorage.setItem throws (quota exceeded /
disabled storage). The dispatch path is the in-session
broadcast that keeps every mounted hook coherent even when
persistence is unavailable; verified by mounting two probes
and asserting both flip after the setter is called with a
throwing setItem.
10/10 vitest cases green (6 existing + 4 new).
* fix(web): honor CustomEvent payload in toggle hook listener (PR #1320)
Both Siri-Ray (blocking) and lefarcen (P2 new) caught the same
real bug in the failure-mode test I added in
|
||
|
|
8e6250f2c1
|
feat(web): render GFM tables in markdown artifact and chat renderers (#1496)
* feat(web): render GFM pipe tables in markdown artifact and chat renderers Both hand-rolled markdown parsers (artifact preview and assistant chat) ignored pipe-table syntax, so GFM tables collapsed into a single paragraph. Add a table block to each parser that: - detects header+alignment+body rows; - honors :--- / :---: / ---: column alignment; - preserves inline formatting (code, bold, italic, links) inside cells; - restores escaped \| as a literal pipe; - breaks a preceding paragraph at a table start without a blank line. Closes #1495 * fix(web): use scan-based table cell splitter, drop NUL placeholder Address review on #1496: - lefarcen (P1): the placeholder approach in both splitters was unsafe. artifacts/markdown.ts embedded literal NUL bytes around 'ODPIPE', which flipped the whole file to binary in GitHub diffs and hid the patch from review. runtime/markdown.tsx used a printable ' ODPIPE ' sentinel that could collide with real cell content. - codex (P2): both splitters ran before inline parsing, so a literal '|' inside a backtick code span (e.g. a TypeScript union cell like \`\"ready\" | \"done\"\`) was treated as a column boundary and shredded the row. Both issues collapse to one root cause — splitting by regex on '|' without knowing what's a code span and what's an escape. Replace the placeholder-then-split routine in both files with a single character- level state-machine scanner that: - treats '|' inside backticks as cell content (tracks inCode); - resolves '\\|' to a literal '|' as it accumulates each cell; - strips a single optional leading '|' and a single unescaped trailing '|' as row terminators rather than empty cells. No placeholders, no NUL bytes, no collision surface. Add a test in each test file covering the GFM-style pipe-inside-code-span cell, and the runtime test additionally asserts the row produces exactly two <td> cells (i.e. no phantom column from the embedded pipe). * style(web): make markdown table apply in artifact viewer + add header bg, borders The prior `.md-table` rules were scoped to `.prose-block`, which only wraps the chat assistant path (AssistantMessage). The artifact preview in FileViewer renders into `.markdown-rendered`, so tables in long-form documents (the most common case) inherited no styling and showed as borderless, structureless rows. Lift the rule out of `.prose-block` so both paths share it. Visual treatment matches the existing prose-block ecosystem (blockquote, markdown-status): - `var(--border)` for cell + outer borders, 6px outer radius. - `var(--bg-panel)` on `<th>` for the header band — same token already used by blockquote and code-copy chip. - Even-row zebra via `color-mix(in oklab, var(--bg-panel) 40%, transparent)` for a very faint stripe that holds up in both light and dark themes (no hardcoded color). - Padding bumped from 4×8 to 8×12; cells were touching before. - `display: block; overflow-x: auto` kept so wide tables scroll inside narrow columns instead of pushing layout. * fix(web): wrap md-table in scroll container so columns fill width Previously `.md-table` itself was `display: block; overflow-x: auto` to let wide tables scroll horizontally. The side effect: a block-mode table no longer behaves like `<table>` for layout, so columns collapse to natural content width and leave empty space on the right when the content is narrower than the container. Split the two concerns: the new `.md-table-wrap` <div> owns the scroll viewport (border, radius, overflow-x), and the inner `<table>` keeps its native `display: table` with `width: 100%` so column widths distribute across the available space again. Wide tables still scroll; narrow tables now fill the container. Both render paths (chat runtime + artifact HTML) emit the wrapper, and the corresponding unit tests assert the nested shape. |
||
|
|
d3d95121f3 |
feat(plugins): enhance visual score sorting and add new example templates
- Updated the `sortByVisualAppeal` function to prioritize featured ranks, ensuring that curated plugins are displayed prominently. - Added tests to verify the new sorting logic, ensuring that plugins with numeric featured ranks are sorted correctly ahead of others. - Introduced new example templates for a magazine article layout, a Twitter share card, and a Xiaohongshu card, expanding the available options for users. - Enhanced the overall plugin preview experience by integrating these new templates, providing users with more visually appealing and functional examples. This update significantly improves the plugin sorting mechanism and enriches the template offerings, enhancing user engagement and experience. |
||
|
|
8b2d48a258 |
feat(daemon, web): enhance plugin preview handling and add new templates
- Introduced logic to assemble example slides with a companion template when the declared entry is missing, improving the user experience for plugin previews. - Updated the server logic to handle special cases for `example-slides.html`, ensuring proper fallback to `template.html` when applicable. - Enhanced tests to verify the new preview assembly functionality and ensure correct rendering of fallback content. - Added new HTML and Markdown examples for various skills, including a magazine article layout and a Twitter share card, expanding the available templates for users. This update significantly improves the plugin preview experience, providing users with more robust and visually appealing fallback options. |
||
|
|
06dbde51f9
|
[codex] Add Cursor Agent auth diagnostics (#1538)
* Add Cursor Agent auth diagnostics * Handle Cursor not logged in auth status * Address Cursor auth review feedback * Classify Cursor stdout auth failures |
||
|
|
a3276ec542
|
[codex] Add visual draw annotation context (#1547)
* feat(web): add visual draw annotation context * Fix visual draw annotation staging * Fix concurrent visual annotation IDs |
||
|
|
fb545b8d21
|
fix(web): count nested .slide elements in deck preview bridge (#1542)
* fix(web): count nested .slide elements in deck preview bridge Generated HTML decks commonly nest .slide elements under an extra wrapper rather than placing them as direct children of .deck / .deck-stage / .deck-shell / body. The bridge then reported count: 0 and the preview toolbar showed 1 / 0 even though the deck visibly contained slides and its own keyboard handler navigated them. Keep the structured selectors first so decorative .slide markup in non-deck pages is not counted, and fall back to all .slide only when nothing structured matched. Pinned with a regression test that runs the extracted deck bridge script against a JSDOM nested-slide layout. Fixes #1530 * test(web): pin deck bridge structured-first precedence with decoy .slide Address review feedback on #1542: the existing structured-first case expected count === 3 but the fixture only contained the three real slides, so a regression that dropped the structured selector entirely and went straight to `.slide` would still pass. Add a decoy `.slide` in a sibling <header> outside `.deck`; the structured-first pass keeps count === 3, while a naive broad selector would now count 4. Verified by temporarily collapsing slides() to `document.querySelectorAll('.slide')` and observing the second case fail with `expected 4 to be 3`. |
||
|
|
086be271d4
|
fix: hide preview chrome in source view (#1556)
* fix: hide preview chrome in source view * fix: keep source-view edit controls |
||
|
|
660c5b88b4
|
fix(web): auto-scroll feedback form (#1566) | ||
|
|
fa2bb59ab3 |
fix(test): align FileViewer tests with merged component
- Drop FileViewer.inspect-empty-hint.test.tsx (deleted on main; should have been gone after the merge but the modify/delete conflict left it in the tree). - Take main's FileViewer.test.tsx so the manual-edit interaction tests match the merged FileViewer.tsx behaviour, then patch every <FileViewer> render with projectKind="prototype" so it satisfies the prop requirement release added in #1509. |
||
|
|
4096d65125 |
fix(web): align Icon names and FileViewer tests with merged state
- SettingsDialog: use 'sparkles' for Pet section nav icon (main's choice; the merge picked up release's 'paw' which is not in main's IconName union). - FileViewer.test.tsx: take release's version which passes projectKind on every render (release added that prop in #1509 analytics; main's tests predate that prop). - FileViewer.manual-edit{,-history}.test.tsx: keep main's tests but pass projectKind="prototype" so they satisfy the merged FileViewer Props. |
||
|
|
5172e37217 |
Merge origin/main into release/v0.7.0 to prepare merge-back PR
Resolves 7 conflicts via hybrid strategy: - apps/web/src/components/EntryView.tsx: take main (Discord+X pills are forward feature) - apps/web/src/components/Icon.tsx: take main (switch-case refactor) - apps/web/src/components/NewProjectPanel.tsx: take release (preserve #1514 dropdown UX validated in 0.7.0 acceptance) - apps/web/src/index.css: take main (project-target-platforms / instructions chip styles) - apps/web/tests/components/FileViewer.inspect-empty-hint.test.tsx: accept main's deletion - nix/package-daemon.nix, nix/package-web.nix: take main pnpmDepsHash Non-conflicting hunks from #1519 (AppChromeHeader), #1428 (PostHog analytics call sites), and #1540 (release light background) are preserved via auto-merge. |
||
|
|
9040088f1c
|
fix(web): remove redundant bulk-select button (#1550) | ||
|
|
4f76e836ae
|
feat(audio): add ElevenLabs audio support (#1384)
* docs: add ElevenLabs audio support design * docs: add ElevenLabs audio implementation plan * feat(daemon): add ElevenLabs speech renderer * feat(daemon): add ElevenLabs sound effects renderer * fix(daemon): preserve ElevenLabs sfx durations * feat(web): expose ElevenLabs media providers * feat(daemon): document ElevenLabs audio contract * feat(audio): add ElevenLabs voice selection * chore: ignore superpowers scratch docs * fix(daemon): cache ElevenLabs voice options * fix(audio): expand ElevenLabs voice and SFX selection * fix(audio): align ElevenLabs SFX controls * fix(audio): tighten ElevenLabs SFX prompt budget * fix(audio): preflight ElevenLabs SFX prompt length * fix(audio): surface ElevenLabs lookup failures * fix(audio): sanitize ElevenLabs prompt errors |
||
|
|
11b4750677
|
Update release light background (#1540) | ||
|
|
cd68c8a80a
|
fix(daemon+web): emit conversation-created SSE event when routine run starts (#1523)
* fix(daemon+web): emit conversation-created SSE event when routine run starts When a Routine fires in "Reuse an existing project" mode, the daemon creates a new conversation in the project and writes a queued/running assistant message to the database, but the open `ProjectView` has no way to learn that anything happened: the project events SSE stream only carries `file-changed` and `live_artifact*` events, and `ProjectView` reloads conversations only when `project.id` changes. The result is the user's own routine "Run now" appears to do nothing until they exit and re-enter the project (#1361). Fix: - Add a `conversation-created` payload type to the existing project events stream in `apps/web/src/providers/project-events.ts`. The payload carries `projectId`, `conversationId`, `title`, and `createdAt`. Mirror the existing `file-changed` listener pattern with explicit malformed-payload handling. - In `apps/daemon/src/server.ts`, after `insertConversation` runs in the routine `setRunHandler` (both reuse-an-existing-project and new-project paths), broadcast a `conversation-created` event through the existing `activeProjectEventSinks` map. The function body was already generic so it was renamed from `emitProjectLiveArtifactEvent` to `emitProjectEvent` and the two pre-existing callers updated. - In `apps/web/src/components/ProjectView.tsx`, when `handleProjectEvent` receives a `conversation-created` event whose `projectId` matches the currently-viewed project, refetch the conversation list via `listConversations`. The active conversation is intentionally NOT changed — per maintainer guidance on #1361, auto-switching is a separate UX decision left for a follow-up. - `projectEventToAgentEvent` returns null for `conversation-created` so it doesn't get routed into the live-artifact path. Tests (`apps/web/tests/providers/project-events.test.ts`): - A single `conversation-created` event reaches the consumer with the parsed payload. - Two consecutive `conversation-created` events from concurrent routine runs both reach the consumer (covers the multiple-concurrent-runs case reported in #1502). - Malformed `conversation-created` payloads are swallowed without throwing, matching the existing `file-changed` / `live_artifact` defensive behavior. Manual verification: - Built locally with `pnpm exec tools-pack mac build --to app --portable` and installed. - Created a routine in `Reuse an existing project` mode targeting an existing project. - With the project view open, clicked `Run now`. The new "Routine" conversation appeared in the project's conversation list within about a second, without exiting and re-entering the project, and the active conversation was not changed. - Clicked `Run now` twice in quick succession; both new conversations appeared in the list, covering the concurrent-runs case in #1502. - `pnpm guard` and `pnpm --filter @open-design/web typecheck` clean; full web test suite is 1016/1016 passing. Fixes #1361 * fix(#1523 review): share SSE type via contracts; guard conversation refresh against project-switch and reordering races Addresses Codex P1 + lefarcen P2 inline review on #1523 (#1361): 1. Move `ProjectConversationCreatedSsePayload` to `@open-design/contracts` (`packages/contracts/src/sse/chat.ts`) so the daemon producer and the web consumer share one type. The web provider re-exports it under the local `ProjectConversationCreatedEvent` name to keep the existing import shape stable for callers; the daemon emit site picks up the same shape via a JSDoc typedef so producer and consumer can't drift as this stream grows. (Addresses lefarcen P2 on project-events.ts:17.) 2. Guard the `conversation-created` async refresh in `ProjectView` against two distinct races: - Project-switch race: capture `project.id` at dispatch time and re-check it via a live `projectIdRef` after `listConversations` resolves; bail if the user switched projects while the request was in flight. The existing project-load effects use the same cancellation pattern. (Addresses Codex P1 on ProjectView.tsx:767.) - Concurrent-refresh re-ordering race: bump a monotonic `conversationsRefreshTokenRef` on every dispatch and capture each request's token; only the request whose captured token still equals the live ref at await-return applies its result. Two rapid `conversation-created` events (the #1502 concurrent Run-now case) can no longer drop the newest conversation when the earlier request resolves last with a stale, shorter list. (Addresses lefarcen P2 on ProjectView.tsx:767.) Both guards are documented inline with comments that point back at the review threads. The existing project-events tests (single delivery, concurrent delivery, malformed payloads) are unchanged — the new guards are defensive logic on the consumer, not new event shapes. `pnpm guard`, `pnpm --filter @open-design/web typecheck`, `pnpm --filter @open-design/daemon typecheck`, and the full web test suite (1016/1016) remain green. |
||
|
|
9e196d34af |
feat(daemon, web): enhance plugin sharing workflows and UI components
- Updated the plugin sharing prompts to utilize local daemon endpoints for publishing to GitHub and contributing to Open Design, streamlining the user experience. - Refactored the `PluginsView` and `PluginShareMenu` components to support new sharing functionalities, including confirmation modals and improved link handling. - Enhanced the CSS styles for the plugin share confirmation modal and related UI elements for better visual consistency. - Added tests to verify the functionality of the new sharing workflows and ensure proper integration within the existing plugin management system. This update significantly improves the plugin sharing experience, making it easier for users to publish and contribute their plugins effectively. |
||
|
|
eda182c8a1
|
refactor(web): UI polish for v0.7.0 — neutralised palette, official brand glyphs, lucide (#1522)
* refactor(web): adopt lucide-react for the inline Icon component
The hand-rolled `<Icon>` set drifted in stroke weight and proportion across
its 50+ glyphs as new icons were added. Swap the implementation to dispatch
to `lucide-react` while keeping the same `<Icon name="..." size={X} />` API
so the 246 existing call sites stay untouched.
- Adds `lucide-react` as a dependency (tree-shaken; ~30KB gzipped for the
~50 icons we actually import).
- `discord` and `x-brand` keep their bespoke inline SVG paths since lucide
intentionally does not ship brand artwork.
- `spinner` continues to use the existing `.icon-spin` className for its
rotation; under the hood it now renders lucide's `Loader2`.
- New `paw` glyph (lucide `PawPrint`) so the Pets nav item stops sharing
the `sparkles` icon with External MCP.
No behaviour change: the prop surface is identical, fill follows
`currentColor` exactly as before, and aria-hidden / focusable defaults are
preserved. Visual deltas are limited to the strokes themselves (slightly
finer endcaps, more consistent baseline weights) — exactly the
consistency upgrade lucide gives us.
* feat(web): bundle official brand assets for agent icons
`AgentIcon` previously approximated each agent's brand with hand-drawn
SVG (orange Anthropic-ish sparkle, OpenAI-knot ellipses, etc). Replace
those approximations with the real, vendor-published artwork shipped as
static assets under `apps/web/public/agent-icons/`.
- 13 SVG marks sourced from `@lobehub/icons-static-svg` (MIT) — color
variants where the vendor published one (Claude, Codex, Gemini,
Copilot, Qwen, Qoder, DeepSeek, Kimi, Mistral/Vibe), monochrome marks
for the rest (Cursor, OpenCode, Hermes, MiMo, Pi, Kilo).
- 1 PNG mark (Devin) sourced from devin.ai/icon.png, resized to 96×96
via `sips` since Cognition doesn't publish an SVG.
- Each SVG was cleaned (stripped `<title>` brand text and the library's
internal `style="flex:none;..."` ; dropped `width/height="1em"` so
`viewBox` governs sizing) and run through `svgo --multipass`. Total
bundle footprint: ~36 KB for all 17 files, only loaded on the agent
cards that render them.
- `AgentIcon` now resolves brands via a small `ICON_EXT` table and
renders `<img src="/agent-icons/<id>.<ext>">`. Agents without an asset
(`devin` is the lone outlier removed in this commit because PNG; new
agents with no shipped artwork at all) fall back to an initial-letter
pill that reads as "no official mark yet" rather than inventing
brand artwork.
- Removes the `simple-icons` dependency from a previous iteration since
`AgentIcon` was its only consumer.
Public-API stable: `<AgentIcon id={a.id} size={X} />` still accepts the
same prop shape; `AvatarMenu`'s small-size usage continues to work.
* refactor(web): polish entry view + Settings dialog UI for v0.7.0
A sweep over the two surfaces that have the most visual surface area in
the app (the entry sidebar / New Project panel on the left, and the
Settings modal). The work converged on a single neutral palette + a
small set of shared dimensional standards documented in CSS, so future
sections that get added slot into the same rhythm.
New Project panel (apps/web/src/components/NewProjectPanel.tsx +
.newproj* rules in index.css)
- Adds a spec comment block at the top of the .newproj rules listing
the canonical heights (input 30, dropdown 38, compact toggle 36,
popover item 38) and the neutral colour rules.
- Rebuilds PlatformPicker as a DS-picker-style dropdown trigger +
popover (the previous 6-card 2×3 grid was ~280px tall; the dropdown
collapses to a single 38px row with the same multi-select semantics).
- Replaces SurfaceOptions' two heavy `ToggleRow` cards with the new
compact one-line `CompactToggle`; the descriptive hint moves to a
native `title` tooltip.
- Compresses the Fidelity card grid (thumb aspect 16/7 → 16/5, tighter
padding, smaller label).
- Neutralises every selected/active state inside the panel: removes the
orange accent fills and rings from `.newproj-card.active`,
`.newproj-title-badge`, `.compact-toggle.on`, `.toggle-row.on`, the
DS picker popover items + radio/check marks, the trigger open border
and shadow, and the search-bar background. The Create CTA stays the
only orange element on the panel.
- Aligns the project-name input focus state across the sidebar:
border `var(--text)` + 8% black halo (rgba is written out because the
CSS pipeline collapses `color-mix(... 8%, transparent)` down to a
solid `var(--text)`, which would render as a 3px solid black band).
- Switches the body card from `flex: 1 1 auto` to `flex: 0 1 auto` so a
short form variant doesn't leave a white void at the bottom of the
card, and disables overscroll-bounce on the card so a fast scroll
doesn't briefly expose the page-level gray under the white surface.
- Pins the privacy footer below the card with a fixed 0 margin-top +
shorter padding-top so it reads as a label of the card rather than a
centred dialog footer.
Entry sidebar footer (apps/web/src/components/EntryView.tsx +
.entry-side-foot* rules)
- Replaces the X social pill's `external-link` placeholder glyph with a
bespoke filled `x-brand` SVG that mirrors the `discord` mark already
in the icon set.
- Wraps Discord + X in `.entry-side-foot-social` and lets that group
flex-margin to the right of the row, so the two social pills read as
a tight pair instead of a fourth pill stuck to the Pet pill.
- Drops the "unadopted" red dot on the Pet pill (it duplicated the call
to action that the label already carried).
- Shrinks the footer icons to 10px and dims them to 55% / 75% opacity
on hover so the labels are clearly the focal point — `currentColor`
on the lucide-rendered SVGs would otherwise make the glyphs full
black on hover.
- Tightens the env-pill version text cap (180 → 142) so the top row
ends close to the right edge of the Language + Pet group below it.
Settings dialog (apps/web/src/components/SettingsDialog.tsx +
.modal-settings / .settings-* / .seg-* / .agent-* rules)
- Removes the "SETTINGS" kicker eyebrow above each section title (the
big-typography title and modal context already make it redundant).
- Switches the sidebar from a card-per-item layout to ChatGPT-style
single-line pills: hides the `<small>` description, swaps the
sidebar bg from gray to white, makes the active item a gray pill (no
border, no shadow) so all items keep a consistent row height
regardless of state.
- Drops the modal-body's top border (already separated by the
whitespace between modal-head and the body grid) and pins
`.modal-settings { height: min(720px, 100vh - 64px) }` so the
dialog no longer resizes when the user switches between short and
long sections.
- Compresses the Local CLI / BYOK seg-control from a 2-line ~52px card
pair to a 1-line ~42px segmented pill that height-matches the active
sidebar nav-item, and aligns the `.settings-content` padding-top
with `.settings-sidebar` (22 → 16) so the first content row sits
level with the first sidebar item.
- Neutralises agent-card selected state, install/docs link colour, and
protocol-chip active state — same accent-stripping pattern as the
New Project panel.
- Uniform agent-card height via `min-height: 64px` so installed cards
(icon + name + version) align with unavailable cards (icon + name +
not-installed + Install/Docs row).
No prop-API changes, no business-logic edits — this is a pure visual
refactor. Existing tests, providers and daemon contracts are untouched.
|
||
|
|
b2841f6045
|
Fix user chat message bubble styling (#1517)
Co-authored-by: Haoyuan Yu <haoyuan.yu@shopee.com> |
||
|
|
6736310a01
|
Implement manual edit inspector (#1448)
* feat(web): tweaks palette popover with HSL hue-shift recoloring Adds a Tweaks color-palette popover to the HTML preview toolbar. Selecting a palette re-skins the iframe in place via a srcDoc-side bridge that walks the DOM and shifts every chromatic paint to the target hue while preserving each color's saturation and lightness — pale tints stay pale, bold CTAs stay bold, just in the new color family. Mono-noir desaturates instead of shifting. - runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options - file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected - FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration - PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir - PreviewDrawOverlay: stub pass-through until the draw branch lands * feat(web): hide finalize-design toolbar from project header * test(e2e): skip project actions toolbar flow after toolbar removal * Polish manual edit inspector * Implement manual edit inspector * Fix manual edit review regressions * Fix FileViewer CI regressions * Fix remaining manual edit review issues * Flush manual edit styles before draw exit * Restore Critique Theater styles * Accept pixel line-height manual edits --------- Co-authored-by: qiongyu1999 <2694684348@qq.com> |
||
|
|
c9cc3b88c0 |
feat(web): standardize plugin terminology and enhance UI components
- Updated terminology from "Community" to "Official" across various components to reflect first-party plugin status. - Enhanced the ChatComposer, HomeHero, and PluginsHomeSection components to improve user experience and clarity in plugin management. - Improved CSS styles for better visual consistency and layout across plugin-related interfaces. - Added tests to ensure proper functionality and visibility of official plugins in the UI. This update reinforces the distinction between official and user-installed plugins, enhancing the overall user experience in plugin interactions. |
||
|
|
0f0d2879ff
|
Make de/fr/ru content i18n optional (#1511) | ||
|
|
dc7791ef9d
|
feat(analytics): add project_id + project_kind to studio/artifact events (#1509)
Product tracking doc 260513 added project_id + project_kind to studio_view (artifact), studio_click (share_option), and artifact_export_result. The Studio funnel can now group by project type without joining run_created on the back end. - contracts: 3 props gain required project_id + project_kind - ProjectView → FileWorkspace → FileViewer: thread projectKind down, converting metadata.kind via projectKindToTracking once at the top - FileViewer + HtmlViewer: populate the three call sites |
||
|
|
c16297f10c
|
Refine preview and project dropdown controls (#1514)
* Refine preview and project dropdown controls * fix(web): gate OS widget metadata Generated-By: looper 0.7.4 (runner=fixer, agent=codex) * fix(web): mark platform picker listbox multi-select Generated-By: looper 0.7.4 (runner=fixer, agent=codex) |
||
|
|
e2f409579d
|
docs: Critique Theater Phase 14 (user guide + 2 AGENTS module maps) (#1319)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* feat(web): CritiqueTheaterMount wires SSE + reducer into a single drop-in (Phase 9.1)
* feat(i18n): Critique Theater strings for de + ja + ko + zh-TW (Phase 9.2)
* fix(web): resolve P1 + P2 review feedback on Phase 9 (PR #1315)
Addresses every blocker from codex, Siri-Ray, and lefarcen. The
three state-lifecycle and SSE-validation issues they also flagged
inherit fixes from PR #1314's review pass that this branch now sits
on top of after rebase.
Real daemon kill on Interrupt (P1)
- CritiqueTheaterMount now POSTs to
/api/projects/:id/critique/:runId/interrupt alongside the
optimistic local dispatch. Before this fix, clicking Interrupt
only flipped the React state to interrupted while the daemon job
kept running. The fetch is best-effort: a 404 (endpoint not wired
yet, lands in Phase 15) is swallowed with a dev-mode console.warn
so the UI still moves to the collapsed badge.
- New fetchInterrupt test seam lets RTL assert on the URL / method
and simulate the "daemon not ready yet" path. Two tests pin both:
the happy URL proj-42/critique/run-abc/interrupt POSTs, and a
rejected fetch still flips the UI.
interruptPending reset on new run (P2)
- A ref-backed effect compares the current runId against the last
one we saw; when it changes, interruptPending is cleared. A user
who interrupts run-1 and then triggers run-2 from the same mount
now gets a fresh, enabled kill button instead of one stuck in
"Interrupting…". Pinned by a new mount test.
Escape keybind scope (P2)
- InterruptButton now checks the keydown target. Escape inside an
input, textarea, select, or contenteditable element is ignored
(and any ancestor of those via closest() is treated the same
way). Body-level focus still fires the keybind so the Theater
area's affordance keeps working. Four new tests cover textarea,
input, contenteditable, and the body-focus positive case.
userFacingName i18n key (P2)
- The spec at specs/current/critique-theater.md:6 mandates a single
critiqueTheater.userFacingName key so the "Design Jury" label can
be renamed without touching code. Phase 8 introduced
critiqueTheater.title by mistake; renamed across types.ts, en.ts,
zh-CN.ts, de.ts, ja.ts, ko.ts, zh-TW.ts, and the lone consumer
TheaterStage.tsx. The locale alignment test stays green.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 14 files, 112 tests (was 101 before, +11 new for
the Phase 9 review pass: 3 mount + 4 InterruptButton focus scope;
the rest were already in #1314's review fix).
- tests/i18n/locales.test.ts 5 of 5 across 18 locales.
* feat(daemon): adapter-degraded registry with TTL (Phase 10.1)
In-memory registry recording adapters that produced malformed or
oversize transcripts so the orchestrator can skip them for a TTL
window (default 24h) instead of cycling through known-bad providers
on every run.
Records carry reason (malformed_block | oversize_block |
missing_artifact), source label, and expiresAt. The test-only
clock seam lets the suite advance time deterministically and prove
that an expired entry stops counting as degraded without anyone
calling clearDegraded.
7/7 vitest cases green.
* feat(daemon): synthetic good + bad adapter fixtures (Phase 10.2)
Two test-only adapters that read the existing v1 transcript
fixtures (happy-3-rounds and malformed-unbalanced) and replay them
as either a full string or a 512-byte chunked stream. The chunked
form is what the conformance harness uses to prove the parser
holds together when the transcript arrives in arbitrary network
slices, not as one buffered blob.
* feat(daemon): adapter conformance harness (Phase 10.3)
runAdapterConformance pulls a transcript through the same
parseCritiqueStream pipeline the orchestrator uses and classifies
the outcome as shipped, degraded, or failed. On a degraded
outcome it forwards the matched reason to the adapter-degraded
registry, so a single nightly conformance run is what populates
the skip list rather than the orchestrator learning each adapter
is broken at request time.
5/5 vitest cases green covering shipped, malformed degraded,
oversize degraded, no-ship failure, and the harness-thrown
failure path.
* test(e2e): Critique Theater Playwright suite (Phase 11)
Six tests, one viewport per visual case, deterministic SSE
fixtures stubbed via page.route(). Adds the suite to
test:ui:extended so the existing extended-UI lane picks it up.
Coverage:
1. Happy path: a single mounted theater plays the full
fixture (1 run_started, 5 panelists open / dim / must_fix /
close, 1 round_end, 1 ship) and ends on the score badge.
2. Interrupt mid-run: the panelist that is open at the time
the interrupt button is clicked closes with an interrupted
marker and the transcript freezes there.
3. Visual regression at 375x720 mobile.
4. Visual regression at 768x1024 tablet.
5. Visual regression at 1280x800 desktop.
6. A11y role tree: the theater region exposes a labelled
landmark, each panelist lane is a group with an accessible
name, the score is a status live region.
All SSE traffic is stubbed by page.route so the suite runs in CI
without a daemon. The toggle is seeded via localStorage by
bootAppWithCritiqueEnabled so the gate behaves as if Settings
flipped it on. typecheck clean; playwright --list reports 6.
* test(web): reducer p99 bench at 10k iterations (Phase 13.1)
Locks the documented 2ms budget for the Critique Theater reducer
on a representative SSE script (27 actions, one full happy run)
behind a regression gate. Asserts p99 stays under 4ms (2x the
documented budget) so CI runners with a noisy neighbour do not
flake while a real regression to 20ms or 200ms still trips.
The bench is a vitest case rather than a bare microbenchmark so
it runs in the same CI lane as every other web test and does not
need a parallel runner.
* test(web): critique surface coverage walker (Phase 13.2)
Walks the public critique surface (11 SSE event names, 5 panelist
roles, 6 lifecycle phases, 9 named i18n keys) and asserts each
named symbol appears in both the src corpus and the test corpus.
The walker is the gate that catches a rename in one half of the
codebase without a matching update in the other half: a future
PR that drops 'panelist_must_fix' from the reducer without also
removing its test reference fails this suite.
62 assertions, one per symbol per corpus.
* docs: Critique Theater user guide (Phase 14.1)
Seven sections aimed at end users (not contributors):
1. What is Design Jury
2. How it works (the five panelists, auto-converging rounds,
the composite formula)
3. Settings (the M1 toggle and what it does)
4. Reading the score badge
5. Replay surface
6. Troubleshooting (degraded, interrupted, failed)
7. FAQ
The composite formula is documented as
designer * 0 + critic * 0.4 + brand * 0.2 + a11y * 0.2 + copy * 0.2
because anyone trying to reverse-engineer the score is going to
search for those weights and the docs are the place they should
land first.
* docs(daemon): critique module AGENTS map (Phase 14.2)
Daemon-side wayfinder for the apps/daemon/src/critique directory.
Tables every file, what owns what invariant, and the 'when you
change anything here' guide so a future contributor does not
have to reverse-engineer the rollout resolver before adding a
new SSE event.
* docs(web): Theater module AGENTS map (Phase 14.3)
Web-side mirror of the daemon AGENTS map. Same file table, same
invariants section, same change-impact guide, sized to the
Theater component package.
* docs: tighten Phase 14 reasoning from lefarcen review (PR #1319)
Four content gaps lefarcen flagged in the Phase 14 docs review,
addressed inline rather than deferred. The fifth item (scope-drift
between 'docs only' PR body and the cumulative stacked diff) is
handled by rewriting the PR body, not the docs.
1. Round exit conditions (lefarcen P2-1).
docs/critique-theater.md §2 'Auto-converging rounds' now lists
the five conditions that stop a run (threshold reached, round
budget exhausted, per-round timeout, total timeout, user
interrupt) with their default values. A user debugging a run
that stopped at round 1 with composite 5.4 can read this list
and find the matching cause without spelunking the orchestrator.
2. Prior-art comparison (lefarcen P2-2).
New §1.5 'Why an in-CLI panel and not a third-party design lint'
pre-answers the 'why not Figma lint / Adobe checker / Material
You conformance' question. Three differences: rule engines vs
generative reviewers, post-hoc vs in-loop, external service vs
same-CLI-session.
3. Composite formula rationale (lefarcen P2-4).
§2 now explains why each weight is set the way it is: critic
gates correctness so it gets 0.4; brand / a11y / copy are
secondary quality dimensions at 0.2 each; designer is at 0.0
in v1 because aesthetic preference is not a ship gate. The slot
stays in the schema so notes flow into the transcript and a v2
config release can bump the weight without a wire-shape change.
4. v2 cast-config ownership (lefarcen P2-3).
Both AGENTS.md files (daemon + web) now declare a 'Designer
weight frozen at 0.0 until v2 cast config' invariant. The
daemon side calls out where the SKILL.md frontmatter resolver
lands (apps/daemon/src/critique/config.ts); the web side calls
out where the Settings surface lands (apps/web/src/components/
Settings/). A contributor reading either AGENTS.md before
implementing v2 sees which module to touch first.
* docs(web): mirror the Designer-weight invariant in Theater AGENTS.md (PR #1319)
lefarcen P1 follow-up on PR #1319: the daemon AGENTS.md already
declares 'Designer weight is frozen at 0.0 until v2 cast config
lands' as an invariant, but the web AGENTS.md's parallel bullet
led with 'Composite weights are read-only on the web side' which
buried the Designer-specific constraint. A web contributor
reading that bullet would not realise the v1 weight distribution
is wire-shape (changing it mid-v1 invalidates persisted
critique_runs composite values).
Rewrote the bullet to lead with the same 'Designer weight is
frozen at 0.0 until v2 cast config lands' phrasing the daemon
side uses, and added an explicit cross-link to the daemon
AGENTS.md so the two halves of the invariant read as one rule.
Web-side specifics retained: ScoreTicker / TheaterCollapsed read
composite off the wire (no client recompute), v2 lands as a
Settings surface at apps/web/src/components/Settings/, do not
add a 'weights' prop to any component in this directory until
the contracts package carries the v2 cast type.
* docs: replace deferred metrics endpoint reference + refresh Theater module map (PR #1319)
Two carryover items lefarcen flagged across the PR #1319 + #1320
reviews.
1. docs/critique-theater.md was sending users to
/api/metrics/critique as the conformance-status check on
malformed_block, but the Phase 12 metrics endpoint is
explicitly deferred until after orchestrator wiring lands.
Replaced the link with the pnpm conformance-harness command
that DOES exist today (pnpm --filter @open-design/daemon
vitest run tests/critique-conformance.test.ts) and noted
that the dashboard surfaces this status as a series once
Phase 12 ships.
2. apps/web/src/components/Theater/AGENTS.md module map was
stale after Phase 15: the index.ts row said 'only two hooks
are exported' but the barrel now exports
useCritiqueTheaterEnabled too (plus the setCritiqueTheaterEnabled
setter). Updated the row to list all three hooks + the
setter + the reducer-derived contract types, and added a
new row for hooks/useCritiqueTheaterEnabled.ts in the file
table so a web contributor scanning the table sees the new
hook without inferring it from the index.ts blurb.
* fix(web): restore wait-for-daemon-ack pattern on Theater interrupt
Same regression as flagged on PR #1316 post-main-merge: the
optimistic local dispatch fired before the POST resolved, so a
daemon 404 / 409 still terminalized the UI and the real SSE
terminal event got ignored by the sticky interrupted phase.
Snapshot runId / bestRound / composite at click time, dispatch
interrupted only on res.ok, clear interruptPending on rejection or
non-2xx so the user can retry. Tests cover rejection + 404 leaving
the run on the live stage; the 204 path waits for the ack.
---------
Co-authored-by: Nagendhra <nagendhra405@gmail.com>
|
||
|
|
fcc3ae5838 |
feat(web): enhance HomeHero and related components for improved context selection and visibility handling
- Updated the HomeHero component to support skill and MCP server mentions, allowing users to select these options seamlessly. - Improved CSS styles for the HomeHero component, enhancing the visual presentation of active selections and context tabs. - Refactored visibility handling for slides in the deck framework, ensuring proper display logic and preventing visibility issues with variant classes. - Added tests to verify the functionality of context selection and visibility handling, ensuring a smoother user experience. This update significantly enhances the user interface and interaction capabilities within the HomeHero component, improving the overall experience for users managing skills and presentations. |
||
|
|
9006b74cde |
feat(web): enhance ChatComposer and related components with skill and MCP server integration
- Added support for skills and MCP servers in the ChatComposer, allowing users to apply skills and select MCP servers through mentions. - Updated the HomeHero and ProjectView components to manage skill selection and project skill changes. - Enhanced the EntryShell and other components to accommodate new skill and MCP server functionalities. - Improved CSS styles for better visual presentation of new features. - Added tests to ensure proper functionality of skill and MCP server integrations within the ChatComposer. This update significantly improves the user experience by enabling seamless integration of skills and MCP servers into the chat interface, enhancing project management capabilities. |
||
|
|
f7a13c7b15 |
feat(web): enhance HomeHero component to support IME composition handling
- Added a `composingRef` to track IME composition state, preventing submission during text composition. - Implemented `isImeComposing` function to check if the input is currently being composed. - Updated event handlers to ensure that submissions and plugin selections are blocked while composing. - Added tests to verify that submissions and plugin picks do not occur during IME composition, improving user experience for non-Latin input methods. This update enhances the input handling in the HomeHero component, ensuring a smoother experience for users utilizing IME for text input. |
||
|
|
e2952acd05 |
Revert "fix(web): restore consistent app header layout (#1432)"
This reverts commit
|
||
|
|
c36609c47d |
feat(daemon, web): implement plugin sharing project creation and enhance CLI functionality
- Added new flags for conversation, message, agent, and model in the CLI to support enhanced plugin sharing features. - Introduced a new API endpoint for creating share projects for plugins, allowing users to publish to GitHub or contribute to Open Design. - Updated the UI components to facilitate the new sharing functionalities, including prompts for user input during the sharing process. - Enhanced the project management system to handle new plugin share actions, improving user interaction and experience. - Added tests to ensure the reliability of the new sharing features and their integration within the existing plugin management system. This update significantly enhances the plugin ecosystem by enabling users to share their creations more effectively and streamline collaboration. |
||
|
|
342ba44383
|
fix memory extraction history affordance (#1447) | ||
|
|
3d3119333c
|
fix(web): restore consistent app header layout (#1432)
* docs: add NotebookLM GitHub export script (#1062) * docs: add NotebookLM GitHub export script * fix: make NotebookLM export TOC anchors work * fix: escape TOC link text markdown chars * fix: include merged PRs when exporting --prs all * fix: allow --prs merged mode * fix: treat --limit as total export budget * fix: avoid starving buckets under global --limit * fix: support --issues none and handle repos w/ issues disabled * fix: avoid underfilling export when buckets empty * fix: keep disabled-issues fallback quiet * fix: silence disabled issues fallback * fix: satisfy script typecheck * prevent duplicate saves and add template deletion (#1294) * prevent duplicate template entries on repeated save * add delete button to saved template list Templates can now be removed from the template picker via a hover x button, calling the existing DELETE /api/templates/:id endpoint. * add missing onDeleteTemplate prop in test fixtures * add template deletion flow test for NewProjectPanel * reject template names longer than 100 characters * preserve original createdAt on template update * feat: add FAQ page skill (#1162) * fix: set writable OD_DATA_DIR default for nix run Fixes #1157 When running via 'nix run github:nexu-io/open-design', the daemon attempted to create runtime state under the Nix store package path: /nix/store/.../lib/open-design/.od/projects The Nix store is read-only at runtime, causing startup to fail with ENOENT when mkdir() tried to create the projects directory. This commit updates the nix run wrapper to export OD_DATA_DIR with a writable default ($HOME/.od) when the variable is unset. Users can still override it by setting OD_DATA_DIR before running. The Home Manager and NixOS modules already set OD_DATA_DIR, so they are unaffected by this change. * feat: add FAQ page skill Add a new skill for generating Frequently Asked Questions pages with: - Collapsible accordion sections for Q&A pairs - Real-time search functionality - Category filtering (Billing, Account, Technical, General) - Smooth animations and transitions - Keyboard navigation support - Mobile-friendly responsive design - Semantic HTML with proper ARIA attributes The skill includes: - SKILL.md with triggers, workflow, and output contract - example.html demonstrating a complete FAQ page with 12 questions Use cases: help centers, support pages, product documentation * fix: address PR review feedback for FAQ page skill - Fix craft slugs: use accessibility-baseline and state-coverage instead of non-existent slugs - Remove overly broad 'questions and answers' trigger - Add edge case handling for insufficient/excessive FAQs - Remove search highlighting requirement (XSS risk) - Update self-check to reflect filtering instead of highlighting Addresses review comments from @lefarcen and @chatgpt-codex-connector * feat: add localized copy for faq-page skill Add German, French, and Russian translations for the FAQ page skill example prompt to fix validation test failure. - DE: FAQ-Seite mit Akkordeon-Abschnitten, Suchfunktion und Kategoriefilterung - FR: Page FAQ avec sections accordéon, recherche et filtrage par catégorie - RU: Страница FAQ со складными секциями-аккордеонами, поиском и фильтрацией * fix: escape apostrophe in French translation Use double quotes to avoid syntax error with d'auth * fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins (#1110) * fix(platform): add legacy ~/.fnm path to wellKnownUserToolchainBins fnm legacy installations use ~/.fnm/node-versions. Closes #1102 * fix: remove stray .fnm token from type declaration * docs: add Windows troubleshooting guide (#478) (#1170) * docs: add Windows troubleshooting guide (#478) Add docs/windows-troubleshooting.md with step-by-step fixes for the most common native-Windows setup errors: - Node 24 / nvm-windows gotchas (fake nvm file in System32) - pnpm not found after installation - Build scripts blocked by pnpm 10 (better-sqlite3, sharp) - Visual Studio / gyp build errors - Starting the dev server - Optional OpenCode CLI setup Also update CONTRIBUTING.md and QUICKSTART.md to link to the new guide instead of the vague "file an issue if it doesn't" note. * docs: fix Windows guide command accuracy (#1170) Address all 6 inline review comments from lefarcen: - Pin npm-global pnpm install to @10.33.2 (matches packageManager field) - Use where.exe instead of bare where (PowerShell alias conflict) - Fix OpenCode package: opencode-ai (not opencode), binary is opencode - Add EPERM fallback note for corepack enable on protected installs - Add Python check for gyp ERR! find Python - Expand diagnostic checklist with corepack, python, execution policy Also remove redundant corepack pnpm --version from checklist. * feat(daemon): inject compiled design-system tokens + fixture into prompts (#1385) * feat(daemon): inject compiled design-system tokens + fixture into prompts Follow-up to #1231. The prior PR landed the structured form of two brands (`default` + `kami`) and codified the schema; this PR teaches the daemon to actually consume those files when assembling the system prompt, so agents stop having to re-derive token names from DESIGN.md prose every turn. Gated behind `OD_DESIGN_TOKEN_CHANNEL=1` for the smoke-test phase — flag-off keeps the daemon byte-equivalent to today's behavior, flag-on appends two new prompt blocks (the brand's `tokens.css` :root contract and its `components.html` reference fixture) right after the existing DESIGN.md block. Brands without those sibling files (every brand except `default` and `kami` today) skip silently in either mode. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(daemon): only swallow ENOENT/ENOTDIR in readFileOptional, rethrow rest Reviewer feedback (nettee, #1385). The prior catch-all hid permission errors, EISDIR, and broken packaged-resource paths behind the same "undefined = absent" branch the legacy ~138-brand fallback uses, which would let `OD_DESIGN_TOKEN_CHANNEL=1` silently degrade to the DESIGN.md-only prompt while reporting success. That corrupts the exact signal the smoke-test rollout depends on. Now `readFileOptional` only returns undefined for ENOENT / ENOTDIR (real "file does not exist" cases) and rethrows everything else. Added a focused test that plants a directory at the tokens.css path to exercise the EISDIR branch, plus a partial-presence regression test to confirm the stricter contract preserves the legacy fallback. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: chaoxiaoche <chaoxiaoche@192.168.10.16> Co-authored-by: Cursor <cursoragent@cursor.com> * feat(daemon): make connection-test timeouts configurable (#1222) * feat(daemon): make connection-test timeouts configurable Provider and agent connection tests had hardcoded 12s / 45s budgets, which are too tight for slow networks or distant providers (the user sees "timeout" in Settings with no way to extend the budget). - Add OD_CONNECTION_TEST_PROVIDER_TIMEOUT_MS (default 12_000) - Add OD_CONNECTION_TEST_AGENT_TIMEOUT_MS (default 45_000) - Invalid values (non-numeric, zero, negative, fractional) emit a console.warn and fall back to the default, so a typo in the env never silently disables the safety timeout. - Export resolveConnectionTestTimeoutMs for unit testing; cover the three resolution paths (fallback / honored override / invalid). 41 connection-test tests pass (+3 new), full daemon suite 1170/1170. * fix(daemon): reject connection-test timeout overrides above Node's setTimeout maximum Node's `setTimeout` silently clamps any delay above `2^31-1` ms (2_147_483_647) to ~1 ms with a TimeoutOverflowWarning. The previous `Number.isInteger(n) && n >= 1` check accepted oversized values unchanged and passed them straight to `setTimeout`, so an override that *intended* to raise the budget — e.g. `OD_CONNECTION_TEST_AGENT_TIMEOUT_MS=3000000000` — instead caused every connection test to fail almost immediately. The safety timeout was effectively disarmed. Add `MAX_CONNECTION_TEST_TIMEOUT_MS = 2_147_483_647` and switch the guard to `Number.isSafeInteger(n) && n >= 1 && n <= MAX...`. The boundary value is still accepted; one millisecond past it falls back with a warn. Regression test exercises `3_000_000_000`, `2_147_483_647`, and `2_147_483_648`. Addresses #1222 review feedback from @chatgpt-codex-connector, @mrcfps, and @lefarcen. * fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN SSRF bypass) (#1122) * fix(security): strip trailing dot in normalizeBracketedIpv6 (FQDN bypass) new URL('http://192.168.1.5./').hostname returns '192.168.1.5.' — the trailing dot is the RFC 1034 absolute-FQDN form and resolves identically to '192.168.1.5'. parseIpv4 fails on the dotted form, so 169.254.169.254. slips past the metadata-service block, 192.168.1.5. slips past the LAN block, and localhost. slips past the loopback identification. Strip trailing dots in normalizeBracketedIpv6 so all downstream checks (isLoopbackApiHost, isBlockedExternalApiHostname, isBlockedIpv4, IPv6 range tests) see the canonical form. Adds 6 vitest cases covering loopback FQDN forms (localhost., foo.localhost., 127.0.0.1.) and SSRF FQDN bypasses (169.254.169.254., 192.168.1.5., 10.0.0.5.). Refs nexu-io/open-design#1119 review feedback (P2 from @lefarcen). * test(connectionTest): tighten trailing-dot coverage per #1122 review Two issues from #1122 review: 1. (P2 from @mrcfps + codex bot) The original `foo.localhost.` case asserted error===undefined on validateBaseUrl, which only proves the URL passed validation — not that the host is identified as loopback. Replaced with direct isLoopbackApiHost(...) assertions on the actual loopback FQDN forms (localhost., 127.0.0.1., 127.0.0.5.) so the test exercises the loopback path the comment claims. 2. (P3 from @lefarcen) Original blocked-FQDN tests covered only 3 of 7 ranges that isBlockedIpv4 handles. Added a dedicated case per range (0.0.0.0/8, 10/8, 100.64/10, 169.254/16, 172.16/12, 192.168/16, multicast >=224) so future regressions in normalizeBracketedIpv6 surface against the full coverage. * docs: drop misleading foo.localhost./endsWith claim in normalizer comment @lefarcen review feedback: isLoopbackApiHost only accepts exact 'localhost', '::1', loopback IPv4, and mapped loopback IPv4 — there's no subdomain or endsWith handling, so referencing 'foo.localhost.' overstates what the trailing-dot strip enables. Rewrite the comment to match actual call sites (isLoopbackApiHost equality + isBlockedIpv4 numeric parse). * feat(daemon): export self-contained HTML via /export/*?inline=1 endpoint (#1312) * test(daemon): add Red unit tests for inlineRelativeAssets helper 14 cases pinning the behavior contract for the upcoming apps/daemon/src/inline-assets.ts helper: - link/script inlining with verbatim body preservation - non-src script attrs preserved (type=module, defer, crossorigin) - relative path resolution (root + nested + deep-nested owners) - self-closing and single-quoted attr forms - negative cases: missing rel, rel=preload, absolute/data/blob/leading-slash - escaping: </style and </script inside body - null-fileReader graceful degradation - duplicate identical tags fully replaced (diverges from apps/web/src/components/FileViewer.tsx:5313's first-match-only; locked decision per plan §3.3) - HTML-escaped data-od-inline-asset attr Tests intentionally Red — module ../src/inline-assets.js does not yet exist. Phase B-G of plan declarative-roaming-gosling.md will turn them green by porting FileViewer.tsx:5248-5354 server-side. Refs nexu-io/open-design#368. * feat(daemon): port inlineRelativeAssets server-side for export endpoint Adds apps/daemon/src/inline-assets.ts — a pure helper that takes (html, ownerFileName, fileReader closure) and returns the HTML with every relative <link rel=stylesheet> and <script src> contents inlined into <style data-od-inline-asset="…">/<script>…</script> blocks. The fileReader closure keeps the helper free of fs/Express coupling so the route handler owns the filesystem boundary. Port source: apps/web/src/components/FileViewer.tsx:5248-5354 — five functions (inlineRelativeAssets, resolveProjectRelativePath, baseDirFor, readHtmlAttr, escapeHtmlAttr). The fetch hop becomes the fileReader closure; replace-all replaces first-match-only per locked design decision §3.3 (inline comment in inline-assets.ts cites the divergence from FileViewer.tsx:5313 and notes the web inline path is on a deprecation track since PR #384 made URL-load the default). Phase B-G of plan declarative-roaming-gosling.md. All 14 unit cases from the Red commit ( |
||
|
|
1f4259a190 |
feat(web): update routing and terminology for automation features
- Modified the routing logic to recognize 'automations' as a valid entry point alongside 'tasks', ensuring backward compatibility. - Updated the EntryNavRail component to reflect the change from 'tasks' to 'automations' in labels and tooltips. - Renamed instances of 'tasks' to 'automations' in the TasksView component for consistency and clarity. - Enhanced HomeView and chip actions to include new input structures for better handling of automation scenarios. - Updated tests to validate the new routing and terminology changes, ensuring proper functionality across the application. This update improves the user experience by clarifying the distinction between tasks and automations, aligning the UI with the updated terminology. |
||
|
|
358105c154 |
feat(web): enhance PluginsHomeSection with contribution card and improved plugin creation flow
- Updated the `onCreatePlugin` prop to accept an optional goal parameter, allowing for more contextual plugin creation. - Introduced a contribution card that displays when no plugins match the current filters, providing users with a prompt to create new plugins. - Enhanced the `resolveContributionTarget` function to return detailed information about the contribution context. - Added new CSS styles for the contribution card to improve visual presentation. - Updated tests to cover new contribution scenarios and ensure proper functionality of the contribution card. This update significantly improves the user experience by guiding users to contribute plugins when no matches are found, fostering community engagement. |
||
|
|
5f71968f61 |
feat(daemon, web): implement plugin sharing features for GitHub and Open Design contributions
- Added new API endpoints for publishing plugins to GitHub and contributing to Open Design, enhancing the plugin sharing capabilities. - Introduced functions for handling plugin sharing actions, including `publishGeneratedPluginToGitHub` and `contributeGeneratedPluginToOpenDesign`. - Updated the `DesignFilesPanel` and `FileWorkspace` components to support new sharing functionalities, allowing users to publish or contribute plugins directly from the interface. - Enhanced the UI with new buttons for publishing and contributing plugins, improving user interaction and experience. - Added tests to ensure the reliability of the new sharing features and their integration within the existing plugin management system. This update significantly improves the plugin ecosystem by enabling users to share their creations with the community and streamline collaboration. |
||
|
|
e1bc83a476
|
feat(analytics): PostHog product analytics (P0 events, consent-gated, packaged) (#1428)
* feat(analytics): scaffold PostHog product-analytics integration
- Add @open-design/contracts/analytics subpath with the 17 P0 event
payload types, header constants, and code↔CSV enum mapping helpers.
- Add apps/daemon/src/analytics.ts with env-gated posthog-node client,
request-scoped analytics context reader, and artifact-id anonymizer.
- Expose GET /api/analytics/config so the web bundle never embeds the
PostHog key at build time; daemon owns POSTHOG_KEY / POSTHOG_HOST.
- Add apps/web/src/analytics module (identity + lazy posthog-js client
+ React provider) and mount it under <I18nProvider> in app/layout.
No event wiring yet — that lands in the next commit alongside trigger
points (App.tsx, EntryView, NewProjectPanel, SettingsDialog, FileViewer,
runs.ts).
* feat(analytics): wire app_launch, home_view, home_click, project_create_result
- App.tsx: fire app_launch once after first effect tick. handleCreateProject
now emits project_create_result on both success and failure paths.
- EntryView.tsx: home_view (page) gated on agents loading so
has_available_cli isn't transiently false; home_view (asset_panel) fires
per top-tab change with the right result_count.
- NewProjectPanel.tsx: home_click create_button fires before delegating to
the parent; a fresh request_id is generated here and threaded through
onCreate so the matching project_create_result stitches via $insert_id.
- contracts/analytics: tighten createTabToTracking and topTabToTracking
for the worktree branch's renamed tabs (live-artifact, templates).
* feat(analytics): wire settings_view + 3 settings_click events
- settings_view fires on dialog mount and on every section switch,
carrying the active section (mapped via settingsSectionToTracking
for the 16-section worktree layout), execution_mode, and the
selected CLI provider id when present.
- settings_click execution_mode_tab: setMode now emits before/after
values whenever the user toggles between Local CLI and BYOK.
- settings_click cli_provider_card: agent card onClick reports
cli_provider_id via agentIdToTracking (kiro → other).
- settings_click byok_field: onFocus added to api_key, model select,
and base_url inputs; provider_id widened to include google so the
worktree's Gemini protocol slot type-checks.
* feat(analytics): wire studio_view + studio_click chat, studio_view artifact
- packages/contracts/src/analytics/artifact-id.ts: FNV-1a 64-bit helper
produces a 16-hex anonymized id for (projectId, fileName). Stable
cross-platform so the daemon and the web bundle resolve the same id
without a Web Crypto round-trip; daemon now re-exports it.
- ChatComposer: studio_view chat_panel fires once per project mount,
studio_click chat_composer fires on attachment + send buttons with
estimated user_query_tokens (length/4) and has_attachment.
- FileViewer: studio_view artifact fires once per (project, file) at
the dispatcher level, before any sub-viewer renders, with
artifact_kind derived from the renderer registry / file.kind table.
- Widen TrackingExportFormat to include markdown and cloudflare_pages
so the worktree branch's full share menu can emit verbatim.
* feat(analytics): wire studio_click share_option + artifact_export_result
HtmlViewer's share menu now emits both events per click via a
fireShareExport helper:
- studio_click share_option fires immediately on click with the chosen
export_format and a fresh request_id.
- artifact_export_result fires when the export resolves — success for
sync exporters (html, markdown, template) the moment the call
returns, success/failed for async exporters (pdf, zip, deploy)
via .then/.catch. The same request_id threads both events so
PostHog stitches click → result via $insert_id.
DEPLOY_PROVIDER_OPTIONS maps to the CSV's vercel / cloudflare_pages
slots; markdown is now a first-class export_format value.
Also ignore .env.local so local POSTHOG_KEY / .env-style secrets
don't get committed.
* feat(analytics): emit run_created and run_finished from the daemon
POST /api/runs now reads the analytics context off the
x-od-analytics-* headers the web client sets on every fetch, then:
- Captures run_created with project_id, conversation_id, run_id,
model_id, agent_provider_id (mapped via agentIdToTracking),
skill_id, design_system_id, plus the token_count_source marker.
- Schedules a run_finished capture on runs.wait(run) resolution,
mapping succeeded/canceled/failed to success/cancelled/failed and
reporting total_duration_ms.
Both events use a stable insert_id derived from the same uuid so
PostHog dedupes the daemon-side mirror against any future
web-side capture without double-counting.
Token sub-fields (user_query_tokens/system_prompt_tokens/...) stay
omitted in v1 — the claude-stream parser only exposes input/output
totals today. See tracking-doc-issues.md §3.2.
* feat(analytics): emit settings_cli_test_result + settings_byok_test_result
The original BLOCKING-list assumed these CSV P0 events were not
implementable in this branch because main lacked Test buttons. The
worktree HEAD actually wires `handleTestAgent` and `handleTestProvider`
in SettingsDialog, so both events are now in scope.
- handleTestAgent emits settings_cli_test_result on success and
failure paths with cli_provider_id mapped via agentIdToTracking,
result drawn from result.ok / catch branch, error_code from
result.kind or the thrown error name, and duration_ms timed via
performance.now().
- handleTestProvider emits settings_byok_test_result analogously,
using apiProtocol (anthropic|openai|azure|ollama|google) directly
as provider_id — wider than the CSV's 5-value enum, documented in
tracking-doc-issues.md §2.5.
Contracts: add SettingsCliTestResultProps / SettingsByokTestResultProps
plus matching track* helpers. AnalyticsEventName union now covers all
14 P0 events this branch supports.
* feat(analytics): gate PostHog on the existing telemetry.metrics consent
The integration now reuses the same first-launch privacy banner +
Settings → Privacy toggle that gates Langfuse, so a single user
decision controls both telemetry sinks.
- /api/analytics/config now consults the persisted AppConfigPrefs:
it returns enabled=true only when POSTHOG_KEY is set AND the user
has chosen "Share usage data" (telemetry.metrics === true). The
response also echoes installationId so the web client uses the
same anonymous id Langfuse keys off of — one identity per install,
shared across both sinks.
- Web AnalyticsProvider:
- Bootstrap fetch resolves installationId and threads it through
the x-od-analytics-anonymous-id header on every /api/* fetch,
so daemon-side captures (run_created / run_finished /
project_create_result) land on the same person record.
- Exposes a setConsent(granted) method that calls posthog-js's
opt_in_capturing / opt_out_capturing, wired from App.tsx via a
useEffect watching config.telemetry?.metrics. Toggling Privacy
→ metrics now stops/resumes events immediately, no reload.
- app_launch additionally gates on telemetry.metrics so a freshly-
declined user fires nothing, and a freshly-opted-in user fires on
the next reload.
* feat(packaging): bake POSTHOG_KEY into packaged daemon spawn env
Wires PostHog product analytics through the same Langfuse-style build-
secret pipeline so official Open Design builds ship with the key while
fork builds compile without it (the integration short-circuits cleanly
when POSTHOG_KEY is absent).
tools/pack
- resolveToolPackConfig reads POSTHOG_KEY / POSTHOG_HOST from
process.env at packaging time, validates them (no whitespace in the
key, http(s) URL for host, trailing-slash strip), and stamps them on
ToolPackConfig. Fork builds without the env vars simply omit the
fields; the daemon-side gate keeps things off in that case.
- Mac, Windows, and Linux packaged-config writers each append the two
fields to open-design-config.json next to the existing
telemetryRelayUrl entry.
apps/packaged
- RawPackagedConfig / PackagedConfig surface posthogKey / posthogHost
so the Electron entry and headless entry both forward them to the
daemon sidecar.
- buildPackagedDaemonSpawnEnv emits POSTHOG_KEY / POSTHOG_HOST into
the daemon child env when present. The daemon's existing analytics
module reads these via process.env — no daemon-side changes needed.
- The headless packaged path falls back to process.env for fields the
builder hasn't injected, mirroring how OPEN_DESIGN_TELEMETRY_RELAY_URL
is read there.
CI
- release-beta.yml and release-stable.yml expose POSTHOG_KEY (secret)
and POSTHOG_HOST (var) at workflow-env scope so every packaging job
inherits them. PR / fork builds without these set simply skip the
bake step.
Tests
- tools/pack: config.test.ts covers bake-through, fork-build omission,
whitespace rejection, invalid-URL rejection, and trailing-slash
normalization.
- apps/packaged: sidecars.test.ts covers buildPackagedDaemonSpawnEnv
forwarding the keys when present and omitting them when null.
* feat(analytics): enable PostHog autocapture + perf + exceptions
Flip on the PostHog SDK's automatic diagnostic features so we capture
click paths, page transitions, web vitals, dead clicks, and browser
exceptions without scattering instrumentation through the codebase.
Privacy defense lives in one place — apps/web/src/analytics/scrub.ts —
wired in via posthog-js's `before_send` hook so every outgoing event
passes through the same audit point:
- $autocapture / $rageclick / $dead_click / $copy_autocapture:
strips $el_text and value/placeholder/aria-label attrs from any
input, textarea, password input, or contenteditable element. PostHog
autocapture does not capture input.value by default, but $el_text
on a <textarea> reflects the typed content — that's the prompt
body for us, so it has to be scrubbed every time.
- $pageview / $pageleave: drops query string and fragment from
$current_url / $referrer so any future ?q=… can't leak.
- $exception: rewrites file:// and absolute filesystem paths in
stack frames to app://apps/<repo-relative> so we don't ship the
user's home directory.
- Suppresses $opt_in entirely — duplicate of our explicit
setConsent toggle in App.tsx.
Element-level defense in depth is limited to the single most sensitive
surface: the chat composer textarea gets `ph-no-capture` so PostHog
never even generates an event for clicks inside that subtree. Every
other input relies on scrub.ts — sprinkling the class through every
form would be noisy and easy to forget on new surfaces.
The existing Privacy → "Share usage data" toggle continues to gate
every new feature: posthog-js's opt_out_capturing() halts autocapture,
$pageview, $exception, web vitals, and dead clicks alongside the
explicit capture() calls — one global switch.
11 unit tests pin the scrub rules in apps/web/tests/analytics-scrub.test.ts.
* ci(nix): bump pnpmDepsHash for posthog-js + posthog-node additions
Adding posthog-js to apps/web and posthog-node to apps/daemon changed
pnpm-lock.yaml, which Nix's fixed-output pnpmDeps derivation pins by
sha256. The CI nix flake check failed with:
specified: sha256-KF3Mld72/iau+pJmA7HvnanRx8VLtDP0N624SKrtrrc=
got: sha256-PGFgX4lYyeH2TRAXfUq52A3EOa6bb1gO59hPsXhEk3s=
Copy the new hash into both nix/package-web.nix and
nix/package-daemon.nix per the procedure documented in nix/README.md
§"First-build hash pinning".
* feat(analytics): unify PostHog identity with Langfuse installationId
PostHog's distinct_id is the installationId stamped by /api/analytics/
config; Langfuse already reads the same id off app-config.json to
populate trace.userId. With both sinks keying off the same anonymous
identity, dashboards can correlate user actions (PostHog events) with
LLM runs (Langfuse traces) without re-identifying.
Two gaps closed:
1. applyConsent(false) — clear posthog-js's persisted ph_*_posthog
localStorage entry on opt-out via posthog.reset(). Without this, a
user who opts out, then clicks Delete my data, then re-opts in
would see PostHog stitch their new session to the deleted identity
because bootstrap.distinctID only takes effect on first init.
2. applyIdentity(newInstallationId) — Delete my data rotates the
installationId in app-config; App.tsx now watches config.installationId
and calls posthog.reset() then identify(newId) so the next event
batch is fully decoupled from the deleted one. Idempotent on
same-id re-renders so benign config refreshes don't churn PostHog
identities.
The fetch wrapper's x-od-analytics-anonymous-id header also flips to
the new id on rotation so daemon-side captures (run_created /
run_finished) land on the same person record from the very next API
call, not after a reload.
The end-to-end rotation flow is verified against a live PostHog
project; these unit tests pin the safety guards (no-client paths, null
inputs) since stubbing posthog-js's init-loaded callback chain is
brittle.
* fix(langfuse): require both metrics AND content consent for trace reports
Tightens the Langfuse gate so a user who shares anonymous metrics but
NOT conversation content stops emitting Langfuse traces entirely —
Langfuse is used for turn-quality evals which only make sense with
prompt/output bodies. PostHog (product analytics, content-free) stays
gated on `metrics` alone and is unaffected.
i18n: "Conversation content" → "Conversation and tool content" with
hints expanded to mention tool inputs/outputs so the consent surface
matches what the trace actually carries (en + zh-CN).
Bundled here per PR scope — change originated outside this PostHog
PR but lands cleanly on the same files; gating Langfuse strictly
on `content` makes the dual-sink consent model (PostHog = metrics,
Langfuse = metrics + content) symmetric across both i18n locales and
the daemon-side gate.
* feat(analytics): wire byok_provider_option + fix PR review P1s
Adds the BYOK protocol-chip click event (5-value provider_id mirroring
the apiProtocol Settings UI) and resolves four P1 review threads on
PR #1428.
byok_provider_option:
- New SettingsClickByokProviderOptionProps in contracts (provider_id =
anthropic|openai|azure|google|ollama; maps to CSV's 5 values per
tracking-doc-issues.md §2.5).
- trackSettingsClickByokProviderOption helper in apps/web/src/analytics.
- SettingsDialog hooks it on the protocol-chip onClick alongside the
existing setApiProtocol call; is_selected reflects whether the chip
was already active.
Review fixes:
1. client.ts (Siri-Ray): clear `initPromise` when the resolution is
null so a Privacy → metrics opt-in after a previous decline triggers
a fresh /api/analytics/config fetch. Without this, the disabled
response was cached forever — first-session opt-in needed a reload
to start sending PostHog events.
2. provider.tsx (Siri-Ray): replace `url.includes('/api/')` with a
strict same-origin + /api/ pathname check (shared
`isSameOriginApiCall` helper). Outbound third-party URLs containing
`/api/` (e.g. provider.example.com/api/x) no longer receive our
x-od-analytics-* headers.
3. provider.tsx (codex-connector, lefarcen): gate header injection on
`resolvedAnonId` being non-null. When Privacy → metrics is off,
/api/analytics/config returns enabled=false → resolvedAnonId stays
null → wrapper never installs → daemon can't read consent-bearing
headers → no daemon-side PostHog event. setConsent now also clears
resolvedAnonId on opt-out and re-fetches on opt-in.
4. daemon/analytics.ts (defense in depth): createAnalyticsService now
takes dataDir and capture() re-reads app-config to check
telemetry.metrics inside the fire-and-forget wrapper. Even if a
stale header somehow reaches the daemon after opt-out, the capture
is dropped before posthog-node.capture is called.
* fix(web): place "Share usage data" on the right in privacy consent banner
Swap button order in PrivacyConsentModal and the in-settings ConsentCard
so the affirmative "Share usage data" lands on the right and "Not now"
on the left. Matches the OK-on-the-right pattern users expect for
primary actions.
Both buttons keep equal visual prominence (same .privacy-consent-action
styling) so the swap doesn't change the EDPB equal-prominence stance
called out in the original Langfuse telemetry spec.
* feat(analytics): populate run_finished token totals from claude-stream usage
Daemon's claude-stream parser already emits agent usage events with
input_tokens / output_tokens totals; the run service buffers them in
run.events and Langfuse reads them out the same way. The run_finished
PostHog event was leaving these fields empty.
Scan run.events for the most recent agent usage frame on terminal
transition and emit input_tokens / output_tokens / total_tokens when
present. token_count_source flips to 'provider_usage' only when at
least one count landed; runs without provider-side usage data keep
'unknown'.
Provider does not break the input down into the 7 sub-fields the
tracking doc lists (memory / context / attachment / system_prompt /
…); those stay omitted until a parser change exposes them.
* feat(analytics): estimate user_query_tokens from prompt length
The user_query_tokens field for run_created / run_finished was hardcoded
to 0. We can't tokenize without bundling a model-specific tokenizer, but
the character/4 heuristic is the industry-standard estimate when one
isn't available and is enough for funnel analysis (prompt-length cohorts,
short-vs-long-query conversion rates).
Extracted from req.body via the same telemetryPromptFromRunRequest
pattern the daemon already uses for langfuse-bridge (currentPrompt then
message fallback). Only the integer count goes to PostHog — the prompt
text itself never leaves the daemon.
token_count_source flips appropriately:
- run_created with a prompt: 'estimated' (was 'unknown')
- run_created with no prompt: 'unknown'
- run_finished with provider usage: 'provider_usage' (overrides
baseProps' 'estimated' value)
- run_finished without provider usage: inherits 'estimated' or 'unknown'
from baseProps so input/output absent doesn't mask the estimate.
|
||
|
|
72ecc09326 |
feat(daemon, web): enhance plugin input handling and subcategory filtering
- Updated the `pickPluginFields` function to support legacy input aliases, improving compatibility with existing plugin structures. - Added tests to ensure the new input handling works correctly with legacy inputs when resolving plugin snapshots. - Enhanced CSS styles for subcategory elements in the plugin home section, improving visual clarity and user experience. - Introduced new tests for subcategory filtering within the active workflow lane, ensuring accurate plugin categorization. This update significantly improves the plugin management experience by enhancing input handling and refining the categorization system. |
||
|
|
26d21a942e |
feat(web): enhance plugin input handling and categorization
- Added support for plugin inputs in the EntryShell and HomeView components, allowing for more dynamic plugin interactions. - Updated the PluginsHomeSection to include subcategory filtering, improving the user experience when navigating plugins. - Enhanced the PluginsView and related components to reflect the new categorization model, transitioning to a workflow-based approach. - Refactored tests to ensure coverage for new input handling and categorization features, maintaining reliability across the application. This update significantly improves the plugin management experience by providing clearer categorization and enhanced input handling for plugins. |
||
|
|
49ea2499ac
|
[codex] Add draw annotation workflow (#1435)
* feat(web): tweaks palette popover with HSL hue-shift recoloring Adds a Tweaks color-palette popover to the HTML preview toolbar. Selecting a palette re-skins the iframe in place via a srcDoc-side bridge that walks the DOM and shifts every chromatic paint to the target hue while preserving each color's saturation and lightness — pale tints stay pale, bold CTAs stay bold, just in the new color family. Mono-noir desaturates instead of shifting. - runtime/srcdoc: new injectPaletteBridge + paletteBridge / initialPalette options - file-viewer-render-mode: paletteActive flips URL-load back to srcDoc so the bridge can be injected - FileViewer: state, popover, postMessage wiring, srcDoc + useUrlLoadPreview integration - PaletteTweaks: popover UI with Original + Coral / Electric / Acid forest / Risograph / Mono noir - PreviewDrawOverlay: stub pass-through until the draw branch lands * feat(web): hide finalize-design toolbar from project header * test(e2e): skip project actions toolbar flow after toolbar removal * Add draw annotation workflow * Restore project actions toolbar |
||
|
|
67a109335d |
feat(web, daemon): enhance plugin categorization and introduce new export scenarios
- Updated the plugin categorization system to reflect a workflow-based model, replacing the previous category bar with a curated workflow bar (From source, Generate, Export). - Added new export scenarios for Next.js, React, and Vue, providing users with starter plugins for downstream integration. - Enhanced the HomeView and PluginsHomeSection components to support the new categorization and improve user interaction with plugins. - Updated tests to cover new scenarios and ensure proper functionality across the updated plugin management system. This update significantly improves the user experience by providing clearer categorization and new tools for exporting Open Design artifacts. |
||
|
|
2d405fae96
|
fix:align artifact preview exit button (#1445) | ||
|
|
09a8fa8d64
|
feat(web): Critique Theater Phase 8 (8 Theater components, barrel, role-keyed CSS) (#1314)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
* feat(i18n): seed Critique Theater key block (en + zh-CN; other locales fall back via spread)
* feat(web): Theater PanelistLane component (Phase 8.1)
* feat(web): Theater ScoreTicker component (Phase 8.2)
* feat(web): Theater RoundDivider component (Phase 8.3)
* feat(web): Theater InterruptButton component with Escape keybind (Phase 8.4)
* feat(web): Theater TheaterDegraded chip (Phase 8.5)
* feat(web): Theater TheaterCollapsed post-run summary (Phase 8.6)
* feat(web): Theater TheaterTranscript replay surface (Phase 8.7)
* feat(web): Theater TheaterStage top-level container (Phase 8.8)
* feat(web): Theater CSS using existing semantic tokens (no hex literals)
* feat(web): Theater public exports barrel
* fix(web): resolve P2 + P3 review feedback on Phase 8 (PR #1314)
Addresses all 4 P2 + 3 P3 items from codex, Siri-Ray, and lefarcen.
State-lifecycle fixes (3 x P2)
1. Reducer learns a synthetic `__reset__` action (`CritiqueResetAction`).
Host hooks dispatch it when their gating prop changes so a stale
run from a prior project / transcript cannot bleed into the next
context. Reset is idempotent on idle (returns the same reference).
2. `useCritiqueStream` dispatches `__reset__` at the top of its
connection effect, so a workspace switch from project A (which
streamed a critique) to project B clears the reducer before the
new EventSource opens. enabled=false also clears.
3. `useCritiqueReplay` dispatches `__reset__` at the top of its
parse effect, so transcriptUrl swaps (including swap-to-null after
a replay reached `shipped`) lift the reducer back to idle before
the new fetch starts.
SSE validation (1 x P2)
4. `sseToPanelEvent` now runs a per-variant `hasValidVariantShape`
check after the cheap `isPanelEvent` predicate. A
`critique.ship` frame missing `composite` / `round` / `status` /
`artifactRef` is rejected before reaching the reducer, so
TheaterCollapsed can no longer crash on `undefined.toFixed(1)`.
Every variant's required fields are validated: run_started
(protocolVersion, non-empty cast, maxRounds, threshold, scale),
panelist_* (round, role, plus variant-specific shape), round_end
(round, composite, mustFix, decision in {continue,ship}, reason),
ship (round, composite, status, artifactRef.{projectId,artifactId},
summary), degraded (reason, adapter), interrupted (bestRound,
composite), failed (cause), parser_warning (kind, position).
Reducer correctness (1 x P2)
5. `panelist_open` now materializes the round + an empty panelist
view (`{dims: [], mustFixes: []}`) so TheaterStage can highlight
the in-progress lane the instant the tag opens. Before this, a
stream that emitted only `panelist_open` after `run_started` left
`rounds = []` and the UI rendered no current round until a later
`panelist_dim` arrived.
Polish (3 x P3)
6. Brand role tint swaps from `var(--magenta, var(--accent))` to
`var(--purple, var(--accent))`. `--purple` is actually defined
across the design systems; `--magenta` is not, so Brand was
silently falling through to `--accent` and looking identical to
Designer.
7. New i18n key `critiqueTheater.interruptedSummary` for the
interrupted-collapse copy ("Interrupted at round N, best
composite X.X"). Previously the interrupted branch reused
`shippedSummary` and the UI read "Shipped at round..." for a run
that specifically did not ship. Native value in en + zh-CN; other
locales fall back via `...en` spread.
8. `TheaterDegraded` heading id comes from `useId()` instead of a
hardcoded `theater-degraded-heading`, so two chips rendered on
the same page (chat history with multiple completed runs) keep
their aria-labelledby references unambiguous.
Tests (15 new cases)
- reducer.test.ts (+5): __reset__ on running/terminal/idle, panelist_open materializes round, panelist_open does not stomp prior panelist data.
- sse.test.ts (+6): variant-level rejection for ship without required fields, degraded without adapter, run_started with empty cast, panelist_dim with non-numeric score, round_end with unknown decision, plus a positive fully-formed ship.
- useCritiqueStream.test.tsx (+2): state reset on projectId change, state reset on enabled flip false.
- useCritiqueReplay.test.tsx (+1): state reset on transcriptUrl swap to null after a replay reached shipped.
- TheaterCollapsed.test.tsx (text-pinning update): asserts the interrupted branch reads "Interrupted at round 1" + "best composite 7.9", and explicitly NOT "Shipped at round...".
- TheaterDegraded.test.tsx (+1): two chips on the same page get unique aria-labelledby ids that each resolve to an `<h3>`.
Validated
- pnpm guard clean
- pnpm --filter @open-design/web typecheck clean
- Theater suite: 13 files, 101 tests (was 86 on the first Phase 8 push, +15 new)
- tests/i18n/locales.test.ts 5 of 5 across 18 locales
* fix(web): tighten isPanelEvent in contracts so enum + numeric fields are checked end-to-end (Siri-Ray round-3 P1 on PR #1314)
The variant validator on the web SSE path previously accepted any
`typeof === 'string'` for closed-enum fields (ship.status,
panelist_*.role, degraded.reason, failed.cause, parser_warning.kind,
run_started.cast[]) and any `typeof === 'number'` for numeric fields,
which let NaN / Infinity through. Downstream components index i18n
tables by enum value, so an unknown status or role would land
`SHIP_BADGE_KEY[final.status]` on undefined and crash the translator.
The replay parser had a separate gap: `useCritiqueReplay.parseTranscript`
called the cheap `isPanelEvent` header check directly, so a recorded
line like `{"type":"ship","runId":"r"}` reached the reducer with
composite, status, round, artifactRef, summary all undefined and
TheaterCollapsed then called `final.composite.toFixed(1)` on undefined.
Resolution: move all wire-side validation into the contract guard.
- Export const arrays for the closed enums:
SHIP_STATUSES, DEGRADED_REASONS, FAILED_CAUSES, PARSER_WARNING_KINDS,
ROUND_DECISIONS (PANELIST_ROLES already existed).
- Rewrite `isPanelEvent` in packages/contracts/src/critique.ts to be the
single deep validator: header (known type + non-empty runId) plus
every variant-specific required field plus closed-enum membership
plus Number.isFinite on every numeric field. Documented as the wire
source of truth.
- Drop the local `hasValidVariantShape` from web/sse.ts; sseToPanelEvent
now relies entirely on the contract guard, and parseTranscript in
useCritiqueReplay (which already uses isPanelEvent) gets the deeper
validation for free.
Tests (TDD, red-first):
- packages/contracts/tests/critique.test.ts: 13 new cases pinning the
strict guard directly (well-formed across every variant, every
rejection path: unknown type, empty/non-string runId, unknown enum,
non-finite numeric, missing variant field).
- apps/web/tests/components/Theater/state/sse.test.ts: 9 new cases for
each closed-enum rejection on the wire path plus a positive sweep
across every legal enum value across every variant.
- apps/web/tests/components/Theater/hooks/useCritiqueReplay.test.tsx:
2 new cases for incomplete and unknown-enum transcript lines.
Verified:
- pnpm --filter @open-design/contracts test 4 files / 30 tests green.
- pnpm --filter @open-design/contracts build clean.
- pnpm --filter @open-design/web typecheck clean.
- pnpm --filter @open-design/web test 107 files / 976 tests green.
* fix(contracts): enforce numeric domains in isPanelEvent (lefarcen P2 on PR #1314 round 4)
The strict guard from PR #1314 round 3 enforced enum membership and
Number.isFinite, but accepted any finite number where the contract
intends a specific domain: scale: 0 (ScoreTicker divides by it),
negative thresholds, fractional rounds, negative mustFix, etc.
ScoreTicker.tsx writes `var(--scale, ${state.scale})` into inline
CSS and divides by it for tick width, so a guard-passing scale: 0
shipped Infinity into the rendered style. Negative composite /
score values reached downstream code that assumes >= 0.
Resolution: mirror the daemon-side Zod domain constraints in the
runtime guard.
Three new helpers in packages/contracts/src/critique.ts:
- isPositiveInt(v): integer with v > 0. Used for round, maxRounds,
scale, protocolVersion (all 1-indexed in the orchestrator).
- isNonNegativeInt(v): integer with v >= 0. Used for mustFix,
position, bestRound. bestRound: 0 is the valid sentinel for
'interrupted before any round closed'.
- isNonNegativeFinite(v): finite number with v >= 0. Used for
composite, score, dimScore, threshold. Threshold may be
fractional (e.g. 8.5 on a scale of 10).
Cross-field check inside run_started: threshold <= scale (the daemon
Zod schema enforces this with an epsilon refine, the wire guard
matches the same intent).
Tests (TDD, red-first) added in packages/contracts/tests/critique.test.ts:
- 22 new rejection cases across every numeric field that
previously slipped through: scale: 0, negative scale, fractional
scale, maxRounds: 0, fractional maxRounds, protocolVersion: 0,
fractional protocolVersion, negative threshold, threshold > scale,
round: 0, fractional round, negative dimScore / score, negative /
fractional mustFix, negative composite, ship round: 0, negative /
fractional bestRound, negative interrupted composite, negative /
fractional parser_warning position.
- 3 positive boundary cases that must still pass: threshold == scale,
fractional threshold within [0, scale], interrupted with
bestRound: 0 (no round completed before interrupt), parser_warning
with position: 0 (start of stream).
Verified:
- pnpm --filter @open-design/contracts build clean.
- pnpm --filter @open-design/contracts test: 4 files / 59 tests green
(was 37 before the new domain cases).
- pnpm --filter @open-design/web typecheck clean.
- pnpm --filter @open-design/web test: 110 files / 1004 tests green;
no regression on Theater suite, sse validator, replay parser, or
assistant-feedback widget tests.
---------
Co-authored-by: Nagendhra <nagendhra405@gmail.com>
|
||
|
|
6f818d971d |
feat(daemon, web): implement plugin folder installation and enhance atom worker registry
- Added a new API endpoint for installing plugins from specified folder paths, improving the plugin management experience. - Introduced functions for normalizing and validating project plugin folder paths, ensuring robust error handling. - Implemented a registry for built-in atom workers, allowing for dynamic signal aggregation during pipeline execution. - Enhanced the `runStageWithRegistry` function to support multiple atom workers, merging their outputs with pessimistic logic. - Updated the UI components to display plugin folder candidates and facilitate user interactions for plugin installation. - Added tests for the new atom worker registry and plugin folder installation features, ensuring reliability and correctness. This update significantly enhances the plugin installation process and the overall functionality of the atom worker system, providing users with better tools for managing plugins and their interactions. |
||
|
|
61a68a34f8 |
feat(web): implement Home intent rail for streamlined project creation
- Introduced a new Home intent rail in the HomeHero component, allowing users to select project categories or migration shortcuts via chips. - Enhanced the EntryShell and HomeView components to support project kind selection based on the chosen chip, improving the project creation flow. - Added functionality for folder import and template selection, integrating with existing project creation mechanisms. - Updated CSS styles for the Home intent rail to ensure a responsive and visually appealing layout. - Implemented tests for the new chip interactions and HomeHero functionality, ensuring reliability and user experience. This update significantly enhances the user experience by providing a more intuitive way to create projects and manage migrations directly from the Home interface. |
||
|
|
ed2cbe171b |
feat(daemon, web): implement media generation scenario and enhance plugin handling
- Introduced a new `od-media-generation` scenario plugin for handling image, video, and audio projects, providing a default pipeline for media generation. - Updated the `collectBundledScenarios` function to deduplicate scenarios and prefer canonical IDs for task kinds, improving plugin routing. - Enhanced the `PluginsView` and `HomeHero` components to better display community and user-installed plugins, improving user experience. - Refactored tests to accommodate the new media generation scenario and ensure proper functionality across plugin types. This update significantly enhances the media handling capabilities and overall plugin management experience, making it easier for users to work with various media projects. |
||
|
|
13d5598b0c |
feat(web, daemon): enhance plugin import functionality and UI components
- Added support for uploading plugins via zip files and folders, improving the plugin import process. - Introduced a new `PluginImportModal` for a streamlined user experience when importing plugins. - Updated the `PluginsView` to include disabled states for unfinished plugin areas, enhancing clarity for users. - Refactored various components to utilize the new `resolvePluginQueryFallback` function for improved localization handling. - Enhanced CSS styles for better visual feedback and responsiveness in the plugin import interface. This update significantly improves the plugin management experience, making it easier for users to import and manage plugins effectively. |
||
|
|
443aea72c5 |
feat(daemon, web): enhance plugin handling and UI integration
- Introduced a new plugin upload mechanism with file size limits and memory storage, allowing users to upload plugins directly. - Implemented fallback logic for plugin application, ensuring projects can be created without explicit plugin requests. - Enhanced the UI to support plugin selection and integration, including a new `PluginsView` component for managing plugins. - Updated various components to utilize localized text for plugin queries, improving user experience across different languages. - Added tests for new plugin functionalities and local skill loading, ensuring reliability and correctness. This update significantly improves the plugin management experience, providing users with better tools for plugin integration and interaction. |
||
|
|
7b191b5f85
|
fix: load Orbit templates from design templates (#1442)
(cherry picked from commit
|
||
|
|
a7e6e0dc3d
|
fix(web/AssistantMessage): update status block detail to latest value instead of skipping (#1413)
When using Open Design with an ACP agent (e.g. Devin for Terminal), after selecting a non-default model in the picker, the model badge under the conversation header kept showing the agent's initial default (e.g. `model swe-1-6-fast`) instead of the running model. The conversation header text and the agent itself reflected the selected model correctly — only the badge UI was stale. Root cause: `buildBlocks()` in this file de-duplicated consecutive status events with the same label by SKIPPING the new one rather than updating the existing block. The daemon emits two `status: label='model'` events per turn — once after `session/new` returns with the agent's default model, then again after `session/set_config_option` succeeds with the user-selected model (see `apps/daemon/src/acp.ts` ~lines 495 and 587). The dedupe kept the first and skipped the second, so the badge stayed stuck on the default. Fix: update the existing block's detail to the latest value instead. "Most recent detail wins" is more accurate than "first one wins forever" for every status label that reaches this code path (`'model'`, `'initializing'`, etc.; the filter at lines 1156-1162 already drops the labels we don't want to surface as badges: streaming, starting, requesting, thinking, empty_response). Adds two regression tests in apps/web/tests/components/AssistantMessage.test.tsx: - Two sequential `status: 'model'` events with different details render the second detail and not the first (the Bug A scenario). - Two sequential events with the SAME detail still collapse to a single badge (no regression in the existing dedupe behavior). `pnpm guard` clean; AssistantMessage tests pass 22/22. Manually verified working in a local build against Devin for Terminal `2026.5.6-4` with #1208 applied — badge now updates to the selected model on every turn. |
||
|
|
e6c5560884
|
Fix appearance accent color persistence (#1439) | ||
|
|
3d6d434f11 |
feat(web): enhance TasksView and plugin preview components with new features and styles
- Updated `TasksView` to include a title row with a "Coming soon" label and added a preview note for user guidance. - Enhanced `DesignSystemSurface` to fetch and display design system showcases, improving visual representation in the plugin home section. - Refactored `PreviewSurface` to pass the `inView` prop to `DesignSystemSurface`, optimizing rendering behavior. - Improved CSS styles for tasks and design system components, ensuring a cohesive and visually appealing layout. This update significantly enhances user experience by providing clearer information and improved visual elements in the tasks and plugin previews. |
||
|
|
9f4e76d507 |
feat(web): introduce integrations and tasks views with enhanced navigation and settings management
- Added new `IntegrationsView` and `TasksView` components to facilitate user interaction with integrations and task management. - Updated the `App` component to manage initial tab states for integrations and settings navigation. - Enhanced `EntryNavRail` to include navigation options for tasks and integrations, improving accessibility. - Refactored `EntryShell` to support dynamic rendering of the new views and manage integration tab states. - Improved CSS styles for the new views, ensuring a cohesive design and responsive layout. This update significantly enhances the user experience by providing dedicated views for integrations and tasks, streamlining navigation and settings management. |
||
|
|
23218eacd9
|
refactor(settings): use tiled language picker instead of dropdown (#1406)
The Language section in Settings rendered a single-button dropdown trigger that opened a floating menu. With one visible label and lots of empty panel space, the layout misled users into thinking only one language existed. Replace the dropdown trigger + portaled menu with an inline tile grid that shows every locale at a glance and clicks directly to switch. Side effects of the new layout: the languageOpen / languageMenuRect state, the dynamic placement effect, the resize-close effect, the mousedown click-outside handler, and the languageRef are gone. The global Escape handler no longer needs to guard against the menu being open. CSS for .settings-language-picker, .settings-language-button, .settings-language-menu, and .settings-language-option is replaced by .settings-language-grid (auto-fill 180px minmax columns) + .settings-language-tile. Tests in SettingsDialog.execution.test.tsx that drove the dropdown (click trigger → click menuitemradio → assert menu closed) are rewritten to drive the tiles directly via the radio role. Refs #1347 |
||
|
|
0220124e0f
|
test: add Memory and Routines coverage (#1400)
* test: align extended Playwright coverage with current UI behavior * test: address extended suite review feedback * test: fix Codex fallback config hydration in e2e * test: add Memory and Routines coverage * test: fix Memory and Routines component test typing * test: include Memory and Routines e2e in extended suite |
||
|
|
7464cb443a |
feat(web): enhance plugin media detail and modal components with improved layout and functionality
- Introduced a new custom stage container for media plugins, allowing for a unified modal experience across different media types (image, video, audio). - Updated the `PreviewModal` to support custom ReactNode stages, enhancing the flexibility of media previews. - Refactored the `PluginMetaSections` to include an optional heading for better clarity and organization of plugin information. - Improved the close button design in various modals for consistency and better user experience. - Enhanced CSS styles for modals and plugin details to ensure responsive design and visual consistency across the application. This update significantly improves the usability and aesthetics of media-related modals, making it easier for users to interact with and understand plugin content. |
||
|
|
6c3fd86642
|
fix(daemon/acp): terminate ACP child after clean prompt completion (#1286)
* fix(daemon/acp): terminate ACP child after clean prompt completion (Bug B / #1265) Some ACP agents (notably Devin for Terminal) keep the child process alive after stdin closes, waiting for the next prompt. Open Design spawns a fresh agent per chat turn and relies on child.on('close') to finalize the run, so without an explicit signal-driven shutdown the chat sits stuck in the 'working' state indefinitely. Three small, targeted changes: - apps/daemon/src/acp.ts: After a clean session/prompt response we schedule a 500ms grace period and then SIGTERM the child. This mirrors the pattern detectAcpModels() already uses after model discovery. The grace period leaves well-behaved agents that exit on stdin.end() unaffected. - apps/daemon/src/acp.ts: New completedSuccessfully() method on the session handle reports whether the prompt resolved without a fatal error or abort, so the consumer can distinguish 'clean signal exit' from 'genuine signal failure'. - apps/daemon/src/server.ts: child.on('close') now treats a SIGTERM exit as 'succeeded' when acpSession.completedSuccessfully() is true. - apps/web/src/providers/daemon.ts: Trust the server's authoritative endStatus; the signal/non-zero-code safety net no longer overrides an explicit 'succeeded' status, so the chat doesn't surface a fake 'agent exited with signal SIGTERM' error after a clean ACP run. Daemon tests cover the SIGTERM grace timer, clean early-exit (timer cleared), and completedSuccessfully() abort/error states. Manual UI test on plain main + this fix confirms Devin chats now return to ready automatically after Done · ... * fix(daemon/connectionTest): treat ACP clean SIGTERM as success Codex review on #1286 caught that the new SIGTERM in attachAcpSession breaks ACP connection tests for agents that don't shut down on stdin.end() (the exact Devin behavior the patch targets). attachAgentStreamHandlers() in connectionTest.ts now also respects acpSession.completedSuccessfully(), mirroring the same check we apply in server.ts. Without this, a clean prompt response followed by our SIGTERM would set winner.signal === 'SIGTERM', flip exitedCleanly to false, and the connection test would report 'agent_spawn_failed' even when the agent had returned a healthy response. Also widened the AgentSpawnHandle type so completedSuccessfully is visible on the structural type used inside connectionTest.ts. All 56 daemon tests still pass; typecheck + guard clean. * fix(daemon/acp): narrow ACP success-on-signal override to forced-SIGTERM Looper review on #1286 caught that the success predicate was broader than the SIGTERM case it was meant to handle. `completedSuccessfully()` flips to true as soon as the ACP `session/prompt` response is processed, but it does not say why the child later closed. With the broad predicate, an ACP agent that returned a prompt result and then exited with code 1 (or was killed by SIGKILL/SIGSEGV) was still marked 'succeeded', regressing the existing close-status behavior for genuine post-response process failures. Scope the override to the exact forced-shutdown shape this PR introduces: code === null && signal === 'SIGTERM' && acpCleanCompletion Applied to both `server.ts` (chat run finalization) and `connectionTest.ts` (connection-test classification). Any other post-response failure now falls through to 'failed' / 'agent_spawn_failed' as before. All 59 daemon tests still pass; typecheck + guard clean. * fix(web/daemon): only bypass exit-code safety net on explicit server success Looper review on #1286 caught that the previous web change trusted `endStatus === 'succeeded'` absolutely, but `endStatus` can become 'succeeded' in two distinct ways: 1. The SSE end event explicitly carries `status: 'succeeded'` (authoritative server declaration). 2. The end event omits or has an invalid `status` field and the handler silently falls back to 'succeeded' as a local default. Both produced `endStatus === 'succeeded'` in the existing code, so the new safety-net bypass treated them identically. That regressed backward compat: a compatible or older daemon emitting an end event like `{code:1}` or `{code:null,signal:"SIGTERM"}` with no `status` would suddenly skip the failure banner. Track explicit success separately via `serverDeclaredSuccess`, set true only when: - The SSE end event has `status === 'succeeded'`, or - The fallback `fetchChatRunStatus` REST path returns `status === 'succeeded'` (which the existing `isChatRunStatus()` guard already proves is explicit). The safety net is now bypassed only on that explicit signal; the local-fallback success path still reaches the exit-code/signal check so real failures surface as before. Adds three web-side regression tests in `apps/web/tests/providers/sse.test.ts`: - Explicit `status: 'succeeded'` + SIGTERM → onDone called, no error - End event with `{code:1}` and no `status` → onError surfaces 'agent exited with code 1' as before - End event with `{code:null,signal:'SIGTERM'}` and no `status` → onError surfaces 'agent exited with signal SIGTERM' as before `pnpm guard` + daemon typecheck clean; 27/27 SSE tests pass (up from 24). |
||
|
|
5ff578dc8d
|
fix: support object-style question-form options (#1293)
* fix: support object-style question-form options * fix: preserve stable option values in form submissions |
||
|
|
2f51f3c1ae
|
feature: refine assistant artifact feedback (#1379)
* feature: refine assistant artifact feedback * fix: clear hidden custom feedback reason * test: update assistant feedback expectations |
||
|
|
244e8b7981 |
feat(daemon): enhance plugin preview handling and add fallback mechanisms
- Implemented a new `collectPluginPreviewCandidates` function to gather potential HTML assets for plugins, improving the robustness of the preview endpoint. - Introduced `discoverPluginHtmlAssets` to scan common directories for HTML files, ensuring that plugins with missing declared entries can still provide a valid preview. - Updated the `/api/plugins/:id/preview` route to utilize the new candidate collection and discovery functions, enhancing the user experience by preventing blank tiles in the gallery. - Added comprehensive tests for the new fallback logic to ensure reliability and correctness in various scenarios. This update significantly improves the plugin preview functionality, ensuring users have access to valid previews even when manifest entries are outdated or missing. |
||
|
|
2e27745800 |
feat(web): enhance PluginsHomeSection with improved plugin card visuals and layout
- Updated the `PluginsHomeSection` to change the section title from "Plugins" to "Community" and revised the subtitle for clarity. - Refactored the `PluginCard` component to include a new `PreviewSurface` for dynamic rendering of plugin previews based on type (media, HTML, design). - Introduced lazy loading for plugin previews using the `useInView` hook to optimize performance. - Added new components for different preview types: `MediaSurface`, `HtmlSurface`, `DesignSystemSurface`, and `TextSurface`, enhancing the visual presentation of plugins. - Improved CSS styles for a more responsive and visually appealing grid layout, accommodating up to five tiles per row on larger screens. - Implemented a new `inferPluginPreview` function to classify and render plugins based on their manifest data. This update significantly enhances the user experience by providing a more engaging and visually rich plugin gallery. |
||
|
|
aeb6cde923
|
prevent duplicate saves and add template deletion (#1294)
* prevent duplicate template entries on repeated save * add delete button to saved template list Templates can now be removed from the template picker via a hover x button, calling the existing DELETE /api/templates/:id endpoint. * add missing onDeleteTemplate prop in test fixtures * add template deletion flow test for NewProjectPanel * reject template names longer than 100 characters * preserve original createdAt on template update |
||
|
|
6cccccdc56 |
feat(web): refactor PluginsHomeSection to implement 3-axis faceted filtering
- Replaced the previous tag-based filtering with a new 3-axis model (SURFACE, TYPE, SCENARIO) for enhanced plugin discovery. - Introduced a new `usePluginFacets` hook to manage the independent selection of facets and their AND-composition. - Updated the `PluginsHomeSection` component to render facet rows and a Featured chip, improving user interaction and layout. - Removed legacy categorization logic and associated files, streamlining the codebase. - Enhanced CSS styles for the new layout and improved visual consistency across the plugins home section. - Added tests to ensure the correctness of facet extraction and filtering logic. This update significantly enhances the user experience by providing a more flexible and intuitive way to filter and discover plugins. |
||
|
|
9c489aa045
|
feat(web): redesign Designs tab cards — covers, tags, overflow menu, multi-select (#1161)
* feat(web): redesign Designs tab cards — covers, tags, overflow menu, multi-select - Render real previews on project cards: HTML iframe / image / video / hashed gradient fallback with project initial; lazily fetches the project's primary file when metadata.entryFile is unset, prefers index.html → newest html → image → video. - Live artifact card thumbnails embed the rendered artifact URL via sandboxed iframe. - Replace the per-card close button with a `…` overflow menu (Rename, Delete) that opens on hover/click; click-outside and Esc close it. - Add multi-select mode (toolbar toggle → checkbox per card → "N selected · Delete · Cancel" pill) with batch delete via the existing onDelete prop. - Add a category tag to every card (Prototype / Live Artifact / Slide / Media) derived from project.metadata.intent / kind / skillId. - Replace browser prompt() and confirm() with custom modals (rename input + danger-confirm) reusing the existing .modal shell. - Add `more-horizontal` icon and 16 new i18n keys across all 18 locales (zh-CN/zh-TW localized; others fall back to English). * test(e2e): update home delete flow for overflow menu + custom confirm modal The previous flow targeted a per-card X button labelled "delete project <name>" and asserted on a native `dialog` event. The card UI now exposes a `…` overflow menu and a styled confirm modal, so reach delete via the menu and assert against the modal's Cancel / Delete buttons instead. * fix(web): harden Designs tab preview sandbox * fix(web): hide Designs select mode in kanban |
||
|
|
77f69257a7
|
feat(web): in-context comment thread for the artifact preview (#1276)
* feat(web): free-pin fallback in comment mode for unannotated artifacts When the artifact has no data-od-id annotations, clicking in Comment mode now posts a synthetic position-based target so the host opens a popover at the click location. Daemon upsert validation requires a non-empty selector/label, so the pin uses [data-od-pin=ID] and label 'pin'. Coordinates are document-space (viewport + scrollY) so pins stay anchored after scroll/reload. Clicks on interactive elements (a/button/input/textarea/select/label/contenteditable) keep their native behavior and are not pinned. * feat(web): tighten comment popover layout for free-pin and element targets The popover header used to dump the raw elementId verbatim — fine for data-od-id targets like 'hero-cta' but jarring for free-pins where elementId is a synthetic 'pin-...' string. Branch on the prefix and show 'Pin · at X, Y' for free-pins; keep the label + selection kind for real element / pod targets. Replace the text 'Close' button with an icon-only close affordance to match the popover-as-card visual. Action row is now two right-aligned buttons (Comment + Send to Claude) for element targets and (Add note + Send to Claude) for pod targets, eliminating the three-button row that wrapped onto two lines at narrow widths. The 'Remove' affordance for existing comments stays left-aligned. * feat(web): drop comments tab from chat sidebar The chat sidebar's 'Comments' tab listed saved/attached preview comments but duplicates the per-element popover already shown in the artifact viewer. Hide the tab and its content while the right-side comment thread panel takes over the same surface in-context. The CommentsPanel / CommentSection components stay defined as dead code for the moment so callers and translation keys remain valid; a later pass can delete them. * feat(web): right-side comment thread panel in board mode Render a 320px CommentSidePanel anchored to the right of the artifact preview whenever board (comment) mode is on. The panel lists every saved preview comment for the current file with an avatar initial, the element label (or 'Pin' for free-pin synthetic ids), an Xd/Xh/Xm-ago timestamp, the note body, a Reply link, and a checkbox. Reply focuses the comment's element via liveSnapshotForComment so the popover opens at the right anchor. Selecting one or more comments via the checkboxes surfaces a 'N selected · Clear · Send to Claude' action bar above the list; Send to Claude reuses the existing onSendBoardCommentAttachments pipeline via commentsToAttachments. The panel takes the place of the chat sidebar's removed Comments tab so the thread lives next to the artifact instead of behind a tab switch. * feat(web): styles for right-side comment thread panel Floating 320px panel anchored to the right edge of the artifact preview with a scrollable comment list and a coral selection bar that appears when one or more comments are checked. Selected items get a coral tint; the reply / check / send-to-claude controls match the popover's coral primary tone. * feat(web): toast confirmation on comment save, close popover After savePersistentComment succeeds, close the popover via clearBoardComposer and surface a transient 'Comment saved' (or 'Pin saved' for free-pin targets) toast for 2.2s. Replaces the previous behavior where the popover stayed open with an empty draft after save, which left users uncertain whether the save landed and forced an extra click to dismiss. * feat(web): position the comment-save toast at the top of the preview * feat(web): allow editing saved comment notes via the side panel Rename the per-item 'Reply' affordance to 'Edit' (no thread model exists yet, so reply was misleading) and pre-fill the popover with the existing note when clicked. The save path goes through onSavePreviewComment which the daemon implements as an upsert keyed on (project, conversation, filePath, elementId), so the edit overwrites the existing row's note without spawning a duplicate. Also fall back to a snapshot synthesized from the saved comment's own fields when the corresponding live target is no longer in the iframe DOM (e.g. free-pin parents that were re-rendered), so the edit path still works after artifact reloads. * feat(web): hide already-sent comments from the side panel After Send to Claude, the daemon flips the comment status from 'open' to 'applying' (and then 'needs_review' / 'resolved' / 'failed' depending on the run). Filter the side panel to status === 'open' so sent comments visibly leave the list — the user gets clear feedback that the send landed and the panel stays focused on actionable, un-sent items. * feat(web): drop single-tab bar and conversation count badge After the Comments tab was removed the chat header still rendered a one-tab 'tablist' just for the Chat tab, which read as visual noise without a sibling to switch between. Drop the tabs wrapper entirely; the chat content stays mounted and the header now hosts only the conversation-history affordance. Also drop the numeric badge that overlaid the conversation history button: counting open conversations next to a generic history icon was easy to mistake for an unread / notification count. The dropdown itself remains the canonical place to see and switch between past conversations. * feat(web): right-align chat header actions after tab bar removal With the tabs wrapper gone, chat-header-actions sat flush left because nothing was pushing it across the header. Add margin-left: auto so the history / new-conversation / collapse buttons land at the right edge, matching the design files / index.html tab row's own right-aligned controls. * feat(web): rename board-mode toggle to Comment with comment icon The artifact preview toolbar's board-mode entry was labeled 'Tweaks' with the tweaks icon, which collided with the palette Tweaks button next to it and hid the comment capability behind a generic label. Rename to 'Comment' with the comment icon and switch to the viewer-action class so the button matches the surrounding toolbar items (Edit/Draw) and the coral active state lands on the right surface. * fix(web): pass designTemplates to ProjectView in api-empty-response test The test props for ProjectView were missing the designTemplates prop that was added to Props in #955 (generic skills split). CI's strict typecheck (tsc -b --noEmit) caught it; local runs that hit project references differently did not. Pass an empty SkillSummary array — matches the empty skills fixture for the same reason. |
||
|
|
928079daf5
|
feat(web): consolidate Image/Video/Audio entries into a Media tab (#1167)
Reduces the New Project panel's top-level tab count by collapsing the three media surfaces into a single Media tab with an inner segmented control, and polishes the controls inside that tab so they stop dominating the panel: - Media tab + segmented (Image / Video / Audio) inside the panel body. Underlying ProjectKind branches and submission contract unchanged — the daemon still receives kind=image/video/audio. - Model picker rewritten as a combobox: one trigger row + searchable, provider-grouped popover with Recommended badges. Replaces the flat grid of provider-grouped cards that scrolled past the fold once the fourth provider landed. - Aspect picker compressed from a 5-card grid to a single row of segmented pills with mini ratio glyphs. - Image surface no longer carries a free-form Style notes field; it was redundant with the prompt template + main prompt input. - Live artifact tab locks fidelity to high-fidelity (the wireframe option is now hidden) — a wireframe live artifact doesn't make sense and the picker added noise. i18n: adds tabMedia / titleMedia / model* keys across all 18 locales, removes imageStyleLabel / imageStylePlaceholder. Tests + e2e selectors updated to drive the new Media tab + segmented surface flow. |
||
|
|
140a4e1ff6
|
Improve responsive preview and design handoff outputs (#1224)
* feat: improve responsive design handoff * feat: refine cross-platform design outputs Changelog:\n- Add auto-fit responsive preview behavior for tablet/mobile frames.\n- Add landing page and OS widgets metadata options with project header chips.\n- Strengthen prompt contracts for modern breakpoints, app-specific modules, CJX-ready UX, and final product surfaces.\n- Require cross-platform outputs to use separate platform files instead of tabbed demo selectors.\n- Add DESIGN-MANIFEST.json plus richer handoff guidance to daemon/client exports.\n- Update archive/export tests for manifest and responsive viewport matrix. * feat: enforce screen-file design outputs Changelog:\n- Enforce screen-file-first generation for landing pages, app screens, platform surfaces, and OS widgets.\n- Update design handoff and manifest exports so coding tools map each screen file to separate routes/surfaces.\n- Strengthen minimal-brief visual guidance to avoid monochrome or unstyled design outputs. * fix: address responsive handoff review feedback * fix: address handoff review blockers * fix: preserve proxy auth and normalized export entry * fix: narrow frame wrapper filter to directory paths only * fix: make artifact save failure banner generic --------- Co-authored-by: Huy Hoàng <macos@MacBook-Pro-Hoang.local> |
||
|
|
5af84c09af |
feat(web): refactor PluginsHomeSection to use tag-based filtering and introduce PluginCard component
- Replaced the legacy tabbed categorization in `PluginsHomeSection` with a tag-driven approach, allowing dynamic filtering based on plugin tags. - Introduced a new `PluginCard` component to encapsulate the rendering of individual plugin cards, improving separation of concerns and maintainability. - Added a `usePluginCategories` hook to manage plugin visibility and filtering logic, enhancing the overall structure and testability of the component. - Implemented a "More" pill for overflow tags in the filter row, improving user interaction with a cleaner UI. - Updated CSS styles to support the new layout and improve visual consistency across the plugins home section. This update significantly enhances the user experience by providing a more flexible and intuitive way to discover and interact with plugins. |
||
|
|
9825b3ba1f |
feat(web): enhance entry view with responsive topbar and GitHub star integration
- Updated `EntryShell` to include a responsive topbar that collapses into a settings dropdown on narrow viewports, improving accessibility and user experience. - Integrated `GithubStarBadge` to display the current star count for the GitHub repository, encouraging user engagement. - Added a new `useGithubStars` hook to manage the star count fetching and caching, ensuring efficient updates without unnecessary API calls. - Enhanced CSS styles for the topbar and avatar items to improve visual consistency and interaction feedback. - Updated internationalization files to include translations for new UI elements related to the GitHub star feature. This update significantly improves the user interface by providing a more dynamic and engaging entry experience. |
||
|
|
45760a75aa |
feat(web): enhance entry view with API protocol and model switching
- Introduced `InlineModelSwitcher` to allow users to switch between CLI and BYOK modes, along with selecting the active model. - Updated `App` component to handle API protocol and model changes, ensuring seamless configuration updates. - Modified routing to support distinct views for home, projects, and design systems, improving navigation and user experience. - Removed legacy `PluginsSection` from `NewProjectPanel`, streamlining the project creation process. - Enhanced CSS styles for better visual consistency across updated components. This update significantly improves user interaction by providing intuitive controls for managing execution modes and models directly from the entry view. |
||
|
|
dbc94b83ed
|
feat(web): add thumbs-up/down feedback widget under completed assistant turns (#1288) (#1308)
* feat(web): add thumbs-up/down feedback widget under completed assistant turns (#1288) Adds a lightweight feedback widget that surfaces under each assistant turn whose run succeeded. Users can submit positive or negative feedback in one click; the negative path opens an optional free-text comment area. The widget never blocks the message composer and only mounts after the run has produced its final artifact, matching the acceptance criteria. What ships - `<MessageFeedback>` (apps/web/src/components/MessageFeedback.tsx) renders the three states: idle (prompt + thumbs), submitted positive (confirmation + Change), submitted negative (confirmation + optional comment textarea + Send + Change). - AssistantMessage.tsx slots the widget under AssistantFooter, gated on `runSucceeded && !hasEmptyResponse`, so failed and empty-response turns don't ask the user to rate something that never finished. - The full record shape leaves room for the future analytics metadata the issue calls out (rating, comment, submittedAt; artifactRef / runId derivable from the surrounding message whenever the analytics pipeline lands). Persistence (v1 = localStorage) Lefarcen's clarifying comment on the issue asked whether v1 should be daemon-persisted or in-memory while the analytics pipeline is defined. The daemon's messages table is column-strict, so daemon persistence would require a SQLite migration plus a contract bump on `ChatMessage`; locking that shape in before the analytics pipeline is designed risks reworking it twice. localStorage is the middle ground: feedback survives reload (so the "feedback state is visually clear after submission" criterion holds across tabs and sessions) without committing the wire shape. The hook surface is just `(value, setter)`, so a future PR can swap the storage layer for a daemon mirror or an analytics shipper without touching the React surface. The store handles corrupted JSON, unknown future rating values, disabled storage (private-mode browsers), and broadcasts changes across listeners in the same tab via a CustomEvent so two mounts of the hook for the same messageId stay in sync. i18n 11 new keys under `feedback.*` (prompt, thumbsUp/Down, two confirmation chips, comment label/placeholder/submit/saved, change). English source values authored alongside the keys; zh-CN translations added in the same pass so the locale alignment test stays green and Chinese users see Chinese strings from day one. The other 16 locales pick up English fallbacks via their existing `...en` spread. Test coverage - `tests/state/message-feedback.test.ts` (8 jsdom cases) — round-trip, null-clear, corrupted JSON, missing rating, unknown rating, key collision across messages. - `tests/components/MessageFeedback.test.tsx` (7 jsdom cases) — idle state, positive submit, negative submit, comment save, blank-comment Send disabled, Change unsticks the rating, rehydration from pre-populated storage. The locale alignment test continues to enforce that every locale declares the new keys (5/5 across 18 locales). Validated - pnpm guard clean - pnpm --filter @open-design/web typecheck clean - tests/i18n/locales.test.ts 5/5 - tests/state/message-feedback.test.ts 8/8 - tests/components/MessageFeedback.test.tsx 7/7 - Full web suite: 98 files, 903 tests * fix(web): tighten feedback widget gate + storage sync + textarea, add styles (PR #1308 review) Addresses every P2/P3 from the codex + Siri-Ray + lefarcen reviews on PR #1308, plus a couple of polish items the review surfaced indirectly. Visibility gate (lefarcen P2) The gate was `runSucceeded && !hasEmptyResponse`, which also matched text-only acknowledgements and question-form replies. The issue scopes feedback to turns that produced a final artifact, so the gate now also requires `produced.length > 0`. New AssistantMessage suite (5 jsdom cases) pins: artifact -> shown, no-artifact -> hidden, streaming -> hidden, failed run -> hidden, empty_response -> hidden. Storage sync (codex P2 + lefarcen P2) The previous broadcast contract was: write storage, dispatch a bare CustomEvent, listeners re-read storage. That had two failure modes: - setItem throwing (private mode / quota / disabled storage) left the listener seeing null and clobbering the in-memory state the user just confirmed. - The clear path early-returned after removeItem and never dispatched, so a second mount of the same messageId stayed in the submitted state when the user clicked Change. New contract: every successful OR failed write dispatches a CustomEvent whose `detail.value` carries the new feedback record (or null). Listeners apply the value directly without re-reading. Same- tab sync survives storage failures and the clear path no longer early-returns. Cross-tab still re-reads on the platform `storage` event since that event has no detail. Two new storage tests pin the new broadcast contract (positive + null) and the failed-setItem path; two new component tests pin in-session confirmation under setItem failure and two-mount Submit + Change synchrony. Textarea draft fix (lefarcen P3) The textarea used `draftComment || feedback.comment || ''` as its controlled value, so erasing a saved comment snapped it back. The draft is now exclusively the source of truth; a ref-backed effect re-seeds the draft from feedback.comment whenever the rating transitions (mount, idle -> negative, cross-mount sync). Send is now enabled when `draftComment !== savedComment`, which lets the user both edit and clear a saved comment. New component test pins erase+ Send actually removing a previously-saved comment. Accessibility The confirmation chip and "Comment saved" tag both gain `role="status"` + `aria-live="polite"` so screen readers announce the state transition. The thumb buttons keep their `aria-label`. CSS (lefarcen P3) The widget's `.message-feedback*` class set had no rules in index.css, so it rendered with default browser controls. Added a ~130-line block that mirrors the surrounding chat pill/chip vocabulary: bg-subtle background, border-pill confirmation chip, accent-tinted positive state and amber-tinted negative state to match the assistant-footer's data-unfinished pattern. Comment area sits below the chip and wraps on narrow widths so the composer isn't pushed off-screen on small panes. Validated - pnpm guard clean - pnpm --filter @open-design/web typecheck clean - tests/state/message-feedback.test.ts 10/10 (was 8, +2 broadcast) - tests/components/MessageFeedback.test.tsx 10/10 (was 7, +3 sync / storage-failure / clear-saved-comment) - tests/components/AssistantMessage.test.tsx 5/5 (new file) - tests/i18n/locales.test.ts 5/5 - Full web suite: 866 tests --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com> |
||
|
|
b55f171693 |
feat(web): redesign entry view with new navigation and home layout
- Introduced `EntryNavRail` for a streamlined left navigation experience, featuring primary actions and a brand logo. - Created `EntryShell` to manage the entire home view layout, integrating the centered hero, recent projects, and plugins section. - Developed `HomeHero` for user prompt input, allowing seamless interaction with plugins and project creation. - Replaced the previous `PluginLoopHome` with a more cohesive `HomeView` that orchestrates plugin interactions and project submissions. - Enhanced CSS styles for improved visual consistency across the redesigned components. This update significantly enhances the user experience by providing a more intuitive and visually appealing entry point into the application. |
||
|
|
1df3eca161
|
feat(web): Critique Theater Phase 7 — reducer + useCritiqueStream + useCritiqueReplay (#1307)
* feat(web): pure reducer for Critique Theater states (Phase 7.1)
Pure CritiqueState reducer driven by the contracts-level PanelEvent
(the same shape both the live SSE stream and the recorded transcript
emit), so a single reducer powers both the in-flight panel and the
rerun replay. Lifecycle covers run_started → running → (shipped /
degraded / interrupted / failed), with panelist_open / dim /
must_fix / close / round_end events building per-round
CritiquePanelistView entries as they arrive.
Defensive behaviour that surfaced while writing the spec tests:
- Terminal phases (shipped / degraded / interrupted / failed) are
sticky against further lifecycle events for the same run, except
for parser_warning which can land late and is recorded in a side
channel without changing phase.
- A new run_started for a different runId at any time discards the
prior state and reboots, so the UI can launch consecutive runs
without an explicit reset action.
- Events whose runId does not match the active run return the same
state reference, so React's useReducer doesn't re-render
subscribers on stray traffic.
- Round bookkeeping keys by round number rather than "always last",
so an out-of-order panelist_dim for round 1 arriving after a
round 2 dim does not corrupt the round 2 bucket.
Test coverage: 18 cases covering each transition, the runId guard,
sticky-terminal behaviour, the out-of-order round invariant, and
the stable-identity guarantee. Sets up Phase 7.2 and 7.3 to wire
SSE + replay into the same reducer.
* feat(web): useCritiqueStream hook subscribes to SSE and feeds reducer (Phase 7.2)
createCritiqueEventsConnection is a pure connection manager that
mirrors apps/web/src/providers/project-events.ts: opens an
EventSource at /api/projects/:id/events, listens for every name in
CRITIQUE_SSE_EVENT_NAMES, decodes each frame back into a PanelEvent
(stripping the critique. prefix and merging the data payload), and
hands it to the caller's onEvent. Reconnect uses exponential
backoff (1s → 30s) and resets on `ready`; malformed payloads drop
with a dev-mode warning rather than tearing the stream.
useCritiqueStream wraps the manager in a useReducer that owns the
CritiqueState. enabled=false or a null projectId tears down the
connection cleanly; switching projectId closes the old connection
and opens a fresh one. The returned dispatch lets local UI
synthesise actions (e.g. an Esc keypress firing a synthetic
interrupted while a kill request is in flight); production traffic
comes from the SSE stream.
Test coverage:
- sse.test.ts (10 cases, node env): subscription set covers every
CRITIQUE_SSE_EVENT_NAMES channel; payload decoding lifts the wire
shape back to PanelEvent; malformed JSON is swallowed and does
not stop the stream; exponential backoff schedule and ready-reset
semantics are pinned with a setTimeout seam; close() cancels
pending reconnects and shuts the live source; no-op fallback
when EventSource is unavailable.
- useCritiqueStream.test.tsx (6 cases, jsdom env): idle pre-event,
reducer driven by synthetic actions, no connection when disabled
or projectId is null, clean close on unmount, projectId change
reopens cleanly.
* feat(web): useCritiqueReplay hook drives reducer from transcript file (Phase 7.3)
Fetches the per-run NDJSON transcript (one PanelEvent per line),
parses every line via the shared isPanelEvent predicate, and
dispatches into the same CritiqueState reducer the live SSE stream
uses. A single reducer means the UI rendering a replay can be
identical to the live panel, and a UI mounting both
useCritiqueStream and useCritiqueReplay in parallel does not have
to reconcile two state shapes.
speed knob is `paused | instant | live | { intervalMs: N }`.
- instant flushes every event synchronously, useful for opening a
finished run already at its terminal state.
- intervalMs paces dispatches at a fixed cadence so the reviewer
can watch the run unfold.
- paused parses the transcript but holds events back until the
caller advances speed (consumers can drive a scrubber later).
- live is reserved for the future "playback at original cadence"
feature, currently treated as instant; replay timestamps are not
yet persisted with each event so honest pacing requires a
follow-up Phase 7+ task.
gunzip seam handles `.ndjson.gz` transcripts via
DecompressionStream when present; the production fetch path picks
between text and arrayBuffer based on the URL extension. Both seams
are injectable so the unit tests don't need to spin up a real
network or a real gzip pipeline.
Test coverage (8 cases, jsdom env):
- Idle status before any URL is provided.
- speed=instant flushes the full transcript synchronously to
shipped state.
- speed={intervalMs:N} paces with the setTimeout seam, reaching
done after the last tick.
- speed=paused leaves status=playing with no dispatches.
- Empty transcript reports done with state still idle.
- Fetch rejection surfaces an error status with the message.
- Malformed NDJSON lines are skipped; valid events around them
still land.
- .gz transcripts route through the gunzip seam.
Closes the Phase 7 plan tasks 7.1 / 7.2 / 7.3 (reducer + stream +
replay), all on one branch ready for review. Phases 8+ (Theater
components) consume these from this PR.
* fix(web): close payload-override gap + paused-resume bug in Critique Theater hooks (Phase 7 review)
Two P1 fixes from lefarcen's review on PR #1307:
SSE payload override
`sseToPanelEvent` previously spread `data` after the channel-derived
`type`, so a payload-provided `type` could override the channel and
route a `critique.run_started` frame into the reducer as a `ship`
action. Reversed the spread so the channel-derived `type` is
authoritative, and revalidated the resulting object through the
contracts-level `isPanelEvent` predicate before returning. Frames
that fail validation (missing runId, empty runId, unknown type) are
dropped, so a malformed or compromised SSE frame can no longer
dispatch a wrong-shape action into the reducer.
Three new sse.test.ts cases pin the regression: hostile `type:'ship'`
in the payload still resolves to `run_started`, missing runId is
dropped, empty runId is dropped.
Replay pause/resume
`useCritiqueReplay` had one big effect keyed on `transcriptUrl`
only, so flipping `speed` from `paused` to `instant` never re-fired
and the held events sat undispatched. Split into a parse effect
(depends on URL, fetches and stores events in state) and a pace
effect (depends on parsed-events + speed, owns the cursor + timers).
The playback cursor lives in a ref that survives pause/resume
cycles, so flipping `paused` -> `instant` flushes from the current
position rather than restarting (which would double-dispatch
`run_started` and reset the reducer).
Two new useCritiqueReplay.test.tsx cases:
- paused-then-instant transitions from `playing` to `done` and
reaches the shipped terminal phase
- intervalMs paced playback dispatches one event, pauses to drain
the next scheduled timer, flips to instant, and confirms the
remaining transcript drains exactly once (cursor was preserved)
Doc consistency
The earlier source comment in useCritiqueReplay.ts claimed `live`
"paces by recorded timestamps" while the impl used zero-delay
timers and the PR body said it behaves like `instant`. Aligned to
reality: `live` currently behaves like `{ intervalMs: 0 }` (events
drain on successive microtasks via setTimeoutFn) because transcripts
do not yet carry per-event timestamps. Honest timestamp-driven
pacing is queued as a Phase 7+ follow-up.
Validated: pnpm guard, pnpm --filter @open-design/web typecheck,
Theater suite 47/47 (up from 42, +3 sse + 2 replay), full web suite
96 files / 888 tests.
---------
Co-authored-by: Nagendhra <nagendhra405@gmail.com>
|
||
|
|
64510b790b
|
fix(web): translate Design Files refresh strings instead of hardcoding English (#1254) (#1300)
* fix(web): translate Design Files / live artifact refresh strings instead of hardcoding English When the app language was set to Chinese, the Design Files refresh flow showed Chinese for the surrounding chrome but kept English for every label and message originating in describeRefreshStatus, describeEventPhase, and the refresh-event timeline body of LiveArtifactRefreshHistoryPanel. Same-screen mixed-language UX, the exact symptom reported in #1254. Root cause: those three sites bypassed i18n entirely. describeRefreshStatus returned hardcoded English label + description strings for the running / succeeded / failed / idle / never statuses; describeEventPhase returned hardcoded Started / Succeeded / Failed labels; the timeline body inlined "Refresh started…", "<n> source(s) updated", and "Refresh failed." string literals; and the empty-timeline copy ("No refresh activity yet in this session. Trigger Refresh to record a timeline…") was hardcoded too. Fix: thread the existing TranslateFn through both helpers, swap every hardcoded string for a t() lookup, and pull the empty-timeline copy and the failure-fallback through the same path. Added 13 new keys under liveArtifact.refresh.* — statusRunning, the five *Description keys, three event-phase labels (eventStarted/Succeeded/Failed), eventStartedDetail, sourcesUpdatedOne/Many with an {n} placeholder, and timelineEmpty. Status labels for succeeded / failed / ready / never already had keys (statusSucceeded / statusFailed / statusReady / statusNever) so those are reused unchanged. Locales: full Chinese translations added to zh-CN.ts (the locale directly named in the issue). The other 16 locales pick up English fallbacks through their existing ...en spread, so the locale-key alignment test stays green; native translations for those locales can land via the usual locale-team passes without re-touching the source code. * fix(web): cover the rest of the refresh panel under i18n + add a zh-CN render test Lefarcen's review on #1254 / PR #1300 surfaced that the first pass only translated three helpers (describeRefreshStatus, describeEventPhase, session timeline body) and left the rest of the panel in English. Under a Chinese UI the panel still mixed languages, which was exactly the regression the issue was filed for. This commit threads t() through every user-visible refresh-panel string the user would see in the Chinese flow: - Hero block: "Last refreshed" label + "Never" empty state. - Created / Last updated facts + their "Unknown" empty label. - Persisted refresh history header, hint, empty-state copy. - Persisted timeline status badge: succeeded / running / failed / cancelled / skipped now resolve through describePersistedStatus, which uses an exhaustive switch off LiveArtifactRefreshLogEntry's status union so a future contract addition trips tsc. - Session activity header, hint. - Document source header, hint, Type / Tool / Connector field labels. - Advanced debug metadata summary + note line. - "just now" relative-time fallback in the persisted timeline. 22 new i18n keys total (23 with the new heroLastRefreshedNever distinct from statusNever); zh-CN strings authored alongside the English source, every other locale picks them up via its existing ...en spread and the locale-key alignment test stays green. Intentionally untranslated surfaces: raw daemon payloads inside the <details> debug panel (event.step / refreshId / error.message and the JSON.stringify dump), since those are agent / connector identifiers and stack-trace style strings, not localised copy. The debug summary heading itself is translated; if the debug section should be hidden in localised primary flows, that is a separate UX call worth its own issue. Test coverage: new render test wraps LiveArtifactRefreshHistoryPanel in I18nProvider initial="zh-CN" and pins the Chinese rendering of every translated label, plus negative assertions that the formerly hardcoded English literals are NOT present in the markup. With the no-provider fallback returning English, the existing static-markup tests can't observe the regression this PR is meant to fix; the zh-CN render test is the only one that would have caught the original gap and will catch the next one. Validated: pnpm guard, pnpm --filter @open-design/web typecheck, locales.test.ts (5/5), FileViewer.test.tsx (69/69, +1 new zh-CN test), full web suite (92 files, 841 tests). * fix(web): route formatRelativeTime through Intl.RelativeTimeFormat so units localise Lefarcen's second pass on PR #1300 caught the remaining hardcoded English path: formatRelativeTime() still emitted units like `5s ago` and `45m ago`, so Chinese users would see those strings inside the otherwise-translated refresh panel. The function now takes the active locale + TranslateFn and routes through Intl.RelativeTimeFormat with style: 'narrow', numeric: 'always'. That preserves the historical `5s ago` shape for English while producing locale-correct output for every other locale (zh-CN gets `5秒前` / `45分前`, with the right past / future suffix and word order). The `just now` carve-out (abs < 5s) keeps using t('liveArtifact.refresh.justNow') since Intl's narrow output for zero-delta reads awkwardly. A try/catch around the RTF constructor falls back to 'en' if the runtime rejects the locale, so the function is safe on engines with limited ICU data. Callsites threaded through: - LiveArtifactRefreshHistoryPanel hero metric (`lastRefreshedAt`) - Session timeline event row (`event.startedAt`) - Session timeline event time (`event.at`) - LiveArtifactRefreshFact for the created / last-updated facts; the component now accepts optional `locale` + `t` props and the panel passes them in. Test coverage extension: - The existing zh-CN render test sets a real lastRefreshedAt (now - 45s) and real session-event timestamps, then asserts the Chinese past-tense suffix `前` appears AND the legacy English `Xs ago` / `Xm ago` shapes do NOT. That was the gap lefarcen pointed at: setting `lastRefreshedAt: undefined` couldn't see the regression because no relative-time formatting ran. - Added a small second test for the lastRefreshedAt-undefined empty hero so the original `从未` coverage still pins. Validated: pnpm guard, pnpm --filter @open-design/web typecheck, FileViewer.test.tsx (70/70, +1 new test), locales.test.ts (5/5), full web suite (92 files, 842 tests). --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com> |
||
|
|
5bd9763181
|
[codex] Improve Claude Code exit diagnostics (#1267)
* fix daemon claude diagnostics * fix claude custom endpoint auth diagnostics * fix project view api empty response test props * fix claude diagnostic review gaps * fix silent custom endpoint claude diagnostics * fix claude diagnostic credential redaction * fix quoted api key redaction * fix claude diagnostic tail redaction * fix silent claude configured profile diagnostics |
||
|
|
4983fc3ac4 |
feat(web): implement file operations summary in assistant messages
- Added a new `FileOpsSummary` component to display a summary of file operations (read, write, edit) performed during an agent's run, enhancing user visibility into file interactions. - Integrated the `FileOpsSummary` into the `AssistantMessage` component, allowing it to show a compact view while streaming and expand to a detailed list once the run completes. - Created a new `file-ops` CSS style to manage the presentation of the file operations summary, including hover effects and status indicators. - Developed utility functions in `runtime/file-ops.ts` to derive and count file operations from agent events, ensuring accurate aggregation of file interactions. - Added tests for the `FileOpsSummary` component and the file operations logic to ensure functionality and prevent regressions. This update improves the user experience by providing clear insights into file operations related to agent activities. |
||
|
|
070b8b07c6 |
feat(web): enhance PluginLoopHome with plugin details modal and refined plugin rail
- Introduced a new `PluginDetailsModal` component to display detailed information about plugins directly from the PluginLoopHome interface, allowing users to preview queries, inputs, and other relevant data before applying a plugin. - Updated the `ChatComposer`, `ChatPane`, and `InlinePluginsRail` components to support a single pinned plugin display when a project is created with a specific plugin, improving user experience by preventing unnecessary plugin selection prompts. - Enhanced CSS styles for better visual presentation of plugin actions and details. - Added tests to ensure the new functionality works as intended and to prevent regressions related to the plugin rail behavior. This update streamlines the plugin selection process and enhances the overall user experience within the application. |
||
|
|
b3dc3c3e0c |
feat(web): integrate applied plugin snapshot for enhanced user experience
- Added support for displaying an active plugin as a context chip in user messages when a project is created with a pinned plugin. This replaces the in-composer plugin rail to avoid re-prompting users for plugin selection. - Introduced `applied_plugin_snapshot_id` in the database schema and updated relevant components (ChatComposer, ChatPane, ProjectView) to handle the new functionality. - Implemented fetching of the applied plugin snapshot in ProjectView to ensure the active plugin is rendered correctly. - Enhanced CSS for the plugin chip to improve visual presentation. This change streamlines the user experience by providing context on previously selected plugins directly within the chat interface. |
||
|
|
87a95b7fb4
|
Fix conversation run isolation (#1271) | ||
|
|
3524a43d18
|
fix: pretty-print JSON file previews (#1206)
* fix: pretty-print JSON file previews * fix: avoid formatting JSON with unsafe numbers * fix: preserve precision-sensitive JSON previews * fix: preserve signed zero in JSON previews * fix: scan JSON numbers without repeated slicing --------- Co-authored-by: Kael S <YOUR_GITHUB_EMAIL_HERE> |
||
|
|
a0316d2599
|
fix(web): suppress autosave indicator for draft-only Connector key edits (#1232)
When the user typed a replacement Composio API key, the global Settings autosave loop persisted `buildPersistedConfig(cfg)` — which intentionally strips the in-flight secret — and then advanced the indicator through 'saving' -> 'saved' despite the key never actually being written. The "All changes saved" status then contradicted the section-local "Save key" gesture and eroded trust in the saved-state badge for a sensitive field. The autosave effect now tracks the snapshot at the last successful save (or the initial cfg on mount) and compares the next snapshot's persisted shape against it via a new `isAutosaveDraftOnlyChange` helper. When the only diffs since last save are fields that `buildPersistedConfig` strips (today the Composio API key, generalizing to any future save-on-explicit-confirm secret), the persist call is skipped and the indicator settles to 'idle' instead of flashing 'saved'. The forced media-provider sync path still runs because that is a real outbound effect even when the persisted shape hasn't changed. Refs #1187 |
||
|
|
be77dc0394
|
Default English resource i18n fallback (#1270) | ||
|
|
1bdf765cf2 |
feat(daemon): enrich API responses with surface specs and add new flags
- Implemented `--schema` flag for `od ui show` to return only the JSON Schema of the surface. - Enhanced the response of `GET /api/runs/:runId/genui/:surfaceId` to include the surface spec from the AppliedPluginSnapshot. - Introduced new flags for daemon and library commands to improve command handling and parsing. - Added tests for the new functionality, ensuring proper behavior of the enriched responses and flag handling. This change supports headless interactions by allowing code agents to inspect surface contracts before responding. |
||
|
|
fb079d8115
|
Add reliable agent-browser skill (#1284)
* Add reliable agent browser skill * Fix ProjectView delete conversation test props |
||
|
|
1eb20e3807
|
fix(web): keep tweaks selection usable without annotations (#1268) | ||
|
|
0f0d214298
|
fix(web): render static previews for sketch json files (#1060)
* fix(web): render static previews for sketch json files * fix(web): tolerate malformed sketch text items * fix(web): harden sketch preview parsing * fix(web): preserve sketch items on round-trip * fix(web): clear sketch files destructively * fix(web): unblock unsupported sketch saves |
||
|
|
12ce5ad38b
|
fix(web): ignore <artifact> tags inside markdown code spans and fences (#1132)
* test(web): add failing parser cases for <artifact> recitation in markdown code Cover the three real-world prose contexts where the model legitimately quotes the artifact tag without intending to emit one: - inside an inline backtick span - inside a fenced code block - spread across streaming chunks crossing the fence boundary Establishes the RED baseline before parser code-fence awareness lands. * fix(web): ignore <artifact> tags inside markdown code spans and fences The streaming artifact parser scanned the buffer with a raw indexOf, guarded only by 'next char must be whitespace'. That meant any literal <artifact ...> the model recited while documenting the protocol — even inside backticks or a ```html fence — flipped the parser into artifact mode, swallowed the rest of the reply from the chat UI, and (when a matching </artifact> appeared in the recitation) silently wrote a spurious file to disk via persistArtifact. Replace findOpenTag with a linear scan that tracks fenced code blocks (```) and inline code spans (`), skipping any <artifact prefix found inside either. If the buffer ends mid-fence, return a partial match anchored at the fence start so the next streaming chunk can resolve the boundary without losing fence context. Closes #1130. * fix(web): match renderer fence/inline-code rules in artifact parser Codex review on PR #1132 caught that the previous fix toggled inFence on any triple-backtick run anywhere in the buffer, including mid-line, while the chat renderer (apps/web/src/runtime/markdown.tsx) only treats ``` as a fence when it occupies a whole line matching /^[ ]{0,3}```(\w[\w+-]*)?\s*$/. That asymmetry would suppress a real <artifact> tag emitted after a prose sentence like "the opening marker is ```html and the response then writes:". Rework findOpenTag in three passes that mirror the renderer: 1. Walk \n-terminated lines; only a line that matches FENCE_LINE_RE toggles fence state. Open fences without a close (or with an unterminated tail line) return partial so the next chunk can resolve. 2. Collect inline code spans with /`[^`]+`/g — the same regex used by renderInline — so what the parser skips matches what the user sees as code. Unmatched trailing backticks after the last \n hold back. 3. Find the first <artifact …> outside any skip range; preserve the existing partial-prefix tail handling. Adds a regression test covering the exact case Codex reported. * test(web): pin parser behavior on double-backtick and in-fence string literal recitation Two cases raised in PR #1132 review: - a real artifact tag wrapped in '``<artifact …>``' (double-backtick inline code span) should not be treated as a real artifact - a fenced JS example whose body contains a string literal like 'const fence = "```";' should not pop fence state early and let a later literal <artifact> be parsed as real Both already pass on 96e88ca because the line-anchored fence regex and the renderer-aligned inline regex handle them correctly. Pinning the behavior so future regressions surface as test failures. * fix(web): make stripArtifact markdown-aware to stop truncating literal recitations The streaming artifact parser was hardened in 96e88ca to skip <artifact> recitations inside backticks and fences, but the post-stream stripper at AssistantMessage.tsx still ran a naive 'content.indexOf("<artifact")' over the same text events. As reported by lefarcen on PR #1132, that meant chat replies with literal protocol recitations could still get silently truncated mid-explanation — even though the parser preserved them in the text stream and the file panel was no longer polluted with ghost files. Extract the renderer-aligned classification (FENCE_LINE_RE, INLINE_CODE_RE, computeSkipRanges, rangeContains) into a single source of truth at apps/web/src/artifacts/markdown-context.ts so the parser and the stripper agree on what counts as code. Add apps/web/src/artifacts/strip.ts with a markdown-aware stripArtifact that: - ignores any <artifact open inside a fenced block or inline code span - looks for </artifact> with the same skip-range filter, so a real open paired with a literal close inside backticks does not strip a literal body that is meant to render - returns content unchanged when an open exists with no matching real close (the previous implementation sliced to end-of-string, which would nuke trailing prose on a malformed or still-streaming tag) Refactor parser.ts to import the shared helpers; behavior preserved (all seven existing parser tests still pass). New strip.test.ts covers six cases including the empirically-verified inline-backtick regression. * fix(web): align artifact stripper/parser fence rules with renderer exactly Two gaps surfaced in review at a0bf05f: - markdown-context.ts used a single FENCE_LINE_RE that allowed 0-3 leading spaces and reused the same pattern for opening and closing fences. The chat renderer (runtime/markdown.tsx:44 and :49) is asymmetric — opens with /^```(\w[\w+-]*)?\s*$/, closes with /^```\s*$/, and rejects any leading indentation on either side. Indented " ```html" was being treated as a code fence even though the renderer keeps it as a paragraph, and a literal "```html" line inside an open fenced example was closing the skip range early — both could expose a real or literal <artifact …> to the wrong handler. - stripArtifact discarded computeSkipRanges' unclosedFenceStart, so a fenced literal that ends at EOF without a trailing newline (very common for chat output) leaked the inner <artifact …> recitation to the stripper, reproducing the original #1130 truncation symptom on a narrower input shape. Split FENCE_LINE_RE into FENCE_OPEN_RE / FENCE_CLOSE_RE with no leading indentation, gate the fence state machine on the right side of the toggle, and have stripArtifact extend skip ranges to end-of-content when a fence is left open. Also tightened the parser's tail-line hold-back regex to match the renderer's no-leading-space rule. Added regression tests for the EOF-unclosed-fence case, the indented pseudo-fence (renderer treats as paragraph, stripper must strip the real artifact), and a "```html" line inside an open fence. Refs nexu-io/open-design#1130 * refactor(web): align streaming tail-line fence guard with FENCE_OPEN_RE The streaming parser's tail-line hold-back used a stricter local regex (/^```\w*$/) than the renderer's FENCE_OPEN_RE (/^```(\w[\w+-]*)?\s*$/), missing valid opener tails like ```c++, ```ts-, or ``` (trailing space). In practice these tails are still held back by the unmatched-backtick parity scan that runs immediately after — three backticks in a tail line are odd, so firstUnmatched stays set and the parser holds from that position. So this wasn't a runtime correctness bug, just a regex divergence that future readers could trip on. Drop the local regex and reuse FENCE_OPEN_RE so the tail check matches the same shape the rest of the pipeline already uses. Pinned the behavior with three new parser tests (`+`/`-` info-string suffix and trailing-space tails arriving as the first chunk) — they pass at HEAD, proving the parity scan was already covering these cases. Refs nexu-io/open-design#1132 (lefarcen polish P2) * fix(web): scope inline-code skip ranges per block and reject <artifact prefix-shared opens INLINE_CODE_RE previously ran over the whole buffer, so an unmatched backtick in one paragraph could pair with a backtick in a later paragraph and create a phantom inline span that swallowed any real <artifact …> between them. Mirror runtime/markdown.tsx by splitting the buffer on fence / blank / heading / list / hr boundaries and running INLINE_CODE_RE per block region instead. stripArtifact accepted any unskipped `<artifact` substring as a real open, while the streaming parser already required a following whitespace character — so prose like `<artifactual>demo</artifact>` was being truncated to `prefix suffix`. Extract the parser's real-open guard into isRealArtifactOpenAt and reuse it from both sides. While reordering findOpenTag for the shared guard, also fix the related hold-back ordering issue tracked at #1141: a stray tail-line backtick or fence-opener prefix used to suppress an artifact already complete earlier in the buffer. Scan for the earliest complete real open first, then pick the earliest hold-back position only when no complete tag was found. Regressions pinned in parser.test.ts and strip.test.ts for both new finding shapes. * fix(web): keep HR-shaped lines inside paragraph regions for inline-code scanning The previous walker closed inline-scan regions on lines matching the HR regex, but `parseBlocks()` in runtime/markdown.tsx does not break a paragraph on HR — its inner accumulation loop only breaks on blank / fence / heading / ul / ol (runtime/markdown.tsx:95-104). HR is only an HR block in the outer loop's first-look, never mid-paragraph. So inputs like `intro \`\n---\n<artifact …>…</artifact>\n---\nclosing \`` are one paragraph in the renderer, whose two stray backticks pair to cover the literal artifact recitation — but the walker was splitting on the `---` lines, leaving the recitation outside skip ranges, and the parser/stripper would treat it as a real tag. Drop HR from the paragraph-break list (HR-shaped lines carry no backticks of their own, so keeping them inside the surrounding region is benign either way) and document the renderer-mirror rationale. Regressions pinned on both sides. |
||
|
|
156bf5a34e
|
fix(web): refresh home projects after deleting a conversation (#1202) (#1219)
The home design cards render their `Needs input` badge from the cached `/api/projects` payload — App.tsx owns the `projects` state and exposes a `refreshProjects` callback that ProjectView already fires from every other state-changing branch (run end, live-artifact events, project rename, etc.). The conversation-delete branch silently skipped it: deleting a conversation that owned an unanswered `<question-form>` flips the daemon-side flag, but the home view kept showing the stale badge until the next manual reload. Call `onProjectsRefresh()` immediately after a successful `deleteConversation` API response (and only then — if the request fails the cached state is still the truth and we must not pretend otherwise). Adds `onProjectsRefresh` to the useCallback deps for exhaustive-deps correctness; matches the pattern at the four existing call sites in this file. New regression coverage in `apps/web/tests/components/ProjectView.deleteConversation.test.tsx`: - triggers onProjectsRefresh after deleting a conversation (verified RED before this fix, GREEN after) - does not trigger onProjectsRefresh when the delete request fails (defensive complement so a future "always refresh" refactor doesn't paper over a real failure with a stale-but-confident UI) |