* fix(daemon): add required env field to McpServerStdio in live-artifacts MCP descriptor
The ACP schema's McpServerStdio marks env as a required field
(List[EnvVariable] with no default). Omitting it causes Pydantic V2
Union validation to fail across all three variants (HttpMcpServer,
SseMcpServer, McpServerStdio), returning -32602 Invalid params on
session/new for agents with mcpDiscovery: 'mature-acp' (Hermes,
Devin, Kimi).
This bug is invisible when mcpServers resolves to an empty array
(no live-artifacts token), so it only manifests when the MCP
live-artifacts integration is enabled.
* fix(daemon): recover from -32602 Invalid params on session/set_model
Extend the existing -32603 Internal error recovery logic to also
handle -32602 (Invalid params) when the set_model request fails.
This allows the prompt to proceed with the default model instead of
hanging or timing out.
Some ACP agents may not support session/set_model or may reject the
model ID — treating this as a non-fatal condition and falling back to
the default model is more resilient than failing the entire run.
* fix(daemon): narrow -32602 handling and update test fixtures for env field
Address PR #627 review feedback:
1. Narrow -32602 Invalid params suppression to setModelRequestId only.
Unexpected-id -32602 errors are now treated as real protocol failures
and propagated via fail(), matching the reviewer's suggestion. Only
-32603 Internal errors from unexpected IDs are still suppressed as
cleanup noise.
2. Update all buildLiveArtifactsMcpServersForAgent test fixtures to
include the new required env: [] field.
Kimi CLI 1.35.0 expects MCP stdio servers to include 'type', 'name',
'command', 'args', and 'env' fields. Open Design was passing only
'name', 'command', and 'args', which caused session/new to return
JSON-RPC -32602 Invalid params when MCP discovery was enabled.
This change normalizes every MCP server descriptor to the full ACP
stdio shape before sending it over the wire.
* feat(craft): add rtl-and-bidi + opt-ins on blog-post, docs-page, finance-report
Module 4 of 5 in the behavioral craft series proposed in #501. Modules
1 (state-coverage, #502) and 2 (animation-discipline, #515) merged.
Module 3 (accessibility-baseline, #587) open at time of authoring.
Differentiating niche per the corpus prior-art survey: zero existing
OSS RTL skill is Apache-2.0, framework-agnostic, and aligned with
UAX #9 rev 51. The closest comparators (idanlevi1/rtlify 5★, MIT;
skills-il/localization 7★, MIT) are LTR-web-skewed and don't cover
Flutter Directionality, RN I18nManager, Compose LocalLayoutDirection,
or iOS UIKit semanticContentAttribute / SwiftUI layoutDirection.
Three-loop adversarial review pass via Claude Opus 4.7 xhigh effort
(codex unavailable). Loop 1 caught five revisions (typography spin-out,
WebKit prose compression, mistakes-list trim 12→9, alreq letter-spacing
rename dropped, WebKit r94775 specific revision dropped). Loop 2 caught
one blocking SwiftUI 4 claim and three nits. Loop 3 said ship.
Skill opt-ins picked to avoid PR #587 merge surface: blog-post (long-form
text), docs-page (LTR code islands in RTL prose), finance-report
(numerals + IBAN + currency).
Refs #501.
* fix(craft): rtl-and-bidi review fixes (lefarcen 6 findings)
- P2 #1 WebKit #50949: bug is RESOLVED FIXED, not still open. Verified
directly against bugs.webkit.org. Removed the broken-WebKit framing;
the recommendation to prefer <bdi> over CSS now stands on UAX #9
§2.7 ("prefer markup over CSS or control characters") rather than a
WebKit bug. Source list updated to drop the dead reference.
- P2 #2 isolate vs embedding controls: U+202C PDF is the
embedding/override terminator, not an isolate terminator. Split into
two families: isolate controls (U+2066/2067/2068 + U+2069 PDI) for
modern code, embedding/override controls (U+202A/202B/202D/202E +
U+202C PDF) as legacy. Recommend isolates first.
- P2 #3 base direction and language: new section covering
<html dir lang>, mixed-language subtrees, dir=auto for UGC. Without
this, agents can follow every other rule and still ship an LTR
document containing Arabic.
- P2 #4 phone/IBAN/card values: bare <bdi> is unreliable for
weak/neutral character runs; updated must-mirror bullet and forms
section to require <bdi dir="ltr">. Added common-mistake entry.
- P3 #1 native mobile budget: added a one-line opt-out hint at the
top of the section so HTML-only skills know they can skim it. Full
split into web/native files deferred — the table is 16 lines on a
176-line file, the cost is bounded.
- P3 #2 lintability: restructured "common mistakes" into three groups
— mechanically lintable, needs script detection, HTML semantics —
with explicit exception language (chart axes, physical-object icons,
platform-pinned UI). Avoids false positives in future linting.
Reviewed via Claude CLI Opus 4.7 xhigh effort (3 loops on the
original draft); these fixes are explicit reviewer responses with
WebKit Bugzilla state verified live.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(craft): rtl-and-bidi mrcfps round-2 precision (lang+dir, isolate picks)
Two non-blocking precision items:
- lang-without-dir scope: previous wording implied English never needs
dir="ltr". True only at the document root in a default-LTR page.
lang does not reset an inherited bidi base direction, so an
<section lang="en"> inside an RTL ancestor still resolves RTL.
Reworded to "lang without dir is fine at the document root in a
default-LTR page; inside any opposite-direction ancestor, set both."
- Plain-text isolate picks: previous wording recommended U+2068 / U+2069
generically. U+2068 is FSI (first-strong auto-detect) — wrong default
for known-direction runs, especially weak/neutral-heavy values like
phone, IBAN, card numbers (the same class this file forces to LTR in
HTML). Split: LRI/PDI for known-LTR, RLI/PDI for known-RTL, FSI/PDI
reserved for unknown direction. Added an explicit "don't default to
FSI" callout.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(craft+skills): rtl-and-bidi mrcfps round-3 — skill-body conflicts + bidi semantic correction
P1 BLOCKING — skill-body physical-direction conflicts (mrcfps):
- skills/docs-page: "left nav" / "right-rail TOC" / "left-edge accent
stripe" survive in skill body even with the rtl-and-bidi opt-in,
because craft is injected ABOVE the skill body. An Arabic docs
request would still see "Left nav" and emit physical-direction
layout. Updated description, lay-out section, and self-check to
inline-start / inline-end vocabulary; added a self-check bullet
requiring logical CSS on rails and accent.
- skills/blog-post: pull-quote "accent rule on the left" updated to
"accent rule on the inline-start edge" with a matching note about
flipping under dir="rtl".
P1 craft semantic correction (mrcfps):
- HTML-semantics lint: previous wording equated <bdi dir="auto"> with
unicode-bidi: plaintext. Not equivalent. <bdi> isolates an inline
run from surrounding bidi resolution; unicode-bidi: plaintext
changes how base direction is *determined* for each plaintext
paragraph in a block. Different surfaces. Reworded the lint guidance
to "prefer semantic isolation in HTML for inline runs; reach for
unicode-bidi: plaintext only when that block-level paragraph
behavior is explicitly required and tested" — and explicitly flagged
that they are not drop-in equivalents to avoid future linters
flagging valid CSS with a non-equivalent fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(craft): rtl-and-bidi mrcfps round-4 — split progress-bar from media scrubber
Non-blocking precision: prior must-mirror bullet lumped "progress-bar
fill" together with sliders, which would have flipped a video / audio
scrubber under dir="rtl" — directly conflicting with the must-not-mirror
rule for media playback controls (play/pause/FF/rewind represent tape
direction, not reading direction). The two cases collide on every audio
or video player.
- Must-mirror progress bars now scoped to "non-media" (download, upload,
form-completion).
- Media scrubber / progress timeline added explicitly to the must-not-
mirror media bullet.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(daemon): add model name to pi initial status and RPC abort on cancel
- Emit status:initializing with model name before pi responds so the UI
shows 'pi · claude-sonnet-4-5' — matching Claude Code, Copilot, Gemini,
and Cursor Agent model-name parity
- Replace raw SIGTERM with RPC abort command on cancel, giving pi a
chance to clean up gracefully before SIGTERM fallback
- Wire run.acpSession onto the run object so cancel() can dispatch to
session.abort() for pi and ACP adapters
- Add stdinOpen guard so sendCommand is a no-op after stdin closes
- Add 4 tests covering initializing status, abort wire format, and
stdin-closed guard
* fix(daemon): gate stdout parser after abort to prevent post-cancel events
Once abort() sets finished=true, the stdout listener kept feeding
chunks into mapPiRpcEvent, so text_delta/tool/status events could
still be emitted during the PI_ABORT_GRACE_MS window. Add a finished
guard at the top of the parser callback so no agent events are
forwarded after abort, while still draining stdout cleanly.
Adds a test that aborts mid-session, then feeds message_update and
tool events, proving zero post-abort agent events are emitted.
* refactor(daemon): own SIGTERM fallback in cancel, rewrite abort tests as integration
- Move SIGTERM fallback from pi-rpc abort() to runs cancel() so the
termination guarantee is centralized — a misbehaving session can't
leave the child alive indefinitely (address lefarcen P3 on L130)
- Remove the setTimeout/SIGTERM from abort(); it now only sends the
RPC abort command, termination is the caller's responsibility
- Rewrite initial-status and abort tests as integration tests that
exercise attachPiRpcSession against mock child processes instead
of duplicating private sendCommand/send helpers inline (address
lefarcen P3 on L453 and L491)
- All 28 tests pass
GitHub's `/contribute` page only renders the `good first issue` label,
so 12 open `help wanted` issues never reach newcomers via that entry.
Switch the link to an issues search URL covering both labels (OR), so
both pools surface from one click. Wording is unchanged across all 10
README locales.
* feat(web): add Cmd/Ctrl+P quick file switcher
A keyboard-driven file palette overlaid on the workspace. Press Cmd/Ctrl+P
anywhere in the project view; type to fuzzy-filter the file list, ↑/↓ to
navigate, Enter to open in a tab, Esc to dismiss. With an empty query the
palette surfaces recents (per-project, localStorage) followed by the rest
of the file list sorted by mtime.
Adds:
- apps/web/src/components/QuickSwitcher.tsx: palette UI and matcher
- apps/web/src/quickSwitcherRecents.ts: per-project recents store
- index.css: scoped .qs-* styles using existing design tokens
- i18n: 6 new keys translated across all 16 locale files
Wires into FileWorkspace's existing openFile() so recents and tab state
behave identically to opening from DesignFilesPanel. Capture-phase keydown
beats the browser's print dialog. No backend changes; uses the files prop
already passed to FileWorkspace.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(web): address QuickSwitcher review feedback
Three fixes from the PR review:
- z-index: bump .qs-overlay from 200 to 1500 so the palette renders in
the modal tier (alongside prompt-template-modal-overlay) instead of
behind context menus and popovers (which sit at 200).
- Arrow-key guard: skip setCursor when matches is empty. Without this,
pressing ↓ on a no-results query set the cursor to -1, making the
highlight selector miss every row on the next render.
- Tests: add 19 unit tests covering scoreMatch ranking tiers, render
output (empty state / row count / kbd hints / placeholder), and the
full recents lifecycle (cap at 6, dedupe-on-push, corrupt-JSON
recovery, per-project scoping, quota-exceeded no-op). Vitest stays
on the node env via a small in-memory localStorage stub.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(web): QuickSwitcher review — wrap, IME, platform gate
Three follow-ups from @mrcfps's review on #556:
- ArrowUp/ArrowDown now wrap at list bounds (last → first, first → last)
via modulo arithmetic in a new pure helper `nextCursor(current, total,
direction)`. Previously they clamped, which contradicted the wrap
behavior the PR test plan promised. Pulled into a pure function so
boundary cases are unit-testable without simulating keyboard events.
- Palette's onKeyDown now early-returns on `e.nativeEvent.isComposing`,
so users typing CJK file names through an IME keep ↑/↓/Enter for
candidate navigation instead of having them steered by the palette.
The global Cmd/Ctrl+P opener already had the equivalent guard.
- Global keydown is now platform-gated: macOS responds only to metaKey,
win/linux only to ctrlKey. Previously both fired everywhere, which
meant Ctrl+P on macOS was stealing native readline "previous line" in
text fields (and the chat composer).
Tests: +6 unit tests for `nextCursor` covering forward/backward wrap,
mid-list moves, empty list (no division-by-zero), and single-item
no-op. Suite now 258 passing (up from 252).
Verified live: ↓ from last row → first row; ↑ from first row → last
row, in a mocked-project Playwright harness.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* feat(craft): add accessibility-baseline + opt-ins on dashboard, hr-onboarding, mobile-onboarding
Module 3 of 5 in the behavioral craft series proposed in #501. Modules 1 (state-coverage, #502) and 2 (animation-discipline, #515) merged earlier today.
The differentiator that survived the corpus review is native-mobile
parity. Existing OSS prior art (fecarrico/A11Y.md, awesome-copilot,
Community-Access) covers web ARIA well, none covers Flutter Semantics,
Compose semantics, iOS UIKit/SwiftUI, or RN labelling APIs.
Secondary differentiator: jurisdictional legal-floor calibration. EAA
references WCAG 2.1 (via EN 301 549 v3.2.1), not 2.2. ADA Title II
2026-04-24 deadline slipped to 2027-04-26 via 2026-04-20 IFR. Most
existing OSS a11y prior art doesn't track either accurately.
Three-loop adversarial review pass before push (codex unavailable, ran
via substitute agent). Loop 1 caught nine cuts plus four factual fixes
including a wrong Android Compose API name. Loop 2 verified and flagged
two more trims. Loop 3 said ship.
Anchored citations: WCAG 2.2 Understanding pages, ISO/IEC 40500:2025,
ADA Title II 2024 + 2026-04-20 IFR, EN 301 549 v3.2.1, WAI-ARIA 1.3 +
AccName 1.2 + Core AAM 1.2, WebAIM Million 2025, A11yn (arXiv 2510.13914),
APCA W3C silver branch.
Refs #501.
* fix(craft): accessibility-baseline review fixes (lefarcen + mrcfps)
Address all P1/P2/P3 findings:
- P1 (lefarcen): add "Keyboard operability and semantic structure" section covering tab reachability (2.1.1), activation keys, no keyboard trap (2.1.2), focus order (2.4.3), native-control-first, document language (3.1.1), heading hierarchy (1.3.1, 2.4.6), landmarks (1.3.1, 2.4.1), text alternatives (1.1.1)
- P2 (lefarcen): expand jurisdiction scope with US Section 508 (WCAG 2.0 AA), ADA Title III caveat, EU WAD reference
- P2 (lefarcen + mrcfps): rename contrast-table row to "Normal text below 18 pt regular / 14 pt bold" so the table matches the threshold rule
- P2 (mrcfps): correct "exclusive" → "inclusive" — exact 4.5:1 / 3:1 passes; the no-rounding rule is what makes 2.999:1 fail
- P2 (lefarcen): add "Prior art and scope" note differentiating from existing OSS a11y agent docs
- P3 (lefarcen): narrow APCA framing to "not part of WCAG/EN/ADA/Section 508" and clarify size/weight-dependent thresholds
- P3 (lefarcen): expand WCAG 2.5.8 exceptions list (Spacing, Equivalent, Inline, User Agent Control, Essential)
- Common-mistakes additions: Section 508/2.1 confusion, tabindex>0 anti-pattern, modal-focus-trap distinction from 2.1.2, heading-size vs level confusion
* fix(craft): accessibility-baseline mrcfps round-2 precision fixes
All three non-blocking precision items addressed:
- Update WebAIM Million benchmark from 2025 to 2026 (February 2026 crawl). Form labels: page-level 51% (was 48.2%), input-level 33.1% (was 34.2%) of 6.9M inputs (was 6.3M). ARIA: 59.1 errors on ARIA pages vs 42 on non-ARIA (was 57 vs 27); gap is ~17 in 2026, was 30 in 2025. ARIA usage 82.7% of pages (was 79.4%). Verified directly against webaim.org/projects/million/.
- Soften keyboard/semantic-structure intro: Level A items are still labeled Level A, but 2.4.6 Headings and Labels is correctly tagged AA, and the one-h1 / no-skipped-levels rules are now framed as OD craft conventions on top of WCAG's programmatic-structure floor (1.3.1).
- Tighten <a> activation note: bare <a> without href is not focusable, not a link, and not keyboard-operable. Use <a href="…"> for navigation or <button> for actions. Added a "common mistakes" entry to lock the rule.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(readme): document OD_DATA_DIR + migration from .od/ to Desktop app
The "Move it elsewhere" row in the First-run state table still said the
path is hard-coded; OD_DATA_DIR (resolveDataDir in apps/daemon/src/server.ts)
has supported relocation since the runtime-data refactor. Replace the
"not supported yet" note with the actual env-var usage and resolution
semantics, and add OD_MEDIA_CONFIG_DIR for the narrower credentials
override.
Add a Migrating a pre-desktop-app `.od/` section so users who started in
the repo and later installed the packaged Desktop app know:
- the two writers target different roots (repo .od/ vs.
~/Library/Application Support/.../namespaces/<channel>/data on macOS,
with the platform-equivalent paths on Windows and Linux),
- how to copy projects/SQLite/artifacts/media-config.json over after
quitting the app cleanly,
- how to keep both writers on the same dir going forward via
OD_DATA_DIR.
Documentation only; no code changes.
* docs(readme): address review feedback on .od/ migration section
Resolves the review on #570:
- chatgpt-codex P1 / mrcfps: replace the literal `<repo>` token in the
copy command (Bash parses `<repo>` as input redirection, so the
documented snippet would fail before any copy). Use a shell-safe
`REPO=` variable in the example.
- chatgpt-codex P2 / mrcfps / lefarcen P2: correct the cross-platform
Desktop data-root. The packaged runtime resolves
`app.getPath("userData")/namespaces/<namespace>/data` (see
apps/packaged/src/config.ts:106-107), and Electron's `userData`
default on Linux is `$XDG_CONFIG_HOME` / `~/.config`, not
`$XDG_DATA_HOME`. Replace the single macOS-only path with a per-OS
table, plus a hint to inspect the packaged daemon log for the
resolved `daemonDataRoot`.
- lefarcen P2: list platform-specific channel namespaces. The release
workflows append `-win` and `-linux` suffixes (release-stable-win,
release-beta-win, release-stable-linux, release-beta-linux); only
macOS uses the bare `release-stable`/`release-beta` strings.
- lefarcen P1 (data corruption): demote the "share one data dir between
repo dev-server and Desktop app" recommendation to an Advanced
callout with an explicit warning that the two writers must never run
at the same time. The daemon opens app.sqlite in WAL mode and writes
uncoordinated project/artifact files, so concurrent use can corrupt
SQLite or clobber artifacts.
- lefarcen P2 (downgrade risk): add a forward-only schema migration
warning. apps/daemon/src/db.ts applies `CREATE TABLE IF NOT EXISTS` /
`ALTER TABLE` without a version guard, so opening a migrated dir with
an older repo checkout can leave the workspace inconsistent. Advise
backing up app.sqlite* before the first launch.
- lefarcen P2 (failure-safety): replace the in-place `cp -R` with a
rsync-into-sibling-then-rename pattern so a partial copy cannot leave
the Desktop data dir in a half-populated state. Document the restore
path from the .fresh-baseline-* backup.
- lefarcen P2 (replace vs merge): add a preflight `ls` of the Desktop's
existing projects and a callout that this is a replace operation, so
users with projects on both sides can stop and choose which is
authoritative.
Documentation only.
* docs(readme): address second review round on .od/ migration section
Resolves the follow-up review on #570 from a72b35f:
- mrcfps (blocking): require stopping the repo dev-server too, not just
the Desktop app, before copying. Without that the source `$REPO/.od/`
may still receive SQLite/WAL writes mid-rsync, so the staged copy can
be inconsistent even though the Desktop target is clean. The clean-
state callout and the bash block both now name `pnpm tools-dev stop`
alongside the Desktop quit step.
- lefarcen P2 (fail-fast gap): the rsync block was not actually fail-
fast — a non-zero rsync exit would still let the subsequent `mv`
promote a partial staged copy. Added `set -euo pipefail` at the top
of the bash block plus an explicit `|| { echo …; exit 1; }` guard on
the rsync line so a failed copy aborts before any swap.
- lefarcen P3 (wording): "Electron's userData path" overlapped with the
per-OS table values, since `app.getPath("userData")` already appends
the `Open Design` segment. Renamed the table column to "<appData>
(Electron `appData` base)" and reworded the surrounding sentence so
the path components compose unambiguously: `<appData>/Open Design/
namespaces/<channel>/data/`.
Documentation only.
---------
Co-authored-by: StotheC90 <StotheC90@users.noreply.github.com>
* fix(daemon): remove --no-session from pi adapter to persist session files
The pi agent was the only adapter explicitly passing `--no-session`
in its `buildArgs`, preventing pi from writing session files.
All other adapters either run in single-shot mode by design or use
the ACP JSON-RPC session lifecycle without suppressing persistence.
Removing `--no-session` lets `--mode rpc` retain its default behavior
of writing session state, which is needed for multi-prompt continuity
and matches the rest of the harness ecosystem.
* test(daemon): add pi buildArgs regression tests; fix docs for --no-session removal
- Adds test for pi.buildArgs base shape: returns ["--mode", "rpc"]
and does not include --no-session (prevents regression).
- Adds test for --model and --thinking option passthrough.
- Updates pi-rpc.ts lifecycle comment to remove [--no-session].
- Updates README.md and all localized READMEs to reflect the
corrected pi CLI invocation.
Move sidecar source under src/ so a single tsconfig produces all daemon
output. Removes the parallel dist/src/ tree that was emitted by
tsconfig.sidecar.json (it included src/**/*.ts to type-check the
`../src/server.js` cross-tree import).
Build now emits:
- dist/<flat> (cli.js, server.js, app-version.js, ...)
- dist/sidecar/{index,server}.js
`dist/sidecar/server.js` reaches the main daemon via `../server.js`
instead of `../src/server.js`, so there is no second copy of the source
tree in the published tarball.
Background — issue #534 (already fixed by #537):
The packaged Settings → About panel showed 0.0.0 because the sidecar
chain loaded the duplicated `dist/src/app-version.js`, where the fixed
`new URL('../package.json', import.meta.url)` resolved to a non-existent
`dist/package.json`. #537 patched the symptom by walking parents until a
real `package.json` is found and by writing `appVersion` into the Linux
packaged config. Both stay in place — they're sound defenses — but the
underlying duplicate-emit was never addressed; any future relative
resource lookup (templates, schemas, prompts) anchored on
`import.meta.url` would have hit the same trap.
This change removes the trap.
* feat(web): add skills & design systems management page in settings
Add a new "Library" section in Settings that lets users browse, search,
preview, and enable/disable skills and design systems. Disabled items are
excluded from the create-project picker. Phase 1 — browse/toggle only.
Closes#497
* fix(web): persist empty disabled lists and deduplicate DS preview
Use empty array instead of undefined when all items are re-enabled so
the daemon merge clears the key. Move DS preview panel outside the
category group loop so it renders once, not per group.
* fix(web): address review feedback on library settings
Clear disabled lists on invalid daemon writes, memoize enabled item
filters in App.tsx, and guard preview fetch against rapid-click race
conditions.
* fix(web): hydrate disabled lists from daemon and keep full lists in ProjectView
Merge daemonConfig.disabledSkills/disabledDesignSystems during bootstrap
so the values survive localStorage resets. Pass unfiltered skills and
design systems to ProjectView so existing project metadata resolves
correctly.
Add NRG / template-driven README generation to TRANSLATIONS.md
"Deferred decisions" with explicit re-evaluation triggers (≥15 locales
or monthly+ README structural edits) and a record of the shared-structure
trade-off that surfaced in #195. Captures the rationale (zh-TW's
"上手體驗" section, pt-BR vs pt-PT content-level divergence precedent)
so future contributors don't relitigate it from scratch.
Settings -> About used to display 0.0.0 in packaged builds because
`readCurrentAppVersionInfo` resolved `'../package.json'` relative to
`import.meta.url`, which only points at the daemon package root from the
flat CLI build (`dist/app-version.js`). The sidecar build emits
`dist/src/app-version.js`, where the same relative path lands on the
non-existent `dist/package.json`, so `readPackageMetadata` returned null
and the version fell back to APP_VERSION_FALLBACK.
Walk up from `import.meta.url` to find the nearest real `package.json`
instead, so the daemon reports its actual version regardless of whether
it runs from TypeScript source (tools-dev), the flat CLI dist, or the
nested sidecar dist used by the packaged desktop app. The OD_APP_VERSION
env still wins inside `resolveAppVersionInfo`, so callers that already
inject it (mac/win packagers) keep working.
Also write `appVersion` into the Linux packaged config so Linux follows
the same env-injection path as mac/win and stays consistent with the new
fallback resolution.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(daemon): preserve ANTHROPIC_API_KEY when ANTHROPIC_BASE_URL is set
The claude adapter currently strips ANTHROPIC_API_KEY unconditionally
so that Claude Code's own auth resolution (claude login) wins instead
of silently falling back to API-key billing.
However, when ANTHROPIC_BASE_URL is set the user is intentionally
routing Claude Code to a custom endpoint (e.g. a Kimi/Moonshot proxy).
In that case claude login is meaningless, so preserve the API key so
the child can authenticate against the custom base URL.
* fix(daemon): guard against empty ANTHROPIC_BASE_URL values
Address review feedback: check that ANTHROPIC_BASE_URL contains a
non-empty, non-whitespace string before preserving ANTHROPIC_API_KEY.
This prevents the #398 billing guard from being bypassed when the
variable is set to '' or whitespace.
Desktop client v0.3.0 has shipped for macOS (Apple Silicon) and Windows
(x64) at https://open-design.ai/ and the GitHub releases page. Surface
that in 11 READMEs (en + 10 localized):
- Add a 'Download' CTA badge as the first item in the shields row,
linking to open-design.ai (orange #ff6b35, no platform-specific logo
so it doesn't imply Mac-only or Windows-only).
- Repoint the existing release badge link from /releases/latest to
/releases (the general entry page, more stable for users browsing
history).
- Add &display_name=tag to the release shields URL so the badge shows
the literal tag (e.g. open-design-v0.3.1-beta.5) instead of 'invalid'
— shields.io's semver parser doesn't recognize the open-design- prefix.
- Update the At-a-glance 'Deployable to' row: drop the 'placeholder,
in-flight' note and call out the actual macOS/Windows downloads with
links to open-design.ai and the releases page.
- Insert a 'Download the desktop app (no build required)' subsection at
the top of Quickstart, with a 'Run from source' subheading before the
existing bash block — splits the funnel between non-developer users
(download link) and developers (clone + pnpm).
- Flip the roadmap entry for apps/packaged/ from [ ] to [x] and append
the download links.
zh-TW does not carry the release badge in its shields row, so the
release-badge update is a no-op there; all other changes apply.
Add backward compatibility for historic OpenCode CLI data:
- Match both 'TodoWrite' and 'todowrite' tool names
- Fixes rendering of TodoWrite cards from persisted chat history
Scope: 1 line change, no other files modified
Co-authored-by: bojiehuang <bojiehuang@bojiehuangdeMacBook-Pro.local>
* feat(craft): add animation-discipline + opt-ins on mobile-app, mobile-onboarding, gamified-app
Animation discipline is the second behavioral craft module proposed in
#501 and explicitly invited in @mrcfps's post-merge comment on #502.
Differentiation from prior art (LottieFiles motion-design-skill, MIT,
96 stars): citation-grounded against primary sources rather than
asserted. Anchors:
- Tversky/Morrison/Bétrancourt 2002 (IJHCS) on the one demonstrated
win-condition for animation
- Heer & Robertson TVCG 2007 on staging (with the actual durations
they tested, not the laundered '300-1000ms' rule)
- Harrison/Yeo/Hudson CHI 2010 on perceived-duration scope (progress
bars only, not skeletons)
- Doherty & Thadani IBM 1982 productivity numbers
- Material 3 motion tokens (M3 standard vs M2 legacy delta)
- IBM @carbon/motion durations
- Apple SwiftUI Animation API published defaults
- W3C View Transitions API + WCAG 2.2.2/2.3.3 calibration
- WebKit 2017 prefers-reduced-motion rationale
The 'common mistakes (lint these)' section busts five specific
folklore claims that don't survive primary-source check, including
the Doherty-400ms attribution and the M2-vs-M3 standard easing
confusion.
Three skills opt in via od.craft.requires:
- mobile-app (animation-heavy mobile screens)
- mobile-onboarding (multi-screen flow with transitions)
- gamified-app (animations central to the format)
Refs #501.
* fix(craft): address review findings on animation-discipline
Six findings from @lefarcen's CHANGES_REQUESTED review on #515,
addressed in one pass. Reviewed by codex across three loops before
push.
P1 integration gaps:
- gamified-app and mobile-onboarding skills now require both
state-coverage and animation-discipline (both render stateful UI
with motion).
- craft/README.md silent-fallback example reframed as a
planned-but-not-yet-vendored placeholder rather than a hard-coded
next-to-ship slug. Note added pointing skill authors who arrive from
older guidance at animation-discipline as the equivalent of the
earlier 'motion' placeholder.
P2 reasoning completeness:
- > 500 ms duration row reframed: 'Reserved for cross-screen, staged,
or platform-native transitions (e.g. M3 long2-extraLong4, Heer &
Robertson 2007's per-stage recommendation)'. Surrounding paragraph
rewritten with an enumerated category — 'Non-navigation
microinteractions: hover, press, toggle, validation, chip selection,
row expansion' — rather than the vague 'routine' term.
- New 'Flashing limits' subsection added in the Reduced motion
section. WCAG 2.3.1 (Level A) three-flashes-in-any-one-second-period
rule with the area/brightness threshold qualifier; WCAG 2.3.2 (AAA)
unconditional rule. Photosensitive epilepsy framing.
- New 'Repeated and ambient motion' section added. Five rules covering
iteration cap, WCAG 2.2.2 pause control after 5s, cancel-on-route,
one-shot reward animations, and spinner timeout cross-referencing
state-coverage.md.
File length now 154 lines (was 130, 80-110 craft target). Trade is
citation density and the new sections demanded by the integration
context (gamified/onboarding skills with looping motion).
Refs #501, #515.
* docs: add live artifacts implementation spec
* docs: align live artifacts implementation plan
* Ralph iteration 1: work in progress
* Ralph iteration 2: work in progress
* Ralph iteration 3: work in progress
* Ralph iteration 4: work in progress
* Ralph iteration 5: work in progress
* Ralph iteration 6: work in progress
* Ralph iteration 7: work in progress
* Ralph iteration 8: work in progress
* Ralph iteration 9: work in progress
* Ralph iteration 10: work in progress
* Ralph iteration 11: work in progress
* Ralph iteration 12: work in progress
* Ralph iteration 13: work in progress
* Ralph iteration 14: work in progress
* Ralph iteration 15: work in progress
* Ralph iteration 16: work in progress
* Ralph iteration 17: work in progress
* Ralph iteration 18: work in progress
* Ralph iteration 19: work in progress
* Ralph iteration 20: work in progress
* Ralph iteration 21: work in progress
* Ralph iteration 22: work in progress
* Ralph iteration 23: work in progress
* Ralph iteration 24: work in progress
* Ralph iteration 25: work in progress
* Ralph iteration 26: work in progress
* Ralph iteration 27: work in progress
* Ralph iteration 28: work in progress
* Ralph iteration 29: work in progress
* Ralph iteration 30: work in progress
* Ralph iteration 31: work in progress
* Ralph iteration 32: work in progress
* Ralph iteration 33: work in progress
* Ralph iteration 34: work in progress
* Ralph iteration 35: work in progress
* Ralph iteration 36: work in progress
* Ralph iteration 37: work in progress
* Ralph iteration 38: work in progress
* Ralph iteration 39: work in progress
* Ralph iteration 40: work in progress
* Ralph iteration 41: work in progress
* Ralph iteration 42: work in progress
* Ralph iteration 43: work in progress
* Ralph iteration 44: work in progress
* Ralph iteration 45: work in progress
* Ralph iteration 46: work in progress
* Ralph iteration 47: work in progress
* Ralph iteration 48: work in progress
* Ralph iteration 49: work in progress
* Ralph iteration 50: work in progress
* Ralph iteration 51: work in progress
* Ralph iteration 52: work in progress
* Ralph iteration 53: work in progress
* Ralph iteration 54: work in progress
* Ralph iteration 55: work in progress
* Ralph iteration 56: work in progress
* Ralph iteration 57: work in progress
* Ralph iteration 58: work in progress
* Ralph iteration 59: work in progress
* Ralph iteration 60: work in progress
* Ralph iteration 61: work in progress
* Ralph iteration 62: work in progress
* Ralph iteration 63: work in progress
* Ralph iteration 64: work in progress
* Ralph iteration 65: work in progress
* Ralph iteration 1: work in progress
* Ralph iteration 2: work in progress
* Ralph iteration 3: work in progress
* Ralph iteration 4: work in progress
* Ralph iteration 5: work in progress
* Ralph iteration 6: work in progress
* Ralph iteration 8: work in progress
* Ralph iteration 9: work in progress
* Ralph iteration 17: work in progress
* Add Composio-backed connectors
* Add Composio-backed connector catalog
* Fix connector callback flow
* Update live artifact connector refresh
* Fix live artifact refresh updates
* Improve live artifact viewer toolbar
* Refine live artifact source tabs
* Expand Composio connector catalog
* Improve Composio connector browsing
* Fix artifact refresh source safety checks
Generated-By: looper 0.4.1 (runner=fixer, agent=opencode)
* Fix live artifacts PR feedback
Generated-By: looper 0.5.0 (runner=fixer, agent=opencode)
* Fix live artifact preview CORS validation
Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
* Fix connector OAuth IPv6 loopback hosts
Allow bracketed IPv6 loopback Host headers when deriving connector OAuth callback URLs so IPv6-bound daemons can complete connection flow.
Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
* Preserve live artifact refresh permissions
Respect explicit refresh permission choices during live artifact create and update flows so revoked connector sources remain gated.
Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
* Fix live artifact preview cache freshness
Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
* Fix live artifact refresh validation
Guard manual refreshes with local daemon checks and reject daemon_tool sources without a toolName before refresh execution.
Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
* Fix Composio credential invalidation
Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
* Fix live artifact CORS methods
Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
* Fix workspace validation
Restore media config test isolation under Vitest setup data-dir overrides and add the missing French live artifact display copy so the workspace test suite stays aligned.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode)
* Fix connector safety filtering
Keep agent-preview connector listings aligned with execution safety policy and prune stale Composio OAuth state records before they accumulate.
Generated-By: looper 0.5.2 (runner=fixer, agent=opencode)
* Fix agent runtime cleanup
Generated-By: looper 0.5.2 (runner=fixer, agent=opencode)
* Fix live artifact daemon access
Validate local-only live artifact routes against the peer socket address and pass daemon-resolved CLI paths to ACP MCP descriptors.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode)
* Fix connector run limit pruning
Evict stale connector rate-limit buckets so long-lived daemon processes do not retain per-run entries indefinitely.\n\nGenerated-By: looper 0.5.2 (runner=fixer, agent=opencode)
* Fix connector compact schemas
Generated-By: looper 0.5.2 (runner=fixer, agent=opencode)
* Improve connector connection feedback
* Adjust connector gate positioning
* Fix live artifact refresh commits
Avoid marking refresh candidates failed after snapshot or state persistence errors by deferring live artifact mutations until the durable refresh metadata is written. Also align connector OAuth callback host validation with daemon loopback handling.\n\nGenerated-By: looper 0.5.4 (runner=fixer, agent=opencode)
* Improve connector search relevance
* fix(daemon): harden connector connection state
Require loopback daemon validation before connector connect side effects and only clear provider-owned connector statuses during credential reset.
Generated-By: looper 0.5.4 (runner=fixer, agent=opencode)
* fix(daemon): guard connector disconnect route
Require local daemon request validation before connector disconnect side effects.
Generated-By: looper 0.5.4 (runner=fixer, agent=opencode)
* fix(daemon): guard composio config updates
Generated-By: looper 0.5.4 (runner=fixer, agent=opencode)
* fix(daemon): dispatch live artifacts mcp first
Route the live-artifacts MCP server before the generic MCP CLI so od mcp live-artifacts starts the dedicated server instead of failing generic argument parsing.\n\nGenerated-By: looper 0.5.4 (runner=fixer, agent=opencode)
* fix(daemon): handle integer connector schemas
Allow JSON Schema integer connector inputs while preserving fractional-value validation so generated connector tool schemas accept valid page sizes and limits.
Generated-By: looper 0.5.4 (runner=fixer, agent=opencode)
* fix: align live artifact refresh error codes
Generated-By: looper 0.5.4 (runner=fixer, agent=opencode)
* Fix live artifact connector refresh flow
* Update live artifact design cards
* Add beta badge to live artifact form
* Remove live artifact tile model
* Fix live artifact refresh sync
* Fix live artifact MCP refresh durability
Generated-By: looper 0.5.4 (runner=fixer, agent=opencode)
* Fix live artifact refresh safety
Enforce persisted refresh opt-out and connector auto-read gating before refresh sources execute.
Generated-By: looper 0.5.5 (runner=fixer, agent=opencode)
* feat(craft): add state-coverage rules + opt-ins on dashboard, mobile-app, kanban-board
State coverage is the most reliable AI-design failure: agents ship only the
populated state. This adds craft/state-coverage.md (108 lines, matches the
existing craft format) covering the five required states (loading, empty,
error, populated, edge), three form-specific states, ARIA/focus rules, and
loading-duration thresholds.
Sources are public: WCAG 2.2, NN/g, Material Design 3, Apple HIG, Baymard.
Three skills with stateful UI opt in via od.craft.requires:
- dashboard
- mobile-app
- kanban-board
Decks, ppt, image-poster and other static-output skills do not opt in.
Refs: see issue body for the broader proposal (state-coverage is module 1
of 5 behavioral craft modules).
* fix(craft): address review findings on state-coverage
Four P2 findings from #502 review addressed in one pass.
- Edge state Test matrix added under the five-states table (dashboard,
mobile, form, search, detail-view scenarios with concrete thresholds).
- Server-driven empty pattern added as trailing note in the empty-state
composition section.
- Retry discipline subsection added after error severity tiers
(immediate first retry, exponential 2s/4s/8s backoff, 3-retry floor,
Last-attempted timestamp).
- README enforcement-levels subsection added distinguishing auto-checked
P0 rules from guidance; partial-stateful skill clarification added
after the Files table.
No rewrites. ~30 lines added. File stays inside the 80-110-line craft
target.
* fix(craft): correct lint enforcement claim + remove duplicate threshold message
Two findings from @mrcfps review (Looper-generated against ee95b909).
- README: rewrote Enforcement-levels P0 description. Verified against
apps/daemon/src/server.ts:1706-1727: /api/artifacts/save writes the
file first, then calls lintArtifact, then returns findings in the
response. Findings reach the UI (P0/P1 badges) and the agent (system
reminder for self-correction). Persistence is not hard-blocked on P0.
Original wording mischaracterized this as a generation gate.
- state-coverage: 30-60s duration-table bucket no longer duplicates the
'15 s taking longer than expected' message from the loading row.
Reworded to focus on cancel affordance and explicitly note the
longer-than-expected notice already fired at 15 s.
Both findings non-blocking per Looper but genuine factual issues. Fixed
in one pass.
* docs(specs): add Critique Theater design spec for panel-tempered artifacts
* docs(specs): add Critique Theater implementation plan
* docs(specs): rename UI to Design Jury, add lane-density modes, ship-rule explainer, label sizing
* feat(contracts): add CritiqueConfig schema and defaults
* fix(contracts): apply Task 1.1 review (CRITIQUE_PROTOCOL_VERSION rename, descriptions, RoleWeights export)
* feat(contracts): add PanelEvent discriminated union and isPanelEvent guard
* fix(contracts): apply Task 1.2 review (exhaustive event-type list, runId guard, import order)
* feat(contracts): add CritiqueSseEvent variants and panelEventToSse mapper
* test(daemon): add v1 wire-protocol golden fixtures for Critique Theater parser
* feat(daemon): add v1 streaming parser for Critique Theater wire protocol
* chore(contracts): add .js extensions to relative imports for NodeNext consumers
* fix(daemon): satisfy noUncheckedIndexedAccess in v1 parser regex match access
* test(daemon): cover parser failure modes; fix unclosed-PANELIST swallow bug
* fix(daemon,contracts): address PR #387 review
- parser now clamps panelist + DIM scores against the run-declared scale
captured from <CRITIQUE_RUN scale=...>, not a hardcoded 100
- PANELIST appearing before any <ROUND n=...> opens now throws
MalformedBlockError rather than emitting events with NaN round
- DIM_RE and MUST_FIX_RE hoisted to module scope and lastIndex reset per
call so the parser hot path stops recompiling regex per artifact
- overflow check after drain simplified to a plain buf.length > cap test
(the prior compound condition was always true on the right side and
obscured intent)
- scoreThreshold <= scoreScale refine gains a 1e-9 epsilon so floating
slack does not reject semantically valid configs
- round-1 designer ARTIFACT guard gains a comment naming the spec
invariant and the v2 relaxation path
- 3 new regression tests cover the panelist-without-round, scale=10
clamp, and scale=20 plumbing cases
* docs(specs): rationale for non-goals, failure-mode rate targets, Phase 10 matrix, Phase 14 doc layout
* Merge branch 'main' into feat/critique-theater
Resolves the contracts/index.ts conflict by keeping the .js extensions added
by chore(contracts) 2d6e8d6 and slotting in the new export for ./api/app-config
introduced upstream by #255 (9d700ec). Critique Theater additions
(./sse/critique, ./critique) preserved in their original positions.
Verified after merge:
pnpm --filter @open-design/contracts test -> 10/10 pass
pnpm --filter @open-design/contracts typecheck -> exit 0
pnpm --filter @open-design/daemon typecheck -> exit 0
pnpm --filter @open-design/web typecheck -> exit 0
Two daemon tests in tests/media-config.test.ts fail both before and after the
merge because they read real OAuth credentials from the developer machine
instead of using mock fixtures. That's an upstream isolation issue on
origin/main, not something this branch introduces.
* fix: unblock web build and address mrcfps PANELIST oversize bypass
The chore commit that added .js extensions to satisfy daemon's nodenext
typecheck broke apps/web's Next.js build, because webpack tried to resolve
the literal ./common.js when only common.ts exists on disk. Replaced with
a subpath approach: contracts/exports gains a './critique' entry pointing
straight at src/critique.ts (which has no relative imports), and daemon
imports route through @open-design/contracts/critique instead of the
barrel. Web keeps the bundler-friendly barrel; daemon's nodenext walks
only the leaf module. All 13 contracts source files reverted to no-.js.
Separately, mrcfps flagged that parserMaxBlockBytes was only enforced on
the leftover buffer after drain returned, so a complete oversized block
arriving in one chunk slipped past the cap. Added an explicit per-block
size check inside drain for every buffered block type (PANELIST,
ROUND_END, SHIP). Three regression tests yield the whole stream as a
single chunk and assert OversizeBlockError fires before any events emit.
* fix(daemon): close three v1 parser invariant gaps from mrcfps review
Three independent gaps that all let malformed or oversized protocol
output pass the v1 envelope contract:
(1) Envelope guard. ROUND, PANELIST, ROUND_END, and SHIP now throw
MalformedBlockError when state.inRun is false. Without this, a stream
that omits <CRITIQUE_RUN> could still emit panelist_* events without
the run_started handshake, leaving downstream reducers with no run-level
config.
(2) UTF-8 byte length. Both the per-block size check and the post-drain
buf-size check now compare Buffer.byteLength(text, 'utf8') against
parserMaxBlockBytes. The previous string-length comparison let multibyte
content (CJK, emoji) inside <NOTES>/<SUMMARY> exceed the configured
byte cap while staying under the JS string length cap, bypassing the
daemon's resource guard.
(3) Header-end ordering. PANELIST, ROUND_END, and SHIP now require the
opener's > to appear before the matched closing tag. A malformed opener
like <PANELIST role="x" score="8"</PANELIST> previously fell through
to the closing tag's > and emitted events for an invalid block.
Four regression tests cover each gap (ROUND-without-run,
SHIP-without-run, multibyte-byte-cap, malformed-opener).
* feat(daemon): add critique_runs persistence (Task 4.1)
Introduces a new SQLite table critique_runs to back the orchestrator's
run lifecycle. Plan called for ALTER TABLE artifacts ADD COLUMN ..., but
artifacts is not a DB concept in this repo; runs get their own table.
- migrateCritique(db) creates the table + two indexes idempotently and
is wired into the existing migrate(db) flow on daemon boot.
- CRUD helpers (insertCritiqueRun, getCritiqueRun, updateCritiqueRun,
listCritiqueRunsByProject, deleteCritiqueRun) round-trip rounds_json
through helpers so callers see typed CritiqueRunRow.
- reconcileStaleRuns flips stale 'running' rows to 'interrupted' with
a recoveryReason='daemon_restart' marker, supporting the spec's
daemon-restart-mid-run failure mode.
- Public CritiqueRunStatus union excludes the in-flight 'running' value
but the runtime CHECK accepts it, matching the spec's lifecycle.
- 11 vitest cases cover migration idempotence, round-trip, default
rounds, status validation, update + list ordering, deletion, and
reconciliation, plus FK CASCADE on project deletion.
* feat(daemon): add Critique Theater transcript writer (Task 4.2)
Streams PanelEvent sequences to .ndjson on disk under the artifact dir,
gzipping to .ndjson.gz when the cumulative UTF-8 byte size crosses
gzipThresholdBytes (default 256 KiB). Uses Node fs streams plus
zlib.createGzip so the writer never holds the full transcript in memory.
readTranscript inverts the path and streams events back, picking the
right pipeline by file extension. Covers happy path, large multibyte,
empty input, mid-stream failure cleanup, and unknown-extension reject.
* feat(daemon): add Critique Theater orchestrator (Task 4.3)
Drives one run end-to-end: parses stdout via parseCritiqueStream, scores
each round through scoreboard helpers, persists lifecycle to critique_runs,
and emits CritiqueSseEvent variants on the existing project event bus.
Honors per-round and total timeouts, applies fallbackPolicy when no
<SHIP> arrives, and tees events into writeTranscript so transcripts
stream to disk without buffering the whole run in memory. Defensive entry
validation throws RangeError on invalid CritiqueConfig before any side
effect.
Also adds scoreboard.ts (computeComposite, decideRound, selectFallbackRound)
and re-exports panelEventToSse/CritiqueSseEvent from the critique subpath
so daemon imports never touch the barrel. Fixes missing .js extensions in
sse/critique.ts that caused NodeNext module resolution errors.
* feat(daemon): wire Critique Theater orchestrator into spawn path (Task 4.4)
Adds loadCritiqueConfigFromEnv to read OD_CRITIQUE_* keys with strict
validation at boot. Branches the existing CLI spawn flow on cfg.enabled:
when false (the M0 default) the legacy single-pass generation runs
unchanged; when true the orchestrator owns the run end-to-end. Same SSE
bus, same artifact dir, no behavior change for users until they flip the
flag.
* fix(lockfile): regenerate to include contracts zod + vitest entries
The earlier conflict resolution took main's lockfile and ran pnpm
install, but the install pass on Windows didn't write the contracts
package's zod and vitest entries back into the lockfile. CI's
--frozen-lockfile install rejected the resulting state. Re-running
pnpm install with --no-frozen-lockfile rewrites the lockfile so it
now matches every package.json across the workspace, including
contracts/zod ^3.23.8 and contracts/vitest ^2.1.8. Verified locally:
pnpm install --frozen-lockfile passes.
* fix(daemon): parser ship envelope, SHIP-before-round guard, real artifactRef (Defects 3 + 5)
- ParserOptions gains projectId + artifactId; the parser threads them into
every emitted ship event's artifactRef so downstream consumers see the
real run identity instead of empty placeholders.
- <SHIP> now requires at least one closed <ROUND_END> in the same run;
malformed streams that emit SHIP before any round complete now throw
MalformedBlockError instead of bypassing the round-1 artifact invariant.
- The SHIP handler validates the inner <ARTIFACT> block is present and
non-empty; missing artifact raises MissingArtifactError.
- Three new regressions: SHIP-before-round, SHIP-without-artifact,
artifactRef populated from parser options.
- Orchestrator threads projectId + artifactId into parserOpts.
- Test fixtures updated to include <ARTIFACT> inside <SHIP> blocks.
* fix(daemon): orchestrator owns lifecycle, gzip atomicity, fallback on timeout (Defects 2,4,7,8)
- Orchestrator now accepts child + childExitPromise, races parser /
child-exit / abort / timeout in one awaited flow, and SIGTERMs the
child on every non-clean termination. Server awaits the result so
the run lifecycle has a single owner.
- ChildExitError surfaces when child exits non-zero mid-stream; the
run is classified as failed with cause cli_exit_nonzero.
- Timeout / abort with at least one completed round elects a fallback
via selectFallbackRound and emits a synthetic ship event with
status=timed_out or interrupted; the score persists to
critique_runs instead of staying null.
- applyTimeouts includes childExitRace in every Promise.race so early
child exits are classified without waiting for the total timeout.
iter.return() cleanup is capped at 200ms to prevent hang on
stalling generators.
- writeTranscript writes gzip output to transcript.ndjson.gz.tmp,
fsyncs, then atomic-renames. Crashes mid-write leave no partial
.gz or .gz.tmp on disk.
* fix(daemon): plain-stream gating, per-run artifact dir, boot reconcile (Defects 1, 2, 6)
- Spawn-path branch now inspects def.streamFormat and only routes through
runOrchestrator when format === 'plain'. Adapters emitting wrapper
formats (claude-stream-json, copilot-stream-json, json-event-stream,
acp-json-rpc, pi-rpc) fall through to legacy single-pass with a
one-time stderr warning per format. Per-format decoding into the
orchestrator is reserved for v2.
- critiqueArtifactDir is now path.join(ARTIFACTS_DIR, projectId, runId)
so concurrent or sequential runs in the same project never overwrite
each other's transcript or final HTML. Persistence stores the relative
per-run path.
- reconcileStaleRuns is now invoked after openDatabase on every daemon
boot with staleAfterMs = critiqueCfg.totalTimeoutMs. Stale running
rows from a prior crash flip to interrupted with rounds_json.
recoveryReason='daemon_restart'. Logs a one-line warning naming the
flipped count when greater than zero.
- Spawn now passes child + childExitPromise to runOrchestrator so the
orchestrator can race child exit against the parser, abort signal,
and timeouts in one awaited flow. Server awaits the orchestrator's
result and surfaces failures through the existing run lifecycle.
* fix(daemon): daemon-authoritative scoring, lifecycle status, stderr ordering, insert type
Round 2 review feedback on PR #481.
1. CritiqueRunInsert.status now accepts 'running' so the boot-reconcile
tests (and any caller seeding an in-flight row) typecheck without
casting. The runtime check in insertCritiqueRun already accepted
'running' against the DB constraint set, only the public type was
stricter than the DB.
2. round_end keeps the daemon-computed composite authoritative. The
agent's <ROUND_END composite=...> attribute is advisory: a divergence
beyond COMPOSITE_TOLERANCE emits a composite_mismatch parser_warning
so the discrepancy is observable, but the daemon value is what scores
and persists. Same policy for must_fix.
3. SHIP-handling derives the final status from decideRound(...) using the
daemon's scored round rather than trusting <SHIP composite=... status=...>.
A run that the agent claims as shipped but whose daemon composite is
below threshold now finalizes as below_threshold, so a malformed or
adversarial stream cannot force a ship.
4. server.ts captures the orchestrator's result and maps the critique
terminal status to the chat run lifecycle. shipped/below_threshold
finalize as 'succeeded'; timed_out/interrupted/degraded/failed
finalize as 'failed'. cancelRequested is honored.
5. stderr forwarding and child.on('error') registrations moved BEFORE
the orchestrator await so a CLI that floods stderr cannot fill the
OS pipe and deadlock until the total timeout, and so an early
child error fired during the run is observed by the same listener
used after.
Tests:
- tests/critique-authority.test.ts: 3 new regressions (lying ship
downgraded to below_threshold, mismatch warning emitted, aligned
composites stay quiet).
- All four affected suites green: 14 orchestrator + 10 spawn-wiring +
3 boot-reconcile + 3 authority = 30/30.
Workspace typechecks: contracts, daemon, web all exit 0.
* fix(daemon,contracts): inline critique SSE, signal-terminated child, null shipped artifactPath
Round 3 review feedback on PR #481.
1. packages/contracts/src/critique.ts inlines CritiqueSseEvent +
panelEventToSse + CRITIQUE_SSE_EVENT_NAMES + a local mirror of
SseTransportEvent. The previous re-export from './sse/critique.js'
broke the workspace web build (Turbopack cannot rewrite .js to .ts
on a relative source import) while removing the .js extension broke
daemon's NodeNext typecheck (it walks this leaf via the './critique'
subpath export which requires explicit .js extensions). Inlining
removes the cross-file relative import entirely so both consumers
walk one self-contained file. packages/contracts/src/sse/critique.ts
is removed and its co-located test moves up to
packages/contracts/src/critique.test.ts. The barrel
packages/contracts/src/index.ts drops the redundant
'./sse/critique' re-export since './critique' already exports the
same symbols.
2. apps/daemon/src/critique/orchestrator.ts treats a signal-terminated
child as a terminal race rejection. Previously the race only caught
non-zero numeric exit codes and treated code === null as
indefinitely pending, so a SIGTERM from /api/runs/:id/cancel
resolved childExitPromise as { code: null, signal: 'SIGTERM' } and
the orchestrator fell through to the no-SHIP fallback path,
persisting below_threshold instead of interrupted. The race now
rejects with a new ChildSignaledError when signal !== null, and a
new catch branch classifies the run as 'interrupted' and (if at
least one round closed) emits a synthetic ship event with
status='interrupted' so the persisted row and the SSE transcript
reflect the actual cause.
3. Same file, ship-handling: artifactPath is now persisted as null on
shipped runs until a future phase actually extracts the
<SHIP><ARTIFACT> body to disk. Previously the orchestrator wrote
${artifactDir}/${artifactId} even though no file existed at that
path, so any later replay/export/UI code that trusted
critique_runs.artifact_path would dereference a missing file. The
transcript still records the ship event with the artifact reference
so consumers can find the run.
Tests:
- apps/daemon/tests/critique-lifecycle.test.ts: 2 new regressions
(SIGTERM-terminated child after one closed round persists
'interrupted' with a synthetic ship event of the same status; shipped
run leaves artifactPath null in result and DB row).
- 43 critique-suite tests pass: 14 orchestrator + 11 transcript +
10 spawn-wiring + 3 boot-reconcile + 3 authority + 2 lifecycle.
Workspace typechecks: contracts, daemon, web all exit 0.
* fix(daemon): buffer raw SHIP, emit only normalized; reject SHIP for unclosed round
Round 4 review feedback on PR #481.
The parser-event loop used to unconditionally collectedEvents.push(event)
and bus.emit(panelEventToSse(event)) for every event, including raw
<SHIP>. SSE clients and the transcript could see the agent's forged
status="shipped" / composite="9.5" before decideRound(...) ran, even
when the daemon later corrected the persisted DB row to below_threshold.
The loop now skips ship events entirely; the orchestrator buffers the
raw shipEvent, runs daemon-authoritative scoring, and emits a single
normalized ship payload built from the daemon's computed composite,
selectFallbackRound's mustFix, and decideRound's status. The transcript
and SSE bus now only ever see the daemon-scored ship.
The unknown-round fallback used to make agent-claimed status/composite
authoritative when SHIP referenced a round that was never closed: a
malformed stream could close low round 1, then send <SHIP round="2"
status="shipped" composite="10">, completedRounds.find(r => r.n === 2)
was undefined, and the orchestrator persisted the agent's value. That
re-opened the scoring-integrity hole the previous round was meant to
close. The orchestrator now drops a SHIP whose round isn't in
completedRounds, emits a parser_warning, and falls through to the
no-SHIP fallback policy. The synthetic ship from selectFallbackRound
gets emitted instead, with daemon-authoritative round/composite/status.
Tests:
- tests/critique-authority.test.ts: extended the lying-ship regression
to also assert the emitted critique.ship payload is downgraded
(status='below_threshold', composite < threshold), so the SSE bus
cannot see the agent's claim. Added a new regression where SHIP
references an unclosed round 2: the agent ship is dropped, a
parser_warning fires, the fallback selects round 1, and the only
emitted critique.ship has round=1 and status=below_threshold.
- 44 critique-suite tests pass: 14 orchestrator + 11 transcript + 10
spawn-wiring + 3 boot-reconcile + 4 authority + 2 lifecycle.
Workspace daemon typecheck exits 0.
---------
Co-authored-by: Nagendhra <nagendhra405@gmail.com>
Co-authored-by: mrcfps <mrc@powerformer.com>
* chore: enforce test directory conventions
Move package, app, and tool tests out of src and add guard enforcement so source directories stay source-only.
* ci: use guard and package-scoped tests
Run the new repository guard in CI and keep test execution aligned with package-scoped commands after removing root aliases.
* ci: align stable release guard check
Use the new repository guard in stable release verification after replacing the residual-JS-only script.
* chore: tighten test layout enforcement
Enforce sibling tests directories, typecheck moved test suites with dedicated configs, and refresh remaining guidance that pointed at src-based tests.
* chore: clarify no-emit test tsconfigs
Explicitly disable declaration-only emit in test tsconfigs so review tooling sees they are no-emit typecheck configs.
Adds the official Discord community link (https://discord.gg/qhbcCH8Am4)
as a badge in the badge row across all 11 localized READMEs so users
have a clear, consistent entry point to the community.
Co-authored-by: joey <joey@joeydeMacBook-Air.local>
* docs: add Arabic README translation
Adds README.ar.md — full Arabic translation of README.md, wrapped in a
top-level <div dir="rtl"> for correct RTL rendering on GitHub. All
URLs, file paths, code blocks, badges, and tables are preserved verbatim;
prose and table cell text are translated to Modern Standard Arabic.
Also wires up the language switcher in all nine sibling README files
(en/de/fr/zh-CN/zh-TW/ko/ja-JP/ru/uk) so the previously-placeholder
العربية entry now links to README.ar.md.
The translation is an initial pass — native-Arabic-speaker review for
phrasing/terminology is welcome.
* docs(ar): preserve code fences verbatim and restore license attribution
Address PR #458 review feedback (P2):
1. Revert translated prose inside fenced code blocks back to verbatim
English, matching the source README.md:
- Prompt-stack composition block
- Architecture ASCII diagram (browser layer, daemon comments,
bottom CLI row)
- Quickstart bash comments
- .od/ tree comments
- Repository structure tree comments
2. Restore the License section's bundled-attribution sentence with
links to skills/guizang-ppt/LICENSE (MIT, @op7418) and
skills/html-ppt/LICENSE (MIT, @lewislulu); the previous version
collapsed it to a generic 'see LICENSE' pointer.
* docs(ar): translate Running the Project and MCP server sections
Address PR #458 follow-up review (mrcfps): the Arabic README jumped from
the .od/ first-run section straight to repository structure, missing the
two sections added to README.md after this branch was forked:
- ## Running the Project — web/localhost mode, fixed-port restarts,
desktop/Electron commands, and the useful-commands table
- ## Use Open Design from your coding agent — stdio MCP server setup,
per-client install flow, daemon-must-be-running note, and the
read-only security model
Command blocks, table structure, and links are preserved verbatim from
README.md per the same convention used elsewhere in the file.
* fix(daemon): respect baseUrl path verbatim in OpenAI-compat proxy
`appendVersionedApiPath` previously force-injected `/v1` unless the
supplied baseUrl ended with `/vN`. That broke any provider whose
OpenAI-compatible surface lives under a sub-path:
https://api.deepinfra.com/v1/openai → ".../v1/openai/v1/chat/completions"
https://openrouter.ai/api/v1 → ".../api/v1/chat/completions" (worked, by luck of /vN suffix)
Now the auto-`/v1` only fires when the user supplied no path at all, so
DeepInfra, OpenRouter, and any other sub-path-mounted compat surface
route to the right endpoint while the canonical
`https://api.openai.com` / `https://api.anthropic.com` shortcuts still
work. Adds a regression test table covering the matrix.
* Account for Anthropic style URLs
Address review feedback from lefarcen and mrcfps on PR #492:
P1: Track setModelRequestId to scope the recovery block to the exact
session/set_model request. This prevents duplicate prompt sends if
session/prompt returns -32603 (which would otherwise match on
expectedId + non-default model conditions).
P1: Add promptRequestId === null guard so the recovery path only
triggers before a prompt has been sent.
P2: In detectAcpModels, only suppress -32603 on unexpected ids.
Expected-id -32603 errors (initialize, session/new) are real probe
failures and should reject immediately rather than causing a silent
15s timeout.
P2: In attachAcpSession, expected-id -32603 errors that don't match
setModelRequestId now call fail() instead of falling through. This
prevents initialize/session/new/session/prompt failures from being
silently swallowed.
The `labels/good-first-issue` URL renders an empty list when no issues
carry that label. GitHub's `/contribute` page auto-curates a mix of
good-first-issue, help-wanted, and active items, so it stays useful for
newcomers regardless of label coverage. Wording is unchanged across all
10 README locales.