Commit graph

27 commits

Author SHA1 Message Date
chaoxiaoche
6a08dfe111
Add design system package quality guard (#2224)
* Add design system import manifest schema

* Generate hybrid design system imports

* Read design system usage and cached manifests

* Add design system pull-file tool

* Show design system package evidence

* Wire design system import semantics

* Add design system package quality guard

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-05-19 16:53:29 +08:00
PerishFire
bd48c597b0
chore: pin dependency versions and harden CI caches (#2189)
* chore: pin dependency versions

* ci: enforce pinned dependency specs

* ci: fix pnpm executable invocation
2026-05-19 13:58:27 +08:00
PerishFire
4424f08be0
[codex] Add packaged desktop auto-update (#1375)
* Add packaged desktop auto-update

* Handle counted beta nightly update versions

* Refresh desktop auto-update branch for main

* Serialize desktop updater operations

* Refresh auto-update branch for packaged paths
2026-05-19 11:20:05 +08:00
chaoxiaoche
f7eb82d7a5
feat(design-systems): import design system projects (#2112)
* feat(design-systems): define project manifest contract

* feat(design-systems): add default project manifest

* feat(daemon): consume design system manifests

* feat(design-systems): import local project systems

* feat(design-systems): import from github repositories

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-05-18 20:20:38 +08:00
chaoxiaoche
1f66c53203
feat(daemon): consume component manifests (#2053)
* feat(design-systems): extract component manifests

* feat(daemon): consume component manifests

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-05-18 16:50:52 +08:00
chaoxiaoche
d1a2f9f07e
chore(design-systems): report component fixture coverage (#2049)
Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
2026-05-18 16:47:34 +08:00
lefarcen
b268bbe169 Merge origin/garnet-hemisphere (post-9e196d34) — Use Plugin handoff fix
Brings in 11 new garnet commits, most importantly:
- 1a90aef4 feat(plugin-use): implement plugin use handoff functionality —
  fixes the bug QA reported where /plugins Use Plugin would 422 silently
  for template plugins; new flow hands off to HomeView with the plugin
  pre-bound + input form prompted there.
- 2ac58544 feat(plugin-inputs): enhance plugin input handling with file
  upload support — extends PluginInputsForm for file uploads.
- 3b167b69 feat(plugins): registry protocol — new @open-design/registry-protocol
  workspace package (needs build before daemon boot).
- Plus enhancements to plugin metadata, GitHub installer, plugin detail
  view, login/whoami, static HTML preview paths.

Conflicts resolved:
- packages/contracts/src/api/projects.ts: HEAD's skipDiscoveryBrief
  field + garnet's contextPlugins (@-mention plugin context refs) both
  kept on ProjectMetadata.
- apps/landing-page/* (3 files): accepted HEAD — garnet had the older
  single-page landing-page header; main has the multi-page layout
  (/skills/, /systems/, /templates/, /craft/) with dynamic counts. Not
  related to the Use Plugin core fix.

New @open-design/registry-protocol package must be built before daemon
boots; pnpm install does this via postinstall already.
2026-05-14 16:32:35 +08:00
lefarcen
6c16283850 Merge origin/main (post-7c8305f4) into reconcile branch
Brings in 10 new main commits: routine deep-link to specific
conversations (#1508), Windows resource cache fix for Orbit templates,
collapsible comment side panel (#1607), routines project radio polish,
Copilot logo swap, and minor UI fixes.

Conflicts resolved:
- router.ts: garnet's home/view + marketplace routes + main's
  per-project conversationId deep-link field coexist on Route union
- ProjectView.tsx: garnet's isPhantomDaemonRunMessage helper +
  main's isStoppableAssistantMessage helper both kept
- ProjectView.run-cleanup.test.tsx: accepted HEAD (garnet's
  phantom-row regression test); main's three new tests for
  finalizeActiveAssistantMessagesOnStop / clearStreamingConversationMarker
  / shouldClearActiveRunRefs are queued as a follow-up TODO inline.
2026-05-14 15:13:38 +08:00
chaoxiaoche
e57e028222
feat(daemon): make design-system token channel default-on (PR-D) (#1544)
* feat(daemon): make design-system token channel default-on (PR-D)

Flip `OD_DESIGN_TOKEN_CHANNEL` from default-off to default-on. Every
chat that picks a brand with `tokens.css` + `components.html` siblings
(today: `default`, `kami`) now gets the structured token contract
appended to the system prompt automatically. `OD_DESIGN_TOKEN_CHANNEL=0`
keeps the DESIGN.md-only path as a kill switch.

Adds `scripts/check-design-system-flag-parity.ts`, registered in
`pnpm guard`. The guard walks every brand and asserts:

- 147 prose-only brands produce byte-identical prompts under flag-off
  vs flag-on (PR-D's "no-op for legacy brands" promise)
- 2 structured brands diverge as expected (catches a future regression
  that silently dropped the structured blocks)

Smoke evidence on #1385 (PR-C):
- `default` — 10/10 brand tokens used byte-for-byte in treatment vs
  0/10 invented colors in control
- `kami` — treatment recovers brand name (`Kami · 纸`), the two-tier
  surface (`--bg` parchment + `--surface` ivory), the CN font stack
  override, and the `components.html` card pattern; control invented
  "Replica" as a brand name

Co-authored-by: Cursor <cursoragent@cursor.com>

* review: address @nettee + @lefarcen feedback on parity guard

Two blocking findings from #1544 review:

1. @nettee — guard's inventory walk silently passed on unreadable
   filesystem state. `fileExists` swallowed every `stat` error and the
   bare `readdir` catch returned `[]` for any failure. A renamed
   `design-systems/` tree, a permission-denied DESIGN.md, or a
   directory at the brand path would have left `pnpm guard` happy
   after checking 0 brands — exactly the silent misconfiguration this
   guard exists to catch. Both error paths now treat only ENOENT /
   ENOTDIR as absence and rethrow everything else, mirroring the
   `readFileOptional` fix already applied to PR-C's
   `apps/daemon/src/design-systems.ts`.

2. @nettee — guard exercised `composeSystemPrompt` directly, bypassing
   the `process.env.OD_DESIGN_TOKEN_CHANNEL !== '0'` gate in server.ts
   that PR-D actually flipped. A regression that restored `=== '1'`,
   typo'd the env name, or stopped reading assets when the var is
   unset would still leave the guard green. Extracted the predicate
   into `isDesignTokenChannelEnabled(env)` next to
   `readDesignSystemAssets` and added 6 unit tests pinning every value
   that matters: unset / `'1'` / `'true'` / empty / `'0'` /
   whitespace-padded. server.ts now calls the predicate. Any
   regression on the env-flag semantics fails
   `tests/design-system-assets.test.ts` independently of the
   composer-level coverage.

Verified: pnpm guard (13/13), tsc -p scripts/tsconfig.json (clean),
@open-design/daemon typecheck (clean), 32/32 prompt + asset tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* review: pin server-layer asset resolution end-to-end (lefarcen P2)

Round-2 review feedback from @lefarcen on #1544: the predicate suite
in tests/design-system-assets.test.ts pinned the env-flag boolean but
did NOT exercise the server prompt-assembly path that PR-D actually
flipped — the seam where the daemon decides whether to read tokens.css
/ components.html from disk and hand them to composeSystemPrompt. A
regression that, say, restored an inline `=== '1'` gate or stopped
calling isDesignTokenChannelEnabled() from server.ts would still leave
the predicate test green.

Extracted that whole seam into `resolveDesignSystemAssets(id,
builtInRoot, userInstalledRoot, env)` on apps/daemon/src/design-systems.ts.
The function combines:
  1. the env-flag gate (kill switch on `OD_DESIGN_TOKEN_CHANNEL=0`)
  2. the built-in → user-installed root fallback chain (per-file)
  3. the DesignSystemAssets result shape consumed by composeSystemPrompt

server.ts at the prompt-assembly site is now a thin caller of this
function. The previous 13-line inline block (env check + per-file
fallback) collapses to one call, so the whole asset-resolution path
now has a single testable seam.

7 new tests in tests/design-system-assets.test.ts run the full pipeline
end-to-end against real disk fixtures:
  - env unset (default-on): returns built-in assets
  - env=`'0'` (kill switch): returns undefined even with files on disk
  - env=`'1'` (legacy opt-in): still works
  - mixed builtin/user-installed: per-file fallback merges correctly
  - both halves built-in: skips user-installed roundtrip verbatim
  - prose-only brand (no files): undefined / undefined
  - nonexistent brand directory: undefined / undefined

Verified: pnpm guard (13/13), tsc -p scripts/tsconfig.json (clean),
@open-design/daemon typecheck (clean), 39/39 prompt + asset tests
(was 32; +7 new server-layer-resolution tests).

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(test): add missing projectKind to FileViewer deck preview test

The deck preview test added in #1556 (086be271) renders <FileViewer/>
without `projectKind`, which became a required prop in #1509. CI on
main is currently red on this; pick up the trivial fix here so PR-D
can land cleanly.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 14:14:19 +08:00
pftom
3b167b6921 feat(plugins): add registry protocol and enhance plugin management features
- Introduced the `@open-design/registry-protocol` package, enabling improved interactions with plugin registries.
- Updated the `typecheck` script in the daemon's `package.json` to include the new registry protocol.
- Enhanced the CLI with new flags and commands for better plugin management, including `yank` and additional marketplace functionalities.
- Implemented a plugin lockfile system to manage installed plugins and their versions, improving reliability during upgrades.
- Added new marketplace doctor functionality to validate plugin entries and ensure compliance with registry standards.

This update significantly enhances the plugin ecosystem by providing robust registry interactions and improved management capabilities.
2026-05-14 08:55:36 +08:00
lefarcen
d3602be666 Merge origin/main into garnet-hemisphere (reconcile)
Merge of `origin/main` (`03ed3960`, 2026-05-13 pre-0.7.0) into the
161-commit garnet-hemisphere line, reconciling the product-vibe-coded
plugin/marketplace/EntryShell surfaces from garnet with the routines /
skills / live-artifacts feature work landed on main since the fork point.

Headline decisions (full rationale + side-by-side screenshots in
`specs/change/20260513-garnet-skills-automations/reconcile-result-vs-garnet.md`):

- #1 SettingsDialog: keep main's Memory / Skills / External MCP /
  Connectors / Routines / MCP server nav items even though the top-level
  /integrations + /automations routes also cover them. Two entries
  coexist for now; revisit once Track A/B fill in the placeholder content.
- #2 EntryView: accept garnet's thin wrapper delegating to EntryShell.
  Main's PetRail sidebar + image-templates/video-templates tabs are
  intentionally deferred to a follow-up that re-integrates them into
  the new EntryShell layout.
- #3 /integrations + /automations top-level routes: kept (garnet's
  product intent). Skills tab is still a "Coming soon" placeholder
  awaiting Track A; Routines/Schedules/Live-artifacts cards on
  /automations are still mock awaiting Track B.
- #5 DesignFilesPanel: hybrid — main's pagination as primary list,
  garnet's Plugin folders section preserved between the live-artifacts
  block and the pagination block. (by-kind sections drop in favour of
  pagination; plugin-folders rendering stays because it is a
  garnet-specific product addition.)
- #7 server.ts (10 hunks, ~5400 conflict lines): manual hunk-by-hunk
  merge. Both daemon admin routes + plugin/genui routes (garnet) and
  routines/memory/skills upgrades (main) preserved. Garnet's inline
  project route block kept alongside main's `registerProjectRoutes` /
  `registerProjectUploadRoutes` modular wiring — duplicate route
  audit is a follow-up. Garnet's POST /api/projects plugin-snapshot
  resolution + default-scenario fallback is intentionally dropped from
  the inline body (now handled by registerProjectRoutes) and listed for
  follow-up re-integration into `project-routes.ts`.

Verification (worktree at /Users/elian/Documents/open-design-garnet):
- `pnpm typecheck` exits 0 across all workspace packages
- daemon (`pnpm tools-dev run web --namespace reconcile-shots`) boots,
  serves `/api/daemon/status` healthy, and survives a Playwright
  walkthrough of /integrations / /automations / home / projects /
  design-systems / plugins / settings dialog
- `@open-design/plugin-runtime` package built (was missing dist/ on
  garnet); without it the daemon's plugins/* imports fail at boot

Track A (Skills tab → real SkillsSection) and Track B (Automations
cards → real routines / live-artifacts backend) are the two remaining
follow-ups blocking the placeholder/mock content from going live. See
`spec.md` and `track-skills.md` in the same directory.
2026-05-13 22:29:21 +08:00
nettee
f621dbbfea
feat(web): Add Tailwind foundation (#1388) 2026-05-12 21:48:16 +08:00
pftom
5af84c09af feat(web): refactor PluginsHomeSection to use tag-based filtering and introduce PluginCard component
- Replaced the legacy tabbed categorization in `PluginsHomeSection` with a tag-driven approach, allowing dynamic filtering based on plugin tags.
- Introduced a new `PluginCard` component to encapsulate the rendering of individual plugin cards, improving separation of concerns and maintainability.
- Added a `usePluginCategories` hook to manage plugin visibility and filtering logic, enhancing the overall structure and testability of the component.
- Implemented a "More" pill for overflow tags in the filter row, improving user interaction with a cleaner UI.
- Updated CSS styles to support the new layout and improve visual consistency across the plugins home section.

This update significantly enhances the user experience by providing a more flexible and intuitive way to discover and interact with plugins.
2026-05-12 13:25:44 +08:00
chaoxiaoche
a75d9938c7
feat(design-systems): add structured tokens.css schema (default + kami) (#1231)
* feat(design-systems): add structured tokens.css schema (default + kami)

Compile each brand's DESIGN.md prose into a machine-readable :root
block agents paste verbatim, removing the "Primary → --accent"
translation step where most token misuse happens. Daemon prompt
injection lands in a follow-up; lint-artifact already enforces the
shared token vocabulary so no rule changes needed.

Schema validated across two contrasting aesthetics:
- default (sans-serif, cobalt, B2B utility) — stress test the
  shallow form, 2-level fg / 2-level surface
- kami (serif, parchment, ink-blue, print-first) — stress test the
  rich form, 4-level fg ramp, 3-level surface, ring elevation, i18n
  font stacks, and solid-hex tag tints (print renderers double-paint
  alpha)

Schema growth from kami's stress test (5 new optional slots, all
backward-compatible — default aliases via var() to existing tokens):
- --fg-2 / --meta (4-level fg ramp)
- --surface-warm (3-level surface)
- --border-soft (2-level border)
- --elev-ring (ring elevation as first-class level)

Brand-specific extensions live in tokens.css with explicit "NOT in
shared schema" labels and a documented promotion path (≥2 brands
need it → promote to schema slot).

components.html in each brand is a self-contained reference fixture
that exercises every token through real layouts. Both fixtures lint
clean against apps/daemon/src/lint-artifact.ts.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(design-systems): add token-fixture drift guard

Each design system in design-systems/<brand>/ ships two files agents
consume in tandem: tokens.css (canonical token bindings) and
components.html (a self-contained fixture whose first <style> embeds
the same :root paste so the file renders standalone). The fixture's
:root block is a copy of tokens.css's :root block, kept in sync only
by an inline comment.

This adds scripts/check-tokens-fixture-sync.ts and registers it in
pnpm guard. The check pairs each brand's tokens.css with its
components.html and asserts the unscoped :root block is byte-equivalent
after canonical normalization (CSS comments stripped, whitespace
collapsed, separator spacing normalized). Brands missing one half of
the pair, or with no :root rule in either file, fail the guard.

Scoped overrides like :root[lang="zh-CN"] are not required to appear
in the fixture (per the kami fixture's inline comment they are pasted
only when an artifact's <html lang> matches), so the check only
compares the unscoped :root block.

Verified: pnpm guard passes for default + kami, fails on intentional
value drift, fails on missing token, tolerates whitespace-only
formatting differences.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(design-systems): point fixture CTAs to real files

Both default and kami components.html advertised in-page anchors
(#tokens, #spec, #surface, #accent, #type, #components) but defined
no matching ids, so every CTA was a no-op when the fixture was
opened locally — flagged by mrcfps in #1231.

Re-point each link to a real artifact in the same brand directory:

- "View tokens" / "Inspect tokens" / "Inspect typography" → ./tokens.css
- "Read the spec" / "Read the rule" → ./DESIGN.md

Browsers render these as raw source views, which is the desired UX
for a reference fixture: clicking the CTA shows the underlying
contract instead of jumping to nothing. Agents copying the fixture
also learn the pattern of "buttons link to actual sibling resources".

The :root token block is unchanged, so the token-fixture drift guard
still passes for both brands.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(design-systems): codify token schema (A1/A2/B/C layers)

The two-brand pilot (default + kami) settled the shape of the shared
token schema; this commit codifies it as a machine-readable contract
and enforces it in pnpm guard, addressing lefarcen's review on #1231:

> the optional-vs-required split won't generalize cleanly when brand
> #3 needs different Layer A tokens or when multiple brands converge
> on the same extension (promoting C→B→A). Consider surfacing that
> limitation in the PR narrative or in a future SCHEMA.md.

Schema lives under design-systems/_schema/ as three files:

- tokens.schema.ts   — TypeScript declaration of every shared token
                       with its layer (A1-identity / A1-structure /
                       A2 / B-slot), plus per-brand C-extension
                       allowlists and a global C-prefix allowlist
- defaults.css       — CSS mirror of A2 fallback values, used as the
                       human-readable contract reviewer's-eye copy
                       and the future input to the derive script
- AGENTS.md          — schema layer model, C → B-slot → A2 promotion
                       rules, when-not-to-add-a-token guidance

Layer model:

  A1-identity    8 tokens — bg/surface/fg/muted/border/accent +
                 font-display/font-body. The brand IS these values;
                 no fallback is defensible.

  A1-structure  18 tokens — type scale (8), leading (2), tracking
                 (1), section-y (3), container (4). Structural
                 decisions vary per brand by design and have no
                 cross-brand default.

  A2            26 tokens — accent states, semantic colors, motion,
                 base spacing scale, radius, elevation, focus,
                 font-mono. Required in every tokens.css; fallback
                 lives in defaults.css for the future derive script
                 to inline when DESIGN.md does not specify the value.

  B-slot         4 tokens — fg-2 / meta / surface-warm / border-soft.
                 Brand may bind independently or alias the named
                 sibling via var(...) for components that target the
                 richer ramp.

  C-extension    n tokens — brand-specific names (kami's tag-bg-*,
                 leading-display, accent-light, etc.). Allowlisted
                 per-brand in BRAND_EXTENSIONS or globally by prefix
                 in BRAND_EXTENSION_PREFIXES. Promote when a second
                 brand adopts the same name.

Why A2 fails the guard today:
  Artifacts are generated by agents pasting one brand's :root block
  into a single <style>; there is no global stylesheet that supplies
  fallbacks at runtime. A tokens.css missing an A2 declaration would
  silently break any var() reference in the fixture. Until the
  derive script (PR-B) lands and inlines defaults, every brand's
  tokens.css must declare every A2 token directly. The guard
  enforces this strictly.

Why --font-mono lands in A2 (not A1):
  149 brands' DESIGN.md files were surveyed: 87 (58%) declare a
  monospace stack, 62 (42%) do not — including major brands like
  bmw / nike / apple / notion / mastercard / meta. Agent paste
  cannot rely on the brand author having written it down; a
  defaultable A2 fallback (with CJK brands like kami overriding) is
  safer than forcing every brand author to add a field they may not
  realize their kbd / code-block components need.

Five guard checks, each registered as its own entry in scripts/guard.ts
so failures attribute to a specific contract:

  1. token-fixture sync       — components.html :root ↔ tokens.css :root
                                 byte-equivalent (existing)
  2. A1 required tokens       — every brand declares every A1 token
  3. A2 required tokens       — every brand declares every A2 token
  4. unknown token allowlist  — every declared token is in schema or
                                 brand-extension allowlist
  5. A2 defaults parity       — defaults.css ↔ tokens.schema.ts
                                 fallback byte-equivalent

Verified on default + kami:
  - 26 A1 tokens declared in both brands
  - 26 A2 tokens declared in both brands
  - 129 total declarations, all match shared schema or brand extensions
  - defaults.css ↔ tokens.schema.ts parity holds
  - sanity test: drifting --motion-fast in defaults.css fails check 5
    with a clear divergence message

The PR description originally listed "Dedicated SCHEMA.md" as
explicitly NOT in this PR ("Once 3+ brands ship, extracting a single
source of truth becomes worthwhile"). That boundary moves: lefarcen's
review surfaced the schema-generalization risk, and the schema must
exist as a machine-enforced contract before the derive script can
read it. The TS file replaces the markdown that was deferred.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(web/tests): pass missing designTemplates prop to ProjectView

Pre-existing typecheck regression on main: PR #955 (b5eb8c16,
"generic skills + split skills/design-templates + finalize-design
API") added required `designTemplates: SkillSummary[]` to ProjectView
Props but updated only two of the three test fixtures that render
ProjectView directly. The third — ProjectView.api-empty-response.test.tsx
— was missed, so `pnpm typecheck` (and CI on any PR merging into
main) fails on:

  apps/web/tests/components/ProjectView.api-empty-response.test.tsx
    (168,6): error TS2741: Property 'designTemplates' is missing in
    type ...

The other two ProjectView tests already pass `designTemplates={[]}`,
so this aligns this fixture with the existing pattern. Out of scope
for #1231 strictly, but the regression blocks the merged-state
typecheck CI runs that #1231 triggers, and the one-line fix here
restores main's typecheck health for everyone.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(design-systems): enforce B-slot required tokens in pnpm guard

Closes mrcfps + lefarcen review comment thread on #1231:

> The guard validates A2 required tokens here, but there's no
> sibling check for B-slot aliases (--fg-2, --meta, --surface-warm,
> --border-soft). Per the schema docs, every brand must declare
> A1 + A2 + B-slot names so shared components can safely read
> var(--fg-2) etc. Without a B-slot guard, a brand can omit those
> aliases, pass pnpm guard, and break any artifact that references
> them.

Same artifact-paste constraint as A2: agents render artifacts by
pasting one brand's :root block into a single <style>; there is no
runtime cascade, so a missing B-slot makes any var(--fg-2) reference
resolve to nothing. Until now the schema narrative claimed B-slots
were optional with a var() default, but no machine check enforced
declaration — a contract gap reviewers reasonably refused to merge.

This commit closes the gap in three places so machine and narrative
agree:

1. scripts/check-tokens-fixture-sync.ts
   - Add checkDesignSystemBSlotRequiredTokens, mirroring the A2
     check but using getBSlotNames() from the schema.
   - Failure message names each missing slot AND the schema-suggested
     alias (--fg-2 (default alias: var(--fg))) so a brand author
     fixing the failure has a copy-pasteable resolution.
   - Renumber section comments: 5 checks → 6 checks.

2. scripts/guard.ts
   - Register the new check between A2 required and unknown
     allowlist so failures attribute to a specific contract.

3. design-systems/_schema/AGENTS.md
   - Update the layer table: B-slot row's "If omitted" column
     changes from "resolves via var() to a richer sibling" to
     "guard fails — brand must declare, either as var(--sibling)
     (collapsed) or independent value (richer)".
   - Add a "Why B-slot is required (and what the alias is for)"
     section that distinguishes the schema-suggested alias from a
     runtime fallback, with worked examples for default (alias) and
     kami (independent bind).

Verified on default + kami:
- pnpm guard passes all 6 design-system checks
- 4 B-slot tokens declared in both brands (default aliases via var(),
  kami binds independently — both forms satisfy the contract)
- pnpm typecheck clean across the workspace
- Sanity test: removing --fg-2 + --meta from default/tokens.css fires
  the new guard with a precise per-token alias hint:
    [default] design-systems/default/tokens.css is missing 2 B-slot
    tokens (alias the named sibling via var(...) or bind
    independently):
      --fg-2 (default alias: var(--fg)),
      --meta (default alias: var(--muted))

The schema contract is now machine-enforced end-to-end (A1 + A2 +
B-slot all required-with-fixed-form-of-fallback). The derive script
in PR-B can rely on every brand's tokens.css containing every shared
slot name.

Co-authored-by: Cursor <cursoragent@cursor.com>

* test(e2e): skip leading-underscore meta-directories under design-systems/

CI for #1231 went red on `Validate workspace` after merging origin/main.
Cause is a clean collision between two recently-landed changes:

- main #1270 (be77dc03 "Default English resource i18n fallback")
  tightened tests/localized-content.test.ts so every directory under
  design-systems/ is run through assertResourceId() with the strict
  RESOURCE_ID_PATTERN /^[a-z0-9][a-z0-9-]*$/.

- this branch #1231 introduced design-systems/_schema/ as the home
  of the shared token contract (tokens.schema.ts, defaults.css,
  AGENTS.md). The leading underscore signals "meta-directory, not
  brand" — the same convention SCSS partials, Jekyll, Hugo all use.

The two changes never met until CI built the merge commit, where
assertResourceId('_schema') deterministically failed:

  Error: Design system directory _schema has malformed resource id: _schema
    at invariant tests/localized-content.test.ts:66:11
    at assertResourceId tests/localized-content.test.ts:71:3
    at readDesignSystemResources tests/localized-content.test.ts:202:8

Fix tightens readDesignSystemResources's directory filter so the
leading-underscore convention is recognised explicitly:

    .filter((entry) => entry.isDirectory() && !entry.name.startsWith('_'))

This aligns with what apps/daemon/src/design-systems.ts:listDesignSystems
already does implicitly — it requires DESIGN.md per directory, so
_schema/ was always invisible at runtime; the test was the only place
that surfaced it.

Verified locally on the post-merge tree:
- pnpm test (e2e vitest) — tests/localized-content.test.ts: 4 passed
- pnpm guard — all 6 design-system checks pass on default + kami
- pnpm typecheck — clean across the workspace (after pnpm install
  to pull deps for tools/pr that arrived with main)

The fix is intentionally narrow (one filter line in one test) and
documents the convention inline so future meta-directories under
design-systems/ (e.g. _archive/, _drafts/) are covered for free.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: chaoxiaoche <chaoxiaoche@192.168.10.16>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 22:23:34 +08:00
shangxinyu1
10802bb0b0
test: expand nightly UI and desktop regression coverage (#1256)
* e2e(ui): cover examples preview flows

* e2e(ui): cover Codex local CLI fallback UX

* test: expand desktop and connector regression coverage

* e2e(ui): cover workspace restoration flows

* e2e(ui): cover retry recovery workspace flow

* test: cover artifact and connector recovery flows

* e2e(ui): cover Continue in CLI stale provenance flow

* e2e(ui): cover BYOK model fetch caching

* test: expand Orbit and desktop connector coverage

* e2e(ui): cover workspace quick switcher recovery flows

* e2e(ui): cover connector pending authorization recovery

* e2e(ui): cover workspace and conversation restoration routes

* e2e(ui): cover conversation draft and attachment restoration

* e2e(ui): cover conversation history selection recovery

* e2e(ui): cover workspace surface conversation selection

* test: cover artifact presentation and orbit link behavior

* test: cover artifact external link restoration

* e2e(ui): cover root-route deep-link restoration

* e2e(specs): cover Orbit open-artifact desktop click

* e2e(specs): cover desktop artifact open link

* test: fix Orbit settings fixture type drift

* test: split Playwright critical and extended suites

* test: fix ProjectView design template fixtures

* ci: split workspace test stages

* guard: allow split Playwright suite scripts

* test: shrink Playwright critical suite

* test: restore omitted Playwright suites
2026-05-11 19:23:13 +08:00
PerishFire
8c0fb8dc01
feat(tools-pr): add maintainer PR-duty workspace (#1259)
* feat(tools-pr): add maintainer PR-duty workspace

Adds `tools/pr` as the maintainer-only control plane for PR-duty work on
this repo. Thin `gh` wrapper that encodes repo-specific knowledge:
review lanes, forbidden surfaces, lane-specific checklists, validation
command derivation from touched packages.

Subcommands:
- `list` — triage open queue by lane and review-state bucket.
- `view <num>` — agent-friendly review brief for a single PR.
- `classify [num]` — emit script-level tags for one PR or the whole
  open queue; full-queue JSON output lands under `.tmp/tools-pr/classify/`
  with rate-limit telemetry per run.
- `assignment` — assigner-perspective view of PR ownership, idle time,
  and blockers (derived from existing tags; no new judgments).

Tag dictionary (13 tags) covers: bot-only-approval, needs-rebase,
forbidden-surface, unlabeled, duplicate-title, non-ascii-slug,
maintainer-edits-disabled, org-member, unresolved-changes-requested,
stale-approval, and three awaiting-* timing tags. Each rule is
expressible as one factual sentence over `gh` data + repo paths — see
`tools/pr/AGENTS.md` for the full dictionary plus precision rules.

Templates in `tools/pr/templates/*.md` are aesthetic references for
recurring maintainer comments (duplicate-title ask, awaiting-author
nudge, agent-review brief shape). `templates/examples/` holds
frozen-in-time agent-review snapshots for three PR shapes.

Infrastructure:
- `gh()` wraps `execFile` with minimum-touch retry (2 attempts at 1s + 2s
  backoff) on transient 5xx / network errors. Persistent failures still
  surface — retry is anti-jitter, not an exponential-backoff resilience
  layer.
- Heavy chunks (`reviews`, `comments`, `commits`, assignment timelines)
  use cursor-paginated `gh api graphql` via `fetchPaginatedPrList` to
  stay under GitHub's GraphQL server-side timeout. Light chunks stay on
  `gh pr list --json`.
- `fetchOrgMembers` cached per process via `gh api orgs/<owner>/members
  --paginate`.

Wiring:
- Root `package.json` adds `pnpm tools-pr` to the allowed root entry
  points.
- `scripts/postinstall.mjs` builds `tools/pr` alongside other workspace
  packages.
- `scripts/guard.ts` allowlists `tools/pr/bin/tools-pr.mjs` and
  `tools/pr/esbuild.config.mjs`, and adds `pr/` to the `tools/` top-level
  layout allowlist.
- Root `AGENTS.md` and `tools/AGENTS.md` document the new command
  surface, root-command-boundary update, and per-tool ownership.

* docs(agents): brief tools-pr in root AGENTS.md, link to tools/pr/AGENTS.md

Adds a `PR-duty tooling` section to the root AGENTS.md summarising what
`pnpm tools-pr` is, listing the four common subcommands (list / view /
classify / assignment), and pointing readers to `tools/pr/AGENTS.md` for
the full tag dictionary, operational playbook, templates, and design
rules. The section keeps root-level guidance to high-level orientation
while details stay local to the tool's own AGENTS.md.

* fix(tools-pr): drop overly broad touches-root-package.json forbidden hit

`deriveForbidden` was flagging any change to root `package.json` as a
forbidden-surface hit, but AGENTS.md §Root command boundary only forbids
specific *lifecycle* aliases (pnpm dev / test / build / daemon / preview
/ start) — tools-control-plane entrypoints like `pnpm tools-pr` are
explicitly allowed. Distinguishing "forbidden alias" from "allowed
entry" requires reading the diff content, which is `pnpm guard`'s job
rather than a path-derived classify tag.

Dogfooded on this branch's own PR (#1259), which added the `pnpm
tools-pr` script and was incorrectly flagged. Removing the hit aligns
the `forbidden-surface` tag with what tools-pr can mechanically detect
from file paths alone (apps/nextjs/, packages/shared/).

* fix(tools-pr): paginate commits fetch, recognise ready-to-merge, escape title-index separator

Three review follow-ups on #1259, all factual fixes:

- `fetchOpenPrCommits` now uses `fetchPaginatedPrList` instead of a
  one-shot `pullRequests(first: $first)` query. GitHub GraphQL caps
  connection page size at 100, so the previous implementation would
  fail at runtime when callers passed `--limit > 100`. The paginated
  path makes the commits fetch consistent with the other heavy chunks
  (reviews, comments, assignment timelines) and removes the artificial
  ceiling entirely. The `limit` parameter is dropped from
  `fetchOpenPrCommits`; the CLI `--limit` continues to bound the
  `gh pr list --json` chunks.
- `deriveStatus` in `assignment.ts` now reads `facts.reviewDecision`
  and `facts.mergeStateStatus`. When the PR is `APPROVED` with merge
  state `CLEAN` or `UNSTABLE` and carries no blockers, status renders
  as `ready to merge` instead of falling through to `in review`. The
  assignment view loses its main triage signal without this — a clean
  human-approved PR rendered identical to a REVIEW_REQUIRED one.
- `tags.ts:tagDuplicateTitle` and `tags.ts:buildContext` both
  constructed the title-index key with a literal NUL byte between
  author and title, which made the file appear as binary in `git diff`
  / review tooling. Replaced the literal byte with a Unicode escape
  sequence in source; the runtime string value is identical, the
  source stays plain text and round-trips through review tooling
  cleanly.

* fix(tools-pr): raise default --limit to 1000 to cover the live open queue

mrcfps flagged that `tools-pr list` (and `classify --all`, `assignment`)
defaults to `--limit 100`, which silently drops every PR past the first
100 in the open queue. The repo currently sits at 104 open PRs, so the
out-of-the-box run was already omitting four PRs.

Raise the default to 1000 in `list.ts`, `classify.ts`, and `assignment.ts`,
and remove the now-pointless 200 ceiling — `gh pr list --limit N` paginates
internally, so a high cap is cheap. Users can still pass `--limit <small>`
for a truncated preview. CLI help text on the three subcommands updated to
match.

* fix(web): pass designTemplates to ProjectView render helper

#955 made `designTemplates` a required Prop on ProjectView, but the test
helper added in #1244 (`renderProjectView` in
`ProjectView.api-empty-response.test.tsx`) was never updated. The two
PRs landed on main without conflicting, leaving `apps/web` typecheck red
for every PR that rebases past b5eb8c16.

Pass `designTemplates={[] as SkillSummary[]}` alongside the existing
`skills={[] as SkillSummary[]}` so the helper compiles. The component
already treats the array shape (empty included) as a no-op fallback in
the empty-response paths the test exercises.

* fix(tools-pr): correct author signal + merge inline review comments

Two correctness gaps in the awaiting-* signal pipeline surfaced during
review of the new tools-pr commands:

1. `authorSignalAt` iterated every PR commit unconditionally. On
   `maintainerCanModify=true` PRs a maintainer's follow-up push would
   advance the author timestamp, masking a stalled author response.
   Filter commits to those whose `authorLogin` matches `facts.author`,
   mirroring the same filter already applied to comments.

2. `fetchOpenPrComments` (and `fetchView`) only fetched
   `pullRequest.comments` / `gh pr view --json comments`, which is the
   issue-conversation thread. Inline review-thread replies — where
   authors and reviewers actually exchange most fix-up replies — live in
   `reviewThreads.comments` / REST `pulls/{n}/comments`. Missing them let
   `humanReviewerSignalAt` / `authorSignalAt` and the `view` brief point
   at the wrong side after someone replied inline. Extend the list-mode
   GraphQL to also sweep `reviewThreads(last: 20).comments(first: 20)`,
   and add a parallel REST inline-comments fetch in `fetchView` that
   merges into `GhView.comments`.
2026-05-11 19:17:21 +08:00
Tom Huang
b5eb8c1647
feat: generic skills + split skills/design-templates + finalize-design API (#955)
* feat: general-purpose skills with @-mention composition and user import

Lift skills from "one mode-bound skill per project" to a generic capability
the user can compose per turn:

- Daemon: scan multiple skill roots (user-skills under runtime data, then
  the bundled `skills/`); user-imported skills can shadow built-ins by id.
- New `POST /api/skills/import` and `DELETE /api/skills/:id` endpoints,
  with CONFLICT/BAD_REQUEST/NOT_FOUND error codes and built-in delete
  protection.
- ChatRequest gains `skillIds: string[]`; the chat run concatenates each
  picked skill's body (and merges craftRequires) into the system prompt
  for that turn only — the project's persistent `skillId` is untouched.
- Web composer: `@` popover now lists skills alongside project files;
  picks render as removable chips above the textarea and ride along with
  the request as `skillIds`.
- Settings → Library: import form (name/description/triggers/body),
  per-card delete for user skills, "user" origin badge.

* chore(web): drop welcome pet teaser + add ds→prompt-template mapping util

- SettingsDialog: remove the inline pet adoption teaser from the welcome
  panel so the first-run modal stays focused on configuration.
- New `inferPromptTemplateCategoriesForDs(ds)` helper that maps a design
  system's authored metadata to prompt-template gallery categories.
  Imported by the design-system gallery wiring on a sibling branch; no
  callers in this branch yet.

* feat: split skills/design-templates and add finalize-design API

Phase 0 of the skills/design-templates refactor (specs/current/
skills-and-design-templates.md):

- Move ~104 rendering catalogue entries from skills/ to design-templates/
  and keep skills/ for the small set of functional skills that *do work*
  on user input (utilities, briefs, packagers).
- Add design-templates/AGENTS.md and skills/AGENTS.md describing the
  contract, and a brand-agnostic craft/ surface for opt-in craft rules.
- Daemon: add DESIGN_TEMPLATES_DIR / USER_DESIGN_TEMPLATES_DIR roots and
  an /api/design-templates surface mirroring /api/skills. Asset/example
  routes still span both registries so existing srcdoc URLs keep
  resolving across the rename.
- Web: split LibrarySection into SkillsSection + DesignSystemsSection,
  rename the EntryView "Examples" tab to "Templates", and update locales
  + the New-project picker accordingly.

Adds the finalize-design endpoint:

- New apps/daemon/src/finalize-design.ts and packages/contracts/src/api/
  finalize.ts — one-shot synthesis of a project's transcript + active
  design system + current artifact into <projectDir>/DESIGN.md via the
  Anthropic Messages API. Per-project .finalize.lock mirrors the
  transcript-export hygiene from PR #493; provider credentials are not
  persisted by the daemon.

Other supporting changes:

- README + AGENTS.md updates to document the new directory split and
  craft/ surface, plus i18n strings across 13 locales.
- Test refactors and new coverage (finalize-design, runs, sidecar
  server, plus refreshed daemon integration tests).
- .gitignore: scope the *.exe ignore to /OpenDesign.exe so legitimate
  vendor binaries are no longer hidden.

* fix(merge): move clinical-case-report to design-templates/

Origin/main added the clinical-case-report skill under skills/ before
the skills/design-templates split landed. Its od.mode is prototype, so
per specs/current/skills-and-design-templates.md it is a design template
and belongs alongside the other rendering catalogue entries — not under
the slimmed-down functional skills/ root. Moving it keeps the EntryView
Templates tab consistent with origin/main's intent.

* feat(skills): curated design/creative catalogue + collapsible Settings rows

Seed ~100 curated design/creative skill stubs under skills/ sourced from
awesome-claude-skills (ComposioHQ) and awesome-agent-skills (VoltAgent).
Each stub carries an od.category tag so the new filter pill row in
Settings -> Skills can group them. The seed script
(scripts/seed-curated-design-skills.ts, pnpm seed:curated-design-skills)
is idempotent: it only creates folders that don't already exist, so
hand-edited stubs are never overwritten.

- Daemon: parse and surface od.category on SkillInfo with a strict slug
  normaliser; mirror the field on SkillSummary in @open-design/contracts.
  Category is purely a UI hint — system-prompt composition is unchanged.
- Web: rewrite SkillsSection from a left-list / right-detail grid into a
  vertical stack of collapsible rows mirroring the External MCP panel
  (header always visible with name + mode/source/category pills + per-row
  enable toggle; SKILL.md preview, file tree and inline edit form expand
  on demand). Add a Category filter row above the list. Reorder Settings
  nav so Skills + External MCP sit above the Composio/MCP cluster. Update
  composer placeholder/hint across 17 locales to advertise '@ files or
  skills · / for commands'.
- Docs: extend skills/AGENTS.md with the curated catalogue rules
  (idempotency, category vocabulary, no upstream vendoring).

Co-authored-by: Cursor <cursoragent@cursor.com>

* test(skills): teach localized-content + system-prompt tests about the skills/design-templates split

mrcfps blocking review on PR #955: the skills/design-templates split
(b5993385) moved ~110 SKILL.md entries out of `skills/` and into
`design-templates/`, but two repo-level tests still hard-coded the
single-root layout, so CI gates went red on the merged branch:

- `e2e/tests/localized-content.test.ts` only scanned `<repo>/skills`
  while the locale `skillCopy` map keeps id-keyed entries spanning
  both roots (ExamplesTab/Templates uses one lookup regardless of
  origin). Teach the helper to read both `skills/` and
  `design-templates/`, deduplicating ids so the union matches the
  localized claim.
- `apps/daemon/tests/prompts/system.test.ts` read
  `skills/live-artifact/SKILL.md`, which now lives under
  `design-templates/live-artifact/`. Update the absolute path so
  composeSystemPrompt's coverage of the live-artifact preamble is
  exercised again.

Also enroll the curated design/creative catalogue (PR #955, ~91
stubs sourced from awesome-claude-skills / awesome-agent-skills) in
the DE / FR / RU `_SKILL_IDS_WITH_EN_FALLBACK` lists. The stubs are
English-only by design (frontmatter advertises an upstream URL); the
fallback list is exactly the place to acknowledge "we know this id
exists, English copy is fine here" so the localized-content coverage
gate passes without forcing a translation task per locale.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(skills): always quote frontmatter name so importUserSkill round-trips numeric / boolean ids

mrcfps PR #955 review: `buildSkillMarkdown` emitted `name:
${escapeYamlString(name)}` without quotes, so YAML coerced names
like `123`, `true`, `false`, or `null` into non-string scalars on
re-parse. listSkills() then read `data.name` as a number/boolean
and the import flow's follow-up `findSkillById(skills, result.id)`
missed it, falling into `/api/skills/import`'s "imported skill
could not be re-read" 500 path for those ids.

Switch the emitter to a quoted scalar (`name: "..."`) — the
double-escape already in `escapeYamlString` makes the quoted form
safe — and add a round-trip test covering `123`, `true`, `false`,
`null`, and `0` to lock in the contract.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(web): drop staged-skill chips when the matching @<id> token leaves the draft

mrcfps PR #955 review: `submit()` always forwarded every id in
`stagedSkills`, but that state was only mutated on picker click and
chip removal. Hand-deleting an `@<id>` token from the textarea left
the chip staged, so the request still carried `skillIds: [<id>]` and
the daemon composed a skill the prompt no longer referenced.

Sync the chips with the draft inside `handleChange()` by pruning
`stagedSkills` whenever the new value no longer contains the
`@<id>` token (using the same whitespace boundary as
`removeStagedSkill`'s strip regex). Comment explains why this
prune does not run for `staged` file attachments — users frequently
add files via the upload button without leaving an `@<path>` token,
so a symmetric prune there would erase legitimate uploads.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(daemon): stage @-composed skills' side files alongside the active skill

codex PR #955 review: composing a per-turn `@`-picked skill into the
system prompt appended its body (with the `withSkillRootPreamble`
guidance pointing at relative paths under `<cwd>/.od-skills/<folder>/`)
but never staged the actual folder. `startChatRun` only copied
`activeSkillDir`, so when the project's primary skill was different
(or absent) the composed skill's references/, examples/, and scripts/
files lived only at their absolute repo path — agents that honour
the cwd-relative form (or that don't get `--add-dir`, e.g. Codex with
allowlisted gpt-image projects) couldn't reach them.

Thread the composed skills' dirs out of `composeDaemonSystemPrompt`
as `extraSkillDirs` and stage each one through the same
`stageActiveSkill` API used for the primary skill. Dedupe by folder
basename so a project whose primary skill is also `@`-composed isn't
copied twice. Each preamble already advertises its own folder, so the
prompt and the staged tree stay aligned without further changes.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(web): respect the Library disable toggle in the project @-mention picker

codex PR #955 review: only `EntryView` received `enabledSkills`
(filtered against `config.disabledSkills`); active projects still
got `skills={skills}` raw, so a skill the user disabled in Settings
kept appearing in the project's `@`-mention popover and could ride
along to the daemon via `skillIds`. That broke the Library toggle
for any project opened on the post-split branch.

Compute a functional-skills-only enabled subset
(`enabledFunctionalSkills`) and pass it into `<ProjectView>` instead.
Templates stay separate — design-templates are filtered through their
own `enabledDesignTemplates` memo for the Templates gallery — so
ProjectView's chat composer still only sees skills, never templates,
matching the pre-split prop surface.

Co-authored-by: Cursor <cursoragent@cursor.com>

* test(e2e): mock /api/design-templates for example-use-prompt flow

The Templates tab in EntryView fetches from /api/design-templates after
the skills/design-templates split (specs/current/skills-and-design-templates.md).
The example-use-prompt Playwright scenario only mocked /api/skills, so the
gallery card never appeared and the test timed out waiting on
example-card-warm-utility-example. Serve the same fixture summary on both
endpoints so the templates gallery renders the card the test clicks.

Co-authored-by: Cursor <cursoragent@cursor.com>

* test(tools-pack): create design-templates fixture for resources test

The packaging resources copy now bundles the new design-templates tree
alongside skills (see resources.ts BUNDLED_RESOURCE_TREES). The
copyBundledResourceTrees fixture only created skills, design-systems,
craft, etc., so the recursive copy crashed with ENOENT on
design-templates before it could check the prompt-templates assertion.
Add the missing fixture directory so the test exercises the same set
of resource trees the packaged build does.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(skills): clone built-in side files into the shadow on first edit

mrcfps PR #955 review: editing a built-in skill wrote a USER_SKILLS_DIR
shadow folder that contained only a new SKILL.md. The next listSkills()
pass surfaced the shadow as the active dir, but every side-file resolver
(/api/skills/:id/files, /example, /assets/*, the system-prompt preamble,
and the per-turn cwd staging) reads through skill.dir. With nothing but
SKILL.md in the shadow, the bundled assets/, references/, scripts/, and
examples/ disappeared the moment the user hit save — a built-in like
last30days or live-artifact would break immediately after edit instead
of just having its body overridden.

Teach updateUserSkill() to take a `sourceDir` and clone every entry
except SKILL.md / dotfiles into the shadow on the very first edit. The
shadow stays self-contained, so all the resolvers keep working without
fallback bookkeeping. Subsequent edits detect the existing shadow and
skip the clone, so user tweaks under the side tree survive a re-save.

Wire `sourceDir: skill.dir` from server.ts's PUT /api/skills/:id handler
and add two regression tests:
- 'clones built-in side files into the shadow on the first edit' walks
  the file tree after save and asserts assets/template.html, references/
  notes.md, and scripts/helper.sh all round-trip from the built-in.
- 'preserves user-edited side files on subsequent edits' edits the
  staged assets/template.html, re-saves, and confirms the user content
  is still there.

Co-authored-by: Cursor <cursoragent@cursor.com>

* test(e2e): rename home tab from Examples to Templates

The Examples tab was renamed to Templates in EntryView (b5993385's
skills/design-templates split — entry.tabExamples became entry.tabTemplates
and the tab value moved from 'examples' to 'templates'), but
entry-chrome-flows still asserted the old label and testId. Update both.

* fix(skills+web): preserve template body in API mode and dir-based skill delete

Two follow-ups from PR #955 review:

1. ProjectView only received `enabledFunctionalSkills`, but
   `composedSystemPrompt()` still resolved `project.skillId` through that
   prop and `fetchSkill()`. Projects created from the new
   `/api/design-templates` surface keep a template id in `project.skillId`,
   so opening one in API mode dropped the template body from the system
   prompt and the upstream request ran without the project's primary
   template instructions. Now ProjectView takes a separate
   `designTemplates` prop (the unfiltered template list, so a
   later-disabled template still loads for projects already created from
   it) and `composedSystemPrompt()` plus the metadata / `isDeck` lookups
   fall back to that list, with `fetchDesignTemplate()` as the body-fetch
   fallback to `fetchSkill()`. The chat composer's `@`-picker keeps
   receiving only the enabled functional skills.

2. `DELETE /api/skills/:id` used `deleteUserSkill(USER_SKILLS_DIR, skill.id)`
   which re-slugified the frontmatter id and removed
   `<userSkillsDir>/<slug>/`. That matched the import shape but missed the
   install shape — `installFromTarget` writes the folder at
   `sanitizeRepoName(url)` (GitHub) or `path.basename(realpath)` (local
   symlink), neither of which is guaranteed to equal the slugified
   frontmatter `name`. A duplicate `app.delete('/api/skills/:id', ...)`
   handler at the install routes never fired because Express resolved the
   earlier registration first, leaving the install/uninstall path without
   working teardown. The handler now removes `skill.dir` (the absolute
   path listSkills already discovered) under a USER_SKILLS_DIR safety
   check, using `lstat` + `unlinkSync` so symlinked local installs unlink
   cleanly without recursing into the user's source tree. The dead
   duplicate handler is removed; `deleteUserSkill` is dropped from the
   server.ts import set (still exported and unit-tested in skills.ts).
   Regression coverage in `apps/daemon/tests/skills-delete-route.test.ts`
   pins both shapes plus the symlink-preserves-source case.

* test(daemon): point hyperframes system-prompt test at design-templates

The merge with main brought in a hyperframes system-prompt test that
reads `skills/hyperframes/SKILL.md`, but this branch's split moved
`hyperframes` into `design-templates/` (same migration as `live-artifact`
already handled above in this file). CI was failing with ENOENT on the
old path.

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 17:48:34 +08:00
PerishFire
976edaf38e
test: harden e2e smoke and release reports (#1140)
* test: harden e2e inspect specs

* test: wire e2e release reports

* chore: bump packaged beta base to 0.6.1

* test: run release smoke vitest directly

* test: add suite-owned tools-dev lifecycle

* ci: harden stable release packaging

* fix(release,e2e): gate stable signing on verify and harden suite cleanup

- restore `needs: [metadata, verify]` on the stable release `build_mac`,
  `build_mac_intel`, `build_win`, and `build_linux` jobs so Apple
  signing/notarization and Windows release builds cannot run before
  pnpm guard, typecheck, and layout checks complete on the metadata commit.
- in `runToolsDevSuite`, drop the `started` flag and always attempt
  `stopToolsDevWeb` in `finally`; record stop errors in diagnostics, and
  when the test body succeeded, escalate the stop failure to the suite
  result and rethrow — so orphan daemon/web processes from an interrupted
  `startToolsDevWeb` or a broken shutdown can no longer pass silently.

Addresses PR #1140 review feedback from lefarcen and mrcfps.
2026-05-11 13:11:16 +08:00
Cursor Agent
bf30b308e3
feat(deploy): docker-compose + Helm chart entry slice (Phase 5)
Plan J4 / spec §15.4 / §15.5 / §16 Phase 5.

Three landings:

- deploy/Dockerfile now COPYs plugins/_official/ into the image so
  the bundled atom plugins from §3.I3 register on container boot —
  without this, registerBundledPlugins() silently no-ops inside the
  container and the §23 self-bootstrap promise breaks for hosted
  deployments.

- tools/pack/docker-compose.yml ships the canonical hosted-mode
  manifest spec §15.4 calls out: two-volume layout (od-data +
  od-config) per §15.2, OD_BIND_HOST=0.0.0.0 + OD_API_TOKEN +
  OD_NAMESPACE + snapshot retention knobs as env, /api/daemon/status
  as the healthcheck endpoint (Phase 1.5). Drop-in usable with
  `docker compose -f tools/pack/docker-compose.yml up -d`.

- tools/pack/helm/open-design/{Chart.yaml,values.yaml,README.md}
  pins the Helm chart parameter surface for the per-cloud overrides
  spec §15.5 enumerates (AWS / GCP / Azure / Aliyun / Tencent /
  Huawei / self-hosted). Templates land in the Phase 5 follow-up
  PR; the values schema is locked here so the per-cloud override
  files (values-<cloud>.yaml) review in isolation.

scripts/guard.ts allowlist gains
`packages/agui-adapter/esbuild.config.mjs` so the new package
passes the residual-JS guard.

Daemon tests stay at 1486/1486 (deploy artifacts only).

Co-authored-by: Tom Huang <1043269994@qq.com>
2026-05-09 13:20:53 +00:00
pftom
4c7cd5d9f2 feat(plugins): introduce plugin system with installation and management capabilities
- Added support for a new plugin system, allowing users to install, uninstall, and manage plugins through the daemon.
- Implemented API endpoints for listing installed plugins, retrieving plugin details, and applying plugins with input validation.
- Introduced a plugin doctor feature to validate plugin manifests and check for issues before application.
- Established a plugin persistence layer with SQLite migrations for managing installed plugins and their metadata.
- Enhanced the CLI with commands for plugin operations, improving user interaction with the plugin ecosystem.
2026-05-09 18:24:44 +08:00
Tom Huang
2df8b775ec
feat(skills): add 32 zhangzara HTML deck templates (#704)
* feat(skills): add 32 zhangzara HTML deck templates

Vendored from upstream MIT-licensed
zarazhangrui/beautiful-html-templates — one Open Design skill per template
(name prefix `html-ppt-zhangzara-`) so each template surfaces as its own
entry in the Examples panel and renders its own preview.

Each skill ships:
- SKILL.md (frontmatter + workflow), description, triggers, and
  od.upstream pointing at the source folder
- example.html (the self-contained deck; daemon's preview route looks
  for <skillDir>/example.html)
- template.json (upstream metadata snapshot, with `slug` re-prefixed to
  `zhangzara-<base>` and a `source` URL)
- assets/deck-stage.js / assets/styles.css for the 8 templates that
  ship a runtime; HTML refs rewritten so the daemon's iframe URL
  rewriter resolves them through /api/skills/<id>/assets/

scripts/guard.ts allowlist updated with the `html-ppt-zhangzara-` prefix
so the vendored upstream JS runtimes pass the residual-JS check.

* fix(skills, i18n): address PR #704 review feedback

- Add the 32 new html-ppt-zhangzara-* skill ids to the de/ru/fr
  SKILL_IDS_WITH_EN_FALLBACK arrays so the localized-content
  coverage e2e test passes. The vendored upstream templates are
  English-only; falling back to the upstream English description
  is the right semantic for this batch.
- Also add the pre-existing social-media-dashboard skill and
  totality-festival design system to the same fallback arrays
  (introduced in #678 without i18n coverage). Tagged with TODOs
  so localized copy can land in a follow-up.
- Ship the upstream MIT LICENSE file in each
  skills/html-ppt-zhangzara-*/ folder so the copyright/permission
  notice travels with the vendored copy, as MIT requires for
  redistributing substantial portions. Update each SKILL.md's
  Source section to reference the bundled LICENSE.
- For the 8 runtime-backed templates (creative-mode,
  editorial-tri-tone, neo-grid-bold, peoples-platform,
  pin-and-paper, pink-script, soft-editorial, stencil-tablet),
  expand the workflow's clone step to instruct the agent to copy
  the assets/ folder alongside example.html — the skill HTML
  references assets/deck-stage.js (and assets/styles.css for
  pin-and-paper) as project-local paths, so cloning the HTML
  alone produces an artifact whose runtime 404s.

Verified locally:
- pnpm guard passes.
- pnpm --filter @open-design/web typecheck passes.
- pnpm --filter @open-design/web test passes (309/309).
- pnpm --filter @open-design/e2e test passes (6/6 active,
  including localized-content coverage for de/ru/fr).

* fix(i18n): drop duplicate totality-festival fallback after merge with main

Main already added 'totality-festival' to the design-system EN-fallback
lists; the TODO entry from this branch became a duplicate after merge.

* fix(skills, guard): address PR #704 follow-up review

- Pin Chart.js CDN to 4.4.7 in coral and cartesian example.html so
  vendored decks no longer track the latest jsDelivr major.
- Narrow scripts/guard.ts zhangzara allowlist to a regex that only
  permits skills/html-ppt-zhangzara-*/assets/deck-stage.js, restoring
  the TypeScript-first guard for any other JS under those skill dirs.
- Reconcile slide_count and 'Slides in demo' with actual <section
  class="slide"> counts: broadside 20 -> 16, monochrome 18 -> 16,
  neo-grid-bold 13 -> 12.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(daemon): keep resolveDataDir return path stable, canonicalize at compare site

The realpathSync wrapper inside resolveDataDir was rewriting every
/var/... result to /private/var/... on macOS, which broke 11 hermetic
assertions in tests/resolve-data-dir.test.ts (absolute paths, relative
paths, and \$HOME / \${HOME} / ~ expansions whose mkdtempSync roots live
under /var/folders/...). It also changed the public OD_DATA_DIR
resolution contract for any downstream caller that compared against the
expanded user-supplied path.

Restore resolveDataDir to return the expanded resolved path unchanged,
and introduce RUNTIME_DATA_DIR_CANONICAL — a one-shot realpath of
RUNTIME_DATA_DIR — used only at the narrow folder-import comparison
site that needs to match against a user-supplied realpath() result. The
import-path symlink protection from #624 still works (a /var-rooted
data dir now compares against its /private/var canonical form), while
resolveDataDir keeps its stable, user-shaped contract.

Verified locally: pnpm --filter @open-design/daemon test (1083/1083),
including all 12 resolve-data-dir.test.ts cases.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-08 12:02:59 +08:00
Tom Huang
56bf6ee1b6
feat: agent-callable research command and /search (#615)
* feat: pre-generation research (Tavily) for grounded generation

Adds an optional pre-generation research step so the agent can produce
slides / prototypes / decks grounded in real sources instead of guessing.

User flow:
  1. Settings -> Tavily Search -> paste API key (or set TAVILY_API_KEY).
  2. Click the new Research button in the chat composer.
  3. On send, the daemon runs a Tavily search, prepends the findings
     as a <research_context> block ahead of the system prompt, and
     spawns the agent. Research progress shows up as status pills in
     the chat stream; the agent cites sources inline as [1]/[2]/...

Phase 1 surface:
  - Single provider (Tavily), single depth ('shallow'), no LLM
    synthesis pass (Tavily's `answer` is the summary).
  - Composer toggle only; no popover / depth picker yet.
  - Reuses the existing `status` SSE agent payload + StatusPill UI
    so no new event variants or renderer code are needed.

Layers touched:
  - contracts: ResearchOptions / Source / Findings DTOs;
    ChatRequest.research; export from index.
  - daemon: apps/daemon/src/research/{index,tavily}.ts orchestrator
    + provider; tavily added to MEDIA_PROVIDERS and ENV_KEYS; hook
    in startChatRun before prompt assembly.
  - web: ChatComposer toggle + ChatSendMeta; threaded through
    ChatPane / ProjectView / streamViaDaemon into ChatRequest.

Side fix (required to land the feature, but useful on its own):
  contracts internal relative imports lacked the `.js` suffix that
  NodeNext module resolution requires. This was already breaking
  `pnpm --filter @open-design/daemon typecheck` on main; without the
  fix, none of the new research types were visible to the daemon.
  All internal contracts imports now carry `.js`.

Spec: specs/current/research-feature.md (phases 2-4 outlined for
follow-up: composer popover, multi-provider, deep recursion, example
skills with research_recommends).

Verified:
  - pnpm --filter @open-design/contracts typecheck/test
  - pnpm --filter @open-design/daemon typecheck (the chokidar
    project-watchers test is a pre-existing flake, unrelated)
  - pnpm --filter @open-design/web typecheck
  - node scripts/verify-media-models.mjs

* fix(daemon): clamp Tavily max_results to 20

Tavily's /search endpoint requires `max_results` in [0, 20]; sending a
larger value (e.g. when `research.depth: "deep"` resolves to 30) returns
400 and `runResearch` silently falls back to no-research. Clamp at the
provider boundary so Phase 2 depth tiers above 20 still produce results
instead of failing the request.

Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)

* Remove stale research merge leftovers

* Add agent-callable research search

* Fix Indonesian locale typecheck

* Fix research command invocation edge cases

* Harden slash search prompt expansion

* Honor research source caps in command contract

* Require search reports in design files

* Add research data provider settings

* Wire web research provider fallback order

* Update research provider fallback wording

* Revert "Update research provider fallback wording"

This reverts commit 86fb6001e3.

* Revert "Wire web research provider fallback order"

This reverts commit 4c9e16036b.

* Revert "Add research data provider settings"

This reverts commit 23630d1746.

* Add Dexter and Last30Days research skills

* Add DCF and Last30Days OD skills

* Add Last30Days and Dexter skills

* Resolve research review threads

---------

Co-authored-by: a1chzt <chizblank@gmail.com>
2026-05-08 10:33:44 +08:00
PerishFire
cb92c93ae0
Migrate beta release publishing to R2 (#805)
* Prebundle standalone web packaged runtime

* Harden mac standalone prebundle policy

* Prebundle mac daemon packaged runtime

* Prune mac Electron locales

* Maximize mac release artifact compression

* Publish beta mac artifacts to R2

* Use remote R2 uploads for beta releases

* Fail fast on beta R2 access issues

* Use S3-compatible uploads for beta R2 releases

* Decouple beta versioning from GitHub releases

* Remove legacy beta metadata source

* Address release beta review notes
2026-05-07 19:13:52 +08:00
PerishFire
6efac8887e
Improve Windows beta packaging and installer flow (#768)
* Optimize Windows packaged web output

* Fix packaged contracts runtime build

* Optimize Windows packaged size pruning

* Prune Windows root Next payload

* Remove Windows bundled Node runtime

* Prune Windows standalone duplicate Next

* Add tools-pack cache foundation

* Cache Windows packaged build layers

* Cache Windows workspace builds

* Cache Electron-ready Windows app

* Split Windows tools-pack module

* Cache Windows dir build outputs

* Split Windows pack build modules

* Document Windows NSIS smoke namespace limits

* Move Windows NSIS smoke note to agents guide

* Optimize Windows beta packaging

* Bump packaged beta base version

* Improve Windows installer namespace UX

* Improve Windows tools-pack cache keys

* Stabilize Windows beta cache version keys

* Cache Windows workspace build outputs

* Optimize windows release beta cache layers

* Cache windows release dependencies

* Trim windows release cache before save

* Refresh windows tools-pack cache key

* Improve windows installer preflight prompts

* Fallback NSIS installer strings to English

* Fix Windows installer cleanup and preflight

* Improve Windows NSIS state logging

* Fix system NSIS Persian language alias

* Use long-path removal for Windows uninstall

* Fix mac tools-pack tests on Windows

* Address Windows packaging review feedback

* Fix Windows installer cache namespace isolation

* Include web output mode in Windows tarball cache key

* Use unique Windows release cache save keys
2026-05-07 16:44:15 +08:00
PerishFire
f1cdb2844a
test(e2e): gate beta packaged runtime (#637)
* test(e2e): gate beta mac packaged runtime

* test(e2e): separate ui automation layout

* test(e2e): move localized content coverage

* chore(release): prepare packaged 0.4.1 beta validation

* test(e2e): keep ui lane playwright-only

* fix(web): keep chat recoverable after conversation load failure

* fix(desktop): honor native mac quit
2026-05-06 17:44:29 +08:00
iulian
14a73d948b
Fix packaged contracts runtime exports (#577)
* fix packaged contracts runtime exports

* fix(packaging): prepare contracts runtime exports on install
2026-05-06 09:11:35 +08:00
PerishFire
bbdd4e84b5
chore: enforce test directory conventions (#496)
* chore: enforce test directory conventions

Move package, app, and tool tests out of src and add guard enforcement so source directories stay source-only.

* ci: use guard and package-scoped tests

Run the new repository guard in CI and keep test execution aligned with package-scoped commands after removing root aliases.

* ci: align stable release guard check

Use the new repository guard in stable release verification after replacing the residual-JS-only script.

* chore: tighten test layout enforcement

Enforce sibling tests directories, typecheck moved test suites with dedicated configs, and refresh remaining guidance that pointed at src-based tests.

* chore: clarify no-emit test tsconfigs

Explicitly disable declaration-only emit in test tsconfigs so review tooling sees they are no-emit typecheck configs.
2026-05-05 15:34:22 +08:00