mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
* feat(daemon): make design-system token channel default-on (PR-D)
Flip `OD_DESIGN_TOKEN_CHANNEL` from default-off to default-on. Every
chat that picks a brand with `tokens.css` + `components.html` siblings
(today: `default`, `kami`) now gets the structured token contract
appended to the system prompt automatically. `OD_DESIGN_TOKEN_CHANNEL=0`
keeps the DESIGN.md-only path as a kill switch.
Adds `scripts/check-design-system-flag-parity.ts`, registered in
`pnpm guard`. The guard walks every brand and asserts:
- 147 prose-only brands produce byte-identical prompts under flag-off
vs flag-on (PR-D's "no-op for legacy brands" promise)
- 2 structured brands diverge as expected (catches a future regression
that silently dropped the structured blocks)
Smoke evidence on #1385 (PR-C):
- `default` — 10/10 brand tokens used byte-for-byte in treatment vs
0/10 invented colors in control
- `kami` — treatment recovers brand name (`Kami · 纸`), the two-tier
surface (`--bg` parchment + `--surface` ivory), the CN font stack
override, and the `components.html` card pattern; control invented
"Replica" as a brand name
Co-authored-by: Cursor <cursoragent@cursor.com>
* review: address @nettee + @lefarcen feedback on parity guard
Two blocking findings from #1544 review:
1. @nettee — guard's inventory walk silently passed on unreadable
filesystem state. `fileExists` swallowed every `stat` error and the
bare `readdir` catch returned `[]` for any failure. A renamed
`design-systems/` tree, a permission-denied DESIGN.md, or a
directory at the brand path would have left `pnpm guard` happy
after checking 0 brands — exactly the silent misconfiguration this
guard exists to catch. Both error paths now treat only ENOENT /
ENOTDIR as absence and rethrow everything else, mirroring the
`readFileOptional` fix already applied to PR-C's
`apps/daemon/src/design-systems.ts`.
2. @nettee — guard exercised `composeSystemPrompt` directly, bypassing
the `process.env.OD_DESIGN_TOKEN_CHANNEL !== '0'` gate in server.ts
that PR-D actually flipped. A regression that restored `=== '1'`,
typo'd the env name, or stopped reading assets when the var is
unset would still leave the guard green. Extracted the predicate
into `isDesignTokenChannelEnabled(env)` next to
`readDesignSystemAssets` and added 6 unit tests pinning every value
that matters: unset / `'1'` / `'true'` / empty / `'0'` /
whitespace-padded. server.ts now calls the predicate. Any
regression on the env-flag semantics fails
`tests/design-system-assets.test.ts` independently of the
composer-level coverage.
Verified: pnpm guard (13/13), tsc -p scripts/tsconfig.json (clean),
@open-design/daemon typecheck (clean), 32/32 prompt + asset tests.
Co-authored-by: Cursor <cursoragent@cursor.com>
* review: pin server-layer asset resolution end-to-end (lefarcen P2)
Round-2 review feedback from @lefarcen on #1544: the predicate suite
in tests/design-system-assets.test.ts pinned the env-flag boolean but
did NOT exercise the server prompt-assembly path that PR-D actually
flipped — the seam where the daemon decides whether to read tokens.css
/ components.html from disk and hand them to composeSystemPrompt. A
regression that, say, restored an inline `=== '1'` gate or stopped
calling isDesignTokenChannelEnabled() from server.ts would still leave
the predicate test green.
Extracted that whole seam into `resolveDesignSystemAssets(id,
builtInRoot, userInstalledRoot, env)` on apps/daemon/src/design-systems.ts.
The function combines:
1. the env-flag gate (kill switch on `OD_DESIGN_TOKEN_CHANNEL=0`)
2. the built-in → user-installed root fallback chain (per-file)
3. the DesignSystemAssets result shape consumed by composeSystemPrompt
server.ts at the prompt-assembly site is now a thin caller of this
function. The previous 13-line inline block (env check + per-file
fallback) collapses to one call, so the whole asset-resolution path
now has a single testable seam.
7 new tests in tests/design-system-assets.test.ts run the full pipeline
end-to-end against real disk fixtures:
- env unset (default-on): returns built-in assets
- env=`'0'` (kill switch): returns undefined even with files on disk
- env=`'1'` (legacy opt-in): still works
- mixed builtin/user-installed: per-file fallback merges correctly
- both halves built-in: skips user-installed roundtrip verbatim
- prose-only brand (no files): undefined / undefined
- nonexistent brand directory: undefined / undefined
Verified: pnpm guard (13/13), tsc -p scripts/tsconfig.json (clean),
@open-design/daemon typecheck (clean), 39/39 prompt + asset tests
(was 32; +7 new server-layer-resolution tests).
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(test): add missing projectKind to FileViewer deck preview test
The deck preview test added in #1556 (086be271) renders <FileViewer/>
without `projectKind`, which became a required prop in #1509. CI on
main is currently red on this; pick up the trivial fix here so PR-D
can land cleanly.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: chaoxiaoche <chaoxiaoche@chaoxiaochedeMacBook-Pro.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
241 lines
10 KiB
TypeScript
241 lines
10 KiB
TypeScript
/* ─────────────────────────────────────────────────────────────────────────
|
|
* scripts/check-design-system-flag-parity.ts
|
|
*
|
|
* Regression guard for the design-system token channel rolled out in
|
|
* PR #1385 (the daemon path that injects `tokens.css` + `components.html`
|
|
* into the agent system prompt when those siblings exist alongside
|
|
* DESIGN.md). Two brands ship the structured form today (`default`,
|
|
* `kami`); the other ~138 brands are prose-only and rely on the legacy
|
|
* DESIGN.md-only behaviour.
|
|
*
|
|
* PR-D will flip the `OD_DESIGN_TOKEN_CHANNEL` env default from off to
|
|
* on. For prose-only brands, that flip MUST be a no-op: missing files
|
|
* resolve to `undefined`, undefined fields skip the new prompt blocks,
|
|
* and the composed system prompt should be byte-identical to today's
|
|
* output. This guard pins that contract end-to-end at every brand:
|
|
*
|
|
* 1. checkDesignSystemFlagParity
|
|
* For each prose-only brand, calling `composeSystemPrompt` with
|
|
* the new asset fields omitted (flag-off / no files) and with
|
|
* them set to the values `readDesignSystemAssets` actually
|
|
* returns (which is `{ undefined, undefined }` for these
|
|
* brands, i.e. the flag-on path with no files on disk) produces
|
|
* byte-identical strings.
|
|
*
|
|
* For structured brands, the same call pair MUST diverge — the
|
|
* flag-on path must contain the `tokens.css` and
|
|
* `components.html` blocks. A silent regression that reverted
|
|
* the composer to skip those blocks would be caught here.
|
|
*
|
|
* The guard mirrors the runtime path the daemon takes in
|
|
* `apps/daemon/src/server.ts` so the contract is exercised in the
|
|
* shape that ships, not a synthetic stand-in.
|
|
*
|
|
* Run standalone: `pnpm exec tsx scripts/check-design-system-flag-parity.ts`
|
|
* Or as part of `pnpm guard` (registered in scripts/guard.ts).
|
|
* ─────────────────────────────────────────────────────────────────── */
|
|
|
|
import { readdir, readFile, stat } from "node:fs/promises";
|
|
import path from "node:path";
|
|
import { fileURLToPath } from "node:url";
|
|
|
|
import {
|
|
readDesignSystemAssets,
|
|
type DesignSystemAssets,
|
|
} from "../apps/daemon/src/design-systems.ts";
|
|
import { composeSystemPrompt } from "../apps/daemon/src/prompts/system.ts";
|
|
|
|
const repoRoot = path.resolve(import.meta.dirname, "..");
|
|
const designSystemsRoot = path.join(repoRoot, "design-systems");
|
|
|
|
const SKIPPED_BRAND_DIRECTORIES = new Set(["_schema"]);
|
|
|
|
function toRepositoryPath(filePath: string): string {
|
|
return path.relative(repoRoot, filePath).split(path.sep).join("/");
|
|
}
|
|
|
|
type BrandSnapshot = {
|
|
id: string;
|
|
brandRoot: string;
|
|
designMdPath: string;
|
|
designMd: string;
|
|
title: string;
|
|
assets: DesignSystemAssets;
|
|
isStructured: boolean;
|
|
};
|
|
|
|
async function fileExists(filePath: string): Promise<boolean> {
|
|
try {
|
|
const stats = await stat(filePath);
|
|
return stats.isFile();
|
|
} catch (err) {
|
|
// Only treat genuine "not there" failures as a clean false. Every
|
|
// other fs error (`EACCES`, `EPERM`, `EIO`, a directory at the
|
|
// file path, …) means the parity inventory cannot be read — and
|
|
// since this guard exists precisely to catch silent
|
|
// misconfigurations during the PR-D rollout, we must surface those
|
|
// loudly instead of treating them as "brand absent".
|
|
if (isAbsenceError(err)) return false;
|
|
throw err;
|
|
}
|
|
}
|
|
|
|
function isAbsenceError(err: unknown): boolean {
|
|
if (typeof err !== "object" || err === null) return false;
|
|
const code = (err as { code?: unknown }).code;
|
|
return code === "ENOENT" || code === "ENOTDIR";
|
|
}
|
|
|
|
function extractTitle(designMd: string, fallback: string): string {
|
|
const match = /^#\s+(.+?)\s*$/m.exec(designMd);
|
|
const raw = match?.[1] ?? fallback;
|
|
// Mirror cleanTitle() in design-systems.ts so the title we feed
|
|
// composeSystemPrompt matches the daemon's runtime title exactly.
|
|
return raw.replace(/^Design System (Inspired by|for)\s+/i, "").trim();
|
|
}
|
|
|
|
async function discoverBrandSnapshots(): Promise<BrandSnapshot[]> {
|
|
let entries;
|
|
try {
|
|
entries = await readdir(designSystemsRoot, { withFileTypes: true });
|
|
} catch (err) {
|
|
// Same absence-vs-real-error split as `fileExists` above. A
|
|
// missing `design-systems/` directory is a non-issue (some
|
|
// packaged distributions ship without it); every other readdir
|
|
// failure means the parity check cannot enumerate brands and
|
|
// must fail the guard loudly.
|
|
if (isAbsenceError(err)) return [];
|
|
throw err;
|
|
}
|
|
|
|
const snapshots: BrandSnapshot[] = [];
|
|
|
|
for (const entry of entries) {
|
|
if (!entry.isDirectory() && !entry.isSymbolicLink()) continue;
|
|
if (SKIPPED_BRAND_DIRECTORIES.has(entry.name)) continue;
|
|
|
|
const id = entry.name;
|
|
const brandRoot = path.join(designSystemsRoot, id);
|
|
const designMdPath = path.join(brandRoot, "DESIGN.md");
|
|
|
|
if (!(await fileExists(designMdPath))) continue;
|
|
|
|
const designMd = await readFile(designMdPath, "utf8");
|
|
const assets = await readDesignSystemAssets(designSystemsRoot, id);
|
|
const isStructured =
|
|
assets.tokensCss !== undefined || assets.fixtureHtml !== undefined;
|
|
|
|
snapshots.push({
|
|
id,
|
|
brandRoot,
|
|
designMdPath,
|
|
designMd,
|
|
title: extractTitle(designMd, id),
|
|
assets,
|
|
isStructured,
|
|
});
|
|
}
|
|
|
|
snapshots.sort((a, b) => a.id.localeCompare(b.id));
|
|
return snapshots;
|
|
}
|
|
|
|
function describeFirstDivergence(left: string, right: string): string {
|
|
const max = Math.min(left.length, right.length);
|
|
let index = 0;
|
|
while (index < max && left.charCodeAt(index) === right.charCodeAt(index)) {
|
|
index += 1;
|
|
}
|
|
// Show a small window around the first differing byte so the failure
|
|
// is actionable without dumping the full ~80 KB system prompt.
|
|
const window = 60;
|
|
const start = Math.max(0, index - 8);
|
|
return [
|
|
` first divergence at byte ${index} (lengths flag-off=${left.length}, flag-on=${right.length}):`,
|
|
` flag-off → …${JSON.stringify(left.slice(start, start + window))}…`,
|
|
` flag-on → …${JSON.stringify(right.slice(start, start + window))}…`,
|
|
].join("\n");
|
|
}
|
|
|
|
export async function checkDesignSystemFlagParity(): Promise<boolean> {
|
|
const snapshots = await discoverBrandSnapshots();
|
|
const violations: string[] = [];
|
|
|
|
let proseOnlyChecked = 0;
|
|
let structuredChecked = 0;
|
|
|
|
for (const brand of snapshots) {
|
|
// Flag-off (today's default): assets fields absent from the
|
|
// ComposeInput entirely. Equivalent to the legacy code path before
|
|
// PR #1385 introduced the channel.
|
|
const flagOff = composeSystemPrompt({
|
|
designSystemBody: brand.designMd,
|
|
designSystemTitle: brand.title,
|
|
});
|
|
|
|
// Flag-on: pass exactly what `readDesignSystemAssets` returned at
|
|
// the top of this loop. For prose-only brands that's
|
|
// `{ tokensCss: undefined, fixtureHtml: undefined }`, which the
|
|
// composer already skips; for structured brands it's the verbatim
|
|
// file contents.
|
|
const flagOn = composeSystemPrompt({
|
|
designSystemBody: brand.designMd,
|
|
designSystemTitle: brand.title,
|
|
designSystemTokensCss: brand.assets.tokensCss,
|
|
designSystemFixtureHtml: brand.assets.fixtureHtml,
|
|
});
|
|
|
|
if (brand.isStructured) {
|
|
structuredChecked += 1;
|
|
if (flagOn === flagOff) {
|
|
violations.push(
|
|
[
|
|
`[${brand.id}] structured brand produced byte-identical prompts under flag-off vs flag-on — the tokens.css / components.html injection is silently inert.`,
|
|
` ${toRepositoryPath(brand.brandRoot)} ships ${brand.assets.tokensCss !== undefined ? "tokens.css" : ""}${brand.assets.tokensCss !== undefined && brand.assets.fixtureHtml !== undefined ? " + " : ""}${brand.assets.fixtureHtml !== undefined ? "components.html" : ""} but the composer did not append the corresponding block.`,
|
|
` This usually means composeSystemPrompt was edited to drop the structured-tier blocks; restore the conditionals around \`designSystemTokensCss\` / \`designSystemFixtureHtml\` in apps/daemon/src/prompts/system.ts.`,
|
|
].join("\n"),
|
|
);
|
|
}
|
|
continue;
|
|
}
|
|
|
|
proseOnlyChecked += 1;
|
|
if (flagOn !== flagOff) {
|
|
violations.push(
|
|
[
|
|
`[${brand.id}] prose-only brand produced different prompts under flag-off vs flag-on — flipping OD_DESIGN_TOKEN_CHANNEL would silently change behaviour for this brand.`,
|
|
describeFirstDivergence(flagOff, flagOn),
|
|
` Either ${toRepositoryPath(brand.brandRoot)} now ships tokens.css / components.html (in which case it should be promoted to the structured tier and pass the divergence side of this check), or composeSystemPrompt is leaking output for undefined asset fields.`,
|
|
].join("\n"),
|
|
);
|
|
}
|
|
}
|
|
|
|
if (violations.length > 0) {
|
|
console.error("Design system flag parity violations:");
|
|
for (const violation of violations) {
|
|
console.error(`- ${violation}`);
|
|
}
|
|
console.error(
|
|
"PR-D's promise is that flipping OD_DESIGN_TOKEN_CHANNEL on by default is a no-op for the ~138 prose-only brands and a deliberate uplift for the structured tier. Both halves of that promise must keep holding.",
|
|
);
|
|
return false;
|
|
}
|
|
|
|
const structuredLabel = `${structuredChecked} structured brand${structuredChecked === 1 ? "" : "s"}`;
|
|
const proseLabel = `${proseOnlyChecked} prose-only brand${proseOnlyChecked === 1 ? "" : "s"}`;
|
|
console.log(
|
|
`Design system flag parity passed: ${proseLabel} produce byte-identical prompts under OD_DESIGN_TOKEN_CHANNEL flag-off vs flag-on; ${structuredLabel} show the expected token-channel divergence.`,
|
|
);
|
|
return true;
|
|
}
|
|
|
|
// ─── Standalone entrypoint ───────────────────────────────────────────
|
|
|
|
const isInvokedDirectly =
|
|
process.argv[1] != null && fileURLToPath(import.meta.url) === path.resolve(process.argv[1]);
|
|
|
|
if (isInvokedDirectly) {
|
|
const passed = await checkDesignSystemFlagParity();
|
|
if (!passed) process.exitCode = 1;
|
|
}
|