mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
* fix(web): parse Provenance with Markdown-bold labels (#1580) The daemon's finalize synthesis prompt at apps/daemon/src/finalize-design.ts:560-565 lists the five Provenance fields without pinning field-label syntax, so Claude renders them with Markdown-bold labels per Markdown convention (`- **Field:** value`). The parser at apps/web/src/lib/parse-provenance.ts:32-36 uses `[:\s]+` as its label/value separator, which stops at the trailing `**` after the colon; the capture group then slurps the `**` and any following whitespace into the value. Downstream of that, transcriptMessageCount and generatedAt parse as null because the captured tokens don't start with digits or a valid ISO 8601 prefix, and the Continue in CLI clipboard prompt shows `Design system: ** ...`, `Transcript message count when DESIGN.md was generated: unknown`, `DESIGN.md generated at: unknown`. Fix: strip leading and trailing Markdown emphasis (`*`, `_`, whitespace) from every captured value via a single helper threaded through extractField / extractFieldOrNone / extractNumber / extractDate. Widen the transcriptMessageCount regex's capture from `(\d+)` to `([^\n]+)` so the strip step gets a chance to run on `** 4`. Add `[^:]*` between `count` and `[:\s]+` to mirror the other label-walking regexes for bolded label variants. Defense-in-depth: tell the synthesis prompt to emit plain `- Field: value` bullets with no emphasis on the labels. The parser hardening is the load-bearing fix; this is belt-and-suspenders for new model variants. Red-Green-Refactor: - Phase 1 (Red): 3 new parse-provenance tests covering bold labels with backticked values, bold labels with a short `Generated:` form, and bold labels with `none` sentinels. All 3 failed against pre-fix source. - Phase 2 (Green): strip + regex widening. All 7 parse-provenance tests + 1158 web tests pass. - Phase 3: empirically verified against a live finalized DESIGN.md — all five fields now parse correctly. - Phase 4 (defense-in-depth): one-line addendum to synthesis prompt. - Phase 5: bold-labelled Provenance fixture added to the hook test (useDesignMdState.test.tsx) so the round-7 `unknown-provenance` fail-closed path is regression-pinned end-to-end. Backticks in field values are intentionally kept (out of scope per the issue spec; rendered clipboard text reads fine with them). The variant `- **Field**: value` (colon outside emphasis) is not in the issue enumeration and is not handled. Fixes #1580 * fix(web): narrow Provenance strip to Markdown residue only Round-2 fix per lefarcen's review on PR #1584. The round-1 helper used `^[\s*_]+` / `[\s*_]+$`, which stripped a literal leading or trailing `*`/`_` from any captured value — `_draft.html` corrupted to `draft.html`, and a build id like `build_id_v1_` lost its trailing underscore. Narrow stripMarkdownEmphasis to three explicit passes: 1. Leading `*`/`_` tokens FOLLOWED BY WHITESPACE — only matches the `** ` residue left after `- **Field:** value` is captured starting at the `*`. 2. Trailing WHITESPACE followed by `*`/`_` tokens — mirror of (1) if the value closes with emphasis after whitespace. 3. A single balanced wrap around the remaining value (`**X**` / `*X*` / `__X__` / `_X_`) — handles the `- **Field:** **value**` shape and any plain-label `**value**` form. Asymmetric literal `*`/`_` characters in the value (no whitespace separator, no balanced closing token) are preserved by construction. Added regression tests: - plain label + `_draft.html` value - plain label + `build_id_v1_` value (trailing underscore) - bold label + `_draft.html` value (residue stripped, literal leading underscore preserved) - plain label + `**wrapped-id**` value (balanced residue stripped) All 11 parse-provenance tests + 1162 web tests pass. Empirically re-verified against a live finalized DESIGN.md — all five fields still parse correctly. --------- Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>
205 lines
7.6 KiB
TypeScript
205 lines
7.6 KiB
TypeScript
import { describe, expect, it } from 'vitest';
|
|
|
|
import { parseProvenance } from '../../src/lib/parse-provenance';
|
|
|
|
const FRESH = `# DESIGN.md
|
|
|
|
## Summary
|
|
|
|
Some content.
|
|
|
|
## Provenance
|
|
|
|
- Project ID: 818cf7a8-8399-4220-a507-07802d8842a8
|
|
- Design system: alphatrace
|
|
- Current artifact: deck.html
|
|
- Transcript message count: 42
|
|
- Generated UTC timestamp: 2026-05-08T11:55:00Z
|
|
`;
|
|
|
|
describe('parseProvenance', () => {
|
|
it('returns all five fields populated for a happy-path input', () => {
|
|
const result = parseProvenance(FRESH);
|
|
expect(result).not.toBeNull();
|
|
expect(result!.projectId).toBe('818cf7a8-8399-4220-a507-07802d8842a8');
|
|
expect(result!.designSystemId).toBe('alphatrace');
|
|
expect(result!.currentArtifact).toBe('deck.html');
|
|
expect(result!.transcriptMessageCount).toBe(42);
|
|
expect(result!.generatedAt).not.toBeNull();
|
|
expect(result!.generatedAt!.toISOString()).toBe('2026-05-08T11:55:00.000Z');
|
|
});
|
|
|
|
it('returns null when the Provenance section is missing', () => {
|
|
const text = `# DESIGN.md\n\n## Summary\n\nThis spec has no provenance.\n`;
|
|
expect(parseProvenance(text)).toBeNull();
|
|
});
|
|
|
|
it('treats the "none" sentinel for design system as null', () => {
|
|
const text = `## Provenance
|
|
|
|
- Project ID: abc-123
|
|
- Design system: none
|
|
- Current artifact: none
|
|
- Transcript message count: 7
|
|
- Generated UTC timestamp: 2026-05-08T00:00:00Z
|
|
`;
|
|
const result = parseProvenance(text);
|
|
expect(result).not.toBeNull();
|
|
expect(result!.designSystemId).toBeNull();
|
|
expect(result!.currentArtifact).toBeNull();
|
|
// Other fields still populated.
|
|
expect(result!.projectId).toBe('abc-123');
|
|
expect(result!.transcriptMessageCount).toBe(7);
|
|
});
|
|
|
|
it('returns generatedAt: null when the timestamp is malformed (no throw)', () => {
|
|
const text = `## Provenance
|
|
|
|
- Project ID: abc-123
|
|
- Design system: alphatrace
|
|
- Current artifact: deck.html
|
|
- Transcript message count: 42
|
|
- Generated UTC timestamp: not-a-date
|
|
`;
|
|
const result = parseProvenance(text);
|
|
expect(result).not.toBeNull();
|
|
expect(result!.generatedAt).toBeNull();
|
|
// Surrounding fields still populated.
|
|
expect(result!.transcriptMessageCount).toBe(42);
|
|
});
|
|
|
|
// Issue #1580: the daemon's synthesis prompt does not pin field-label
|
|
// syntax (finalize-design.ts:560-565), so Claude renders Provenance
|
|
// fields with Markdown-bold labels per Markdown convention. The
|
|
// pre-fix regexes' `[:\s]+` separator stops at the trailing `**`
|
|
// after the colon, leaking `** ` into every captured value and
|
|
// making transcriptMessageCount + generatedAt parse as null.
|
|
it('parses bold-labelled fields with backticked values (live DESIGN.md shape)', () => {
|
|
// Verbatim shape from a finalized DESIGN.md emitted by Claude
|
|
// against the prod synthesis prompt. UUID + filename are
|
|
// illustrative placeholders, not user data.
|
|
const text = `## Provenance
|
|
|
|
- **Project ID:** \`00000000-0000-0000-0000-000000000000\`
|
|
- **Design system:** \`default\` (Neutral Modern — not applied; wireframe overrides all tokens)
|
|
- **Current artifact:** \`prototype.html\` (single-file, 1,922 lines, 57KB)
|
|
- **Transcript message count:** 4
|
|
- **Generated UTC timestamp:** 2026-05-13T12:27:21.499Z
|
|
`;
|
|
const result = parseProvenance(text);
|
|
expect(result).not.toBeNull();
|
|
// Backticks may remain in the captured value (out of scope to
|
|
// strip per #1580 spec); the `** ` Markdown-bold prefix must not.
|
|
expect(result!.projectId).toBe('`00000000-0000-0000-0000-000000000000`');
|
|
expect(result!.designSystemId).toBe('`default` (Neutral Modern — not applied; wireframe overrides all tokens)');
|
|
expect(result!.currentArtifact).toBe('`prototype.html` (single-file, 1,922 lines, 57KB)');
|
|
expect(result!.transcriptMessageCount).toBe(4);
|
|
expect(result!.generatedAt).not.toBeNull();
|
|
expect(result!.generatedAt!.toISOString()).toBe('2026-05-13T12:27:21.499Z');
|
|
});
|
|
|
|
it('parses bold-labelled fields with plain values and a short "Generated:" label', () => {
|
|
const text = `## Provenance
|
|
|
|
- **Project ID:** abc-123
|
|
- **Design system:** my-system
|
|
- **Current artifact:** deck.html
|
|
- **Transcript message count:** 12
|
|
- **Generated:** 2026-05-08T11:55:00Z
|
|
`;
|
|
const result = parseProvenance(text);
|
|
expect(result).not.toBeNull();
|
|
expect(result!.projectId).toBe('abc-123');
|
|
expect(result!.designSystemId).toBe('my-system');
|
|
expect(result!.currentArtifact).toBe('deck.html');
|
|
expect(result!.transcriptMessageCount).toBe(12);
|
|
expect(result!.generatedAt).not.toBeNull();
|
|
expect(result!.generatedAt!.toISOString()).toBe('2026-05-08T11:55:00.000Z');
|
|
});
|
|
|
|
// PR #1584 review (lefarcen): the round-1 strip used `^[\s*_]+` /
|
|
// `[\s*_]+$`, which stripped a literal leading/trailing underscore
|
|
// from values like `_draft.html` (corrupting it to `draft.html`).
|
|
// Narrow the strip to only consume Markdown residue, never literal
|
|
// characters in the value itself.
|
|
it('preserves a literal leading underscore in a plain-label value (e.g. _draft.html)', () => {
|
|
const text = `## Provenance
|
|
|
|
- Project ID: abc-123
|
|
- Design system: alphatrace
|
|
- Current artifact: _draft.html
|
|
- Transcript message count: 7
|
|
- Generated UTC timestamp: 2026-05-08T00:00:00Z
|
|
`;
|
|
const result = parseProvenance(text);
|
|
expect(result).not.toBeNull();
|
|
// The whole filename must survive — no leading underscore strip.
|
|
expect(result!.currentArtifact).toBe('_draft.html');
|
|
});
|
|
|
|
it('preserves a literal trailing underscore in a plain-label id-like value', () => {
|
|
const text = `## Provenance
|
|
|
|
- Project ID: build_id_v1_
|
|
- Design system: alphatrace
|
|
- Current artifact: deck.html
|
|
- Transcript message count: 7
|
|
- Generated UTC timestamp: 2026-05-08T00:00:00Z
|
|
`;
|
|
const result = parseProvenance(text);
|
|
expect(result).not.toBeNull();
|
|
expect(result!.projectId).toBe('build_id_v1_');
|
|
});
|
|
|
|
it('preserves a literal leading underscore even when the label is Markdown-bold', () => {
|
|
const text = `## Provenance
|
|
|
|
- **Project ID:** abc-123
|
|
- **Design system:** alphatrace
|
|
- **Current artifact:** _draft.html
|
|
- **Transcript message count:** 7
|
|
- **Generated UTC timestamp:** 2026-05-08T00:00:00Z
|
|
`;
|
|
const result = parseProvenance(text);
|
|
expect(result).not.toBeNull();
|
|
// The bold-label residue (`** `) must be stripped, but the literal
|
|
// leading underscore on the filename must remain.
|
|
expect(result!.currentArtifact).toBe('_draft.html');
|
|
});
|
|
|
|
it('strips a balanced **value** wrap (residue case, no preceding bold-label residue)', () => {
|
|
const text = `## Provenance
|
|
|
|
- Project ID: **wrapped-id**
|
|
- Design system: alphatrace
|
|
- Current artifact: deck.html
|
|
- Transcript message count: 7
|
|
- Generated UTC timestamp: 2026-05-08T00:00:00Z
|
|
`;
|
|
const result = parseProvenance(text);
|
|
expect(result).not.toBeNull();
|
|
// **X** is unambiguously Markdown emphasis residue per the issue
|
|
// spec; strip the balanced wrap.
|
|
expect(result!.projectId).toBe('wrapped-id');
|
|
});
|
|
|
|
it('still treats "none" as the null sentinel after the bold-label prefix is stripped', () => {
|
|
const text = `## Provenance
|
|
|
|
- **Project ID:** abc-123
|
|
- **Design system:** none
|
|
- **Current artifact:** none
|
|
- **Transcript message count:** 7
|
|
- **Generated UTC timestamp:** 2026-05-08T00:00:00Z
|
|
`;
|
|
const result = parseProvenance(text);
|
|
expect(result).not.toBeNull();
|
|
expect(result!.projectId).toBe('abc-123');
|
|
// NONE_SENTINEL must trip on the value after emphasis is stripped,
|
|
// otherwise "** none" leaks through as a real design-system id.
|
|
expect(result!.designSystemId).toBeNull();
|
|
expect(result!.currentArtifact).toBeNull();
|
|
expect(result!.transcriptMessageCount).toBe(7);
|
|
expect(result!.generatedAt).not.toBeNull();
|
|
});
|
|
});
|