open-design/apps/web/src/artifacts/parser.ts
Dongsen 12ce5ad38b
fix(web): ignore <artifact> tags inside markdown code spans and fences (#1132)
* test(web): add failing parser cases for <artifact> recitation in markdown code

Cover the three real-world prose contexts where the model legitimately
quotes the artifact tag without intending to emit one:

- inside an inline backtick span
- inside a fenced code block
- spread across streaming chunks crossing the fence boundary

Establishes the RED baseline before parser code-fence awareness lands.

* fix(web): ignore <artifact> tags inside markdown code spans and fences

The streaming artifact parser scanned the buffer with a raw indexOf,
guarded only by 'next char must be whitespace'. That meant any literal
<artifact ...> the model recited while documenting the protocol — even
inside backticks or a ```html fence — flipped the parser into artifact
mode, swallowed the rest of the reply from the chat UI, and (when a
matching </artifact> appeared in the recitation) silently wrote a
spurious file to disk via persistArtifact.

Replace findOpenTag with a linear scan that tracks fenced code blocks
(```) and inline code spans (`), skipping any <artifact prefix found
inside either. If the buffer ends mid-fence, return a partial match
anchored at the fence start so the next streaming chunk can resolve
the boundary without losing fence context.

Closes #1130.

* fix(web): match renderer fence/inline-code rules in artifact parser

Codex review on PR #1132 caught that the previous fix toggled inFence on
any triple-backtick run anywhere in the buffer, including mid-line, while
the chat renderer (apps/web/src/runtime/markdown.tsx) only treats ``` as
a fence when it occupies a whole line matching /^[ ]{0,3}```(\w[\w+-]*)?\s*$/.
That asymmetry would suppress a real <artifact> tag emitted after a prose
sentence like "the opening marker is ```html and the response then writes:".

Rework findOpenTag in three passes that mirror the renderer:

  1. Walk \n-terminated lines; only a line that matches FENCE_LINE_RE
     toggles fence state. Open fences without a close (or with an
     unterminated tail line) return partial so the next chunk can resolve.
  2. Collect inline code spans with /`[^`]+`/g — the same regex used by
     renderInline — so what the parser skips matches what the user sees as
     code. Unmatched trailing backticks after the last \n hold back.
  3. Find the first <artifact …> outside any skip range; preserve the
     existing partial-prefix tail handling.

Adds a regression test covering the exact case Codex reported.

* test(web): pin parser behavior on double-backtick and in-fence string literal recitation

Two cases raised in PR #1132 review:

- a real artifact tag wrapped in '``<artifact …>``' (double-backtick
  inline code span) should not be treated as a real artifact
- a fenced JS example whose body contains a string literal like
  'const fence = "```";' should not pop fence state early and let a
  later literal <artifact> be parsed as real

Both already pass on 96e88ca because the line-anchored fence regex and
the renderer-aligned inline regex handle them correctly. Pinning the
behavior so future regressions surface as test failures.

* fix(web): make stripArtifact markdown-aware to stop truncating literal recitations

The streaming artifact parser was hardened in 96e88ca to skip <artifact>
recitations inside backticks and fences, but the post-stream stripper at
AssistantMessage.tsx still ran a naive 'content.indexOf("<artifact")' over
the same text events. As reported by lefarcen on PR #1132, that meant
chat replies with literal protocol recitations could still get silently
truncated mid-explanation — even though the parser preserved them in the
text stream and the file panel was no longer polluted with ghost files.

Extract the renderer-aligned classification (FENCE_LINE_RE, INLINE_CODE_RE,
computeSkipRanges, rangeContains) into a single source of truth at
apps/web/src/artifacts/markdown-context.ts so the parser and the stripper
agree on what counts as code. Add apps/web/src/artifacts/strip.ts with a
markdown-aware stripArtifact that:

- ignores any <artifact open inside a fenced block or inline code span
- looks for </artifact> with the same skip-range filter, so a real open
  paired with a literal close inside backticks does not strip a literal
  body that is meant to render
- returns content unchanged when an open exists with no matching real
  close (the previous implementation sliced to end-of-string, which would
  nuke trailing prose on a malformed or still-streaming tag)

Refactor parser.ts to import the shared helpers; behavior preserved (all
seven existing parser tests still pass). New strip.test.ts covers six
cases including the empirically-verified inline-backtick regression.

* fix(web): align artifact stripper/parser fence rules with renderer exactly

Two gaps surfaced in review at a0bf05f:

- markdown-context.ts used a single FENCE_LINE_RE that allowed 0-3 leading
  spaces and reused the same pattern for opening and closing fences. The
  chat renderer (runtime/markdown.tsx:44 and :49) is asymmetric — opens
  with /^```(\w[\w+-]*)?\s*$/, closes with /^```\s*$/, and rejects any
  leading indentation on either side. Indented "   ```html" was being
  treated as a code fence even though the renderer keeps it as a paragraph,
  and a literal "```html" line inside an open fenced example was closing
  the skip range early — both could expose a real or literal <artifact …>
  to the wrong handler.
- stripArtifact discarded computeSkipRanges' unclosedFenceStart, so a
  fenced literal that ends at EOF without a trailing newline (very common
  for chat output) leaked the inner <artifact …> recitation to the
  stripper, reproducing the original #1130 truncation symptom on a
  narrower input shape.

Split FENCE_LINE_RE into FENCE_OPEN_RE / FENCE_CLOSE_RE with no leading
indentation, gate the fence state machine on the right side of the toggle,
and have stripArtifact extend skip ranges to end-of-content when a fence
is left open. Also tightened the parser's tail-line hold-back regex to
match the renderer's no-leading-space rule. Added regression tests for the
EOF-unclosed-fence case, the indented pseudo-fence (renderer treats as
paragraph, stripper must strip the real artifact), and a "```html" line
inside an open fence.

Refs nexu-io/open-design#1130

* refactor(web): align streaming tail-line fence guard with FENCE_OPEN_RE

The streaming parser's tail-line hold-back used a stricter local regex
(/^```\w*$/) than the renderer's FENCE_OPEN_RE (/^```(\w[\w+-]*)?\s*$/),
missing valid opener tails like ```c++, ```ts-, or ``` (trailing space).

In practice these tails are still held back by the unmatched-backtick
parity scan that runs immediately after — three backticks in a tail line
are odd, so firstUnmatched stays set and the parser holds from that
position. So this wasn't a runtime correctness bug, just a regex
divergence that future readers could trip on.

Drop the local regex and reuse FENCE_OPEN_RE so the tail check matches
the same shape the rest of the pipeline already uses. Pinned the
behavior with three new parser tests (`+`/`-` info-string suffix and
trailing-space tails arriving as the first chunk) — they pass at HEAD,
proving the parity scan was already covering these cases.

Refs nexu-io/open-design#1132 (lefarcen polish P2)

* fix(web): scope inline-code skip ranges per block and reject <artifact prefix-shared opens

INLINE_CODE_RE previously ran over the whole buffer, so an unmatched
backtick in one paragraph could pair with a backtick in a later
paragraph and create a phantom inline span that swallowed any real
<artifact …> between them. Mirror runtime/markdown.tsx by splitting the
buffer on fence / blank / heading / list / hr boundaries and running
INLINE_CODE_RE per block region instead.

stripArtifact accepted any unskipped `<artifact` substring as a real
open, while the streaming parser already required a following whitespace
character — so prose like `<artifactual>demo</artifact>` was being
truncated to `prefix  suffix`. Extract the parser's real-open guard into
isRealArtifactOpenAt and reuse it from both sides.

While reordering findOpenTag for the shared guard, also fix the related
hold-back ordering issue tracked at #1141: a stray tail-line backtick or
fence-opener prefix used to suppress an artifact already complete
earlier in the buffer. Scan for the earliest complete real open first,
then pick the earliest hold-back position only when no complete tag was
found.

Regressions pinned in parser.test.ts and strip.test.ts for both new
finding shapes.

* fix(web): keep HR-shaped lines inside paragraph regions for inline-code scanning

The previous walker closed inline-scan regions on lines matching the HR
regex, but `parseBlocks()` in runtime/markdown.tsx does not break a
paragraph on HR — its inner accumulation loop only breaks on blank /
fence / heading / ul / ol (runtime/markdown.tsx:95-104). HR is only an
HR block in the outer loop's first-look, never mid-paragraph.

So inputs like `intro \`\n---\n<artifact …>…</artifact>\n---\nclosing \``
are one paragraph in the renderer, whose two stray backticks pair to
cover the literal artifact recitation — but the walker was splitting on
the `---` lines, leaving the recitation outside skip ranges, and the
parser/stripper would treat it as a real tag.

Drop HR from the paragraph-break list (HR-shaped lines carry no
backticks of their own, so keeping them inside the surrounding region
is benign either way) and document the renderer-mirror rationale.

Regressions pinned on both sides.
2026-05-11 19:29:22 +08:00

251 lines
8.7 KiB
TypeScript

/**
* Streaming parser for <artifact identifier="..." type="..." title="...">...</artifact>
* tags. Simplified from packages/artifacts/src/parser.ts in the reference
* repo: handles one artifact at a time, ignores nesting.
*
* Feed deltas in, iterate events. Every event type here has a direct
* counterpart in the reference parser — the shape is intentionally preserved
* so you can upgrade later without rewriting consumers.
*/
export type ArtifactEvent =
| { type: 'text'; delta: string }
| { type: 'artifact:start'; identifier: string; artifactType: string; title: string }
| { type: 'artifact:chunk'; identifier: string; delta: string }
| { type: 'artifact:end'; identifier: string; fullContent: string };
const OPEN_PREFIX = '<artifact';
const CLOSE_TAG = '</artifact>';
interface ParserState {
inside: boolean;
buffer: string;
identifier: string;
artifactType: string;
title: string;
content: string;
}
function parseAttrs(raw: string): Record<string, string> {
const re = /(\w+)\s*=\s*(?:"([^"]*)"|'([^']*)')/g;
const out: Record<string, string> = {};
let m: RegExpExecArray | null = re.exec(raw);
while (m !== null) {
out[m[1] as string] = (m[2] ?? m[3] ?? '') as string;
m = re.exec(raw);
}
return out;
}
type OpenTagMatch =
| { kind: 'complete'; start: number; end: number; attrs: string }
| { kind: 'partial'; start: number }
| { kind: 'none' };
import { computeSkipRanges, FENCE_OPEN_RE, isRealArtifactOpenAt, rangeContains } from './markdown-context';
// Scan the buffer for `<artifact …>` while skipping any positions that the
// chat markdown renderer would render as a fenced code block or inline code
// span — see ./markdown-context.ts for the shared classification used by both
// the streaming parser and the post-stream `<artifact>` stripper.
//
// Streaming caveats handled here on top of the shared ranges:
// * Open fence with no close yet → hold back from its opening line.
// * Unterminated tail line that could still resolve into a fence delimiter
// (e.g. "```", "```ht") → hold back from the line start.
// * Unmatched opening backtick after the last \n → hold back from it; a
// future chunk may turn it into an inline code span.
function findOpenTag(buffer: string): OpenTagMatch {
const len = buffer.length;
const { ranges, unclosedFenceStart } = computeSkipRanges(buffer);
// Pass 1: scan for the earliest *complete* real `<artifact …>` open outside
// any skip range. Done before any hold-back decision, otherwise a stray
// backtick or fence-opener prefix on a tail line would suppress an already
// self-contained artifact earlier in the buffer.
let earliestPartialOpen = -1;
let from = 0;
while (from < len) {
const idx = buffer.indexOf(OPEN_PREFIX, from);
if (idx === -1) break;
if (rangeContains(ranges, idx)) {
from = idx + OPEN_PREFIX.length;
continue;
}
if (unclosedFenceStart !== null && idx >= unclosedFenceStart) {
// Anything past an unclosed fence opener is inside a code block that
// will close in a later chunk (or at end-of-buffer for the stripper);
// treat as skip range, not a real tag.
break;
}
const after = idx + OPEN_PREFIX.length;
const next = buffer.charAt(after);
if (next === '') {
// `<artifact` at very end of buffer — could become real with the next
// chunk. Remember the earliest one and keep looking for a complete tag.
if (earliestPartialOpen === -1) earliestPartialOpen = idx;
break;
}
if (!isRealArtifactOpenAt(buffer, idx)) {
// Not a real <artifact ...> open (e.g. "<artifactual"). Keep scanning.
from = after;
continue;
}
let j = after;
let quote: '"' | "'" | null = null;
while (j < len) {
const c = buffer.charAt(j);
if (quote !== null) {
if (c === quote) quote = null;
} else if (c === '"' || c === "'") {
quote = c;
} else if (c === '>') {
return { kind: 'complete', start: idx, end: j + 1, attrs: buffer.slice(after, j) };
}
j++;
}
// Ran out of buffer before the closing `>` arrived — this is an open tag
// mid-stream. Remember and stop scanning (any later `<artifact` would be
// a second tag we'd reach next chunk).
if (earliestPartialOpen === -1) earliestPartialOpen = idx;
break;
}
// Pass 2: no complete open found. Decide whether to hold back, and if so,
// from which position. Earliest hold-back wins so the text-flush boundary
// never crosses something that might still resolve into a tag/fence/span.
let holdback = -1;
const note = (pos: number | null) => {
if (pos !== null && pos !== -1 && (holdback === -1 || pos < holdback)) holdback = pos;
};
note(earliestPartialOpen);
note(unclosedFenceStart);
const lastNl = buffer.lastIndexOf('\n');
if (lastNl < len - 1) {
const tailLineStart = lastNl + 1;
const tail = buffer.slice(tailLineStart);
if (FENCE_OPEN_RE.test(tail) || /^`{1,2}$/.test(tail)) {
note(tailLineStart);
}
}
let firstUnmatched = -1;
let parity = 0;
for (let k = lastNl + 1; k < len; k++) {
if (buffer.charAt(k) !== '`') continue;
if (rangeContains(ranges, k)) continue;
if (parity === 0) {
firstUnmatched = k;
parity = 1;
} else {
firstUnmatched = -1;
parity = 0;
}
}
note(firstUnmatched);
// Strict prefix at the tail (e.g. "<art") — hold back.
const tailLt = buffer.lastIndexOf('<');
if (tailLt !== -1 && !rangeContains(ranges, tailLt)) {
const slice = buffer.slice(tailLt);
if (OPEN_PREFIX.startsWith(slice) && slice.length < OPEN_PREFIX.length) {
note(tailLt);
}
}
if (holdback !== -1) return { kind: 'partial', start: holdback };
return { kind: 'none' };
}
export function createArtifactParser() {
const state: ParserState = {
inside: false,
buffer: '',
identifier: '',
artifactType: '',
title: '',
content: '',
};
function* feed(delta: string): Generator<ArtifactEvent> {
state.buffer += delta;
while (state.buffer.length > 0) {
if (!state.inside) {
const open = findOpenTag(state.buffer);
if (open.kind === 'none') {
yield { type: 'text', delta: state.buffer };
state.buffer = '';
return;
}
if (open.kind === 'partial') {
if (open.start > 0) {
yield { type: 'text', delta: state.buffer.slice(0, open.start) };
state.buffer = state.buffer.slice(open.start);
}
return;
}
if (open.start > 0) {
yield { type: 'text', delta: state.buffer.slice(0, open.start) };
}
const attrs = parseAttrs(open.attrs);
state.inside = true;
state.identifier = attrs['identifier'] ?? '';
state.artifactType = attrs['type'] ?? '';
state.title = attrs['title'] ?? '';
state.content = '';
state.buffer = state.buffer.slice(open.end);
yield {
type: 'artifact:start',
identifier: state.identifier,
artifactType: state.artifactType,
title: state.title,
};
continue;
}
const closeIdx = state.buffer.indexOf(CLOSE_TAG);
if (closeIdx === -1) {
// Hold back enough bytes to detect a partial close tag at the tail.
const flushUpTo = state.buffer.length - (CLOSE_TAG.length - 1);
if (flushUpTo > 0) {
const chunk = state.buffer.slice(0, flushUpTo);
state.content += chunk;
state.buffer = state.buffer.slice(flushUpTo);
yield { type: 'artifact:chunk', identifier: state.identifier, delta: chunk };
}
return;
}
const finalChunk = state.buffer.slice(0, closeIdx);
if (finalChunk.length > 0) {
state.content += finalChunk;
yield { type: 'artifact:chunk', identifier: state.identifier, delta: finalChunk };
}
yield { type: 'artifact:end', identifier: state.identifier, fullContent: state.content };
state.buffer = state.buffer.slice(closeIdx + CLOSE_TAG.length);
state.inside = false;
state.identifier = '';
state.artifactType = '';
state.title = '';
state.content = '';
}
}
function* flush(): Generator<ArtifactEvent> {
if (state.inside) {
if (state.buffer.length > 0) {
state.content += state.buffer;
yield { type: 'artifact:chunk', identifier: state.identifier, delta: state.buffer };
state.buffer = '';
}
yield { type: 'artifact:end', identifier: state.identifier, fullContent: state.content };
} else if (state.buffer.length > 0) {
yield { type: 'text', delta: state.buffer };
}
state.buffer = '';
state.inside = false;
}
return { feed, flush };
}