feat(daemon): guard against agent-emitted stub artifact regressions (#1171)

* feat(daemon): guard against agent-emitted stub artifact regressions

When an agent emits an <artifact> block whose body is a placeholder
("see other-file.html in this project", a bare filename string, a tiny
fallback page) instead of the full document, the daemon writes the
placeholder to disk verbatim. Users see a 25-500 byte HTML file where
their previous version had tens of kilobytes of real markup.

Add a structural regression guard in writeProjectFile: before writing
an html/deck artifact whose manifest carries metadata.identifier, scan
the project dir for prior siblings matching <identifier>(-\d+)?\.html?
and compare sizes. If the new body is below minRetainedRatio (default
0.2) of the largest prior sibling >= minPriorBytes (default 4096),
flag a regression. Three modes via env:

- OD_ARTIFACT_STUB_GUARD=warn (default) writes the file and attaches
  stubGuardWarning to the response so the frontend can surface it.
- OD_ARTIFACT_STUB_GUARD=reject throws ArtifactRegressionError before
  fs.writeFile; the route returns 422 ARTIFACT_REGRESSION with the
  prior sibling's name and size in error.details.
- OD_ARTIFACT_STUB_GUARD=off skips the guard entirely.

Cross-agent by design: anchored on size delta + identifier match,
no agent-specific stub-phrase regex, so works for any agent backend
behind the agent-adapter abstraction.

The body-then-manifest write order pre-dates this change; the reject
path throws before fs.writeFile so rejections never leave a partial
state behind.

24 unit + 8 HTTP tests cover happy paths, all three modes, deck kind,
.htm extension sibling detection, ratio=1 edge case, and verify
rejected writes leave neither the html nor its manifest sidecar on
disk.

* fix(stub-guard): close same-name, nested-dir, and non-slug bypasses

Code review on PR #1171 (lefarcen, Codex, mrcfps) found three holes
where the stub guard could be silently bypassed. All three are now
closed with HTTP test coverage.

Same-name overwrite (lefarcen P1): the writer's prior-sibling scan
deliberately skipped the file at safeName, but for an in-session
overwrite (persistArtifact reuses the same fileName when
savedArtifactRef.current matches) that file is the prior content,
not the new entry. Drop the exclude-by-name filter; the current
on-disk size at scan time is always the prior because the overwrite
happens after this check.

Subdirectory scoping (Codex/mrcfps P2): writeProjectFile creates
parent directories for nested paths like reports/overview.html, but
the guard only scanned the project root. Pass path.dirname(target)
as scanDir so nested artifacts are evaluated against their real
sibling set.

Non-slug identifier (Codex/lefarcen/mrcfps P2): the web's
persistArtifact slugifies the filename basename but stores the raw
identifier in the manifest, so an identifier like "Landing Page"
yields filename landing-page.html with metadata.identifier="Landing
Page". Build the sibling regex from both the raw identifier and a
slugified variant (mirroring the frontend's slugifier) so either
form matches the same priors.

Also surface warn-mode warnings in the web UI: ProjectView now
checks file.stubGuardWarning after writeProjectTextFile and renders
the warning via setError. Reject-mode 422 surfacing requires
restructuring writeProjectTextFile's return contract and is
deferred.

API change inside the daemon: evaluateArtifactStubGuard /
findPriorArtifactSiblings drop excludeSafeName and rename projectDir
to scanDir. Tests updated.

Tests: 4 new HTTP cases (same-name overwrite preserves prior body,
nested subdir rejects, slug-form match rejects, plus the existing
warn/off/deck/.htm cases) and 1 new unit case (slug-form sibling
match). 44 tests pass.

* fix(stub-guard): empty-slug fallback + reject-mode UI surface

Round 3 review on PR #1171 (lefarcen, mrcfps) found two remaining
holes after 9cc82430 closed the same-name / subdir / non-slug bypasses.

Empty-slug fallback bypass (lefarcen P2): an identifier like "测试"
(all-non-ASCII) strips to empty through the web slugifier, and
persistArtifact's `slice(0,60) || 'artifact'` falls back to the
literal "artifact" basename. The guard searched for raw identifier +
slug only, so a later artifact-2.html stub bypassed the prior. Add
EMPTY_SLUG_FALLBACK_NAME = 'artifact' as a sibling-name candidate
when the slug is empty, mirroring the frontend fallback exactly.

Reject-mode UI silence (mrcfps P2 + lefarcen P2): writeProjectTextFile
collapses any non-OK response (including 422 ARTIFACT_REGRESSION) to
null, and persistArtifact previously had no else branch. Users in
reject mode saw the daemon log fire but the UI was silent. Add an
else branch that surfaces a generic banner pointing at the most
likely cause and mentions checking the daemon logs for structured
details. Also clear savedArtifactRef.current on failure so retries
re-enter the persistence path.

Plumbing the structured 422 details through writeProjectTextFile
itself remains out of scope (cross-cutting client contract change
affecting 5+ call sites). The generic banner is the "at minimum"
path mrcfps suggested.

Tests: 1 new unit case (artifact.html sibling discovery for non-ASCII
identifier) + 1 new HTTP case (empty-slug stub regression rejected
end-to-end). 46 tests pass across stub-guard suites (was 44).

* fix(stub-guard): verify sidecar identity to avoid cross-identifier false positives

Round 4 review on PR #1171 (mrcfps inline + lefarcen review) caught
a false-positive introduced by the round-3 empty-slug fallback. Two
distinct identifiers that both slugify to empty (e.g. "测试" and
"首页") share the artifact*.html basename, so a brand-new save under
the second identifier was being compared against — and falsely
rejected because of — the unrelated first.

The same shape exists symmetrically: a non-empty-slug identifier
literally named "artifact" would falsely match empty-slug fallback
files written under any other identifier.

Fix: filename pattern matching is now a candidate generator, not
the source of truth. For every candidate sibling, read its
.artifact.json sidecar and verify metadata.identifier matches the
input via artifactIdentifiersMatch (raw equality OR shared non-empty
slug). Files without a sidecar are skipped — they weren't written
through the artifact-tag path this guard targets, and treating them
as priors was always a stretch.

Empty-slug equivalence is intentionally NOT honored: 测试 != 首页
even though both slugify to empty. The whole bug was conflating
distinct identifiers via the fallback name; slug-equivalence kicks
in only for non-empty slugs (Landing Page <-> landing-page).

Tests: unit fixtures now write file+sidecar pairs (mirrors prod);
new artifactIdentifiersMatch suite covers the 5 equivalence cases;
new HTTP test does NOT cross-reject distinct empty-slug identifiers
asserts the second save returns 200 instead of 422; new unit test
skips files without a sidecar.

42 tests pass across stub-guard suites.

* fix(stub-guard): require canonical-form anchor in identifier match to avoid 60-char truncation collisions

Round 5 review on PR #1171 (mrcfps) caught another false-positive in
artifactIdentifiersMatch: slugifyArtifactIdentifier truncates at 60
chars, so two distinct >60-char identifiers that share their first
60 chars (e.g. "A...A1" and "A...A2", 70 chars each) slugify to the
same string and would falsely bridge. Same shape as the empty-slug
fallback bug from round 4, just at the other end of the input range.

Tighten the rule: slug-equivalence requires at least one input to BE
its own canonical slug form. That keeps the legitimate bridge
("Landing Page" <-> "landing-page" — second input IS the slug) but
rejects truncation collisions ("A...A1" <-> "A...A2" — neither is in
canonical form).

Side effect: two non-canonical forms that slugify to the same value
no longer bridge (e.g. "Landing Page" vs "LANDING-PAGE"). This is
correct: without one canonical anchor we can't safely call them the
same lineage. Updated the slug-equivalence test to assert the new
semantics explicitly with both directions and a negative case.

Tests: 2 new cases (no bridge for >60-char truncation collision; raw
70-char to its 60-char truncated slug still bridges) + 1 negative
test for the non-canonical-pair case. 45 tests pass.

* fix(stub-guard): cover legacy sidecar-less HTML priors

Round 6 review on PR #1171 (mrcfps, non-blocking) caught a real
legacy bypass: round 4's sidecar-required policy skipped any HTML
file without an .artifact.json companion, but readManifestForPath
(projects.ts) treats those same files as legitimate artifacts via
inferLegacyManifest. So a project with an older sidecar-less
dashboard.html (pre-sidecar era, Write-tool-emitted, paste-text,
manual import, etc.) let its first stub rewrite through as a
supposed "first emission".

Fix: when the sidecar is missing, derive a synthetic identifier
from the filename (strip the (-N)?\.html? suffix) and run it
through the same artifactIdentifiersMatch rules. Synthetic
identifiers come from already-slugified filenames, so they bridge
raw inputs only via the canonical-form rule established in round
5 — no truncation collisions, no empty-slug conflation, no
unrelated cross-identifier matches.

Tests: 3 new unit cases (legacy fallback finds the prior; bridges
raw->slug under the same rules; does NOT bridge unrelated slug
forms via inference) + 1 new HTTP test that seeds a sidecar-less
prior via the artifact-manifest-less write path and asserts the
stub rewrite is rejected with 422 ARTIFACT_REGRESSION.

48 tests pass across stub-guard suites (was 45).

* fix(stub-guard): try both interpretations for legacy filename inference

Round 7 review on PR #1171 (mrcfps, non-blocking) caught a real
ambiguity in the round-6 legacy fallback: a filename like
`phase-2.html` is genuinely ambiguous without a sidecar. It could
be the identifier "phase" with a -2 collision suffix, OR the
standalone identifier "phase-2". The round-6 helper only stripped
the suffix, so a sidecar-less `phase-2.html` followed by a stub
emission with metadata.identifier="phase-2" bypassed the guard
("phase-2" doesn't match the inferred "phase").

Fix: when the sidecar is missing, generate both candidate
identifiers (full basename and suffix-stripped basename) and
accept the file as a prior if either matches. Visible false
positives are preferable to silent false negatives — and the
canonical-form anchor in artifactIdentifiersMatch still rules out
truncation collisions and empty-slug conflations regardless of
which candidate matched.

Tests: 2 new unit cases (full-basename interpretation finds
"phase-2"; suffix-stripped interpretation also finds "phase") and
1 new HTTP test that seeds a sidecar-less `phase-2.html` and
asserts the stub rewrite is rejected with 422 ARTIFACT_REGRESSION.

51 tests pass across stub-guard suites (was 48).

---------

Co-authored-by: Sebastian Westberg <sebastianwestberg@users.noreply.github.com>
This commit is contained in:
Sebastian Westberg 2026-05-11 13:59:37 +02:00 committed by GitHub
parent 0f0d214298
commit 8962088c75
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
8 changed files with 1274 additions and 6 deletions

View file

@ -0,0 +1,304 @@
// Detects "stub" HTML artifact regressions: an agent emits a new artifact
// with the same metadata.identifier as an earlier one, but the body is a
// tiny placeholder ("see <other>.html in this project", a bare filename
// string, an empty fallback page, etc.) instead of the full HTML.
//
// The guard is structural: it compares the new body's size against the
// largest prior sibling sharing the same identifier. It does not pattern-
// match on phrasing, so it works regardless of which agent backend produced
// the regression. False positives are bounded by minPriorBytes (we won't
// compare against priors that are themselves small) and minRetainedRatio.
import type { Dirent } from 'node:fs';
import { readFile, readdir, stat } from 'node:fs/promises';
import path from 'node:path';
export type ArtifactStubGuardMode = 'reject' | 'warn' | 'off';
export interface ArtifactStubGuardConfig {
mode: ArtifactStubGuardMode;
minRetainedRatio: number;
minPriorBytes: number;
}
export interface PriorArtifactSibling {
name: string;
size: number;
}
export interface ArtifactStubGuardWarning {
code: 'ARTIFACT_REGRESSION';
message: string;
identifier: string;
newSize: number;
priorSize: number;
priorName: string;
}
export interface EvaluateArtifactStubGuardInput {
scanDir: string;
identifier: string;
newSize: number;
config: ArtifactStubGuardConfig;
}
export interface EvaluateArtifactStubGuardResult {
outcome: 'pass' | 'warn' | 'reject';
warning?: ArtifactStubGuardWarning;
}
export class ArtifactRegressionError extends Error {
readonly code = 'ARTIFACT_REGRESSION';
readonly identifier: string;
readonly newSize: number;
readonly priorSize: number;
readonly priorName: string;
constructor(message: string, details: { identifier: string; newSize: number; priorSize: number; priorName: string }) {
super(message);
this.name = 'ArtifactRegressionError';
this.identifier = details.identifier;
this.newSize = details.newSize;
this.priorSize = details.priorSize;
this.priorName = details.priorName;
}
}
export const DEFAULT_ARTIFACT_STUB_GUARD_CONFIG: ArtifactStubGuardConfig = {
mode: 'warn',
minRetainedRatio: 0.2,
minPriorBytes: 4096,
};
// HTML-rendered manifest kinds. Decks are HTML files on disk and have the
// same regression failure mode as plain html artifacts (the agent emits a
// placeholder where a multi-KB framework should be), so they're guarded
// alongside `html`.
export const STUB_GUARDED_MANIFEST_KINDS: ReadonlySet<string> = new Set(['html', 'deck']);
function escapeRegExp(value: string): string {
return value.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
// Mirror of the slugifier in `apps/web/src/components/ProjectView.tsx`'s
// `persistArtifact`. The web path slugifies the identifier for the
// filename basename but keeps the *raw* identifier in the manifest, so a
// regex anchored on the raw identifier alone can miss its own slug-form
// siblings on disk. We try both forms.
export function slugifyArtifactIdentifier(value: string): string {
return value
.toLowerCase()
.replace(/[^a-z0-9_-]+/g, '-')
.replace(/^-+|-+$/g, '')
.slice(0, 60);
}
// Frontend fallback name used by persistArtifact when the slugified
// identifier is empty (e.g. all-non-ASCII identifiers like "测试" strip to
// nothing). Without matching against this, sibling discovery would miss
// the entire "artifact*.html" family that such identifiers produce.
export const EMPTY_SLUG_FALLBACK_NAME = 'artifact';
// Two identifiers refer to the same artifact lineage when they're
// literally equal OR one is the canonical slug form of the other (and
// that slug is non-empty). Slugs alone matching is not enough: the
// frontend slugifier truncates at 60 chars, so two raw identifiers that
// only diverge after character 60 (e.g. "A...A1" and "A...A2", 70 chars
// each) would otherwise falsely bridge. Requiring one side to *be* the
// slug form keeps the "Landing Page" <-> "landing-page" bridge while
// rejecting truncation-induced collisions. Empty-slug equivalence
// (e.g. "测试" vs "首页") is also not treated as a match for the same
// reason — distinct identifiers can both strip to empty.
export function artifactIdentifiersMatch(a: string, b: string): boolean {
if (a === b) return true;
const slugA = slugifyArtifactIdentifier(a);
if (slugA.length === 0) return false;
const slugB = slugifyArtifactIdentifier(b);
if (slugA !== slugB) return false;
return a === slugA || b === slugB;
}
// Reads the canonical identifier from a sibling's `.artifact.json`
// sidecar. Returns null when the sidecar is absent, malformed, or
// missing a string identifier — callers fall back to filename-derived
// inference for legacy artifacts that pre-date the sidecar era.
async function readSidecarIdentifier(scanDir: string, entryName: string): Promise<string | null> {
try {
const raw = await readFile(path.join(scanDir, `${entryName}.artifact.json`), 'utf8');
const parsed = JSON.parse(raw) as { metadata?: { identifier?: unknown } } | null;
const id = parsed?.metadata?.identifier;
return typeof id === 'string' && id.length > 0 ? id : null;
} catch {
return null;
}
}
// Strips the frontend's `-N` collision suffix and the `.html` / `.htm`
// extension to recover the per-lineage basename. Used as one of the
// fallback identifiers for legacy HTML artifacts that don't have a
// sidecar yet (anything written before the manifest era, or HTML
// pasted/uploaded outside the artifact-tag flow). `inferLegacyManifest`
// already treats these as html-kind artifacts elsewhere; we mirror
// that so the guard doesn't silently let a stub overwrite them.
const SYNTHETIC_IDENTIFIER_SUFFIX = /(?:-\d+)?\.html?$/;
const HTML_EXTENSION = /\.html?$/;
function syntheticIdentifierFromFilename(name: string): string {
return name.replace(SYNTHETIC_IDENTIFIER_SUFFIX, '');
}
// `phase-2.html` is genuinely ambiguous: it could be the identifier
// `phase` with a `-2` collision suffix, or the standalone identifier
// `phase-2`. Without a sidecar to disambiguate, both interpretations
// are valid candidates — the guard accepts the file as a prior if
// either matches the input identifier. Visible false positives
// (rejecting a legitimate write the user can override via env) are
// preferable to silent false negatives (stub bypasses the guard).
function legacyCandidateIdentifiers(filename: string): string[] {
const fullBasename = filename.replace(HTML_EXTENSION, '');
const stripped = syntheticIdentifierFromFilename(filename);
const candidates: string[] = [];
if (fullBasename.length > 0) candidates.push(fullBasename);
if (stripped.length > 0 && stripped !== fullBasename) candidates.push(stripped);
return candidates;
}
// Finds prior HTML siblings on disk that share an identifier with a
// newly-written artifact. The frontend's collision-suffixing scheme means
// related entries match `<identifier>(-\d+)?\.html?`. The scan deliberately
// includes any file at the same path as the new write — when an agent
// overwrites `dashboard.html` with the same name, the file currently on
// disk is the prior content (the overwrite happens after this scan).
export async function findPriorArtifactSiblings(
scanDir: string,
identifier: string,
): Promise<PriorArtifactSibling[]> {
if (identifier.length === 0) return [];
const tokens = new Set<string>();
tokens.add(identifier);
const slug = slugifyArtifactIdentifier(identifier);
if (slug.length > 0) tokens.add(slug);
// When the identifier slugifies to empty (e.g. all-non-ASCII), the web
// path falls back to the literal "artifact" basename. Match that family
// so a later artifact-2.html stub doesn't bypass the prior.
else tokens.add(EMPTY_SLUG_FALLBACK_NAME);
const alternation = Array.from(tokens, escapeRegExp).join('|');
const pattern = new RegExp(`^(?:${alternation})(?:-\\d+)?\\.html?$`);
let entries: Dirent[];
try {
entries = await readdir(scanDir, { withFileTypes: true });
} catch {
return [];
}
const results: PriorArtifactSibling[] = [];
for (const entry of entries) {
if (!entry.isFile()) continue;
if (!pattern.test(entry.name)) continue;
// Verify the sibling's canonical identifier matches before treating
// it as a prior. Without this check, distinct identifiers that share
// a sibling-name namespace (most reachably the empty-slug fallback,
// where 测试 and 首页 both land in artifact*.html) would falsely
// warn/reject across each other.
//
// For legacy HTML artifacts without a sidecar (pre-manifest era,
// Write-tool, paste-text, manual import), we fall back to filename-
// derived candidates. Because a name like `phase-2.html` is
// genuinely ambiguous between "phase + collision suffix -2" and "the
// standalone identifier phase-2", we try both interpretations and
// accept the file as a prior if either matches. The canonical-form
// anchor in artifactIdentifiersMatch still rules out truncation
// collisions and empty-slug conflations.
const sidecarIdentifier = await readSidecarIdentifier(scanDir, entry.name);
const candidateIdentifiers = sidecarIdentifier !== null
? [sidecarIdentifier]
: legacyCandidateIdentifiers(entry.name);
if (candidateIdentifiers.length === 0) continue;
if (!candidateIdentifiers.some((c) => artifactIdentifiersMatch(identifier, c))) continue;
try {
const st = await stat(path.join(scanDir, entry.name));
results.push({ name: entry.name, size: st.size });
} catch {
// ignore unreadable entries; they don't influence the guard decision
}
}
return results;
}
export function readArtifactStubGuardConfigFromEnv(env: NodeJS.ProcessEnv = process.env): ArtifactStubGuardConfig {
const rawMode = (env.OD_ARTIFACT_STUB_GUARD ?? '').toLowerCase();
const mode: ArtifactStubGuardMode =
rawMode === 'reject' || rawMode === 'warn' || rawMode === 'off'
? rawMode
: DEFAULT_ARTIFACT_STUB_GUARD_CONFIG.mode;
const ratioRaw = Number(env.OD_ARTIFACT_STUB_GUARD_MIN_RATIO);
// Accept (0, 1] so users can set 1 to reject any shrinkage. Values <=0
// or >1 fall back to default.
const minRetainedRatio =
Number.isFinite(ratioRaw) && ratioRaw > 0 && ratioRaw <= 1
? ratioRaw
: DEFAULT_ARTIFACT_STUB_GUARD_CONFIG.minRetainedRatio;
const minPriorBytesRaw = Number(env.OD_ARTIFACT_STUB_GUARD_MIN_PRIOR_BYTES);
const minPriorBytes =
Number.isInteger(minPriorBytesRaw) && minPriorBytesRaw > 0
? minPriorBytesRaw
: DEFAULT_ARTIFACT_STUB_GUARD_CONFIG.minPriorBytes;
return { mode, minRetainedRatio, minPriorBytes };
}
function buildWarning(
identifier: string,
newSize: number,
prior: PriorArtifactSibling,
): ArtifactStubGuardWarning {
return {
code: 'ARTIFACT_REGRESSION',
message:
`New artifact body for identifier "${identifier}" is ${newSize} bytes, ` +
`but the largest prior sibling "${prior.name}" is ${prior.size} bytes. ` +
'This pattern usually means the agent emitted a placeholder instead of the full document. ' +
'Set OD_ARTIFACT_STUB_GUARD=warn to record the warning without rejecting, or =off to disable the guard entirely.',
identifier,
newSize,
priorSize: prior.size,
priorName: prior.name,
};
}
// Pure decision function: given the prior siblings on disk, decide whether
// the new body is a stub regression. Splitting this from the disk scan
// keeps the unit tests fast and lets callers pre-fetch siblings.
export function classifyArtifactStubGuard(
priors: PriorArtifactSibling[],
identifier: string,
newSize: number,
config: ArtifactStubGuardConfig,
): EvaluateArtifactStubGuardResult {
if (config.mode === 'off') return { outcome: 'pass' };
if (identifier.length === 0) return { outcome: 'pass' };
if (priors.length === 0) return { outcome: 'pass' };
let largest: PriorArtifactSibling | null = null;
for (const prior of priors) {
if (largest === null || prior.size > largest.size) largest = prior;
}
if (largest === null) return { outcome: 'pass' };
if (largest.size < config.minPriorBytes) return { outcome: 'pass' };
const threshold = largest.size * config.minRetainedRatio;
if (newSize >= threshold) return { outcome: 'pass' };
const warning = buildWarning(identifier, newSize, largest);
return { outcome: config.mode === 'reject' ? 'reject' : 'warn', warning };
}
export async function evaluateArtifactStubGuard(
input: EvaluateArtifactStubGuardInput,
): Promise<EvaluateArtifactStubGuardResult> {
if (input.config.mode === 'off') return { outcome: 'pass' };
if (input.identifier.length === 0) return { outcome: 'pass' };
const priors = await findPriorArtifactSiblings(input.scanDir, input.identifier);
return classifyArtifactStubGuard(priors, input.identifier, input.newSize, input.config);
}

View file

@ -1,4 +1,5 @@
import type { Express } from 'express';
import { ArtifactRegressionError } from './artifact-stub-guard.js';
import type { RouteDeps } from './server-context.js';
export interface RegisterProjectRoutesDeps extends RouteDeps<'db' | 'design' | 'http' | 'paths' | 'projectStore' | 'projectFiles' | 'conversations' | 'templates' | 'status' | 'events' | 'ids'> {}
@ -900,6 +901,16 @@ export function registerProjectFileRoutes(app: Express, ctx: RegisterProjectFile
const body = { file: meta };
res.json(body);
} catch (err: any) {
if (err instanceof ArtifactRegressionError) {
return sendApiError(res, 422, 'ARTIFACT_REGRESSION', err.message, {
details: {
identifier: err.identifier,
newSize: err.newSize,
priorSize: err.priorSize,
priorName: err.priorName,
},
});
}
sendApiError(res, 500, 'INTERNAL_ERROR', 'upload failed');
}
},

View file

@ -15,6 +15,12 @@ import {
parsePersistedManifest,
validateArtifactManifestInput,
} from './artifact-manifest.js';
import {
ArtifactRegressionError,
STUB_GUARDED_MANIFEST_KINDS,
evaluateArtifactStubGuard,
readArtifactStubGuardConfigFromEnv,
} from './artifact-stub-guard.js';
const FORBIDDEN_SEGMENT = /^$|^\.\.?$/;
const RESERVED_PROJECT_FILE_SEGMENTS = new Set(['.live-artifacts']);
@ -395,19 +401,60 @@ export async function writeProjectFile(
}
}
await mkdir(path.dirname(target), { recursive: true });
await writeFile(target, body);
let stubGuardWarning = null;
let validatedManifest = null;
if (artifactManifest && typeof artifactManifest === 'object') {
const manifestFileName = artifactManifestNameFor(safeName);
const manifestTarget = await resolveSafeReal(dir, manifestFileName);
const validated = validateArtifactManifestInput(artifactManifest, safeName);
if (validated.ok && validated.value) {
const nextManifest = validated.value;
await writeFile(manifestTarget, JSON.stringify(nextManifest, null, 2));
validatedManifest = validated.value;
const identifier = typeof validatedManifest.metadata?.identifier === 'string'
? validatedManifest.metadata.identifier
: '';
// Stub-guard applies to HTML-rendered manifest kinds (html, deck).
// Other kinds (markdown, svg, code-snippet) can legitimately be small
// and are skipped.
if (identifier.length > 0 && STUB_GUARDED_MANIFEST_KINDS.has(validatedManifest.kind)) {
// Scan the directory the new file actually lands in, not the project
// root — writeProjectFile accepts nested paths like reports/X.html
// and a root-only scan would miss prior siblings in subdirectories.
const guard = await evaluateArtifactStubGuard({
scanDir: path.dirname(target),
identifier,
newSize: Buffer.byteLength(body),
config: readArtifactStubGuardConfigFromEnv(),
});
if ((guard.outcome === 'reject' || guard.outcome === 'warn') && guard.warning) {
// Operator-visible signal regardless of mode, so on-call can see
// how often the guard fires without combing through 422s.
console.warn(
`[stub-guard] ${guard.outcome} identifier=${guard.warning.identifier} ` +
`newSize=${guard.warning.newSize} priorSize=${guard.warning.priorSize} ` +
`priorName=${guard.warning.priorName} project=${projectId}`,
);
}
if (guard.outcome === 'reject' && guard.warning) {
throw new ArtifactRegressionError(guard.warning.message, {
identifier: guard.warning.identifier,
newSize: guard.warning.newSize,
priorSize: guard.warning.priorSize,
priorName: guard.warning.priorName,
});
}
if (guard.outcome === 'warn' && guard.warning) {
stubGuardWarning = guard.warning;
}
}
}
}
await writeFile(target, body);
if (validatedManifest) {
const manifestFileName = artifactManifestNameFor(safeName);
const manifestTarget = await resolveSafeReal(dir, manifestFileName);
await writeFile(manifestTarget, JSON.stringify(validatedManifest, null, 2));
}
const st = await stat(target);
const persistedManifest = await readManifestForPath(dir, safeName);
return {
const result = {
name: safeName,
path: safeName,
size: st.size,
@ -417,6 +464,8 @@ export async function writeProjectFile(
artifactKind: persistedManifest?.kind,
artifactManifest: persistedManifest,
};
if (stubGuardWarning) result.stubGuardWarning = stubGuardWarning;
return result;
}
function artifactManifestNameFor(name) {

View file

@ -0,0 +1,432 @@
import { mkdtemp, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import path from 'node:path';
import { afterEach, describe, expect, it } from 'vitest';
import {
ArtifactRegressionError,
DEFAULT_ARTIFACT_STUB_GUARD_CONFIG,
artifactIdentifiersMatch,
classifyArtifactStubGuard,
evaluateArtifactStubGuard,
findPriorArtifactSiblings,
readArtifactStubGuardConfigFromEnv,
slugifyArtifactIdentifier,
type ArtifactStubGuardConfig,
} from '../src/artifact-stub-guard.js';
// Helper: write an artifact body + the sidecar manifest the production
// write path would produce. Sibling discovery requires the sidecar to
// verify the canonical identifier, so unit fixtures must include both.
async function writeArtifactPair(dir: string, name: string, body: string, identifier: string): Promise<void> {
await writeFile(path.join(dir, name), body);
const manifest = {
version: 1,
kind: 'html',
title: identifier,
entry: name,
renderer: 'html',
status: 'complete',
exports: ['html'],
metadata: { identifier, artifactType: 'text/html', inferred: false },
};
await writeFile(path.join(dir, `${name}.artifact.json`), JSON.stringify(manifest));
}
function rejectingConfig(overrides: Partial<ArtifactStubGuardConfig> = {}): ArtifactStubGuardConfig {
return { ...DEFAULT_ARTIFACT_STUB_GUARD_CONFIG, mode: 'reject', ...overrides };
}
function warningConfig(overrides: Partial<ArtifactStubGuardConfig> = {}): ArtifactStubGuardConfig {
return { ...DEFAULT_ARTIFACT_STUB_GUARD_CONFIG, mode: 'warn', ...overrides };
}
describe('classifyArtifactStubGuard', () => {
it('passes when no priors exist', () => {
const result = classifyArtifactStubGuard([], 'dashboard', 80, rejectingConfig());
expect(result.outcome).toBe('pass');
expect(result.warning).toBeUndefined();
});
it('passes when guard mode is off', () => {
const result = classifyArtifactStubGuard(
[{ name: 'dashboard.html', size: 80_000 }],
'dashboard',
120,
{ ...DEFAULT_ARTIFACT_STUB_GUARD_CONFIG, mode: 'off' },
);
expect(result.outcome).toBe('pass');
});
it('passes when identifier is empty', () => {
const result = classifyArtifactStubGuard(
[{ name: 'dashboard.html', size: 80_000 }],
'',
120,
rejectingConfig(),
);
expect(result.outcome).toBe('pass');
});
it('passes when largest prior is below the floor', () => {
const result = classifyArtifactStubGuard(
[{ name: 'dashboard.html', size: 1_024 }],
'dashboard',
32,
rejectingConfig({ minPriorBytes: 4_096 }),
);
expect(result.outcome).toBe('pass');
});
it('passes when the new body keeps at least minRetainedRatio of the prior', () => {
const result = classifyArtifactStubGuard(
[{ name: 'dashboard.html', size: 80_000 }],
'dashboard',
40_000,
rejectingConfig({ minRetainedRatio: 0.2 }),
);
expect(result.outcome).toBe('pass');
});
it('rejects when the new body collapses below the ratio of the largest prior', () => {
const result = classifyArtifactStubGuard(
[
{ name: 'dashboard.html', size: 80_000 },
{ name: 'dashboard-2.html', size: 95_000 },
],
'dashboard',
120,
rejectingConfig({ minRetainedRatio: 0.2, minPriorBytes: 4_096 }),
);
expect(result.outcome).toBe('reject');
expect(result.warning).toMatchObject({
code: 'ARTIFACT_REGRESSION',
identifier: 'dashboard',
newSize: 120,
priorSize: 95_000,
priorName: 'dashboard-2.html',
});
expect(result.warning?.message).toContain('dashboard-2.html');
});
it('warns instead of rejecting when mode is warn', () => {
const result = classifyArtifactStubGuard(
[{ name: 'report.html', size: 50_000 }],
'report',
300,
warningConfig(),
);
expect(result.outcome).toBe('warn');
expect(result.warning?.priorSize).toBe(50_000);
});
});
describe('findPriorArtifactSiblings', () => {
const tempDirs: string[] = [];
afterEach(async () => {
for (const dir of tempDirs.splice(0)) {
await rm(dir, { recursive: true, force: true });
}
});
async function makeDir() {
const dir = await mkdtemp(path.join(tmpdir(), 'od-stub-guard-'));
tempDirs.push(dir);
return dir;
}
it('finds bare and suffixed siblings, including the same-named target if it exists', async () => {
const dir = await makeDir();
await writeArtifactPair(dir, 'report.html', 'a'.repeat(20_000), 'report');
await writeArtifactPair(dir, 'report-2.html', 'b'.repeat(40_000), 'report');
await writeArtifactPair(dir, 'report-3.html', 'c'.repeat(60_000), 'report');
await writeArtifactPair(dir, 'unrelated.html', 'x'.repeat(50_000), 'unrelated');
// The target 'report-3.html' is included because it currently exists on
// disk and its current size is the prior content (the overwrite that
// would replace it has not happened yet at scan time). This is the
// same-name-overwrite case: see lefarcen P1.
const priors = await findPriorArtifactSiblings(dir, 'report');
const names = priors.map((p) => p.name).sort();
expect(names).toEqual(['report-2.html', 'report-3.html', 'report.html']);
});
it('returns an empty list when the directory does not exist', async () => {
const priors = await findPriorArtifactSiblings('/nonexistent/od/projects/missing', 'dashboard');
expect(priors).toEqual([]);
});
it('does not match identifiers that share a prefix', async () => {
const dir = await makeDir();
await writeArtifactPair(dir, 'landing.html', 'a'.repeat(1_000), 'landing');
await writeArtifactPair(dir, 'landing-page.html', 'b'.repeat(1_000), 'landing-page');
const priors = await findPriorArtifactSiblings(dir, 'landing');
const names = priors.map((p) => p.name).sort();
expect(names).toEqual(['landing.html']);
});
it('also matches .htm siblings', async () => {
const dir = await makeDir();
await writeArtifactPair(dir, 'overview-doc.htm', 'a'.repeat(20_000), 'overview-doc');
await writeArtifactPair(dir, 'overview-doc-2.html', 'b'.repeat(30_000), 'overview-doc');
const priors = await findPriorArtifactSiblings(dir, 'overview-doc');
const names = priors.map((p) => p.name).sort();
expect(names).toEqual(['overview-doc-2.html', 'overview-doc.htm']);
});
it('matches siblings using the slugified form of a non-slug identifier', async () => {
const dir = await makeDir();
// Frontend persistArtifact slugifies "Landing Page" -> "landing-page"
// for the filename but keeps the raw "Landing Page" in the manifest.
// Both forms refer to the same lineage; sidecar identity uses
// slug-equivalence to bridge them.
await writeArtifactPair(dir, 'landing-page.html', 'a'.repeat(40_000), 'Landing Page');
const priors = await findPriorArtifactSiblings(dir, 'Landing Page');
expect(priors.map((p) => p.name)).toEqual(['landing-page.html']);
});
it('falls back to the "artifact" basename when the identifier slugifies to empty', async () => {
const dir = await makeDir();
// Identifier like "测试" (or any all-non-ASCII / punctuation-only
// string) strips to "" through the web slugifier and persistArtifact's
// `|| 'artifact'` fallback writes it as artifact.html / artifact-2.html.
await writeArtifactPair(dir, 'artifact.html', 'a'.repeat(40_000), '测试');
await writeArtifactPair(dir, 'artifact-2.html', 'b'.repeat(60_000), '测试');
const priors = await findPriorArtifactSiblings(dir, '测试');
const names = priors.map((p) => p.name).sort();
expect(names).toEqual(['artifact-2.html', 'artifact.html']);
});
it('does NOT match a fallback sibling whose sidecar identifier differs (lefarcen/mrcfps round 4)', async () => {
const dir = await makeDir();
// Two distinct empty-slug identifiers both land in the artifact*.html
// namespace. A new save for "首页" must not be compared against the
// earlier "测试" sibling — they're unrelated artifacts that just
// happen to share a fallback basename.
await writeArtifactPair(dir, 'artifact.html', 'a'.repeat(40_000), '测试');
const priors = await findPriorArtifactSiblings(dir, '首页');
expect(priors).toEqual([]);
});
it('falls back to filename-derived identifier for legacy sidecar-less HTML (mrcfps R6)', async () => {
const dir = await makeDir();
// Pre-sidecar legacy file — `inferLegacyManifest` treats it as
// html-kind elsewhere, so the guard should too. Without this
// fallback, a stub overwrite of a legacy `dashboard.html` would
// bypass the guard as a "first emission".
await writeFile(path.join(dir, 'dashboard.html'), 'a'.repeat(40_000));
const priors = await findPriorArtifactSiblings(dir, 'dashboard');
expect(priors.map((p) => p.name)).toEqual(['dashboard.html']);
});
it('legacy fallback bridges raw <-> slug per artifactIdentifiersMatch rules', async () => {
const dir = await makeDir();
// Legacy file basename is the canonical slug form. Input identifier
// "Landing Page" must still bridge to it via slug-equivalence, just
// like the sidecar case.
await writeFile(path.join(dir, 'landing-page.html'), 'a'.repeat(40_000));
const priors = await findPriorArtifactSiblings(dir, 'Landing Page');
expect(priors.map((p) => p.name)).toEqual(['landing-page.html']);
});
it('legacy fallback does NOT bridge unrelated identifiers via filename inference', async () => {
const dir = await makeDir();
// Legacy file basename `dashboard` should not match identifier
// `legacy-dashboard` even though both are slug-form.
await writeFile(path.join(dir, 'dashboard.html'), 'a'.repeat(40_000));
const priors = await findPriorArtifactSiblings(dir, 'legacy-dashboard');
expect(priors).toEqual([]);
});
it('legacy fallback honors identifiers that legitimately end in -<digits> (mrcfps R7)', async () => {
const dir = await makeDir();
// `phase-2.html` is ambiguous without a sidecar: could be "phase"
// with -2 collision suffix, or the standalone identifier "phase-2".
// The guard tries both; here the input names "phase-2" so the full
// basename interpretation must match.
await writeFile(path.join(dir, 'phase-2.html'), 'a'.repeat(40_000));
const priors = await findPriorArtifactSiblings(dir, 'phase-2');
expect(priors.map((p) => p.name)).toEqual(['phase-2.html']);
});
it('legacy fallback also honors the suffix-stripped interpretation', async () => {
const dir = await makeDir();
// Same on-disk file, different input: this time the agent is
// emitting under the `phase` identifier (treating the -2 as a
// collision suffix). The suffix-stripped interpretation must match.
await writeFile(path.join(dir, 'phase-2.html'), 'a'.repeat(40_000));
const priors = await findPriorArtifactSiblings(dir, 'phase');
expect(priors.map((p) => p.name)).toEqual(['phase-2.html']);
});
});
describe('artifactIdentifiersMatch', () => {
it('matches identical raw identifiers', () => {
expect(artifactIdentifiersMatch('dashboard', 'dashboard')).toBe(true);
});
it('bridges raw form to its canonical slug form', () => {
expect(artifactIdentifiersMatch('Landing Page', 'landing-page')).toBe(true);
expect(artifactIdentifiersMatch('landing-page', 'Landing Page')).toBe(true);
});
it('does NOT bridge two non-canonical forms even if they slugify the same', () => {
// Both inputs slugify to "landing-page" but neither IS the canonical
// slug form, so we can't safely call them the same lineage. This is
// the same safety property that protects against truncation
// collisions for >60-char identifiers.
expect(artifactIdentifiersMatch('Landing Page', 'LANDING-PAGE')).toBe(false);
});
it('does not match distinct identifiers that both slugify to empty', () => {
expect(artifactIdentifiersMatch('测试', '首页')).toBe(false);
expect(artifactIdentifiersMatch('!!!', '???')).toBe(false);
});
it('matches a non-ASCII identifier with itself even when its slug is empty', () => {
expect(artifactIdentifiersMatch('测试', '测试')).toBe(true);
});
it('does not match unrelated identifiers with different slugs', () => {
expect(artifactIdentifiersMatch('dashboard', 'legacy-dashboard')).toBe(false);
});
it('does NOT bridge two long raw identifiers that share a 60-char truncated slug (mrcfps R5)', () => {
// Both >60 chars and identical for the first 60, differing only after.
// Their slugify outputs collide via truncation, but neither is the
// canonical slug form of itself, so they must not bridge.
const sixtyAs = 'a'.repeat(60);
const a = `${sixtyAs}-suffix-one`;
const b = `${sixtyAs}-suffix-two`;
expect(a).not.toBe(b);
expect(artifactIdentifiersMatch(a, b)).toBe(false);
});
it('still bridges raw form to truncated slug when the slug is the canonical second input', () => {
// The standard "Landing Page" <-> "landing-page" case must still work.
// Asserting via inputs that hit the truncation boundary: a 70-char raw
// identifier whose slug is the truncated-to-60 form, paired with that
// truncated form passed in directly, should still match.
const slug = 'a'.repeat(60);
const raw = 'a'.repeat(70);
expect(slugifyArtifactIdentifier(raw)).toBe(slug);
expect(artifactIdentifiersMatch(raw, slug)).toBe(true);
});
});
describe('evaluateArtifactStubGuard (integration with disk scan)', () => {
const tempDirs: string[] = [];
afterEach(async () => {
for (const dir of tempDirs.splice(0)) {
await rm(dir, { recursive: true, force: true });
}
});
async function makeDir() {
const dir = await mkdtemp(path.join(tmpdir(), 'od-stub-guard-eval-'));
tempDirs.push(dir);
return dir;
}
it('rejects a stub-sized rewrite of an existing identifier', async () => {
const dir = await makeDir();
await writeArtifactPair(dir, 'presentation.html', 'p'.repeat(60_000), 'presentation');
const result = await evaluateArtifactStubGuard({
scanDir: dir,
identifier: 'presentation',
newSize: 200,
config: rejectingConfig(),
});
expect(result.outcome).toBe('reject');
expect(result.warning?.priorName).toBe('presentation.html');
});
it('passes when the new body comparable in size to the prior', async () => {
const dir = await makeDir();
await writeArtifactPair(dir, 'presentation.html', 'p'.repeat(60_000), 'presentation');
const result = await evaluateArtifactStubGuard({
scanDir: dir,
identifier: 'presentation',
newSize: 50_000,
config: rejectingConfig(),
});
expect(result.outcome).toBe('pass');
});
});
describe('readArtifactStubGuardConfigFromEnv', () => {
it('returns defaults when env vars are absent', () => {
const config = readArtifactStubGuardConfigFromEnv({});
expect(config).toEqual(DEFAULT_ARTIFACT_STUB_GUARD_CONFIG);
});
it('parses recognised mode values', () => {
expect(readArtifactStubGuardConfigFromEnv({ OD_ARTIFACT_STUB_GUARD: 'reject' }).mode).toBe('reject');
expect(readArtifactStubGuardConfigFromEnv({ OD_ARTIFACT_STUB_GUARD: 'WARN' }).mode).toBe('warn');
expect(readArtifactStubGuardConfigFromEnv({ OD_ARTIFACT_STUB_GUARD: 'off' }).mode).toBe('off');
});
it('falls back to default when mode is unrecognised', () => {
expect(readArtifactStubGuardConfigFromEnv({ OD_ARTIFACT_STUB_GUARD: 'maybe' }).mode).toBe(
DEFAULT_ARTIFACT_STUB_GUARD_CONFIG.mode,
);
});
it('honours numeric overrides within range', () => {
const config = readArtifactStubGuardConfigFromEnv({
OD_ARTIFACT_STUB_GUARD_MIN_RATIO: '0.35',
OD_ARTIFACT_STUB_GUARD_MIN_PRIOR_BYTES: '8192',
});
expect(config.minRetainedRatio).toBeCloseTo(0.35);
expect(config.minPriorBytes).toBe(8_192);
});
it('accepts ratio = 1 to reject any shrinkage', () => {
const config = readArtifactStubGuardConfigFromEnv({
OD_ARTIFACT_STUB_GUARD_MIN_RATIO: '1',
});
expect(config.minRetainedRatio).toBe(1);
});
it('rejects out-of-range numeric overrides', () => {
const config = readArtifactStubGuardConfigFromEnv({
OD_ARTIFACT_STUB_GUARD_MIN_RATIO: '5',
OD_ARTIFACT_STUB_GUARD_MIN_PRIOR_BYTES: '-12',
});
expect(config.minRetainedRatio).toBe(DEFAULT_ARTIFACT_STUB_GUARD_CONFIG.minRetainedRatio);
expect(config.minPriorBytes).toBe(DEFAULT_ARTIFACT_STUB_GUARD_CONFIG.minPriorBytes);
});
});
describe('ArtifactRegressionError', () => {
it('carries identifier, sizes, and prior name in details', () => {
const err = new ArtifactRegressionError('regression', {
identifier: 'dashboard',
newSize: 100,
priorSize: 50_000,
priorName: 'dashboard.html',
});
expect(err.code).toBe('ARTIFACT_REGRESSION');
expect(err.name).toBe('ArtifactRegressionError');
expect(err.identifier).toBe('dashboard');
expect(err.newSize).toBe(100);
expect(err.priorSize).toBe(50_000);
expect(err.priorName).toBe('dashboard.html');
});
});

View file

@ -0,0 +1,427 @@
import type http from 'node:http';
import { afterAll, afterEach, beforeAll, describe, expect, it, vi } from 'vitest';
import { startServer } from '../src/server.js';
interface FilePostBody {
name: string;
content: string;
encoding?: 'utf8' | 'base64';
artifactManifest?: unknown;
}
function htmlBody(byteLength: number): string {
const filler = 'x'.repeat(Math.max(0, byteLength - 96));
return `<!doctype html><html><head><title>Doc</title></head><body><main>${filler}</main></body></html>`;
}
function manifestFor(identifier: string, kind: 'html' | 'deck' = 'html') {
return {
kind,
renderer: kind === 'deck' ? 'deck-html' : 'html',
title: identifier,
exports: kind === 'deck' ? ['html', 'pdf'] : ['html'],
metadata: { identifier },
};
}
describe('artifact stub guard via /api/projects/:id/files', () => {
let server: http.Server;
let baseUrl: string;
beforeAll(async () => {
vi.stubEnv('OD_ARTIFACT_STUB_GUARD', 'reject');
vi.stubEnv('OD_ARTIFACT_STUB_GUARD_MIN_PRIOR_BYTES', '1024');
vi.stubEnv('OD_ARTIFACT_STUB_GUARD_MIN_RATIO', '0.2');
const started = (await startServer({ port: 0, returnServer: true })) as {
url: string;
server: http.Server;
};
baseUrl = started.url;
server = started.server;
});
afterAll(() => {
vi.unstubAllEnvs();
return new Promise<void>((resolve) => server.close(() => resolve()));
});
afterEach(() => {
// Each test resets the mode it changed; default back to reject.
vi.stubEnv('OD_ARTIFACT_STUB_GUARD', 'reject');
});
async function createProject(prefix: string) {
const id = `${prefix}-${Date.now()}-${Math.random().toString(36).slice(2)}`;
const resp = await fetch(`${baseUrl}/api/projects`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ id, name: id }),
});
expect(resp.status).toBe(200);
return id;
}
async function postFile(projectId: string, body: FilePostBody) {
return fetch(`${baseUrl}/api/projects/${projectId}/files`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body),
});
}
it('rejects a stub-sized rewrite with ARTIFACT_REGRESSION', async () => {
const projectId = await createProject('reject');
const firstResp = await postFile(projectId, {
name: 'dashboard.html',
content: htmlBody(20_000),
artifactManifest: manifestFor('dashboard'),
});
expect(firstResp.status).toBe(200);
const stubResp = await postFile(projectId, {
name: 'dashboard-2.html',
content: 'See dashboard.html in this project — full standalone file written to disk.',
artifactManifest: manifestFor('dashboard'),
});
expect(stubResp.status).toBe(422);
const stubBody = (await stubResp.json()) as {
error: { code: string; message: string; details?: { identifier?: string; priorName?: string } };
};
expect(stubBody.error.code).toBe('ARTIFACT_REGRESSION');
expect(stubBody.error.details?.identifier).toBe('dashboard');
expect(stubBody.error.details?.priorName).toBe('dashboard.html');
});
it('does not write the new file or its manifest when rejected', async () => {
const projectId = await createProject('not-written');
const firstResp = await postFile(projectId, {
name: 'report.html',
content: htmlBody(20_000),
artifactManifest: manifestFor('report'),
});
expect(firstResp.status).toBe(200);
const stubResp = await postFile(projectId, {
name: 'report-2.html',
content: '<html><body>see report.html</body></html>',
artifactManifest: manifestFor('report'),
});
expect(stubResp.status).toBe(422);
const listResp = await fetch(`${baseUrl}/api/projects/${projectId}/files`);
expect(listResp.status).toBe(200);
const listBody = (await listResp.json()) as { files: Array<{ name: string }> };
const names = listBody.files.map((f) => f.name).sort();
expect(names).not.toContain('report-2.html');
expect(names).not.toContain('report-2.html.artifact.json');
expect(names).toContain('report.html');
});
it('allows a same-size revision through', async () => {
const projectId = await createProject('allow');
const firstResp = await postFile(projectId, {
name: 'landing-page.html',
content: htmlBody(20_000),
artifactManifest: manifestFor('landing-page'),
});
expect(firstResp.status).toBe(200);
const secondResp = await postFile(projectId, {
name: 'landing-page-2.html',
content: htmlBody(20_500),
artifactManifest: manifestFor('landing-page'),
});
expect(secondResp.status).toBe(200);
});
it('warns instead of rejecting when guard mode is warn', async () => {
vi.stubEnv('OD_ARTIFACT_STUB_GUARD', 'warn');
const projectId = await createProject('warn');
const firstResp = await postFile(projectId, {
name: 'overview.html',
content: htmlBody(20_000),
artifactManifest: manifestFor('overview'),
});
expect(firstResp.status).toBe(200);
const stubResp = await postFile(projectId, {
name: 'overview-2.html',
content: '<html><body>placeholder</body></html>',
artifactManifest: manifestFor('overview'),
});
expect(stubResp.status).toBe(200);
const writeBody = (await stubResp.json()) as {
file: { name: string; stubGuardWarning?: { code: string; identifier: string } };
};
expect(writeBody.file.name).toBe('overview-2.html');
expect(writeBody.file.stubGuardWarning?.code).toBe('ARTIFACT_REGRESSION');
expect(writeBody.file.stubGuardWarning?.identifier).toBe('overview');
});
it('skips the guard entirely when mode is off', async () => {
vi.stubEnv('OD_ARTIFACT_STUB_GUARD', 'off');
const projectId = await createProject('off');
const firstResp = await postFile(projectId, {
name: 'briefing.html',
content: htmlBody(20_000),
artifactManifest: manifestFor('briefing'),
});
expect(firstResp.status).toBe(200);
const stubResp = await postFile(projectId, {
name: 'briefing-2.html',
content: '<html><body>see briefing.html</body></html>',
artifactManifest: manifestFor('briefing'),
});
expect(stubResp.status).toBe(200);
const writeBody = (await stubResp.json()) as { file: { stubGuardWarning?: unknown } };
expect(writeBody.file.stubGuardWarning).toBeUndefined();
});
it('accepts a stub-sized first emission of a new identifier', async () => {
const projectId = await createProject('first');
const resp = await postFile(projectId, {
name: 'changelog.html',
content: '<html><body>tiny</body></html>',
artifactManifest: manifestFor('changelog'),
});
expect(resp.status).toBe(200);
});
it('rejects a stub rewrite of a deck artifact (kind: deck)', async () => {
const projectId = await createProject('deck');
const firstResp = await postFile(projectId, {
name: 'kickoff-deck.html',
content: htmlBody(40_000),
artifactManifest: manifestFor('kickoff-deck', 'deck'),
});
expect(firstResp.status).toBe(200);
const stubResp = await postFile(projectId, {
name: 'kickoff-deck-2.html',
content: '<html><body>see kickoff-deck.html</body></html>',
artifactManifest: manifestFor('kickoff-deck', 'deck'),
});
expect(stubResp.status).toBe(422);
const body = (await stubResp.json()) as { error: { code: string } };
expect(body.error.code).toBe('ARTIFACT_REGRESSION');
});
it('detects prior siblings written with .htm extension', async () => {
const projectId = await createProject('htm');
const firstResp = await postFile(projectId, {
name: 'overview-doc.htm',
content: htmlBody(20_000),
artifactManifest: manifestFor('overview-doc'),
});
expect(firstResp.status).toBe(200);
const stubResp = await postFile(projectId, {
name: 'overview-doc-2.html',
content: '<html><body>see overview-doc.htm</body></html>',
artifactManifest: manifestFor('overview-doc'),
});
expect(stubResp.status).toBe(422);
const body = (await stubResp.json()) as {
error: { code: string; details?: { priorName?: string } };
};
expect(body.error.code).toBe('ARTIFACT_REGRESSION');
expect(body.error.details?.priorName).toBe('overview-doc.htm');
});
it('rejects a same-name overwrite that shrinks the existing file (lefarcen P1)', async () => {
const projectId = await createProject('overwrite');
const firstResp = await postFile(projectId, {
name: 'dashboard.html',
content: htmlBody(20_000),
artifactManifest: manifestFor('dashboard'),
});
expect(firstResp.status).toBe(200);
// Same name, same identifier, stub body: existing file is the prior.
const stubResp = await postFile(projectId, {
name: 'dashboard.html',
content: '<html><body>see dashboard.html</body></html>',
artifactManifest: manifestFor('dashboard'),
});
expect(stubResp.status).toBe(422);
const body = (await stubResp.json()) as {
error: { code: string; details?: { priorName?: string } };
};
expect(body.error.code).toBe('ARTIFACT_REGRESSION');
expect(body.error.details?.priorName).toBe('dashboard.html');
// Confirm the original 20 KB file is intact (not overwritten).
const filesResp = await fetch(`${baseUrl}/api/projects/${projectId}/files`);
const files = (await filesResp.json()) as { files: Array<{ name: string; size: number }> };
const dashboard = files.files.find((f) => f.name === 'dashboard.html');
expect(dashboard?.size).toBeGreaterThan(15_000);
});
it('rejects stub regressions in subdirectories (Codex/mrcfps P2)', async () => {
const projectId = await createProject('nested');
const firstResp = await postFile(projectId, {
name: 'reports/overview.html',
content: htmlBody(20_000),
artifactManifest: manifestFor('overview'),
});
expect(firstResp.status).toBe(200);
const stubResp = await postFile(projectId, {
name: 'reports/overview-2.html',
content: '<html><body>see reports/overview.html</body></html>',
artifactManifest: manifestFor('overview'),
});
expect(stubResp.status).toBe(422);
const body = (await stubResp.json()) as {
error: { code: string; details?: { priorName?: string } };
};
expect(body.error.code).toBe('ARTIFACT_REGRESSION');
expect(body.error.details?.priorName).toBe('overview.html');
});
it('finds slug-form sibling when manifest carries non-slug identifier (Codex/lefarcen/mrcfps P2)', async () => {
const projectId = await createProject('slug');
// Frontend wrote the previous artifact under a slugified name but the
// manifest carried the raw identifier "Landing Page".
const firstResp = await postFile(projectId, {
name: 'landing-page.html',
content: htmlBody(20_000),
artifactManifest: { ...manifestFor('landing-page'), metadata: { identifier: 'Landing Page' } },
});
expect(firstResp.status).toBe(200);
const stubResp = await postFile(projectId, {
name: 'landing-page-2.html',
content: '<html><body>see landing-page.html</body></html>',
artifactManifest: { ...manifestFor('landing-page'), metadata: { identifier: 'Landing Page' } },
});
expect(stubResp.status).toBe(422);
const body = (await stubResp.json()) as { error: { code: string } };
expect(body.error.code).toBe('ARTIFACT_REGRESSION');
});
it('finds artifact.html siblings when identifier slugifies to empty (lefarcen P2)', async () => {
const projectId = await createProject('empty-slug');
// Frontend persistArtifact slugifies "测试" -> "" -> falls back to
// "artifact", so the file lands as artifact.html. A subsequent stub
// emission with the same non-ASCII identifier must still find the
// prior via the empty-slug fallback.
const firstResp = await postFile(projectId, {
name: 'artifact.html',
content: htmlBody(20_000),
artifactManifest: { ...manifestFor('artifact'), metadata: { identifier: '测试' } },
});
expect(firstResp.status).toBe(200);
const stubResp = await postFile(projectId, {
name: 'artifact-2.html',
content: '<html><body>see artifact.html</body></html>',
artifactManifest: { ...manifestFor('artifact'), metadata: { identifier: '测试' } },
});
expect(stubResp.status).toBe(422);
const body = (await stubResp.json()) as {
error: { code: string; details?: { priorName?: string } };
};
expect(body.error.code).toBe('ARTIFACT_REGRESSION');
expect(body.error.details?.priorName).toBe('artifact.html');
});
it('does NOT cross-reject distinct empty-slug identifiers (lefarcen/mrcfps round 4)', async () => {
const projectId = await createProject('empty-slug-distinct');
// First save: identifier "测试", lands as artifact.html with a 20 KB
// body. Sidecar carries identifier="测试".
const firstResp = await postFile(projectId, {
name: 'artifact.html',
content: htmlBody(20_000),
artifactManifest: { ...manifestFor('artifact'), metadata: { identifier: '测试' } },
});
expect(firstResp.status).toBe(200);
// Second save: a *different* non-ASCII identifier "首页" that also
// slugifies to empty. This is a brand-new artifact lineage; the small
// first-emission body must not be compared against the unrelated
// "测试" prior just because both share the artifact*.html namespace.
const secondResp = await postFile(projectId, {
name: 'artifact-2.html',
content: '<html><body>tiny but legitimate first emission</body></html>',
artifactManifest: { ...manifestFor('artifact'), metadata: { identifier: '首页' } },
});
expect(secondResp.status).toBe(200);
});
it('catches stub rewrites of legacy sidecar-less HTML priors (mrcfps R6)', async () => {
const projectId = await createProject('legacy');
// Seed a "legacy" file by POSTing without artifactManifest — the
// route writes the body but no .artifact.json. Mirrors any HTML that
// pre-dates the sidecar era or was uploaded outside the artifact-tag
// flow (Write tool, paste-text, manual import).
const legacyResp = await postFile(projectId, {
name: 'dashboard.html',
content: htmlBody(20_000),
// no artifactManifest -> no sidecar on disk
});
expect(legacyResp.status).toBe(200);
// Now an agent emits a stub artifact with the matching identifier.
// Without the legacy fallback, the guard would skip the sidecar-less
// prior and let this through as a "first emission".
const stubResp = await postFile(projectId, {
name: 'dashboard-2.html',
content: '<html><body>see dashboard.html</body></html>',
artifactManifest: manifestFor('dashboard'),
});
expect(stubResp.status).toBe(422);
const body = (await stubResp.json()) as {
error: { code: string; details?: { priorName?: string } };
};
expect(body.error.code).toBe('ARTIFACT_REGRESSION');
expect(body.error.details?.priorName).toBe('dashboard.html');
});
it('catches stub rewrites of legacy priors whose identifier ends in -<digits> (mrcfps R7)', async () => {
const projectId = await createProject('legacy-numeric');
// Seed a sidecar-less `phase-2.html` prior (the standalone
// identifier "phase-2", not "phase + collision suffix"). Without
// the dual-candidate fallback, syntheticIdentifierFromFilename
// would strip the -2 and the legacy fallback would search for
// "phase" instead, missing the prior and bypassing the guard.
const legacyResp = await postFile(projectId, {
name: 'phase-2.html',
content: htmlBody(20_000),
// no artifactManifest -> no sidecar on disk
});
expect(legacyResp.status).toBe(200);
const stubResp = await postFile(projectId, {
name: 'phase-2-2.html',
content: '<html><body>see phase-2.html</body></html>',
artifactManifest: manifestFor('phase-2'),
});
expect(stubResp.status).toBe(422);
const body = (await stubResp.json()) as {
error: { code: string; details?: { priorName?: string } };
};
expect(body.error.code).toBe('ARTIFACT_REGRESSION');
expect(body.error.details?.priorName).toBe('phase-2.html');
});
});

View file

@ -1618,10 +1618,34 @@ export function ProjectView({
});
if (file) {
setFilesRefresh((n) => n + 1);
// Surface the daemon's stub-guard warning when it fires in `warn`
// mode (the default). Without this the warning would land in the
// file metadata silently and the user would never see that the
// model shipped a placeholder.
if (file.stubGuardWarning) {
setError(
`Saved "${file.name}", but the model may have shipped a placeholder: ` +
`${file.stubGuardWarning.message}`,
);
}
// Auto-open the freshly-persisted artifact as a tab so the user
// sees it without an extra click. The Write-tool path already does
// this for tool-emitted files; this handles the artifact-tag path.
requestOpenFile(file.name);
} else {
// writeProjectTextFile collapses non-OK responses (including
// 422 ARTIFACT_REGRESSION from reject-mode stub-guard) to null.
// Surfacing the structured error requires changing that helper's
// return contract for all callers — out of scope here. Until then,
// a generic banner makes the failure observable instead of silent.
// Allow the user to retry by clearing the saved-artifact ref so a
// retry attempt re-enters this code path.
savedArtifactRef.current = '';
setError(
`Couldn't save artifact "${fileName}". The daemon refused the write — ` +
'this is most likely OD_ARTIFACT_STUB_GUARD=reject catching a placeholder body. ' +
'Check the daemon logs for the structured ARTIFACT_REGRESSION details.',
);
}
},
[project.id, projectFiles, requestOpenFile],

View file

@ -15,6 +15,20 @@ export type ProjectFileKind =
| 'spreadsheet'
| 'binary';
// Surfaced when the daemon's stub-guard runs in `warn` mode and detects a
// likely regression (the agent emitted a placeholder body that is much
// smaller than a prior artifact sharing the same `metadata.identifier`).
// In `reject` mode the daemon returns `422 ARTIFACT_REGRESSION` instead and
// no `ProjectFile` is produced.
export interface ProjectFileStubGuardWarning {
code: 'ARTIFACT_REGRESSION';
message: string;
identifier: string;
newSize: number;
priorSize: number;
priorName: string;
}
export interface ProjectFile {
name: string;
path?: string;
@ -25,6 +39,7 @@ export interface ProjectFile {
mime: string;
artifactKind?: ArtifactKind;
artifactManifest?: ArtifactManifest;
stubGuardWarning?: ProjectFileStubGuardWarning;
}
export interface ProjectFilesResponse {

View file

@ -16,6 +16,12 @@ export const API_ERROR_CODES = [
'PROJECT_NOT_FOUND',
'FILE_NOT_FOUND',
'ARTIFACT_NOT_FOUND',
// The agent emitted a new artifact whose body is dramatically smaller than
// a prior artifact sharing the same metadata.identifier. Almost always means
// the agent shipped a placeholder ("see other-file.html in this project",
// a bare filename string, an empty fallback page) instead of the full
// document. Configurable via OD_ARTIFACT_STUB_GUARD (reject|warn|off).
'ARTIFACT_REGRESSION',
'UPSTREAM_UNAVAILABLE',
'RATE_LIMITED',
// PR #974 round-4: desktop-paired daemon received an import request