open-design/apps/daemon/tests/chat-route.test.ts
lefarcen 98a2c63973
feat(daemon): add Antigravity agent adapter (#3157)
* feat(daemon): add Antigravity agent adapter

Adds Google Antigravity (`agy` CLI) as a coding-agent runtime. Detection
picks up `agy` on PATH, the daemon spawns `agy -p "<prompt>"` for a
single non-interactive turn, and the assistant text reply streams back
on stdout. OAuth is shared with the Antigravity IDE through the system
keyring, so users who have signed into the desktop app are authenticated
on first run with no extra step.

`agy` v1.0.3 has no JSON / stream-json / ACP output mode (upstream issue
#119), no `--model` flag (issue #35), and no MCP forwarding hook yet —
the adapter ships with `streamFormat: 'plain'` and a single `default`
fallback model so the model picker doesn't mislead users into thinking
their choice is wired through. We will upgrade buildArgs + add a
dedicated event parser when upstream ships structured output.

Also gitignores `.antigravitycli/`, the project-local config directory
`agy` auto-creates on every run (upstream issue #175).

* fix(daemon): Antigravity adapter — stdin prompt, brand icon, form loop, empty-output guard

- Switch prompt delivery from argv to stdin (`agy -p -`) to avoid the
  30KB maxPromptArgBytes limit that blocked real-world composed prompts
- Add official Antigravity brand SVG icon to agent picker
- Fix repeated question-form loop for plain agents by injecting an
  OVERRIDE block when form answers are already present in the transcript
- Add empty-output guard for plain agents so expired auth or silent
  failures surface a user-visible error instead of a blank "Done" turn

* feat(daemon): expand Antigravity adapter — model picker, form-loop fix, OAuth launcher, log-file classification

PR #3157 follow-up integrating four iterations from end-to-end manual
testing on Gemini 3.5 Flash + GPT-OSS 120B Medium through `agy` v1.0.3.
Each section is independently verifiable; combined they're what made
the first successful artifact generation work end-to-end.

## Model picker via settings.json (agy has no --model flag)

agy v1.0.3 ships no `--model` CLI flag (upstream issue #35), but the
TUI Switch-Model picker writes the chosen label to
`~/.gemini/antigravity-cli/settings.json`'s `"model"` field, and every
`-p` invocation re-reads that file on startup — verified by capturing
the `--log-file` line `Propagating selected model override to backend:
label="<model>"`. Antigravity's `fallbackModels` now lists the 8
labels its TUI exposes (Gemini 3.1 Pro / 3.5 Flash variants, Claude
Sonnet/Opus 4.6 Thinking, GPT-OSS 120B Medium) and `buildArgs`
persists the user's choice to settings.json right before spawn. The
synthetic `default` id is preserved — picking it leaves settings.json
untouched so a user who switches models from agy's own TUI keeps
their choice.

Introduces `RuntimeAgentDef.supportsCustomModel?: boolean`. AMR's
hardcoded blocklist in `SettingsDialog.tsx` migrates to the
declarative flag (it rejects free-form ids at the ACP layer), and
antigravity opts out because its label set is a server-side enum that
silently fails on unrecognised strings.

## Form-loop fix (transcript sanitizer + stronger OVERRIDE)

The discovery form loop on weak/medium plain-stream models (GPT-OSS
120B Medium, Gemini 3.5 Flash) had two reinforcing causes:

  1. `buildDaemonTranscript` packed the prior assistant turn's
     literal `<question-form>` markup into the user request on the
     next turn, giving the model a template to echo. New
     `sanitizePriorAssistantTurnForTranscript` strips
     `<question-form>...</question-form>` blocks and ```json fences
     that match form-schema shape, replacing them with a brief
     placeholder. User content is preserved verbatim (a user who
     legitimately mentions `<question-form>` in chat keeps their
     message intact).
  2. The OVERRIDE block on form-answered turns was 4 lines and only
     banned the bare `<question-form>` tag — models still emitted the
     fenced JSON, form-asking prose ("Got it — tell me the following"),
     and fake system events ("subagents stopped"). The new
     `FORM_ANSWERED_SYSTEM_OVERRIDE` enumerates each anti-pattern and
     pins them via tests, so silently weakening any line reintroduces
     the regression.

Also adds RuntimeAgentDef.resumesSessionViaCli + RuntimeContext.
hasPriorAssistantTurn as forward-looking abstractions (skipTranscript
option on composeChatUserRequestForAgent). Antigravity does NOT opt
in — agy's `-c` resume activates an internal agentic loop with tool
retries and fallback-to-cached-response on tool errors that the OD
system prompt cannot steer; reverted after seeing byte-identical
form re-emissions caused by agy's own retry logic, not OD's transcript.

## One-click OAuth via system terminal

agy print mode can't complete Google Sign-In on its own (the OAuth
callback page asks the user to paste an auth code back into agy, but
`-p` has no input field). Before this commit the auth banner only
told the user to "open a terminal yourself."

Adds `POST /api/agents/antigravity/oauth-launch` and a cross-platform
launcher in `runtimes/terminal-launch.ts`:

  - macOS:    osascript → Terminal.app `do script "agy"` + activate
  - Linux:    tries x-terminal-emulator, gnome-terminal, konsole,
              xfce4-terminal, xterm in order
  - Windows:  `cmd /c start "Open Design" cmd /k agy`

The endpoint hardcodes the `agy` command (no user input → no shell
injection surface) and is loopback-gated like the other daemon
endpoints. The chat's `AGENT_AUTH_REQUIRED` banner now renders a
"Sign in via terminal" button next to Retry; clicking it spawns the
terminal so the user can finish OAuth in one click.

## Silent-failure classification (auth vs quota via --log-file)

agy print mode is silent on stdout/stderr for both missing-OAuth AND
quota-exhausted failures — the upstream
`RESOURCE_EXHAUSTED (code 429): Individual quota reached` and the
`not logged into Antigravity` line only surface in agy's
`--log-file`. Without log inspection the daemon misread quota as
"auth required" and showed the wrong banner.

`RuntimeContext.agentLogFilePath` carries a daemon-owned per-run temp
path that antigravity's buildArgs translates to `--log-file <path>`.
The empty-output guard now reads that log on a `code === 0 &&
!childStdoutSeen` exit, feeds the tail to
`classifyAgentServiceFailure`, and routes:

  - "not logged into Antigravity"     → AGENT_AUTH_REQUIRED with
                                        antigravityAuthGuidance
  - "RESOURCE_EXHAUSTED" / "quota" /  → RATE_LIMITED with
    "Individual quota reached"          antigravityQuotaGuidance
  - none of the above (rare)          → fall back to auth guidance
                                        as the most likely cause

Both surface a terminal launcher in the auth banner: auth gets "Sign
in via terminal", quota gets "Switch model in terminal" — same
endpoint, contextual label. The handler is identical (open agy in a
terminal); the user either signs in or uses agy's Switch Model
picker to pick a model with available quota.

## Validation

- `pnpm guard` pass
- `pnpm --filter @open-design/daemon` runtime + telemetry suites:
  192 passed, 1 skipped (the 1 pre-existing `task-type` failure on
  origin/main is unrelated to this change)
- `pnpm --filter @open-design/web` typecheck pass; sse / amr-guidance
  / AgentIcon suites pass (51 web tests)
- Manual end-to-end on darwin + Gemini 3.5 Flash and GPT-OSS 120B
  Medium: turn-1 question-form rendered correctly, turn-2 produced
  `<artifact>` with full HTML (3.3KB Modern Minimal design) instead
  of re-emitting the form. agy `--log-file` content correctly
  classified as RATE_LIMITED when Gemini Pro quota was exhausted,
  and as AGENT_AUTH_REQUIRED when keychain was cleared.

* fix(web/test): align amrAgent fixture with supportsCustomModel contract

The AMR agent definition in the daemon ships `supportsCustomModel: false`
so the Settings model picker hides the free-text "Custom…" option. The
PR changed `allowCustomModel` from `selected.id !== 'amr'` (hardcoded)
to `selected.supportsCustomModel !== false` (declarative), but the test
fixture was not updated to carry the same field — causing the
`__custom__` sentinel to appear in the picker under test.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): align formAnswerTransition wording with main + scope build directive to discovery

CI surfaced two failures on the merge with main:
- chat-route.test marks submitted discovery form answers ... expected
  the main-version wording 'Do not emit another <formId> form.'
- telemetry-message-finalization keeps non-discovery form answers
  active ... expected task-type to fall through the else branch
  ('Treat these form answers as the active user turn'), not the
  discovery RULE 2/RULE 3 build branch.

The colleague's earlier fba1e40b form-loop fix tightened both pieces
(stronger wording + grouped discovery|task-type into the build branch)
but didn't update the tests that pin the contract. Revert the
transition wording to main and re-scope the build directive to
'discovery' only. The aggressive form-loop suppression we added in
this PR now lives in the system-prompt FORM_ANSWERED_SYSTEM_OVERRIDE
block, which is far stronger than the user-request transition text
this commit reverts.

* fix(daemon): scope formOverride by form id, detach Linux terminal, move agy log cleanup to finally

- FORM_ANSWERED_GENERIC_OVERRIDE: new exported constant for non-discovery/
  non-task-type form ids; contains only the "do not re-ask" suppression
  without the RULE 2 / RULE 3 / artifact directive.
- formAnswerTransitionForCurrentPrompt: extend build-transition branch to
  include task-type alongside discovery, keeping user-turn and system
  override consistent.
- Prompt assembly (server.ts ~10848): derive formOverride from the parsed
  form id — FORM_ANSWERED_SYSTEM_OVERRIDE for discovery/task-type,
  FORM_ANSWERED_GENERIC_OVERRIDE for all other form ids, empty otherwise.
- launchOnLinux: replace execFileAsync (waited for terminal exit, 3 s cap)
  with spawn({ detached: true, stdio: 'ignore' }) + unref(); resolve on
  the 'spawn' event so long-lived interactive terminals (xterm, konsole)
  are not killed mid-OAuth-flow.
- Antigravity log cleanup: move fs.promises.unlink(agentLogFilePath) into
  a try/finally wrapper around the close handler so every exit path
  (success, failure, cancel, non-zero exit) cleans up the per-run temp
  file, preventing unbounded /tmp accumulation.
- Tests: rename task-type case to assert build-transition behaviour; add
  generic-form-id case (preferences) pinning the non-build path; add
  FORM_ANSWERED_GENERIC_OVERRIDE content assertions.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): switch Antigravity buildArgs to chat subcommand invocation

Replace top-level `-p -` with `agy chat [--log-file …] -` so the adapter
uses the documented chat subcommand and stdin sentinel instead of the
unrecognised global -p flag.  Update the agent-args test description and
all four deepEqual assertions to assert the ['chat', '-'] shape.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* test(daemon): drop real-platform default-launch assertion from terminal-launch suite

The removed test called launchAgentInSystemTerminal('agy') with no
platform override, which invokes the real system terminal on every
developer machine running the daemon test suite (Terminal.app on macOS,
cmd.exe on Windows, xterm/gnome-terminal on Linux). That is an
unacceptable OS side effect for a unit test.

The behaviour being asserted — that omitting platform selects
process.platform — is a TypeScript default-parameter guarantee, not a
runtime invariant that needs an integration test. The remaining 'aix'
case continues to pin the unsupported-platform failure shape.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): buffer Antigravity stdout to suppress auth URL before close-time classifier

The plain-stream close handler at code===0 can detect an agy OAuth
prompt in agentStdoutTail and emit AGENT_AUTH_REQUIRED, but by the
time close fires the stdout chunk has already been forwarded to the
client via the plain-stream `send('stdout', { chunk })` path. This
leaves both the raw OAuth URL and the terminal-launch guidance visible
in chat.

Buffer all stdout chunks for the `antigravity` agent instead of
forwarding them immediately. The existing close-time auth-prompt guard
(code===0, !trackingSubstantiveOutput, childStdoutSeen) returns early
when it detects the auth pattern, leaving the buffer unflushed and the
OAuth URL out of the SSE stream. For legitimate assistant output the
buffer is flushed in order just before design.runs.finish so the
chunks still arrive before the run's finished event.

Adds a chat-route integration test using a fake `agy` that exits 0
after printing the canonical auth prompt; asserts that the run emits
AGENT_AUTH_REQUIRED with no event: stdout delta containing the URL.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* test(daemon): isolate antigravity buildArgs argv test from real settings file

Pass a temp antigravitySettingsPath in the RuntimeContext for the
withModel argv assertion so unit tests do not touch
~/.gemini/antigravity-cli/settings.json. Adds the optional
antigravitySettingsPath field to RuntimeContext and threads it
through buildArgs to writeAntigravityModelSelection; production
callers leave it undefined, preserving the existing default path.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* fix(daemon): revert Antigravity buildArgs to `-p -` (the only working agy v1.0.3 invocation)

The looper-reviewer-bot reported `chat` as agy's headless subcommand
based on its environment's agy build, and looper-fixer applied that
shape. The installed CLI (`agy --version` reports `1.0.3`) does NOT
expose a `chat` subcommand — `agy --help`'s `Available subcommands`
section lists only `changelog / help / install / plugin / update`,
and `agy chat - < prompt` exits 0 with empty stdout (the daemon then
forwards it as a 'successful' empty reply, exactly the failure mode
the auth/quota guard at server.ts ~12090 is meant to catch — for the
wrong reason).

`-p` is the documented print-mode flag (`Short alias for --print`)
and `agy -p -` reads the prompt from stdin and prints the model
reply, which the entire end-to-end test sequence in this PR has
verified against (form-loop fix, settings.json model routing,
log-file classification all confirmed working on Gemini 3.5 Flash
+ GPT-OSS 120B Medium with this invocation).

Updates the agent-args test to pin `['-p', '-']` instead of
`['chat', '-']` and adds an inline comment in antigravity.ts noting
that `chat` may exist in a future agy build but is not the contract
on the installed CLI today.

* fix(daemon): serialize Antigravity concrete-model spawns to dodge settings.json race

Reviewer (looper) flagged a concurrency race in the model-routing path:
~/.gemini/antigravity-cli/settings.json is process-global, so two OD
runs starting close together with different concrete models can race
the file — run A writes model A, run B writes model B, then A's agy
finally reads settings.json and executes on model B. The Settings
model picker becomes nondeterministic under parallel conversations.

Adds a per-process promise chain in antigravity.ts:
  - acquireAntigravityModelLock(): chain-await + return release fn
  - waitForAgyToReadModel(logPath, expected): polls agy's --log-file
    for the upstream signal
      'Propagating selected model override to backend: label="<X>"'
    which model_config_manager.go emits once agy has finished reading
    settings.json. Returns true on observed match, false on timeout.
    Regex-escapes the expected label so '(' / ')' in 'GPT-OSS 120B
    (Medium)' match literally, not as a capture group.

server.ts spawn pipeline now acquires the lock BEFORE buildArgs (which
performs the settings.json write) and schedules a release-once handler
that fires when EITHER (a) the log-file confirms agy read the model
or (b) the child exits — the exit fallback prevents a stuck/crashed
agy from starving the queue for every subsequent antigravity spawn.

Default-model spawns bypass the lock entirely: their buildArgs doesn't
touch settings.json, so there's nothing to serialize.

Tests pin:
  - FIFO ordering across 2 / 3 concurrent acquirers
  - Wait helper's regex correctly matches parenthesized labels
  - Wait helper does NOT match a different model with shared prefix
  - Wait helper swallows missing-log-file errors and returns false on
    timeout (no spawn-pipeline crash if the log never appears)

194 → 198 passing runtime tests, 0 regressions.

* fix(daemon): close Antigravity lock release race on slow agy startup (looper #263fd2fe7)

Reviewer flagged that the previous serialization scheduled
`releaseOnce` in `.finally()` on waitForAgyToReadModel — meaning the
helper's `false` timeout return ALSO released the lock. If agy took
longer than the 15s polling window to read settings.json (cold start,
swap-thrash, slow network handshake to the upstream backend), run A's
lock dropped at 15s, run B rewrote settings.json with model B, and
run A's still-starting agy then read the wrong model. Same race the
original mutex was meant to close.

Fix the release semantics to be release-on-confirmation-only:

  - waitForAgyToReadModel: `false` now strictly means 'I gave up
    polling,' not 'agy definitely did not read this.' Document the
    contract so a future caller can't conflate the two. Add an
    optional AbortSignal so server.ts can stop polling when the child
    exits — without it, the leftover watcher could outlive the run
    and accidentally match a later concurrent run's log content,
    releasing the wrong lock.
  - server.ts: schedule `releaseOnce` only when waitForAgyToReadModel
    returns true. The exit handler (which fires for crashes, fast
    exits, normal completion) is now the canonical fallback that
    releases the lock no matter what — the queue can't starve
    permanently because agy always exits eventually. The exit
    handler also fires the AbortController so the watcher cleans up.

New tests pin:
  - timeout returns false WITHOUT any release-implying side effect
  - already-aborted signal short-circuits (no readFile calls)
  - abort mid-poll wakes the helper from its setTimeout (no
    multi-hundred-ms hang waiting out a poll interval that no longer
    matters)

198 → 201 passing runtime tests, 0 regressions.

---------

Co-authored-by: qiongyu1999 <2694684348@qq.com>
2026-05-29 05:43:37 +00:00

2106 lines
76 KiB
TypeScript

import type http from 'node:http';
import { randomUUID } from 'node:crypto';
import {
chmodSync,
existsSync,
mkdirSync,
mkdtempSync,
promises as fsp,
readFileSync,
realpathSync,
rmSync,
symlinkSync,
writeFileSync,
} from 'node:fs';
import { tmpdir } from 'node:os';
import { delimiter, join, resolve } from 'node:path';
import { afterAll, afterEach, beforeAll, describe, expect, it } from 'vitest';
import {
composeLiveInstructionPrompt,
resolveGrantedCodexImagegenOverride,
resolveCodexGeneratedImagesDir,
resolveChatExtraAllowedDirs,
resolveResearchCommandContract,
startServer,
validateCodexGeneratedImagesDir,
} from '../src/server.js';
import { skillCwdAliasSegment } from '../src/cwd-aliases.js';
import { getAgentDef } from '../src/agents.js';
import { readMemoryConfig, writeMemoryConfig } from '../src/memory.js';
import { renderCodexImagegenOverride } from '../src/prompts/system.js';
const FAKE_VELA_FIXTURE = resolve(process.cwd(), 'tests', 'fixtures', 'fake-vela.mjs');
function symlinkDir(target: string, link: string): void {
symlinkSync(target, link, process.platform === 'win32' ? 'junction' : 'dir');
}
async function withFakeAgent<T>(
binName: string,
script: string,
run: () => Promise<T>,
): Promise<T> {
const dir = await fsp.mkdtemp(join(tmpdir(), 'od-chat-route-bin-'));
const oldPath = process.env.PATH;
try {
if (process.platform === 'win32') {
const runner = join(dir, `${binName}-test-runner.cjs`);
await fsp.writeFile(runner, script);
await fsp.writeFile(
join(dir, `${binName}.cmd`),
`@echo off\r\nnode "${runner}" %*\r\n`,
);
} else {
const bin = join(dir, binName);
await fsp.writeFile(bin, `#!/usr/bin/env node\n${script}`);
await fsp.chmod(bin, 0o755);
}
process.env.PATH = `${dir}${delimiter}${oldPath ?? ''}`;
return await run();
} finally {
process.env.PATH = oldPath;
await fsp.rm(dir, { recursive: true, force: true });
}
}
describe('/api/chat', () => {
let server: http.Server;
let baseUrl: string;
let originalMemoryConfig: Awaited<ReturnType<typeof readMemoryConfig>> | null = null;
const originalPath = process.env.PATH;
const originalAgentHome = process.env.OD_AGENT_HOME;
const tempDirs: string[] = [];
async function createPluginFixture(args: {
pluginId: string;
dirName: string;
localSkillPath?: string;
}): Promise<string> {
const root = await fsp.mkdtemp(join(tmpdir(), 'od-plugin-fixture-'));
tempDirs.push(root);
const fixtureDir = resolve(root, args.dirName);
const baseFixtureDir = resolve(
process.cwd(),
'tests',
'fixtures',
'plugin-fixtures',
'sample-plugin',
);
await fsp.cp(baseFixtureDir, fixtureDir, { recursive: true });
const manifestPath = resolve(fixtureDir, 'open-design.json');
const manifest = JSON.parse(await fsp.readFile(manifestPath, 'utf8')) as {
name: string;
title: string;
od?: { context?: { skills?: Array<{ ref?: string; path?: string }> } };
};
manifest.name = args.pluginId;
manifest.title = args.pluginId;
if (args.localSkillPath) {
manifest.od ??= {};
manifest.od.context ??= {};
manifest.od.context.skills = [{ path: args.localSkillPath }];
}
await fsp.writeFile(manifestPath, JSON.stringify(manifest, null, 2), 'utf8');
return fixtureDir;
}
beforeAll(async () => {
if (process.env.OD_DATA_DIR) {
originalMemoryConfig = await readMemoryConfig(process.env.OD_DATA_DIR);
await writeMemoryConfig(process.env.OD_DATA_DIR, {
enabled: false,
extraction: null,
});
}
const started = await startServer({ port: 0, returnServer: true }) as {
url: string;
server: http.Server;
};
baseUrl = started.url;
server = started.server;
});
afterEach(() => {
if (originalPath == null) {
delete process.env.PATH;
} else {
process.env.PATH = originalPath;
}
if (originalAgentHome == null) {
delete process.env.OD_AGENT_HOME;
} else {
process.env.OD_AGENT_HOME = originalAgentHome;
}
});
afterAll(async () => {
for (const dir of tempDirs.splice(0)) {
rmSync(dir, { recursive: true, force: true });
}
if (server) {
await new Promise<void>((resolve) => server.close(() => resolve()));
}
if (process.env.OD_DATA_DIR && originalMemoryConfig) {
await writeMemoryConfig(process.env.OD_DATA_DIR, {
enabled: originalMemoryConfig.enabled,
extraction: originalMemoryConfig.extraction,
});
}
});
it('does not reference an out-of-scope response while starting a run', async () => {
process.env.PATH = '';
const emptyAgentHome = mkdtempSync(join(tmpdir(), 'od-empty-agent-home-'));
tempDirs.push(emptyAgentHome);
process.env.OD_AGENT_HOME = emptyAgentHome;
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'claude',
message: 'hello',
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).not.toContain('res is not defined');
expect(body).toContain('AGENT_UNAVAILABLE');
});
it('marks json stream runs failed when an error frame exits with code 0', async () => {
const conversationId = `conv-${randomUUID()}`;
await withFakeAgent(
'opencode',
`
console.log(JSON.stringify({
type: 'error',
error: { message: 'model not found: fake-opencode-model' },
}));
process.exit(0);
`,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
conversationId,
message: 'hello',
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('AGENT_EXECUTION_FAILED');
expect(body).toContain('model not found: fake-opencode-model');
expect(body).toContain('"status":"failed"');
expect(body).not.toContain('"status":"succeeded"');
const runsResponse = await fetch(
`${baseUrl}/api/runs?conversationId=${encodeURIComponent(conversationId)}`,
);
const runsBody = (await runsResponse.json()) as {
runs: Array<{ conversationId: string | null; status: string; exitCode: number | null }>;
};
expect(runsBody.runs).toHaveLength(1);
expect(runsBody.runs[0]).toMatchObject({
conversationId,
status: 'failed',
exitCode: 0,
});
},
);
});
it('retries transient AMR Link catalog failures before aborting startup', async () => {
const previousRuntimeKey = process.env.VELA_RUNTIME_KEY;
const previousLinkUrl = process.env.VELA_LINK_URL;
const stateFile = join(tmpdir(), `od-amr-model-retry-${randomUUID()}.json`);
try {
process.env.VELA_RUNTIME_KEY = 'fake-runtime-key';
process.env.VELA_LINK_URL = 'https://amr-link.open-design.ai/v1';
await withFakeAgent(
'vela',
`
const { existsSync, readFileSync, writeFileSync } = require('node:fs');
const { spawn } = require('node:child_process');
const fixture = ${JSON.stringify(FAKE_VELA_FIXTURE)};
const stateFile = ${JSON.stringify(stateFile)};
const args = process.argv.slice(2);
if (args[0] === 'models') {
const state = existsSync(stateFile)
? JSON.parse(readFileSync(stateFile, 'utf8'))
: { attempts: 0 };
state.attempts += 1;
writeFileSync(stateFile, JSON.stringify(state), 'utf8');
if (state.attempts < 3) {
process.stderr.write('Get "https://amr-link.open-design.ai/v1/models": context deadline exceeded\\n');
process.exit(1);
}
}
const child = spawn(process.execPath, [fixture, ...args], {
stdio: 'inherit',
env: process.env,
});
child.on('exit', (code, signal) => {
if (signal) process.kill(process.pid, signal);
process.exit(code ?? 0);
});
`,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'amr',
message: 'hello',
model: 'deepseek-v3.2',
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('"type":"text_delta","delta":"Hello from fake "');
expect(body).toContain('"type":"text_delta","delta":"vela."');
expect(body).not.toContain('model_catalog_unavailable');
const attempts = JSON.parse(readFileSync(stateFile, 'utf8')) as { attempts: number };
expect(attempts.attempts).toBe(3);
},
);
} finally {
rmSync(stateFile, { force: true });
if (previousRuntimeKey == null) delete process.env.VELA_RUNTIME_KEY;
else process.env.VELA_RUNTIME_KEY = previousRuntimeKey;
if (previousLinkUrl == null) delete process.env.VELA_LINK_URL;
else process.env.VELA_LINK_URL = previousLinkUrl;
}
});
it('does not report plugin authoring as succeeded when the agent only emits planning text without artifacts', async () => {
const projectId = `proj-plugin-authoring-${randomUUID()}`;
const createProjectResponse = await fetch(`${baseUrl}/api/projects`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
id: projectId,
name: 'Plugin authoring completion fixture',
skillId: null,
designSystemId: null,
}),
});
expect(createProjectResponse.status).toBe(200);
const conversationsResponse = await fetch(`${baseUrl}/api/projects/${projectId}/conversations`);
expect(conversationsResponse.status).toBe(200);
const conversationsBody = await conversationsResponse.json() as {
conversations: Array<{ id: string }>;
};
const conversationId = conversationsBody.conversations[0]?.id;
expect(conversationId).toBeTruthy();
await withFakeAgent(
'opencode',
`
process.stdin.resume();
process.stdin.on('end', () => {
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: '我来帮你创建一个通用的 Open Design 插件脚手架。先读取文档规范,再生成插件文件。' } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
projectId,
conversationId,
pluginId: 'od-plugin-authoring',
message: '请创建一个可刷新、可审计、由 API 驱动的 Open Design 插件脚手架。',
}),
});
expect(createResponse.status).toBe(202);
const {
runId,
pluginId,
appliedPluginSnapshotId,
} = await createResponse.json() as {
runId: string;
pluginId: string | null;
appliedPluginSnapshotId: string | null;
};
expect(pluginId).toBe('od-plugin-authoring');
expect(appliedPluginSnapshotId).toBeTruthy();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`);
const eventsBody = await readSseUntil(eventsResponse, 'event: final');
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('先读取文档规范,再生成插件文件');
expect(statusBody.status).not.toBe('succeeded');
const filesResponse = await fetch(`${baseUrl}/api/projects/${projectId}/files`);
expect(filesResponse.status).toBe(200);
const filesBody = await filesResponse.json() as { files: Array<{ name: string }> };
expect(filesBody.files.some((file) => file.name.startsWith('generated-plugin/'))).toBe(false);
},
);
});
it('closes the # Instructions block with an explicit "do not echo" guard so models do not parrot the prompt back', async () => {
// claude-opus-4-7 (and a few other instruction-tuned models) start
// their reply by echoing the # Instructions block verbatim, which
// shows up to users as the system prompt leading the visible
// answer. server.ts:9934 closes every Instructions block with a
// trailing guard line; this test pins the literal so a future
// refactor cannot silently drop it.
await withFakeAgent(
'opencode',
`
let prompt = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => {
prompt += chunk;
});
process.stdin.on('end', () => {
const checks = [
prompt.includes('Do not quote, restate, or echo the # Instructions block above')
? 'has-echo-guard'
: 'missing-echo-guard',
];
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: checks.join('\\n') } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
message: 'hello',
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('has-echo-guard');
expect(body).not.toContain('missing-echo-guard');
},
);
});
it('injects @-mention skillIds into the composed system prompt', async () => {
await withFakeAgent(
'opencode',
`
let prompt = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => {
prompt += chunk;
});
process.stdin.on('end', () => {
const checks = [
prompt.includes('## Composed skill — faq-page') ? 'has-composed-skill-header' : 'missing-composed-skill-header',
prompt.includes('# FAQ Page Skill') ? 'has-faq-skill-body' : 'missing-faq-skill-body',
prompt.includes('category filtering') ? 'has-faq-skill-content' : 'missing-faq-skill-content',
];
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: checks.join('\\n') } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
message: 'build an faq page',
skillIds: ['faq-page'],
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('has-composed-skill-header');
expect(body).toContain('has-faq-skill-body');
expect(body).toContain('has-faq-skill-content');
expect(body).not.toContain('missing-composed-skill-header');
expect(body).not.toContain('missing-faq-skill-body');
expect(body).not.toContain('missing-faq-skill-content');
},
);
});
it('stages ad-hoc skill side files into the project cwd', async () => {
const projectId = `project-${randomUUID()}`;
const stagedRelativePath = `.od-skills/${skillCwdAliasSegment(resolve(process.cwd(), '..', '..', 'skills', 'release-notes-one-pager'))}/references/checklist.md`;
const expectedChecklist = await fsp.readFile(
resolve(process.cwd(), '..', '..', 'skills', 'release-notes-one-pager', 'references', 'checklist.md'),
'utf8',
);
const createProjectResponse = await fetch(`${baseUrl}/api/projects`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
id: projectId,
name: 'Ad hoc staged skill project',
}),
});
expect(createProjectResponse.ok).toBe(true);
const fakeAgentScript = `
const fs = require('node:fs');
const stagedChecklist = fs.readFileSync(${JSON.stringify(stagedRelativePath)}, 'utf8');
if (stagedChecklist !== ${JSON.stringify(expectedChecklist)}) {
console.error('staged-skill-side-files-mismatch');
process.exit(1);
}
process.stdin.resume();
process.stdin.on('end', () => {
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: 'staged-skill-side-files-before-spawn' } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`;
await withFakeAgent(
'opencode',
fakeAgentScript,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
projectId,
message: 'draft the release notes',
skillIds: ['release-notes-one-pager'],
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('staged-skill-side-files-before-spawn');
},
);
const stagedFileResponse = await fetch(
`${baseUrl}/api/projects/${projectId}/raw/${stagedRelativePath}`,
);
const stagedFileBody = await stagedFileResponse.text();
expect(stagedFileResponse.ok).toBe(true);
expect(stagedFileBody).toBe(expectedChecklist);
});
it('stages side files for every composed skill into the project cwd', async () => {
const projectId = `project-${randomUUID()}`;
const stagedPaths = [
`.od-skills/${skillCwdAliasSegment(resolve(process.cwd(), '..', '..', 'skills', 'release-notes-one-pager'))}/references/checklist.md`,
`.od-skills/${skillCwdAliasSegment(resolve(process.cwd(), '..', '..', 'skills', 'swiss-creative-mode-template'))}/references/checklist.md`,
] as const;
const expectedBodies = await Promise.all(
[
resolve(process.cwd(), '..', '..', 'skills', 'release-notes-one-pager', 'references', 'checklist.md'),
resolve(process.cwd(), '..', '..', 'skills', 'swiss-creative-mode-template', 'references', 'checklist.md'),
].map((file) => fsp.readFile(file, 'utf8')),
);
const createProjectResponse = await fetch(`${baseUrl}/api/projects`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
id: projectId,
name: 'Multi staged skill project',
}),
});
expect(createProjectResponse.ok).toBe(true);
const fakeAgentScript = `
const fs = require('node:fs');
const stagedBodies = [
fs.readFileSync(${JSON.stringify(stagedPaths[0])}, 'utf8'),
fs.readFileSync(${JSON.stringify(stagedPaths[1])}, 'utf8'),
];
const expectedBodies = ${JSON.stringify(expectedBodies)};
if (JSON.stringify(stagedBodies) !== JSON.stringify(expectedBodies)) {
console.error('multi-staged-skill-side-files-mismatch');
process.exit(1);
}
process.stdin.resume();
process.stdin.on('end', () => {
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: 'multi-staged-skill-side-files-before-spawn' } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`;
await withFakeAgent(
'opencode',
fakeAgentScript,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
projectId,
message: 'compose multiple skills',
skillIds: ['release-notes-one-pager', 'swiss-creative-mode-template'],
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('multi-staged-skill-side-files-before-spawn');
},
);
});
it('propagates the composed skill mode for ad-hoc-only deck skills', async () => {
await withFakeAgent(
'opencode',
`
let prompt = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => {
prompt += chunk;
});
process.stdin.on('end', () => {
const checks = [
prompt.includes('## Composed skill — open-design-landing-deck') ? 'has-deck-skill-header' : 'missing-deck-skill-header',
prompt.includes('# Slide deck — fixed framework (this is non-negotiable for deck mode)') ? 'has-deck-framework' : 'missing-deck-framework',
];
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: checks.join('\\n') } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
message: 'build an editorial brand deck',
skillIds: ['open-design-landing-deck'],
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('has-deck-skill-header');
expect(body).toContain('has-deck-framework');
expect(body).not.toContain('missing-deck-skill-header');
expect(body).not.toContain('missing-deck-framework');
},
);
});
it('preserves a persisted media skill as the primary surface over a composed deck mention', async () => {
await withFakeAgent(
'opencode',
`
let prompt = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => {
prompt += chunk;
});
process.stdin.on('end', () => {
const checks = [
prompt.includes('# imagegen') ? 'has-base-image-skill-body' : 'missing-base-image-skill-body',
prompt.includes('## Composed skill — open-design-landing-deck') ? 'has-composed-deck-skill-header' : 'missing-composed-deck-skill-header',
prompt.includes('## Media generation contract (load-bearing — overrides softer wording above)') ? 'has-image-contract' : 'missing-image-contract',
prompt.includes('# Slide deck — fixed framework (this is non-negotiable for deck mode)') ? 'unexpected-deck-framework' : 'kept-deck-framework-out',
];
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: checks.join('\\n') } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
message: 'generate an image while also referencing a deck template',
skillId: 'imagegen',
skillIds: ['open-design-landing-deck'],
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('has-base-image-skill-body');
expect(body).toContain('has-composed-deck-skill-header');
expect(body).toContain('has-image-contract');
expect(body).toContain('kept-deck-framework-out');
expect(body).not.toContain('missing-base-image-skill-body');
expect(body).not.toContain('missing-composed-deck-skill-header');
expect(body).not.toContain('missing-image-contract');
expect(body).not.toContain('unexpected-deck-framework');
},
);
});
it('honors mediaExecution on legacy chat requests', async () => {
const conversationId = `conv-${randomUUID()}`;
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: `missing-agent-${randomUUID()}`,
conversationId,
message: 'plan an image without using OD media',
skillId: 'imagegen',
mediaExecution: {
mode: 'disabled',
allowedSurfaces: ['image'],
},
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('unknown agent');
const runsResponse = await fetch(
`${baseUrl}/api/runs?conversationId=${encodeURIComponent(conversationId)}`,
);
const runsBody = await runsResponse.json() as {
runs: Array<{ mediaExecution?: { mode?: string; allowedSurfaces?: string[] } }>;
};
expect(runsBody.runs).toHaveLength(1);
expect(runsBody.runs[0]?.mediaExecution).toMatchObject({
mode: 'disabled',
allowedSurfaces: ['image'],
});
});
it('rejects invalid mediaExecution on legacy chat requests', async () => {
const conversationId = `conv-${randomUUID()}`;
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
conversationId,
message: 'generate an image',
mediaExecution: { mode: 'provider-router' },
}),
});
const body = await response.text();
expect(response.status).toBe(400);
expect(body).toContain('mediaExecution.mode');
const runsResponse = await fetch(
`${baseUrl}/api/runs?conversationId=${encodeURIComponent(conversationId)}`,
);
const runsBody = await runsResponse.json() as { runs: unknown[] };
expect(runsBody.runs).toEqual([]);
});
it('propagates ad-hoc skill critique policy into the chat resolver', async () => {
if (!process.env.OD_DATA_DIR) {
throw new Error('OD_DATA_DIR is required for user skill critique-policy tests');
}
const skillId = `critique-opt-out-${randomUUID()}`;
const skillDir = resolve(process.env.OD_DATA_DIR, 'skills', skillId);
const originalCritiqueEnabled = process.env.OD_CRITIQUE_ENABLED;
await fsp.mkdir(skillDir, { recursive: true });
await fsp.writeFile(
resolve(skillDir, 'SKILL.md'),
`---
name: ${skillId}
description: Ad-hoc critique opt-out regression fixture.
od:
critique:
policy: opt-out
---
# Critique opt-out fixture
This skill should suppress critique when selected through skillIds.
`,
'utf8',
);
process.env.OD_CRITIQUE_ENABLED = 'true';
try {
await withFakeAgent(
'opencode',
`
let prompt = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => {
prompt += chunk;
});
process.stdin.on('end', () => {
const checks = [
prompt.includes('## Composed skill — ${skillId}') ? 'has-opt-out-skill-header' : 'missing-opt-out-skill-header',
prompt.includes('<CRITIQUE_RUN') ? 'unexpected-critique-panel' : 'critique-panel-disabled-by-skill-policy',
];
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: checks.join('\\n') } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
designSystemId: 'default',
message: 'draft an opt-out skill artifact',
skillIds: [skillId],
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('has-opt-out-skill-header');
expect(body).toContain('critique-panel-disabled-by-skill-policy');
expect(body).not.toContain('missing-opt-out-skill-header');
expect(body).not.toContain('unexpected-critique-panel');
},
);
} finally {
if (originalCritiqueEnabled == null) {
delete process.env.OD_CRITIQUE_ENABLED;
} else {
process.env.OD_CRITIQUE_ENABLED = originalCritiqueEnabled;
}
await fsp.rm(skillDir, { recursive: true, force: true });
}
});
it('preserves plugin-local and composed @-mention skills in plugin-bound runs', async () => {
const pluginId = `plugin-local-${randomUUID()}`;
const pluginFixtureDir = await createPluginFixture({
pluginId,
dirName: `plugin-local-${randomUUID()}`,
localSkillPath: './SKILL.md',
});
const installResponse = await fetch(`${baseUrl}/api/plugins/install`, {
method: 'POST',
headers: { 'Content-Type': 'application/json', accept: 'text/event-stream' },
body: JSON.stringify({ source: pluginFixtureDir }),
});
const installBody = await installResponse.text();
expect(installResponse.status).toBe(200);
expect(installBody).toContain(`"id":"${pluginId}"`);
const projectId = `project-${randomUUID()}`;
const createProjectResponse = await fetch(`${baseUrl}/api/projects`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
id: projectId,
name: 'Plugin-bound skill composition project',
pluginId,
pluginInputs: { topic: 'agentic design' },
}),
});
const createProjectBody = await createProjectResponse.json() as {
appliedPluginSnapshotId?: string;
};
expect(createProjectResponse.ok).toBe(true);
expect(createProjectBody.appliedPluginSnapshotId).toBeTruthy();
await withFakeAgent(
'opencode',
`
let prompt = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => {
prompt += chunk;
});
process.stdin.on('end', () => {
const checks = [
prompt.includes('# Sample Plugin') ? 'has-plugin-skill-body' : 'missing-plugin-skill-body',
prompt.includes('## Composed skill — faq-page') ? 'has-composed-skill-header' : 'missing-composed-skill-header',
prompt.includes('# FAQ Page Skill') ? 'has-composed-skill-body' : 'missing-composed-skill-body',
];
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: checks.join('\\n') } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`,
async () => {
const createRunResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
projectId,
message: 'build a plugin-backed faq page',
appliedPluginSnapshotId: createProjectBody.appliedPluginSnapshotId,
skillIds: ['faq-page'],
}),
});
const createRunBody = await createRunResponse.json() as { runId: string };
expect(createRunResponse.status).toBe(202);
const eventsResponse = await fetch(`${baseUrl}/api/runs/${createRunBody.runId}/events`);
const body = await readSseUntil(eventsResponse, 'event: final');
expect(body).toContain('has-plugin-skill-body');
expect(body).toContain('has-composed-skill-header');
expect(body).toContain('has-composed-skill-body');
expect(body).not.toContain('missing-plugin-skill-body');
expect(body).not.toContain('missing-composed-skill-header');
expect(body).not.toContain('missing-composed-skill-body');
},
);
});
it('stages colliding plugin and composed skill dirs under distinct aliases', async () => {
if (!process.env.OD_DATA_DIR) {
throw new Error('OD_DATA_DIR is required for colliding skill-dir staging tests');
}
const pluginId = `plugin-collision-${randomUUID()}`;
const pluginFixtureDir = await createPluginFixture({
pluginId,
dirName: 'sample-plugin',
localSkillPath: './SKILL.md',
});
const installResponse = await fetch(`${baseUrl}/api/plugins/install`, {
method: 'POST',
headers: { 'Content-Type': 'application/json', accept: 'text/event-stream' },
body: JSON.stringify({ source: pluginFixtureDir }),
});
const installBody = await installResponse.text();
expect(installResponse.status).toBe(200);
expect(installBody).toContain(`"id":"${pluginId}"`);
const projectId = `project-${randomUUID()}`;
const userSkillDir = resolve(process.env.OD_DATA_DIR, 'skills', 'sample-plugin');
const userChecklist = 'user-skill-checklist';
const userAlias = skillCwdAliasSegment(userSkillDir);
await fsp.mkdir(resolve(userSkillDir, 'references'), { recursive: true });
await fsp.writeFile(
resolve(userSkillDir, 'SKILL.md'),
'# Sample-plugin side-file fixture\n\nRead references/checklist.md before drafting.',
'utf8',
);
await fsp.writeFile(resolve(userSkillDir, 'references', 'checklist.md'), userChecklist, 'utf8');
try {
const createProjectResponse = await fetch(`${baseUrl}/api/projects`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
id: projectId,
name: 'Colliding skill-dir project',
pluginId,
pluginInputs: { topic: 'agentic design' },
}),
});
const createProjectBody = await createProjectResponse.json() as {
appliedPluginSnapshotId?: string;
};
const installedPluginResponse = await fetch(`${baseUrl}/api/plugins/${pluginId}`);
const installedPluginBody = await installedPluginResponse.json() as { fsPath: string };
const pluginAlias = skillCwdAliasSegment(installedPluginBody.fsPath);
expect(createProjectResponse.ok).toBe(true);
expect(installedPluginResponse.ok).toBe(true);
expect(createProjectBody.appliedPluginSnapshotId).toBeTruthy();
expect(pluginAlias).not.toBe(userAlias);
await withFakeAgent(
'opencode',
`
const fs = require('node:fs');
const pluginSkill = fs.readFileSync(${JSON.stringify(`.od-skills/${pluginAlias}/SKILL.md`)}, 'utf8');
const userChecklist = fs.readFileSync(${JSON.stringify(`.od-skills/${userAlias}/references/checklist.md`)}, 'utf8');
if (!pluginSkill.includes('# Sample Plugin')) {
console.error('plugin-skill-stage-missing');
process.exit(1);
}
if (userChecklist !== ${JSON.stringify(userChecklist)}) {
console.error('colliding-skill-stage-mismatch');
process.exit(1);
}
process.stdin.resume();
process.stdin.on('end', () => {
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: 'colliding-skill-dirs-staged' } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`,
async () => {
const createRunResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
projectId,
message: 'use both plugin and user skill side files',
appliedPluginSnapshotId: createProjectBody.appliedPluginSnapshotId,
skillIds: ['sample-plugin'],
}),
});
const createRunBody = await createRunResponse.json() as { runId: string };
expect(createRunResponse.status).toBe(202);
const eventsResponse = await fetch(`${baseUrl}/api/runs/${createRunBody.runId}/events`);
const body = await readSseUntil(eventsResponse, 'event: final');
expect(body).toContain('colliding-skill-dirs-staged');
},
);
} finally {
await fsp.rm(userSkillDir, { recursive: true, force: true });
}
});
it('canonicalizes aliased skill ids before deduping composed skills', async () => {
await withFakeAgent(
'opencode',
`
let prompt = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => {
prompt += chunk;
});
process.stdin.on('end', () => {
const hasDuplicateComposedAlias = prompt.includes('## Composed skill — open-design-landing');
const checks = [
hasDuplicateComposedAlias ? 'duplicate-alias-composed-skill' : 'deduped-alias-composed-skill',
prompt.includes('# open-design-landing') ? 'has-base-alias-skill-body' : 'missing-base-alias-skill-body',
];
console.log(JSON.stringify({ type: 'step_start' }));
console.log(JSON.stringify({ type: 'text', part: { text: checks.join('\\n') } }));
console.log(JSON.stringify({ type: 'step_finish', part: { tokens: { input: 1, output: 1 } } }));
process.exit(0);
});
`,
async () => {
const response = await fetch(`${baseUrl}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
message: 'build the Open Design landing page',
skillId: 'editorial-collage',
skillIds: ['open-design-landing'],
}),
});
const body = await response.text();
expect(response.ok).toBe(true);
expect(body).toContain('deduped-alias-composed-skill');
expect(body).toContain('has-base-alias-skill-body');
expect(body).not.toContain('duplicate-alias-composed-skill');
expect(body).not.toContain('missing-base-alias-skill-body');
},
);
});
it('classifies Cursor Agent authentication stderr as a typed run error', async () => {
await withFakeAgent(
'cursor-agent',
`
const args = process.argv.slice(2);
if (args[0] === '--version') {
console.log('2026.05.07-test');
process.exit(0);
}
if (args[0] === 'models') {
console.log('auto');
process.exit(0);
}
console.error("Authentication required. Please run 'agent login' first, or set CURSOR_API_KEY environment variable.");
process.exit(1);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'cursor-agent',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'AGENT_AUTH_REQUIRED');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('AGENT_AUTH_REQUIRED');
expect(eventsBody).toContain('cursor-agent login');
expect(eventsBody).toContain('cursor-agent status');
expect(statusBody.status).toBe('failed');
},
);
});
it('classifies Cursor Agent Not logged in stderr as a typed run error', async () => {
await withFakeAgent(
'cursor-agent',
`
const args = process.argv.slice(2);
if (args[0] === '--version') {
console.log('2026.05.07-test');
process.exit(0);
}
if (args[0] === 'models') {
console.log('auto');
process.exit(0);
}
console.error('Not logged in');
process.exit(1);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'cursor-agent',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'AGENT_AUTH_REQUIRED');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('AGENT_AUTH_REQUIRED');
expect(eventsBody).toContain('cursor-agent login');
expect(eventsBody).toContain('cursor-agent status');
expect(statusBody.status).toBe('failed');
},
);
});
it('classifies Cursor Agent stdout auth text as a typed run error', async () => {
await withFakeAgent(
'cursor-agent',
`
const args = process.argv.slice(2);
if (args[0] === '--version') {
console.log('2026.05.07-test');
process.exit(0);
}
if (args[0] === 'models') {
console.log('auto');
process.exit(0);
}
console.log('ConnectError: [unauthenticated]');
process.exit(1);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'cursor-agent',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'AGENT_AUTH_REQUIRED');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('AGENT_AUTH_REQUIRED');
expect(eventsBody).toContain('cursor-agent login');
expect(eventsBody).toContain('cursor-agent status');
expect(eventsBody).not.toContain('AGENT_EXECUTION_FAILED');
expect(statusBody.status).toBe('failed');
},
);
});
it('classifies Cursor Agent stdout error payloads as typed auth failures', async () => {
const cursorErrorLine = JSON.stringify({
type: 'error',
message: 'Error: [unauthenticated] Error',
});
await withFakeAgent(
'cursor-agent',
`
const args = process.argv.slice(2);
if (args[0] === '--version') {
console.log('2026.05.07-test');
process.exit(0);
}
if (args[0] === 'models') {
console.log('auto');
process.exit(0);
}
console.log(${JSON.stringify(cursorErrorLine)});
process.exit(1);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'cursor-agent',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'AGENT_AUTH_REQUIRED');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('AGENT_AUTH_REQUIRED');
expect(eventsBody).toContain('cursor-agent login');
expect(eventsBody).toContain('cursor-agent status');
expect(eventsBody).not.toContain('AGENT_EXECUTION_FAILED');
expect(statusBody.status).toBe('failed');
},
);
});
it('classifies DeepSeek TUI config guidance as typed auth failures', async () => {
await withFakeAgent(
'deepseek',
`
const args = process.argv.slice(2);
if (args[0] === '--version') {
console.log('deepseek 0.3.0-test');
process.exit(0);
}
console.error('KEY=<your-key> deepseek --api-key <your-key>');
console.error('api_key = "<your-key>" in ~/.deepseek/config.toml');
process.exit(1);
`,
async () => {
const deepseek = getAgentDef('deepseek');
expect(deepseek).toBeDefined();
const originalBudget = deepseek?.maxPromptArgBytes;
if (deepseek) deepseek.maxPromptArgBytes = 200_000;
try {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'deepseek',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'AGENT_AUTH_REQUIRED');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('AGENT_AUTH_REQUIRED');
expect(eventsBody).toContain('~/.deepseek/config.toml');
expect(eventsBody).toContain('DEEPSEEK_API_KEY');
expect(eventsBody).not.toContain('cursor-agent login');
expect(eventsBody).not.toContain('AGENT_EXECUTION_FAILED');
expect(statusBody.status).toBe('failed');
} finally {
if (deepseek) {
if (originalBudget === undefined) {
delete deepseek.maxPromptArgBytes;
} else {
deepseek.maxPromptArgBytes = originalBudget;
}
}
}
},
);
});
it('suppresses Antigravity auth stdout and emits AGENT_AUTH_REQUIRED without an event: stdout delta', async () => {
await withFakeAgent(
'agy',
`
const args = process.argv.slice(2);
if (args[0] === '--version') {
console.log('1.107.0-test');
process.exit(0);
}
// Simulate agy chat - printing the OAuth prompt and exiting 0
process.stdout.write('Authentication required. Please visit the URL to log in: https://accounts.google.com/o/oauth2/auth?client_id=12345&redirect_uri=antigravity-redirect\\n');
process.stdout.write('Waiting for authentication (timeout 30s)...\\n');
process.stdout.write('Error: authentication timed out.\\n');
process.exit(0);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'antigravity',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'AGENT_AUTH_REQUIRED');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('AGENT_AUTH_REQUIRED');
expect(eventsBody).not.toContain('event: stdout');
expect(eventsBody).not.toContain('accounts.google.com');
expect(statusBody.status).toBe('failed');
},
);
});
it('surfaces Qoder assistant error records through the SSE error channel', async () => {
const qoderErrorLine = JSON.stringify({
type: 'assistant',
message: { content: [] },
error: { message: 'Qoder authentication expired' },
});
await withFakeAgent(
'qodercli',
`console.log(${JSON.stringify(qoderErrorLine)});\nprocess.exit(0);\n`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'qoder',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'event: error');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('Qoder authentication expired');
expect(eventsBody).not.toContain('event: agent\\ndata: {"type":"error"');
expect(statusBody.status).toBe('failed');
},
);
});
it('fails Qoder runs when the result reports is_error with exit code 0', async () => {
const qoderResultLine = JSON.stringify({
type: 'result',
subtype: 'error',
duration_ms: 17,
is_error: true,
stop_reason: 'tool_use_failed',
total_cost_usd: 0,
usage: {
input_tokens: 3,
output_tokens: 1,
},
});
await withFakeAgent(
'qodercli',
`console.log(${JSON.stringify(qoderResultLine)});\nprocess.exit(0);\n`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'qoder',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'event: error');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: agent');
expect(eventsBody).toContain('"type":"usage"');
expect(eventsBody).toContain('"isError":true');
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('Qoder run failed: tool_use_failed');
expect(statusBody.status).toBe('failed');
},
);
});
it('fails stalled json-stream runs after the inactivity timeout elapses', async () => {
const previous = process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS;
process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS = '500';
try {
await withFakeAgent(
'opencode',
`
console.log(JSON.stringify({ type: 'step_start' }));
process.on('SIGTERM', () => process.exit(143));
setInterval(() => {}, 1000);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'event: error');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('Agent stalled without emitting any new output');
expect(eventsBody).toContain('Phase details: spawned agent opencode;');
expect(eventsBody).not.toContain('spawned agent binary');
expect(eventsBody).toMatch(/stdout arrived: (yes|no)/);
expect(statusBody.status).toBe('failed');
},
);
} finally {
if (previous == null) {
delete process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS;
} else {
process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS = previous;
}
}
});
it('keeps Claude stream runs alive while structured output is still flowing', async () => {
const previous = process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS;
process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS = '3000';
try {
await withFakeAgent(
'claude',
`
const lines = [
JSON.stringify({ type: 'stream_event', event: { type: 'message_start', message: { id: 'msg-1' }, ttft_ms: 10 } }),
JSON.stringify({ type: 'stream_event', event: { type: 'content_block_start', index: 0, content_block: { type: 'text' } } }),
JSON.stringify({ type: 'stream_event', event: { type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'hello ' } } }),
JSON.stringify({ type: 'stream_event', event: { type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'world' } } }),
JSON.stringify({ type: 'stream_event', event: { type: 'content_block_stop', index: 0 } }),
JSON.stringify({ type: 'result', usage: { input_tokens: 1, output_tokens: 2 }, duration_ms: 700, stop_reason: 'end_turn' }),
];
let index = 0;
console.log(lines[index++]);
const timer = setInterval(() => {
if (index >= lines.length) {
clearInterval(timer);
process.exit(0);
return;
}
console.log(lines[index++]);
}, 750);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'claude',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(statusBody.status).toBe('succeeded');
},
);
} finally {
if (previous == null) {
delete process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS;
} else {
process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS = previous;
}
}
});
it('surfaces Claude auth diagnostics through the SSE error channel', async () => {
await withFakeAgent(
'claude',
`
console.error(JSON.stringify({ apiKeySource: 'none', error_status: 401 }));
process.exit(1);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'claude',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'event: error');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('event: error');
expect(eventsBody).toContain('/login');
expect(eventsBody).toContain('CLAUDE_CONFIG_DIR');
expect(statusBody.status).toBe('failed');
},
);
});
it('caps oversized inactivity overrides so Node does not fire the timer immediately', async () => {
const previous = process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS;
process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS = '10000000000';
try {
await withFakeAgent(
'opencode',
`
setTimeout(() => {
console.log(JSON.stringify({ type: 'text', part: { text: 'done' } }));
process.exit(0);
}, 50);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(statusBody.status).toBe('succeeded');
},
);
} finally {
if (previous == null) {
delete process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS;
} else {
process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS = previous;
}
}
});
it('marks stalled runs failed even when the child ignores SIGTERM', async () => {
const previous = process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS;
process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS = '500';
try {
await withFakeAgent(
'opencode',
`
console.log(JSON.stringify({ type: 'step_start' }));
process.on('SIGTERM', () => {});
setInterval(() => {}, 1000);
`,
async () => {
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
message: 'hello',
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const eventsController = new AbortController();
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
signal: eventsController.signal,
});
const eventsBody = await readSseUntil(eventsResponse, 'event: error');
eventsController.abort();
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(eventsBody).toContain('Agent stalled without emitting any new output');
expect(statusBody.status).toBe('failed');
},
);
} finally {
if (previous == null) {
delete process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS;
} else {
process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS = previous;
}
}
});
it('marks submitted discovery form answers as the active turn before the transcript', async () => {
const captureDir = mkdtempSync(join(tmpdir(), 'od-form-answer-prompt-'));
tempDirs.push(captureDir);
const capturePath = join(captureDir, 'prompt.txt');
const previousCapturePath = process.env.OD_CAPTURE_PROMPT_PATH;
process.env.OD_CAPTURE_PROMPT_PATH = capturePath;
try {
await withFakeAgent(
'opencode',
`
const fs = require('node:fs');
let input = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => { input += chunk; });
process.stdin.on('end', () => {
fs.writeFileSync(process.env.OD_CAPTURE_PROMPT_PATH, input, 'utf8');
console.log(JSON.stringify({ type: 'text', part: { text: 'building now' } }));
});
`,
async () => {
const formAnswers = [
'[form answers — discovery]',
'- output: Dashboard / tool UI',
'- brand: Pick a direction for me [value: pick_direction]',
].join('\n');
const transcript = [
'## user',
'Design a metrics dashboard.',
'',
'## assistant',
'<question-form id="discovery" title="Quick brief — 30 seconds"></question-form>',
'',
'## user',
formAnswers,
].join('\n');
const createResponse = await fetch(`${baseUrl}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
agentId: 'opencode',
message: transcript,
currentPrompt: formAnswers,
}),
});
expect(createResponse.status).toBe(202);
const { runId } = await createResponse.json() as { runId: string };
const statusBody = await waitForRunStatus(baseUrl, runId);
expect(statusBody.status).toBe('succeeded');
expect(existsSync(capturePath)).toBe(true);
const prompt = readFileSync(capturePath, 'utf8');
const transitionIdx = prompt.indexOf('## Latest user turn - form answers submitted');
const transcriptIdx = prompt.indexOf('## Full conversation transcript');
expect(transitionIdx).toBeGreaterThan(-1);
expect(transcriptIdx).toBeGreaterThan(transitionIdx);
expect(prompt).toContain('The user has answered the discovery form. Do not emit another discovery form.');
expect(prompt).toContain('Continue with RULE 2 / RULE 3 now.');
expect(prompt).toContain(formAnswers);
},
);
} finally {
if (previousCapturePath == null) {
delete process.env.OD_CAPTURE_PROMPT_PATH;
} else {
process.env.OD_CAPTURE_PROMPT_PATH = previousCapturePath;
}
}
});
});
describe('daemon run creation during shutdown', () => {
it('rejects new run creation while shutdown cleanup is still in flight', async () => {
const previousGrace = process.env.OD_CHAT_RUN_SHUTDOWN_GRACE_MS;
process.env.OD_CHAT_RUN_SHUTDOWN_GRACE_MS = '100';
const started = await startServer({ port: 0, returnServer: true }) as {
url: string;
server: http.Server;
shutdown: () => Promise<void>;
};
try {
await withFakeAgent(
'opencode',
`
process.on('SIGTERM', () => {});
setInterval(() => {}, 1000);
`,
async () => {
const activeResponse = await fetch(`${started.url}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ agentId: 'opencode', message: 'hello' }),
});
expect(activeResponse.status).toBe(202);
const { runId } = await activeResponse.json() as { runId: string };
await waitForRunStatus(started.url, runId, (status) => status === 'running');
const shutdownPromise = started.shutdown();
const runResponse = await fetch(`${started.url}/api/runs`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ agentId: 'opencode', message: 'late run' }),
});
const chatResponse = await fetch(`${started.url}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ agentId: 'opencode', message: 'late chat' }),
});
expect(runResponse.status).toBe(503);
expect(chatResponse.status).toBe(503);
await shutdownPromise;
},
);
} finally {
if (previousGrace == null) {
delete process.env.OD_CHAT_RUN_SHUTDOWN_GRACE_MS;
} else {
process.env.OD_CHAT_RUN_SHUTDOWN_GRACE_MS = previousGrace;
}
await new Promise<void>((resolve) => started.server.close(() => resolve()));
}
});
});
async function readSseUntil(response: Response, marker: string): Promise<string> {
const reader = response.body!.getReader();
const decoder = new TextDecoder();
let body = '';
for (let attempt = 0; attempt < 20; attempt += 1) {
const { done, value } = await reader.read();
if (done) return body;
body += decoder.decode(value, { stream: true });
if (body.includes(marker)) return body;
}
return body;
}
async function waitForRunStatus(
baseUrl: string,
runId: string,
done: (status: string) => boolean = (status) => status !== 'queued' && status !== 'running',
): Promise<{ status: string }> {
let lastStatus = 'unknown';
for (let attempt = 0; attempt < 500; attempt += 1) {
const statusResponse = await fetch(`${baseUrl}/api/runs/${runId}`);
const statusBody = await statusResponse.json() as { status: string };
lastStatus = statusBody.status;
if (done(statusBody.status)) return statusBody;
await new Promise((resolve) => setTimeout(resolve, 25));
}
throw new Error(`run did not reach expected status; last status: ${lastStatus}`);
}
describe('chat prompt helpers', () => {
it('appends the validated Codex override after the client system prompt and removes earlier duplicates', () => {
const override = renderCodexImagegenOverride('codex', {
kind: 'image',
imageModel: 'gpt-image-2',
imageAspect: '1:1',
});
const clientMediaContract =
'## Media generation contract\nclient contract wins unless a later override says otherwise';
const prompt = composeLiveInstructionPrompt({
daemonSystemPrompt: `daemon prompt\n${override}`,
runtimeToolPrompt: 'runtime tools',
clientSystemPrompt: clientMediaContract,
finalPromptOverride: override,
});
const clientIdx = prompt.indexOf(clientMediaContract);
const overrideIdx = prompt.indexOf('## Codex built-in imagegen override');
expect(clientIdx).toBeGreaterThan(-1);
expect(overrideIdx).toBeGreaterThan(clientIdx);
expect(prompt.match(/## Codex built-in imagegen override/g)).toHaveLength(1);
});
it('omits the Codex final imagegen override when run media policy blocks execution', () => {
const metadata = {
kind: 'image',
imageModel: 'gpt-image-2',
imageAspect: '1:1',
};
const mediaExecution = {
mode: 'disabled',
allowedSurfaces: ['image'],
};
const generatedImagesDir = resolveCodexGeneratedImagesDir(
'codex',
metadata,
{ CODEX_HOME: '/tmp/custom-codex-home' },
'/home/tester',
mediaExecution,
);
const otherwiseGrantedDir = resolve('/tmp/custom-codex-home/generated_images');
const override = resolveGrantedCodexImagegenOverride({
agentId: 'codex',
metadata,
codexGeneratedImagesDir: otherwiseGrantedDir,
extraAllowedDirs: [otherwiseGrantedDir],
mediaExecution,
});
const prompt = composeLiveInstructionPrompt({
daemonSystemPrompt: 'daemon media policy prompt',
runtimeToolPrompt: 'runtime tools',
clientSystemPrompt: 'client instructions',
finalPromptOverride: override,
});
expect(generatedImagesDir).toBeNull();
expect(override).toBeNull();
expect(prompt).not.toContain('## Codex built-in imagegen override');
});
it('defaults enabled research without an explicit query to the current message', () => {
const prompt = resolveResearchCommandContract(
{ enabled: true },
'EV market 2025 trends',
);
expect(prompt).toContain('Canonical query for this run:');
expect(prompt).toContain('EV market 2025 trends');
expect(prompt).toContain('the first tool action must be the research command');
});
it('resolves only the narrow Codex generated_images allowlist for known gpt-image image projects', () => {
expect(
resolveCodexGeneratedImagesDir(
'codex',
{ kind: 'image', imageModel: 'gpt-image-2' },
{ CODEX_HOME: '/tmp/custom-codex-home' },
'/home/tester',
),
).toBe(resolve('/tmp/custom-codex-home/generated_images'));
expect(
resolveCodexGeneratedImagesDir(
'codex',
{ kind: 'image', imageModel: 'gpt-image-2-preview' },
{ CODEX_HOME: '/tmp/custom-codex-home' },
'/home/tester',
),
).toBeNull();
expect(
resolveCodexGeneratedImagesDir(
'claude',
{ kind: 'image', imageModel: 'gpt-image-2' },
{ CODEX_HOME: '/tmp/custom-codex-home' },
'/home/tester',
),
).toBeNull();
});
it('rejects a generated_images final-component symlink', () => {
const root = mkdtempSync(join(tmpdir(), 'od-codex-generated-symlink-'));
try {
const codexHome = join(root, 'codex-home');
const symlinkTarget = join(root, 'actual-generated-images');
mkdirSync(codexHome, { recursive: true });
mkdirSync(symlinkTarget, { recursive: true });
symlinkDir(symlinkTarget, join(codexHome, 'generated_images'));
const generatedImagesDir = resolveCodexGeneratedImagesDir(
'codex',
{ kind: 'image', imageModel: 'gpt-image-2' },
{ CODEX_HOME: codexHome },
'/home/tester',
);
expect(
validateCodexGeneratedImagesDir(generatedImagesDir, {
warn: () => undefined,
}),
).toBeNull();
} finally {
rmSync(root, { recursive: true, force: true });
}
});
it('rejects a generated_images dir whose canonical path is inside a protected root', () => {
const root = mkdtempSync(join(tmpdir(), 'od-codex-generated-protected-'));
try {
const protectedRoot = join(root, 'skills');
const protectedGeneratedImages = join(protectedRoot, 'generated_images');
mkdirSync(protectedGeneratedImages, { recursive: true });
const codexHome = join(root, 'codex-home');
symlinkDir(protectedRoot, codexHome);
const generatedImagesDir = resolveCodexGeneratedImagesDir(
'codex',
{ kind: 'image', imageModel: 'gpt-image-2' },
{ CODEX_HOME: codexHome },
'/home/tester',
);
expect(
validateCodexGeneratedImagesDir(generatedImagesDir, {
protectedDirs: [protectedRoot],
warn: () => undefined,
}),
).toBeNull();
} finally {
rmSync(root, { recursive: true, force: true });
}
});
it('grants Codex the canonical validated generated_images dir', () => {
const root = mkdtempSync(join(tmpdir(), 'od-codex-generated-canonical-'));
try {
const actualCodexHome = join(root, 'actual-codex-home');
const symlinkCodexHome = join(root, 'codex-home-link');
mkdirSync(actualCodexHome, { recursive: true });
symlinkDir(actualCodexHome, symlinkCodexHome);
const generatedImagesDir = resolveCodexGeneratedImagesDir(
'codex',
{ kind: 'image', imageModel: 'gpt-image-2' },
{ CODEX_HOME: symlinkCodexHome },
'/home/tester',
);
const validatedDir = validateCodexGeneratedImagesDir(
generatedImagesDir,
{ warn: () => undefined },
);
const canonicalGeneratedImagesDir = join(
realpathSync.native(actualCodexHome),
'generated_images',
);
const extraAllowedDirs = resolveChatExtraAllowedDirs({
agentId: 'codex',
skillsDir: '/repo/skills',
designSystemsDir: '/repo/design-systems',
linkedDirs: ['/linked/reference'],
codexGeneratedImagesDir: validatedDir,
existsSync: () => true,
});
const codex = getAgentDef('codex');
if (!codex) throw new Error('Codex agent definition missing');
const args = codex.buildArgs('', [], extraAllowedDirs, {}, {
cwd: '/tmp/od-project',
});
expect(generatedImagesDir).not.toBe(canonicalGeneratedImagesDir);
expect(validatedDir).toBe(canonicalGeneratedImagesDir);
expect(extraAllowedDirs).toEqual([canonicalGeneratedImagesDir]);
expect(
args.filter(
(arg, index) =>
arg === '--add-dir' || args[index - 1] === '--add-dir',
),
).toEqual(['--add-dir', canonicalGeneratedImagesDir]);
} finally {
rmSync(root, { recursive: true, force: true });
}
});
it('limits Codex extra allowed dirs to the generated_images output dir', () => {
const generatedImagesDir = '/home/tester/.codex/generated_images';
const dirs = resolveChatExtraAllowedDirs({
agentId: ' CoDeX ',
skillsDir: '/repo/skills',
designSystemsDir: '/repo/design-systems',
linkedDirs: ['/linked/reference'],
codexGeneratedImagesDir: generatedImagesDir,
existsSync: () => true,
});
expect(dirs).toEqual([generatedImagesDir]);
const codex = getAgentDef('codex');
if (!codex) throw new Error('Codex agent definition missing');
const args = codex.buildArgs('', [], dirs, {}, { cwd: '/tmp/od-project' });
expect(
args.filter(
(arg, index) =>
arg === '--add-dir' || args[index - 1] === '--add-dir',
),
).toEqual(['--add-dir', generatedImagesDir]);
expect(args).not.toContain('/repo/skills');
expect(args).not.toContain('/repo/design-systems');
expect(args).not.toContain('/linked/reference');
});
it('keeps resource and linked dirs for non-Codex agents without the Codex output dir', () => {
const existingDirs = new Set([
'/repo/skills',
'/repo/design-systems',
'/linked/reference',
'/home/tester/.codex/generated_images',
]);
const dirs = resolveChatExtraAllowedDirs({
agentId: 'claude',
skillsDir: '/repo/skills',
designSystemsDir: '/repo/design-systems',
linkedDirs: ['/linked/reference'],
codexGeneratedImagesDir: '/home/tester/.codex/generated_images',
existsSync: (dir: string) => existingDirs.has(dir),
});
expect(dirs).toEqual([
'/repo/skills',
'/repo/design-systems',
'/linked/reference',
]);
});
it('does not add resource dirs for Codex when imagegen is not whitelisted', () => {
const dirs = resolveChatExtraAllowedDirs({
agentId: 'codex',
skillsDir: '/repo/skills',
designSystemsDir: '/repo/design-systems',
linkedDirs: ['/linked/reference'],
codexGeneratedImagesDir: null,
existsSync: () => true,
});
expect(dirs).toEqual([]);
});
it('omits the Codex override when validation fails or the dir is not granted', () => {
const metadata = { kind: 'image', imageModel: 'gpt-image-2' };
const root = mkdtempSync(join(tmpdir(), 'od-codex-generated-prompt-'));
try {
const codexHome = join(root, 'codex-home');
const symlinkTarget = join(root, 'actual-generated-images');
mkdirSync(codexHome, { recursive: true });
mkdirSync(symlinkTarget, { recursive: true });
symlinkDir(symlinkTarget, join(codexHome, 'generated_images'));
const generatedImagesDir = resolveCodexGeneratedImagesDir(
'codex',
metadata,
{ CODEX_HOME: codexHome },
'/home/tester',
);
const validatedDir = validateCodexGeneratedImagesDir(
generatedImagesDir,
{ warn: () => undefined },
);
const extraAllowedDirs = resolveChatExtraAllowedDirs({
agentId: 'codex',
skillsDir: '/repo/skills',
designSystemsDir: '/repo/design-systems',
linkedDirs: ['/linked/reference'],
codexGeneratedImagesDir: validatedDir,
existsSync: () => true,
});
const validationFailedOverride = resolveGrantedCodexImagegenOverride({
agentId: 'codex',
metadata,
codexGeneratedImagesDir: validatedDir,
extraAllowedDirs,
});
const validationFailedPrompt = composeLiveInstructionPrompt({
daemonSystemPrompt: 'daemon prompt',
runtimeToolPrompt: 'runtime tools',
clientSystemPrompt: 'client media contract',
finalPromptOverride: validationFailedOverride,
});
expect(validatedDir).toBeNull();
expect(extraAllowedDirs).toEqual([]);
expect(validationFailedOverride).toBeNull();
expect(validationFailedPrompt).not.toContain(
'## Codex built-in imagegen override',
);
const validDir = join(root, 'safe-codex-home', 'generated_images');
mkdirSync(validDir, { recursive: true });
const notGrantedOverride = resolveGrantedCodexImagegenOverride({
agentId: 'codex',
metadata,
codexGeneratedImagesDir: validDir,
extraAllowedDirs: [],
});
expect(notGrantedOverride).toBeNull();
} finally {
rmSync(root, { recursive: true, force: true });
}
});
});