open-design/apps/daemon/tests/app-config.test.ts
lefarcen afb331a288
feat: add opt-in Langfuse telemetry (#800)
* docs(specs): add langfuse telemetry change spec

Captures the design for forwarding completed agent runs to Langfuse,
including data-model mapping, field-budget caps, privacy gates,
build-secret injection, GDPR right-to-deletion approach, and the
resolved decisions on default consent, identifier shape, region, and
ownership.

* feat(daemon): add langfuse-trace module and telemetry prefs

Adds the dependency-free building blocks for forwarding completed
agent runs to Langfuse. Two layers:

- AppConfigPrefs gains installationId and a TelemetryPrefs object with
  metrics / content / artifactManifest gates. The daemon validator
  treats telemetry like agentModels — replace-on-write, drop-when-empty,
  reject non-boolean inner values.

- New langfuse-trace.ts builds a {trace-create, generation-create}
  pair from a ReportContext, capping prompt at 8 KB, output at 16 KB,
  artifacts at 50 entries, and dropping any batch larger than 1 MB
  before send. reportRunCompleted is no-op when LANGFUSE_PUBLIC_KEY /
  LANGFUSE_SECRET_KEY are unset (so dev runs and forks never emit) and
  short-circuits on prefs.metrics === false.

Server-side wiring into the run-close path lands in a follow-up.

* fix(langfuse): default to US Langfuse region

End-to-end smoke against the project's actual dev key on 2026-05-07
returned 401 from cloud.langfuse.com (EU) and 207 from
us.cloud.langfuse.com (US), confirming the org lives in US. Update the
default base URL, the matching test, and the spec's Q3 decision row to
match. Self-hosted or EU-region operators can still override via the
LANGFUSE_BASE_URL env var.

* feat(daemon): wire langfuse trace forwarding into run-close

Adds the daemon-side glue to forward completed agent runs:

- runs.ts gains an optional onTerminate hook fired once per run after it
  reaches a terminal state. Errors thrown from the hook are caught and
  logged, never propagated, so telemetry can never break the run path.

- New langfuse-bridge.ts assembles a ReportContext from the in-memory
  run record, the conversation's persisted assistant message, and the
  user's app-config preferences. It tolerates a missing message (e.g.
  when web has not yet PUT the final delta) and a missing app-config.

- server.ts stashes the original user prompt on the run object inside
  startChatRun so the bridge can include it without crossing the
  createChatRunService boundary, and registers the hook callback when
  building the run service.

Behavior remains a no-op unless LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY
are set in the daemon env AND telemetry.metrics is true in app-config.
A live smoke against us.cloud.langfuse.com on 2026-05-07 confirmed the
matching trace + generation schema is accepted (HTTP 207, both events
201 created).

* fix(langfuse): address PR #800 review feedback

P1 — Move trace forwarding off the daemon-internal run-close hook and
onto the message-persistence path. The original onTerminate hook ran
inside finish() the moment the SSE 'end' event was emitted, which is
*before* the web client's onDone handler refreshes project files and
PUTs producedFiles + final assistant content back to SQLite. Reading
SQLite at that moment routinely missed both. The fix: drop the runs.ts
hook entirely and trigger from PUT /api/projects/:id/conversations/:cid/
messages/:mid when the saved row carries a terminal runStatus. A
reportedRuns Set guards against the multiple PUT calls web makes per
turn (each retry / state update). Set entries auto-evict after the same
30 min TTL the runs map uses. Web persists a terminal-status message in
all three completion paths — onDone (succeeded), onError (failed), and
cancel (canceled) — so this catches every run shape.

P2 — postLangfuseBatch now parses the 207 Multi-Status response body.
Langfuse legacy ingestion always returns 207, and response.ok is true
for 207, so per-event validation errors used to slip through silently.
We now warn when body.errors is non-empty. Two new unit tests.

P2 — truncate() and the HARD_BATCH cap now compare UTF-8 byte length,
not String.length (which counts UTF-16 code units). A 4096-character
CJK prompt occupies 12 KB, well over the 8 KB input cap. truncate also
walks backwards to a UTF-8 leading byte so the cut never lands inside a
multi-byte codepoint. New unit test covers '设'.repeat(4096).

P2 — Spec R7 now lists the actual Langfuse trace deletion endpoint
(DELETE /api/public/traces/{traceId} for single, DELETE /api/public/traces
with body for batch). Verified by curl on us.cloud.langfuse.com:
DELETE /api/public/traces/X → 200; the path the original spec named
(POST /api/public/trace/X) returns 404. Reference link points at
langfuse.com/docs/administration/data-deletion.

P3 — Q4 (legacy ingestion vs OTel) moved from Open Questions to
Resolved Decisions. The implementation already commits to legacy and
the trade-off was discussed during design; the open-question status was
stale.

* feat(web): privacy consent surface + Settings → Privacy tab

Adds the user-facing half of the telemetry feature so the daemon-side
hook from PR #800 has something to talk to.

- AppConfig gains optional `installationId` (anonymous v4 uuid generated
  on first opt-in; null after explicit decline; undefined when the user
  has never seen the consent surface) and `telemetry: TelemetryConfig`
  ({metrics, content, artifactManifest}). syncConfigToDaemon round-trips
  both fields so the bridge module sees the same prefs.

- SettingsDialog grows a Privacy section with two states. When the user
  has never made a consent decision (typical first-run path), the
  section renders the GDPR-aligned consent card: a kicker, the disclosure
  body listing both metrics and conversation content as separate bullets,
  and two equally-prominent buttons ("Share usage data" / "Don't share").
  The Don't-share path keeps the app fully usable (core app must work
  with all tracking declined). After a decision the same panel switches
  to three independent toggles + the anonymous ID + a "Delete my data"
  button that rotates the ID and turns everything off.

- App.tsx points the welcome modal at the new Privacy section so the
  consent decision is the first thing a fresh installation sees.

- 17 i18n keys land in en + zh-CN + zh-TW with hand-translated copy,
  and as English placeholders in the remaining 14 locales — enough for
  the parity check to pass while leaving room for proper localisation
  in a follow-up. Dict type updated.

- Minimal index.css for the consent card + toggle rows so the panel is
  legible without depending on follow-up design polish.

Telemetry remains a no-op end-to-end until the user clicks Share usage
data: the daemon gate (prefs.metrics === true) keeps every code path
short-circuited otherwise.

* refactor(web): rebuild Privacy panel using project-native settings primitives

The first cut used custom .settings-privacy-* classes + raw HTML
checkboxes that didn't match any other Settings tab. Replace with the
shell other sections already use:

- settings-subsection containers with section-head + h4 + .hint
- seg-control / seg-btn pill toggles ("active" / "offline") for each of
  the three telemetry preferences, mirroring NotificationsSection
- a 2-cell seg-control for the consent card so Share usage data and
  Don't share carry identical visual weight (the GDPR equal-prominence
  requirement that the previous accent / outline split missed)
- ghost button + readonly text input for the installation id row,
  mirroring the API-key field pattern elsewhere

Drop the bespoke CSS block in favor of inheriting the existing
settings-section / seg-control / ghost styling. The only privacy-
specific style left is a tight definition list inside the consent card
for the metrics + content disclosure rows.

* refactor(web): use .toggle-row iOS switch for Privacy preferences

Active/offline pills (the seg-control single-cell pattern that
NotificationsSection uses) read awkwardly for a flat preference list.
Switch the three telemetry toggles to .toggle-row — the same control
NewProjectPanel uses for "speaker notes" / "animations": label + hint
on the left, iOS-style sliding switch on the right, full-row click
target. The consent card's two-button seg-control stays as-is — there
the equal-weight pill pair is exactly what GDPR equal-prominence wants.

* feat(web): standalone first-run privacy consent banner

Replaces the Settings-dialog-as-onboarding hack with a dedicated
bottom-right banner card that mounts whenever the user has never made
a privacy decision (cfg.installationId === undefined). The banner is
prominent (anchored to the corner with a soft shadow) but
non-blocking, mirrors cookie-consent UX, and shares the project's
panel styling — same .modal-elevated background, --radius-lg corners,
--shadow-lg lift.

Wiring:

- App.tsx imports PrivacyConsentModal and renders it at the root,
  gated on installationId === undefined && !settingsOpen so it doesn't
  double up with the Privacy tab's own consent card when Settings is
  already showing.
- Share / Don't share both go through handleConfigPersist, so the
  resulting installationId + telemetry prefs land in localStorage and
  the daemon at the same time, reusing the existing autosave plumbing.
- The previous attempt that pinned the welcome SettingsDialog to the
  Privacy section is reverted; onboarding now stays focused on agent
  configuration, and the consent decision lives in its own surface.

* fix(web): keep privacy banner visible while Settings welcome modal is open

The banner gated itself on `!settingsOpen` to avoid double-rendering
with the Privacy tab's consent card. But the first-run path opens the
Settings welcome modal automatically when `onboardingCompleted=false`,
which fired immediately after bootstrap — so the banner flashed for a
moment and then vanished behind the modal backdrop.

Drop the `!settingsOpen` clause so the banner stays mounted whenever
the user has not yet made a privacy decision, and bump its z-index
above the modal backdrop (200 vs 100) so first-run users can actually
reach the consent buttons. The minor visual overlap with the Privacy
tab's own card is fine: clicking either copy resolves both surfaces.

* copy(privacy): soften consent button labels

Banner action buttons now read "Help improve Open Design" / "Not now"
(en, with hand translations in zh-CN / zh-TW and English placeholders
in the other 13 locales) instead of "Share usage data" / "Don't share".

The new wording aligns the affirmative action with the kicker copy
("Help us improve Open Design") and reads less alarming, while the
disclosure list above still names both data categories explicitly so
the consent stays informed under GDPR. The decline button stays as a
soft "Not now" rather than an aggressive "Don't share" so the reject
path doesn't read as hostile to the user.

No structural change — the two-cell seg-control still gives the buttons
identical visual weight, and the underlying side-effects are unchanged
(installationId is generated on Help / nulled on Not now, and the
telemetry prefs flip the same way).

* feat(telemetry): expand trace fields for evals & dataset construction

Each Langfuse trace now ships the full per-turn + per-install fact
sheet that the eval/dataset workflow needs, instead of only the bare
turn id + token count from before. Everything below is gated by
`prefs.metrics === true`; nothing here is content (those gates remain
separate).

Per-turn:
- model — first-class generation.model field, drives Langfuse cost
  lookup and model-grouping in the UI; also mirrored in trace.metadata
  and trace.tags so list-view filters work.
- reasoning — generation.modelParameters.{ reasoning } so the Model
  Parameters card lights up; mirrored in metadata.
- skillId / designSystemId — metadata + tags, so dataset slices can
  group by which skill/DS produced which output.

Per-process / build (constant within one daemon run, cached at start):
- appVersion / appChannel / packaged from app-version.ts
- nodeVersion (process.version), os (platform()), osRelease,
  arch (os.arch())
- clientType — desktop vs web, derived from a new X-OD-Client header
  the web layer sets in providers/daemon.ts (with a User-Agent sniff
  fallback for third-party callers).

Plumbing:
- startChatRun stashes model / reasoning / skillId / designSystemId
  on the run object alongside the existing userPrompt stash.
- POST /api/runs reads X-OD-Client and stores run.clientType.
- langfuse-bridge collects RuntimeInfo once per process and merges
  per-run client carrier; ReportContext gains optional `turn` +
  `runtime` blocks; existing fields stay backward compatible.

Spec gains a "Telemetry Fields Catalog" section enumerating every
field, its source, and the gate it lives under, so the eval team has a
single place to look up what's available without reading the trace
schema by example.

Tests:
- new langfuse-trace tests cover turn tags, runtime tags, generation
  model/modelParameters promotion, modelParameters omission when
  reasoning is unset, and metadata mirroring.
- langfuse-bridge gains an end-to-end "turn-level config" test that
  threads model/reasoning/skill/DS/clientType + appVersion through
  the bridge and asserts the Langfuse payload shape.
- existing tests adjusted to tolerate host-dependent os tag.

* copy(privacy): trim Share button to verb phrase only

"Help improve Open Design" overflowed the equal-width 2-cell
seg-control on the consent banner — the product name is already in
the kicker + headline above the buttons, so the button itself only
needs the verb phrase. Drop the product name from all locales:

- en: Help improve Open Design → Help improve
- zh-CN: 帮助改进 Open Design → 帮助改进
- zh-TW: 協助改進 Open Design → 協助改進

The decline button ("Not now" / "暂不" / "暫不") was already short, so
the two buttons now have comparable length and the equal-prominence
seg-control fits cleanly. Standalone Settings → Privacy panel uses
the same labels for consistency.

* fix(web): defer Settings welcome modal until privacy decision is made

Previously bootstrap raced two surfaces against each other on first
launch: the privacy consent banner (gated on installationId ===
undefined) and the Settings welcome modal (gated on
onboardingCompleted === false). The banner's higher z-index kept it
above the backdrop visually, but having two foreground surfaces at
once is still confusing UX.

Sequence them instead: bootstrap only opens the welcome modal when
the user has *already* resolved consent (installationId !== undefined).
Until then the banner owns the foreground alone. Once the user clicks
Help improve / Not now, the corresponding handler hands off to the
welcome modal if onboarding is still pending. End state matches what
it was before — just without the simultaneous-render flash.

* debug(privacy): log banner gate state to track sudden disappearance

Two console.log points to find which setCfg call (or stale bundle) is
flipping cfg.installationId from undefined to a value while the banner
is visible. To remove once the regression is reproduced.

* fix(privacy): keep installationId + telemetry out of localStorage

Daemon is now the single source of truth for the privacy decision.
Why this matters: the consent banner gates on
\`config.installationId === undefined\`, but loadConfig() merges
localStorage on top of the daemon's reply, so a stale uuid in
\`open-design:config\` (left over from a previous opt-in) was
re-hydrating the React state and immediately syncing back to the
daemon — defeating "Delete my data" and re-suppressing the banner
within milliseconds of every page load.

The deeper reason to fix it here, not just patch the gate: a privacy
identifier persisted in browser storage that the user can't see or
clear without DevTools is a compliance liability. Anything users can
revoke needs one canonical place to store it. Daemon \`app-config.json\`
already serves that role for everything else gated through
syncConfigToDaemon, so installationId + telemetry now ride that path
exclusively:

- saveConfig() strips both keys before writing localStorage.
- loadConfig() strips both keys when reading older stale payloads,
  so existing installs migrate transparently on next launch.
- syncConfigToDaemon() / mergeDaemonConfig still round-trip them, so
  the React state stays in sync with the daemon as before.

Net effect: clearing app-config.json (or hitting "Delete my data") now
fully resets the install identity, with no residual cohort key in
browser storage.

* feat(privacy): scrub secrets + PII from prompt/output before send

When prefs.content is on, daemon now runs the prompt and assistant
text through a regex scrubber (apps/daemon/src/redact.ts) before
posting to Langfuse. The scrubber is the simplest thing that gives
the user-facing copy a truthful claim — pure regex, zero new
dependencies, fully auditable in this Apache-2.0 repo (vs. pulling a
single-maintainer 5-month-old npm package into a core process).

Categories covered (each replaced with [REDACTED:<kind>]):

- Anthropic / OpenAI sk- keys (incl. proj/live/test/ant variants)
- Langfuse pk-lf- / sk-lf- (specific rule wins over generic sk-)
- GitHub gh[opsur]_ tokens
- AWS access key ids (AKIA + 16 uppercase)
- Google API keys (AIza + 35)
- Slack xox[abprs]- tokens
- Stripe live/test keys
- JWT header.payload.signature triples
- Bearer-header values (scheme word stays readable)
- Emails, IPv4, US-style phone numbers
- Credit cards — 13–19 digit runs that pass a Luhn check, so order ids
  and unix-nanos timestamps that fail Luhn pass through unchanged

Not covered, stated openly in spec + i18n: names, postal addresses,
business-secret semantics, raw 40-hex tokens (too high a false-positive
cost for artifact slugs). Those would require an ML layer.

Wired in:
- apps/daemon/src/redact.ts — exports redactSecrets() +
  redactSecretsWithCounts() helper for future audit-summary metadata.
- apps/daemon/src/langfuse-bridge.ts — runs both prompt and output
  through redactSecrets() before they reach the trace builder.
- 18 unit tests cover every pattern plus negative cases (Luhn-failing
  digit runs, out-of-range IPv4 octets, idempotence on re-redacted
  text, ordinary prose passthrough).
- i18n privacyContentHint on en + zh-CN + zh-TW (plus 14 locale
  placeholders) enumerates the categories so the consent disclosure
  matches the implementation — the GDPR informed-consent requirement.
- spec gains a Pre-send Redaction subsection with the regex shape
  table + intentional non-coverage list.

Drive-by: dropped the [privacy] debug logs that traced the now-fixed
bootstrap regression.

* fix(telemetry): make Langfuse reporting resilient

* feat(telemetry): nest Langfuse turn observations

* feat(telemetry): emit Langfuse tool spans

* fix(telemetry): report after finalized message writes

* fix(telemetry): honor persisted terminal status

* fix(web): let consent banner yield page clicks

* fix(telemetry): report current turn prompt only
2026-05-09 10:06:01 +08:00

651 lines
20 KiB
TypeScript

import http from 'node:http';
import { mkdtemp, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import path from 'node:path';
import express from 'express';
import {
afterAll,
afterEach,
beforeAll,
beforeEach,
describe,
expect,
it,
} from 'vitest';
import { readAppConfig, writeAppConfig } from '../src/app-config.js';
import { isLocalSameOrigin } from '../src/origin-validation.js';
describe('app-config', () => {
let dataDir: string;
beforeEach(async () => {
dataDir = await mkdtemp(path.join(tmpdir(), 'od-appconfig-'));
});
afterEach(async () => {
await rm(dataDir, { recursive: true, force: true });
});
describe('readAppConfig', () => {
it('returns {} when config file does not exist', async () => {
expect(await readAppConfig(dataDir)).toEqual({});
});
it('returns parsed config from existing file', async () => {
await writeFile(
path.join(dataDir, 'app-config.json'),
JSON.stringify({ onboardingCompleted: true }),
);
const cfg = await readAppConfig(dataDir);
expect(cfg.onboardingCompleted).toBe(true);
});
it('returns {} for corrupted JSON without crashing', async () => {
await writeFile(path.join(dataDir, 'app-config.json'), '{not valid');
const cfg = await readAppConfig(dataDir);
expect(cfg).toEqual({});
});
it('returns {} when file contains a JSON array', async () => {
await writeFile(path.join(dataDir, 'app-config.json'), '[1,2,3]');
const cfg = await readAppConfig(dataDir);
expect(cfg).toEqual({});
});
it('returns {} when file contains a JSON primitive', async () => {
await writeFile(path.join(dataDir, 'app-config.json'), '"hello"');
const cfg = await readAppConfig(dataDir);
expect(cfg).toEqual({});
});
it('filters out unknown keys from stored file', async () => {
await writeFile(
path.join(dataDir, 'app-config.json'),
JSON.stringify({ agentId: 'claude', rogue: 'value', __proto: 'x' }),
);
const cfg = await readAppConfig(dataDir);
expect(cfg).toEqual({ agentId: 'claude' });
expect(cfg).not.toHaveProperty('rogue');
expect(cfg).not.toHaveProperty('__proto');
});
it('filters out invalid scalar values from stored file', async () => {
await writeFile(
path.join(dataDir, 'app-config.json'),
JSON.stringify({
onboardingCompleted: 'yes',
agentId: 123,
skillId: { id: 'bad' },
designSystemId: ['bad'],
}),
);
const cfg = await readAppConfig(dataDir);
expect(cfg).toEqual({});
});
it('preserves omitted orbit.templateSkillId from legacy stored config', async () => {
await writeFile(
path.join(dataDir, 'app-config.json'),
JSON.stringify({
orbit: {
enabled: true,
time: '09:30',
},
}),
);
const cfg = await readAppConfig(dataDir);
expect(cfg.orbit).toEqual({
enabled: true,
time: '09:30',
});
expect(cfg.orbit).not.toHaveProperty('templateSkillId');
});
it('falls back to default orbit time for out-of-range stored values', async () => {
await writeFile(
path.join(dataDir, 'app-config.json'),
JSON.stringify({
orbit: {
enabled: true,
time: '99:99',
},
}),
);
const cfg = await readAppConfig(dataDir);
expect(cfg.orbit).toEqual({
enabled: true,
time: '08:00',
});
});
it('preserves explicit orbit.templateSkillId null and trimmed string', async () => {
await writeFile(
path.join(dataDir, 'app-config.json'),
JSON.stringify({
orbit: {
enabled: false,
time: '08:00',
templateSkillId: null,
},
}),
);
let cfg = await readAppConfig(dataDir);
expect(cfg.orbit).toEqual({
enabled: false,
time: '08:00',
templateSkillId: null,
});
await writeFile(
path.join(dataDir, 'app-config.json'),
JSON.stringify({
orbit: {
enabled: true,
time: '10:15',
templateSkillId: ' orbit-general ',
},
}),
);
cfg = await readAppConfig(dataDir);
expect(cfg.orbit).toEqual({
enabled: true,
time: '10:15',
templateSkillId: 'orbit-general',
});
});
});
describe('writeAppConfig', () => {
it('creates data directory if missing', async () => {
const nested = path.join(dataDir, 'sub', 'dir');
await writeAppConfig(nested, { onboardingCompleted: true });
const cfg = await readAppConfig(nested);
expect(cfg.onboardingCompleted).toBe(true);
});
it('only persists ALLOWED_KEYS, filtering unknown keys', async () => {
await writeAppConfig(dataDir, {
onboardingCompleted: true,
unknownKey: 'should be dropped',
agentId: 'claude',
});
const cfg = await readAppConfig(dataDir);
expect(cfg).toEqual({ onboardingCompleted: true, agentId: 'claude' });
expect(cfg).not.toHaveProperty('unknownKey');
});
it('does not persist invalid scalar values', async () => {
await writeAppConfig(dataDir, {
onboardingCompleted: 'yes',
agentId: 123,
skillId: false,
designSystemId: { id: 'bad' },
});
const cfg = await readAppConfig(dataDir);
expect(cfg).toEqual({});
});
it('merges with existing config', async () => {
await writeAppConfig(dataDir, { agentId: 'claude' });
await writeAppConfig(dataDir, { skillId: 'coder' });
const cfg = await readAppConfig(dataDir);
expect(cfg.agentId).toBe('claude');
expect(cfg.skillId).toBe('coder');
});
it('clears a key when null is sent', async () => {
await writeAppConfig(dataDir, { agentId: 'claude', skillId: 'coder' });
await writeAppConfig(dataDir, { agentId: null });
const cfg = await readAppConfig(dataDir);
expect(cfg.agentId).toBeNull();
expect(cfg.skillId).toBe('coder');
});
it('clears agentModels when null is sent', async () => {
await writeAppConfig(dataDir, {
agentModels: { a: { model: 'gpt-4' } },
onboardingCompleted: true,
});
expect((await readAppConfig(dataDir)).agentModels).toBeDefined();
await writeAppConfig(dataDir, { agentModels: null });
const cfg = await readAppConfig(dataDir);
expect(cfg.agentModels).toBeUndefined();
expect(cfg.onboardingCompleted).toBe(true);
});
it('clears agentModels when empty object is sent', async () => {
await writeAppConfig(dataDir, {
agentModels: { a: { model: 'gpt-4' } },
});
await writeAppConfig(dataDir, { agentModels: {} });
const cfg = await readAppConfig(dataDir);
expect(cfg.agentModels).toBeUndefined();
});
it('validates agentModels entries, dropping invalid shapes', async () => {
await writeAppConfig(dataDir, {
agentModels: {
validAgent: { model: 'gpt-4', reasoning: 'fast' },
invalidAgent: 'not-an-object',
arrayAgent: [1, 2, 3],
badKeys: { model: 'ok', extra: 42 },
},
});
const cfg = await readAppConfig(dataDir);
expect(cfg.agentModels).toEqual({
validAgent: { model: 'gpt-4', reasoning: 'fast' },
});
});
it('drops agentModels entirely when no entries are valid', async () => {
await writeAppConfig(dataDir, {
onboardingCompleted: true,
agentModels: { bad: 'string-value' },
});
const cfg = await readAppConfig(dataDir);
expect(cfg.onboardingCompleted).toBe(true);
expect(cfg.agentModels).toBeUndefined();
});
it('persists supported per-agent CLI env keys and drops everything else', async () => {
await writeAppConfig(dataDir, {
agentCliEnv: {
claude: {
CLAUDE_CONFIG_DIR: ' ~/.claude-2 ',
ANTHROPIC_API_KEY: 'sk-should-not-persist',
},
codex: {
CODEX_HOME: '~/.codex-alt',
CODEX_BIN: '~/bin/codex-next',
OPENAI_API_KEY: 'sk-should-not-persist',
},
gemini: {
GEMINI_API_KEY: 'should-not-persist',
},
__proto__: {
CLAUDE_CONFIG_DIR: 'bad',
},
},
});
const cfg = await readAppConfig(dataDir);
expect(cfg.agentCliEnv).toEqual({
claude: { CLAUDE_CONFIG_DIR: '~/.claude-2' },
codex: { CODEX_HOME: '~/.codex-alt', CODEX_BIN: '~/bin/codex-next' },
});
});
it('drops agentCliEnv entries that collide with Object.prototype keys', async () => {
await writeAppConfig(dataDir, {
agentCliEnv: {
toString: {
CODEX_HOME: '~/.codex-prototype',
},
hasOwnProperty: {
CLAUDE_CONFIG_DIR: '~/.claude-prototype',
},
claude: {
CLAUDE_CONFIG_DIR: '~/.claude-2',
},
},
});
const cfg = await readAppConfig(dataDir);
expect(cfg.agentCliEnv).toEqual({
claude: { CLAUDE_CONFIG_DIR: '~/.claude-2' },
});
});
it('clears agentCliEnv when null or an empty object is sent', async () => {
await writeAppConfig(dataDir, {
agentCliEnv: {
claude: { CLAUDE_CONFIG_DIR: '~/.claude-2' },
},
onboardingCompleted: true,
});
expect((await readAppConfig(dataDir)).agentCliEnv).toBeDefined();
await writeAppConfig(dataDir, { agentCliEnv: null });
let cfg = await readAppConfig(dataDir);
expect(cfg.agentCliEnv).toBeUndefined();
expect(cfg.onboardingCompleted).toBe(true);
await writeAppConfig(dataDir, {
agentCliEnv: {
codex: { CODEX_HOME: '~/.codex-alt' },
},
});
await writeAppConfig(dataDir, { agentCliEnv: {} });
cfg = await readAppConfig(dataDir);
expect(cfg.agentCliEnv).toBeUndefined();
});
it('handles corrupted existing file gracefully on write', async () => {
await writeFile(path.join(dataDir, 'app-config.json'), 'CORRUPT');
await writeAppConfig(dataDir, { agentId: 'test' });
const cfg = await readAppConfig(dataDir);
expect(cfg.agentId).toBe('test');
});
});
});
// ---------------------------------------------------------------------------
// HTTP-layer origin guard
// ---------------------------------------------------------------------------
function httpRequest(
url: string,
opts: { method?: string; headers?: Record<string, string>; body?: string },
): Promise<{ status: number; body: string }> {
return new Promise((resolve, reject) => {
const parsed = new URL(url);
const req = http.request(
{
hostname: parsed.hostname,
port: Number(parsed.port),
path: parsed.pathname,
method: opts.method ?? 'GET',
headers: opts.headers ?? {},
},
(res) => {
let data = '';
res.on('data', (c) => (data += c));
res.on('end', () => resolve({ status: res.statusCode!, body: data }));
},
);
req.on('error', reject);
if (opts.body) req.write(opts.body);
req.end();
});
}
describe('app-config disabled lists', () => {
let dataDir: string;
beforeEach(async () => {
dataDir = await mkdtemp(path.join(tmpdir(), 'od-disabled-'));
});
afterEach(async () => {
await rm(dataDir, { recursive: true, force: true });
});
it('persists disabledSkills as string array', async () => {
await writeAppConfig(dataDir, { disabledSkills: ['skill-a', 'skill-b'] });
const cfg = await readAppConfig(dataDir);
expect(cfg.disabledSkills).toEqual(['skill-a', 'skill-b']);
});
it('persists disabledDesignSystems as string array', async () => {
await writeAppConfig(dataDir, { disabledDesignSystems: ['ds-x'] });
const cfg = await readAppConfig(dataDir);
expect(cfg.disabledDesignSystems).toEqual(['ds-x']);
});
it('drops disabledSkills when not a string array', async () => {
await writeAppConfig(dataDir, { disabledSkills: 'not-array' } as any);
const cfg = await readAppConfig(dataDir);
expect(cfg.disabledSkills).toBeUndefined();
});
it('drops disabledSkills with non-string elements', async () => {
await writeAppConfig(dataDir, { disabledSkills: [1, 2, 3] } as any);
const cfg = await readAppConfig(dataDir);
expect(cfg.disabledSkills).toBeUndefined();
});
it('clears disabledSkills when empty array is sent', async () => {
await writeAppConfig(dataDir, { disabledSkills: ['a'] });
await writeAppConfig(dataDir, { disabledSkills: [] });
const cfg = await readAppConfig(dataDir);
expect(cfg.disabledSkills).toEqual([]);
});
});
describe('app-config telemetry prefs', () => {
let dataDir: string;
beforeEach(async () => {
dataDir = await mkdtemp(path.join(tmpdir(), 'od-telemetry-'));
});
afterEach(async () => {
await rm(dataDir, { recursive: true, force: true });
});
it('persists installationId as string', async () => {
await writeAppConfig(dataDir, {
installationId: '11111111-2222-3333-4444-555555555555',
});
const cfg = await readAppConfig(dataDir);
expect(cfg.installationId).toBe('11111111-2222-3333-4444-555555555555');
});
it('clears installationId when null is sent', async () => {
await writeAppConfig(dataDir, { installationId: 'abc' });
await writeAppConfig(dataDir, { installationId: null });
const cfg = await readAppConfig(dataDir);
expect(cfg.installationId).toBeNull();
});
it('drops installationId of wrong type', async () => {
await writeAppConfig(dataDir, { installationId: 12345 } as any);
const cfg = await readAppConfig(dataDir);
expect(cfg.installationId).toBeUndefined();
});
it('persists privacyDecisionAt as a timestamp', async () => {
await writeAppConfig(dataDir, { privacyDecisionAt: 1778244000000 });
const cfg = await readAppConfig(dataDir);
expect(cfg.privacyDecisionAt).toBe(1778244000000);
});
it('clears privacyDecisionAt when null is sent', async () => {
await writeAppConfig(dataDir, { privacyDecisionAt: 1778244000000 });
await writeAppConfig(dataDir, { privacyDecisionAt: null });
const cfg = await readAppConfig(dataDir);
expect(cfg.privacyDecisionAt).toBeNull();
});
it('drops privacyDecisionAt of wrong type', async () => {
await writeAppConfig(dataDir, { privacyDecisionAt: 'yesterday' } as any);
const cfg = await readAppConfig(dataDir);
expect(cfg.privacyDecisionAt).toBeUndefined();
});
it('persists full telemetry prefs', async () => {
await writeAppConfig(dataDir, {
telemetry: { metrics: true, content: true, artifactManifest: false },
});
const cfg = await readAppConfig(dataDir);
expect(cfg.telemetry).toEqual({
metrics: true,
content: true,
artifactManifest: false,
});
});
it('persists partial telemetry prefs and omits absent keys', async () => {
await writeAppConfig(dataDir, { telemetry: { metrics: true } });
const cfg = await readAppConfig(dataDir);
expect(cfg.telemetry).toEqual({ metrics: true });
});
it('drops telemetry inner values that are not booleans', async () => {
await writeAppConfig(dataDir, {
telemetry: {
metrics: 'yes' as any,
content: 1 as any,
artifactManifest: true,
},
} as any);
const cfg = await readAppConfig(dataDir);
expect(cfg.telemetry).toEqual({ artifactManifest: true });
});
it('drops telemetry entirely when no inner key is valid', async () => {
await writeAppConfig(dataDir, {
onboardingCompleted: true,
telemetry: { metrics: 'yes' } as any,
} as any);
const cfg = await readAppConfig(dataDir);
expect(cfg.onboardingCompleted).toBe(true);
expect(cfg.telemetry).toBeUndefined();
});
it('drops unknown keys nested inside telemetry', async () => {
await writeAppConfig(dataDir, {
telemetry: { metrics: true, rogue: true } as any,
} as any);
const cfg = await readAppConfig(dataDir);
expect(cfg.telemetry).toEqual({ metrics: true });
expect(cfg.telemetry).not.toHaveProperty('rogue');
});
it('drops telemetry when value is not a plain object', async () => {
await writeAppConfig(dataDir, { telemetry: [true] } as any);
const cfg = await readAppConfig(dataDir);
expect(cfg.telemetry).toBeUndefined();
});
it('clears telemetry when null is sent', async () => {
await writeAppConfig(dataDir, {
telemetry: { metrics: true, content: true },
});
await writeAppConfig(dataDir, { telemetry: null } as any);
const cfg = await readAppConfig(dataDir);
expect(cfg.telemetry).toBeUndefined();
});
it('merges telemetry without disturbing other keys', async () => {
await writeAppConfig(dataDir, {
installationId: 'install-1',
telemetry: { metrics: true },
agentId: 'claude',
});
await writeAppConfig(dataDir, { telemetry: { content: true } });
const cfg = await readAppConfig(dataDir);
expect(cfg.installationId).toBe('install-1');
expect(cfg.agentId).toBe('claude');
// telemetry is replaced (not deep-merged) — matches the agentModels semantics.
expect(cfg.telemetry).toEqual({ content: true });
});
});
describe('app-config origin guard', () => {
let server: http.Server;
let port: number;
let baseUrl: string;
beforeAll(
() =>
new Promise<void>((resolve) => {
const app = express();
app.use(express.json());
app.get('/api/app-config', (req, res) => {
if (!isLocalSameOrigin(req, port)) {
return res
.status(403)
.json({ error: 'cross-origin request rejected' });
}
res.json({ config: {} });
});
app.put('/api/app-config', (req, res) => {
if (!isLocalSameOrigin(req, port)) {
return res
.status(403)
.json({ error: 'cross-origin request rejected' });
}
res.json({ config: req.body });
});
server = app.listen(0, '127.0.0.1', () => {
port = (server.address() as { port: number }).port;
baseUrl = `http://127.0.0.1:${port}`;
resolve();
});
}),
);
afterAll(() => new Promise<void>((resolve) => server.close(() => resolve())));
it('allows GET from same-origin (no Origin header)', async () => {
const res = await httpRequest(`${baseUrl}/api/app-config`, {
headers: { Host: `127.0.0.1:${port}` },
});
expect(res.status).toBe(200);
});
it('allows PUT from same-origin', async () => {
const res = await httpRequest(`${baseUrl}/api/app-config`, {
method: 'PUT',
headers: {
'Content-Type': 'application/json',
Host: `127.0.0.1:${port}`,
Origin: `http://127.0.0.1:${port}`,
},
body: JSON.stringify({ onboardingCompleted: true }),
});
expect(res.status).toBe(200);
});
it('rejects GET with cross-origin Origin header', async () => {
const res = await httpRequest(`${baseUrl}/api/app-config`, {
headers: {
Host: `127.0.0.1:${port}`,
Origin: 'https://evil.com',
},
});
expect(res.status).toBe(403);
});
it('rejects PUT with cross-origin Origin header', async () => {
const res = await httpRequest(`${baseUrl}/api/app-config`, {
method: 'PUT',
headers: {
'Content-Type': 'application/json',
Host: `127.0.0.1:${port}`,
Origin: 'https://evil.com',
},
body: JSON.stringify({ agentId: 'hacked' }),
});
expect(res.status).toBe(403);
});
it('rejects request with wrong Host header', async () => {
const res = await httpRequest(`${baseUrl}/api/app-config`, {
headers: { Host: 'evil.com:9999' },
});
expect(res.status).toBe(403);
});
it('rejects no-Origin requests that only match configured deployment hosts', async () => {
process.env.OD_ALLOWED_ORIGINS = 'https://od.example.com';
try {
const res = await httpRequest(`${baseUrl}/api/app-config`, {
headers: { Host: 'od.example.com' },
});
expect(res.status).toBe(403);
} finally {
delete process.env.OD_ALLOWED_ORIGINS;
}
});
it('still rejects non-loopback Origin', async () => {
const res = await httpRequest(`${baseUrl}/api/app-config`, {
headers: {
Host: `127.0.0.1:${port}`,
Origin: 'https://evil.com',
},
});
expect(res.status).toBe(403);
});
});