open-design/apps/daemon/tests/run-result.test.ts
lefarcen 255c3058c5
fix(analytics): app_version=0.0.0 + media providers clicks + lock run_finished error_code (#2453)
* fix(analytics): use state for runtime app version so PostHog gets the real value

`useAppVersion()` stored the fetched `/api/version` result in a `useRef`,
but ref writes do NOT trigger a re-render. The hook therefore kept
returning '0.0.0' forever and the downstream `useEffect` that calls
`client.register({ app_version, ui_version })` never re-ran with the
real version. PostHog dashboards then showed `app_version=0.0.0` and
`ui_version=0.0.0` on every event ever shipped from the web client.

Switching to `useState` lets the resolved version flow through React's
render cycle so the register-on-change effect picks it up. The boot
placeholder still ships as '0.0.0' for the first events before the
fetch resolves (we don't re-emit those), but every event after init
now carries the real daemon-pinned version.

Adds a red-spec at apps/web/tests/analytics-app-version.test.tsx that
went red on the `useRef` shape (`expected '0.0.0' to be '1.2.3'`) and
green on the `useState` shape, so a future refactor can't silently
regress it.

* feat(analytics): wire media providers click events + lock run_finished error_code invariant

Two analytics gaps shipped together because both came out of the same
PostHog spot-check after PR #2390 landed:

1. Settings → Media providers (CSV row "client_type=desktop / mason /
   media_providers") wasn't emitting any ui_click events. The contract
   type `SettingsMediaProvidersClickProps` and helper
   `trackSettingsMediaProvidersClick` were defined but no call site
   used them, so the dashboard showed zero traffic on every element.
   Added the four v2 elements:
   - `reload` on the "Reload from daemon" button
   - `key_input` on every per-provider API key field (onFocus, mirrors
     the BYOK key field pattern in this same dialog)
   - `url_input` on every per-provider base-URL field
   - `clear` on each row's Clear button (fires before the confirm
     dialog so the intent signal is recorded even if the user backs
     out)
   Each event carries `providers_id` (provider.id) and `is_configured`
   (truthy when the row has a stored entry).

2. `run_finished` with `result=failed` was reported as missing
   `error_code` on PostHog. Audited every failure path: the daemon's
   `child.on('close', ...)` handler has several branches that call
   `runs.finish('failed', code, signal)` directly without first
   emitting an SSE `error` event (ACP fatal, agentStreamError fall
   through, child close without diagnostic), leaving
   `run.errorCode === null` in the status body. The existing fallback
   in `server.ts` already derives `AGENT_SIGNAL_*` / `AGENT_EXIT_*` /
   `AGENT_TERMINATED_UNKNOWN` from `signal` / `exitCode` for those
   cases, so the wire emission should never blank out — but the logic
   was inline and had no unit coverage.

   Extracted the result/error_code derivation into
   `apps/daemon/src/run-result.ts` and added 12 unit tests covering:
   - explicit errorCode forwarding
   - signal-only failures
   - exit-code-only failures
   - clean (code=0) failures (ACP fatal shape)
   - cancelled runs (with and without stamped code)
   - empty-string errorCode defensive case
   - status→result mapping for succeeded/canceled/failed/unknown

   All 12 pass — confirming the invariant "result=failed always
   carries error_code" holds for every failure shape the daemon
   produces. The refactor pins that invariant so a future change
   loses test coverage rather than silently regressing on PostHog.

   If `error_code` still looks empty on a live event, share the
   PostHog event JSON + the agent id and I'll dig further — at this
   point the daemon emission itself is exercised end-to-end.
2026-05-20 21:50:11 +08:00

142 lines
4.4 KiB
TypeScript

// Unit coverage for `deriveRunErrorCode`. The v2 analytics doc requires
// `run_finished` events with `result === 'failed'` to carry a non-empty
// `error_code` so dashboards keyed on it can break failures down by
// reason. Several daemon failure paths reach `runs.finish('failed',
// ...)` WITHOUT first emitting an SSE `error` event (ACP fatal,
// agentStreamError fall-through, child close with no diagnostic). When
// they do, the run status body's `errorCode` is `null`. The fallback
// chain in `run-result.ts` turns those signal/exitCode hints into a
// best-effort code so the wire emission never blanks out.
//
// This pins each failure shape's expected output. A future refactor
// that loses the fallback re-introduces the original symptom: PostHog
// shows `result=failed` events with no `error_code`.
import { describe, expect, it } from 'vitest';
import {
deriveRunErrorCode,
runResultFromStatus,
} from '../src/run-result.js';
describe('runResultFromStatus', () => {
it('maps succeeded -> success', () => {
expect(runResultFromStatus('succeeded')).toBe('success');
});
it('maps canceled -> cancelled (with the analytics doc spelling)', () => {
expect(runResultFromStatus('canceled')).toBe('cancelled');
});
it('maps failed -> failed', () => {
expect(runResultFromStatus('failed')).toBe('failed');
});
it('maps unknown / partial states to failed (defensive)', () => {
expect(runResultFromStatus('running')).toBe('failed');
expect(runResultFromStatus(undefined)).toBe('failed');
});
});
describe('deriveRunErrorCode', () => {
it('returns undefined for a successful run', () => {
expect(
deriveRunErrorCode({
status: 'succeeded',
errorCode: null,
exitCode: 0,
signal: null,
}),
).toBeUndefined();
});
it('forwards an explicit errorCode set via runs.emit(error, …)', () => {
expect(
deriveRunErrorCode({
status: 'failed',
errorCode: 'AGENT_EXECUTION_FAILED',
exitCode: 1,
signal: null,
}),
).toBe('AGENT_EXECUTION_FAILED');
});
it('derives AGENT_SIGNAL_* when the child died from a signal', () => {
expect(
deriveRunErrorCode({
status: 'failed',
errorCode: null,
exitCode: null,
signal: 'SIGSEGV',
}),
).toBe('AGENT_SIGNAL_SIGSEGV');
});
it('derives AGENT_EXIT_<code> when the child exited non-zero with no signal', () => {
expect(
deriveRunErrorCode({
status: 'failed',
errorCode: null,
exitCode: 1,
signal: null,
}),
).toBe('AGENT_EXIT_1');
expect(
deriveRunErrorCode({
status: 'failed',
errorCode: null,
exitCode: 137,
signal: null,
}),
).toBe('AGENT_EXIT_137');
});
it('falls back to AGENT_TERMINATED_UNKNOWN when neither signal nor non-zero exit is available', () => {
// This is the shape produced when `acpSession.hasFatalError()` is
// true even though the child exited cleanly (code=0, no signal) —
// see `child.on('close', ...)` in server.ts. Without this fallback,
// those failures would still blank out on PostHog.
expect(
deriveRunErrorCode({
status: 'failed',
errorCode: null,
exitCode: 0,
signal: null,
}),
).toBe('AGENT_TERMINATED_UNKNOWN');
});
it('returns undefined for a cancelled run with no stamped code', () => {
// Cancellation is the user's choice; not a diagnosable failure.
expect(
deriveRunErrorCode({
status: 'canceled',
errorCode: null,
exitCode: null,
signal: 'SIGTERM',
}),
).toBeUndefined();
});
it('forwards an explicit errorCode on a cancelled run (cancel during error recovery)', () => {
expect(
deriveRunErrorCode({
status: 'canceled',
errorCode: 'AGENT_AUTH_REQUIRED',
exitCode: 1,
signal: null,
}),
).toBe('AGENT_AUTH_REQUIRED');
});
it('treats empty-string errorCode as "missing" so the fallback chain still kicks in', () => {
// Defensive — `readString` in runs.ts trims and null-checks, but
// any future caller passing an empty string would otherwise emit a
// blank `error_code` field and reintroduce the original symptom.
expect(
deriveRunErrorCode({
status: 'failed',
errorCode: '',
exitCode: 1,
signal: null,
}),
).toBe('AGENT_EXIT_1');
});
});