open-design/apps/daemon/tests/xai-oauth.test.ts
Joey-nexu 56988e406c
feat: integrate xAI SuperGrok subscription as a credential source for Grok media + X search (#2134)
* feat(daemon): add xAI OAuth client with PKCE + token storage

Wraps mcp-oauth.ts PKCE primitives for xAI's auth.x.ai OAuth server.
xAI doesn't speak MCP and doesn't expose Dynamic Client Registration,
so issuer / endpoints / client_id / scope / loopback :56121 are
hardcoded constants.

Adds xai-tokens.ts for persistent storage, mirroring mcp-tokens.ts:
atomic write + chmod 0600 + per-dataDir in-memory mutex. Simplified
for the single-token case (no per-server-id map).

Reference: NousResearch/hermes-agent hermes_cli/auth.py:93-100.
PoC reuses Hermes client_id (b1a00492-...); replace before stable
release once Open Design has its own.

Tests: 11 + 20, all green. tsc --noEmit clean. pnpm guard clean.

* feat(daemon): expose xAI Grok models in Hermes runtime fallbackModels

Lists grok-4.3, grok-4.20-reasoning, grok-4.20-non-reasoning, and
grok-4.20-multi-agent-0309 as discoverable Hermes fallback models.
A user who has not installed Hermes yet now sees these xAI options
in the model picker, signalling that `hermes auth add xai-oauth`
(SuperGrok subscription) or XAI_API_KEY unlocks Grok in Open Design
without OD itself implementing OAuth-for-chat.

`fetchModels` (which calls `hermes acp` to enumerate the user's
actually-installed providers) is unchanged; this list only kicks
in when probing fails (e.g. Hermes off PATH).

Reference: xAI × Nous Research grok-hermes integration announcement,
2026-05-15. https://x.ai/news/grok-hermes

* feat(media): route Grok Imagine through xAI OAuth credentials

Adds resolveXAIBearer() — a refresh-aware helper on top of the
xai-tokens.json store written by the daemon's OAuth client. Returns
a fresh access_token, transparently refreshing in-place when the
stored token enters the 120 s expiry skew window.

Wires it into media-config.ts so the existing Grok provider gets the
same OAuth-fallback treatment OD already gives the OpenAI provider:
env keys win, then stored Settings keys, then OD-native xAI OAuth,
then a borrowed Hermes-side xai-oauth token from ~/.hermes/auth.json.
SuperGrok subscribers who already authorized Hermes get OD image /
video generation routed through their subscription with zero extra
setup.

Updates the "no xAI API key" error in renderGrokImage / renderGrokVideo
to point at the new OAuth path so users hitting it know they have a
zero-cost option.

Also exposes mediaConfigDir() so credential helpers next to
media-config.json (like xai-tokens.json) reuse the same precedence:
OD_MEDIA_CONFIG_DIR > OD_DATA_DIR > <projectRoot>/.od.

Tests: 7 new xai-credentials cases (refresh on expiry, refresh
failure, missing refresh_token, response without refresh_token) +
8 new media-config Grok OAuth fallback cases (OD-native, Hermes
borrow, OD vs Hermes precedence, env precedence, stored precedence,
unconfigured, expired-without-refresh). All green; tsc / guard clean.

* feat(media): add xAI Grok TTS provider

Registers grok-tts in the speech model catalog and wires up
renderXAITTS to dispatch (provider=grok, surface=audio, kind=speech)
to https://api.x.ai/v1/tts. xAI exposes a dedicated /tts endpoint
that returns raw audio bytes — distinct from OpenAI's /audio/speech
JSON shape — so TTS gets its own renderer rather than reusing
renderOpenAISpeech.

Credentials route through the same OAuth-aware path as Grok image
and video (PR follow-up to media-config.ts), so a SuperGrok
subscriber gets TTS for free once they have authorized once.

Default request body matches the documented minimal shape
(text / voice_id / language); sample_rate / bit_rate / codec are
left unset so the server applies its mp3 / 24 kHz / 128 kbps
defaults. Plumbing for explicit overrides is left for a later PR
once the agent-facing contract grows the corresponding flags.

Tests: 5 cases covering documented body shape, voice / language
override, env-key fallback, server-error surfacing, and the
no-credentials error. All green; tsc / guard clean.

Reference: https://docs.x.ai/developers/model-capabilities/audio/text-to-speech

* feat(daemon, web): expose xAI OAuth flow in Settings UI

Closes the loop on the Grok integration: a SuperGrok subscriber can
now authorize Open Design directly from Settings → Media Providers →
Grok, with no API key and no Hermes install. After authorizing, image,
video, and TTS routes pick up the bearer through the OAuth fallback
chain added in 'route Grok Imagine through xAI OAuth credentials'.

Daemon side
- xai-oauth-server.ts opens a one-shot HTTP listener on
  127.0.0.1:56121 to receive the OAuth callback. The redirect URI is
  hard-locked to that port because the PoC reuses the Hermes-issued
  client_id. Listener self-closes on first matching callback or after
  a 30 min timeout.
- xai-routes.ts wires three endpoints onto the daemon's HTTP app:
    POST /api/xai/oauth/start       — mint state, open listener,
                                       return authorize URL
    GET  /api/xai/auth/status       — has-token / expiry / in-flight
    POST /api/xai/oauth/disconnect  — wipe stored token, stop listener
- server.ts registers xai-routes alongside the existing mcp-routes.

Web side
- XaiOAuthControl.tsx renders a Sign in / Reconnect / Disconnect
  surface mirroring McpOAuthControl, but polls /api/xai/auth/status
  exclusively because the :56121 callback page lives in a separate
  process and can't postMessage back to the OD UI. SettingsDialog
  embeds it inside the Grok provider row.

Tests: 9 listener cases (bind / state mismatch / replay / favicon /
EADDRINUSE / timeout / explicit error param / one-shot consume /
early stop) + 8 route cases (start mints PKCE URL, second start
replaces in-flight listener, status reports listening + connected,
callback ok stores token, callback error skips storage, disconnect
wipes, cross-origin guard rejects all three endpoints). All 17 +
the 74 from prior commits pass; tsc / web typecheck / pnpm guard
clean.

PoC client_id stays Hermes-issued; user-visible strings are
hardcoded English pending an i18n pass before stable.

* fix(daemon, web): xAI OAuth follow-up — paste-back, X search, UX polish

PoC testing surfaced four real-world rough edges in the Sign in flow
that were not obvious before getting an actual SuperGrok subscription
in front of it. None alter the architecture in 'expose xAI OAuth flow
in Settings UI'; they round it off so the path the user actually walks
matches the one the design assumed.

1. Layout. XaiOAuthControl was a grid item inside .media-provider-body
   and got squeezed into the API-key column. Moves it out of the body
   so the row's flex-column layout gives it the full width — matches
   what every other Settings provider OAuth surface gets.

2. Paste-back. xAI's `auth.x.ai` page often shows a "cannot connect to
   your application" fallback that hands the user a code instead of
   redirecting back to 127.0.0.1:56121, even when the loopback listener
   is reachable (browser DOES quietly redirect in the background, but
   the page lies and shows the manual-paste UI anyway). Adds:
     - POST /api/xai/oauth/complete that takes {state, code} and runs
       completeXAIAuth + setXAIToken + stops the listener.
     - A paste-back input row in XaiOAuthControl that surfaces while
       the dance is in flight; submitting either via Enter or the
       button calls /complete and falls through to the same connected
       state the loopback path lands on.

3. X search. New POST /api/xai/search wraps Grok's native x_search tool
   through the Responses API, gated on the same OAuth-first credential
   chain as Grok image / video / TTS. Body accepts query (required),
   allowed_x_handles, excluded_x_handles, from_date, to_date, model.
   Returns { answer, citations[], model } parsed from the Responses
   payload via two newly exported helpers (extractAnswerText,
   extractUrlCitations).

4. State machine + warning banner. Three issues collapsed into one:
     - Polling that flipped busy → 'idle' the moment the loopback
       listener self-closed disabled the paste-back input even though
       the dance was still recoverable. Removed that branch; awaiting
       state now only ends on connected=true or explicit cancel.
     - paste-input `disabled` was over-eager (`busy !== 'awaiting' &&
       busy !== 'refreshing'`); now it's only blocked while a submit
       is in flight (`busy === 'refreshing'`).
     - Added a heads-up banner inside the awaiting region explaining
       that xAI's "cannot connect" page is a UX bug on their side and
       the OD panel is the source of truth for sign-in success. The
       connected message picks up the cue too: "You can close any
       open xAI browser tabs now."

Tests: +12 cases on top of the existing 17. The complete endpoint
covers happy path, blank-field rejection, and unknown-state error.
The search endpoint covers blank-query rejection, no-credentials 401,
full bearer / x_search-options forwarding with response parsing, and
upstream-error pass-through. Two helper functions get four direct
parser cases. All 29 in the file pass; 225 across the daemon test
suite pass; tsc / web tsc / pnpm guard all clean.

* fix(daemon): satisfy tsconfig.tests.json strictness in xai test files

The CI workspace typecheck step runs tsconfig.tests.json (which extends
tsconfig.json's strict + exactOptionalPropertyTypes settings and adds
the tests/ directory to the include set) — but the local
`tsc -p tsconfig.json --noEmit` I ran while iterating only covered
src/. That gap let two classes of strict-mode errors slip into the
PR's CI:

- `let outcome: CallbackOutcome | null = null` mutated from inside an
  async callback narrowed to `never` after `outcome?.kind` because TS
  doesn't track cross-function mutation. Switched the seven sites in
  xai-oauth-server.test.ts to a `{ current: CallbackOutcome | null }`
  ref object — TS does narrow .current correctly, so `kind` / `error`
  field access stops collapsing to `never`.
- `await r.json()` returns `Promise<unknown>` in the lib.dom typings
  shipped with TS 5.x, so every `body.field` / `status.connected`
  access in xai-routes.test.ts tripped TS18046. Added a one-line
  `jsonOf<T = any>` helper at the top of the file and switched all
  call sites (both `await r.json()` and `.then((r) => r.json())`).
- The cross-origin guard test iterated `for (const [method, path] of
  [...])` — under noUncheckedIndexedAccess that destructures to
  `string | undefined`, which RequestInit.method (a `string` under
  exactOptionalPropertyTypes) won't accept. Hoisted the cases to a
  typed `ReadonlyArray<readonly [string, string]>` so the elements
  stay non-optional.

Behaviour is unchanged; vitest still reports 29/29 across these two
files. tsc -p tsconfig.tests.json --noEmit now passes locally,
matching what CI will run.

* fix(xai-oauth): preserve refresh_token + release :56121 on cancel

Two lifecycle issues Looper flagged on the prior commit:

1. resolveXAIBearer dropped the existing refresh_token whenever the
   refresh response omitted one. RFC 6749 §6 explicitly allows the
   server to skip refresh_token rotation and keep the old one valid;
   xAI's behaviour is currently to rotate, but a future change could
   silently break OD users. With the old code the first refresh
   succeeded but persisted a token with no refresh credential, so the
   next expiry forced the user back through Sign in even though their
   grant was still good. Carries the previous refresh_token forward
   when fresh.refresh_token is absent. Updates the matching
   xai-credentials test to assert the carried-forward value instead of
   the previous (incorrect) "drop it" assertion.

2. The Cancel button in XaiOAuthControl only cleared React-side
   pending state; the daemon's one-shot 127.0.0.1:56121 listener kept
   running for the full 30 min server timeout. /api/xai/auth/status
   would still report listening=true, and that singleton port could
   block the next Sign in (or a Hermes session on the same machine).
   Adds POST /api/xai/oauth/cancel that calls stopActiveListener()
   without touching the stored token (Disconnect is the destructive
   path; this is the narrow "release the port" affordance), wires the
   UI Cancel handler to fire it, and adds two route tests covering
   the listener-stopped-but-token-preserved invariant and the no-op
   behaviour when no listener is in flight.

All 38 xai tests + tsconfig.tests.json typecheck + web typecheck +
pnpm guard pass.

* fix(xai-oauth): close two more lifecycle gaps Looper flagged

Both are non-blocking but cheap and right.

1. window.open used 'noopener=no,noreferrer=no' (carried over from the
   sibling McpOAuthControl), which deliberately KEEPS the auth.x.ai
   tab's window.opener reference back to the Settings tab. Reverse
   tabnabbing risk if the auth page or any redirect target along the
   OAuth chain ever turns hostile, with no upside — the xAI flow
   doesn't use postMessage, the daemon receives the code through the
   :56121 listener (or paste-back), so opener access buys nothing.
   Switched to 'noopener,noreferrer'.

2. PendingAuthCache was constructed with its default 10 min TTL while
   the loopback listener self-closes at 30 min and the UI shows a
   pending state for the same 30 min. After 10 min, a user looking at
   a live paste-back input would hit `xAI OAuth state not found or
   expired` even though everything visible (and the daemon socket)
   still claimed the dance was live. Constructed the cache with
   30 * 60 * 1000 so the PKCE state, the open :56121 socket, and the
   paste-back UI all expire together.

The third inline comment (XaiOAuthControl.tsx:248 — "Cancel only
clears React-side state") was a stale reference: the previous commit
fd04887 wired the Cancel button to fire `cancelInFlightOAuth()` which
hits the new `POST /api/xai/oauth/cancel` endpoint. Looper carried
the old comment forward when re-reviewing the rebased file; no code
change needed.

All 38 xai tests still green; tsconfig.tests.json clean; web tsc
clean; pnpm guard clean.

* fix(xai-oauth): keep loopback listener open on stale-tab callbacks

The one-shot listener marked itself consumed at the top of every
/callback request, then closed itself in the finally block whether
or not the state actually matched. A stray browser tab replaying an
old /callback?state=… (real-world scenario: user re-clicked Sign in
before closing the previous tab) would therefore close the singleton
:56121 listener with a state-mismatch error before the real xAI
redirect could arrive.

Now we only tear the listener down on outcomes that actually
terminate the dance:
  - ok callback (matched state, code present)
  - explicit ?error= from xAI (auth provider terminated; we should
    propagate, not wait for the 30 min timeout). xAI's error
    redirects may or may not echo state, but a stale tab can't
    fabricate ?error= without colluding with the auth server, so
    this branch is safe to consume.

Stale tabs / browser prefetches / malformed redirects still get the
HTTP 400 / "Sign-in failed" page, but the listener stays open and
the matching xAI redirect that arrives next is what closes it.

Tests: replaces the previous "rejects state mismatch with kind=error"
test with the recovery scenario (stale-then-real callbacks both hit
the listener; only the real one fires onCallback). Adds a sibling
case for missing-code / missing-state callbacks. xai-oauth-server
suite is now 10/10; full xai sweep 39/39.

* fix(xai-oauth): scope error-callback consume to matching/missing state

c00252c simplified the consume rule to "any explicit ?error= closes
the listener", which was broader than the stale-tab protection added
in the same commit. A browser history replay of an old
`/callback?error=access_denied&state=stale` would set `consumed`,
fire `onCallback`, and tear down the singleton 127.0.0.1:56121 socket
before the current dance's real callback could land — undoing the
defence the commit was supposed to add.

Tighten the rule so error-callbacks consume only when:
  - the URL carries no state (xAI rejected before issuing one, so
    there's nothing to compare against — safe to terminate), or
  - the carried state matches our expectedState (xAI explicitly
    rejected this dance; propagate immediately rather than wait for
    the 30 min timeout).

An ?error= replay carrying a *different* state is now treated like
the stale success replay above: returns the 400 page to the browser,
keeps the listener live, lets the real callback close it.

Tests: adds two cases — error+wrong-state followed by real success
must still resolve to ok; error+matching-state still consumes the
listener and surfaces the error to onCallback. xai-oauth-server
suite goes 10 → 12; full xai sweep 39 → 41.
2026-05-19 11:10:34 +08:00

266 lines
8.2 KiB
TypeScript

import { createHash } from 'node:crypto';
import { describe, expect, it } from 'vitest';
import { PendingAuthCache } from '../src/mcp-oauth.js';
import {
XAI_OAUTH_AUTHORIZATION_ENDPOINT,
XAI_OAUTH_CLIENT_ID,
XAI_OAUTH_REDIRECT_PORT,
XAI_OAUTH_SCOPE,
XAI_OAUTH_TOKEN_ENDPOINT,
XAI_PROVIDER_ID,
beginXAIAuth,
completeXAIAuth,
refreshXAIToken,
xaiRedirectUri,
} from '../src/xai-oauth.js';
type FetchInput = Parameters<typeof fetch>[0];
type FetchInit = Parameters<typeof fetch>[1];
function makeFetch(
handler: (url: string, init?: FetchInit) => Promise<Response> | Response,
) {
return async (input: FetchInput, init?: FetchInit): Promise<Response> => {
const url = typeof input === 'string' ? input : input.toString();
return handler(url, init);
};
}
describe('xaiRedirectUri', () => {
it('matches the loopback / port hermes-agent uses', () => {
expect(xaiRedirectUri()).toBe(
`http://127.0.0.1:${XAI_OAUTH_REDIRECT_PORT}/callback`,
);
});
});
describe('beginXAIAuth', () => {
it('builds an authorize URL with PKCE, state, and the configured scope', () => {
const pending = new PendingAuthCache();
try {
const { authorizeUrl, state } = beginXAIAuth({ pending });
const u = new URL(authorizeUrl);
expect(u.origin + u.pathname).toBe(XAI_OAUTH_AUTHORIZATION_ENDPOINT);
expect(u.searchParams.get('response_type')).toBe('code');
expect(u.searchParams.get('client_id')).toBe(XAI_OAUTH_CLIENT_ID);
expect(u.searchParams.get('redirect_uri')).toBe(xaiRedirectUri());
expect(u.searchParams.get('scope')).toBe(XAI_OAUTH_SCOPE);
expect(u.searchParams.get('code_challenge_method')).toBe('S256');
const challenge = u.searchParams.get('code_challenge');
expect(challenge).toMatch(/^[A-Za-z0-9_-]+$/);
expect(u.searchParams.get('state')).toBe(state);
} finally {
pending.stop();
}
});
it('puts a pending state keyed by `state` whose serverId is "xai"', () => {
const pending = new PendingAuthCache();
try {
const { state } = beginXAIAuth({ pending });
expect(pending.size()).toBe(1);
// We don't expose the inner state, but consume() should yield a
// record with the right serverId and a non-empty verifier.
const consumed = pending.consume(state);
expect(consumed).not.toBeNull();
expect(consumed!.serverId).toBe(XAI_PROVIDER_ID);
expect(consumed!.codeVerifier.length).toBeGreaterThanOrEqual(43);
expect(consumed!.tokenEndpoint).toBe(XAI_OAUTH_TOKEN_ENDPOINT);
} finally {
pending.stop();
}
});
it('produces distinct verifiers / states across calls', () => {
const pending = new PendingAuthCache();
try {
const a = beginXAIAuth({ pending });
const b = beginXAIAuth({ pending });
expect(a.state).not.toBe(b.state);
expect(a.authorizeUrl).not.toBe(b.authorizeUrl);
} finally {
pending.stop();
}
});
it('challenge is sha256(verifier) base64url, end-to-end', () => {
const pending = new PendingAuthCache();
try {
const { authorizeUrl, state } = beginXAIAuth({ pending });
const consumed = pending.consume(state);
expect(consumed).not.toBeNull();
const expected = createHash('sha256')
.update(consumed!.codeVerifier)
.digest('base64')
.replace(/\+/g, '-')
.replace(/\//g, '_')
.replace(/=+$/g, '');
const actual = new URL(authorizeUrl).searchParams.get('code_challenge');
expect(actual).toBe(expected);
} finally {
pending.stop();
}
});
});
describe('completeXAIAuth', () => {
it('exchanges code for tokens via the xAI token endpoint', async () => {
const pending = new PendingAuthCache();
try {
const { state } = beginXAIAuth({ pending });
const fakeFetch = makeFetch(async (url, init) => {
expect(url).toBe(XAI_OAUTH_TOKEN_ENDPOINT);
const body = String((init as RequestInit).body ?? '');
const params = new URLSearchParams(body);
expect(params.get('grant_type')).toBe('authorization_code');
expect(params.get('client_id')).toBe(XAI_OAUTH_CLIENT_ID);
expect(params.get('redirect_uri')).toBe(xaiRedirectUri());
expect(params.get('code')).toBe('auth-code-123');
expect(params.get('code_verifier')).toMatch(/^[A-Za-z0-9_-]+$/);
return new Response(
JSON.stringify({
access_token: 'access-abc',
refresh_token: 'refresh-xyz',
token_type: 'Bearer',
expires_in: 3600,
}),
{ status: 200, headers: { 'content-type': 'application/json' } },
);
});
const tokens = await completeXAIAuth({
pending,
state,
code: 'auth-code-123',
fetchImpl: fakeFetch,
});
expect(tokens.access_token).toBe('access-abc');
expect(tokens.refresh_token).toBe('refresh-xyz');
expect(tokens.expires_in).toBe(3600);
} finally {
pending.stop();
}
});
it('throws when state is unknown', async () => {
const pending = new PendingAuthCache();
try {
await expect(
completeXAIAuth({
pending,
state: 'never-issued',
code: 'x',
}),
).rejects.toThrow(/state not found/i);
} finally {
pending.stop();
}
});
it('throws when state is replayed (one-shot consume)', async () => {
const pending = new PendingAuthCache();
try {
const { state } = beginXAIAuth({ pending });
const fakeFetch = makeFetch(
async () =>
new Response(
JSON.stringify({ access_token: 'a', token_type: 'Bearer' }),
{ status: 200, headers: { 'content-type': 'application/json' } },
),
);
// First consume succeeds.
await completeXAIAuth({
pending,
state,
code: 'c',
fetchImpl: fakeFetch,
});
// Second consume of the same state must fail.
await expect(
completeXAIAuth({
pending,
state,
code: 'c',
fetchImpl: fakeFetch,
}),
).rejects.toThrow(/state not found/i);
} finally {
pending.stop();
}
});
it('rejects state issued for a different serverId', async () => {
const pending = new PendingAuthCache();
try {
// Hand-craft a pending entry as if some other provider had stashed it.
pending.put('foreign-state', {
serverId: 'some-other-provider',
authServerIssuer: 'https://example.test',
tokenEndpoint: 'https://example.test/token',
clientId: 'x',
redirectUri: 'http://localhost/cb',
codeVerifier: 'v',
createdAt: Date.now(),
});
await expect(
completeXAIAuth({
pending,
state: 'foreign-state',
code: 'c',
}),
).rejects.toThrow(/serverId/i);
} finally {
pending.stop();
}
});
});
describe('refreshXAIToken', () => {
it('refreshes against the fixed xAI token endpoint and client_id', async () => {
const fakeFetch = makeFetch(async (url, init) => {
expect(url).toBe(XAI_OAUTH_TOKEN_ENDPOINT);
const body = String((init as RequestInit).body ?? '');
const params = new URLSearchParams(body);
expect(params.get('grant_type')).toBe('refresh_token');
expect(params.get('refresh_token')).toBe('rt-1');
expect(params.get('client_id')).toBe(XAI_OAUTH_CLIENT_ID);
return new Response(
JSON.stringify({
access_token: 'new-access',
refresh_token: 'rt-2',
token_type: 'Bearer',
expires_in: 1800,
}),
{ status: 200, headers: { 'content-type': 'application/json' } },
);
});
const tokens = await refreshXAIToken({
refreshToken: 'rt-1',
fetchImpl: fakeFetch,
});
expect(tokens.access_token).toBe('new-access');
expect(tokens.refresh_token).toBe('rt-2');
});
it('surfaces token-endpoint errors with the body included', async () => {
const fakeFetch = makeFetch(
async () =>
new Response('{"error":"invalid_grant"}', {
status: 400,
headers: { 'content-type': 'application/json' },
}),
);
await expect(
refreshXAIToken({ refreshToken: 'expired', fetchImpl: fakeFetch }),
).rejects.toThrow(/HTTP 400/);
});
});