mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
1 commit
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
56988e406c
|
feat: integrate xAI SuperGrok subscription as a credential source for Grok media + X search (#2134)
* feat(daemon): add xAI OAuth client with PKCE + token storage Wraps mcp-oauth.ts PKCE primitives for xAI's auth.x.ai OAuth server. xAI doesn't speak MCP and doesn't expose Dynamic Client Registration, so issuer / endpoints / client_id / scope / loopback :56121 are hardcoded constants. Adds xai-tokens.ts for persistent storage, mirroring mcp-tokens.ts: atomic write + chmod 0600 + per-dataDir in-memory mutex. Simplified for the single-token case (no per-server-id map). Reference: NousResearch/hermes-agent hermes_cli/auth.py:93-100. PoC reuses Hermes client_id (b1a00492-...); replace before stable release once Open Design has its own. Tests: 11 + 20, all green. tsc --noEmit clean. pnpm guard clean. * feat(daemon): expose xAI Grok models in Hermes runtime fallbackModels Lists grok-4.3, grok-4.20-reasoning, grok-4.20-non-reasoning, and grok-4.20-multi-agent-0309 as discoverable Hermes fallback models. A user who has not installed Hermes yet now sees these xAI options in the model picker, signalling that `hermes auth add xai-oauth` (SuperGrok subscription) or XAI_API_KEY unlocks Grok in Open Design without OD itself implementing OAuth-for-chat. `fetchModels` (which calls `hermes acp` to enumerate the user's actually-installed providers) is unchanged; this list only kicks in when probing fails (e.g. Hermes off PATH). Reference: xAI × Nous Research grok-hermes integration announcement, 2026-05-15. https://x.ai/news/grok-hermes * feat(media): route Grok Imagine through xAI OAuth credentials Adds resolveXAIBearer() — a refresh-aware helper on top of the xai-tokens.json store written by the daemon's OAuth client. Returns a fresh access_token, transparently refreshing in-place when the stored token enters the 120 s expiry skew window. Wires it into media-config.ts so the existing Grok provider gets the same OAuth-fallback treatment OD already gives the OpenAI provider: env keys win, then stored Settings keys, then OD-native xAI OAuth, then a borrowed Hermes-side xai-oauth token from ~/.hermes/auth.json. SuperGrok subscribers who already authorized Hermes get OD image / video generation routed through their subscription with zero extra setup. Updates the "no xAI API key" error in renderGrokImage / renderGrokVideo to point at the new OAuth path so users hitting it know they have a zero-cost option. Also exposes mediaConfigDir() so credential helpers next to media-config.json (like xai-tokens.json) reuse the same precedence: OD_MEDIA_CONFIG_DIR > OD_DATA_DIR > <projectRoot>/.od. Tests: 7 new xai-credentials cases (refresh on expiry, refresh failure, missing refresh_token, response without refresh_token) + 8 new media-config Grok OAuth fallback cases (OD-native, Hermes borrow, OD vs Hermes precedence, env precedence, stored precedence, unconfigured, expired-without-refresh). All green; tsc / guard clean. * feat(media): add xAI Grok TTS provider Registers grok-tts in the speech model catalog and wires up renderXAITTS to dispatch (provider=grok, surface=audio, kind=speech) to https://api.x.ai/v1/tts. xAI exposes a dedicated /tts endpoint that returns raw audio bytes — distinct from OpenAI's /audio/speech JSON shape — so TTS gets its own renderer rather than reusing renderOpenAISpeech. Credentials route through the same OAuth-aware path as Grok image and video (PR follow-up to media-config.ts), so a SuperGrok subscriber gets TTS for free once they have authorized once. Default request body matches the documented minimal shape (text / voice_id / language); sample_rate / bit_rate / codec are left unset so the server applies its mp3 / 24 kHz / 128 kbps defaults. Plumbing for explicit overrides is left for a later PR once the agent-facing contract grows the corresponding flags. Tests: 5 cases covering documented body shape, voice / language override, env-key fallback, server-error surfacing, and the no-credentials error. All green; tsc / guard clean. Reference: https://docs.x.ai/developers/model-capabilities/audio/text-to-speech * feat(daemon, web): expose xAI OAuth flow in Settings UI Closes the loop on the Grok integration: a SuperGrok subscriber can now authorize Open Design directly from Settings → Media Providers → Grok, with no API key and no Hermes install. After authorizing, image, video, and TTS routes pick up the bearer through the OAuth fallback chain added in 'route Grok Imagine through xAI OAuth credentials'. Daemon side - xai-oauth-server.ts opens a one-shot HTTP listener on 127.0.0.1:56121 to receive the OAuth callback. The redirect URI is hard-locked to that port because the PoC reuses the Hermes-issued client_id. Listener self-closes on first matching callback or after a 30 min timeout. - xai-routes.ts wires three endpoints onto the daemon's HTTP app: POST /api/xai/oauth/start — mint state, open listener, return authorize URL GET /api/xai/auth/status — has-token / expiry / in-flight POST /api/xai/oauth/disconnect — wipe stored token, stop listener - server.ts registers xai-routes alongside the existing mcp-routes. Web side - XaiOAuthControl.tsx renders a Sign in / Reconnect / Disconnect surface mirroring McpOAuthControl, but polls /api/xai/auth/status exclusively because the :56121 callback page lives in a separate process and can't postMessage back to the OD UI. SettingsDialog embeds it inside the Grok provider row. Tests: 9 listener cases (bind / state mismatch / replay / favicon / EADDRINUSE / timeout / explicit error param / one-shot consume / early stop) + 8 route cases (start mints PKCE URL, second start replaces in-flight listener, status reports listening + connected, callback ok stores token, callback error skips storage, disconnect wipes, cross-origin guard rejects all three endpoints). All 17 + the 74 from prior commits pass; tsc / web typecheck / pnpm guard clean. PoC client_id stays Hermes-issued; user-visible strings are hardcoded English pending an i18n pass before stable. * fix(daemon, web): xAI OAuth follow-up — paste-back, X search, UX polish PoC testing surfaced four real-world rough edges in the Sign in flow that were not obvious before getting an actual SuperGrok subscription in front of it. None alter the architecture in 'expose xAI OAuth flow in Settings UI'; they round it off so the path the user actually walks matches the one the design assumed. 1. Layout. XaiOAuthControl was a grid item inside .media-provider-body and got squeezed into the API-key column. Moves it out of the body so the row's flex-column layout gives it the full width — matches what every other Settings provider OAuth surface gets. 2. Paste-back. xAI's `auth.x.ai` page often shows a "cannot connect to your application" fallback that hands the user a code instead of redirecting back to 127.0.0.1:56121, even when the loopback listener is reachable (browser DOES quietly redirect in the background, but the page lies and shows the manual-paste UI anyway). Adds: - POST /api/xai/oauth/complete that takes {state, code} and runs completeXAIAuth + setXAIToken + stops the listener. - A paste-back input row in XaiOAuthControl that surfaces while the dance is in flight; submitting either via Enter or the button calls /complete and falls through to the same connected state the loopback path lands on. 3. X search. New POST /api/xai/search wraps Grok's native x_search tool through the Responses API, gated on the same OAuth-first credential chain as Grok image / video / TTS. Body accepts query (required), allowed_x_handles, excluded_x_handles, from_date, to_date, model. Returns { answer, citations[], model } parsed from the Responses payload via two newly exported helpers (extractAnswerText, extractUrlCitations). 4. State machine + warning banner. Three issues collapsed into one: - Polling that flipped busy → 'idle' the moment the loopback listener self-closed disabled the paste-back input even though the dance was still recoverable. Removed that branch; awaiting state now only ends on connected=true or explicit cancel. - paste-input `disabled` was over-eager (`busy !== 'awaiting' && busy !== 'refreshing'`); now it's only blocked while a submit is in flight (`busy === 'refreshing'`). - Added a heads-up banner inside the awaiting region explaining that xAI's "cannot connect" page is a UX bug on their side and the OD panel is the source of truth for sign-in success. The connected message picks up the cue too: "You can close any open xAI browser tabs now." Tests: +12 cases on top of the existing 17. The complete endpoint covers happy path, blank-field rejection, and unknown-state error. The search endpoint covers blank-query rejection, no-credentials 401, full bearer / x_search-options forwarding with response parsing, and upstream-error pass-through. Two helper functions get four direct parser cases. All 29 in the file pass; 225 across the daemon test suite pass; tsc / web tsc / pnpm guard all clean. * fix(daemon): satisfy tsconfig.tests.json strictness in xai test files The CI workspace typecheck step runs tsconfig.tests.json (which extends tsconfig.json's strict + exactOptionalPropertyTypes settings and adds the tests/ directory to the include set) — but the local `tsc -p tsconfig.json --noEmit` I ran while iterating only covered src/. That gap let two classes of strict-mode errors slip into the PR's CI: - `let outcome: CallbackOutcome | null = null` mutated from inside an async callback narrowed to `never` after `outcome?.kind` because TS doesn't track cross-function mutation. Switched the seven sites in xai-oauth-server.test.ts to a `{ current: CallbackOutcome | null }` ref object — TS does narrow .current correctly, so `kind` / `error` field access stops collapsing to `never`. - `await r.json()` returns `Promise<unknown>` in the lib.dom typings shipped with TS 5.x, so every `body.field` / `status.connected` access in xai-routes.test.ts tripped TS18046. Added a one-line `jsonOf<T = any>` helper at the top of the file and switched all call sites (both `await r.json()` and `.then((r) => r.json())`). - The cross-origin guard test iterated `for (const [method, path] of [...])` — under noUncheckedIndexedAccess that destructures to `string | undefined`, which RequestInit.method (a `string` under exactOptionalPropertyTypes) won't accept. Hoisted the cases to a typed `ReadonlyArray<readonly [string, string]>` so the elements stay non-optional. Behaviour is unchanged; vitest still reports 29/29 across these two files. tsc -p tsconfig.tests.json --noEmit now passes locally, matching what CI will run. * fix(xai-oauth): preserve refresh_token + release :56121 on cancel Two lifecycle issues Looper flagged on the prior commit: 1. resolveXAIBearer dropped the existing refresh_token whenever the refresh response omitted one. RFC 6749 §6 explicitly allows the server to skip refresh_token rotation and keep the old one valid; xAI's behaviour is currently to rotate, but a future change could silently break OD users. With the old code the first refresh succeeded but persisted a token with no refresh credential, so the next expiry forced the user back through Sign in even though their grant was still good. Carries the previous refresh_token forward when fresh.refresh_token is absent. Updates the matching xai-credentials test to assert the carried-forward value instead of the previous (incorrect) "drop it" assertion. 2. The Cancel button in XaiOAuthControl only cleared React-side pending state; the daemon's one-shot 127.0.0.1:56121 listener kept running for the full 30 min server timeout. /api/xai/auth/status would still report listening=true, and that singleton port could block the next Sign in (or a Hermes session on the same machine). Adds POST /api/xai/oauth/cancel that calls stopActiveListener() without touching the stored token (Disconnect is the destructive path; this is the narrow "release the port" affordance), wires the UI Cancel handler to fire it, and adds two route tests covering the listener-stopped-but-token-preserved invariant and the no-op behaviour when no listener is in flight. All 38 xai tests + tsconfig.tests.json typecheck + web typecheck + pnpm guard pass. * fix(xai-oauth): close two more lifecycle gaps Looper flagged Both are non-blocking but cheap and right. 1. window.open used 'noopener=no,noreferrer=no' (carried over from the sibling McpOAuthControl), which deliberately KEEPS the auth.x.ai tab's window.opener reference back to the Settings tab. Reverse tabnabbing risk if the auth page or any redirect target along the OAuth chain ever turns hostile, with no upside — the xAI flow doesn't use postMessage, the daemon receives the code through the :56121 listener (or paste-back), so opener access buys nothing. Switched to 'noopener,noreferrer'. 2. PendingAuthCache was constructed with its default 10 min TTL while the loopback listener self-closes at 30 min and the UI shows a pending state for the same 30 min. After 10 min, a user looking at a live paste-back input would hit `xAI OAuth state not found or expired` even though everything visible (and the daemon socket) still claimed the dance was live. Constructed the cache with 30 * 60 * 1000 so the PKCE state, the open :56121 socket, and the paste-back UI all expire together. The third inline comment (XaiOAuthControl.tsx:248 — "Cancel only clears React-side state") was a stale reference: the previous commit |