feat(senseaudio): BYOK chat with image + video generation tools (#2065)

* feat(senseaudio): BYOK chat with image + video generation tools Adds SenseAudio as a first-class BYOK chat protocol and wires the daemon's chat proxy with a tool loop so BYOK users can generate images and videos without dropping to a CLI agent. - BYOK protocol: new senseaudio tab + /api/proxy/senseaudio/stream route + connection-test + provider-models discovery (OpenAI-compatible wire) - Tool loop: generate_image (synchronous /v1/image/sync) and generate_video (async /v1/video/create + 5s polling /v1/video/status, 10-min ceiling, periodic progress log every 30s) - Settings dropdown + chat-composer dropdown for the BYOK image model default; generate_image's model enum lets the LLM override per call - Seed-on-success: a successful BYOK chat call idempotently mirrors the key into media-config (preserves env-resolved + already-stored keys) - Generated artifacts land in <projectsRoot>/<projectId>/ so FileViewer, DesignFilesPanel, and project export pick them up automatically; legacy /api/byok-image/:id route kept for old conversation links - Markdown renderer learns ![alt](url) image syntax with a scheme allowlist (http(s) / data:image/ / blob: / relative paths) - i18n key settings.byokImageModel across all 19 locales - 3 SenseAudio image models registered (2.0, 1.0, doubao-seedream-5.0); 1 video model (doubao-seedance-2.0) - Tests: byok-tools (29), media-senseaudio-image (8), media-config seed (7), proxy-routes (47), markdown image rendering (8) * fix(senseaudio): unblock image gen + design file preview switching - SenseAudio /v1/image/sync rejected the previous size mapping with `参数错误：size` (1664x936, 936x1664, 1280x960, 960x1280 are not in the gateway's accepted set). Switched to standard HD / SD sizes that every aspect bucket can hit: 1024×1024, 1280×720, 720×1280, 1024×768, 768×1024. Kept the byok-tools and media.ts tables in sync so the BYOK chat tool and the CLI agent path both stop failing on non-square aspects. - DesignFilesPanel's <DfPreview> was missing a key prop, so React reused the same iframe DOM node when the user picked a different file — the src prop changed but the iframe never navigated. Added key={previewFile.name} so the previous preview unmounts cleanly. - Updated byok-tools + media-senseaudio-image tests for the new size expectations. * docs(senseaudio): clear stale provider hint + update README - Settings → Media → SenseAudio: clear the auto-promoted "Image · TTS · 70+ voices · clone" hint; the provider label alone is enough now that the BYOK chat surface covers image + video tooling. - README: list the new senseaudio (and missing ollama) proxy routes so the BYOK section reflects what the daemon actually serves, and mention the generate_image / generate_video chat tools that ship with the SenseAudio path. * fix(senseaudio): address PR #2065 review feedback Three non-blocking review notes from @PerishCode on PR #2065: 1. Drop the dead /api/byok-image/:id route. The PR description claimed it was "legacy fallback for old chat history" but that storage layout never existed on main, so the route can only ever 400 or 404 — never 200. Removed the handler, the isSafeByokImageId export, the unused createReadStream / stat / path / Request / Response imports, and the two byok-image regression tests. 2. Add rejectProxyPluginContext guard to the senseaudio proxy handler so it matches the invariant the other five proxy paths already enforce (plugin runs must go through /api/runs for snapshot pinning). Extended the existing "API fallback rejects plugin runs" describe to also cover /api/proxy/senseaudio/stream with the 409 PLUGIN_REQUIRES_DAEMON expectation. 3. Wrap the secondary image / video downloads (the URLs the SenseAudio gateway hands back in /v1/image/sync .url and /v1/video/status .video_url) in validateBaseUrlResolved so a malicious gateway can't point us at 169.254.169.254 (AWS / Azure metadata) or RFC1918 hosts via the response payload. Also passed `redirect: 'error'` on both fetches to match the SSRF posture the primary proxy fetch already uses. The new assertExternalAssetUrl helper lives next to executeGenerateImage so future tool downloads can reuse it. Tests: 120/120 daemon tests pass; guard + typecheck green. * fix(senseaudio): mirror SSRF guard onto renderSenseAudioImage CLI path Follow-up to 01b1260a — the chat-tool fix in byok-tools.ts wasn't mirrored onto the parallel renderSenseAudioImage path in media.ts. Same attacker-controllable shape (gateway-returned `data.url`), same one-line fix. - Hoist assertExternalAssetUrl from byok-tools.ts into connectionTest.ts next to validateBaseUrlResolved so both call sites (the BYOK chat tool loop AND the CLI agent media dispatcher) share one helper. Made the error strings provider-agnostic so a future caller doesn't get a misleading "senseaudio" attribution for a Volcengine / Grok / etc. download. - renderSenseAudioImage now runs the response url through assertExternalAssetUrl before fetching bytes, and passes redirect: 'error' to block a 3xx hop into private space. Scope intentionally limited to the senseaudio path PerishCode flagged; the other unguarded fetch(entry.url) call sites in media.ts (OpenAI / Volcengine / Grok / Nano-Banana) are pre-existing patterns and belong in a separate follow-up if the daemon wants defense-in-depth across every provider. Tests: 127/127 daemon tests pass; guard + typecheck green. --------- Co-authored-by: unknown <mazeliang@sensetime.com>
2026-06-01 03:14:35 +07:00 · 2026-05-19 23:14:56 +08:00 · 2026-05-19 23:14:56 +08:00 · 210b94069a
commit 210b94069a
parent 431a5e2d79
52 changed files with 3305 additions and 55 deletions
--- a/README.md
+++ b/README.md
@ -63,7 +63,7 @@ OD stands on four open-source shoulders:
 | | What you get |
 |---|---|
 | **Coding-agent CLIs (16)** | Claude Code · Codex CLI · Devin for Terminal · Cursor Agent · Gemini CLI · OpenCode · Qwen Code · Qoder CLI · GitHub Copilot CLI · Hermes (ACP) · Kimi CLI (ACP) · Pi (RPC) · Kiro CLI (ACP) · Kilo (ACP) · Mistral Vibe CLI (ACP) · DeepSeek TUI — auto-detected on `PATH`, swap with one click |
-| **BYOK fallback** | Protocol-specific API proxy at `/api/proxy/{anthropic,openai,azure,google}/stream` — paste `baseUrl` + `apiKey` + `model`, choose Anthropic / OpenAI / Azure OpenAI / Google Gemini, and the daemon normalizes SSE back to the same chat stream. Internal-IP/SSRF blocked at the daemon edge. |
+| **BYOK fallback** | Protocol-specific API proxy at `/api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream` — paste `baseUrl` + `apiKey` + `model`, choose Anthropic / OpenAI / Azure OpenAI / Google Gemini / Ollama Cloud / SenseAudio, and the daemon normalizes SSE back to the same chat stream. SenseAudio chat additionally exposes `generate_image` and `generate_video` tools so the model can write rendered artifacts straight into the active project's folder. Internal-IP/SSRF blocked at the daemon edge. |
 | **Design systems built-in** | **129** — 2 hand-authored starters + 70 product systems (Linear, Stripe, Vercel, Airbnb, Tesla, Notion, Anthropic, Apple, Cursor, Supabase, Figma, Xiaohongshu, …) from [`awesome-design-md`][acd2], plus 57 design skills from [`awesome-design-skills`][ads] added directly under `design-systems/` |
 | **Skills built-in** | **31** — 27 in `prototype` mode (web-prototype, saas-landing, dashboard, mobile-app, gamified-app, social-carousel, magazine-poster, dating-web, sprite-animation, motion-frames, critique, tweaks, wireframe-sketch, pm-spec, eng-runbook, finance-report, hr-onboarding, invoice, kanban-board, team-okrs, …) + 4 in `deck` mode (`guizang-ppt` · `simple-deck` · `replit-deck` · `weekly-update`). Grouped in the picker by `scenario`: design / marketing / operation / engineering / product / finance / hr / sale / personal. |
 | **Media generation** | Image · video · audio surfaces ship alongside the design loop. **gpt-image-2** (Azure / OpenAI) for posters, avatars, infographics, illustrated maps · **Seedance 2.0** (ByteDance) for cinematic 15s text-to-video and image-to-video · **HyperFrames** ([heygen-com/hyperframes](https://github.com/heygen-com/hyperframes)) for HTML→MP4 motion graphics (product reveals, kinetic typography, data charts, social overlays, logo outros). **93** ready-to-replicate prompts gallery — 43 gpt-image-2 + 39 Seedance + 11 HyperFrames — under [`prompt-templates/`](prompt-templates/), with preview thumbnails and source attribution. Same chat surface as code; outputs a real `.mp4` / `.png` chip into the project workspace. |
@ -304,7 +304,7 @@ Every layer is composable. Every layer is a file you can edit. Read [`apps/daemo
 | Frontend | Next.js 16 App Router + React 18 + TypeScript, Vercel-deployable |
 | Daemon | Node 24 · Express · SSE streaming · `better-sqlite3`; tables: `projects` · `conversations` · `messages` · `tabs` · `templates` |
 | Agent transport | `child_process.spawn`; typed-event parsers for `claude-stream-json` (Claude Code), `qoder-stream-json` (Qoder CLI), `copilot-stream-json` (Copilot), `json-event-stream` per-CLI parsers (Codex / Gemini / OpenCode / Cursor Agent), `acp-json-rpc` (Devin / Hermes / Kimi / Kiro / Kilo / Mistral Vibe via Agent Client Protocol), `pi-rpc` (Pi via stdio JSON-RPC), `plain` (Qwen Code / DeepSeek TUI) |
-| BYOK proxy | `POST /api/proxy/{anthropic,openai,azure,google}/stream` → provider-specific upstream APIs, normalized `delta/end/error` SSE; allows loopback local LLM providers, rejects non-loopback private/link-local/CGNAT/multicast/reserved hosts, and disables upstream redirects at the daemon edge |
+| BYOK proxy | `POST /api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream` → provider-specific upstream APIs, normalized `delta/end/error` SSE; allows loopback local LLM providers, rejects non-loopback private/link-local/CGNAT/multicast/reserved hosts, and disables upstream redirects at the daemon edge |
 | Storage | Plain files in `.od/projects/<id>/` + SQLite at `.od/app.sqlite` + credentials at `.od/media-config.json` (gitignored, auto-created). `OD_DATA_DIR=<dir>` relocates all daemon data (used for test isolation and read-only-install setups); `OD_MEDIA_CONFIG_DIR=<dir>` further narrows the override to just `media-config.json` for setups that want to keep API keys outside the data dir |
 | Preview | Sandboxed iframe via `srcdoc` + per-skill `<artifact>` parser ([`apps/web/src/artifacts/parser.ts`](apps/web/src/artifacts/parser.ts)) |
 | Export | HTML (inline assets) · PDF (browser print, deck-aware) · PPTX (agent-driven via skill) · ZIP (archiver) · Markdown |
@ -872,7 +872,7 @@ Pattern is the same as the rest: pick a template, edit the brief, send. The agen
 The chat / artifact loop gets the spotlight, but a handful of less-visible capabilities are already wired and worth knowing before you compare OD to anything else:
 - **Claude Design ZIP import.** Drop an export from claude.ai onto the welcome dialog. `POST /api/import/claude-design` extracts it into a real `.od/projects/<id>/`, opens the entry file as a tab, and stages a continue-where-Anthropic-left-off prompt for your local agent. No re-prompting, no "ask the model to re-create what we just had". ([`apps/daemon/src/server.ts`](apps/daemon/src/server.ts) — `/api/import/claude-design`)
- **Multi-provider BYOK proxy.** `POST /api/proxy/{anthropic,openai,azure,google}/stream` takes `{ baseUrl, apiKey, model, messages }`, builds the provider-specific upstream request, normalizes SSE chunks into `delta/end/error`, and allows loopback local LLM providers while rejecting non-loopback private, link-local, CGNAT, multicast, reserved, and redirect targets to head off SSRF. OpenAI-compatible covers OpenAI, Azure AI Foundry `/openai/v1`, DeepSeek, Groq, MiMo, OpenRouter, Ollama, LM Studio, and self-hosted vLLM; Azure OpenAI adds deployment URL + `api-version`; Google uses Gemini `:streamGenerateContent`.
+- **Multi-provider BYOK proxy.** `POST /api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream` takes `{ baseUrl, apiKey, model, messages }`, builds the provider-specific upstream request, normalizes SSE chunks into `delta/end/error`, and allows loopback local LLM providers while rejecting non-loopback private, link-local, CGNAT, multicast, reserved, and redirect targets to head off SSRF. OpenAI-compatible covers OpenAI, Azure AI Foundry `/openai/v1`, DeepSeek, Groq, MiMo, OpenRouter, Ollama, LM Studio, and self-hosted vLLM; Azure OpenAI adds deployment URL + `api-version`; Google uses Gemini `:streamGenerateContent`.
 - **User-saved templates.** Once you like a render, `POST /api/templates` snapshots the HTML + metadata into the SQLite `templates` table. The next project picks it from a "your templates" row in the picker — same surface as the shipped 31, but yours.
 - **Tab persistence.** Every project remembers its open files and active tab in the `tabs` table. Reopen the project tomorrow and the workspace looks exactly the way you left it.
 - **Artifact lint API.** `POST /api/artifacts/lint` runs structural checks on a generated artifact (broken `<artifact>` framing, missing required side files, stale palette tokens) and returns findings the agent can read back into its next turn. The five-dim self-critique uses this to ground its score in real evidence, not vibes.
@ -974,7 +974,7 @@ Long-form provenance write-up — what we take from each, what we deliberately d
 - [x] Web app + chat + question form + 5-direction picker + todo progress + sandboxed preview
 - [x] 31 skills + 72 design systems + 5 visual directions + 5 device frames
 - [x] SQLite-backed projects · conversations · messages · tabs · templates
- [x] Multi-provider BYOK proxy (`/api/proxy/{anthropic,openai,azure,google}/stream`) with SSRF guard
+- [x] Multi-provider BYOK proxy (`/api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream`) with SSRF guard
 - [x] Claude Design ZIP import (`/api/import/claude-design`)
 - [x] Sidecar protocol + Electron desktop with IPC automation (STATUS / EVAL / SCREENSHOT / CONSOLE / CLICK / SHUTDOWN)
 - [x] Artifact lint API + 5-dim self-critique pre-emit gate
--- a/apps/daemon/src/byok-tools.ts
+++ b/apps/daemon/src/byok-tools.ts
@ -0,0 +1,598 @@
 // Tool definitions and executors exposed to BYOK chat sessions.
 //
 // Why this file exists: the BYOK chat proxy (e.g. /api/proxy/senseaudio/stream)
 // is a thin pass-through that doesn't have the agent-runtime scaffolding the
 // CLI agents (Claude Code / Codex / ...) carry. To let users ask their BYOK
 // chat to "draw me a cat" and get an actual rendered PNG back, the daemon
 // injects an OpenAI-shaped `tools` definition into the upstream completion
 // request, then loops on the model's tool_calls: execute → feed the result
 // back as a `role: 'tool'` message → re-issue the completion. The chat surface
 // stays the same; the tool dispatch happens entirely daemon-side.
 //
 // Today we ship one tool — `generate_image` — backed by SenseAudio's
 // /v1/image/sync endpoint, since the BYOK chat session already authenticates
 // against SenseAudio with the same API key. Additional tools (TTS, video,
 // research) can be added here as the BYOK surface expands.
 import path from 'node:path';
 import { writeFile } from 'node:fs/promises';
 import { randomBytes } from 'node:crypto';
 import { assertExternalAssetUrl } from './connectionTest.js';
 import { resolveProviderConfig } from './media-config.js';
 import { IMAGE_MODELS } from './media-models.js';
 import { ensureProject } from './projects.js';
 // SenseAudio image model allowlist — derived from the shared media-models
 // registry so adding a new SenseAudio image model in one place (media-models)
 // auto-extends the BYOK tool param enum, the Settings dropdown, and the
 // daemon-side validation. No drift, no hand-maintained constant.
 export const BYOK_SENSEAUDIO_IMAGE_MODELS: readonly string[] = IMAGE_MODELS
  .filter((m) => m.provider === 'senseaudio')
  .map((m) => m.id);
 // Default falls back to the first entry from the registry (today
 // `senseaudio-image-2.0-260319` — the multi-aspect latest). Kept as a
 // computed constant so re-ordering the registry rotates the default
 // without code edits here.
 export const BYOK_SENSEAUDIO_DEFAULT_IMAGE_MODEL =
  BYOK_SENSEAUDIO_IMAGE_MODELS[0] ?? 'senseaudio-image-2.0-260319';
 export function isSenseAudioImageModel(value: unknown): value is string {
  return typeof value === 'string' && BYOK_SENSEAUDIO_IMAGE_MODELS.includes(value);
 }
 const SENSEAUDIO_DEFAULT_BASE_URL = 'https://api.senseaudio.cn';
 const PROMPT_MAX_LENGTH = 2000;
 // SenseAudio video — the API only documents one model today, so the
 // wire id is a const. The chat tool's `generate_video` param surface
 // (prompt, aspect_ratio, duration, resolution, generate_audio) covers
 // every knob the doubao-seedance gateway accepts.
 const SENSEAUDIO_VIDEO_MODEL = 'doubao-seedance-2-0-260128';
 const SENSEAUDIO_VIDEO_ASPECT_RATIOS = ['16:9', '9:16', '4:3', '3:4', '1:1'] as const;
 const SENSEAUDIO_VIDEO_RESOLUTIONS = ['480p', '720p', '1080p'] as const;
 const SENSEAUDIO_VIDEO_DURATION_MIN = 4;
 const SENSEAUDIO_VIDEO_DURATION_MAX = 15;
 const SENSEAUDIO_VIDEO_DURATION_DEFAULT = 5;
 // Polling: SenseAudio docs recommend 5–10 s intervals; we pick 5 s and
 // cap total attempts so a stuck job can't pin the chat stream forever.
 // 120 attempts × 5 s = 10 min ceiling — covers the real-world
 // doubao-seedance latency range (1080p + audio jobs frequently spend
 // 3–8 min on the gateway). Below this, the 5-min cap timed out otherwise
 // valid jobs; above this the chat surface starts feeling stuck.
 const SENSEAUDIO_VIDEO_POLL_INTERVAL_MS_DEFAULT = 5000;
 const SENSEAUDIO_VIDEO_MAX_POLLS = 120;
 // Periodic progress log every N polls so a long-running job emits some
 // signal to the daemon log — without flooding it with one line per
 // 5 s. 6 polls = ~30 s between progress lines.
 const SENSEAUDIO_VIDEO_PROGRESS_LOG_EVERY = 6;
 // SenseAudio's image gateway rejects non-standard pixel sizes with a 400
 // `参数错误：size` (verified against logs from a failed call on
 // 2026-05-16). We stick to common 16-multiple HD / SD sizes that the
 // gateway is known to accept: 1024×1024 for square, 1280×720 / 720×1280
 // for widescreen / portrait, 1024×768 / 768×1024 for the 4:3 family.
 // The table is duplicated in renderSenseAudioImage (media.ts) for the
 // CLI-agent path so both surfaces stay in sync.
 const ASPECT_TO_SIZE: Record<string, string> = {
  '1:1': '1024x1024',
  '16:9': '1280x720',
  '9:16': '720x1280',
  '4:3': '1024x768',
  '3:4': '768x1024',
 };
 /**
 * OpenAI-compatible tool definition for image generation. Injected into
 * the upstream `tools` array on every /api/proxy/senseaudio/stream
 * request so the LLM can decide on its own when to call it. The
 * description deliberately tells the model to embed the returned URL
 * in markdown — the chat UI already renders markdown images inline,
 * so no client-side wiring is required for the bytes to show up.
 */
 export const BYOK_SENSEAUDIO_TOOLS = [
  {
    type: 'function' as const,
    function: {
      name: 'generate_image',
      description:
        'Generate an image from a text prompt using SenseAudio image models. Returns a URL pointing to the rendered PNG. After this tool succeeds, embed the URL in your reply with markdown image syntax — ![alt](url) — so the user sees the image inline. Use this whenever the user asks to draw, create, generate, design, or illustrate something visual.',
      parameters: {
        type: 'object',
        properties: {
          prompt: {
            type: 'string',
            description:
              'Detailed visual description of the image (Chinese or English are both fine). Include subject, style, lighting, composition. Maximum 2000 characters.',
          },
          aspect_ratio: {
            type: 'string',
            enum: ['1:1', '16:9', '9:16', '4:3', '3:4'],
            description:
              'Output aspect ratio. 1:1 for square avatars and product shots, 16:9 for hero banners, 9:16 for vertical phone posters, 4:3 for editorial covers, 3:4 for posters. Defaults to 1:1 when omitted.',
          },
          model: {
            type: 'string',
            enum: [...BYOK_SENSEAUDIO_IMAGE_MODELS],
            description:
              'Optional model override. Omit this to use the user-configured default from Settings (or the SenseAudio 2.0 multi-aspect model when unset). Choose senseaudio-image-2.0-260319 for multi-aspect generation, senseaudio-image-1.0-260319 for standard sizes, or doubao-seedream-5-0-260128 for high-resolution output through the ByteDance Seedream gateway. The user explicitly picked a default in their Settings — only override when the user asks for a different style/resolution.',
          },
        },
        required: ['prompt'],
      },
    },
  },
  {
    type: 'function' as const,
    function: {
      name: 'generate_video',
      description:
        'Generate a short video (4–15 seconds) from a text prompt using SenseAudio\'s ByteDance Seedance gateway. This is an asynchronous call that can take 30 s to a few minutes — the daemon polls the job for you, so the user just sees the chat waiting. After this tool succeeds, embed the returned URL in your reply as a markdown link, e.g. `[▶ Play video](url)`, because the chat\'s markdown renderer does not currently render `<video>` tags inline. Use this whenever the user asks for a video, clip, animation, or motion graphic.',
      parameters: {
        type: 'object',
        properties: {
          prompt: {
            type: 'string',
            description:
              'Detailed motion description of the video. Include subject, action / camera move / scene transitions, style, lighting. Chinese or English. Maximum 2000 characters.',
          },
          aspect_ratio: {
            type: 'string',
            enum: [...SENSEAUDIO_VIDEO_ASPECT_RATIOS],
            description:
              'Output aspect ratio. 16:9 for cinematic, 9:16 for vertical (phone / TikTok), 1:1 for social square, 4:3 / 3:4 for editorial. Defaults to 16:9.',
          },
          duration: {
            type: 'integer',
            minimum: SENSEAUDIO_VIDEO_DURATION_MIN,
            maximum: SENSEAUDIO_VIDEO_DURATION_MAX,
            description:
              `Video length in seconds (integer). Allowed range ${SENSEAUDIO_VIDEO_DURATION_MIN}–${SENSEAUDIO_VIDEO_DURATION_MAX}; defaults to ${SENSEAUDIO_VIDEO_DURATION_DEFAULT}. Shorter durations finish faster.`,
          },
          resolution: {
            type: 'string',
            enum: [...SENSEAUDIO_VIDEO_RESOLUTIONS],
            description:
              'Output resolution. 480p (fastest), 720p (default, balanced), 1080p (best quality, slowest). Pick 1080p only when the user explicitly asks for high resolution.',
          },
          generate_audio: {
            type: 'boolean',
            description:
              'Whether the model also synthesises an audio track for the clip (background sound, ambience). Defaults to false to keep generation fast; flip to true when the user asks for sound, music, or a "video with audio".',
          },
        },
        required: ['prompt'],
      },
    },
  },
 ];
 /**
 * Runtime context the BYOK tool executor needs. Passed by the chat
 * route on every call so the tool layer stays free of global state and
 * can be unit-tested with a temp directory.
 */
 export interface BYOKToolContext {
  /** Daemon project root — used to look up media-config when the chat
   *  session key is missing. */
  projectRoot: string;
  /** Daemon's PROJECTS_DIR (the `<projectRoot>/.od/projects/` folder
   *  that holds per-project file trees). Generated images land in
   *  `<projectsRoot>/<projectId>/byok-<id>.png` so the project's
   *  FileViewer / DesignFilesPanel discover them automatically and
   *  the file travels with the project on export, archive, rename. */
  projectsRoot: string;
  /** Active project id from the chat surface. Required — the BYOK
   *  chat always runs inside a project, so the tool dispatch refuses
   *  to fire without one rather than dump bytes into a global cache.
   *  Validated upstream via `isSafeId`. */
  projectId: string;
  /** The BYOK chat session's API key — first credential we try. Bypasses
   *  the media-config indirection so the same key the user just pasted
   *  for chat is the same key the image call uses. */
  upstreamApiKey: string;
  /** The BYOK chat session's base URL (may be a custom gateway). Falls
   *  back to api.senseaudio.cn. */
  upstreamBaseUrl?: string;
  /** Default image model the user picked in BYOK Settings, used when the
   *  LLM didn't pass `model` in tool args. Validated upstream — anything
   *  outside `BYOK_SENSEAUDIO_IMAGE_MODELS` is dropped so a stale
   *  client-side config can't smuggle an unregistered model id through.
   *  Falls back to `BYOK_SENSEAUDIO_DEFAULT_IMAGE_MODEL` (the registry's
   *  first SenseAudio image entry) when missing. */
  defaultImageModel?: string;
  /** Test-only override for the video polling interval (ms). Production
   *  uses 5 s (SenseAudio's recommendation) — tests pass small values
   *  (e.g. 1 ms) to keep the suite fast without changing the polling
   *  semantics. */
  videoPollIntervalMs?: number;
 }
 export interface ImageToolResult {
  ok: boolean;
  /** Daemon-served URL on success. */
  url?: string;
  /** Short human-readable failure reason. Stuffed into the `tool` role
   *  reply so the LLM can apologize / retry. */
  error?: string;
 }
 function sanitizeAspectRatio(raw: unknown): string {
  if (typeof raw !== 'string') return '1:1';
  return ASPECT_TO_SIZE[raw] ? raw : '1:1';
 }
 /**
 * Execute the `generate_image` tool. Calls SenseAudio /v1/image/sync,
 * downloads the rendered bytes, writes them to <byokImagesDir>/<id>.png,
 * and returns a daemon-served URL. Pure async — caller is responsible
 * for emitting any SSE events (e.g. "tool result ready").
 *
 * Failure modes return `{ok: false, error}` rather than throwing so the
 * caller can feed the message back to the LLM as a tool_result; that
 * lets the model apologize / suggest a retry instead of the chat
 * silently stopping.
 */
 export async function executeGenerateImage(
  args: { prompt?: unknown; aspect_ratio?: unknown; model?: unknown },
  ctx: BYOKToolContext,
 ): Promise<ImageToolResult> {
  const promptRaw = typeof args.prompt === 'string' ? args.prompt.trim() : '';
  if (!promptRaw) return { ok: false, error: 'prompt is required' };
  const prompt =
    promptRaw.length > PROMPT_MAX_LENGTH
      ? promptRaw.slice(0, PROMPT_MAX_LENGTH)
      : promptRaw;
  const aspect = sanitizeAspectRatio(args.aspect_ratio);
  const size = ASPECT_TO_SIZE[aspect];
  // Model resolution order — LLM args > user's Settings default > registry
  // default. The allowlist guards every step so a hallucinated or stale id
  // can never reach the senseaudio /v1/image/sync wire — the catalogue is
  // the source of truth.
  const senseAudioImageModel = isSenseAudioImageModel(args.model)
    ? args.model
    : isSenseAudioImageModel(ctx.defaultImageModel)
      ? ctx.defaultImageModel
      : BYOK_SENSEAUDIO_DEFAULT_IMAGE_MODEL;
  // Resolve the project folder up front. ensureProject runs
  // `isSafeId` internally, so an attacker who somehow bypassed the
  // chat-routes guard and slipped `../escape` into projectId fails
  // here before we make any upstream call. The returned `dir` is
  // reused at writeFile time below.
  let dir: string;
  try {
    dir = await ensureProject(ctx.projectsRoot, ctx.projectId);
  } catch (err) {
    return {
      ok: false,
      error: `invalid projectId for image storage: ${err instanceof Error ? err.message : String(err)}`,
    };
  }
  // Prefer the BYOK session's key (what the user is actively using).
  // Fall back to media-config (env var > stored) so a user who set
  // OD_SENSEAUDIO_API_KEY but forgot to fill the chat panel still
  // gets a working tool call.
  let apiKey = ctx.upstreamApiKey;
  let baseUrl = ctx.upstreamBaseUrl || SENSEAUDIO_DEFAULT_BASE_URL;
  if (!apiKey) {
    const resolved = await resolveProviderConfig(ctx.projectRoot, 'senseaudio');
    apiKey = resolved.apiKey || '';
    if (resolved.baseUrl) baseUrl = resolved.baseUrl;
  }
  if (!apiKey) {
    return { ok: false, error: 'no SenseAudio API key available' };
  }
  const trimmedBase = baseUrl.replace(/\/+$/, '');
  let imageUrl: string;
  try {
    const resp = await fetch(`${trimmedBase}/v1/image/sync`, {
      method: 'POST',
      headers: {
        authorization: `Bearer ${apiKey}`,
        'content-type': 'application/json',
      },
      body: JSON.stringify({
        model: senseAudioImageModel,
        prompt,
        size,
      }),
    });
    if (!resp.ok) {
      const text = await resp.text().catch(() => '');
      return {
        ok: false,
        error: `senseaudio image ${resp.status}: ${text.slice(0, 240)}`,
      };
    }
    const data = (await resp.json()) as {
      url?: string;
      error_message?: string;
      base_resp?: { status_code?: number; status_msg?: string };
    };
    if (data?.base_resp && data.base_resp.status_code !== 0) {
      return {
        ok: false,
        error: `senseaudio image api error ${data.base_resp.status_code}: ${data.base_resp.status_msg || 'unknown'}`,
      };
    }
    if (typeof data?.error_message === 'string' && data.error_message) {
      return { ok: false, error: `senseaudio image: ${data.error_message}` };
    }
    if (typeof data?.url !== 'string' || !data.url) {
      return { ok: false, error: 'senseaudio image response missing url' };
    }
    imageUrl = data.url;
  } catch (err) {
    return {
      ok: false,
      error: err instanceof Error ? err.message : String(err),
    };
  }
  const imageUrlCheck = await assertExternalAssetUrl(imageUrl);
  if (!imageUrlCheck.ok) return { ok: false, error: imageUrlCheck.error };
  let bytes: Buffer;
  try {
    const imgResp = await fetch(imageUrl, { redirect: 'error' });
    if (!imgResp.ok) {
      return { ok: false, error: `image download ${imgResp.status}` };
    }
    bytes = Buffer.from(await imgResp.arrayBuffer());
  } catch (err) {
    return {
      ok: false,
      error: `image download failed: ${err instanceof Error ? err.message : String(err)}`,
    };
  }
  if (bytes.length === 0) {
    return { ok: false, error: 'image download returned zero bytes' };
  }
  // Persist into the active project's folder. `dir` was resolved up
  // front via ensureProject — no DB write, no metadata side-effects —
  // and the resulting path slots straight into the existing project
  // file plumbing: listFiles enumerates it for the FileViewer,
  // readProjectFile serves it via GET /api/projects/<id>/files/<filename>,
  // and project archive / export pick it up automatically because it
  // lives under the project's own directory.
  //
  // Filename pattern `byok-<timestamp>-<random>.png` keeps tool
  // outputs distinguishable from user uploads at a glance while
  // staying url-safe.
  const id = `${Date.now().toString(36)}-${randomBytes(4).toString('hex')}`;
  const filename = `byok-${id}.png`;
  await writeFile(path.join(dir, filename), bytes);
  // Return a relative URL through the project file serving route. The
  // web's Next.js rewrites `/api/:path*` to the daemon (see
  // apps/web/next.config.ts), so the chat UI loads the image
  // same-origin — satisfying the strict CSP (`img-src 'self' data:
  // blob:`) without any CORS plumbing.
  return {
    ok: true,
    url: `/api/projects/${encodeURIComponent(ctx.projectId)}/files/${filename}`,
  };
 }
 function sanitizeVideoAspectRatio(raw: unknown): (typeof SENSEAUDIO_VIDEO_ASPECT_RATIOS)[number] {
  if (typeof raw !== 'string') return '16:9';
  return (SENSEAUDIO_VIDEO_ASPECT_RATIOS as readonly string[]).includes(raw)
    ? (raw as (typeof SENSEAUDIO_VIDEO_ASPECT_RATIOS)[number])
    : '16:9';
 }
 function sanitizeVideoResolution(raw: unknown): (typeof SENSEAUDIO_VIDEO_RESOLUTIONS)[number] {
  if (typeof raw !== 'string') return '720p';
  return (SENSEAUDIO_VIDEO_RESOLUTIONS as readonly string[]).includes(raw)
    ? (raw as (typeof SENSEAUDIO_VIDEO_RESOLUTIONS)[number])
    : '720p';
 }
 function sanitizeVideoDuration(raw: unknown): number {
  if (typeof raw !== 'number' || !Number.isFinite(raw)) return SENSEAUDIO_VIDEO_DURATION_DEFAULT;
  const rounded = Math.round(raw);
  if (rounded < SENSEAUDIO_VIDEO_DURATION_MIN) return SENSEAUDIO_VIDEO_DURATION_MIN;
  if (rounded > SENSEAUDIO_VIDEO_DURATION_MAX) return SENSEAUDIO_VIDEO_DURATION_MAX;
  return rounded;
 }
 const sleep = (ms: number): Promise<void> =>
  new Promise((resolve) => setTimeout(resolve, ms));
 /**
 * Execute the `generate_video` tool. SenseAudio's video API is
 * asynchronous-only: POST /v1/video/create returns a task_id, then
 * GET /v1/video/status?id=<task_id> reports `pending` / `processing`
 * → `completed` (with `video_url`) or `failed` (with `error_message`).
 * We poll every `videoPollIntervalMs` (default 5 s) and bail after
 * `SENSEAUDIO_VIDEO_MAX_POLLS` so a stuck upstream can't pin the
 * chat stream forever.
 *
 * The chat tool waits for the whole loop, so the daemon's outbound
 * SSE response from /api/proxy/senseaudio/stream stays open for the
 * duration. That's intentional — the next chat turn cannot begin
 * until we have a URL to feed back into the tool_result.
 */
 export async function executeGenerateVideo(
  args: {
    prompt?: unknown;
    aspect_ratio?: unknown;
    duration?: unknown;
    resolution?: unknown;
    generate_audio?: unknown;
  },
  ctx: BYOKToolContext,
 ): Promise<ImageToolResult> {
  const promptRaw = typeof args.prompt === 'string' ? args.prompt.trim() : '';
  if (!promptRaw) return { ok: false, error: 'prompt is required' };
  const prompt =
    promptRaw.length > PROMPT_MAX_LENGTH
      ? promptRaw.slice(0, PROMPT_MAX_LENGTH)
      : promptRaw;
  const ratio = sanitizeVideoAspectRatio(args.aspect_ratio);
  const resolution = sanitizeVideoResolution(args.resolution);
  const duration = sanitizeVideoDuration(args.duration);
  const generateAudio = args.generate_audio === true;
  let dir: string;
  try {
    dir = await ensureProject(ctx.projectsRoot, ctx.projectId);
  } catch (err) {
    return {
      ok: false,
      error: `invalid projectId for video storage: ${err instanceof Error ? err.message : String(err)}`,
    };
  }
  let apiKey = ctx.upstreamApiKey;
  let baseUrl = ctx.upstreamBaseUrl || SENSEAUDIO_DEFAULT_BASE_URL;
  if (!apiKey) {
    const resolved = await resolveProviderConfig(ctx.projectRoot, 'senseaudio');
    apiKey = resolved.apiKey || '';
    if (resolved.baseUrl) baseUrl = resolved.baseUrl;
  }
  if (!apiKey) {
    return { ok: false, error: 'no SenseAudio API key available' };
  }
  const trimmedBase = baseUrl.replace(/\/+$/, '');
  // Step 1: POST /v1/video/create → task_id.
  let taskId: string;
  try {
    const resp = await fetch(`${trimmedBase}/v1/video/create`, {
      method: 'POST',
      headers: {
        authorization: `Bearer ${apiKey}`,
        'content-type': 'application/json',
      },
      body: JSON.stringify({
        model: SENSEAUDIO_VIDEO_MODEL,
        content: [{ type: 'text', text: prompt }],
        duration,
        resolution,
        ratio,
        provider_specific: { generate_audio: generateAudio },
      }),
    });
    if (!resp.ok) {
      const text = await resp.text().catch(() => '');
      return {
        ok: false,
        error: `senseaudio video create ${resp.status}: ${text.slice(0, 240)}`,
      };
    }
    const data = (await resp.json()) as { task_id?: string };
    if (typeof data?.task_id !== 'string' || !data.task_id) {
      return { ok: false, error: 'senseaudio video create response missing task_id' };
    }
    taskId = data.task_id;
  } catch (err) {
    return {
      ok: false,
      error: err instanceof Error ? err.message : String(err),
    };
  }
  // Step 2: poll /v1/video/status until completed / failed / timeout.
  const pollIntervalMs = ctx.videoPollIntervalMs ?? SENSEAUDIO_VIDEO_POLL_INTERVAL_MS_DEFAULT;
  let videoUrl = '';
  for (let attempt = 0; attempt < SENSEAUDIO_VIDEO_MAX_POLLS; attempt++) {
    await sleep(pollIntervalMs);
    let statusResp: Response;
    try {
      statusResp = await fetch(
        `${trimmedBase}/v1/video/status?id=${encodeURIComponent(taskId)}`,
        {
          method: 'GET',
          headers: { authorization: `Bearer ${apiKey}` },
        },
      );
    } catch (err) {
      return {
        ok: false,
        error: `senseaudio video poll failed: ${err instanceof Error ? err.message : String(err)}`,
      };
    }
    if (!statusResp.ok) {
      const text = await statusResp.text().catch(() => '');
      return {
        ok: false,
        error: `senseaudio video status ${statusResp.status}: ${text.slice(0, 240)}`,
      };
    }
    const data = (await statusResp.json()) as {
      status?: string;
      progress?: number;
      video_url?: string;
      error_message?: string;
    };
    if (data?.status === 'completed') {
      if (typeof data.video_url !== 'string' || !data.video_url) {
        return { ok: false, error: 'senseaudio video status completed but missing video_url' };
      }
      videoUrl = data.video_url;
      break;
    }
    if (data?.status === 'failed') {
      return {
        ok: false,
        error: `senseaudio video failed: ${data.error_message || 'unknown reason'}`,
      };
    }
    // pending / processing — continue polling. Emit a periodic log line
    // so a stuck job surfaces in the daemon log instead of silently
    // burning attempts.
    if ((attempt + 1) % SENSEAUDIO_VIDEO_PROGRESS_LOG_EVERY === 0) {
      const pct = typeof data.progress === 'number' ? data.progress : '?';
      console.log(
        `[proxy:senseaudio] generate_video poll ${attempt + 1}/${SENSEAUDIO_VIDEO_MAX_POLLS} task=${taskId} status=${data.status ?? 'unknown'} progress=${pct}`,
      );
    }
  }
  if (!videoUrl) {
    return {
      ok: false,
      error: `senseaudio video timed out after ${SENSEAUDIO_VIDEO_MAX_POLLS} polls`,
    };
  }
  // Step 3: download the mp4 bytes and persist into the project folder.
  // Re-validate the returned URL through validateBaseUrlResolved so a
  // malicious gateway can't point us at 169.254.169.254 (AWS / Azure
  // metadata service) or RFC1918 hosts via the response payload.
  const videoUrlCheck = await assertExternalAssetUrl(videoUrl);
  if (!videoUrlCheck.ok) return { ok: false, error: videoUrlCheck.error };
  let bytes: Buffer;
  try {
    const videoResp = await fetch(videoUrl, { redirect: 'error' });
    if (!videoResp.ok) {
      return { ok: false, error: `video download ${videoResp.status}` };
    }
    bytes = Buffer.from(await videoResp.arrayBuffer());
  } catch (err) {
    return {
      ok: false,
      error: `video download failed: ${err instanceof Error ? err.message : String(err)}`,
    };
  }
  if (bytes.length === 0) {
    return { ok: false, error: 'video download returned zero bytes' };
  }
  const id = `${Date.now().toString(36)}-${randomBytes(4).toString('hex')}`;
  const filename = `byok-video-${id}.mp4`;
  await writeFile(path.join(dir, filename), bytes);
  return {
    ok: true,
    url: `/api/projects/${encodeURIComponent(ctx.projectId)}/files/${filename}`,
  };
 }
--- a/apps/daemon/src/chat-routes.ts
+++ b/apps/daemon/src/chat-routes.ts
@ -1,13 +1,22 @@
 import type { Express } from 'express';
 import type { RouteDeps } from './server-context.js';
 import { newInsertId } from './analytics.js';
 import { seedProviderIfMissing } from './media-config.js';
 import {
  BYOK_SENSEAUDIO_TOOLS,
  executeGenerateImage,
  executeGenerateVideo,
  isSenseAudioImageModel,
  type BYOKToolContext,
 } from './byok-tools.js';
 import { isSafeId as isSafeProjectId } from './projects.js';
 import {
  agentIdToTracking,
  projectKindToTracking,
 } from '@open-design/contracts/analytics';
 import { validateBaseUrlResolved } from './connectionTest.js';
-export interface RegisterChatRoutesDeps extends RouteDeps<'db' | 'design' | 'http' | 'chat' | 'agents' | 'critique' | 'validation' | 'lifecycle'> {}
+export interface RegisterChatRoutesDeps extends RouteDeps<'db' | 'design' | 'http' | 'chat' | 'agents' | 'critique' | 'validation' | 'lifecycle' | 'paths'> {}
 // Invariant: a chat assistant message row reflects its run's terminal state
 // even when the web client never persists the cancel/finish itself (refresh
@ -310,13 +319,13 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
    const protocol = body.protocol;
    if (
      typeof protocol !== 'string' ||
-      !['anthropic', 'openai', 'azure', 'google', 'ollama'].includes(protocol)
+      !['anthropic', 'openai', 'azure', 'google', 'ollama', 'senseaudio'].includes(protocol)
    ) {
      return sendApiError(
        res,
        400,
        'BAD_REQUEST',
-        'protocol must be one of anthropic|openai|azure|google|ollama',
+        'protocol must be one of anthropic|openai|azure|google|ollama|senseaudio',
      );
    }
    if (
@ -371,13 +380,13 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
        const protocol = body.protocol;
        if (
          typeof protocol !== 'string' ||
-          !['anthropic', 'openai', 'azure', 'google', 'ollama'].includes(protocol)
+          !['anthropic', 'openai', 'azure', 'google', 'ollama', 'senseaudio'].includes(protocol)
        ) {
          return sendApiError(
            res,
            400,
            'BAD_REQUEST',
-            'protocol must be one of anthropic|openai|azure|google|ollama',
+            'protocol must be one of anthropic|openai|azure|google|ollama|senseaudio',
          );
        }
        if (
@ -1172,4 +1181,354 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
    }
  });
  // SenseAudio chat completions. Wire-compatible with OpenAI (POST
  // /v1/chat/completions, Bearer auth, SSE `data: {...}` + `data: [DONE]`)
  // plus a daemon-side tool loop: the handler injects an OpenAI
  // `tools` array on every upstream request and, when the model
  // responds with a `tool_calls` finish_reason, executes the call
  // locally, appends the assistant + tool messages to the conversation,
  // and re-issues the completion. This is how BYOK chat — which has
  // no agent-runtime scaffolding — gets image-generation parity with
  // the CLI agent path. Loop is bounded by MAX_BYOK_TOOL_LOOPS so a
  // misbehaving model can't pin the daemon in an infinite tool dance.
  const MAX_BYOK_TOOL_LOOPS = 3;
  type AccumulatedToolCall = { id: string; name: string; arguments: string };
  type TurnResult =
    | { kind: 'text_end' }
    | { kind: 'error' }
    | {
        kind: 'tool_calls';
        assistantMessage: any;
        toolCalls: Array<{ id: string; type: 'function'; function: { name: string; arguments: string } }>;
      };
  app.post('/api/proxy/senseaudio/stream', async (req, res) => {
    const proxyBody = req.body || {};
    if (rejectProxyPluginContext(proxyBody, res)) return;
    const {
      baseUrl,
      apiKey,
      model,
      systemPrompt,
      messages,
      maxTokens,
      projectId,
      byokImageModel,
    } = proxyBody;
    if (!apiKey || !model) {
      return sendApiError(
        res,
        400,
        'BAD_REQUEST',
        'apiKey and model are required',
      );
    }
    // projectId is required because the BYOK generate_image tool writes
    // into the active project's folder; without one we'd have to fall
    // back to a daemon-global cache that orphans the file. The web
    // client always passes project.id from ProjectView, so a missing
    // value means the request did not come through the chat surface.
    if (typeof projectId !== 'string' || !isSafeProjectId(projectId)) {
      return sendApiError(
        res,
        400,
        'BAD_REQUEST',
        'projectId is required and must be a safe identifier',
      );
    }
    const effectiveBaseUrl = baseUrl || 'https://api.senseaudio.cn';
    const validated = await validateExternalApiBaseUrl(effectiveBaseUrl);
    if (validated.error) {
      return sendApiError(
        res,
        validated.forbidden ? 403 : 400,
        validated.forbidden ? 'FORBIDDEN' : 'BAD_REQUEST',
        validated.error,
      );
    }
    const url = appendVersionedApiPath(effectiveBaseUrl, '/chat/completions');
    console.log(
      `[proxy:senseaudio] ${req.method} ${validated.parsed?.hostname ?? '?'} model=${model} project=${projectId}`,
    );
    const workingMessages: any[] = Array.isArray(messages) ? [...messages] : [];
    if (typeof systemPrompt === 'string' && systemPrompt) {
      workingMessages.unshift({ role: 'system', content: systemPrompt });
    }
    // Tool execution context — built once per request. The image tool
    // writes into `<projectsRoot>/<projectId>/byok-<id>.png` and returns
    // a relative URL via `/api/projects/:id/files/:filename`. The web's
    // Next.js rewrites `/api/:path*` to the daemon, so the chat UI
    // loads images same-origin through the standard project file
    // route — no CSP / CORS exceptions needed.
    // User-configured BYOK default image model. Drop silently if the
    // client sent an id outside the SenseAudio registry — the tool
    // will fall back to the registry default and the LLM can still
    // override per-call via the tool's `model` arg.
    const validDefaultImageModel = isSenseAudioImageModel(byokImageModel)
      ? byokImageModel
      : undefined;
    const toolCtx: BYOKToolContext = {
      projectRoot: ctx.paths.PROJECT_ROOT,
      projectsRoot: ctx.paths.PROJECTS_DIR,
      projectId,
      upstreamApiKey: apiKey,
      upstreamBaseUrl: effectiveBaseUrl,
      // Spread-conditional because tsconfig's exactOptionalPropertyTypes
      // forbids `field: undefined` on an optional slot. The byok-tools
      // executor reads `ctx.defaultImageModel` with `isSenseAudioImageModel`
      // anyway, so a missing key and an undefined value behave the same.
      ...(validDefaultImageModel
        ? { defaultImageModel: validDefaultImageModel }
        : {}),
    };
    // Run one round-trip: POST to upstream, stream text deltas to the
    // client as they arrive, accumulate any tool_call deltas. Returns
    // a typed result describing what to do next (loop on tool calls,
    // close the stream, or bail on error). Closures capture all the
    // SSE helpers from registerChatRoutes.
    const runSenseAudioTurn = async (
      sse: any,
      messagesForTurn: any[],
    ): Promise<TurnResult> => {
      const payload: any = {
        model,
        messages: messagesForTurn,
        max_tokens:
          typeof maxTokens === 'number' && maxTokens > 0 ? maxTokens : 8192,
        stream: true,
        tools: BYOK_SENSEAUDIO_TOOLS,
        tool_choice: 'auto',
      };
      const response = await fetch(url, {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          Authorization: `Bearer ${apiKey}`,
        },
        body: JSON.stringify(payload),
        redirect: 'error',
      });
      if (!response.ok) {
        const errorText = await response.text();
        console.error(
          `[proxy:senseaudio] upstream error: ${response.status} ${redactAuthTokens(errorText)}`,
        );
        sendProxyError(sse, `Upstream error: ${response.status}`, {
          code: proxyErrorCode(response.status),
          details: errorText,
          retryable: response.status === 429 || response.status >= 500,
        });
        return { kind: 'error' };
      }
      const accum: Record<number, AccumulatedToolCall> = {};
      let finishReason = '';
      let providerError = '';
      await streamUpstreamSse(response, ({ payload, data }: any) => {
        if (payload === '[DONE]') return true;
        if (!data) return false;
        const streamErr = extractStreamErrorMessage(data);
        if (streamErr) {
          providerError = streamErr;
          return true;
        }
        const choices = (data as any).choices;
        if (!Array.isArray(choices) || choices.length === 0) return false;
        const choice = choices[0] || {};
        const delta = choice.delta || {};
        // Text content streams to the client unchanged. Tool turns and
        // text turns can both share this path — the OpenAI protocol
        // never emits text+tool_calls in the same chunk, but it can
        // emit text before / after a tool_call in the same turn, and
        // we want the user to see whatever the model decided to say.
        if (typeof delta.content === 'string' && delta.content) {
          sse.send('delta', { delta: delta.content });
        }
        // Tool call deltas stream as fragments — `id` arrives once at
        // the start, `function.name` once at the start, and
        // `function.arguments` accumulates a chunked JSON string we
        // have to concatenate. Parallel calls use the `index` field to
        // distinguish slots. Default to 0 when omitted (older models).
        if (Array.isArray(delta.tool_calls)) {
          for (const tc of delta.tool_calls) {
            const idx = typeof tc?.index === 'number' ? tc.index : 0;
            if (!accum[idx]) {
              accum[idx] = { id: '', name: '', arguments: '' };
            }
            const slot = accum[idx];
            if (typeof tc.id === 'string' && tc.id) slot.id = tc.id;
            if (typeof tc.function?.name === 'string' && tc.function.name) {
              slot.name = tc.function.name;
            }
            if (typeof tc.function?.arguments === 'string') {
              slot.arguments += tc.function.arguments;
            }
          }
        }
        if (typeof choice.finish_reason === 'string' && choice.finish_reason) {
          finishReason = choice.finish_reason;
        }
        return false;
      });
      if (providerError) {
        sendProxyError(sse, `Provider error: ${providerError}`, {
          details: providerError,
        });
        return { kind: 'error' };
      }
      if (finishReason === 'tool_calls' && Object.keys(accum).length > 0) {
        const indices = Object.keys(accum)
          .map(Number)
          .sort((a, b) => a - b);
        const toolCalls = indices.map((i) => ({
          id: accum[i]!.id || `call_${i}`,
          type: 'function' as const,
          function: {
            name: accum[i]!.name,
            arguments: accum[i]!.arguments,
          },
        }));
        return {
          kind: 'tool_calls',
          assistantMessage: {
            role: 'assistant',
            content: null,
            tool_calls: toolCalls,
          },
          toolCalls,
        };
      }
      return { kind: 'text_end' };
    };
    const executeOneTool = async (call: {
      id: string;
      function: { name: string; arguments: string };
    }): Promise<{ ok: boolean; url?: string; error?: string; kind?: 'image' | 'video' }> => {
      const fnName = call?.function?.name ?? '';
      if (fnName !== 'generate_image' && fnName !== 'generate_video') {
        return {
          ok: false,
          error: `unknown tool: ${fnName || 'unnamed'}`,
        };
      }
      let args: any = {};
      try {
        args = JSON.parse(call.function.arguments || '{}');
      } catch {
        return { ok: false, error: 'tool arguments were not valid JSON' };
      }
      if (fnName === 'generate_image') {
        const result = await executeGenerateImage(args, toolCtx);
        return { ...result, kind: 'image' };
      }
      // generate_video — longer (up to 5 min), async-with-polling.
      const result = await executeGenerateVideo(args, toolCtx);
      return { ...result, kind: 'video' };
    };
    const sse = createSseResponse(res);
    sse.send('start', { model });
    // SenseAudio's gateway issues one API key that works for both
    // /v1/chat/completions and the image / TTS surfaces. Mirror the
    // BYOK key into media-config so the CLI agent path (`od media
    // generate`) picks it up automatically — fire-and-forget; the
    // chat stream must not block on the disk write. seedProviderIfMissing
    // is idempotent and preserves env-var-resolved keys.
    seedProviderIfMissing(ctx.paths.PROJECT_ROOT, 'senseaudio', {
      apiKey,
      baseUrl: effectiveBaseUrl,
    })
      .then((seeded) => {
        if (seeded) {
          console.log(
            '[proxy:senseaudio] seeded media-config.senseaudio from BYOK key',
          );
        }
      })
      .catch((err: unknown) => {
        console.warn(
          `[proxy:senseaudio] seed media-config failed: ${
            err instanceof Error ? err.message : String(err)
          }`,
        );
      });
    try {
      for (let loop = 0; loop < MAX_BYOK_TOOL_LOOPS; loop++) {
        const turn = await runSenseAudioTurn(sse, workingMessages);
        if (turn.kind === 'error') return sse.end();
        if (turn.kind === 'text_end') {
          sse.send('end', {});
          return sse.end();
        }
        // turn.kind === 'tool_calls'
        workingMessages.push(turn.assistantMessage);
        for (const call of turn.toolCalls) {
          const result = await executeOneTool(call);
          // The tool result is delivered to the model as a `tool` role
          // message — a structured payload the model can interpret. We
          // also surface a daemon-side log line so a user reporting "no
          // image showed up" can grep for the call id. The kind field
          // distinguishes image vs video so the daemon picks the right
          // embedding hint for the model (markdown image syntax for
          // PNG, markdown link for MP4 since the chat renderer doesn't
          // currently render <video> tags).
          const toolName = call?.function?.name ?? 'unknown';
          if (result.ok) {
            console.log(
              `[proxy:senseaudio] ${toolName} OK: ${call.id} → ${result.url}`,
            );
          } else {
            console.warn(
              `[proxy:senseaudio] ${toolName} FAILED: ${call.id} — ${result.error}`,
            );
          }
          const content = result.ok
            ? result.kind === 'video'
              ? `Video generated successfully. URL: ${result.url}. Reply to the user with a clickable markdown link, e.g. [▶ Play video](${result.url}). Do NOT use markdown image syntax — the chat renderer does not embed <video> tags.`
              : `Image generated successfully. URL: ${result.url}. Reply to the user with: ![generated image](${result.url})`
            : result.kind === 'video'
              ? `Video generation failed: ${result.error}. Apologize briefly and suggest a retry with a more specific prompt or a shorter duration.`
              : `Image generation failed: ${result.error}. Apologize briefly and suggest a retry with a more specific prompt.`;
          workingMessages.push({
            role: 'tool',
            tool_call_id: call.id,
            content,
          });
        }
      }
      // Tool loop exhausted — the model still wants to call tools but we
      // refuse a 4th round. Close the stream gracefully; the last text
      // delta the model emitted (if any) is already on the wire.
      console.warn(
        '[proxy:senseaudio] tool loop bounded at MAX_BYOK_TOOL_LOOPS=3',
      );
      sse.send('end', {});
      return sse.end();
    } catch (err: any) {
      console.error(`[proxy:senseaudio] internal error: ${err.message}`);
      sendProxyError(sse, err.message, { code: 'INTERNAL_ERROR' });
      sse.end();
    }
  });
 }
--- a/apps/daemon/src/connectionTest.ts
+++ b/apps/daemon/src/connectionTest.ts
@ -119,6 +119,41 @@ export async function validateBaseUrlResolved(
  return sync;
 }
 /**
 * SSRF guard for asset URLs handed back inside a successful API
 * response — typically a `data.url` or `data.video_url` that points
 * at the gateway's CDN, but is attacker-controllable when the
 * upstream gateway is compromised or misconfigured. Routes the URL
 * through `validateBaseUrlResolved` (DNS-resolve → reject loopback,
 * RFC1918, link-local, CGNAT, metadata-service IPs) and returns a
 * discriminated union so callers don't have to repeat the
 * `validated.error || !validated.parsed` plumbing.
 *
 * Two callers today:
 *   - `byok-tools.ts` for the chat-tool image/video downloads
 *   - `media.ts` `renderSenseAudioImage` for the CLI agent path
 * Both hand the URL straight to `fetch(...)` next, so pair this
 * guard with `redirect: 'error'` on the fetch to also block a
 * 3xx hop into private space.
 */
 export async function assertExternalAssetUrl(
  rawUrl: string,
 ): Promise<{ ok: true } | { ok: false; error: string }> {
  if (typeof rawUrl !== 'string' || !rawUrl) {
    return { ok: false, error: 'empty download url' };
  }
  const validated = await validateBaseUrlResolved(rawUrl);
  if (validated.error || !validated.parsed) {
    return {
      ok: false,
      error: validated.forbidden
        ? `blocked download url (${validated.error ?? 'internal address'})`
        : `invalid download url: ${validated.error ?? 'unknown reason'}`,
    };
  }
  return { ok: true };
 }
 // Aggressive but not punitive — happy paths usually return in under 2 s.
 // Override with OD_CONNECTION_TEST_PROVIDER_TIMEOUT_MS for slow networks
 // or distant providers; invalid values fall back to the default.
@ -315,10 +350,10 @@ function inspectProviderCompletion(
  const obj = data && typeof data === 'object' ? data as Record<string, unknown> : null;
  if (!obj) return { valid: false };
-  if (protocol === 'openai' || protocol === 'azure') {
+  if (protocol === 'openai' || protocol === 'azure' || protocol === 'senseaudio') {
    const responseModel = typeof obj.model === 'string' ? obj.model : '';
    if (
-      protocol === 'openai' &&
+      (protocol === 'openai' || protocol === 'senseaudio') &&
      enforceResponseModel &&
      responseModel &&
      requestedModel &&
@ -518,6 +553,12 @@ function buildProviderCall(input: ProviderTestRequest): ProviderCallShape {
        },
      };
    case 'openai':
    case 'senseaudio':
      // SenseAudio is wire-compatible with OpenAI (POST /v1/chat/completions,
      // Bearer auth, identical body + response shape), so the connection
      // smoke test reuses the same call shape. We default the base URL
      // upstream-side in chat-routes; this layer assumes the caller passed
      // a concrete URL via the BYOK form.
      return {
        url: appendVersionedApiPath(baseUrl, '/chat/completions'),
        headers: {
--- a/apps/daemon/src/media-config.ts
+++ b/apps/daemon/src/media-config.ts
@ -521,3 +521,53 @@ export async function writeConfig(projectRoot: string, body: unknown) {
  await writeStored(projectRoot, next);
  return readMaskedConfig(projectRoot);
 }
 /**
 * Idempotent "seed if empty" write for a single provider slot. The chat
 * proxy uses this to mirror a BYOK key into media-config so the agent's
 * image / TTS path picks up the same credential without the user having
 * to paste it twice. Strict rules:
 *   * No-op when an apiKey is ALREADY stored for `providerId` (the user
 *     may have configured Media independently and we never overwrite).
 *   * No-op when an env-var key resolves for `providerId` (env wins
 *     regardless of disk state — seeding would be invisible).
 *   * No-op when the incoming `apiKey` is empty (we only seed values
 *     the chat layer has just verified upstream).
 *   * Otherwise merge `{ [providerId]: entry }` into the existing
 *     provider map and persist. All other provider slots and aliases
 *     are preserved byte-for-byte.
 *
 * Returns `true` when a write happened (caller can log), `false` when
 * the call was a no-op. Errors are surfaced — the caller decides
 * whether to swallow them (fire-and-forget) or propagate.
 */
 export async function seedProviderIfMissing(
  projectRoot: string,
  providerId: string,
  entry: { apiKey?: string; baseUrl?: string; model?: string },
 ): Promise<boolean> {
  if (!PROVIDER_IDS.includes(providerId)) return false;
  const apiKey = entry.apiKey?.trim() ?? '';
  if (!apiKey) return false;
  // Env var wins at resolution time, so seeding when env is set would
  // be invisible to the user. Skip to avoid confusing on-disk state.
  if (readEnvKey(providerId)) return false;
  const prior = await readStored(projectRoot);
  const priorApiKey =
    typeof prior[providerId]?.apiKey === 'string' && prior[providerId].apiKey.trim()
      ? prior[providerId].apiKey.trim()
      : '';
  if (priorApiKey) return false;
  const baseUrl = entry.baseUrl?.trim() ?? '';
  const model = entry.model?.trim() ?? '';
  const next: ProviderMap = { ...prior };
  next[providerId] = {
    apiKey,
    ...(baseUrl ? { baseUrl } : {}),
    ...(model ? { model } : {}),
  };
  await writeStored(projectRoot, next);
  return true;
 }
--- a/apps/daemon/src/media-models.ts
+++ b/apps/daemon/src/media-models.ts
@ -60,7 +60,7 @@ export const MEDIA_PROVIDERS: MediaProvider[] = [
  {
    id: 'senseaudio',
    label: 'SenseAudio',
-    hint: 'TTS · 70+ system voices · clone',
+    hint: '',
    integrated: true,
    defaultBaseUrl: 'https://api.senseaudio.cn',
    docsUrl: 'https://docs.senseaudio.cn',
@ -80,6 +80,10 @@ export const IMAGE_MODELS: MediaModel[] = [
  { id: 'doubao-seedream-3-0-t2i-250415', label: 'seedream-3.0', hint: 'ByteDance · Doubao image', provider: 'volcengine', caps: ['t2i'] },
  { id: 'doubao-seededit-3-0-i2i-250628', label: 'seededit-3.0', hint: 'ByteDance · image edit', provider: 'volcengine', caps: ['i2i'] },
  { id: 'senseaudio-image-2.0-260319', label: 'senseaudio-image-2.0', hint: 'SenseAudio · multi-aspect, latest', provider: 'senseaudio', caps: ['t2i', 'i2i'] },
  { id: 'senseaudio-image-1.0-260319', label: 'senseaudio-image-1.0', hint: 'SenseAudio · standard', provider: 'senseaudio', caps: ['t2i', 'i2i'] },
  { id: 'doubao-seedream-5-0-260128', label: 'seedream-5.0', hint: 'SenseAudio · ByteDance Seedream 5.0 hi-res', provider: 'senseaudio', caps: ['t2i', 'i2i'] },
  { id: 'grok-imagine-image', label: 'grok-imagine-image', hint: 'xAI · 2K text-to-image', provider: 'grok', caps: ['t2i'] },
  { id: 'gemini-3.1-flash-image-preview', label: 'nano-banana-2', hint: 'Nano Banana · text-to-image', provider: 'nanobanana', caps: ['t2i'] },
--- a/apps/daemon/src/media.ts
+++ b/apps/daemon/src/media.ts
@ -57,6 +57,7 @@ import {
  findProvider,
  modelsForSurface,
 } from './media-models.js';
 import { assertExternalAssetUrl } from './connectionTest.js';
 import { resolveModelAlias, resolveProviderConfig } from './media-config.js';
 import {
  ensureProject,
@ -559,6 +560,11 @@ export async function generateMedia(args: {
      bytes = result.bytes;
      providerNote = result.providerNote;
      suggestedExt = result.suggestedExt;
    } else if (def.provider === 'senseaudio' && surface === 'image') {
      const result = await renderSenseAudioImage(ctx, credentials);
      bytes = result.bytes;
      providerNote = result.providerNote;
      suggestedExt = result.suggestedExt;
    } else if (def.provider === 'fishaudio' && surface === 'audio') {
      const result = await renderFishAudioTTS(ctx, credentials);
      bytes = result.bytes;
@ -2243,6 +2249,131 @@ async function renderSenseAudioTTS(ctx: MediaContext, credentials: ProviderConfi
  };
 }
 // ---------------------------------------------------------------------------
 // Provider: SenseAudio image — POST /v1/image/sync (synchronous text-to-image).
 //
 // Docs: https://docs.senseaudio.cn/guides/image/overview
 //   * Models: senseaudio-image-2.0-260319 (multi-aspect), senseaudio-image-1.0-260319
 //     (standard), doubao-seedream-5-0-260128 (hi-res). The wire `model` field
 //     accepts the catalog id directly so no alias map is needed.
 //   * Body: { model, prompt (≤2000 chars), size (WxH, required when no
 //     reference), reference (URL or data URI, optional), seed (optional int) }.
 //   * Response: { url: string } pointing at the rendered PNG; we fetch it
 //     once to materialise bytes the dispatcher can write to disk.
 //   * Auth: Authorization: Bearer <API_KEY>; shares the senseaudio provider
 //     slot with the TTS path (OD_SENSEAUDIO_API_KEY / SENSEAUDIO_API_KEY).
 // We default to the /sync endpoint because the chat runtime already streams
 // progress and a single round-trip keeps the dispatcher contract identical
 // to OpenAI / Volcengine image. Switching to /v1/image/async + GET
 // /v1/image/pending is a future option if the upstream model latency
 // outgrows the daemon's request timeout.
 // ---------------------------------------------------------------------------
 const SENSEAUDIO_IMAGE_PROMPT_LIMIT = 2000;
 // SenseAudio's image gateway rejects non-standard pixel sizes with a 400
 // `参数错误：size`. Keep this table in sync with byok-tools.ts's
 // ASPECT_TO_SIZE — both paths hit the same /v1/image/sync endpoint.
 function senseAudioImageSize(aspect?: string): string {
  if (aspect === '16:9') return '1280x720';
  if (aspect === '9:16') return '720x1280';
  if (aspect === '4:3') return '1024x768';
  if (aspect === '3:4') return '768x1024';
  return '1024x1024';
 }
 async function renderSenseAudioImage(ctx: MediaContext, credentials: ProviderConfig): Promise<RenderResult> {
  if (!credentials.apiKey) {
    throw new Error(
      'no SenseAudio API key — configure it in Settings or set OD_SENSEAUDIO_API_KEY',
    );
  }
  const baseUrl = (credentials.baseUrl || SENSEAUDIO_DEFAULT_BASE_URL).replace(
    /\/$/,
    '',
  );
  const promptRaw = (ctx.prompt && ctx.prompt.trim()) || 'A high-quality reference image.';
  // SenseAudio rejects >2000-char prompts with a 4xx; trim defensively so a
  // verbose agent plan doesn't dead-end the generation. The truncated tail
  // surfaces in providerNote so the user sees what was actually sent.
  const prompt =
    promptRaw.length > SENSEAUDIO_IMAGE_PROMPT_LIMIT
      ? promptRaw.slice(0, SENSEAUDIO_IMAGE_PROMPT_LIMIT)
      : promptRaw;
  const size = senseAudioImageSize(ctx.aspect);
  const reference = ctx.imageRef?.dataUrl;
  const body: Record<string, unknown> = {
    model: ctx.wireModel,
    prompt,
    size,
  };
  if (reference) {
    // When a reference image is supplied the API documents `size` as
    // optional; we still send it so the output dimensions stay
    // deterministic across t2i / i2i runs of the same project.
    body.reference = reference;
  }
  const resp = await fetch(`${baseUrl}/v1/image/sync`, {
    method: 'POST',
    headers: {
      authorization: `Bearer ${credentials.apiKey}`,
      'content-type': 'application/json',
    },
    body: JSON.stringify(body),
  });
  const respText = await resp.text();
  if (!resp.ok) {
    throw new Error(`senseaudio image ${resp.status}: ${truncate(respText, 240)}`);
  }
  let data: any;
  try {
    data = JSON.parse(respText);
  } catch {
    throw new Error(`senseaudio image non-JSON: ${truncate(respText, 200)}`);
  }
  // Mirror the TTS base_resp envelope check: HTTP 200 can still encode an
  // upstream logical failure. The image API uses the same shape on the
  // failure path documented for /v1/image/pending (status=failed +
  // error_message), so surface either source verbatim.
  if (data?.base_resp && data.base_resp.status_code !== 0) {
    throw new Error(
      `senseaudio image api error ${data.base_resp.status_code}: ${data.base_resp.status_msg || 'unknown'}`,
    );
  }
  if (typeof data?.error_message === 'string' && data.error_message) {
    throw new Error(`senseaudio image api error: ${data.error_message}`);
  }
  const url = typeof data?.url === 'string' ? data.url : '';
  if (!url) {
    throw new Error('senseaudio image response missing url');
  }
  // Mirror the chat-tool SSRF guard (byok-tools.ts): the gateway-returned
  // `url` is attacker-controllable inside a successful response, so DNS-
  // resolve it through validateBaseUrlResolved and refuse loopback /
  // RFC1918 / metadata-service hosts. Pair with `redirect: 'error'` so a
  // 3xx hop into private space is also blocked.
  const urlCheck = await assertExternalAssetUrl(url);
  if (!urlCheck.ok) {
    throw new Error(`senseaudio image ${urlCheck.error}`);
  }
  const imgResp = await fetch(url, { redirect: 'error' });
  if (!imgResp.ok) {
    throw new Error(`senseaudio image fetch ${imgResp.status}`);
  }
  const bytes = Buffer.from(await imgResp.arrayBuffer());
  if (bytes.length === 0) {
    throw new Error('senseaudio image fetch returned zero bytes');
  }
  return {
    bytes,
    providerNote: `senseaudio/${ctx.wireModel} · ${size}${reference ? ' · i2i' : ''} · ${bytes.length} bytes`,
    suggestedExt: '.png',
  };
 }
 // ---------------------------------------------------------------------------
 // Provider: FishAudio — Speech-1.x family text-to-speech (synchronous).
 //
--- a/apps/daemon/src/memory-llm.ts
+++ b/apps/daemon/src/memory-llm.ts
@ -142,6 +142,15 @@ const PROVIDER_DEFAULTS = {
    model: 'gemma3:4b',
    baseUrl: 'https://ollama.com',
  },
  // SenseAudio's chat API is OpenAI-compatible (POST /v1/chat/completions,
  // Bearer auth), so the extractor falls through to callOpenAI with this
  // base URL and the user's SenseAudio API key. The default model is the
  // small/fast variant so auto-pick stays cheap; users can swap in
  // senseaudio-s2 or any gateway model via the picker.
  senseaudio: {
    model: 'senseaudio-s2-flash',
    baseUrl: 'https://api.senseaudio.cn',
  },
 };
 // Map an explicit override provider to the env var the daemon should
@ -169,6 +178,13 @@ function envKeyFor(provider) {
  if (provider === 'ollama') {
    return process.env.OLLAMA_API_KEY?.trim() || '';
  }
  if (provider === 'senseaudio') {
    return (
      process.env.OD_SENSEAUDIO_API_KEY?.trim()
      || process.env.SENSEAUDIO_API_KEY?.trim()
      || ''
    );
  }
  return '';
 }
--- a/apps/daemon/src/providerModels.ts
+++ b/apps/daemon/src/providerModels.ts
@ -149,7 +149,9 @@ function extractGoogleModels(data: unknown): ProviderModelOption[] {
 }
 function providerModelsUrl(protocol: ConnectionTestProtocol, baseUrl: string, apiKey: string): string {
-  if (protocol === 'openai') return appendVersionedApiPath(baseUrl, '/models');
+  if (protocol === 'openai' || protocol === 'senseaudio') {
    return appendVersionedApiPath(baseUrl, '/models');
  }
  if (protocol === 'anthropic') {
    const url = new URL(appendVersionedApiPath(baseUrl, '/models'));
    url.searchParams.set('limit', '1000');
@ -167,7 +169,9 @@ function providerModelsHeaders(
  protocol: ConnectionTestProtocol,
  apiKey: string,
 ): Record<string, string> {
-  if (protocol === 'openai') return { authorization: `Bearer ${apiKey}` };
+  if (protocol === 'openai' || protocol === 'senseaudio') {
    return { authorization: `Bearer ${apiKey}` };
  }
  if (protocol === 'anthropic') {
    return {
      'x-api-key': apiKey,
@ -178,7 +182,9 @@ function providerModelsHeaders(
 }
 function extractModels(protocol: ConnectionTestProtocol, data: unknown): ProviderModelOption[] {
-  if (protocol === 'openai') return extractOpenAiModels(data);
+  // SenseAudio's /v1/models response follows the OpenAI envelope
  // (`{ data: [{ id, ... }] }`), so the same extractor handles both.
  if (protocol === 'openai' || protocol === 'senseaudio') return extractOpenAiModels(data);
  if (protocol === 'anthropic') return extractAnthropicModels(data);
  if (protocol === 'google') return extractGoogleModels(data);
  return [];
--- a/apps/daemon/src/server.ts
+++ b/apps/daemon/src/server.ts
@ -10859,6 +10859,7 @@ export async function startServer({
    db,
    design,
    http: httpDeps,
    paths: pathDeps,
    chat: { startChatRun, submitToolResultToRun },
    agents: agentDeps,
    critique: critiqueDeps,
--- a/apps/daemon/tests/byok-tools.test.ts
+++ b/apps/daemon/tests/byok-tools.test.ts
@ -0,0 +1,686 @@
 import { mkdir, mkdtemp, readFile, rm } from 'node:fs/promises';
 import { tmpdir } from 'node:os';
 import path from 'node:path';
 import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
 import {
  BYOK_SENSEAUDIO_TOOLS,
  executeGenerateImage,
  executeGenerateVideo,
 } from '../src/byok-tools.js';
 describe('BYOK_SENSEAUDIO_TOOLS', () => {
  it('exports an OpenAI-shaped generate_image tool definition', () => {
    const tool = BYOK_SENSEAUDIO_TOOLS.find(
      (t) => t.function.name === 'generate_image',
    );
    expect(tool).toBeDefined();
    expect(tool!.type).toBe('function');
    expect(tool!.function.parameters.required).toEqual(['prompt']);
    expect(tool!.function.parameters.properties.aspect_ratio.enum).toEqual([
      '1:1',
      '16:9',
      '9:16',
      '4:3',
      '3:4',
    ]);
  });
  it('exposes both generate_image and generate_video tools', () => {
    const names = BYOK_SENSEAUDIO_TOOLS.map((t) => t.function.name).sort();
    expect(names).toEqual(['generate_image', 'generate_video']);
  });
 });
 describe('executeGenerateImage', () => {
  let root: string;
  let projectsRoot: string;
  const PROJECT_ID = 'test-project';
  const realFetch = globalThis.fetch;
  beforeEach(async () => {
    root = await mkdtemp(path.join(tmpdir(), 'od-byok-tools-'));
    projectsRoot = path.join(root, 'projects');
  });
  afterEach(async () => {
    globalThis.fetch = realFetch;
    vi.unstubAllGlobals();
    await rm(root, { recursive: true, force: true });
  });
  const baseCtx = () => ({
    projectRoot: root,
    projectsRoot,
    projectId: PROJECT_ID,
    upstreamApiKey: 'sa-byok-key',
    upstreamBaseUrl: 'https://api.senseaudio.cn',
  });
  it('calls /v1/image/sync, downloads the URL, persists bytes, and returns a daemon URL', async () => {
    const pngBytes = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a]);
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const url = String(input);
      if (url === 'https://api.senseaudio.cn/v1/image/sync') {
        expect(init?.method).toBe('POST');
        expect(init?.headers).toMatchObject({
          authorization: 'Bearer sa-byok-key',
          'content-type': 'application/json',
        });
        expect(JSON.parse(String(init?.body))).toEqual({
          model: 'senseaudio-image-2.0-260319',
          prompt: 'a tabby cat playing with yarn',
          size: '1024x1024',
        });
        return new Response(
          JSON.stringify({
            url: 'https://cdn.example.test/generated/cat.png',
            base_resp: { status_code: 0, status_msg: 'success' },
          }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      if (url === 'https://cdn.example.test/generated/cat.png') {
        return new Response(pngBytes, {
          status: 200,
          headers: { 'content-type': 'image/png' },
        });
      }
      throw new Error(`unexpected fetch: ${url}`);
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage(
      { prompt: 'a tabby cat playing with yarn' },
      baseCtx(),
    );
    expect(result.ok).toBe(true);
    // Returns a relative URL through the project file route so the
    // chat UI loads same-origin via Next.js's /api/:path* rewrite,
    // satisfying the strict CSP `img-src 'self'`. Path component is
    // url-encoded so unusual (but isSafeId-passing) project ids don't
    // break the URL.
    expect(result.url).toMatch(
      new RegExp(`^/api/projects/${PROJECT_ID}/files/byok-[a-z0-9-]+\\.png$`),
    );
    expect(fetchMock).toHaveBeenCalledTimes(2);
    // Persisted file lives inside the project folder where listFiles /
    // readProjectFile / archive plumbing will all discover it.
    const filename = result.url!.split('/').pop()!;
    const onDisk = await readFile(path.join(projectsRoot, PROJECT_ID, filename));
    expect(onDisk.equals(pngBytes)).toBe(true);
  });
  it('honours args.model when the LLM picks a SenseAudio image model', async () => {
    const pngBytes = Buffer.from([0x89, 0x50]);
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const url = String(input);
      if (url.endsWith('/v1/image/sync')) {
        expect(JSON.parse(String(init?.body)).model).toBe('doubao-seedream-5-0-260128');
        return new Response(
          JSON.stringify({ url: 'https://cdn.example.test/hi.png' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      return new Response(pngBytes, { status: 200 });
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage(
      { prompt: 'wallpaper', model: 'doubao-seedream-5-0-260128' },
      baseCtx(),
    );
    expect(result.ok).toBe(true);
  });
  it('falls back to ctx.defaultImageModel when args.model is missing', async () => {
    const pngBytes = Buffer.from([0x89, 0x50]);
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const url = String(input);
      if (url.endsWith('/v1/image/sync')) {
        expect(JSON.parse(String(init?.body)).model).toBe('senseaudio-image-1.0-260319');
        return new Response(
          JSON.stringify({ url: 'https://cdn.example.test/std.png' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      return new Response(pngBytes, { status: 200 });
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage(
      { prompt: 'standard' },
      { ...baseCtx(), defaultImageModel: 'senseaudio-image-1.0-260319' },
    );
    expect(result.ok).toBe(true);
  });
  it('ignores args.model when it is not in the SenseAudio allowlist', async () => {
    const pngBytes = Buffer.from([0x89, 0x50]);
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const url = String(input);
      if (url.endsWith('/v1/image/sync')) {
        // Falls through to ctx.defaultImageModel (registry-valid).
        expect(JSON.parse(String(init?.body)).model).toBe('senseaudio-image-1.0-260319');
        return new Response(
          JSON.stringify({ url: 'https://cdn.example.test/x.png' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      return new Response(pngBytes, { status: 200 });
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage(
      { prompt: 'spoofed', model: 'evil-model-id' },
      { ...baseCtx(), defaultImageModel: 'senseaudio-image-1.0-260319' },
    );
    expect(result.ok).toBe(true);
  });
  it('falls back to registry default when both args.model and ctx.defaultImageModel are missing/invalid', async () => {
    const pngBytes = Buffer.from([0x89, 0x50]);
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const url = String(input);
      if (url.endsWith('/v1/image/sync')) {
        // Registry default is the first SenseAudio entry — 2.0 today.
        expect(JSON.parse(String(init?.body)).model).toBe('senseaudio-image-2.0-260319');
        return new Response(
          JSON.stringify({ url: 'https://cdn.example.test/d.png' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      return new Response(pngBytes, { status: 200 });
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage(
      { prompt: 'no model anywhere' },
      { ...baseCtx(), defaultImageModel: 'also-bogus' },
    );
    expect(result.ok).toBe(true);
  });
  it('rejects unsafe projectId before any upstream call', async () => {
    const fetchMock = vi.fn();
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage(
      { prompt: 'x' },
      { ...baseCtx(), projectId: '../escape' },
    );
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/invalid projectId/);
    // ensureProject runs up front so the unsafe id is caught BEFORE
    // any senseaudio upstream call goes out — no token spent, no
    // attempt to write outside the project tree.
    expect(fetchMock).not.toHaveBeenCalled();
  });
  it('maps aspect_ratio to the SenseAudio size string', async () => {
    const pngBytes = Buffer.from([0x89, 0x50]);
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const url = String(input);
      if (url.endsWith('/v1/image/sync')) {
        expect(JSON.parse(String(init?.body)).size).toBe('1280x720');
        return new Response(
          JSON.stringify({ url: 'https://cdn.example.test/wide.png' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      return new Response(pngBytes, { status: 200 });
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage(
      { prompt: 'widescreen banner', aspect_ratio: '16:9' },
      baseCtx(),
    );
    expect(result.ok).toBe(true);
  });
  it('falls back to 1:1 for unknown aspect_ratio values', async () => {
    const pngBytes = Buffer.from([0x89, 0x50]);
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const url = String(input);
      if (url.endsWith('/v1/image/sync')) {
        expect(JSON.parse(String(init?.body)).size).toBe('1024x1024');
        return new Response(
          JSON.stringify({ url: 'https://cdn.example.test/square.png' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      return new Response(pngBytes, { status: 200 });
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage(
      { prompt: 'square thing', aspect_ratio: 'something-else' },
      baseCtx(),
    );
    expect(result.ok).toBe(true);
  });
  it('returns { ok: false } on missing prompt', async () => {
    const fetchMock = vi.fn();
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage({}, baseCtx());
    expect(result).toEqual({ ok: false, error: 'prompt is required' });
    expect(fetchMock).not.toHaveBeenCalled();
  });
  it('returns { ok: false } when no API key is available', async () => {
    const fetchMock = vi.fn();
    vi.stubGlobal('fetch', fetchMock);
    const ctx = { ...baseCtx(), upstreamApiKey: '' };
    const result = await executeGenerateImage({ prompt: 'whatever' }, ctx);
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/no SenseAudio API key/);
    expect(fetchMock).not.toHaveBeenCalled();
  });
  it('surfaces HTTP failures with status code and truncated body', async () => {
    const fetchMock = vi.fn(async () =>
      new Response('unauthorized', {
        status: 401,
        headers: { 'content-type': 'text/plain' },
      }),
    );
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage({ prompt: 'x' }, baseCtx());
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/senseaudio image 401/);
  });
  it('surfaces error_message envelope verbatim', async () => {
    const fetchMock = vi.fn(async () =>
      new Response(
        JSON.stringify({ error_message: 'sensitive_content_blocked' }),
        { status: 200, headers: { 'content-type': 'application/json' } },
      ),
    );
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage({ prompt: 'x' }, baseCtx());
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/sensitive_content_blocked/);
  });
  it('surfaces base_resp non-zero status_code', async () => {
    const fetchMock = vi.fn(async () =>
      new Response(
        JSON.stringify({
          base_resp: { status_code: 1004, status_msg: 'quota exhausted' },
        }),
        { status: 200, headers: { 'content-type': 'application/json' } },
      ),
    );
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage({ prompt: 'x' }, baseCtx());
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/api error 1004/);
    expect(result.error).toMatch(/quota exhausted/);
  });
  it('returns { ok: false } when upstream returns no url', async () => {
    const fetchMock = vi.fn(async () =>
      new Response(
        JSON.stringify({ base_resp: { status_code: 0, status_msg: 'ok' } }),
        { status: 200, headers: { 'content-type': 'application/json' } },
      ),
    );
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage({ prompt: 'x' }, baseCtx());
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/missing url/);
  });
  it('returns { ok: false } when the image download fails', async () => {
    const fetchMock = vi.fn(async (input: unknown) => {
      const url = String(input);
      if (url.endsWith('/v1/image/sync')) {
        return new Response(
          JSON.stringify({ url: 'https://cdn.example.test/will-404.png' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      return new Response('not found', { status: 404 });
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateImage({ prompt: 'x' }, baseCtx());
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/image download 404/);
  });
 });
 describe('BYOK_SENSEAUDIO_TOOLS — video', () => {
  it('exposes a generate_video tool definition with the documented param surface', () => {
    const video = BYOK_SENSEAUDIO_TOOLS.find(
      (t) => t.function.name === 'generate_video',
    );
    expect(video).toBeDefined();
    const props = video!.function.parameters.properties as Record<string, any>;
    expect(video!.function.parameters.required).toEqual(['prompt']);
    expect(props.aspect_ratio.enum).toEqual(['16:9', '9:16', '4:3', '3:4', '1:1']);
    expect(props.resolution.enum).toEqual(['480p', '720p', '1080p']);
    expect(props.duration).toMatchObject({ type: 'integer', minimum: 4, maximum: 15 });
    expect(props.generate_audio.type).toBe('boolean');
  });
 });
 describe('executeGenerateVideo', () => {
  let root: string;
  let projectsRoot: string;
  const PROJECT_ID = 'test-project';
  const realFetch = globalThis.fetch;
  beforeEach(async () => {
    root = await mkdtemp(path.join(tmpdir(), 'od-byok-video-'));
    projectsRoot = path.join(root, 'projects');
  });
  afterEach(async () => {
    globalThis.fetch = realFetch;
    vi.unstubAllGlobals();
    await rm(root, { recursive: true, force: true });
  });
  const baseCtx = () => ({
    projectRoot: root,
    projectsRoot,
    projectId: PROJECT_ID,
    upstreamApiKey: 'sa-byok-key',
    upstreamBaseUrl: 'https://api.senseaudio.cn',
    // Keep tests fast — 1 ms between polls instead of the production 5 s.
    videoPollIntervalMs: 1,
  });
  it('creates, polls until completed, downloads, and writes the mp4 into the project folder', async () => {
    const mp4Bytes = Buffer.from([0x00, 0x00, 0x00, 0x18, 0x66, 0x74, 0x79, 0x70]);
    let pollCount = 0;
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const url = String(input);
      if (url === 'https://api.senseaudio.cn/v1/video/create') {
        expect(init?.method).toBe('POST');
        expect(init?.headers).toMatchObject({
          authorization: 'Bearer sa-byok-key',
          'content-type': 'application/json',
        });
        const body = JSON.parse(String(init?.body));
        expect(body).toEqual({
          model: 'doubao-seedance-2-0-260128',
          content: [{ type: 'text', text: 'a sunset over the ocean' }],
          duration: 8,
          resolution: '1080p',
          ratio: '16:9',
          provider_specific: { generate_audio: true },
        });
        return new Response(
          JSON.stringify({ task_id: 'task-abc' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      if (url.startsWith('https://api.senseaudio.cn/v1/video/status?id=task-abc')) {
        pollCount++;
        if (pollCount === 1) {
          return new Response(
            JSON.stringify({ status: 'pending', progress: 0 }),
            { status: 200, headers: { 'content-type': 'application/json' } },
          );
        }
        if (pollCount === 2) {
          return new Response(
            JSON.stringify({ status: 'processing', progress: 50 }),
            { status: 200, headers: { 'content-type': 'application/json' } },
          );
        }
        return new Response(
          JSON.stringify({
            status: 'completed',
            progress: 100,
            video_url: 'https://cdn.example.test/video/done.mp4',
            duration: 8,
          }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      if (url === 'https://cdn.example.test/video/done.mp4') {
        return new Response(mp4Bytes, {
          status: 200,
          headers: { 'content-type': 'video/mp4' },
        });
      }
      throw new Error(`unexpected fetch: ${url}`);
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateVideo(
      {
        prompt: 'a sunset over the ocean',
        aspect_ratio: '16:9',
        duration: 8,
        resolution: '1080p',
        generate_audio: true,
      },
      baseCtx(),
    );
    expect(result.ok).toBe(true);
    expect(result.url).toMatch(
      new RegExp(`^/api/projects/${PROJECT_ID}/files/byok-video-[a-z0-9-]+\\.mp4$`),
    );
    // 1× create + 3× poll + 1× download = 5 fetches total.
    expect(fetchMock).toHaveBeenCalledTimes(5);
    expect(pollCount).toBe(3);
    const filename = result.url!.split('/').pop()!;
    const onDisk = await readFile(path.join(projectsRoot, PROJECT_ID, filename));
    expect(onDisk.equals(mp4Bytes)).toBe(true);
  });
  it('defaults duration / resolution / aspect when caller omits them', async () => {
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const url = String(input);
      if (url.endsWith('/v1/video/create')) {
        const body = JSON.parse(String(init?.body));
        expect(body).toMatchObject({
          duration: 5,
          resolution: '720p',
          ratio: '16:9',
          provider_specific: { generate_audio: false },
        });
        return new Response(
          JSON.stringify({ task_id: 'task-defaults' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      if (url.startsWith('https://api.senseaudio.cn/v1/video/status')) {
        return new Response(
          JSON.stringify({
            status: 'completed',
            video_url: 'https://cdn.example.test/video/d.mp4',
          }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      return new Response(Buffer.from([0x01]), { status: 200 });
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateVideo({ prompt: 'minimal' }, baseCtx());
    expect(result.ok).toBe(true);
  });
  it('clamps duration outside the 4–15 range and rejects non-enum aspect_ratio / resolution', async () => {
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const url = String(input);
      if (url.endsWith('/v1/video/create')) {
        const body = JSON.parse(String(init?.body));
        // 99 → clamped to 15; 'octagonal' → falls back to '16:9';
        // '8k' → falls back to '720p'.
        expect(body).toMatchObject({
          duration: 15,
          resolution: '720p',
          ratio: '16:9',
        });
        return new Response(
          JSON.stringify({ task_id: 'task-clamp' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      if (url.startsWith('https://api.senseaudio.cn/v1/video/status')) {
        return new Response(
          JSON.stringify({
            status: 'completed',
            video_url: 'https://cdn.example.test/clamp.mp4',
          }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      return new Response(Buffer.from([0x02]), { status: 200 });
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateVideo(
      {
        prompt: 'overflow',
        duration: 99,
        aspect_ratio: 'octagonal',
        resolution: '8k',
      },
      baseCtx(),
    );
    expect(result.ok).toBe(true);
  });
  it('surfaces a failed status as a tool error so the model can apologize', async () => {
    const fetchMock = vi.fn(async (input: unknown) => {
      const url = String(input);
      if (url.endsWith('/v1/video/create')) {
        return new Response(
          JSON.stringify({ task_id: 'task-fail' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      if (url.startsWith('https://api.senseaudio.cn/v1/video/status')) {
        return new Response(
          JSON.stringify({
            status: 'failed',
            error_message: 'sensitive_content_blocked',
          }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      throw new Error(`unexpected fetch: ${url}`);
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateVideo(
      { prompt: 'blocked content' },
      baseCtx(),
    );
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/senseaudio video failed/);
    expect(result.error).toMatch(/sensitive_content_blocked/);
  });
  it('times out after SENSEAUDIO_VIDEO_MAX_POLLS polls when the job stays pending', async () => {
    const fetchMock = vi.fn(async (input: unknown) => {
      const url = String(input);
      if (url.endsWith('/v1/video/create')) {
        return new Response(
          JSON.stringify({ task_id: 'task-stuck' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      if (url.startsWith('https://api.senseaudio.cn/v1/video/status')) {
        return new Response(
          JSON.stringify({ status: 'pending', progress: 0 }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      throw new Error(`unexpected fetch: ${url}`);
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateVideo(
      { prompt: 'stuck job' },
      baseCtx(),
    );
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/timed out/);
    // 1× create + 120× poll = 121 fetches (10-min ceiling at 5 s
    // intervals — kept generous because doubao-seedance frequently
    // spends 3–8 min on the gateway for 1080p+audio jobs).
    expect(fetchMock).toHaveBeenCalledTimes(121);
  }, 30_000);
  it('returns a tool error when create response is missing task_id', async () => {
    const fetchMock = vi.fn(async () =>
      new Response('{"oops": true}', {
        status: 200,
        headers: { 'content-type': 'application/json' },
      }),
    );
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateVideo({ prompt: 'x' }, baseCtx());
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/missing task_id/);
  });
  it('returns a tool error when create call returns non-2xx', async () => {
    const fetchMock = vi.fn(async () =>
      new Response('unauthorized', {
        status: 401,
        headers: { 'content-type': 'text/plain' },
      }),
    );
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateVideo({ prompt: 'x' }, baseCtx());
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/senseaudio video create 401/);
  });
  it('rejects an unsafe projectId before any upstream call', async () => {
    const fetchMock = vi.fn();
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateVideo(
      { prompt: 'x' },
      { ...baseCtx(), projectId: '../escape' },
    );
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/invalid projectId/);
    expect(fetchMock).not.toHaveBeenCalled();
  });
  it('rejects empty prompt before any upstream call', async () => {
    const fetchMock = vi.fn();
    vi.stubGlobal('fetch', fetchMock);
    const result = await executeGenerateVideo({}, baseCtx());
    expect(result.ok).toBe(false);
    expect(result.error).toMatch(/prompt is required/);
    expect(fetchMock).not.toHaveBeenCalled();
  });
 });
--- a/apps/daemon/tests/media-config.test.ts
+++ b/apps/daemon/tests/media-config.test.ts
@ -8,6 +8,7 @@ import {
  readMaskedConfig,
  resolveModelAlias,
  resolveProviderConfig,
  seedProviderIfMissing,
  writeConfig,
 } from '../src/media-config.js';
@ -868,3 +869,159 @@ describe('media-config model alias resolution (issue #1277)', () => {
    ).toBe('doubao-seedream-5-0');
  });
 });
 describe('seedProviderIfMissing', () => {
  let projectRoot: string;
  const SENSEAUDIO_ENV_KEYS = ['OD_SENSEAUDIO_API_KEY', 'SENSEAUDIO_API_KEY'];
  const originalEnv = Object.fromEntries(
    SENSEAUDIO_ENV_KEYS.map((key) => [key, process.env[key]]),
  );
  const originalMediaConfigDir = process.env.OD_MEDIA_CONFIG_DIR;
  const originalDataDir = process.env.OD_DATA_DIR;
  beforeEach(async () => {
    projectRoot = await mkdtemp(path.join(tmpdir(), 'od-media-seed-'));
    for (const key of SENSEAUDIO_ENV_KEYS) {
      delete process.env[key];
    }
    delete process.env.OD_MEDIA_CONFIG_DIR;
    delete process.env.OD_DATA_DIR;
  });
  afterEach(async () => {
    for (const key of SENSEAUDIO_ENV_KEYS) {
      if (originalEnv[key] == null) {
        delete process.env[key];
      } else {
        process.env[key] = originalEnv[key];
      }
    }
    if (originalMediaConfigDir == null) {
      delete process.env.OD_MEDIA_CONFIG_DIR;
    } else {
      process.env.OD_MEDIA_CONFIG_DIR = originalMediaConfigDir;
    }
    if (originalDataDir == null) {
      delete process.env.OD_DATA_DIR;
    } else {
      process.env.OD_DATA_DIR = originalDataDir;
    }
    await rm(projectRoot, { recursive: true, force: true });
  });
  async function writeStored(data: unknown) {
    const file = path.join(projectRoot, '.od', 'media-config.json');
    await mkdir(path.dirname(file), { recursive: true });
    await writeFile(file, JSON.stringify(data), 'utf8');
  }
  async function readStoredJson(): Promise<unknown> {
    const file = path.join(projectRoot, '.od', 'media-config.json');
    const raw = await readFile(file, 'utf8');
    return JSON.parse(raw);
  }
  it('writes a fresh entry when the slot is empty', async () => {
    const wrote = await seedProviderIfMissing(projectRoot, 'senseaudio', {
      apiKey: 'sa-test-key',
      baseUrl: 'https://api.senseaudio.cn',
    });
    expect(wrote).toBe(true);
    const stored = await readStoredJson();
    expect(stored).toEqual({
      providers: {
        senseaudio: {
          apiKey: 'sa-test-key',
          baseUrl: 'https://api.senseaudio.cn',
        },
      },
    });
  });
  it('no-ops and preserves the stored key when one is already configured', async () => {
    await writeStored({
      providers: {
        senseaudio: { apiKey: 'pre-existing-key', baseUrl: 'https://existing.example' },
      },
    });
    const wrote = await seedProviderIfMissing(projectRoot, 'senseaudio', {
      apiKey: 'newer-byok-key',
      baseUrl: 'https://api.senseaudio.cn',
    });
    expect(wrote).toBe(false);
    const stored = (await readStoredJson()) as { providers: Record<string, unknown> };
    expect(stored.providers.senseaudio).toEqual({
      apiKey: 'pre-existing-key',
      baseUrl: 'https://existing.example',
    });
  });
  it('preserves every other provider and aliases when seeding', async () => {
    await writeStored({
      providers: {
        openai: { apiKey: 'sk-openai', baseUrl: 'https://api.openai.com/v1' },
        volcengine: { apiKey: 'ark-key', baseUrl: 'https://ark.cn-beijing.volces.com/api/v3' },
      },
      aliases: { 'doubao-seedream-3-0-t2i-250415': 'doubao-seedream-5-0' },
    });
    const wrote = await seedProviderIfMissing(projectRoot, 'senseaudio', {
      apiKey: 'sa-new',
    });
    expect(wrote).toBe(true);
    const stored = (await readStoredJson()) as {
      providers: Record<string, unknown>;
      aliases: Record<string, string>;
    };
    expect(stored.providers.openai).toEqual({
      apiKey: 'sk-openai',
      baseUrl: 'https://api.openai.com/v1',
    });
    expect(stored.providers.volcengine).toEqual({
      apiKey: 'ark-key',
      baseUrl: 'https://ark.cn-beijing.volces.com/api/v3',
    });
    expect(stored.providers.senseaudio).toEqual({ apiKey: 'sa-new' });
    expect(stored.aliases).toEqual({
      'doubao-seedream-3-0-t2i-250415': 'doubao-seedream-5-0',
    });
  });
  it('no-ops when an env var resolves a key for the provider', async () => {
    process.env.OD_SENSEAUDIO_API_KEY = 'env-key';
    const wrote = await seedProviderIfMissing(projectRoot, 'senseaudio', {
      apiKey: 'sa-byok-key',
      baseUrl: 'https://api.senseaudio.cn',
    });
    expect(wrote).toBe(false);
    await expect(readStoredJson()).rejects.toThrow();
  });
  it('no-ops on empty apiKey', async () => {
    const wrote = await seedProviderIfMissing(projectRoot, 'senseaudio', {
      apiKey: '',
      baseUrl: 'https://api.senseaudio.cn',
    });
    expect(wrote).toBe(false);
    await expect(readStoredJson()).rejects.toThrow();
  });
  it('no-ops for unknown provider ids', async () => {
    const wrote = await seedProviderIfMissing(projectRoot, 'not-a-provider', {
      apiKey: 'whatever',
    });
    expect(wrote).toBe(false);
    await expect(readStoredJson()).rejects.toThrow();
  });
  it('resolves the seeded key through resolveProviderConfig', async () => {
    await seedProviderIfMissing(projectRoot, 'senseaudio', {
      apiKey: 'sa-final',
      baseUrl: 'https://api.senseaudio.cn',
    });
    const resolved = await resolveProviderConfig(projectRoot, 'senseaudio');
    expect(resolved).toEqual({
      apiKey: 'sa-final',
      baseUrl: 'https://api.senseaudio.cn',
    });
  });
 });
--- a/apps/daemon/tests/media-senseaudio-image.test.ts
+++ b/apps/daemon/tests/media-senseaudio-image.test.ts
@ -0,0 +1,305 @@
 import { mkdir, mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
 import { tmpdir } from 'node:os';
 import path from 'node:path';
 import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
 import { generateMedia } from '../src/media.js';
 const TEST_SENSEAUDIO_BASE_URL = 'https://senseaudio-gateway.example.test';
 const TEST_IMAGE_URL = 'https://cdn.example.test/generated/abc.png';
 const TEST_IMAGE_BYTES = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x00, 0x01]);
 function buildOkResponse(url = TEST_IMAGE_URL) {
  return new Response(
    JSON.stringify({ url, base_resp: { status_code: 0, status_msg: 'success' } }),
    { status: 200, headers: { 'content-type': 'application/json' } },
  );
 }
 function buildImageFetchResponse(bytes: Buffer) {
  return new Response(bytes, {
    status: 200,
    headers: { 'content-type': 'image/png' },
  });
 }
 describe('senseaudio image generation', () => {
  let root: string;
  let projectRoot: string;
  let projectsRoot: string;
  const realFetch = globalThis.fetch;
  const originalMediaConfigDir = process.env.OD_MEDIA_CONFIG_DIR;
  const originalDataDir = process.env.OD_DATA_DIR;
  beforeEach(async () => {
    root = await mkdtemp(path.join(tmpdir(), 'od-senseaudio-image-'));
    projectRoot = path.join(root, 'project-root');
    projectsRoot = path.join(projectRoot, '.od', 'projects');
    await mkdir(projectsRoot, { recursive: true });
    delete process.env.OD_MEDIA_CONFIG_DIR;
    delete process.env.OD_DATA_DIR;
    delete process.env.OD_SENSEAUDIO_API_KEY;
    delete process.env.SENSEAUDIO_API_KEY;
  });
  afterEach(async () => {
    globalThis.fetch = realFetch;
    if (originalMediaConfigDir == null) {
      delete process.env.OD_MEDIA_CONFIG_DIR;
    } else {
      process.env.OD_MEDIA_CONFIG_DIR = originalMediaConfigDir;
    }
    if (originalDataDir == null) {
      delete process.env.OD_DATA_DIR;
    } else {
      process.env.OD_DATA_DIR = originalDataDir;
    }
    delete process.env.OD_SENSEAUDIO_API_KEY;
    delete process.env.SENSEAUDIO_API_KEY;
    await rm(root, { recursive: true, force: true });
  });
  async function writeConfig(data: unknown) {
    const file = path.join(projectRoot, '.od', 'media-config.json');
    await mkdir(path.dirname(file), { recursive: true });
    await writeFile(file, JSON.stringify(data), 'utf8');
  }
  it('renders a SenseAudio image with the documented sync defaults', async () => {
    await writeConfig({
      providers: {
        senseaudio: {
          apiKey: 'sense-test-key',
          baseUrl: TEST_SENSEAUDIO_BASE_URL,
        },
      },
    });
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const urlStr = String(input);
      if (urlStr === `${TEST_SENSEAUDIO_BASE_URL}/v1/image/sync`) {
        expect(init?.method).toBe('POST');
        expect(init?.headers).toMatchObject({
          authorization: 'Bearer sense-test-key',
          'content-type': 'application/json',
        });
        expect(JSON.parse(String(init?.body))).toEqual({
          model: 'senseaudio-image-2.0-260319',
          prompt: 'A magazine-style hero poster.',
          size: '1024x1024',
        });
        return buildOkResponse();
      }
      if (urlStr === TEST_IMAGE_URL) {
        return buildImageFetchResponse(TEST_IMAGE_BYTES);
      }
      throw new Error(`unexpected fetch: ${urlStr}`);
    });
    vi.stubGlobal('fetch', fetchMock);
    const result = await generateMedia({
      projectRoot,
      projectsRoot,
      projectId: 'project-1',
      surface: 'image',
      model: 'senseaudio-image-2.0-260319',
      prompt: 'A magazine-style hero poster.',
      output: 'sa-hero.png',
    });
    expect(fetchMock).toHaveBeenCalledTimes(2);
    expect(result.providerId).toBe('senseaudio');
    expect(result.providerNote).toContain('senseaudio/senseaudio-image-2.0-260319');
    expect(result.providerNote).toContain('1024x1024');
    const bytes = await readFile(path.join(projectsRoot, 'project-1', 'sa-hero.png'));
    expect(bytes.equals(TEST_IMAGE_BYTES)).toBe(true);
  });
  it('maps aspect ratios to the SenseAudio size strings', async () => {
    await writeConfig({
      providers: {
        senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
      },
    });
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      const urlStr = String(input);
      if (urlStr === `${TEST_SENSEAUDIO_BASE_URL}/v1/image/sync`) {
        expect(JSON.parse(String(init?.body)).size).toBe('1280x720');
        return buildOkResponse();
      }
      return buildImageFetchResponse(TEST_IMAGE_BYTES);
    });
    vi.stubGlobal('fetch', fetchMock);
    await generateMedia({
      projectRoot,
      projectsRoot,
      projectId: 'project-1',
      surface: 'image',
      model: 'senseaudio-image-1.0-260319',
      aspect: '16:9',
      prompt: 'Widescreen banner.',
      output: 'sa-banner.png',
    });
    expect(fetchMock).toHaveBeenCalledTimes(2);
  });
  it('falls back to the canonical base URL when none is configured', async () => {
    await writeConfig({
      providers: {
        senseaudio: { apiKey: 'sense-test-key' },
      },
    });
    const fetchMock = vi.fn(async (input: unknown) => {
      const urlStr = String(input);
      if (urlStr === 'https://api.senseaudio.cn/v1/image/sync') {
        return buildOkResponse();
      }
      return buildImageFetchResponse(TEST_IMAGE_BYTES);
    });
    vi.stubGlobal('fetch', fetchMock);
    await generateMedia({
      projectRoot,
      projectsRoot,
      projectId: 'project-1',
      surface: 'image',
      model: 'doubao-seedream-5-0-260128',
      prompt: 'Default base url.',
      output: 'sa-default-base.png',
    });
    expect(fetchMock).toHaveBeenCalledTimes(2);
  });
  it('reads the API key from OD_SENSEAUDIO_API_KEY when storage is empty', async () => {
    process.env.OD_SENSEAUDIO_API_KEY = 'env-sense-key';
    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
      if (String(input).endsWith('/v1/image/sync')) {
        expect(init?.headers).toMatchObject({ authorization: 'Bearer env-sense-key' });
        return buildOkResponse();
      }
      return buildImageFetchResponse(TEST_IMAGE_BYTES);
    });
    vi.stubGlobal('fetch', fetchMock);
    await generateMedia({
      projectRoot,
      projectsRoot,
      projectId: 'project-1',
      surface: 'image',
      model: 'senseaudio-image-2.0-260319',
      prompt: 'Env-only key.',
      output: 'sa-env.png',
    });
    expect(fetchMock).toHaveBeenCalledTimes(2);
  });
  it('errors when no API key is configured', async () => {
    const fetchMock = vi.fn();
    vi.stubGlobal('fetch', fetchMock);
    await expect(
      generateMedia({
        projectRoot,
        projectsRoot,
        projectId: 'project-1',
        surface: 'image',
        model: 'senseaudio-image-2.0-260319',
        prompt: 'Should fail.',
        output: 'sa-no-key.png',
      }),
    ).rejects.toThrow(/no SenseAudio API key/);
    expect(fetchMock).not.toHaveBeenCalled();
  });
  it('surfaces HTTP-level failures with the status code and truncated body', async () => {
    await writeConfig({
      providers: {
        senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
      },
    });
    const fetchMock = vi.fn(async () =>
      new Response('unauthorized', {
        status: 401,
        headers: { 'content-type': 'text/plain' },
      }),
    );
    vi.stubGlobal('fetch', fetchMock);
    await expect(
      generateMedia({
        projectRoot,
        projectsRoot,
        projectId: 'project-1',
        surface: 'image',
        model: 'senseaudio-image-2.0-260319',
        prompt: 'Bad auth.',
        output: 'sa-401.png',
      }),
    ).rejects.toThrow('senseaudio image 401: unauthorized');
  });
  it('surfaces upstream error_message verbatim when the body reports failure', async () => {
    await writeConfig({
      providers: {
        senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
      },
    });
    const fetchMock = vi.fn(async () =>
      new Response(
        JSON.stringify({ error_message: 'sensitive_content_blocked' }),
        { status: 200, headers: { 'content-type': 'application/json' } },
      ),
    );
    vi.stubGlobal('fetch', fetchMock);
    await expect(
      generateMedia({
        projectRoot,
        projectsRoot,
        projectId: 'project-1',
        surface: 'image',
        model: 'senseaudio-image-2.0-260319',
        prompt: 'Blocked.',
        output: 'sa-blocked.png',
      }),
    ).rejects.toThrow('senseaudio image api error: sensitive_content_blocked');
  });
  it('errors when the response body is missing the image url', async () => {
    await writeConfig({
      providers: {
        senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
      },
    });
    const fetchMock = vi.fn(async () =>
      new Response(
        JSON.stringify({ base_resp: { status_code: 0, status_msg: 'success' } }),
        { status: 200, headers: { 'content-type': 'application/json' } },
      ),
    );
    vi.stubGlobal('fetch', fetchMock);
    await expect(
      generateMedia({
        projectRoot,
        projectsRoot,
        projectId: 'project-1',
        surface: 'image',
        model: 'senseaudio-image-2.0-260319',
        prompt: 'Missing url.',
        output: 'sa-missing-url.png',
      }),
    ).rejects.toThrow('senseaudio image response missing url');
  });
 });
--- a/apps/daemon/tests/proxy-routes.test.ts
+++ b/apps/daemon/tests/proxy-routes.test.ts
@ -523,6 +523,497 @@ describe('API proxy routes', () => {
    expect(upstreamInit?.redirect).toBe('error');
  });
  it('streams delta + end for SenseAudio chat completions', async () => {
    const fetchMock = vi.fn((input: FetchInput, init?: FetchInit) => {
      const url = String(input);
      if (url.startsWith(baseUrl)) return realFetch(input, init);
      return Promise.resolve(sseResponse([
        'data: {"choices":[{"delta":{"content":"sense"}}]}',
        '',
        'data: [DONE]',
        '',
      ].join('\n')));
    });
    vi.stubGlobal('fetch', fetchMock);
    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        baseUrl: 'https://api.senseaudio.cn',
        apiKey: 'sa-test',
        projectId: 'test-project',
        model: 'senseaudio-s2',
        messages: [{ role: 'user', content: 'hello' }],
      }),
    });
    await expect(res.text()).resolves.toContain('event: delta\ndata: {"delta":"sense"}');
    expect(fetchMock).toHaveBeenCalledWith(
      'https://api.senseaudio.cn/v1/chat/completions',
      expect.objectContaining({
        headers: expect.objectContaining({ Authorization: 'Bearer sa-test' }),
        redirect: 'error',
      }),
    );
  });
  it('defaults SenseAudio base URL to api.senseaudio.cn when caller omits it', async () => {
    const fetchMock = vi.fn((input: FetchInput, init?: FetchInit) => {
      const url = String(input);
      if (url.startsWith(baseUrl)) return realFetch(input, init);
      return Promise.resolve(sseResponse('data: [DONE]\n\n'));
    });
    vi.stubGlobal('fetch', fetchMock);
    await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        apiKey: 'sa-test',
        projectId: 'test-project',
        model: 'senseaudio-s2',
        messages: [{ role: 'user', content: 'hi' }],
      }),
    });
    expect(String(fetchMock.mock.calls[0]![0])).toBe(
      'https://api.senseaudio.cn/v1/chat/completions',
    );
  });
  it('rejects SenseAudio requests that omit apiKey or model', async () => {
    const fetchMock = vi.fn();
    vi.stubGlobal('fetch', fetchMock);
    const missingKey = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        model: 'senseaudio-s2',
        messages: [{ role: 'user', content: 'hi' }],
      }),
    });
    expect(missingKey.status).toBe(400);
    const missingModel = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        apiKey: 'sa-test',
        messages: [{ role: 'user', content: 'hi' }],
      }),
    });
    expect(missingModel.status).toBe(400);
    expect(fetchMock).not.toHaveBeenCalled();
  });
  it('disables upstream redirects for senseaudio proxy requests', async () => {
    const fetchMock = vi.fn((input: FetchInput, init?: FetchInit) => {
      const url = String(input);
      if (url.startsWith(baseUrl)) return realFetch(input, init);
      return Promise.resolve(sseResponse('data: [DONE]\n\n'));
    });
    vi.stubGlobal('fetch', fetchMock);
    await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        baseUrl: 'https://api.senseaudio.cn',
        apiKey: 'sa-test',
        projectId: 'test-project',
        model: 'model-one',
        messages: [{ role: 'user', content: 'hi' }],
      }),
    });
    const upstreamCall = fetchMock.mock.calls.find(([input]) =>
      !String(input).startsWith(baseUrl),
    );
    expect(upstreamCall).toBeDefined();
    const upstreamInit = upstreamCall![1] as FetchInit;
    expect(upstreamInit?.redirect).toBe('error');
  });
  it('injects generate_image tool definition on every SenseAudio request', async () => {
    const fetchMock = vi.fn((input: FetchInput, init?: FetchInit) => {
      const url = String(input);
      if (url.startsWith(baseUrl)) return realFetch(input, init);
      return Promise.resolve(sseResponse([
        'data: {"choices":[{"delta":{"content":"ok"}}]}',
        '',
        'data: [DONE]',
        '',
      ].join('\n')));
    });
    vi.stubGlobal('fetch', fetchMock);
    await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        baseUrl: 'https://api.senseaudio.cn',
        apiKey: 'sa-test',
        projectId: 'test-project',
        model: 'senseaudio-s2',
        messages: [{ role: 'user', content: 'hi' }],
      }),
    });
    const upstreamCall = fetchMock.mock.calls.find(([input]) =>
      !String(input).startsWith(baseUrl),
    );
    expect(upstreamCall).toBeDefined();
    const body = JSON.parse(String((upstreamCall![1] as FetchInit)?.body));
    expect(body.tool_choice).toBe('auto');
    expect(Array.isArray(body.tools)).toBe(true);
    expect(body.tools[0]).toMatchObject({
      type: 'function',
      function: { name: 'generate_image' },
    });
  });
  it('runs the BYOK image tool loop end-to-end', async () => {
    const pngBytes = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x00, 0x01]);
    const upstreamChatBodies: any[] = [];
    let chatCallIndex = 0;
    const fetchMock = vi.fn(async (input: FetchInput, init?: FetchInit) => {
      const url = String(input);
      if (url.startsWith(baseUrl)) return realFetch(input, init);
      // SenseAudio image generation
      if (url === 'https://api.senseaudio.cn/v1/image/sync') {
        return new Response(
          JSON.stringify({
            url: 'https://cdn.example.test/cat.png',
            base_resp: { status_code: 0, status_msg: 'success' },
          }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      // Image bytes download (initiated by the tool, not via the proxy)
      if (url === 'https://cdn.example.test/cat.png') {
        return new Response(pngBytes, {
          status: 200,
          headers: { 'content-type': 'image/png' },
        });
      }
      // Upstream chat completions — capture bodies, return different SSE per call
      if (url === 'https://api.senseaudio.cn/v1/chat/completions') {
        upstreamChatBodies.push(JSON.parse(String(init?.body || '{}')));
        chatCallIndex++;
        if (chatCallIndex === 1) {
          // First turn: model decides to call generate_image
          return sseResponse([
            'data: {"choices":[{"index":0,"delta":{"role":"assistant","content":null,"tool_calls":[{"index":0,"id":"call_abc","type":"function","function":{"name":"generate_image","arguments":"{\\"prompt\\":\\"a cat\\"}"}}]},"finish_reason":null}]}',
            '',
            'data: {"choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}',
            '',
            'data: [DONE]',
            '',
          ].join('\n'));
        }
        // Second turn: model summarises with image embedded in markdown
        return sseResponse([
          'data: {"choices":[{"index":0,"delta":{"content":"Here is your cat: "}}]}',
          '',
          'data: {"choices":[{"index":0,"delta":{"content":"![cat](generated)"}}]}',
          '',
          'data: {"choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}',
          '',
          'data: [DONE]',
          '',
        ].join('\n'));
      }
      throw new Error(`unexpected fetch: ${url}`);
    });
    vi.stubGlobal('fetch', fetchMock);
    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        baseUrl: 'https://api.senseaudio.cn',
        apiKey: 'sa-test',
        projectId: 'test-project',
        model: 'senseaudio-s2',
        messages: [{ role: 'user', content: 'draw a cat' }],
      }),
    });
    expect(res.status).toBe(200);
    const body = await res.text();
    // Final assistant text streams through to the client
    expect(body).toContain('event: delta');
    expect(body).toContain('Here is your cat');
    expect(body).toContain('![cat](generated)');
    expect(body).toContain('event: end');
    // Two upstream chat completions calls happened (loop ran exactly once)
    expect(upstreamChatBodies).toHaveLength(2);
    // Second upstream call includes assistant{tool_calls} + tool{result}
    const secondMessages = upstreamChatBodies[1].messages;
    expect(secondMessages).toHaveLength(3);
    expect(secondMessages[0]).toEqual({ role: 'user', content: 'draw a cat' });
    expect(secondMessages[1]).toMatchObject({
      role: 'assistant',
      content: null,
      tool_calls: [
        {
          id: 'call_abc',
          type: 'function',
          function: {
            name: 'generate_image',
            arguments: '{"prompt":"a cat"}',
          },
        },
      ],
    });
    expect(secondMessages[2]).toMatchObject({
      role: 'tool',
      tool_call_id: 'call_abc',
      content: expect.stringMatching(
        /Image generated successfully\. URL: \/api\/projects\/test-project\/files\/byok-[a-z0-9-]+\.png/,
      ),
    });
  });
  it('feeds a tool error message back to the model when generate_image fails', async () => {
    const upstreamChatBodies: any[] = [];
    let chatCallIndex = 0;
    const fetchMock = vi.fn(async (input: FetchInput, init?: FetchInit) => {
      const url = String(input);
      if (url.startsWith(baseUrl)) return realFetch(input, init);
      if (url === 'https://api.senseaudio.cn/v1/image/sync') {
        return new Response(
          JSON.stringify({ error_message: 'sensitive_content_blocked' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      if (url === 'https://api.senseaudio.cn/v1/chat/completions') {
        upstreamChatBodies.push(JSON.parse(String(init?.body || '{}')));
        chatCallIndex++;
        if (chatCallIndex === 1) {
          return sseResponse([
            'data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"call_err","type":"function","function":{"name":"generate_image","arguments":"{\\"prompt\\":\\"...\\"}"}}]},"finish_reason":null}]}',
            '',
            'data: {"choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}',
            '',
            'data: [DONE]',
            '',
          ].join('\n'));
        }
        return sseResponse([
          'data: {"choices":[{"index":0,"delta":{"content":"Sorry, that one was blocked."}}]}',
          '',
          'data: {"choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}',
          '',
          'data: [DONE]',
          '',
        ].join('\n'));
      }
      throw new Error(`unexpected fetch: ${url}`);
    });
    vi.stubGlobal('fetch', fetchMock);
    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        baseUrl: 'https://api.senseaudio.cn',
        apiKey: 'sa-test',
        projectId: 'test-project',
        model: 'senseaudio-s2',
        messages: [{ role: 'user', content: 'draw something blocked' }],
      }),
    });
    expect(res.status).toBe(200);
    const body = await res.text();
    expect(body).toContain('Sorry, that one was blocked');
    expect(upstreamChatBodies).toHaveLength(2);
    const toolMsg = upstreamChatBodies[1].messages[2];
    expect(toolMsg.role).toBe('tool');
    expect(toolMsg.tool_call_id).toBe('call_err');
    expect(toolMsg.content).toMatch(/Image generation failed/);
    expect(toolMsg.content).toMatch(/sensitive_content_blocked/);
  });
  it('bounds the BYOK tool loop at MAX_BYOK_TOOL_LOOPS=3', async () => {
    let chatCallIndex = 0;
    const fetchMock = vi.fn(async (input: FetchInput, init?: FetchInit) => {
      const url = String(input);
      if (url.startsWith(baseUrl)) return realFetch(input, init);
      if (url === 'https://api.senseaudio.cn/v1/image/sync') {
        return new Response(
          JSON.stringify({ url: 'https://cdn.example.test/x.png' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      if (url === 'https://cdn.example.test/x.png') {
        return new Response(Buffer.from([0x89, 0x50]), { status: 200 });
      }
      if (url === 'https://api.senseaudio.cn/v1/chat/completions') {
        chatCallIndex++;
        // Always return tool_calls — the model never returns text
        return sseResponse([
          `data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"call_${chatCallIndex}","type":"function","function":{"name":"generate_image","arguments":"{\\"prompt\\":\\"x\\"}"}}]},"finish_reason":null}]}`,
          '',
          'data: {"choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}',
          '',
          'data: [DONE]',
          '',
        ].join('\n'));
      }
      throw new Error(`unexpected fetch: ${url}`);
    });
    vi.stubGlobal('fetch', fetchMock);
    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        baseUrl: 'https://api.senseaudio.cn',
        apiKey: 'sa-test',
        projectId: 'test-project',
        model: 'senseaudio-s2',
        messages: [{ role: 'user', content: 'infinite' }],
      }),
    });
    expect(res.status).toBe(200);
    const body = await res.text();
    expect(body).toContain('event: end');
    // Loop ran exactly MAX_BYOK_TOOL_LOOPS times before bailing.
    expect(chatCallIndex).toBe(3);
  });
  it('writes the generated image into the project folder and serves it via /api/projects/:id/files/*', async () => {
    const pngBytes = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x42, 0x59]);
    let capturedUrl: string | undefined;
    const fetchMock = vi.fn(async (input: FetchInput, init?: FetchInit) => {
      const url = String(input);
      if (url.startsWith(baseUrl)) return realFetch(input, init);
      if (url === 'https://api.senseaudio.cn/v1/image/sync') {
        return new Response(
          JSON.stringify({ url: 'https://cdn.example.test/served.png' }),
          { status: 200, headers: { 'content-type': 'application/json' } },
        );
      }
      if (url === 'https://cdn.example.test/served.png') {
        return new Response(pngBytes, { status: 200 });
      }
      if (url === 'https://api.senseaudio.cn/v1/chat/completions') {
        const body = JSON.parse(String(init?.body || '{}'));
        // Capture URL the tool produced from the second turn's tool message.
        const toolMsg = body.messages?.find((m: any) => m.role === 'tool');
        if (toolMsg) {
          const match = /URL: (\/api\/projects\/[A-Za-z0-9._-]+\/files\/byok-[a-z0-9-]+\.png)/.exec(toolMsg.content);
          if (match) capturedUrl = match[1];
        }
        const isToolTurn = !toolMsg;
        if (isToolTurn) {
          return sseResponse([
            'data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"call_serve","type":"function","function":{"name":"generate_image","arguments":"{\\"prompt\\":\\"s\\"}"}}]},"finish_reason":null}]}',
            '',
            'data: {"choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}',
            '',
            'data: [DONE]',
            '',
          ].join('\n'));
        }
        return sseResponse([
          'data: {"choices":[{"index":0,"delta":{"content":"done"}}]}',
          '',
          'data: {"choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}',
          '',
          'data: [DONE]',
          '',
        ].join('\n'));
      }
      throw new Error(`unexpected fetch: ${url}`);
    });
    vi.stubGlobal('fetch', fetchMock);
    const proxyRes = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        baseUrl: 'https://api.senseaudio.cn',
        apiKey: 'sa-test',
        projectId: 'test-project',
        model: 'senseaudio-s2',
        messages: [{ role: 'user', content: 'gen' }],
      }),
    });
    // Drain the SSE body so the tool loop fully completes before we assert.
    await proxyRes.text();
    expect(capturedUrl).toBeDefined();
    // The URL the tool emits is relative — same-origin via Next.js
    // rewrite in production, hits this test server directly here.
    // We GET the captured URL through the standard project file route
    // and assert the bytes come back. This proves both halves:
    // (1) the image landed in <projectsRoot>/<projectId>/ as expected
    // (so listFiles / FileViewer / archive will find it), and
    // (2) /api/projects/:id/files/* serves it without needing any
    //     byok-specific route.
    const imgRes = await realFetch(`${baseUrl}${capturedUrl!}`);
    expect(imgRes.status).toBe(200);
    expect(imgRes.headers.get('content-type')).toMatch(/^image\/png/);
    const served = Buffer.from(await imgRes.arrayBuffer());
    expect(served.equals(pngBytes)).toBe(true);
  });
  it('rejects senseaudio chat requests without a projectId', async () => {
    const fetchMock = vi.fn();
    vi.stubGlobal('fetch', fetchMock);
    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        apiKey: 'sa-test',
        model: 'senseaudio-s2',
        messages: [{ role: 'user', content: 'hi' }],
        // no projectId — should 400
      }),
    });
    expect(res.status).toBe(400);
    expect(fetchMock).not.toHaveBeenCalled();
  });
  it('rejects senseaudio chat requests with an unsafe projectId', async () => {
    const fetchMock = vi.fn();
    vi.stubGlobal('fetch', fetchMock);
    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
      method: 'POST',
      headers: { 'content-type': 'application/json' },
      body: JSON.stringify({
        apiKey: 'sa-test',
        model: 'senseaudio-s2',
        projectId: '../etc/passwd',
        messages: [{ role: 'user', content: 'hi' }],
      }),
    });
    expect(res.status).toBe(400);
    expect(fetchMock).not.toHaveBeenCalled();
  });
  // Plan §3.A4 / spec §11.8 (e2e-7): the API-fallback proxy paths must
  // never carry plugin context. The web sidecar's fallback mode bypasses
  // the daemon snapshot bus, so any pluginId / appliedPluginSnapshotId in
@ -534,6 +1025,7 @@ describe('API proxy routes', () => {
      '/api/proxy/openai/stream',
      '/api/proxy/azure/stream',
      '/api/proxy/google/stream',
      '/api/proxy/senseaudio/stream',
    ];
    for (const path of proxies) {
--- a/apps/web/src/components/ChatComposer.tsx
+++ b/apps/web/src/components/ChatComposer.tsx
@ -14,6 +14,7 @@ import {
  trackStudioClickChatComposer,
  trackStudioViewChatPanel,
 } from '../analytics/events';
 import { IMAGE_MODELS } from "../media/models";
 import { projectRawUrl, uploadProjectFiles, openFolderDialog, fetchConnectors } from "../providers/registry";
 import { patchProject } from "../state/projects";
 import { fetchMcpServers } from "../state/mcp";
@ -126,6 +127,14 @@ interface Props {
  researchAvailable?: boolean;
  projectMetadata?: ProjectMetadata;
  onProjectMetadataChange?: (metadata: ProjectMetadata) => void;
  // SenseAudio BYOK image-model picker shown above the textarea. Hidden
  // when the active chat protocol is anything other than 'senseaudio',
  // so the composer stays clean for every other BYOK tab. The state
  // owner is ProjectView (per-session, reset on refresh); ChatComposer
  // is a fully controlled select.
  byokApiProtocol?: AppConfig['apiProtocol'];
  byokImageModel?: string;
  onChangeByokImageModel?: (model: string) => void;
  currentSkillId?: string | null;
  onProjectSkillChange?: (skillId: string | null) => void;
  // Set when the project was created with a plugin already pinned
@ -188,6 +197,9 @@ export const ChatComposer = forwardRef<ChatComposerHandle, Props>(
      researchAvailable = false,
      projectMetadata,
      onProjectMetadataChange,
      byokApiProtocol,
      byokImageModel,
      onChangeByokImageModel,
      currentSkillId = null,
      onProjectSkillChange,
      pinnedPluginId = null,
@ -1186,6 +1198,53 @@ export const ChatComposer = forwardRef<ChatComposerHandle, Props>(
              t={t}
            />
          ) : null}
          {byokApiProtocol === 'senseaudio' && onChangeByokImageModel ? (
            <div
              className="composer-byok-image-model"
              data-testid="composer-byok-image-model"
              style={{
                display: 'flex',
                alignItems: 'center',
                gap: 8,
                padding: '4px 8px',
                fontSize: 12,
                color: 'var(--text-muted, #888)',
              }}
            >
              <Icon name="image" size={13} />
              <label
                htmlFor="composer-byok-image-model-select"
                style={{ flexShrink: 0 }}
              >
                {t('settings.byokImageModel')}
              </label>
              <select
                id="composer-byok-image-model-select"
                value={byokImageModel ?? ''}
                onChange={(e) => onChangeByokImageModel(e.target.value)}
                style={{
                  background: 'transparent',
                  border: '1px solid var(--border, #444)',
                  borderRadius: 4,
                  padding: '2px 6px',
                  color: 'inherit',
                  fontSize: 12,
                }}
              >
                <option value="">
                  {(IMAGE_MODELS.find((m) => m.provider === 'senseaudio')?.label
                    ?? 'senseaudio-image-2.0') + ' (default)'}
                </option>
                {IMAGE_MODELS.filter((m) => m.provider === 'senseaudio').map(
                  (m) => (
                    <option key={m.id} value={m.id}>
                      {m.label}
                    </option>
                  ),
                )}
              </select>
            </div>
          ) : null}
          {/*
            Spec §8.4 — context bar above the composer input. The
            section now behaves as a pure context bar: it renders the
--- a/apps/web/src/components/ChatPane.tsx
+++ b/apps/web/src/components/ChatPane.tsx
@ -279,6 +279,12 @@ interface Props {
  // message" without forcing a separate side widget.
  activePluginSnapshot?: AppliedPluginSnapshot | null;
  onCollapse?: () => void;
  // SenseAudio BYOK only — wired straight through to ChatComposer for the
  // in-composer image-model picker. Active protocol is read so the picker
  // hides when the user is on any other BYOK tab (azure / openai / …).
  byokApiProtocol?: AppConfig['apiProtocol'];
  byokImageModel?: string;
  onChangeByokImageModel?: (model: string) => void;
 }
 type Tab = 'chat' | 'comments';
@ -327,6 +333,9 @@ export function ChatPane({
  activePluginSnapshot,
  skills = [],
  onCollapse,
  byokApiProtocol,
  byokImageModel,
  onChangeByokImageModel,
 }: Props) {
  const t = useT();
  const logRef = useRef<HTMLDivElement | null>(null);
@ -872,6 +881,9 @@ export function ChatPane({
            researchAvailable={researchAvailable}
            projectMetadata={projectMetadata}
            onProjectMetadataChange={onProjectMetadataChange}
            byokApiProtocol={byokApiProtocol}
            byokImageModel={byokImageModel}
            onChangeByokImageModel={onChangeByokImageModel}
            currentSkillId={currentSkillId}
            onProjectSkillChange={onProjectSkillChange}
            pinnedPluginId={activePluginSnapshot?.pluginId ?? null}
--- a/apps/web/src/components/DesignFilesPanel.tsx
+++ b/apps/web/src/components/DesignFilesPanel.tsx
@ -1192,7 +1192,14 @@ export function DesignFilesPanel({
        </div>
      </div>
      {preview && previewFile ? (
        // Key on the file name so React unmounts the previous DfPreview
        // (and its iframe / image element) when the user clicks a
        // different file. Without this, React diffing reuses the same
        // iframe DOM node and the browser keeps showing the first
        // file's contents — only the `src` prop changes but the iframe
        // never actually navigates.
        <DfPreview
          key={previewFile.name}
          projectId={projectId}
          file={previewFile}
          onOpen={() => onOpenFile(previewFile.name)}
--- a/apps/web/src/components/ProjectView.tsx
+++ b/apps/web/src/components/ProjectView.tsx
@ -486,6 +486,15 @@ export function ProjectView({
  const [liveArtifacts, setLiveArtifacts] = useState<LiveArtifactSummary[]>([]);
  const [liveArtifactEvents, setLiveArtifactEvents] = useState<LiveArtifactEventItem[]>([]);
  const [workspaceFocused, setWorkspaceFocused] = useState(false);
  // Per-session override for the BYOK SenseAudio chat's generate_image
  // tool. Seeded once from Settings (config.byokImageModel) so the
  // composer dropdown opens on the user's chosen default; subsequent
  // selections live only in this component's state — page refresh /
  // project switch resets to the Settings default. Persistent defaults
  // live in Settings → BYOK → SenseAudio → Image generation model.
  const [byokImageModelOverride, setByokImageModelOverride] = useState<string>(
    config.byokImageModel ?? '',
  );
  // `closed` → no surface; `review` → read-only saved-state panel with a
  // preview + reopen-to-edit action (#1822); `edit` → the textarea editor.
  const [instructionsMode, setInstructionsMode] = useState<'closed' | 'review' | 'edit'>('closed');
@ -2202,6 +2211,13 @@ export function ProjectView({
            });
          },
          onError: handlers.onError,
        }, {
          projectId: project.id,
          // SenseAudio BYOK chat reads this to pre-fill the tool param's
          // default model. Prefer the live composer override; fall back
          // to the Settings default when the composer dropdown is on
          // "use default". Other protocols ignore unknown body fields.
          byokImageModel: byokImageModelOverride || config.byokImageModel,
        });
      }
    },
@ -3375,6 +3391,9 @@ export function ProjectView({
              onTogglePet={onTogglePet}
              onOpenPetSettings={onOpenPetSettings}
              researchAvailable={config.mode === 'daemon'}
              byokApiProtocol={config.apiProtocol}
              byokImageModel={byokImageModelOverride}
              onChangeByokImageModel={setByokImageModelOverride}
              projectMetadata={project.metadata}
              onProjectMetadataChange={(metadata) => {
                onProjectChange({ ...project, metadata });
--- a/apps/web/src/components/SettingsDialog.tsx
+++ b/apps/web/src/components/SettingsDialog.tsx
@ -68,7 +68,7 @@ import type {
 import { testAgent, testApiProvider } from '../providers/connection-test';
 import { fetchProviderModels } from '../providers/provider-models';
 import { fetchConnectors, fetchDesignTemplates } from '../providers/registry';
-import { MEDIA_PROVIDERS } from '../media/models';
+import { IMAGE_MODELS, MEDIA_PROVIDERS } from '../media/models';
 import { XaiOAuthControl } from './XaiOAuthControl';
 import type { MediaProvider } from '../media/models';
 import { Toast } from './Toast';
@ -444,6 +444,7 @@ function currentApiProtocolConfig(config: AppConfig): ApiProtocolConfig {
    model: config.model,
    apiVersion: config.apiVersion ?? '',
    apiProviderBaseUrl: config.apiProviderBaseUrl ?? null,
    byokImageModel: config.byokImageModel ?? '',
  };
 }
@ -460,6 +461,11 @@ function applyApiProtocolConfig(
    model: apiConfig.model,
    apiProviderBaseUrl: apiConfig.apiProviderBaseUrl ?? null,
    apiVersion: protocol === 'azure' ? (apiConfig.apiVersion ?? '') : '',
    // byokImageModel is SenseAudio-only — flipping to another BYOK tab
    // shouldn't carry a SenseAudio image-model choice into, say, the
    // OpenAI form. Mirrors the apiVersion guarding above.
    byokImageModel:
      protocol === 'senseaudio' ? (apiConfig.byokImageModel ?? '') : '',
  };
 }
@ -2683,6 +2689,34 @@ export function SettingsDialog({
                  />
                </label>
              ) : null}
              {apiProtocol === 'senseaudio' ? (
                <label className="field">
                  <span className="field-label">{t('settings.byokImageModel')}</span>
                  <select
                    value={cfg.byokImageModel ?? ''}
                    onChange={(e) =>
                      updateApiConfig({ byokImageModel: e.target.value })
                    }
                  >
                    {/* Default-empty option resolves to the registry default
                        on the daemon side (senseaudio-image-2.0-260319 today).
                        Listing it explicitly lets the picker show what the
                        unconfigured state actually means. */}
                    <option value="">
                      {IMAGE_MODELS.find((m) => m.provider === 'senseaudio')?.label
                        ?? 'senseaudio-image-2.0'}
                      {' (default)'}
                    </option>
                    {IMAGE_MODELS.filter((m) => m.provider === 'senseaudio').map(
                      (m) => (
                        <option key={m.id} value={m.id}>
                          {m.label}
                        </option>
                      ),
                    )}
                  </select>
                </label>
              ) : null}
              <p className="hint">{t('settings.apiHint')}</p>
            </section>
          )}
--- a/apps/web/src/i18n/locales/ar.ts
+++ b/apps/web/src/i18n/locales/ar.ts
@ -202,6 +202,7 @@ export const ar: Dict = {
  'settings.azureDeploymentModelHint':
    'في Azure OpenAI، يُستخدم هذا الحقل كاسم النشر في /openai/deployments/<model>. أدخل اسم النشر الذي أنشأته في Azure.',
  'settings.apiVersion': 'إصدار API',
  'settings.byokImageModel': 'نموذج إنشاء الصور',
  'settings.maxTokens': 'أقصى عدد من الرموز (اختياري)',
  'settings.maxTokensHint':
    'الحد الأقصى لطول الاستجابة. لكل نموذج قيمة افتراضية؛ اتركها فارغة لاستخدامها، أو أدخل رقماً للتجاوز.',
--- a/apps/web/src/i18n/locales/de.ts
+++ b/apps/web/src/i18n/locales/de.ts
@ -202,6 +202,7 @@ export const de: Dict = {
  'settings.azureDeploymentModelHint':
    'Fuer Azure OpenAI wird dieses Feld als Deployment-Name in /openai/deployments/<model> verwendet. Geben Sie den in Azure angelegten Deployment-Namen ein.',
  'settings.apiVersion': 'API-Version',
  'settings.byokImageModel': 'Bilderzeugungsmodell',
  'settings.maxTokens': 'Max. Tokens (optional)',
  'settings.maxTokensHint':
    'Obergrenze für die Antwortlänge. Jedes Modell hat einen abgestimmten Standardwert (im Platzhalter sichtbar); leer lassen, um ihn zu verwenden, oder eine Zahl eingeben, um ihn zu überschreiben.',
--- a/apps/web/src/i18n/locales/en.ts
+++ b/apps/web/src/i18n/locales/en.ts
@ -227,6 +227,7 @@ export const en: Dict = {
  'settings.azureModelFetchHint':
    'For Azure OpenAI, enter the deployment name you created in Azure. Automatic deployment discovery is not available from this BYOK endpoint.',
  'settings.apiVersion': 'API version',
  'settings.byokImageModel': 'Image generation model',
  'settings.maxTokens': 'Max tokens (optional)',
  'settings.maxTokensHint':
    'Cap on the response length. Each model has a tuned default (shown as a placeholder); leave blank to use it, or enter a number to override.',
--- a/apps/web/src/i18n/locales/es-ES.ts
+++ b/apps/web/src/i18n/locales/es-ES.ts
@ -202,6 +202,7 @@ export const esES: Dict = {
  'settings.azureDeploymentModelHint':
    'Para Azure OpenAI, este campo se usa como nombre del despliegue en /openai/deployments/<model>. Introduce el nombre del despliegue que creaste en Azure.',
  'settings.apiVersion': 'Versión de API',
  'settings.byokImageModel': 'Modelo de generación de imágenes',
  'settings.maxTokens': 'Tokens máx. (opcional)',
  'settings.maxTokensHint':
    'Tope para la longitud de la respuesta. Cada modelo tiene un valor por defecto ajustado (visible en el placeholder); déjalo vacío para usarlo o introduce un número para anularlo.',
--- a/apps/web/src/i18n/locales/fa.ts
+++ b/apps/web/src/i18n/locales/fa.ts
@ -202,6 +202,7 @@ export const fa: Dict = {
  'settings.azureDeploymentModelHint':
    'در Azure OpenAI، این فیلد به عنوان نام استقرار در /openai/deployments/<model> استفاده می‌شود. نام استقراری را که در Azure ساخته‌اید وارد کنید.',
  'settings.apiVersion': 'نسخه API',
  'settings.byokImageModel': 'مدل تولید تصویر',
  'settings.maxTokens': 'حداکثر توکن (اختیاری)',
  'settings.maxTokensHint':
    'سقف طول پاسخ. هر مدل مقدار پیش‌فرض تنظیم‌شدهٔ خود را دارد (در placeholder نمایش داده می‌شود)؛ برای استفاده از آن خالی بگذارید، یا برای جایگزینی، عددی وارد کنید.',
--- a/apps/web/src/i18n/locales/fr.ts
+++ b/apps/web/src/i18n/locales/fr.ts
@ -202,6 +202,7 @@ export const fr: Dict = {
  'settings.azureDeploymentModelHint':
    'Pour Azure OpenAI, ce champ est utilisé comme nom du déploiement dans /openai/deployments/<model>. Saisissez le nom du déploiement créé dans Azure.',
  'settings.apiVersion': 'Version API',
  'settings.byokImageModel': "Modèle de génération d'images",
  'settings.maxTokens': 'Tokens max (optionnel)',
  'settings.maxTokensHint':
    'Limite de la longueur de réponse. Chaque modèle a une valeur par défaut (affichée à titre indicatif) ; laissez vide pour l\'utiliser, ou entrez un nombre pour la remplacer.',
--- a/apps/web/src/i18n/locales/hu.ts
+++ b/apps/web/src/i18n/locales/hu.ts
@ -202,6 +202,7 @@ export const hu: Dict = {
  'settings.azureDeploymentModelHint':
    'Azure OpenAI esetén ez a mező a /openai/deployments/<model> deployment neveként szerepel. Add meg az Azure-ban létrehozott deployment nevét.',
  'settings.apiVersion': 'API-verzió',
  'settings.byokImageModel': 'Képgenerálási modell',
  'settings.maxTokens': 'Max tokenek (opcionális)',
  'settings.maxTokensHint':
    'A válasz hosszának felső határa. Minden modellnek van hangolt alapértelmezése (placeholderként látható); hagyd üresen az alkalmazásához, vagy adj meg számot a felülíráshoz.',
--- a/apps/web/src/i18n/locales/id.ts
+++ b/apps/web/src/i18n/locales/id.ts
@ -202,6 +202,7 @@ export const id: Dict = {
  'settings.azureDeploymentModelHint':
    'Untuk Azure OpenAI, field ini digunakan sebagai nama deployment di /openai/deployments/<model>. Masukkan nama deployment yang kamu buat di Azure.',
  'settings.apiVersion': 'Versi API',
  'settings.byokImageModel': 'Model pembuatan gambar',
  'settings.maxTokens': 'Token maks (opsional)',
  'settings.maxTokensHint':
    'Batas panjang respons. Setiap model punya default sendiri; kosongkan untuk memakainya, atau isi angka untuk menimpa.',
--- a/apps/web/src/i18n/locales/it.ts
+++ b/apps/web/src/i18n/locales/it.ts
@ -199,6 +199,7 @@ export const it: Dict = {
  'settings.azureDeploymentModelHint':
    'Per Azure OpenAI, questo campo viene utilizzato come nome del deployment in /openai/deployments/<model>. Inserisci il nome del deployment creato in Azure.',
  'settings.apiVersion': 'Versione API',
  'settings.byokImageModel': 'Modello di generazione immagini',
  'settings.maxTokens': 'Token massimi (opzionale)',
  'settings.maxTokensHint':
    'Limite della lunghezza della risposta. Ogni modello ha un valore predefinito (mostrato nel placeholder); lascia vuoto per usarlo, o inserisci un numero per sostituirlo.',
--- a/apps/web/src/i18n/locales/ja.ts
+++ b/apps/web/src/i18n/locales/ja.ts
@ -202,6 +202,7 @@ export const ja: Dict = {
  'settings.azureDeploymentModelHint':
    'Azure OpenAI では、このフィールドが /openai/deployments/<model> のデプロイ名として使われます。Azure で作成したデプロイ名を入力してください。',
  'settings.apiVersion': 'API バージョン',
  'settings.byokImageModel': '画像生成モデル',
  'settings.maxTokens': '最大トークン（任意）',
  'settings.maxTokensHint':
    '応答長の上限。各モデルにチューニング済みのデフォルト値があります（プレースホルダーに表示）。空のままにすればそれを使用し、数値を入力すれば上書きされます。',
--- a/apps/web/src/i18n/locales/ko.ts
+++ b/apps/web/src/i18n/locales/ko.ts
@ -205,6 +205,7 @@ export const ko: Dict = {
  'settings.azureDeploymentModelHint':
    'Azure OpenAI에서는 이 필드가 /openai/deployments/<model>의 배포 이름으로 사용됩니다. Azure에서 만든 배포 이름을 입력하세요.',
  'settings.apiVersion': 'API 버전',
  'settings.byokImageModel': '이미지 생성 모델',
  'settings.apiHint': '요청은 로컬 daemon 프록시를 통해 설정한 Base URL로 전송됩니다. 키는 이 브라우저에만 저장되며 제공자 요청과 함께 전송됩니다.',
  'settings.skipForNow': '지금은 건너뛰기',
  'settings.getStarted': '시작하기',
--- a/apps/web/src/i18n/locales/pl.ts
+++ b/apps/web/src/i18n/locales/pl.ts
@ -202,6 +202,7 @@ export const pl: Dict = {
  'settings.azureDeploymentModelHint':
      'Dla Azure OpenAI to pole jest używane jako nazwa wdrożenia w /openai/deployments/<model>. Wpisz nazwę wdrożenia utworzonego w Azure.',
  'settings.apiVersion': 'Wersja API',
  'settings.byokImageModel': 'Model generowania obrazów',
  'settings.maxTokens': 'Maks. liczba tokenów (opcjonalnie)',
  'settings.maxTokensHint':
      'Limit długości odpowiedzi. Każdy model ma dostrojony domyślny limit (widoczny jako placeholder); pozostaw puste, aby go użyć, lub wpisz liczbę.',
--- a/apps/web/src/i18n/locales/pt-BR.ts
+++ b/apps/web/src/i18n/locales/pt-BR.ts
@ -202,6 +202,7 @@ export const ptBR: Dict = {
  'settings.azureDeploymentModelHint':
    'No Azure OpenAI, este campo e usado como nome do deployment em /openai/deployments/<model>. Informe o nome do deployment criado no Azure.',
  'settings.apiVersion': 'Versão da API',
  'settings.byokImageModel': 'Modelo de geração de imagens',
  'settings.maxTokens': 'Tokens máx. (opcional)',
  'settings.maxTokensHint':
    'Limite para o comprimento da resposta. Cada modelo tem um valor padrão ajustado (visível no placeholder); deixe em branco para usá-lo ou insira um número para substituí-lo.',
--- a/apps/web/src/i18n/locales/ru.ts
+++ b/apps/web/src/i18n/locales/ru.ts
@ -202,6 +202,7 @@ export const ru: Dict = {
  'settings.azureDeploymentModelHint':
    'Для Azure OpenAI это поле используется как имя развертывания в /openai/deployments/<model>. Укажите имя развертывания, созданного в Azure.',
  'settings.apiVersion': 'Версия API',
  'settings.byokImageModel': 'Модель генерации изображений',
  'settings.maxTokens': 'Макс. токенов (опционально)',
  'settings.maxTokensHint':
    'Ограничение длины ответа. У каждой модели свой настроенный дефолт (виден в плейсхолдере); оставьте поле пустым, чтобы использовать его, или введите число, чтобы переопределить.',
--- a/apps/web/src/i18n/locales/th.ts
+++ b/apps/web/src/i18n/locales/th.ts
@ -198,6 +198,7 @@ export const th: Dict = {
  'settings.azureDeploymentModel': 'ชื่อ Deployment',
  'settings.azureDeploymentModelHint': 'สำหรับ Azure OpenAI ฟิลด์นี้ใช้เป็นชื่อ Deployment ใน /openai/deployments/<model> ป้อนชื่อ Deployment ที่คุณสร้างใน Azure',
  'settings.apiVersion': 'เวอร์ชัน API',
  'settings.byokImageModel': 'โมเดลสร้างภาพ',
  'settings.maxTokens': 'Max tokens (เลือกได้)',
  'settings.maxTokensHint': 'ขีดจำกัดความยาวในการตอบกลับ',
  'settings.apiHint': 'คำสั่งจะถูกส่งผ่าน local daemon proxy ไปยัง base URL ที่คุณตั้งไว้ API Key จะถูกเก็บในเบราว์เซอร์นี้เท่านั้น',
--- a/apps/web/src/i18n/locales/tr.ts
+++ b/apps/web/src/i18n/locales/tr.ts
@ -202,6 +202,7 @@ export const tr: Dict = {
  'settings.azureDeploymentModelHint':
    'Azure OpenAI icin bu alan /openai/deployments/<model> icindeki dagitim adi olarak kullanilir. Azureda olusturdugunuz dagitim adini girin.',
  'settings.apiVersion': 'API sürümü',
  'settings.byokImageModel': 'Görüntü oluşturma modeli',
  'settings.maxTokens': 'Maks. token (isteğe bağlı)',
  'settings.maxTokensHint':
    'Yanıt uzunluğu sınırı. Her modelin ayarlanmış bir varsayılanı vardır (yer tutucuda görünür); kullanmak için boş bırakın, üzerine yazmak için bir sayı girin.',
--- a/apps/web/src/i18n/locales/uk.ts
+++ b/apps/web/src/i18n/locales/uk.ts
@ -203,6 +203,7 @@ export const uk: Dict = {
  'settings.azureDeploymentModelHint':
    'Для Azure OpenAI це поле використовується як назва розгортання в /openai/deployments/<model>. Введіть назву розгортання, створену в Azure.',
  'settings.apiVersion': 'Версія API',
  'settings.byokImageModel': 'Модель генерації зображень',
  'settings.maxTokens': 'Макс. токенів (необов\'язково)',
  'settings.maxTokensHint':
    'Обмеження на довжину відповіді. Кожна модель має налаштовану за замовчуванням (показано в заповнювачі); залиште поле порожнім, щоб використовувати її, або введіть число, щоб переопрацювати.',
--- a/apps/web/src/i18n/locales/zh-CN.ts
+++ b/apps/web/src/i18n/locales/zh-CN.ts
@ -227,6 +227,7 @@ export const zhCN: Dict = {
  'settings.azureModelFetchHint':
    '对于 Azure OpenAI，请填写你在 Azure 中创建的部署名称。当前 BYOK 端点无法自动发现 deployment。',
  'settings.apiVersion': 'API 版本',
  'settings.byokImageModel': '图片生成模型',
  'settings.maxTokens': '最大 tokens（可选）',
  'settings.maxTokensHint':
    '响应长度上限。每个 model 有调优过的默认值（在 placeholder 里显示），留空即使用，输入数字则覆盖。',
--- a/apps/web/src/i18n/locales/zh-TW.ts
+++ b/apps/web/src/i18n/locales/zh-TW.ts
@ -201,6 +201,7 @@ export const zhTW: Dict = {
  'settings.azureDeploymentModelHint':
    '對於 Azure OpenAI，此欄位會作為 /openai/deployments/<model> 中的部署名稱使用。請填入你在 Azure 中建立的部署名稱。',
  'settings.apiVersion': 'API 版本',
  'settings.byokImageModel': '圖片生成模型',
  'settings.maxTokens': '最大 tokens（可選）',
  'settings.maxTokensHint':
    '回應長度上限。每個 model 有調過的預設值（在 placeholder 顯示），留空即使用，輸入數字則覆蓋。',
--- a/apps/web/src/i18n/types.ts
+++ b/apps/web/src/i18n/types.ts
@ -252,6 +252,7 @@ export interface Dict {
  'settings.azureDeploymentModelHint': string;
  'settings.azureModelFetchHint': string;
  'settings.apiVersion': string;
  'settings.byokImageModel': string;
  'settings.apiHint': string;
  'settings.skipForNow': string;
  'settings.getStarted': string;
--- a/apps/web/src/media/models.ts
+++ b/apps/web/src/media/models.ts
@ -234,7 +234,7 @@ export const MEDIA_PROVIDERS: MediaProvider[] = [
  {
    id: 'senseaudio',
    label: 'SenseAudio',
-    hint: 'TTS · 70+ system voices · clone',
+    hint: '',
    integrated: true,
    defaultBaseUrl: 'https://api.senseaudio.cn',
    docsUrl: 'https://docs.senseaudio.cn',
@ -344,6 +344,29 @@ export const IMAGE_MODELS: MediaModel[] = [
    caps: ['i2i'],
  },
  // SenseAudio — synchronous /v1/image/sync, Bearer auth, reference URL or data URI.
  {
    id: 'senseaudio-image-2.0-260319',
    label: 'senseaudio-image-2.0',
    hint: 'SenseAudio · multi-aspect, latest',
    provider: 'senseaudio',
    caps: ['t2i', 'i2i'],
  },
  {
    id: 'senseaudio-image-1.0-260319',
    label: 'senseaudio-image-1.0',
    hint: 'SenseAudio · standard',
    provider: 'senseaudio',
    caps: ['t2i', 'i2i'],
  },
  {
    id: 'doubao-seedream-5-0-260128',
    label: 'seedream-5.0',
    hint: 'SenseAudio · ByteDance Seedream 5.0 hi-res',
    provider: 'senseaudio',
    caps: ['t2i', 'i2i'],
  },
  // xAI Grok Imagine — text-to-image (1k/2k, 11+ aspect ratios).
  {
    id: 'grok-imagine-image',
--- a/apps/web/src/providers/anthropic.ts
+++ b/apps/web/src/providers/anthropic.ts
@ -11,10 +11,12 @@ import Anthropic from '@anthropic-ai/sdk';
 import { effectiveMaxTokens } from '../state/maxTokens';
 import type { AppConfig, ChatMessage } from '../types';
 import { streamMessageAnthropicProxy } from './anthropic-compatible';
 import type { ProxyContext } from './api-proxy';
 import { streamMessageAzure } from './azure-compatible';
 import { streamMessageGoogle } from './google-compatible';
 import { streamMessageOllama } from './ollama-compatible';
 import { isOpenAICompatible, streamMessageOpenAI } from './openai-compatible';
 import { streamMessageSenseAudio } from './senseaudio-compatible';
 // Re-export for convenience
 export { isOpenAICompatible } from './openai-compatible';
@ -39,6 +41,12 @@ export async function streamMessage(
  history: ChatMessage[],
  signal: AbortSignal,
  handlers: StreamHandlers,
  // Only the senseaudio branch reads `context.projectId` today (so the
  // daemon-side `generate_image` tool can write into the active
  // project's folder). Other branches accept and ignore — keeping the
  // signature uniform means the single call site in ProjectView passes
  // the same shape regardless of protocol.
  context?: ProxyContext,
 ): Promise<void> {
  // Prefer the explicit Settings protocol; keep the legacy heuristic as a
  // fallback for configs saved before apiProtocol existed.
@ -51,6 +59,9 @@ export async function streamMessage(
  if (cfg.apiProtocol === 'google') {
    return streamMessageGoogle(cfg, system, history, signal, handlers);
  }
  if (cfg.apiProtocol === 'senseaudio') {
    return streamMessageSenseAudio(cfg, system, history, signal, handlers, context);
  }
  if (cfg.apiProtocol === 'openai' || (!cfg.apiProtocol && isOpenAICompatible(cfg.model, cfg.baseUrl))) {
    return streamMessageOpenAI(cfg, system, history, signal, handlers);
  }
--- a/apps/web/src/providers/api-proxy.ts
+++ b/apps/web/src/providers/api-proxy.ts
@ -3,6 +3,22 @@ import type { AppConfig, ChatMessage } from '../types';
 import type { StreamHandlers } from './anthropic';
 import { parseSseFrame } from './sse';
 /**
 * Optional per-request context that some protocols thread into the
 * proxy body. Today only the senseaudio proxy reads these fields:
 *  - `projectId` lets the `generate_image` tool write into the active
 *    project's folder instead of a daemon-global cache.
 *  - `byokImageModel` is the user's BYOK Settings default for the
 *    image tool. The LLM can still override per-call via the tool's
 *    `model` arg; this is just the fallback when it omits one.
 * Other protocols ignore unknown body fields, so callers are free to
 * pass this for every protocol.
 */
 export interface ProxyContext {
  projectId?: string;
  byokImageModel?: string;
 }
 export async function streamProxyEndpoint(
  endpoint: string,
  cfg: AppConfig,
@ -10,6 +26,7 @@ export async function streamProxyEndpoint(
  history: ChatMessage[],
  signal: AbortSignal,
  handlers: StreamHandlers,
  context?: ProxyContext,
 ): Promise<void> {
  if (!cfg.apiKey) {
    handlers.onError(new Error('Missing API key — open Settings and paste one in.'));
@ -30,6 +47,10 @@ export async function streamProxyEndpoint(
        messages: history.map((m) => ({ role: m.role, content: m.content })),
        maxTokens: effectiveMaxTokens(cfg),
        apiVersion: cfg.apiVersion,
        ...(context?.projectId ? { projectId: context.projectId } : {}),
        ...(context?.byokImageModel
          ? { byokImageModel: context.byokImageModel }
          : {}),
      }),
      signal,
    });
--- a/apps/web/src/providers/senseaudio-compatible.ts
+++ b/apps/web/src/providers/senseaudio-compatible.ts
@ -0,0 +1,33 @@
 /**
 * SenseAudio chat completions provider. Wire-compatible with OpenAI
 * (POST /v1/chat/completions, Bearer auth, SSE delta frames + [DONE]),
 * so the only thing that differs from streamMessageOpenAI is the
 * daemon proxy endpoint — keeping a dedicated client makes the picker
 * tab → daemon log line → upstream call chain readable end-to-end and
 * leaves room for SenseAudio-specific divergence in the future.
 *
 * Routes through the daemon proxy to avoid browser CORS issues.
 * BYOK — the key stays on the user's machine.
 */
 import type { AppConfig, ChatMessage } from '../types';
 import type { StreamHandlers } from './anthropic';
 import { streamProxyEndpoint, type ProxyContext } from './api-proxy';
 export async function streamMessageSenseAudio(
  cfg: AppConfig,
  system: string,
  history: ChatMessage[],
  signal: AbortSignal,
  handlers: StreamHandlers,
  context?: ProxyContext,
 ): Promise<void> {
  return streamProxyEndpoint(
    '/api/proxy/senseaudio/stream',
    cfg,
    system,
    history,
    signal,
    handlers,
    context,
  );
 }
--- a/apps/web/src/runtime/markdown.tsx
+++ b/apps/web/src/runtime/markdown.tsx
@ -262,6 +262,24 @@ function renderBlock(block: Block, key: number): ReactNode {
  return null;
 }
 // Allowed schemes / forms for image `src` attributes. The BYOK chat
 // tool loop emits relative URLs like `/api/byok-image/<id>.png` which
 // the web's Next.js rewrites proxy to the daemon — that's the common
 // case. data: + blob: cover inline / generated images. http(s):// is
 // allowed so a model can reference public images. Anything else
 // (javascript:, file:, vbscript:, …) is rejected so a hallucinated
 // or adversarial URL cannot exfiltrate or execute.
 function isSafeMarkdownImageSrc(src: string): boolean {
  if (!src) return false;
  if (src.startsWith('/') && !src.startsWith('//')) return true;
  return (
    src.startsWith('http://')
    || src.startsWith('https://')
    || src.startsWith('data:image/')
    || src.startsWith('blob:')
  );
 }
 // Inline pass: tokenize into runs of `code`, **bold**, *italic*, links,
 // and plain text. We walk the string with a regex that matches whichever
 // delimiter shows up next; everything between delimiters becomes a text
@ -270,14 +288,19 @@ function renderInline(text: string): ReactNode {
  const out: ReactNode[] = [];
  // Order matters:
  //  1. inline code first so its contents are not re-tokenized as bold/italic.
-  //  2. explicit `[text](url)` markdown links before bare URL autolink so the
+  //  2. image syntax `![alt](url)` BEFORE the link branch. Both share
  //     `[…](…)` and the image is only distinguished by the leading `!`;
  //     letting the link branch win would render `[alt](url)` as a text
  //     link with `!` stranded as a sibling text node and the user would
  //     see the link copy but never the image.
  //  3. explicit `[text](url)` markdown links before bare URL autolink so the
  //     autolink does not greedily swallow the closing paren.
-  //  3. bare http(s) URL autolink BEFORE italic markers — chat output often
+  //  4. bare http(s) URL autolink BEFORE italic markers — chat output often
  //     contains OAuth-style links with `_type=` / `_id=` query params, and
  //     leaving italic to win turns the URL into an italic-fragmented mess.
-  //  4. bold (**a** / __a__) before italic (*a* / _a_).
+  //  5. bold (**a** / __a__) before italic (*a* / _a_).
  const re =
-    /(`[^`]+`)|\[([^\]]+)\]\(([^)\s]+)\)|(https?:\/\/[^\s)<>]+)|(\*\*[^*]+\*\*)|(__[^_]+__)|(\*[^*\n]+\*)|(_[^_\n]+_)/g;
+    /(`[^`]+`)|!\[([^\]]*)\]\(([^)\s]+)\)|\[([^\]]+)\]\(([^)\s]+)\)|(https?:\/\/[^\s)<>]+)|(\*\*[^*]+\*\*)|(__[^_]+__)|(\*[^*\n]+\*)|(_[^_\n]+_)/g;
  let lastIndex = 0;
  let m: RegExpExecArray | null;
  let key = 0;
@ -291,40 +314,61 @@ function renderInline(text: string): ReactNode {
          {m[1].slice(1, -1)}
        </code>,
      );
-    } else if (m[2] && m[3]) {
+    } else if (m[3] !== undefined) {
      // Image: m[2] = alt (may be empty), m[3] = src
      const src = m[3];
      const alt = m[2] || '';
      if (isSafeMarkdownImageSrc(src)) {
        out.push(
          <img
            key={key++}
            className="md-image"
            src={src}
            alt={alt}
            loading="lazy"
            referrerPolicy="no-referrer"
            style={{ maxWidth: '100%', height: 'auto', borderRadius: 6 }}
          />,
        );
      } else {
        // Unsafe scheme — drop the image tag but keep the alt text so
        // the user sees what the model meant to show.
        pushText(out, alt, key++);
      }
    } else if (m[4] && m[5]) {
      out.push(
        <a
          key={key++}
          className="md-link"
-          href={m[3]}
+          href={m[5]}
          target="_blank"
          rel="noreferrer noopener"
        >
          {m[2]}
        </a>,
      );
    } else if (m[4]) {
      // Bare URL — autolink with the URL as both href and visible text,
      // matching the Markdown `<https://…>` autolink convention.
      out.push(
        <a
          key={key++}
          className="md-link md-link-bare"
          href={m[4]}
          target="_blank"
          rel="noreferrer noopener"
        >
          {m[4]}
        </a>,
      );
    } else if (m[5]) {
      out.push(<strong key={key++}>{m[5].slice(2, -2)}</strong>);
    } else if (m[6]) {
-      out.push(<strong key={key++}>{m[6].slice(2, -2)}</strong>);
+      // Bare URL — autolink with the URL as both href and visible text,
      // matching the Markdown `<https://…>` autolink convention.
      out.push(
        <a
          key={key++}
          className="md-link md-link-bare"
          href={m[6]}
          target="_blank"
          rel="noreferrer noopener"
        >
          {m[6]}
        </a>,
      );
    } else if (m[7]) {
-      out.push(<em key={key++}>{m[7].slice(1, -1)}</em>);
+      out.push(<strong key={key++}>{m[7].slice(2, -2)}</strong>);
    } else if (m[8]) {
-      out.push(<em key={key++}>{m[8].slice(1, -1)}</em>);
+      out.push(<strong key={key++}>{m[8].slice(2, -2)}</strong>);
    } else if (m[9]) {
      out.push(<em key={key++}>{m[9].slice(1, -1)}</em>);
    } else if (m[10]) {
      out.push(<em key={key++}>{m[10].slice(1, -1)}</em>);
    }
    lastIndex = re.lastIndex;
  }
--- a/apps/web/src/state/apiProtocols.ts
+++ b/apps/web/src/state/apiProtocols.ts
@ -65,6 +65,22 @@ export const SUGGESTED_MODELS_BY_PROTOCOL: Record<ApiProtocol, readonly string[]
    'gemini-1.5-pro',
    'gemini-1.5-flash',
  ],
  senseaudio: [
    // SenseAudio is an OpenAI-compatible gateway that fronts both its own
    // models (senseaudio-s2 family) and aggregator routes to deepseek /
    // glm / kimi / minimax. Listing the headline house models first keeps
    // the picker's default selection on a SenseAudio-native checkpoint;
    // the aggregator IDs trail so users who arrived for a specific
    // upstream still find it in this tab without retyping it.
    'senseaudio-s2',
    'senseaudio-s2-flash',
    'deepseek-v4-flash',
    'deepseek-v4-pro',
    'glm-5.1',
    'kimi-k2.6',
    'MiniMax-M2.7-highspeed',
    'MiniMax-M2.7',
  ],
  ollama: [
    'cogito-2.1:671b',
    'deepseek-v3.1:671b',
@ -123,6 +139,7 @@ export const FAST_MODEL_BY_PROTOCOL: Record<ApiProtocol, string> = {
  // pick produces a deterministic answer; users who care can override
  // through the Memory model picker.
  ollama: 'gemma3:4b',
  senseaudio: 'senseaudio-s2-flash',
 };
 export const API_PROTOCOL_TABS: ReadonlyArray<{
@ -134,6 +151,7 @@ export const API_PROTOCOL_TABS: ReadonlyArray<{
  { id: 'azure', title: 'Azure OpenAI' },
  { id: 'google', title: 'Google Gemini' },
  { id: 'ollama', title: 'Ollama Cloud' },
  { id: 'senseaudio', title: 'SenseAudio' },
 ];
 export const API_PROTOCOL_LABELS: Record<ApiProtocol, string> = {
@ -142,6 +160,7 @@ export const API_PROTOCOL_LABELS: Record<ApiProtocol, string> = {
  azure: 'Azure OpenAI',
  google: 'Google Gemini',
  ollama: 'Ollama Cloud API',
  senseaudio: 'SenseAudio API',
 };
 export const API_KEY_PLACEHOLDERS: Record<ApiProtocol, string> = {
@ -150,6 +169,7 @@ export const API_KEY_PLACEHOLDERS: Record<ApiProtocol, string> = {
  azure: 'azure key',
  google: 'AIza...',
  ollama: 'Ollama API key',
  senseaudio: 'SenseAudio API key',
 };
 // Default base URL the daemon assumes when the user leaves the field
@ -161,4 +181,5 @@ export const DEFAULT_BASE_URL_BY_PROTOCOL: Record<ApiProtocol, string> = {
  azure: '',
  google: 'https://generativelanguage.googleapis.com',
  ollama: 'https://ollama.com',
  senseaudio: 'https://api.senseaudio.cn',
 };
--- a/apps/web/src/state/config.ts
+++ b/apps/web/src/state/config.ts
@ -249,6 +249,22 @@ export const KNOWN_PROVIDERS: KnownProvider[] = [
    model: 'mimo-v2.5-pro',
    models: ['mimo-v2.5-pro'],
  },
  {
    label: 'SenseAudio',
    protocol: 'senseaudio',
    baseUrl: 'https://api.senseaudio.cn',
    model: 'senseaudio-s2',
    models: [
      'senseaudio-s2',
      'senseaudio-s2-flash',
      'deepseek-v4-flash',
      'deepseek-v4-pro',
      'glm-5.1',
      'kimi-k2.6',
      'MiniMax-M2.7-highspeed',
      'MiniMax-M2.7',
    ],
  },
 ];
 function normalizePet(input: Partial<PetConfig> | undefined): PetConfig {
@ -290,6 +306,10 @@ function inferApiProtocol(model: string, baseUrl: string): ApiProtocol {
    // protocol so both chat and the connection test hit the native Ollama
    // proxy instead of the Anthropic or OpenAI paths.
    if (normalized.includes('ollama.com')) return 'ollama';
    // SenseAudio host gets routed to its own proxy so the daemon log line
    // and the BYOK tab UI stay consistent with the protocol the user
    // picked — even though the on-wire shape is OpenAI-compatible.
    if (normalized.includes('senseaudio.cn')) return 'senseaudio';
    return isOpenAICompatible(model, baseUrl) ? 'openai' : 'anthropic';
  } catch {
    // Preserve the rest of the user's settings even if an old saved base URL is
--- a/apps/web/src/types.ts
+++ b/apps/web/src/types.ts
@ -91,7 +91,7 @@ export type {
 } from '@open-design/contracts';
 export type ExecMode = 'daemon' | 'api';
-export type ApiProtocol = 'anthropic' | 'openai' | 'azure' | 'google' | 'ollama';
+export type ApiProtocol = 'anthropic' | 'openai' | 'azure' | 'google' | 'ollama' | 'senseaudio';
 export type LiveArtifactTabId = `live:${string}`;
 export type ProjectWorkspaceTabId = string | LiveArtifactTabId;
@ -180,6 +180,13 @@ export interface ApiProtocolConfig {
  model: string;
  apiVersion?: string;
  apiProviderBaseUrl?: string | null;
  /** SenseAudio BYOK only — default image model the daemon-side
   *  `generate_image` tool uses when the LLM doesn't pass one. Carries
   *  one of the SenseAudio image model ids (`senseaudio-image-2.0-260319`,
   *  `senseaudio-image-1.0-260319`, `doubao-seedream-5-0-260128`). Stored
   *  per-protocol so flipping between BYOK tabs doesn't reset the
   *  SenseAudio image-model choice. */
  byokImageModel?: string;
 }
 // Per-CLI model + reasoning the user picked in the model menu. Each agent
@ -294,6 +301,11 @@ export interface AppConfig {
  model: string;
  apiProtocol?: ApiProtocol;
  apiVersion?: string;
  /** SenseAudio BYOK only — default image model for the daemon-side
   *  generate_image tool. Mirrors apiProtocolConfigs.senseaudio.byokImageModel
   *  so the active protocol's value lives at the top level (consistent
   *  with how apiKey / baseUrl / model are projected onto AppConfig). */
  byokImageModel?: string;
  apiProtocolConfigs?: Partial<Record<ApiProtocol, ApiProtocolConfig>>;
  /** Internal config schema/migration version for localStorage upgrades. */
  configMigrationVersion?: number;
--- a/apps/web/src/utils/apiProtocol.ts
+++ b/apps/web/src/utils/apiProtocol.ts
@ -6,6 +6,7 @@ const API_PROTOCOL_LABELS: Record<ApiProtocol, string> = {
  azure: 'Azure OpenAI',
  google: 'Google Gemini',
  ollama: 'Ollama Cloud API',
  senseaudio: 'SenseAudio API',
 };
 const API_PROTOCOL_AGENT_IDS: Record<ApiProtocol, string> = {
@ -14,6 +15,7 @@ const API_PROTOCOL_AGENT_IDS: Record<ApiProtocol, string> = {
  azure: 'azure-openai-api',
  google: 'google-gemini-api',
  ollama: 'ollama-cloud-api',
  senseaudio: 'senseaudio-api',
 };
 export function apiProtocolLabel(protocol: ApiProtocol | undefined): string {
--- a/apps/web/tests/runtime/markdown.test.tsx
+++ b/apps/web/tests/runtime/markdown.test.tsx
@ -105,4 +105,67 @@ describe('renderMarkdown', () => {
    const bodyTd = (out.match(/<tbody>[\s\S]*<\/tbody>/)?.[0] ?? '').match(/<td/g) ?? [];
    expect(bodyTd.length).toBe(2);
  });
  it('renders ![alt](url) as <img> for relative BYOK image URLs', () => {
    const out = html('Here is your cat: ![cute kitten](/api/byok-image/abc-123.png)');
    expect(out).toContain('<img');
    expect(out).toContain('class="md-image"');
    expect(out).toContain('src="/api/byok-image/abc-123.png"');
    expect(out).toContain('alt="cute kitten"');
    expect(out).toContain('loading="lazy"');
    expect(out).toContain('referrerPolicy="no-referrer"');
    // Image syntax must NOT be turned into an <a> link — `[alt](url)`
    // with a leading `!` is image, not link.
    expect(out).not.toContain('<a class="md-link"');
  });
  it('renders ![](url) with empty alt text', () => {
    const out = html('![](/api/byok-image/abc.png)');
    expect(out).toContain('<img');
    expect(out).toContain('alt=""');
  });
  it('renders https image URLs', () => {
    const out = html('![logo](https://example.com/logo.png)');
    expect(out).toContain('<img');
    expect(out).toContain('src="https://example.com/logo.png"');
  });
  it('renders data: image URIs', () => {
    const out = html('![inline](data:image/png;base64,iVBORw0KGgo=)');
    expect(out).toContain('<img');
    expect(out).toContain('src="data:image/png;base64,iVBORw0KGgo="');
  });
  it('drops image tags with unsafe schemes and keeps alt text as plain text', () => {
    const out = html('![hacked](javascript:alert(1))');
    expect(out).not.toContain('<img');
    expect(out).not.toContain('javascript:');
    expect(out).toContain('hacked');
  });
  it('rejects protocol-relative image URLs (could load cross-origin)', () => {
    // `//evil.com/track.png` would inherit the page protocol; not in our
    // allowlist. Should fall through to alt-as-text.
    const out = html('![track](//evil.com/track.png)');
    expect(out).not.toContain('<img');
    expect(out).toContain('track');
  });
  it('keeps regular [text](url) links working alongside image syntax', () => {
    const out = html('Click [here](https://example.com) and look ![image](/api/byok-image/a.png)');
    expect(out).toContain('<a class="md-link"');
    expect(out).toContain('href="https://example.com"');
    expect(out).toContain('>here</a>');
    expect(out).toContain('<img');
    expect(out).toContain('src="/api/byok-image/a.png"');
  });
  it('preserves bold + italic + code after the image regex addition', () => {
    const out = html('**b** and *i* and `c` and ![a](/p.png)');
    expect(out).toContain('<strong>b</strong>');
    expect(out).toContain('<em>i</em>');
    expect(out).toContain('<code class="md-inline-code">c</code>');
    expect(out).toContain('<img');
  });
 });
--- a/packages/contracts/src/analytics/events.ts
+++ b/packages/contracts/src/analytics/events.ts
@ -229,7 +229,7 @@ export interface SettingsClickByokProviderOptionProps {
  // Tracking doc names azure/google/ollama as azure_openai/google_gemini/
  // ollama_cloud — we forward the code value verbatim and let dashboards
  // map; see tracking-doc-issues.md §2.5.
-  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google';
+  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google' | 'senseaudio';
  // True when the clicked chip was already the active protocol (no-op
  // toggle); false when the click switches protocol.
  is_selected: boolean;
@ -242,10 +242,10 @@ export interface SettingsClickByokFieldProps {
  action: 'focus_byok_field';
  field_id: 'api_key' | 'base_url' | 'model';
  // Code's `apiProtocol` is wider than the CSV's BYOK provider enum
-  // (anthropic|openai|azure|ollama|google). We forward the code value
+  // (anthropic|openai|azure|ollama|google|senseaudio). We forward the code
-  // verbatim so dashboards can group by the actual protocol; the CSV enum
+  // value verbatim so dashboards can group by the actual protocol; the CSV
-  // is a strict subset the product team can revise.
+  // enum is a strict subset the product team can revise.
-  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google';
+  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google' | 'senseaudio';
  has_value: boolean;
 }
@ -261,7 +261,7 @@ export interface SettingsCliTestResultProps {
 export interface SettingsByokTestResultProps {
  page: 'settings';
  area: 'execution_model';
-  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google';
+  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google' | 'senseaudio';
  result: 'success' | 'failed' | 'timeout';
  error_code?: string;
  duration_ms: number;
--- a/packages/contracts/src/api/connectionTest.ts
+++ b/packages/contracts/src/api/connectionTest.ts
@ -139,7 +139,7 @@ export type ConnectionTestKind =
  | 'agent_spawn_failed'
  | 'unknown';
-export type ConnectionTestProtocol = 'anthropic' | 'openai' | 'azure' | 'google' | 'ollama';
+export type ConnectionTestProtocol = 'anthropic' | 'openai' | 'azure' | 'google' | 'ollama' | 'senseaudio';
 export interface ProviderTestRequest {
  protocol: ConnectionTestProtocol;
--- a/packages/contracts/src/api/memory.ts
+++ b/packages/contracts/src/api/memory.ts
@ -80,16 +80,19 @@ export interface MemoryListResponse {
 /** Provider/protocol the memory extractor calls. Mirrors the chat
 *  BYOK form's protocols — anthropic + openai-compatible + azure
 *  (openai-compatible at a different URL/header) + google gemini +
- *  ollama (also openai-compatible, just hosted on Ollama Cloud) — so
+ *  ollama (also openai-compatible, just hosted on Ollama Cloud) +
- *  the memory picker can offer the same options as the chat picker
+ *  senseaudio (also openai-compatible, SenseAudio's OpenAI-shaped
- *  above it. The daemon routes ollama through the same callOpenAI
+ *  /v1/chat/completions gateway) — so the memory picker can offer the
- *  path since the wire protocol is identical. */
+ *  same options as the chat picker above it. The daemon routes both
 *  ollama and senseaudio through the same callOpenAI path since the
 *  wire protocol is identical. */
 export type MemoryExtractionProvider =
  | 'anthropic'
  | 'openai'
  | 'azure'
  | 'google'
-  | 'ollama';
+  | 'ollama'
  | 'senseaudio';
 /** Masked version of MemoryExtractionConfig returned by GET endpoints —
 *  the api key field is replaced with a 4-char tail so the settings UI