feat(senseaudio): BYOK chat with image + video generation tools (#2065)

* feat(senseaudio): BYOK chat with image + video generation tools Adds SenseAudio as a first-class BYOK chat protocol and wires the daemon's chat proxy with a tool loop so BYOK users can generate images and videos without dropping to a CLI agent. - BYOK protocol: new senseaudio tab + /api/proxy/senseaudio/stream route + connection-test + provider-models discovery (OpenAI-compatible wire) - Tool loop: generate_image (synchronous /v1/image/sync) and generate_video (async /v1/video/create + 5s polling /v1/video/status, 10-min ceiling, periodic progress log every 30s) - Settings dropdown + chat-composer dropdown for the BYOK image model default; generate_image's model enum lets the LLM override per call - Seed-on-success: a successful BYOK chat call idempotently mirrors the key into media-config (preserves env-resolved + already-stored keys) - Generated artifacts land in <projectsRoot>/<projectId>/ so FileViewer, DesignFilesPanel, and project export pick them up automatically; legacy /api/byok-image/:id route kept for old conversation links - Markdown renderer learns ![alt](url) image syntax with a scheme allowlist (http(s) / data:image/ / blob: / relative paths) - i18n key settings.byokImageModel across all 19 locales - 3 SenseAudio image models registered (2.0, 1.0, doubao-seedream-5.0); 1 video model (doubao-seedance-2.0) - Tests: byok-tools (29), media-senseaudio-image (8), media-config seed (7), proxy-routes (47), markdown image rendering (8) * fix(senseaudio): unblock image gen + design file preview switching - SenseAudio /v1/image/sync rejected the previous size mapping with `参数错误：size` (1664x936, 936x1664, 1280x960, 960x1280 are not in the gateway's accepted set). Switched to standard HD / SD sizes that every aspect bucket can hit: 1024×1024, 1280×720, 720×1280, 1024×768, 768×1024. Kept the byok-tools and media.ts tables in sync so the BYOK chat tool and the CLI agent path both stop failing on non-square aspects. - DesignFilesPanel's <DfPreview> was missing a key prop, so React reused the same iframe DOM node when the user picked a different file — the src prop changed but the iframe never navigated. Added key={previewFile.name} so the previous preview unmounts cleanly. - Updated byok-tools + media-senseaudio-image tests for the new size expectations. * docs(senseaudio): clear stale provider hint + update README - Settings → Media → SenseAudio: clear the auto-promoted "Image · TTS · 70+ voices · clone" hint; the provider label alone is enough now that the BYOK chat surface covers image + video tooling. - README: list the new senseaudio (and missing ollama) proxy routes so the BYOK section reflects what the daemon actually serves, and mention the generate_image / generate_video chat tools that ship with the SenseAudio path. * fix(senseaudio): address PR #2065 review feedback Three non-blocking review notes from @PerishCode on PR #2065: 1. Drop the dead /api/byok-image/:id route. The PR description claimed it was "legacy fallback for old chat history" but that storage layout never existed on main, so the route can only ever 400 or 404 — never 200. Removed the handler, the isSafeByokImageId export, the unused createReadStream / stat / path / Request / Response imports, and the two byok-image regression tests. 2. Add rejectProxyPluginContext guard to the senseaudio proxy handler so it matches the invariant the other five proxy paths already enforce (plugin runs must go through /api/runs for snapshot pinning). Extended the existing "API fallback rejects plugin runs" describe to also cover /api/proxy/senseaudio/stream with the 409 PLUGIN_REQUIRES_DAEMON expectation. 3. Wrap the secondary image / video downloads (the URLs the SenseAudio gateway hands back in /v1/image/sync .url and /v1/video/status .video_url) in validateBaseUrlResolved so a malicious gateway can't point us at 169.254.169.254 (AWS / Azure metadata) or RFC1918 hosts via the response payload. Also passed `redirect: 'error'` on both fetches to match the SSRF posture the primary proxy fetch already uses. The new assertExternalAssetUrl helper lives next to executeGenerateImage so future tool downloads can reuse it. Tests: 120/120 daemon tests pass; guard + typecheck green. * fix(senseaudio): mirror SSRF guard onto renderSenseAudioImage CLI path Follow-up to 01b1260a — the chat-tool fix in byok-tools.ts wasn't mirrored onto the parallel renderSenseAudioImage path in media.ts. Same attacker-controllable shape (gateway-returned `data.url`), same one-line fix. - Hoist assertExternalAssetUrl from byok-tools.ts into connectionTest.ts next to validateBaseUrlResolved so both call sites (the BYOK chat tool loop AND the CLI agent media dispatcher) share one helper. Made the error strings provider-agnostic so a future caller doesn't get a misleading "senseaudio" attribution for a Volcengine / Grok / etc. download. - renderSenseAudioImage now runs the response url through assertExternalAssetUrl before fetching bytes, and passes redirect: 'error' to block a 3xx hop into private space. Scope intentionally limited to the senseaudio path PerishCode flagged; the other unguarded fetch(entry.url) call sites in media.ts (OpenAI / Volcengine / Grok / Nano-Banana) are pre-existing patterns and belong in a separate follow-up if the daemon wants defense-in-depth across every provider. Tests: 127/127 daemon tests pass; guard + typecheck green. --------- Co-authored-by: unknown <mazeliang@sensetime.com>
2026-05-31 19:04:39 +07:00 · 2026-05-19 23:14:56 +08:00 · 2026-05-19 23:14:56 +08:00 · 210b94069a
commit 210b94069a
parent 431a5e2d79
52 changed files with 3305 additions and 55 deletions
--- a/README.md
+++ b/README.md
@ -63,7 +63,7 @@ OD stands on four open-source shoulders:
 | | What you get |
 |---|---|
 | **Coding-agent CLIs (16)** | Claude Code · Codex CLI · Devin for Terminal · Cursor Agent · Gemini CLI · OpenCode · Qwen Code · Qoder CLI · GitHub Copilot CLI · Hermes (ACP) · Kimi CLI (ACP) · Pi (RPC) · Kiro CLI (ACP) · Kilo (ACP) · Mistral Vibe CLI (ACP) · DeepSeek TUI — auto-detected on `PATH`, swap with one click |
-| **BYOK fallback** | Protocol-specific API proxy at `/api/proxy/{anthropic,openai,azure,google}/stream` — paste `baseUrl` + `apiKey` + `model`, choose Anthropic / OpenAI / Azure OpenAI / Google Gemini, and the daemon normalizes SSE back to the same chat stream. Internal-IP/SSRF blocked at the daemon edge. |
+| **BYOK fallback** | Protocol-specific API proxy at `/api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream` — paste `baseUrl` + `apiKey` + `model`, choose Anthropic / OpenAI / Azure OpenAI / Google Gemini / Ollama Cloud / SenseAudio, and the daemon normalizes SSE back to the same chat stream. SenseAudio chat additionally exposes `generate_image` and `generate_video` tools so the model can write rendered artifacts straight into the active project's folder. Internal-IP/SSRF blocked at the daemon edge. |
 | **Design systems built-in** | **129** — 2 hand-authored starters + 70 product systems (Linear, Stripe, Vercel, Airbnb, Tesla, Notion, Anthropic, Apple, Cursor, Supabase, Figma, Xiaohongshu, …) from [`awesome-design-md`][acd2], plus 57 design skills from [`awesome-design-skills`][ads] added directly under `design-systems/` |
 | **Skills built-in** | **31** — 27 in `prototype` mode (web-prototype, saas-landing, dashboard, mobile-app, gamified-app, social-carousel, magazine-poster, dating-web, sprite-animation, motion-frames, critique, tweaks, wireframe-sketch, pm-spec, eng-runbook, finance-report, hr-onboarding, invoice, kanban-board, team-okrs, …) + 4 in `deck` mode (`guizang-ppt` · `simple-deck` · `replit-deck` · `weekly-update`). Grouped in the picker by `scenario`: design / marketing / operation / engineering / product / finance / hr / sale / personal. |
 | **Media generation** | Image · video · audio surfaces ship alongside the design loop. **gpt-image-2** (Azure / OpenAI) for posters, avatars, infographics, illustrated maps · **Seedance 2.0** (ByteDance) for cinematic 15s text-to-video and image-to-video · **HyperFrames** ([heygen-com/hyperframes](https://github.com/heygen-com/hyperframes)) for HTML→MP4 motion graphics (product reveals, kinetic typography, data charts, social overlays, logo outros). **93** ready-to-replicate prompts gallery — 43 gpt-image-2 + 39 Seedance + 11 HyperFrames — under [`prompt-templates/`](prompt-templates/), with preview thumbnails and source attribution. Same chat surface as code; outputs a real `.mp4` / `.png` chip into the project workspace. |
@ -304,7 +304,7 @@ Every layer is composable. Every layer is a file you can edit. Read [`apps/daemo
 | Frontend | Next.js 16 App Router + React 18 + TypeScript, Vercel-deployable |
 | Daemon | Node 24 · Express · SSE streaming · `better-sqlite3`; tables: `projects` · `conversations` · `messages` · `tabs` · `templates` |
 | Agent transport | `child_process.spawn`; typed-event parsers for `claude-stream-json` (Claude Code), `qoder-stream-json` (Qoder CLI), `copilot-stream-json` (Copilot), `json-event-stream` per-CLI parsers (Codex / Gemini / OpenCode / Cursor Agent), `acp-json-rpc` (Devin / Hermes / Kimi / Kiro / Kilo / Mistral Vibe via Agent Client Protocol), `pi-rpc` (Pi via stdio JSON-RPC), `plain` (Qwen Code / DeepSeek TUI) |
-| BYOK proxy | `POST /api/proxy/{anthropic,openai,azure,google}/stream` → provider-specific upstream APIs, normalized `delta/end/error` SSE; allows loopback local LLM providers, rejects non-loopback private/link-local/CGNAT/multicast/reserved hosts, and disables upstream redirects at the daemon edge |
+| BYOK proxy | `POST /api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream` → provider-specific upstream APIs, normalized `delta/end/error` SSE; allows loopback local LLM providers, rejects non-loopback private/link-local/CGNAT/multicast/reserved hosts, and disables upstream redirects at the daemon edge |
 | Storage | Plain files in `.od/projects/<id>/` + SQLite at `.od/app.sqlite` + credentials at `.od/media-config.json` (gitignored, auto-created). `OD_DATA_DIR=<dir>` relocates all daemon data (used for test isolation and read-only-install setups); `OD_MEDIA_CONFIG_DIR=<dir>` further narrows the override to just `media-config.json` for setups that want to keep API keys outside the data dir |
 | Preview | Sandboxed iframe via `srcdoc` + per-skill `<artifact>` parser ([`apps/web/src/artifacts/parser.ts`](apps/web/src/artifacts/parser.ts)) |
 | Export | HTML (inline assets) · PDF (browser print, deck-aware) · PPTX (agent-driven via skill) · ZIP (archiver) · Markdown |
@ -872,7 +872,7 @@ Pattern is the same as the rest: pick a template, edit the brief, send. The agen
 The chat / artifact loop gets the spotlight, but a handful of less-visible capabilities are already wired and worth knowing before you compare OD to anything else:

 - **Claude Design ZIP import.** Drop an export from claude.ai onto the welcome dialog. `POST /api/import/claude-design` extracts it into a real `.od/projects/<id>/`, opens the entry file as a tab, and stages a continue-where-Anthropic-left-off prompt for your local agent. No re-prompting, no "ask the model to re-create what we just had". ([`apps/daemon/src/server.ts`](apps/daemon/src/server.ts) — `/api/import/claude-design`)
- **Multi-provider BYOK proxy.** `POST /api/proxy/{anthropic,openai,azure,google}/stream` takes `{ baseUrl, apiKey, model, messages }`, builds the provider-specific upstream request, normalizes SSE chunks into `delta/end/error`, and allows loopback local LLM providers while rejecting non-loopback private, link-local, CGNAT, multicast, reserved, and redirect targets to head off SSRF. OpenAI-compatible covers OpenAI, Azure AI Foundry `/openai/v1`, DeepSeek, Groq, MiMo, OpenRouter, Ollama, LM Studio, and self-hosted vLLM; Azure OpenAI adds deployment URL + `api-version`; Google uses Gemini `:streamGenerateContent`.
+- **Multi-provider BYOK proxy.** `POST /api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream` takes `{ baseUrl, apiKey, model, messages }`, builds the provider-specific upstream request, normalizes SSE chunks into `delta/end/error`, and allows loopback local LLM providers while rejecting non-loopback private, link-local, CGNAT, multicast, reserved, and redirect targets to head off SSRF. OpenAI-compatible covers OpenAI, Azure AI Foundry `/openai/v1`, DeepSeek, Groq, MiMo, OpenRouter, Ollama, LM Studio, and self-hosted vLLM; Azure OpenAI adds deployment URL + `api-version`; Google uses Gemini `:streamGenerateContent`.
 - **User-saved templates.** Once you like a render, `POST /api/templates` snapshots the HTML + metadata into the SQLite `templates` table. The next project picks it from a "your templates" row in the picker — same surface as the shipped 31, but yours.
 - **Tab persistence.** Every project remembers its open files and active tab in the `tabs` table. Reopen the project tomorrow and the workspace looks exactly the way you left it.
 - **Artifact lint API.** `POST /api/artifacts/lint` runs structural checks on a generated artifact (broken `<artifact>` framing, missing required side files, stale palette tokens) and returns findings the agent can read back into its next turn. The five-dim self-critique uses this to ground its score in real evidence, not vibes.
@ -974,7 +974,7 @@ Long-form provenance write-up — what we take from each, what we deliberately d
 - [x] Web app + chat + question form + 5-direction picker + todo progress + sandboxed preview
 - [x] 31 skills + 72 design systems + 5 visual directions + 5 device frames
 - [x] SQLite-backed projects · conversations · messages · tabs · templates
- [x] Multi-provider BYOK proxy (`/api/proxy/{anthropic,openai,azure,google}/stream`) with SSRF guard
+- [x] Multi-provider BYOK proxy (`/api/proxy/{anthropic,openai,azure,google,ollama,senseaudio}/stream`) with SSRF guard
 - [x] Claude Design ZIP import (`/api/import/claude-design`)
 - [x] Sidecar protocol + Electron desktop with IPC automation (STATUS / EVAL / SCREENSHOT / CONSOLE / CLICK / SHUTDOWN)
 - [x] Artifact lint API + 5-dim self-critique pre-emit gate
--- a/apps/daemon/src/byok-tools.ts
+++ b/apps/daemon/src/byok-tools.ts
@ -0,0 +1,598 @@
+// Tool definitions and executors exposed to BYOK chat sessions.
+//
+// Why this file exists: the BYOK chat proxy (e.g. /api/proxy/senseaudio/stream)
+// is a thin pass-through that doesn't have the agent-runtime scaffolding the
+// CLI agents (Claude Code / Codex / ...) carry. To let users ask their BYOK
+// chat to "draw me a cat" and get an actual rendered PNG back, the daemon
+// injects an OpenAI-shaped `tools` definition into the upstream completion
+// request, then loops on the model's tool_calls: execute → feed the result
+// back as a `role: 'tool'` message → re-issue the completion. The chat surface
+// stays the same; the tool dispatch happens entirely daemon-side.
+//
+// Today we ship one tool — `generate_image` — backed by SenseAudio's
+// /v1/image/sync endpoint, since the BYOK chat session already authenticates
+// against SenseAudio with the same API key. Additional tools (TTS, video,
+// research) can be added here as the BYOK surface expands.
+
+import path from 'node:path';
+import { writeFile } from 'node:fs/promises';
+import { randomBytes } from 'node:crypto';
+import { assertExternalAssetUrl } from './connectionTest.js';
+import { resolveProviderConfig } from './media-config.js';
+import { IMAGE_MODELS } from './media-models.js';
+import { ensureProject } from './projects.js';
+
+// SenseAudio image model allowlist — derived from the shared media-models
+// registry so adding a new SenseAudio image model in one place (media-models)
+// auto-extends the BYOK tool param enum, the Settings dropdown, and the
+// daemon-side validation. No drift, no hand-maintained constant.
+export const BYOK_SENSEAUDIO_IMAGE_MODELS: readonly string[] = IMAGE_MODELS
+  .filter((m) => m.provider === 'senseaudio')
+  .map((m) => m.id);
+
+// Default falls back to the first entry from the registry (today
+// `senseaudio-image-2.0-260319` — the multi-aspect latest). Kept as a
+// computed constant so re-ordering the registry rotates the default
+// without code edits here.
+export const BYOK_SENSEAUDIO_DEFAULT_IMAGE_MODEL =
+  BYOK_SENSEAUDIO_IMAGE_MODELS[0] ?? 'senseaudio-image-2.0-260319';
+
+export function isSenseAudioImageModel(value: unknown): value is string {
+  return typeof value === 'string' && BYOK_SENSEAUDIO_IMAGE_MODELS.includes(value);
+}
+
+const SENSEAUDIO_DEFAULT_BASE_URL = 'https://api.senseaudio.cn';
+const PROMPT_MAX_LENGTH = 2000;
+
+// SenseAudio video — the API only documents one model today, so the
+// wire id is a const. The chat tool's `generate_video` param surface
+// (prompt, aspect_ratio, duration, resolution, generate_audio) covers
+// every knob the doubao-seedance gateway accepts.
+const SENSEAUDIO_VIDEO_MODEL = 'doubao-seedance-2-0-260128';
+const SENSEAUDIO_VIDEO_ASPECT_RATIOS = ['16:9', '9:16', '4:3', '3:4', '1:1'] as const;
+const SENSEAUDIO_VIDEO_RESOLUTIONS = ['480p', '720p', '1080p'] as const;
+const SENSEAUDIO_VIDEO_DURATION_MIN = 4;
+const SENSEAUDIO_VIDEO_DURATION_MAX = 15;
+const SENSEAUDIO_VIDEO_DURATION_DEFAULT = 5;
+// Polling: SenseAudio docs recommend 5–10 s intervals; we pick 5 s and
+// cap total attempts so a stuck job can't pin the chat stream forever.
+// 120 attempts × 5 s = 10 min ceiling — covers the real-world
+// doubao-seedance latency range (1080p + audio jobs frequently spend
+// 3–8 min on the gateway). Below this, the 5-min cap timed out otherwise
+// valid jobs; above this the chat surface starts feeling stuck.
+const SENSEAUDIO_VIDEO_POLL_INTERVAL_MS_DEFAULT = 5000;
+const SENSEAUDIO_VIDEO_MAX_POLLS = 120;
+// Periodic progress log every N polls so a long-running job emits some
+// signal to the daemon log — without flooding it with one line per
+// 5 s. 6 polls = ~30 s between progress lines.
+const SENSEAUDIO_VIDEO_PROGRESS_LOG_EVERY = 6;
+
+// SenseAudio's image gateway rejects non-standard pixel sizes with a 400
+// `参数错误：size` (verified against logs from a failed call on
+// 2026-05-16). We stick to common 16-multiple HD / SD sizes that the
+// gateway is known to accept: 1024×1024 for square, 1280×720 / 720×1280
+// for widescreen / portrait, 1024×768 / 768×1024 for the 4:3 family.
+// The table is duplicated in renderSenseAudioImage (media.ts) for the
+// CLI-agent path so both surfaces stay in sync.
+const ASPECT_TO_SIZE: Record<string, string> = {
+  '1:1': '1024x1024',
+  '16:9': '1280x720',
+  '9:16': '720x1280',
+  '4:3': '1024x768',
+  '3:4': '768x1024',
+};
+
+/**
+ * OpenAI-compatible tool definition for image generation. Injected into
+ * the upstream `tools` array on every /api/proxy/senseaudio/stream
+ * request so the LLM can decide on its own when to call it. The
+ * description deliberately tells the model to embed the returned URL
+ * in markdown — the chat UI already renders markdown images inline,
+ * so no client-side wiring is required for the bytes to show up.
+ */
+export const BYOK_SENSEAUDIO_TOOLS = [
+  {
+    type: 'function' as const,
+    function: {
+      name: 'generate_image',
+      description:
+        'Generate an image from a text prompt using SenseAudio image models. Returns a URL pointing to the rendered PNG. After this tool succeeds, embed the URL in your reply with markdown image syntax — ![alt](url) — so the user sees the image inline. Use this whenever the user asks to draw, create, generate, design, or illustrate something visual.',
+      parameters: {
+        type: 'object',
+        properties: {
+          prompt: {
+            type: 'string',
+            description:
+              'Detailed visual description of the image (Chinese or English are both fine). Include subject, style, lighting, composition. Maximum 2000 characters.',
+          },
+          aspect_ratio: {
+            type: 'string',
+            enum: ['1:1', '16:9', '9:16', '4:3', '3:4'],
+            description:
+              'Output aspect ratio. 1:1 for square avatars and product shots, 16:9 for hero banners, 9:16 for vertical phone posters, 4:3 for editorial covers, 3:4 for posters. Defaults to 1:1 when omitted.',
+          },
+          model: {
+            type: 'string',
+            enum: [...BYOK_SENSEAUDIO_IMAGE_MODELS],
+            description:
+              'Optional model override. Omit this to use the user-configured default from Settings (or the SenseAudio 2.0 multi-aspect model when unset). Choose senseaudio-image-2.0-260319 for multi-aspect generation, senseaudio-image-1.0-260319 for standard sizes, or doubao-seedream-5-0-260128 for high-resolution output through the ByteDance Seedream gateway. The user explicitly picked a default in their Settings — only override when the user asks for a different style/resolution.',
+          },
+        },
+        required: ['prompt'],
+      },
+    },
+  },
+  {
+    type: 'function' as const,
+    function: {
+      name: 'generate_video',
+      description:
+        'Generate a short video (4–15 seconds) from a text prompt using SenseAudio\'s ByteDance Seedance gateway. This is an asynchronous call that can take 30 s to a few minutes — the daemon polls the job for you, so the user just sees the chat waiting. After this tool succeeds, embed the returned URL in your reply as a markdown link, e.g. `[▶ Play video](url)`, because the chat\'s markdown renderer does not currently render `<video>` tags inline. Use this whenever the user asks for a video, clip, animation, or motion graphic.',
+      parameters: {
+        type: 'object',
+        properties: {
+          prompt: {
+            type: 'string',
+            description:
+              'Detailed motion description of the video. Include subject, action / camera move / scene transitions, style, lighting. Chinese or English. Maximum 2000 characters.',
+          },
+          aspect_ratio: {
+            type: 'string',
+            enum: [...SENSEAUDIO_VIDEO_ASPECT_RATIOS],
+            description:
+              'Output aspect ratio. 16:9 for cinematic, 9:16 for vertical (phone / TikTok), 1:1 for social square, 4:3 / 3:4 for editorial. Defaults to 16:9.',
+          },
+          duration: {
+            type: 'integer',
+            minimum: SENSEAUDIO_VIDEO_DURATION_MIN,
+            maximum: SENSEAUDIO_VIDEO_DURATION_MAX,
+            description:
+              `Video length in seconds (integer). Allowed range ${SENSEAUDIO_VIDEO_DURATION_MIN}–${SENSEAUDIO_VIDEO_DURATION_MAX}; defaults to ${SENSEAUDIO_VIDEO_DURATION_DEFAULT}. Shorter durations finish faster.`,
+          },
+          resolution: {
+            type: 'string',
+            enum: [...SENSEAUDIO_VIDEO_RESOLUTIONS],
+            description:
+              'Output resolution. 480p (fastest), 720p (default, balanced), 1080p (best quality, slowest). Pick 1080p only when the user explicitly asks for high resolution.',
+          },
+          generate_audio: {
+            type: 'boolean',
+            description:
+              'Whether the model also synthesises an audio track for the clip (background sound, ambience). Defaults to false to keep generation fast; flip to true when the user asks for sound, music, or a "video with audio".',
+          },
+        },
+        required: ['prompt'],
+      },
+    },
+  },
+];
+
+/**
+ * Runtime context the BYOK tool executor needs. Passed by the chat
+ * route on every call so the tool layer stays free of global state and
+ * can be unit-tested with a temp directory.
+ */
+export interface BYOKToolContext {
+  /** Daemon project root — used to look up media-config when the chat
+   *  session key is missing. */
+  projectRoot: string;
+  /** Daemon's PROJECTS_DIR (the `<projectRoot>/.od/projects/` folder
+   *  that holds per-project file trees). Generated images land in
+   *  `<projectsRoot>/<projectId>/byok-<id>.png` so the project's
+   *  FileViewer / DesignFilesPanel discover them automatically and
+   *  the file travels with the project on export, archive, rename. */
+  projectsRoot: string;
+  /** Active project id from the chat surface. Required — the BYOK
+   *  chat always runs inside a project, so the tool dispatch refuses
+   *  to fire without one rather than dump bytes into a global cache.
+   *  Validated upstream via `isSafeId`. */
+  projectId: string;
+  /** The BYOK chat session's API key — first credential we try. Bypasses
+   *  the media-config indirection so the same key the user just pasted
+   *  for chat is the same key the image call uses. */
+  upstreamApiKey: string;
+  /** The BYOK chat session's base URL (may be a custom gateway). Falls
+   *  back to api.senseaudio.cn. */
+  upstreamBaseUrl?: string;
+  /** Default image model the user picked in BYOK Settings, used when the
+   *  LLM didn't pass `model` in tool args. Validated upstream — anything
+   *  outside `BYOK_SENSEAUDIO_IMAGE_MODELS` is dropped so a stale
+   *  client-side config can't smuggle an unregistered model id through.
+   *  Falls back to `BYOK_SENSEAUDIO_DEFAULT_IMAGE_MODEL` (the registry's
+   *  first SenseAudio image entry) when missing. */
+  defaultImageModel?: string;
+  /** Test-only override for the video polling interval (ms). Production
+   *  uses 5 s (SenseAudio's recommendation) — tests pass small values
+   *  (e.g. 1 ms) to keep the suite fast without changing the polling
+   *  semantics. */
+  videoPollIntervalMs?: number;
+}
+
+export interface ImageToolResult {
+  ok: boolean;
+  /** Daemon-served URL on success. */
+  url?: string;
+  /** Short human-readable failure reason. Stuffed into the `tool` role
+   *  reply so the LLM can apologize / retry. */
+  error?: string;
+}
+
+function sanitizeAspectRatio(raw: unknown): string {
+  if (typeof raw !== 'string') return '1:1';
+  return ASPECT_TO_SIZE[raw] ? raw : '1:1';
+}
+
+/**
+ * Execute the `generate_image` tool. Calls SenseAudio /v1/image/sync,
+ * downloads the rendered bytes, writes them to <byokImagesDir>/<id>.png,
+ * and returns a daemon-served URL. Pure async — caller is responsible
+ * for emitting any SSE events (e.g. "tool result ready").
+ *
+ * Failure modes return `{ok: false, error}` rather than throwing so the
+ * caller can feed the message back to the LLM as a tool_result; that
+ * lets the model apologize / suggest a retry instead of the chat
+ * silently stopping.
+ */
+export async function executeGenerateImage(
+  args: { prompt?: unknown; aspect_ratio?: unknown; model?: unknown },
+  ctx: BYOKToolContext,
+): Promise<ImageToolResult> {
+  const promptRaw = typeof args.prompt === 'string' ? args.prompt.trim() : '';
+  if (!promptRaw) return { ok: false, error: 'prompt is required' };
+  const prompt =
+    promptRaw.length > PROMPT_MAX_LENGTH
+      ? promptRaw.slice(0, PROMPT_MAX_LENGTH)
+      : promptRaw;
+
+  const aspect = sanitizeAspectRatio(args.aspect_ratio);
+  const size = ASPECT_TO_SIZE[aspect];
+
+  // Model resolution order — LLM args > user's Settings default > registry
+  // default. The allowlist guards every step so a hallucinated or stale id
+  // can never reach the senseaudio /v1/image/sync wire — the catalogue is
+  // the source of truth.
+  const senseAudioImageModel = isSenseAudioImageModel(args.model)
+    ? args.model
+    : isSenseAudioImageModel(ctx.defaultImageModel)
+      ? ctx.defaultImageModel
+      : BYOK_SENSEAUDIO_DEFAULT_IMAGE_MODEL;
+
+  // Resolve the project folder up front. ensureProject runs
+  // `isSafeId` internally, so an attacker who somehow bypassed the
+  // chat-routes guard and slipped `../escape` into projectId fails
+  // here before we make any upstream call. The returned `dir` is
+  // reused at writeFile time below.
+  let dir: string;
+  try {
+    dir = await ensureProject(ctx.projectsRoot, ctx.projectId);
+  } catch (err) {
+    return {
+      ok: false,
+      error: `invalid projectId for image storage: ${err instanceof Error ? err.message : String(err)}`,
+    };
+  }
+
+  // Prefer the BYOK session's key (what the user is actively using).
+  // Fall back to media-config (env var > stored) so a user who set
+  // OD_SENSEAUDIO_API_KEY but forgot to fill the chat panel still
+  // gets a working tool call.
+  let apiKey = ctx.upstreamApiKey;
+  let baseUrl = ctx.upstreamBaseUrl || SENSEAUDIO_DEFAULT_BASE_URL;
+  if (!apiKey) {
+    const resolved = await resolveProviderConfig(ctx.projectRoot, 'senseaudio');
+    apiKey = resolved.apiKey || '';
+    if (resolved.baseUrl) baseUrl = resolved.baseUrl;
+  }
+  if (!apiKey) {
+    return { ok: false, error: 'no SenseAudio API key available' };
+  }
+
+  const trimmedBase = baseUrl.replace(/\/+$/, '');
+  let imageUrl: string;
+  try {
+    const resp = await fetch(`${trimmedBase}/v1/image/sync`, {
+      method: 'POST',
+      headers: {
+        authorization: `Bearer ${apiKey}`,
+        'content-type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: senseAudioImageModel,
+        prompt,
+        size,
+      }),
+    });
+    if (!resp.ok) {
+      const text = await resp.text().catch(() => '');
+      return {
+        ok: false,
+        error: `senseaudio image ${resp.status}: ${text.slice(0, 240)}`,
+      };
+    }
+    const data = (await resp.json()) as {
+      url?: string;
+      error_message?: string;
+      base_resp?: { status_code?: number; status_msg?: string };
+    };
+    if (data?.base_resp && data.base_resp.status_code !== 0) {
+      return {
+        ok: false,
+        error: `senseaudio image api error ${data.base_resp.status_code}: ${data.base_resp.status_msg || 'unknown'}`,
+      };
+    }
+    if (typeof data?.error_message === 'string' && data.error_message) {
+      return { ok: false, error: `senseaudio image: ${data.error_message}` };
+    }
+    if (typeof data?.url !== 'string' || !data.url) {
+      return { ok: false, error: 'senseaudio image response missing url' };
+    }
+    imageUrl = data.url;
+  } catch (err) {
+    return {
+      ok: false,
+      error: err instanceof Error ? err.message : String(err),
+    };
+  }
+
+  const imageUrlCheck = await assertExternalAssetUrl(imageUrl);
+  if (!imageUrlCheck.ok) return { ok: false, error: imageUrlCheck.error };
+
+  let bytes: Buffer;
+  try {
+    const imgResp = await fetch(imageUrl, { redirect: 'error' });
+    if (!imgResp.ok) {
+      return { ok: false, error: `image download ${imgResp.status}` };
+    }
+    bytes = Buffer.from(await imgResp.arrayBuffer());
+  } catch (err) {
+    return {
+      ok: false,
+      error: `image download failed: ${err instanceof Error ? err.message : String(err)}`,
+    };
+  }
+  if (bytes.length === 0) {
+    return { ok: false, error: 'image download returned zero bytes' };
+  }
+
+  // Persist into the active project's folder. `dir` was resolved up
+  // front via ensureProject — no DB write, no metadata side-effects —
+  // and the resulting path slots straight into the existing project
+  // file plumbing: listFiles enumerates it for the FileViewer,
+  // readProjectFile serves it via GET /api/projects/<id>/files/<filename>,
+  // and project archive / export pick it up automatically because it
+  // lives under the project's own directory.
+  //
+  // Filename pattern `byok-<timestamp>-<random>.png` keeps tool
+  // outputs distinguishable from user uploads at a glance while
+  // staying url-safe.
+  const id = `${Date.now().toString(36)}-${randomBytes(4).toString('hex')}`;
+  const filename = `byok-${id}.png`;
+  await writeFile(path.join(dir, filename), bytes);
+
+  // Return a relative URL through the project file serving route. The
+  // web's Next.js rewrites `/api/:path*` to the daemon (see
+  // apps/web/next.config.ts), so the chat UI loads the image
+  // same-origin — satisfying the strict CSP (`img-src 'self' data:
+  // blob:`) without any CORS plumbing.
+  return {
+    ok: true,
+    url: `/api/projects/${encodeURIComponent(ctx.projectId)}/files/${filename}`,
+  };
+}
+
+function sanitizeVideoAspectRatio(raw: unknown): (typeof SENSEAUDIO_VIDEO_ASPECT_RATIOS)[number] {
+  if (typeof raw !== 'string') return '16:9';
+  return (SENSEAUDIO_VIDEO_ASPECT_RATIOS as readonly string[]).includes(raw)
+    ? (raw as (typeof SENSEAUDIO_VIDEO_ASPECT_RATIOS)[number])
+    : '16:9';
+}
+
+function sanitizeVideoResolution(raw: unknown): (typeof SENSEAUDIO_VIDEO_RESOLUTIONS)[number] {
+  if (typeof raw !== 'string') return '720p';
+  return (SENSEAUDIO_VIDEO_RESOLUTIONS as readonly string[]).includes(raw)
+    ? (raw as (typeof SENSEAUDIO_VIDEO_RESOLUTIONS)[number])
+    : '720p';
+}
+
+function sanitizeVideoDuration(raw: unknown): number {
+  if (typeof raw !== 'number' || !Number.isFinite(raw)) return SENSEAUDIO_VIDEO_DURATION_DEFAULT;
+  const rounded = Math.round(raw);
+  if (rounded < SENSEAUDIO_VIDEO_DURATION_MIN) return SENSEAUDIO_VIDEO_DURATION_MIN;
+  if (rounded > SENSEAUDIO_VIDEO_DURATION_MAX) return SENSEAUDIO_VIDEO_DURATION_MAX;
+  return rounded;
+}
+
+const sleep = (ms: number): Promise<void> =>
+  new Promise((resolve) => setTimeout(resolve, ms));
+
+/**
+ * Execute the `generate_video` tool. SenseAudio's video API is
+ * asynchronous-only: POST /v1/video/create returns a task_id, then
+ * GET /v1/video/status?id=<task_id> reports `pending` / `processing`
+ * → `completed` (with `video_url`) or `failed` (with `error_message`).
+ * We poll every `videoPollIntervalMs` (default 5 s) and bail after
+ * `SENSEAUDIO_VIDEO_MAX_POLLS` so a stuck upstream can't pin the
+ * chat stream forever.
+ *
+ * The chat tool waits for the whole loop, so the daemon's outbound
+ * SSE response from /api/proxy/senseaudio/stream stays open for the
+ * duration. That's intentional — the next chat turn cannot begin
+ * until we have a URL to feed back into the tool_result.
+ */
+export async function executeGenerateVideo(
+  args: {
+    prompt?: unknown;
+    aspect_ratio?: unknown;
+    duration?: unknown;
+    resolution?: unknown;
+    generate_audio?: unknown;
+  },
+  ctx: BYOKToolContext,
+): Promise<ImageToolResult> {
+  const promptRaw = typeof args.prompt === 'string' ? args.prompt.trim() : '';
+  if (!promptRaw) return { ok: false, error: 'prompt is required' };
+  const prompt =
+    promptRaw.length > PROMPT_MAX_LENGTH
+      ? promptRaw.slice(0, PROMPT_MAX_LENGTH)
+      : promptRaw;
+
+  const ratio = sanitizeVideoAspectRatio(args.aspect_ratio);
+  const resolution = sanitizeVideoResolution(args.resolution);
+  const duration = sanitizeVideoDuration(args.duration);
+  const generateAudio = args.generate_audio === true;
+
+  let dir: string;
+  try {
+    dir = await ensureProject(ctx.projectsRoot, ctx.projectId);
+  } catch (err) {
+    return {
+      ok: false,
+      error: `invalid projectId for video storage: ${err instanceof Error ? err.message : String(err)}`,
+    };
+  }
+
+  let apiKey = ctx.upstreamApiKey;
+  let baseUrl = ctx.upstreamBaseUrl || SENSEAUDIO_DEFAULT_BASE_URL;
+  if (!apiKey) {
+    const resolved = await resolveProviderConfig(ctx.projectRoot, 'senseaudio');
+    apiKey = resolved.apiKey || '';
+    if (resolved.baseUrl) baseUrl = resolved.baseUrl;
+  }
+  if (!apiKey) {
+    return { ok: false, error: 'no SenseAudio API key available' };
+  }
+  const trimmedBase = baseUrl.replace(/\/+$/, '');
+
+  // Step 1: POST /v1/video/create → task_id.
+  let taskId: string;
+  try {
+    const resp = await fetch(`${trimmedBase}/v1/video/create`, {
+      method: 'POST',
+      headers: {
+        authorization: `Bearer ${apiKey}`,
+        'content-type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: SENSEAUDIO_VIDEO_MODEL,
+        content: [{ type: 'text', text: prompt }],
+        duration,
+        resolution,
+        ratio,
+        provider_specific: { generate_audio: generateAudio },
+      }),
+    });
+    if (!resp.ok) {
+      const text = await resp.text().catch(() => '');
+      return {
+        ok: false,
+        error: `senseaudio video create ${resp.status}: ${text.slice(0, 240)}`,
+      };
+    }
+    const data = (await resp.json()) as { task_id?: string };
+    if (typeof data?.task_id !== 'string' || !data.task_id) {
+      return { ok: false, error: 'senseaudio video create response missing task_id' };
+    }
+    taskId = data.task_id;
+  } catch (err) {
+    return {
+      ok: false,
+      error: err instanceof Error ? err.message : String(err),
+    };
+  }
+
+  // Step 2: poll /v1/video/status until completed / failed / timeout.
+  const pollIntervalMs = ctx.videoPollIntervalMs ?? SENSEAUDIO_VIDEO_POLL_INTERVAL_MS_DEFAULT;
+  let videoUrl = '';
+  for (let attempt = 0; attempt < SENSEAUDIO_VIDEO_MAX_POLLS; attempt++) {
+    await sleep(pollIntervalMs);
+    let statusResp: Response;
+    try {
+      statusResp = await fetch(
+        `${trimmedBase}/v1/video/status?id=${encodeURIComponent(taskId)}`,
+        {
+          method: 'GET',
+          headers: { authorization: `Bearer ${apiKey}` },
+        },
+      );
+    } catch (err) {
+      return {
+        ok: false,
+        error: `senseaudio video poll failed: ${err instanceof Error ? err.message : String(err)}`,
+      };
+    }
+    if (!statusResp.ok) {
+      const text = await statusResp.text().catch(() => '');
+      return {
+        ok: false,
+        error: `senseaudio video status ${statusResp.status}: ${text.slice(0, 240)}`,
+      };
+    }
+    const data = (await statusResp.json()) as {
+      status?: string;
+      progress?: number;
+      video_url?: string;
+      error_message?: string;
+    };
+    if (data?.status === 'completed') {
+      if (typeof data.video_url !== 'string' || !data.video_url) {
+        return { ok: false, error: 'senseaudio video status completed but missing video_url' };
+      }
+      videoUrl = data.video_url;
+      break;
+    }
+    if (data?.status === 'failed') {
+      return {
+        ok: false,
+        error: `senseaudio video failed: ${data.error_message || 'unknown reason'}`,
+      };
+    }
+    // pending / processing — continue polling. Emit a periodic log line
+    // so a stuck job surfaces in the daemon log instead of silently
+    // burning attempts.
+    if ((attempt + 1) % SENSEAUDIO_VIDEO_PROGRESS_LOG_EVERY === 0) {
+      const pct = typeof data.progress === 'number' ? data.progress : '?';
+      console.log(
+        `[proxy:senseaudio] generate_video poll ${attempt + 1}/${SENSEAUDIO_VIDEO_MAX_POLLS} task=${taskId} status=${data.status ?? 'unknown'} progress=${pct}`,
+      );
+    }
+  }
+  if (!videoUrl) {
+    return {
+      ok: false,
+      error: `senseaudio video timed out after ${SENSEAUDIO_VIDEO_MAX_POLLS} polls`,
+    };
+  }
+
+  // Step 3: download the mp4 bytes and persist into the project folder.
+  // Re-validate the returned URL through validateBaseUrlResolved so a
+  // malicious gateway can't point us at 169.254.169.254 (AWS / Azure
+  // metadata service) or RFC1918 hosts via the response payload.
+  const videoUrlCheck = await assertExternalAssetUrl(videoUrl);
+  if (!videoUrlCheck.ok) return { ok: false, error: videoUrlCheck.error };
+
+  let bytes: Buffer;
+  try {
+    const videoResp = await fetch(videoUrl, { redirect: 'error' });
+    if (!videoResp.ok) {
+      return { ok: false, error: `video download ${videoResp.status}` };
+    }
+    bytes = Buffer.from(await videoResp.arrayBuffer());
+  } catch (err) {
+    return {
+      ok: false,
+      error: `video download failed: ${err instanceof Error ? err.message : String(err)}`,
+    };
+  }
+  if (bytes.length === 0) {
+    return { ok: false, error: 'video download returned zero bytes' };
+  }
+  const id = `${Date.now().toString(36)}-${randomBytes(4).toString('hex')}`;
+  const filename = `byok-video-${id}.mp4`;
+  await writeFile(path.join(dir, filename), bytes);
+
+  return {
+    ok: true,
+    url: `/api/projects/${encodeURIComponent(ctx.projectId)}/files/${filename}`,
+  };
+}
+
--- a/apps/daemon/src/chat-routes.ts
+++ b/apps/daemon/src/chat-routes.ts
@ -1,13 +1,22 @@
 import type { Express } from 'express';
 import type { RouteDeps } from './server-context.js';
 import { newInsertId } from './analytics.js';
+import { seedProviderIfMissing } from './media-config.js';
+import {
+  BYOK_SENSEAUDIO_TOOLS,
+  executeGenerateImage,
+  executeGenerateVideo,
+  isSenseAudioImageModel,
+  type BYOKToolContext,
+} from './byok-tools.js';
+import { isSafeId as isSafeProjectId } from './projects.js';
 import {
  agentIdToTracking,
  projectKindToTracking,
 } from '@open-design/contracts/analytics';
 import { validateBaseUrlResolved } from './connectionTest.js';

-export interface RegisterChatRoutesDeps extends RouteDeps<'db' | 'design' | 'http' | 'chat' | 'agents' | 'critique' | 'validation' | 'lifecycle'> {}
+export interface RegisterChatRoutesDeps extends RouteDeps<'db' | 'design' | 'http' | 'chat' | 'agents' | 'critique' | 'validation' | 'lifecycle' | 'paths'> {}

 // Invariant: a chat assistant message row reflects its run's terminal state
 // even when the web client never persists the cancel/finish itself (refresh
@ -310,13 +319,13 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
    const protocol = body.protocol;
    if (
      typeof protocol !== 'string' ||
-      !['anthropic', 'openai', 'azure', 'google', 'ollama'].includes(protocol)
+      !['anthropic', 'openai', 'azure', 'google', 'ollama', 'senseaudio'].includes(protocol)
    ) {
      return sendApiError(
        res,
        400,
        'BAD_REQUEST',
-        'protocol must be one of anthropic|openai|azure|google|ollama',
+        'protocol must be one of anthropic|openai|azure|google|ollama|senseaudio',
      );
    }
    if (
@ -371,13 +380,13 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
        const protocol = body.protocol;
        if (
          typeof protocol !== 'string' ||
-          !['anthropic', 'openai', 'azure', 'google', 'ollama'].includes(protocol)
+          !['anthropic', 'openai', 'azure', 'google', 'ollama', 'senseaudio'].includes(protocol)
        ) {
          return sendApiError(
            res,
            400,
            'BAD_REQUEST',
-            'protocol must be one of anthropic|openai|azure|google|ollama',
+            'protocol must be one of anthropic|openai|azure|google|ollama|senseaudio',
          );
        }
        if (
@ -1172,4 +1181,354 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
    }
  });

+  // SenseAudio chat completions. Wire-compatible with OpenAI (POST
+  // /v1/chat/completions, Bearer auth, SSE `data: {...}` + `data: [DONE]`)
+  // plus a daemon-side tool loop: the handler injects an OpenAI
+  // `tools` array on every upstream request and, when the model
+  // responds with a `tool_calls` finish_reason, executes the call
+  // locally, appends the assistant + tool messages to the conversation,
+  // and re-issues the completion. This is how BYOK chat — which has
+  // no agent-runtime scaffolding — gets image-generation parity with
+  // the CLI agent path. Loop is bounded by MAX_BYOK_TOOL_LOOPS so a
+  // misbehaving model can't pin the daemon in an infinite tool dance.
+  const MAX_BYOK_TOOL_LOOPS = 3;
+
+  type AccumulatedToolCall = { id: string; name: string; arguments: string };
+  type TurnResult =
+    | { kind: 'text_end' }
+    | { kind: 'error' }
+    | {
+        kind: 'tool_calls';
+        assistantMessage: any;
+        toolCalls: Array<{ id: string; type: 'function'; function: { name: string; arguments: string } }>;
+      };
+
+  app.post('/api/proxy/senseaudio/stream', async (req, res) => {
+    const proxyBody = req.body || {};
+    if (rejectProxyPluginContext(proxyBody, res)) return;
+    const {
+      baseUrl,
+      apiKey,
+      model,
+      systemPrompt,
+      messages,
+      maxTokens,
+      projectId,
+      byokImageModel,
+    } = proxyBody;
+    if (!apiKey || !model) {
+      return sendApiError(
+        res,
+        400,
+        'BAD_REQUEST',
+        'apiKey and model are required',
+      );
+    }
+    // projectId is required because the BYOK generate_image tool writes
+    // into the active project's folder; without one we'd have to fall
+    // back to a daemon-global cache that orphans the file. The web
+    // client always passes project.id from ProjectView, so a missing
+    // value means the request did not come through the chat surface.
+    if (typeof projectId !== 'string' || !isSafeProjectId(projectId)) {
+      return sendApiError(
+        res,
+        400,
+        'BAD_REQUEST',
+        'projectId is required and must be a safe identifier',
+      );
+    }
+
+    const effectiveBaseUrl = baseUrl || 'https://api.senseaudio.cn';
+    const validated = await validateExternalApiBaseUrl(effectiveBaseUrl);
+    if (validated.error) {
+      return sendApiError(
+        res,
+        validated.forbidden ? 403 : 400,
+        validated.forbidden ? 'FORBIDDEN' : 'BAD_REQUEST',
+        validated.error,
+      );
+    }
+
+    const url = appendVersionedApiPath(effectiveBaseUrl, '/chat/completions');
+    console.log(
+      `[proxy:senseaudio] ${req.method} ${validated.parsed?.hostname ?? '?'} model=${model} project=${projectId}`,
+    );
+
+    const workingMessages: any[] = Array.isArray(messages) ? [...messages] : [];
+    if (typeof systemPrompt === 'string' && systemPrompt) {
+      workingMessages.unshift({ role: 'system', content: systemPrompt });
+    }
+
+    // Tool execution context — built once per request. The image tool
+    // writes into `<projectsRoot>/<projectId>/byok-<id>.png` and returns
+    // a relative URL via `/api/projects/:id/files/:filename`. The web's
+    // Next.js rewrites `/api/:path*` to the daemon, so the chat UI
+    // loads images same-origin through the standard project file
+    // route — no CSP / CORS exceptions needed.
+    // User-configured BYOK default image model. Drop silently if the
+    // client sent an id outside the SenseAudio registry — the tool
+    // will fall back to the registry default and the LLM can still
+    // override per-call via the tool's `model` arg.
+    const validDefaultImageModel = isSenseAudioImageModel(byokImageModel)
+      ? byokImageModel
+      : undefined;
+
+    const toolCtx: BYOKToolContext = {
+      projectRoot: ctx.paths.PROJECT_ROOT,
+      projectsRoot: ctx.paths.PROJECTS_DIR,
+      projectId,
+      upstreamApiKey: apiKey,
+      upstreamBaseUrl: effectiveBaseUrl,
+      // Spread-conditional because tsconfig's exactOptionalPropertyTypes
+      // forbids `field: undefined` on an optional slot. The byok-tools
+      // executor reads `ctx.defaultImageModel` with `isSenseAudioImageModel`
+      // anyway, so a missing key and an undefined value behave the same.
+      ...(validDefaultImageModel
+        ? { defaultImageModel: validDefaultImageModel }
+        : {}),
+    };
+
+    // Run one round-trip: POST to upstream, stream text deltas to the
+    // client as they arrive, accumulate any tool_call deltas. Returns
+    // a typed result describing what to do next (loop on tool calls,
+    // close the stream, or bail on error). Closures capture all the
+    // SSE helpers from registerChatRoutes.
+    const runSenseAudioTurn = async (
+      sse: any,
+      messagesForTurn: any[],
+    ): Promise<TurnResult> => {
+      const payload: any = {
+        model,
+        messages: messagesForTurn,
+        max_tokens:
+          typeof maxTokens === 'number' && maxTokens > 0 ? maxTokens : 8192,
+        stream: true,
+        tools: BYOK_SENSEAUDIO_TOOLS,
+        tool_choice: 'auto',
+      };
+      const response = await fetch(url, {
+        method: 'POST',
+        headers: {
+          'Content-Type': 'application/json',
+          Authorization: `Bearer ${apiKey}`,
+        },
+        body: JSON.stringify(payload),
+        redirect: 'error',
+      });
+
+      if (!response.ok) {
+        const errorText = await response.text();
+        console.error(
+          `[proxy:senseaudio] upstream error: ${response.status} ${redactAuthTokens(errorText)}`,
+        );
+        sendProxyError(sse, `Upstream error: ${response.status}`, {
+          code: proxyErrorCode(response.status),
+          details: errorText,
+          retryable: response.status === 429 || response.status >= 500,
+        });
+        return { kind: 'error' };
+      }
+
+      const accum: Record<number, AccumulatedToolCall> = {};
+      let finishReason = '';
+      let providerError = '';
+
+      await streamUpstreamSse(response, ({ payload, data }: any) => {
+        if (payload === '[DONE]') return true;
+        if (!data) return false;
+
+        const streamErr = extractStreamErrorMessage(data);
+        if (streamErr) {
+          providerError = streamErr;
+          return true;
+        }
+
+        const choices = (data as any).choices;
+        if (!Array.isArray(choices) || choices.length === 0) return false;
+        const choice = choices[0] || {};
+        const delta = choice.delta || {};
+
+        // Text content streams to the client unchanged. Tool turns and
+        // text turns can both share this path — the OpenAI protocol
+        // never emits text+tool_calls in the same chunk, but it can
+        // emit text before / after a tool_call in the same turn, and
+        // we want the user to see whatever the model decided to say.
+        if (typeof delta.content === 'string' && delta.content) {
+          sse.send('delta', { delta: delta.content });
+        }
+
+        // Tool call deltas stream as fragments — `id` arrives once at
+        // the start, `function.name` once at the start, and
+        // `function.arguments` accumulates a chunked JSON string we
+        // have to concatenate. Parallel calls use the `index` field to
+        // distinguish slots. Default to 0 when omitted (older models).
+        if (Array.isArray(delta.tool_calls)) {
+          for (const tc of delta.tool_calls) {
+            const idx = typeof tc?.index === 'number' ? tc.index : 0;
+            if (!accum[idx]) {
+              accum[idx] = { id: '', name: '', arguments: '' };
+            }
+            const slot = accum[idx];
+            if (typeof tc.id === 'string' && tc.id) slot.id = tc.id;
+            if (typeof tc.function?.name === 'string' && tc.function.name) {
+              slot.name = tc.function.name;
+            }
+            if (typeof tc.function?.arguments === 'string') {
+              slot.arguments += tc.function.arguments;
+            }
+          }
+        }
+
+        if (typeof choice.finish_reason === 'string' && choice.finish_reason) {
+          finishReason = choice.finish_reason;
+        }
+        return false;
+      });
+
+      if (providerError) {
+        sendProxyError(sse, `Provider error: ${providerError}`, {
+          details: providerError,
+        });
+        return { kind: 'error' };
+      }
+
+      if (finishReason === 'tool_calls' && Object.keys(accum).length > 0) {
+        const indices = Object.keys(accum)
+          .map(Number)
+          .sort((a, b) => a - b);
+        const toolCalls = indices.map((i) => ({
+          id: accum[i]!.id || `call_${i}`,
+          type: 'function' as const,
+          function: {
+            name: accum[i]!.name,
+            arguments: accum[i]!.arguments,
+          },
+        }));
+        return {
+          kind: 'tool_calls',
+          assistantMessage: {
+            role: 'assistant',
+            content: null,
+            tool_calls: toolCalls,
+          },
+          toolCalls,
+        };
+      }
+
+      return { kind: 'text_end' };
+    };
+
+    const executeOneTool = async (call: {
+      id: string;
+      function: { name: string; arguments: string };
+    }): Promise<{ ok: boolean; url?: string; error?: string; kind?: 'image' | 'video' }> => {
+      const fnName = call?.function?.name ?? '';
+      if (fnName !== 'generate_image' && fnName !== 'generate_video') {
+        return {
+          ok: false,
+          error: `unknown tool: ${fnName || 'unnamed'}`,
+        };
+      }
+      let args: any = {};
+      try {
+        args = JSON.parse(call.function.arguments || '{}');
+      } catch {
+        return { ok: false, error: 'tool arguments were not valid JSON' };
+      }
+      if (fnName === 'generate_image') {
+        const result = await executeGenerateImage(args, toolCtx);
+        return { ...result, kind: 'image' };
+      }
+      // generate_video — longer (up to 5 min), async-with-polling.
+      const result = await executeGenerateVideo(args, toolCtx);
+      return { ...result, kind: 'video' };
+    };
+
+    const sse = createSseResponse(res);
+    sse.send('start', { model });
+
+    // SenseAudio's gateway issues one API key that works for both
+    // /v1/chat/completions and the image / TTS surfaces. Mirror the
+    // BYOK key into media-config so the CLI agent path (`od media
+    // generate`) picks it up automatically — fire-and-forget; the
+    // chat stream must not block on the disk write. seedProviderIfMissing
+    // is idempotent and preserves env-var-resolved keys.
+    seedProviderIfMissing(ctx.paths.PROJECT_ROOT, 'senseaudio', {
+      apiKey,
+      baseUrl: effectiveBaseUrl,
+    })
+      .then((seeded) => {
+        if (seeded) {
+          console.log(
+            '[proxy:senseaudio] seeded media-config.senseaudio from BYOK key',
+          );
+        }
+      })
+      .catch((err: unknown) => {
+        console.warn(
+          `[proxy:senseaudio] seed media-config failed: ${
+            err instanceof Error ? err.message : String(err)
+          }`,
+        );
+      });
+
+    try {
+      for (let loop = 0; loop < MAX_BYOK_TOOL_LOOPS; loop++) {
+        const turn = await runSenseAudioTurn(sse, workingMessages);
+        if (turn.kind === 'error') return sse.end();
+        if (turn.kind === 'text_end') {
+          sse.send('end', {});
+          return sse.end();
+        }
+        // turn.kind === 'tool_calls'
+        workingMessages.push(turn.assistantMessage);
+        for (const call of turn.toolCalls) {
+          const result = await executeOneTool(call);
+          // The tool result is delivered to the model as a `tool` role
+          // message — a structured payload the model can interpret. We
+          // also surface a daemon-side log line so a user reporting "no
+          // image showed up" can grep for the call id. The kind field
+          // distinguishes image vs video so the daemon picks the right
+          // embedding hint for the model (markdown image syntax for
+          // PNG, markdown link for MP4 since the chat renderer doesn't
+          // currently render <video> tags).
+          const toolName = call?.function?.name ?? 'unknown';
+          if (result.ok) {
+            console.log(
+              `[proxy:senseaudio] ${toolName} OK: ${call.id} → ${result.url}`,
+            );
+          } else {
+            console.warn(
+              `[proxy:senseaudio] ${toolName} FAILED: ${call.id} — ${result.error}`,
+            );
+          }
+          const content = result.ok
+            ? result.kind === 'video'
+              ? `Video generated successfully. URL: ${result.url}. Reply to the user with a clickable markdown link, e.g. [▶ Play video](${result.url}). Do NOT use markdown image syntax — the chat renderer does not embed <video> tags.`
+              : `Image generated successfully. URL: ${result.url}. Reply to the user with: ![generated image](${result.url})`
+            : result.kind === 'video'
+              ? `Video generation failed: ${result.error}. Apologize briefly and suggest a retry with a more specific prompt or a shorter duration.`
+              : `Image generation failed: ${result.error}. Apologize briefly and suggest a retry with a more specific prompt.`;
+          workingMessages.push({
+            role: 'tool',
+            tool_call_id: call.id,
+            content,
+          });
+        }
+      }
+      // Tool loop exhausted — the model still wants to call tools but we
+      // refuse a 4th round. Close the stream gracefully; the last text
+      // delta the model emitted (if any) is already on the wire.
+      console.warn(
+        '[proxy:senseaudio] tool loop bounded at MAX_BYOK_TOOL_LOOPS=3',
+      );
+      sse.send('end', {});
+      return sse.end();
+    } catch (err: any) {
+      console.error(`[proxy:senseaudio] internal error: ${err.message}`);
+      sendProxyError(sse, err.message, { code: 'INTERNAL_ERROR' });
+      sse.end();
+    }
+  });
+
 }
--- a/apps/daemon/src/connectionTest.ts
+++ b/apps/daemon/src/connectionTest.ts
@ -119,6 +119,41 @@ export async function validateBaseUrlResolved(
  return sync;
 }

+/**
+ * SSRF guard for asset URLs handed back inside a successful API
+ * response — typically a `data.url` or `data.video_url` that points
+ * at the gateway's CDN, but is attacker-controllable when the
+ * upstream gateway is compromised or misconfigured. Routes the URL
+ * through `validateBaseUrlResolved` (DNS-resolve → reject loopback,
+ * RFC1918, link-local, CGNAT, metadata-service IPs) and returns a
+ * discriminated union so callers don't have to repeat the
+ * `validated.error || !validated.parsed` plumbing.
+ *
+ * Two callers today:
+ *   - `byok-tools.ts` for the chat-tool image/video downloads
+ *   - `media.ts` `renderSenseAudioImage` for the CLI agent path
+ * Both hand the URL straight to `fetch(...)` next, so pair this
+ * guard with `redirect: 'error'` on the fetch to also block a
+ * 3xx hop into private space.
+ */
+export async function assertExternalAssetUrl(
+  rawUrl: string,
+): Promise<{ ok: true } | { ok: false; error: string }> {
+  if (typeof rawUrl !== 'string' || !rawUrl) {
+    return { ok: false, error: 'empty download url' };
+  }
+  const validated = await validateBaseUrlResolved(rawUrl);
+  if (validated.error || !validated.parsed) {
+    return {
+      ok: false,
+      error: validated.forbidden
+        ? `blocked download url (${validated.error ?? 'internal address'})`
+        : `invalid download url: ${validated.error ?? 'unknown reason'}`,
+    };
+  }
+  return { ok: true };
+}
+
 // Aggressive but not punitive — happy paths usually return in under 2 s.
 // Override with OD_CONNECTION_TEST_PROVIDER_TIMEOUT_MS for slow networks
 // or distant providers; invalid values fall back to the default.
@ -315,10 +350,10 @@ function inspectProviderCompletion(
  const obj = data && typeof data === 'object' ? data as Record<string, unknown> : null;
  if (!obj) return { valid: false };

-  if (protocol === 'openai' || protocol === 'azure') {
+  if (protocol === 'openai' || protocol === 'azure' || protocol === 'senseaudio') {
    const responseModel = typeof obj.model === 'string' ? obj.model : '';
    if (
-      protocol === 'openai' &&
+      (protocol === 'openai' || protocol === 'senseaudio') &&
      enforceResponseModel &&
      responseModel &&
      requestedModel &&
@ -518,6 +553,12 @@ function buildProviderCall(input: ProviderTestRequest): ProviderCallShape {
        },
      };
    case 'openai':
+    case 'senseaudio':
+      // SenseAudio is wire-compatible with OpenAI (POST /v1/chat/completions,
+      // Bearer auth, identical body + response shape), so the connection
+      // smoke test reuses the same call shape. We default the base URL
+      // upstream-side in chat-routes; this layer assumes the caller passed
+      // a concrete URL via the BYOK form.
      return {
        url: appendVersionedApiPath(baseUrl, '/chat/completions'),
        headers: {
--- a/apps/daemon/src/media-config.ts
+++ b/apps/daemon/src/media-config.ts
@ -521,3 +521,53 @@ export async function writeConfig(projectRoot: string, body: unknown) {
  await writeStored(projectRoot, next);
  return readMaskedConfig(projectRoot);
 }
+
+/**
+ * Idempotent "seed if empty" write for a single provider slot. The chat
+ * proxy uses this to mirror a BYOK key into media-config so the agent's
+ * image / TTS path picks up the same credential without the user having
+ * to paste it twice. Strict rules:
+ *   * No-op when an apiKey is ALREADY stored for `providerId` (the user
+ *     may have configured Media independently and we never overwrite).
+ *   * No-op when an env-var key resolves for `providerId` (env wins
+ *     regardless of disk state — seeding would be invisible).
+ *   * No-op when the incoming `apiKey` is empty (we only seed values
+ *     the chat layer has just verified upstream).
+ *   * Otherwise merge `{ [providerId]: entry }` into the existing
+ *     provider map and persist. All other provider slots and aliases
+ *     are preserved byte-for-byte.
+ *
+ * Returns `true` when a write happened (caller can log), `false` when
+ * the call was a no-op. Errors are surfaced — the caller decides
+ * whether to swallow them (fire-and-forget) or propagate.
+ */
+export async function seedProviderIfMissing(
+  projectRoot: string,
+  providerId: string,
+  entry: { apiKey?: string; baseUrl?: string; model?: string },
+): Promise<boolean> {
+  if (!PROVIDER_IDS.includes(providerId)) return false;
+  const apiKey = entry.apiKey?.trim() ?? '';
+  if (!apiKey) return false;
+  // Env var wins at resolution time, so seeding when env is set would
+  // be invisible to the user. Skip to avoid confusing on-disk state.
+  if (readEnvKey(providerId)) return false;
+
+  const prior = await readStored(projectRoot);
+  const priorApiKey =
+    typeof prior[providerId]?.apiKey === 'string' && prior[providerId].apiKey.trim()
+      ? prior[providerId].apiKey.trim()
+      : '';
+  if (priorApiKey) return false;
+
+  const baseUrl = entry.baseUrl?.trim() ?? '';
+  const model = entry.model?.trim() ?? '';
+  const next: ProviderMap = { ...prior };
+  next[providerId] = {
+    apiKey,
+    ...(baseUrl ? { baseUrl } : {}),
+    ...(model ? { model } : {}),
+  };
+  await writeStored(projectRoot, next);
+  return true;
+}
--- a/apps/daemon/src/media-models.ts
+++ b/apps/daemon/src/media-models.ts
@ -60,7 +60,7 @@ export const MEDIA_PROVIDERS: MediaProvider[] = [
  {
    id: 'senseaudio',
    label: 'SenseAudio',
-    hint: 'TTS · 70+ system voices · clone',
+    hint: '',
    integrated: true,
    defaultBaseUrl: 'https://api.senseaudio.cn',
    docsUrl: 'https://docs.senseaudio.cn',
@ -80,6 +80,10 @@ export const IMAGE_MODELS: MediaModel[] = [
  { id: 'doubao-seedream-3-0-t2i-250415', label: 'seedream-3.0', hint: 'ByteDance · Doubao image', provider: 'volcengine', caps: ['t2i'] },
  { id: 'doubao-seededit-3-0-i2i-250628', label: 'seededit-3.0', hint: 'ByteDance · image edit', provider: 'volcengine', caps: ['i2i'] },

+  { id: 'senseaudio-image-2.0-260319', label: 'senseaudio-image-2.0', hint: 'SenseAudio · multi-aspect, latest', provider: 'senseaudio', caps: ['t2i', 'i2i'] },
+  { id: 'senseaudio-image-1.0-260319', label: 'senseaudio-image-1.0', hint: 'SenseAudio · standard', provider: 'senseaudio', caps: ['t2i', 'i2i'] },
+  { id: 'doubao-seedream-5-0-260128', label: 'seedream-5.0', hint: 'SenseAudio · ByteDance Seedream 5.0 hi-res', provider: 'senseaudio', caps: ['t2i', 'i2i'] },
+
  { id: 'grok-imagine-image', label: 'grok-imagine-image', hint: 'xAI · 2K text-to-image', provider: 'grok', caps: ['t2i'] },

  { id: 'gemini-3.1-flash-image-preview', label: 'nano-banana-2', hint: 'Nano Banana · text-to-image', provider: 'nanobanana', caps: ['t2i'] },
--- a/apps/daemon/src/media.ts
+++ b/apps/daemon/src/media.ts
@ -57,6 +57,7 @@ import {
  findProvider,
  modelsForSurface,
 } from './media-models.js';
+import { assertExternalAssetUrl } from './connectionTest.js';
 import { resolveModelAlias, resolveProviderConfig } from './media-config.js';
 import {
  ensureProject,
@ -559,6 +560,11 @@ export async function generateMedia(args: {
      bytes = result.bytes;
      providerNote = result.providerNote;
      suggestedExt = result.suggestedExt;
+    } else if (def.provider === 'senseaudio' && surface === 'image') {
+      const result = await renderSenseAudioImage(ctx, credentials);
+      bytes = result.bytes;
+      providerNote = result.providerNote;
+      suggestedExt = result.suggestedExt;
    } else if (def.provider === 'fishaudio' && surface === 'audio') {
      const result = await renderFishAudioTTS(ctx, credentials);
      bytes = result.bytes;
@ -2243,6 +2249,131 @@ async function renderSenseAudioTTS(ctx: MediaContext, credentials: ProviderConfi
  };
 }

+// ---------------------------------------------------------------------------
+// Provider: SenseAudio image — POST /v1/image/sync (synchronous text-to-image).
+//
+// Docs: https://docs.senseaudio.cn/guides/image/overview
+//   * Models: senseaudio-image-2.0-260319 (multi-aspect), senseaudio-image-1.0-260319
+//     (standard), doubao-seedream-5-0-260128 (hi-res). The wire `model` field
+//     accepts the catalog id directly so no alias map is needed.
+//   * Body: { model, prompt (≤2000 chars), size (WxH, required when no
+//     reference), reference (URL or data URI, optional), seed (optional int) }.
+//   * Response: { url: string } pointing at the rendered PNG; we fetch it
+//     once to materialise bytes the dispatcher can write to disk.
+//   * Auth: Authorization: Bearer <API_KEY>; shares the senseaudio provider
+//     slot with the TTS path (OD_SENSEAUDIO_API_KEY / SENSEAUDIO_API_KEY).
+// We default to the /sync endpoint because the chat runtime already streams
+// progress and a single round-trip keeps the dispatcher contract identical
+// to OpenAI / Volcengine image. Switching to /v1/image/async + GET
+// /v1/image/pending is a future option if the upstream model latency
+// outgrows the daemon's request timeout.
+// ---------------------------------------------------------------------------
+
+const SENSEAUDIO_IMAGE_PROMPT_LIMIT = 2000;
+
+// SenseAudio's image gateway rejects non-standard pixel sizes with a 400
+// `参数错误：size`. Keep this table in sync with byok-tools.ts's
+// ASPECT_TO_SIZE — both paths hit the same /v1/image/sync endpoint.
+function senseAudioImageSize(aspect?: string): string {
+  if (aspect === '16:9') return '1280x720';
+  if (aspect === '9:16') return '720x1280';
+  if (aspect === '4:3') return '1024x768';
+  if (aspect === '3:4') return '768x1024';
+  return '1024x1024';
+}
+
+async function renderSenseAudioImage(ctx: MediaContext, credentials: ProviderConfig): Promise<RenderResult> {
+  if (!credentials.apiKey) {
+    throw new Error(
+      'no SenseAudio API key — configure it in Settings or set OD_SENSEAUDIO_API_KEY',
+    );
+  }
+  const baseUrl = (credentials.baseUrl || SENSEAUDIO_DEFAULT_BASE_URL).replace(
+    /\/$/,
+    '',
+  );
+  const promptRaw = (ctx.prompt && ctx.prompt.trim()) || 'A high-quality reference image.';
+  // SenseAudio rejects >2000-char prompts with a 4xx; trim defensively so a
+  // verbose agent plan doesn't dead-end the generation. The truncated tail
+  // surfaces in providerNote so the user sees what was actually sent.
+  const prompt =
+    promptRaw.length > SENSEAUDIO_IMAGE_PROMPT_LIMIT
+      ? promptRaw.slice(0, SENSEAUDIO_IMAGE_PROMPT_LIMIT)
+      : promptRaw;
+  const size = senseAudioImageSize(ctx.aspect);
+  const reference = ctx.imageRef?.dataUrl;
+
+  const body: Record<string, unknown> = {
+    model: ctx.wireModel,
+    prompt,
+    size,
+  };
+  if (reference) {
+    // When a reference image is supplied the API documents `size` as
+    // optional; we still send it so the output dimensions stay
+    // deterministic across t2i / i2i runs of the same project.
+    body.reference = reference;
+  }
+
+  const resp = await fetch(`${baseUrl}/v1/image/sync`, {
+    method: 'POST',
+    headers: {
+      authorization: `Bearer ${credentials.apiKey}`,
+      'content-type': 'application/json',
+    },
+    body: JSON.stringify(body),
+  });
+  const respText = await resp.text();
+  if (!resp.ok) {
+    throw new Error(`senseaudio image ${resp.status}: ${truncate(respText, 240)}`);
+  }
+  let data: any;
+  try {
+    data = JSON.parse(respText);
+  } catch {
+    throw new Error(`senseaudio image non-JSON: ${truncate(respText, 200)}`);
+  }
+  // Mirror the TTS base_resp envelope check: HTTP 200 can still encode an
+  // upstream logical failure. The image API uses the same shape on the
+  // failure path documented for /v1/image/pending (status=failed +
+  // error_message), so surface either source verbatim.
+  if (data?.base_resp && data.base_resp.status_code !== 0) {
+    throw new Error(
+      `senseaudio image api error ${data.base_resp.status_code}: ${data.base_resp.status_msg || 'unknown'}`,
+    );
+  }
+  if (typeof data?.error_message === 'string' && data.error_message) {
+    throw new Error(`senseaudio image api error: ${data.error_message}`);
+  }
+  const url = typeof data?.url === 'string' ? data.url : '';
+  if (!url) {
+    throw new Error('senseaudio image response missing url');
+  }
+  // Mirror the chat-tool SSRF guard (byok-tools.ts): the gateway-returned
+  // `url` is attacker-controllable inside a successful response, so DNS-
+  // resolve it through validateBaseUrlResolved and refuse loopback /
+  // RFC1918 / metadata-service hosts. Pair with `redirect: 'error'` so a
+  // 3xx hop into private space is also blocked.
+  const urlCheck = await assertExternalAssetUrl(url);
+  if (!urlCheck.ok) {
+    throw new Error(`senseaudio image ${urlCheck.error}`);
+  }
+  const imgResp = await fetch(url, { redirect: 'error' });
+  if (!imgResp.ok) {
+    throw new Error(`senseaudio image fetch ${imgResp.status}`);
+  }
+  const bytes = Buffer.from(await imgResp.arrayBuffer());
+  if (bytes.length === 0) {
+    throw new Error('senseaudio image fetch returned zero bytes');
+  }
+
+  return {
+    bytes,
+    providerNote: `senseaudio/${ctx.wireModel} · ${size}${reference ? ' · i2i' : ''} · ${bytes.length} bytes`,
+    suggestedExt: '.png',
+  };
+}
+
 // ---------------------------------------------------------------------------
 // Provider: FishAudio — Speech-1.x family text-to-speech (synchronous).
 //
--- a/apps/daemon/src/memory-llm.ts
+++ b/apps/daemon/src/memory-llm.ts
@ -142,6 +142,15 @@ const PROVIDER_DEFAULTS = {
    model: 'gemma3:4b',
    baseUrl: 'https://ollama.com',
  },
+  // SenseAudio's chat API is OpenAI-compatible (POST /v1/chat/completions,
+  // Bearer auth), so the extractor falls through to callOpenAI with this
+  // base URL and the user's SenseAudio API key. The default model is the
+  // small/fast variant so auto-pick stays cheap; users can swap in
+  // senseaudio-s2 or any gateway model via the picker.
+  senseaudio: {
+    model: 'senseaudio-s2-flash',
+    baseUrl: 'https://api.senseaudio.cn',
+  },
 };

 // Map an explicit override provider to the env var the daemon should
@ -169,6 +178,13 @@ function envKeyFor(provider) {
  if (provider === 'ollama') {
    return process.env.OLLAMA_API_KEY?.trim() || '';
  }
+  if (provider === 'senseaudio') {
+    return (
+      process.env.OD_SENSEAUDIO_API_KEY?.trim()
+      || process.env.SENSEAUDIO_API_KEY?.trim()
+      || ''
+    );
+  }
  return '';
 }

--- a/apps/daemon/src/providerModels.ts
+++ b/apps/daemon/src/providerModels.ts
@ -149,7 +149,9 @@ function extractGoogleModels(data: unknown): ProviderModelOption[] {
 }

 function providerModelsUrl(protocol: ConnectionTestProtocol, baseUrl: string, apiKey: string): string {
-  if (protocol === 'openai') return appendVersionedApiPath(baseUrl, '/models');
+  if (protocol === 'openai' || protocol === 'senseaudio') {
+    return appendVersionedApiPath(baseUrl, '/models');
+  }
  if (protocol === 'anthropic') {
    const url = new URL(appendVersionedApiPath(baseUrl, '/models'));
    url.searchParams.set('limit', '1000');
@ -167,7 +169,9 @@ function providerModelsHeaders(
  protocol: ConnectionTestProtocol,
  apiKey: string,
 ): Record<string, string> {
-  if (protocol === 'openai') return { authorization: `Bearer ${apiKey}` };
+  if (protocol === 'openai' || protocol === 'senseaudio') {
+    return { authorization: `Bearer ${apiKey}` };
+  }
  if (protocol === 'anthropic') {
    return {
      'x-api-key': apiKey,
@ -178,7 +182,9 @@ function providerModelsHeaders(
 }

 function extractModels(protocol: ConnectionTestProtocol, data: unknown): ProviderModelOption[] {
-  if (protocol === 'openai') return extractOpenAiModels(data);
+  // SenseAudio's /v1/models response follows the OpenAI envelope
+  // (`{ data: [{ id, ... }] }`), so the same extractor handles both.
+  if (protocol === 'openai' || protocol === 'senseaudio') return extractOpenAiModels(data);
  if (protocol === 'anthropic') return extractAnthropicModels(data);
  if (protocol === 'google') return extractGoogleModels(data);
  return [];
--- a/apps/daemon/src/server.ts
+++ b/apps/daemon/src/server.ts
@ -10859,6 +10859,7 @@ export async function startServer({
    db,
    design,
    http: httpDeps,
+    paths: pathDeps,
    chat: { startChatRun, submitToolResultToRun },
    agents: agentDeps,
    critique: critiqueDeps,
--- a/apps/daemon/tests/byok-tools.test.ts
+++ b/apps/daemon/tests/byok-tools.test.ts
@ -0,0 +1,686 @@
+import { mkdir, mkdtemp, readFile, rm } from 'node:fs/promises';
+import { tmpdir } from 'node:os';
+import path from 'node:path';
+import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
+
+import {
+  BYOK_SENSEAUDIO_TOOLS,
+  executeGenerateImage,
+  executeGenerateVideo,
+} from '../src/byok-tools.js';
+
+describe('BYOK_SENSEAUDIO_TOOLS', () => {
+  it('exports an OpenAI-shaped generate_image tool definition', () => {
+    const tool = BYOK_SENSEAUDIO_TOOLS.find(
+      (t) => t.function.name === 'generate_image',
+    );
+    expect(tool).toBeDefined();
+    expect(tool!.type).toBe('function');
+    expect(tool!.function.parameters.required).toEqual(['prompt']);
+    expect(tool!.function.parameters.properties.aspect_ratio.enum).toEqual([
+      '1:1',
+      '16:9',
+      '9:16',
+      '4:3',
+      '3:4',
+    ]);
+  });
+
+  it('exposes both generate_image and generate_video tools', () => {
+    const names = BYOK_SENSEAUDIO_TOOLS.map((t) => t.function.name).sort();
+    expect(names).toEqual(['generate_image', 'generate_video']);
+  });
+});
+
+describe('executeGenerateImage', () => {
+  let root: string;
+  let projectsRoot: string;
+  const PROJECT_ID = 'test-project';
+  const realFetch = globalThis.fetch;
+
+  beforeEach(async () => {
+    root = await mkdtemp(path.join(tmpdir(), 'od-byok-tools-'));
+    projectsRoot = path.join(root, 'projects');
+  });
+
+  afterEach(async () => {
+    globalThis.fetch = realFetch;
+    vi.unstubAllGlobals();
+    await rm(root, { recursive: true, force: true });
+  });
+
+  const baseCtx = () => ({
+    projectRoot: root,
+    projectsRoot,
+    projectId: PROJECT_ID,
+    upstreamApiKey: 'sa-byok-key',
+    upstreamBaseUrl: 'https://api.senseaudio.cn',
+  });
+
+  it('calls /v1/image/sync, downloads the URL, persists bytes, and returns a daemon URL', async () => {
+    const pngBytes = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a]);
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const url = String(input);
+      if (url === 'https://api.senseaudio.cn/v1/image/sync') {
+        expect(init?.method).toBe('POST');
+        expect(init?.headers).toMatchObject({
+          authorization: 'Bearer sa-byok-key',
+          'content-type': 'application/json',
+        });
+        expect(JSON.parse(String(init?.body))).toEqual({
+          model: 'senseaudio-image-2.0-260319',
+          prompt: 'a tabby cat playing with yarn',
+          size: '1024x1024',
+        });
+        return new Response(
+          JSON.stringify({
+            url: 'https://cdn.example.test/generated/cat.png',
+            base_resp: { status_code: 0, status_msg: 'success' },
+          }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      if (url === 'https://cdn.example.test/generated/cat.png') {
+        return new Response(pngBytes, {
+          status: 200,
+          headers: { 'content-type': 'image/png' },
+        });
+      }
+      throw new Error(`unexpected fetch: ${url}`);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage(
+      { prompt: 'a tabby cat playing with yarn' },
+      baseCtx(),
+    );
+
+    expect(result.ok).toBe(true);
+    // Returns a relative URL through the project file route so the
+    // chat UI loads same-origin via Next.js's /api/:path* rewrite,
+    // satisfying the strict CSP `img-src 'self'`. Path component is
+    // url-encoded so unusual (but isSafeId-passing) project ids don't
+    // break the URL.
+    expect(result.url).toMatch(
+      new RegExp(`^/api/projects/${PROJECT_ID}/files/byok-[a-z0-9-]+\\.png$`),
+    );
+    expect(fetchMock).toHaveBeenCalledTimes(2);
+
+    // Persisted file lives inside the project folder where listFiles /
+    // readProjectFile / archive plumbing will all discover it.
+    const filename = result.url!.split('/').pop()!;
+    const onDisk = await readFile(path.join(projectsRoot, PROJECT_ID, filename));
+    expect(onDisk.equals(pngBytes)).toBe(true);
+  });
+
+  it('honours args.model when the LLM picks a SenseAudio image model', async () => {
+    const pngBytes = Buffer.from([0x89, 0x50]);
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const url = String(input);
+      if (url.endsWith('/v1/image/sync')) {
+        expect(JSON.parse(String(init?.body)).model).toBe('doubao-seedream-5-0-260128');
+        return new Response(
+          JSON.stringify({ url: 'https://cdn.example.test/hi.png' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      return new Response(pngBytes, { status: 200 });
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage(
+      { prompt: 'wallpaper', model: 'doubao-seedream-5-0-260128' },
+      baseCtx(),
+    );
+    expect(result.ok).toBe(true);
+  });
+
+  it('falls back to ctx.defaultImageModel when args.model is missing', async () => {
+    const pngBytes = Buffer.from([0x89, 0x50]);
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const url = String(input);
+      if (url.endsWith('/v1/image/sync')) {
+        expect(JSON.parse(String(init?.body)).model).toBe('senseaudio-image-1.0-260319');
+        return new Response(
+          JSON.stringify({ url: 'https://cdn.example.test/std.png' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      return new Response(pngBytes, { status: 200 });
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage(
+      { prompt: 'standard' },
+      { ...baseCtx(), defaultImageModel: 'senseaudio-image-1.0-260319' },
+    );
+    expect(result.ok).toBe(true);
+  });
+
+  it('ignores args.model when it is not in the SenseAudio allowlist', async () => {
+    const pngBytes = Buffer.from([0x89, 0x50]);
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const url = String(input);
+      if (url.endsWith('/v1/image/sync')) {
+        // Falls through to ctx.defaultImageModel (registry-valid).
+        expect(JSON.parse(String(init?.body)).model).toBe('senseaudio-image-1.0-260319');
+        return new Response(
+          JSON.stringify({ url: 'https://cdn.example.test/x.png' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      return new Response(pngBytes, { status: 200 });
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage(
+      { prompt: 'spoofed', model: 'evil-model-id' },
+      { ...baseCtx(), defaultImageModel: 'senseaudio-image-1.0-260319' },
+    );
+    expect(result.ok).toBe(true);
+  });
+
+  it('falls back to registry default when both args.model and ctx.defaultImageModel are missing/invalid', async () => {
+    const pngBytes = Buffer.from([0x89, 0x50]);
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const url = String(input);
+      if (url.endsWith('/v1/image/sync')) {
+        // Registry default is the first SenseAudio entry — 2.0 today.
+        expect(JSON.parse(String(init?.body)).model).toBe('senseaudio-image-2.0-260319');
+        return new Response(
+          JSON.stringify({ url: 'https://cdn.example.test/d.png' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      return new Response(pngBytes, { status: 200 });
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage(
+      { prompt: 'no model anywhere' },
+      { ...baseCtx(), defaultImageModel: 'also-bogus' },
+    );
+    expect(result.ok).toBe(true);
+  });
+
+  it('rejects unsafe projectId before any upstream call', async () => {
+    const fetchMock = vi.fn();
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage(
+      { prompt: 'x' },
+      { ...baseCtx(), projectId: '../escape' },
+    );
+
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/invalid projectId/);
+    // ensureProject runs up front so the unsafe id is caught BEFORE
+    // any senseaudio upstream call goes out — no token spent, no
+    // attempt to write outside the project tree.
+    expect(fetchMock).not.toHaveBeenCalled();
+  });
+
+  it('maps aspect_ratio to the SenseAudio size string', async () => {
+    const pngBytes = Buffer.from([0x89, 0x50]);
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const url = String(input);
+      if (url.endsWith('/v1/image/sync')) {
+        expect(JSON.parse(String(init?.body)).size).toBe('1280x720');
+        return new Response(
+          JSON.stringify({ url: 'https://cdn.example.test/wide.png' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      return new Response(pngBytes, { status: 200 });
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage(
+      { prompt: 'widescreen banner', aspect_ratio: '16:9' },
+      baseCtx(),
+    );
+
+    expect(result.ok).toBe(true);
+  });
+
+  it('falls back to 1:1 for unknown aspect_ratio values', async () => {
+    const pngBytes = Buffer.from([0x89, 0x50]);
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const url = String(input);
+      if (url.endsWith('/v1/image/sync')) {
+        expect(JSON.parse(String(init?.body)).size).toBe('1024x1024');
+        return new Response(
+          JSON.stringify({ url: 'https://cdn.example.test/square.png' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      return new Response(pngBytes, { status: 200 });
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage(
+      { prompt: 'square thing', aspect_ratio: 'something-else' },
+      baseCtx(),
+    );
+
+    expect(result.ok).toBe(true);
+  });
+
+  it('returns { ok: false } on missing prompt', async () => {
+    const fetchMock = vi.fn();
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage({}, baseCtx());
+
+    expect(result).toEqual({ ok: false, error: 'prompt is required' });
+    expect(fetchMock).not.toHaveBeenCalled();
+  });
+
+  it('returns { ok: false } when no API key is available', async () => {
+    const fetchMock = vi.fn();
+    vi.stubGlobal('fetch', fetchMock);
+
+    const ctx = { ...baseCtx(), upstreamApiKey: '' };
+    const result = await executeGenerateImage({ prompt: 'whatever' }, ctx);
+
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/no SenseAudio API key/);
+    expect(fetchMock).not.toHaveBeenCalled();
+  });
+
+  it('surfaces HTTP failures with status code and truncated body', async () => {
+    const fetchMock = vi.fn(async () =>
+      new Response('unauthorized', {
+        status: 401,
+        headers: { 'content-type': 'text/plain' },
+      }),
+    );
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage({ prompt: 'x' }, baseCtx());
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/senseaudio image 401/);
+  });
+
+  it('surfaces error_message envelope verbatim', async () => {
+    const fetchMock = vi.fn(async () =>
+      new Response(
+        JSON.stringify({ error_message: 'sensitive_content_blocked' }),
+        { status: 200, headers: { 'content-type': 'application/json' } },
+      ),
+    );
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage({ prompt: 'x' }, baseCtx());
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/sensitive_content_blocked/);
+  });
+
+  it('surfaces base_resp non-zero status_code', async () => {
+    const fetchMock = vi.fn(async () =>
+      new Response(
+        JSON.stringify({
+          base_resp: { status_code: 1004, status_msg: 'quota exhausted' },
+        }),
+        { status: 200, headers: { 'content-type': 'application/json' } },
+      ),
+    );
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage({ prompt: 'x' }, baseCtx());
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/api error 1004/);
+    expect(result.error).toMatch(/quota exhausted/);
+  });
+
+  it('returns { ok: false } when upstream returns no url', async () => {
+    const fetchMock = vi.fn(async () =>
+      new Response(
+        JSON.stringify({ base_resp: { status_code: 0, status_msg: 'ok' } }),
+        { status: 200, headers: { 'content-type': 'application/json' } },
+      ),
+    );
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage({ prompt: 'x' }, baseCtx());
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/missing url/);
+  });
+
+  it('returns { ok: false } when the image download fails', async () => {
+    const fetchMock = vi.fn(async (input: unknown) => {
+      const url = String(input);
+      if (url.endsWith('/v1/image/sync')) {
+        return new Response(
+          JSON.stringify({ url: 'https://cdn.example.test/will-404.png' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      return new Response('not found', { status: 404 });
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateImage({ prompt: 'x' }, baseCtx());
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/image download 404/);
+  });
+});
+
+describe('BYOK_SENSEAUDIO_TOOLS — video', () => {
+  it('exposes a generate_video tool definition with the documented param surface', () => {
+    const video = BYOK_SENSEAUDIO_TOOLS.find(
+      (t) => t.function.name === 'generate_video',
+    );
+    expect(video).toBeDefined();
+    const props = video!.function.parameters.properties as Record<string, any>;
+    expect(video!.function.parameters.required).toEqual(['prompt']);
+    expect(props.aspect_ratio.enum).toEqual(['16:9', '9:16', '4:3', '3:4', '1:1']);
+    expect(props.resolution.enum).toEqual(['480p', '720p', '1080p']);
+    expect(props.duration).toMatchObject({ type: 'integer', minimum: 4, maximum: 15 });
+    expect(props.generate_audio.type).toBe('boolean');
+  });
+});
+
+describe('executeGenerateVideo', () => {
+  let root: string;
+  let projectsRoot: string;
+  const PROJECT_ID = 'test-project';
+  const realFetch = globalThis.fetch;
+
+  beforeEach(async () => {
+    root = await mkdtemp(path.join(tmpdir(), 'od-byok-video-'));
+    projectsRoot = path.join(root, 'projects');
+  });
+
+  afterEach(async () => {
+    globalThis.fetch = realFetch;
+    vi.unstubAllGlobals();
+    await rm(root, { recursive: true, force: true });
+  });
+
+  const baseCtx = () => ({
+    projectRoot: root,
+    projectsRoot,
+    projectId: PROJECT_ID,
+    upstreamApiKey: 'sa-byok-key',
+    upstreamBaseUrl: 'https://api.senseaudio.cn',
+    // Keep tests fast — 1 ms between polls instead of the production 5 s.
+    videoPollIntervalMs: 1,
+  });
+
+  it('creates, polls until completed, downloads, and writes the mp4 into the project folder', async () => {
+    const mp4Bytes = Buffer.from([0x00, 0x00, 0x00, 0x18, 0x66, 0x74, 0x79, 0x70]);
+    let pollCount = 0;
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const url = String(input);
+
+      if (url === 'https://api.senseaudio.cn/v1/video/create') {
+        expect(init?.method).toBe('POST');
+        expect(init?.headers).toMatchObject({
+          authorization: 'Bearer sa-byok-key',
+          'content-type': 'application/json',
+        });
+        const body = JSON.parse(String(init?.body));
+        expect(body).toEqual({
+          model: 'doubao-seedance-2-0-260128',
+          content: [{ type: 'text', text: 'a sunset over the ocean' }],
+          duration: 8,
+          resolution: '1080p',
+          ratio: '16:9',
+          provider_specific: { generate_audio: true },
+        });
+        return new Response(
+          JSON.stringify({ task_id: 'task-abc' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+
+      if (url.startsWith('https://api.senseaudio.cn/v1/video/status?id=task-abc')) {
+        pollCount++;
+        if (pollCount === 1) {
+          return new Response(
+            JSON.stringify({ status: 'pending', progress: 0 }),
+            { status: 200, headers: { 'content-type': 'application/json' } },
+          );
+        }
+        if (pollCount === 2) {
+          return new Response(
+            JSON.stringify({ status: 'processing', progress: 50 }),
+            { status: 200, headers: { 'content-type': 'application/json' } },
+          );
+        }
+        return new Response(
+          JSON.stringify({
+            status: 'completed',
+            progress: 100,
+            video_url: 'https://cdn.example.test/video/done.mp4',
+            duration: 8,
+          }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+
+      if (url === 'https://cdn.example.test/video/done.mp4') {
+        return new Response(mp4Bytes, {
+          status: 200,
+          headers: { 'content-type': 'video/mp4' },
+        });
+      }
+
+      throw new Error(`unexpected fetch: ${url}`);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateVideo(
+      {
+        prompt: 'a sunset over the ocean',
+        aspect_ratio: '16:9',
+        duration: 8,
+        resolution: '1080p',
+        generate_audio: true,
+      },
+      baseCtx(),
+    );
+
+    expect(result.ok).toBe(true);
+    expect(result.url).toMatch(
+      new RegExp(`^/api/projects/${PROJECT_ID}/files/byok-video-[a-z0-9-]+\\.mp4$`),
+    );
+
+    // 1× create + 3× poll + 1× download = 5 fetches total.
+    expect(fetchMock).toHaveBeenCalledTimes(5);
+    expect(pollCount).toBe(3);
+
+    const filename = result.url!.split('/').pop()!;
+    const onDisk = await readFile(path.join(projectsRoot, PROJECT_ID, filename));
+    expect(onDisk.equals(mp4Bytes)).toBe(true);
+  });
+
+  it('defaults duration / resolution / aspect when caller omits them', async () => {
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const url = String(input);
+      if (url.endsWith('/v1/video/create')) {
+        const body = JSON.parse(String(init?.body));
+        expect(body).toMatchObject({
+          duration: 5,
+          resolution: '720p',
+          ratio: '16:9',
+          provider_specific: { generate_audio: false },
+        });
+        return new Response(
+          JSON.stringify({ task_id: 'task-defaults' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      if (url.startsWith('https://api.senseaudio.cn/v1/video/status')) {
+        return new Response(
+          JSON.stringify({
+            status: 'completed',
+            video_url: 'https://cdn.example.test/video/d.mp4',
+          }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      return new Response(Buffer.from([0x01]), { status: 200 });
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateVideo({ prompt: 'minimal' }, baseCtx());
+    expect(result.ok).toBe(true);
+  });
+
+  it('clamps duration outside the 4–15 range and rejects non-enum aspect_ratio / resolution', async () => {
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const url = String(input);
+      if (url.endsWith('/v1/video/create')) {
+        const body = JSON.parse(String(init?.body));
+        // 99 → clamped to 15; 'octagonal' → falls back to '16:9';
+        // '8k' → falls back to '720p'.
+        expect(body).toMatchObject({
+          duration: 15,
+          resolution: '720p',
+          ratio: '16:9',
+        });
+        return new Response(
+          JSON.stringify({ task_id: 'task-clamp' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      if (url.startsWith('https://api.senseaudio.cn/v1/video/status')) {
+        return new Response(
+          JSON.stringify({
+            status: 'completed',
+            video_url: 'https://cdn.example.test/clamp.mp4',
+          }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      return new Response(Buffer.from([0x02]), { status: 200 });
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateVideo(
+      {
+        prompt: 'overflow',
+        duration: 99,
+        aspect_ratio: 'octagonal',
+        resolution: '8k',
+      },
+      baseCtx(),
+    );
+    expect(result.ok).toBe(true);
+  });
+
+  it('surfaces a failed status as a tool error so the model can apologize', async () => {
+    const fetchMock = vi.fn(async (input: unknown) => {
+      const url = String(input);
+      if (url.endsWith('/v1/video/create')) {
+        return new Response(
+          JSON.stringify({ task_id: 'task-fail' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      if (url.startsWith('https://api.senseaudio.cn/v1/video/status')) {
+        return new Response(
+          JSON.stringify({
+            status: 'failed',
+            error_message: 'sensitive_content_blocked',
+          }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      throw new Error(`unexpected fetch: ${url}`);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateVideo(
+      { prompt: 'blocked content' },
+      baseCtx(),
+    );
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/senseaudio video failed/);
+    expect(result.error).toMatch(/sensitive_content_blocked/);
+  });
+
+  it('times out after SENSEAUDIO_VIDEO_MAX_POLLS polls when the job stays pending', async () => {
+    const fetchMock = vi.fn(async (input: unknown) => {
+      const url = String(input);
+      if (url.endsWith('/v1/video/create')) {
+        return new Response(
+          JSON.stringify({ task_id: 'task-stuck' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      if (url.startsWith('https://api.senseaudio.cn/v1/video/status')) {
+        return new Response(
+          JSON.stringify({ status: 'pending', progress: 0 }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      throw new Error(`unexpected fetch: ${url}`);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateVideo(
+      { prompt: 'stuck job' },
+      baseCtx(),
+    );
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/timed out/);
+    // 1× create + 120× poll = 121 fetches (10-min ceiling at 5 s
+    // intervals — kept generous because doubao-seedance frequently
+    // spends 3–8 min on the gateway for 1080p+audio jobs).
+    expect(fetchMock).toHaveBeenCalledTimes(121);
+  }, 30_000);
+
+  it('returns a tool error when create response is missing task_id', async () => {
+    const fetchMock = vi.fn(async () =>
+      new Response('{"oops": true}', {
+        status: 200,
+        headers: { 'content-type': 'application/json' },
+      }),
+    );
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateVideo({ prompt: 'x' }, baseCtx());
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/missing task_id/);
+  });
+
+  it('returns a tool error when create call returns non-2xx', async () => {
+    const fetchMock = vi.fn(async () =>
+      new Response('unauthorized', {
+        status: 401,
+        headers: { 'content-type': 'text/plain' },
+      }),
+    );
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateVideo({ prompt: 'x' }, baseCtx());
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/senseaudio video create 401/);
+  });
+
+  it('rejects an unsafe projectId before any upstream call', async () => {
+    const fetchMock = vi.fn();
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateVideo(
+      { prompt: 'x' },
+      { ...baseCtx(), projectId: '../escape' },
+    );
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/invalid projectId/);
+    expect(fetchMock).not.toHaveBeenCalled();
+  });
+
+  it('rejects empty prompt before any upstream call', async () => {
+    const fetchMock = vi.fn();
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await executeGenerateVideo({}, baseCtx());
+    expect(result.ok).toBe(false);
+    expect(result.error).toMatch(/prompt is required/);
+    expect(fetchMock).not.toHaveBeenCalled();
+  });
+});
--- a/apps/daemon/tests/media-config.test.ts
+++ b/apps/daemon/tests/media-config.test.ts
@ -8,6 +8,7 @@ import {
  readMaskedConfig,
  resolveModelAlias,
  resolveProviderConfig,
+  seedProviderIfMissing,
  writeConfig,
 } from '../src/media-config.js';

@ -868,3 +869,159 @@ describe('media-config model alias resolution (issue #1277)', () => {
    ).toBe('doubao-seedream-5-0');
  });
 });
+
+describe('seedProviderIfMissing', () => {
+  let projectRoot: string;
+  const SENSEAUDIO_ENV_KEYS = ['OD_SENSEAUDIO_API_KEY', 'SENSEAUDIO_API_KEY'];
+  const originalEnv = Object.fromEntries(
+    SENSEAUDIO_ENV_KEYS.map((key) => [key, process.env[key]]),
+  );
+  const originalMediaConfigDir = process.env.OD_MEDIA_CONFIG_DIR;
+  const originalDataDir = process.env.OD_DATA_DIR;
+
+  beforeEach(async () => {
+    projectRoot = await mkdtemp(path.join(tmpdir(), 'od-media-seed-'));
+    for (const key of SENSEAUDIO_ENV_KEYS) {
+      delete process.env[key];
+    }
+    delete process.env.OD_MEDIA_CONFIG_DIR;
+    delete process.env.OD_DATA_DIR;
+  });
+
+  afterEach(async () => {
+    for (const key of SENSEAUDIO_ENV_KEYS) {
+      if (originalEnv[key] == null) {
+        delete process.env[key];
+      } else {
+        process.env[key] = originalEnv[key];
+      }
+    }
+    if (originalMediaConfigDir == null) {
+      delete process.env.OD_MEDIA_CONFIG_DIR;
+    } else {
+      process.env.OD_MEDIA_CONFIG_DIR = originalMediaConfigDir;
+    }
+    if (originalDataDir == null) {
+      delete process.env.OD_DATA_DIR;
+    } else {
+      process.env.OD_DATA_DIR = originalDataDir;
+    }
+    await rm(projectRoot, { recursive: true, force: true });
+  });
+
+  async function writeStored(data: unknown) {
+    const file = path.join(projectRoot, '.od', 'media-config.json');
+    await mkdir(path.dirname(file), { recursive: true });
+    await writeFile(file, JSON.stringify(data), 'utf8');
+  }
+
+  async function readStoredJson(): Promise<unknown> {
+    const file = path.join(projectRoot, '.od', 'media-config.json');
+    const raw = await readFile(file, 'utf8');
+    return JSON.parse(raw);
+  }
+
+  it('writes a fresh entry when the slot is empty', async () => {
+    const wrote = await seedProviderIfMissing(projectRoot, 'senseaudio', {
+      apiKey: 'sa-test-key',
+      baseUrl: 'https://api.senseaudio.cn',
+    });
+    expect(wrote).toBe(true);
+    const stored = await readStoredJson();
+    expect(stored).toEqual({
+      providers: {
+        senseaudio: {
+          apiKey: 'sa-test-key',
+          baseUrl: 'https://api.senseaudio.cn',
+        },
+      },
+    });
+  });
+
+  it('no-ops and preserves the stored key when one is already configured', async () => {
+    await writeStored({
+      providers: {
+        senseaudio: { apiKey: 'pre-existing-key', baseUrl: 'https://existing.example' },
+      },
+    });
+    const wrote = await seedProviderIfMissing(projectRoot, 'senseaudio', {
+      apiKey: 'newer-byok-key',
+      baseUrl: 'https://api.senseaudio.cn',
+    });
+    expect(wrote).toBe(false);
+    const stored = (await readStoredJson()) as { providers: Record<string, unknown> };
+    expect(stored.providers.senseaudio).toEqual({
+      apiKey: 'pre-existing-key',
+      baseUrl: 'https://existing.example',
+    });
+  });
+
+  it('preserves every other provider and aliases when seeding', async () => {
+    await writeStored({
+      providers: {
+        openai: { apiKey: 'sk-openai', baseUrl: 'https://api.openai.com/v1' },
+        volcengine: { apiKey: 'ark-key', baseUrl: 'https://ark.cn-beijing.volces.com/api/v3' },
+      },
+      aliases: { 'doubao-seedream-3-0-t2i-250415': 'doubao-seedream-5-0' },
+    });
+    const wrote = await seedProviderIfMissing(projectRoot, 'senseaudio', {
+      apiKey: 'sa-new',
+    });
+    expect(wrote).toBe(true);
+    const stored = (await readStoredJson()) as {
+      providers: Record<string, unknown>;
+      aliases: Record<string, string>;
+    };
+    expect(stored.providers.openai).toEqual({
+      apiKey: 'sk-openai',
+      baseUrl: 'https://api.openai.com/v1',
+    });
+    expect(stored.providers.volcengine).toEqual({
+      apiKey: 'ark-key',
+      baseUrl: 'https://ark.cn-beijing.volces.com/api/v3',
+    });
+    expect(stored.providers.senseaudio).toEqual({ apiKey: 'sa-new' });
+    expect(stored.aliases).toEqual({
+      'doubao-seedream-3-0-t2i-250415': 'doubao-seedream-5-0',
+    });
+  });
+
+  it('no-ops when an env var resolves a key for the provider', async () => {
+    process.env.OD_SENSEAUDIO_API_KEY = 'env-key';
+    const wrote = await seedProviderIfMissing(projectRoot, 'senseaudio', {
+      apiKey: 'sa-byok-key',
+      baseUrl: 'https://api.senseaudio.cn',
+    });
+    expect(wrote).toBe(false);
+    await expect(readStoredJson()).rejects.toThrow();
+  });
+
+  it('no-ops on empty apiKey', async () => {
+    const wrote = await seedProviderIfMissing(projectRoot, 'senseaudio', {
+      apiKey: '',
+      baseUrl: 'https://api.senseaudio.cn',
+    });
+    expect(wrote).toBe(false);
+    await expect(readStoredJson()).rejects.toThrow();
+  });
+
+  it('no-ops for unknown provider ids', async () => {
+    const wrote = await seedProviderIfMissing(projectRoot, 'not-a-provider', {
+      apiKey: 'whatever',
+    });
+    expect(wrote).toBe(false);
+    await expect(readStoredJson()).rejects.toThrow();
+  });
+
+  it('resolves the seeded key through resolveProviderConfig', async () => {
+    await seedProviderIfMissing(projectRoot, 'senseaudio', {
+      apiKey: 'sa-final',
+      baseUrl: 'https://api.senseaudio.cn',
+    });
+    const resolved = await resolveProviderConfig(projectRoot, 'senseaudio');
+    expect(resolved).toEqual({
+      apiKey: 'sa-final',
+      baseUrl: 'https://api.senseaudio.cn',
+    });
+  });
+});
--- a/apps/daemon/tests/media-senseaudio-image.test.ts
+++ b/apps/daemon/tests/media-senseaudio-image.test.ts
@ -0,0 +1,305 @@
+import { mkdir, mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
+import { tmpdir } from 'node:os';
+import path from 'node:path';
+import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
+
+import { generateMedia } from '../src/media.js';
+
+const TEST_SENSEAUDIO_BASE_URL = 'https://senseaudio-gateway.example.test';
+const TEST_IMAGE_URL = 'https://cdn.example.test/generated/abc.png';
+const TEST_IMAGE_BYTES = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x00, 0x01]);
+
+function buildOkResponse(url = TEST_IMAGE_URL) {
+  return new Response(
+    JSON.stringify({ url, base_resp: { status_code: 0, status_msg: 'success' } }),
+    { status: 200, headers: { 'content-type': 'application/json' } },
+  );
+}
+
+function buildImageFetchResponse(bytes: Buffer) {
+  return new Response(bytes, {
+    status: 200,
+    headers: { 'content-type': 'image/png' },
+  });
+}
+
+describe('senseaudio image generation', () => {
+  let root: string;
+  let projectRoot: string;
+  let projectsRoot: string;
+  const realFetch = globalThis.fetch;
+  const originalMediaConfigDir = process.env.OD_MEDIA_CONFIG_DIR;
+  const originalDataDir = process.env.OD_DATA_DIR;
+
+  beforeEach(async () => {
+    root = await mkdtemp(path.join(tmpdir(), 'od-senseaudio-image-'));
+    projectRoot = path.join(root, 'project-root');
+    projectsRoot = path.join(projectRoot, '.od', 'projects');
+    await mkdir(projectsRoot, { recursive: true });
+    delete process.env.OD_MEDIA_CONFIG_DIR;
+    delete process.env.OD_DATA_DIR;
+    delete process.env.OD_SENSEAUDIO_API_KEY;
+    delete process.env.SENSEAUDIO_API_KEY;
+  });
+
+  afterEach(async () => {
+    globalThis.fetch = realFetch;
+    if (originalMediaConfigDir == null) {
+      delete process.env.OD_MEDIA_CONFIG_DIR;
+    } else {
+      process.env.OD_MEDIA_CONFIG_DIR = originalMediaConfigDir;
+    }
+    if (originalDataDir == null) {
+      delete process.env.OD_DATA_DIR;
+    } else {
+      process.env.OD_DATA_DIR = originalDataDir;
+    }
+    delete process.env.OD_SENSEAUDIO_API_KEY;
+    delete process.env.SENSEAUDIO_API_KEY;
+    await rm(root, { recursive: true, force: true });
+  });
+
+  async function writeConfig(data: unknown) {
+    const file = path.join(projectRoot, '.od', 'media-config.json');
+    await mkdir(path.dirname(file), { recursive: true });
+    await writeFile(file, JSON.stringify(data), 'utf8');
+  }
+
+  it('renders a SenseAudio image with the documented sync defaults', async () => {
+    await writeConfig({
+      providers: {
+        senseaudio: {
+          apiKey: 'sense-test-key',
+          baseUrl: TEST_SENSEAUDIO_BASE_URL,
+        },
+      },
+    });
+
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const urlStr = String(input);
+      if (urlStr === `${TEST_SENSEAUDIO_BASE_URL}/v1/image/sync`) {
+        expect(init?.method).toBe('POST');
+        expect(init?.headers).toMatchObject({
+          authorization: 'Bearer sense-test-key',
+          'content-type': 'application/json',
+        });
+        expect(JSON.parse(String(init?.body))).toEqual({
+          model: 'senseaudio-image-2.0-260319',
+          prompt: 'A magazine-style hero poster.',
+          size: '1024x1024',
+        });
+        return buildOkResponse();
+      }
+      if (urlStr === TEST_IMAGE_URL) {
+        return buildImageFetchResponse(TEST_IMAGE_BYTES);
+      }
+      throw new Error(`unexpected fetch: ${urlStr}`);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const result = await generateMedia({
+      projectRoot,
+      projectsRoot,
+      projectId: 'project-1',
+      surface: 'image',
+      model: 'senseaudio-image-2.0-260319',
+      prompt: 'A magazine-style hero poster.',
+      output: 'sa-hero.png',
+    });
+
+    expect(fetchMock).toHaveBeenCalledTimes(2);
+    expect(result.providerId).toBe('senseaudio');
+    expect(result.providerNote).toContain('senseaudio/senseaudio-image-2.0-260319');
+    expect(result.providerNote).toContain('1024x1024');
+
+    const bytes = await readFile(path.join(projectsRoot, 'project-1', 'sa-hero.png'));
+    expect(bytes.equals(TEST_IMAGE_BYTES)).toBe(true);
+  });
+
+  it('maps aspect ratios to the SenseAudio size strings', async () => {
+    await writeConfig({
+      providers: {
+        senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
+      },
+    });
+
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      const urlStr = String(input);
+      if (urlStr === `${TEST_SENSEAUDIO_BASE_URL}/v1/image/sync`) {
+        expect(JSON.parse(String(init?.body)).size).toBe('1280x720');
+        return buildOkResponse();
+      }
+      return buildImageFetchResponse(TEST_IMAGE_BYTES);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    await generateMedia({
+      projectRoot,
+      projectsRoot,
+      projectId: 'project-1',
+      surface: 'image',
+      model: 'senseaudio-image-1.0-260319',
+      aspect: '16:9',
+      prompt: 'Widescreen banner.',
+      output: 'sa-banner.png',
+    });
+
+    expect(fetchMock).toHaveBeenCalledTimes(2);
+  });
+
+  it('falls back to the canonical base URL when none is configured', async () => {
+    await writeConfig({
+      providers: {
+        senseaudio: { apiKey: 'sense-test-key' },
+      },
+    });
+
+    const fetchMock = vi.fn(async (input: unknown) => {
+      const urlStr = String(input);
+      if (urlStr === 'https://api.senseaudio.cn/v1/image/sync') {
+        return buildOkResponse();
+      }
+      return buildImageFetchResponse(TEST_IMAGE_BYTES);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    await generateMedia({
+      projectRoot,
+      projectsRoot,
+      projectId: 'project-1',
+      surface: 'image',
+      model: 'doubao-seedream-5-0-260128',
+      prompt: 'Default base url.',
+      output: 'sa-default-base.png',
+    });
+
+    expect(fetchMock).toHaveBeenCalledTimes(2);
+  });
+
+  it('reads the API key from OD_SENSEAUDIO_API_KEY when storage is empty', async () => {
+    process.env.OD_SENSEAUDIO_API_KEY = 'env-sense-key';
+    const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
+      if (String(input).endsWith('/v1/image/sync')) {
+        expect(init?.headers).toMatchObject({ authorization: 'Bearer env-sense-key' });
+        return buildOkResponse();
+      }
+      return buildImageFetchResponse(TEST_IMAGE_BYTES);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    await generateMedia({
+      projectRoot,
+      projectsRoot,
+      projectId: 'project-1',
+      surface: 'image',
+      model: 'senseaudio-image-2.0-260319',
+      prompt: 'Env-only key.',
+      output: 'sa-env.png',
+    });
+
+    expect(fetchMock).toHaveBeenCalledTimes(2);
+  });
+
+  it('errors when no API key is configured', async () => {
+    const fetchMock = vi.fn();
+    vi.stubGlobal('fetch', fetchMock);
+
+    await expect(
+      generateMedia({
+        projectRoot,
+        projectsRoot,
+        projectId: 'project-1',
+        surface: 'image',
+        model: 'senseaudio-image-2.0-260319',
+        prompt: 'Should fail.',
+        output: 'sa-no-key.png',
+      }),
+    ).rejects.toThrow(/no SenseAudio API key/);
+
+    expect(fetchMock).not.toHaveBeenCalled();
+  });
+
+  it('surfaces HTTP-level failures with the status code and truncated body', async () => {
+    await writeConfig({
+      providers: {
+        senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
+      },
+    });
+
+    const fetchMock = vi.fn(async () =>
+      new Response('unauthorized', {
+        status: 401,
+        headers: { 'content-type': 'text/plain' },
+      }),
+    );
+    vi.stubGlobal('fetch', fetchMock);
+
+    await expect(
+      generateMedia({
+        projectRoot,
+        projectsRoot,
+        projectId: 'project-1',
+        surface: 'image',
+        model: 'senseaudio-image-2.0-260319',
+        prompt: 'Bad auth.',
+        output: 'sa-401.png',
+      }),
+    ).rejects.toThrow('senseaudio image 401: unauthorized');
+  });
+
+  it('surfaces upstream error_message verbatim when the body reports failure', async () => {
+    await writeConfig({
+      providers: {
+        senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
+      },
+    });
+
+    const fetchMock = vi.fn(async () =>
+      new Response(
+        JSON.stringify({ error_message: 'sensitive_content_blocked' }),
+        { status: 200, headers: { 'content-type': 'application/json' } },
+      ),
+    );
+    vi.stubGlobal('fetch', fetchMock);
+
+    await expect(
+      generateMedia({
+        projectRoot,
+        projectsRoot,
+        projectId: 'project-1',
+        surface: 'image',
+        model: 'senseaudio-image-2.0-260319',
+        prompt: 'Blocked.',
+        output: 'sa-blocked.png',
+      }),
+    ).rejects.toThrow('senseaudio image api error: sensitive_content_blocked');
+  });
+
+  it('errors when the response body is missing the image url', async () => {
+    await writeConfig({
+      providers: {
+        senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
+      },
+    });
+
+    const fetchMock = vi.fn(async () =>
+      new Response(
+        JSON.stringify({ base_resp: { status_code: 0, status_msg: 'success' } }),
+        { status: 200, headers: { 'content-type': 'application/json' } },
+      ),
+    );
+    vi.stubGlobal('fetch', fetchMock);
+
+    await expect(
+      generateMedia({
+        projectRoot,
+        projectsRoot,
+        projectId: 'project-1',
+        surface: 'image',
+        model: 'senseaudio-image-2.0-260319',
+        prompt: 'Missing url.',
+        output: 'sa-missing-url.png',
+      }),
+    ).rejects.toThrow('senseaudio image response missing url');
+  });
+});
--- a/apps/daemon/tests/proxy-routes.test.ts
+++ b/apps/daemon/tests/proxy-routes.test.ts
@ -523,6 +523,497 @@ describe('API proxy routes', () => {
    expect(upstreamInit?.redirect).toBe('error');
  });

+  it('streams delta + end for SenseAudio chat completions', async () => {
+    const fetchMock = vi.fn((input: FetchInput, init?: FetchInit) => {
+      const url = String(input);
+      if (url.startsWith(baseUrl)) return realFetch(input, init);
+      return Promise.resolve(sseResponse([
+        'data: {"choices":[{"delta":{"content":"sense"}}]}',
+        '',
+        'data: [DONE]',
+        '',
+      ].join('\n')));
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        baseUrl: 'https://api.senseaudio.cn',
+        apiKey: 'sa-test',
+        projectId: 'test-project',
+        model: 'senseaudio-s2',
+        messages: [{ role: 'user', content: 'hello' }],
+      }),
+    });
+
+    await expect(res.text()).resolves.toContain('event: delta\ndata: {"delta":"sense"}');
+    expect(fetchMock).toHaveBeenCalledWith(
+      'https://api.senseaudio.cn/v1/chat/completions',
+      expect.objectContaining({
+        headers: expect.objectContaining({ Authorization: 'Bearer sa-test' }),
+        redirect: 'error',
+      }),
+    );
+  });
+
+  it('defaults SenseAudio base URL to api.senseaudio.cn when caller omits it', async () => {
+    const fetchMock = vi.fn((input: FetchInput, init?: FetchInit) => {
+      const url = String(input);
+      if (url.startsWith(baseUrl)) return realFetch(input, init);
+      return Promise.resolve(sseResponse('data: [DONE]\n\n'));
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        apiKey: 'sa-test',
+        projectId: 'test-project',
+        model: 'senseaudio-s2',
+        messages: [{ role: 'user', content: 'hi' }],
+      }),
+    });
+
+    expect(String(fetchMock.mock.calls[0]![0])).toBe(
+      'https://api.senseaudio.cn/v1/chat/completions',
+    );
+  });
+
+  it('rejects SenseAudio requests that omit apiKey or model', async () => {
+    const fetchMock = vi.fn();
+    vi.stubGlobal('fetch', fetchMock);
+
+    const missingKey = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        model: 'senseaudio-s2',
+        messages: [{ role: 'user', content: 'hi' }],
+      }),
+    });
+    expect(missingKey.status).toBe(400);
+
+    const missingModel = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        apiKey: 'sa-test',
+        messages: [{ role: 'user', content: 'hi' }],
+      }),
+    });
+    expect(missingModel.status).toBe(400);
+
+    expect(fetchMock).not.toHaveBeenCalled();
+  });
+
+  it('disables upstream redirects for senseaudio proxy requests', async () => {
+    const fetchMock = vi.fn((input: FetchInput, init?: FetchInit) => {
+      const url = String(input);
+      if (url.startsWith(baseUrl)) return realFetch(input, init);
+      return Promise.resolve(sseResponse('data: [DONE]\n\n'));
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        baseUrl: 'https://api.senseaudio.cn',
+        apiKey: 'sa-test',
+        projectId: 'test-project',
+        model: 'model-one',
+        messages: [{ role: 'user', content: 'hi' }],
+      }),
+    });
+
+    const upstreamCall = fetchMock.mock.calls.find(([input]) =>
+      !String(input).startsWith(baseUrl),
+    );
+    expect(upstreamCall).toBeDefined();
+    const upstreamInit = upstreamCall![1] as FetchInit;
+    expect(upstreamInit?.redirect).toBe('error');
+  });
+
+  it('injects generate_image tool definition on every SenseAudio request', async () => {
+    const fetchMock = vi.fn((input: FetchInput, init?: FetchInit) => {
+      const url = String(input);
+      if (url.startsWith(baseUrl)) return realFetch(input, init);
+      return Promise.resolve(sseResponse([
+        'data: {"choices":[{"delta":{"content":"ok"}}]}',
+        '',
+        'data: [DONE]',
+        '',
+      ].join('\n')));
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        baseUrl: 'https://api.senseaudio.cn',
+        apiKey: 'sa-test',
+        projectId: 'test-project',
+        model: 'senseaudio-s2',
+        messages: [{ role: 'user', content: 'hi' }],
+      }),
+    });
+
+    const upstreamCall = fetchMock.mock.calls.find(([input]) =>
+      !String(input).startsWith(baseUrl),
+    );
+    expect(upstreamCall).toBeDefined();
+    const body = JSON.parse(String((upstreamCall![1] as FetchInit)?.body));
+    expect(body.tool_choice).toBe('auto');
+    expect(Array.isArray(body.tools)).toBe(true);
+    expect(body.tools[0]).toMatchObject({
+      type: 'function',
+      function: { name: 'generate_image' },
+    });
+  });
+
+  it('runs the BYOK image tool loop end-to-end', async () => {
+    const pngBytes = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x00, 0x01]);
+    const upstreamChatBodies: any[] = [];
+    let chatCallIndex = 0;
+
+    const fetchMock = vi.fn(async (input: FetchInput, init?: FetchInit) => {
+      const url = String(input);
+
+      if (url.startsWith(baseUrl)) return realFetch(input, init);
+
+      // SenseAudio image generation
+      if (url === 'https://api.senseaudio.cn/v1/image/sync') {
+        return new Response(
+          JSON.stringify({
+            url: 'https://cdn.example.test/cat.png',
+            base_resp: { status_code: 0, status_msg: 'success' },
+          }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+
+      // Image bytes download (initiated by the tool, not via the proxy)
+      if (url === 'https://cdn.example.test/cat.png') {
+        return new Response(pngBytes, {
+          status: 200,
+          headers: { 'content-type': 'image/png' },
+        });
+      }
+
+      // Upstream chat completions — capture bodies, return different SSE per call
+      if (url === 'https://api.senseaudio.cn/v1/chat/completions') {
+        upstreamChatBodies.push(JSON.parse(String(init?.body || '{}')));
+        chatCallIndex++;
+        if (chatCallIndex === 1) {
+          // First turn: model decides to call generate_image
+          return sseResponse([
+            'data: {"choices":[{"index":0,"delta":{"role":"assistant","content":null,"tool_calls":[{"index":0,"id":"call_abc","type":"function","function":{"name":"generate_image","arguments":"{\\"prompt\\":\\"a cat\\"}"}}]},"finish_reason":null}]}',
+            '',
+            'data: {"choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}',
+            '',
+            'data: [DONE]',
+            '',
+          ].join('\n'));
+        }
+        // Second turn: model summarises with image embedded in markdown
+        return sseResponse([
+          'data: {"choices":[{"index":0,"delta":{"content":"Here is your cat: "}}]}',
+          '',
+          'data: {"choices":[{"index":0,"delta":{"content":"![cat](generated)"}}]}',
+          '',
+          'data: {"choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}',
+          '',
+          'data: [DONE]',
+          '',
+        ].join('\n'));
+      }
+
+      throw new Error(`unexpected fetch: ${url}`);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        baseUrl: 'https://api.senseaudio.cn',
+        apiKey: 'sa-test',
+        projectId: 'test-project',
+        model: 'senseaudio-s2',
+        messages: [{ role: 'user', content: 'draw a cat' }],
+      }),
+    });
+
+    expect(res.status).toBe(200);
+    const body = await res.text();
+
+    // Final assistant text streams through to the client
+    expect(body).toContain('event: delta');
+    expect(body).toContain('Here is your cat');
+    expect(body).toContain('![cat](generated)');
+    expect(body).toContain('event: end');
+
+    // Two upstream chat completions calls happened (loop ran exactly once)
+    expect(upstreamChatBodies).toHaveLength(2);
+
+    // Second upstream call includes assistant{tool_calls} + tool{result}
+    const secondMessages = upstreamChatBodies[1].messages;
+    expect(secondMessages).toHaveLength(3);
+    expect(secondMessages[0]).toEqual({ role: 'user', content: 'draw a cat' });
+    expect(secondMessages[1]).toMatchObject({
+      role: 'assistant',
+      content: null,
+      tool_calls: [
+        {
+          id: 'call_abc',
+          type: 'function',
+          function: {
+            name: 'generate_image',
+            arguments: '{"prompt":"a cat"}',
+          },
+        },
+      ],
+    });
+    expect(secondMessages[2]).toMatchObject({
+      role: 'tool',
+      tool_call_id: 'call_abc',
+      content: expect.stringMatching(
+        /Image generated successfully\. URL: \/api\/projects\/test-project\/files\/byok-[a-z0-9-]+\.png/,
+      ),
+    });
+  });
+
+  it('feeds a tool error message back to the model when generate_image fails', async () => {
+    const upstreamChatBodies: any[] = [];
+    let chatCallIndex = 0;
+    const fetchMock = vi.fn(async (input: FetchInput, init?: FetchInit) => {
+      const url = String(input);
+      if (url.startsWith(baseUrl)) return realFetch(input, init);
+      if (url === 'https://api.senseaudio.cn/v1/image/sync') {
+        return new Response(
+          JSON.stringify({ error_message: 'sensitive_content_blocked' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      if (url === 'https://api.senseaudio.cn/v1/chat/completions') {
+        upstreamChatBodies.push(JSON.parse(String(init?.body || '{}')));
+        chatCallIndex++;
+        if (chatCallIndex === 1) {
+          return sseResponse([
+            'data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"call_err","type":"function","function":{"name":"generate_image","arguments":"{\\"prompt\\":\\"...\\"}"}}]},"finish_reason":null}]}',
+            '',
+            'data: {"choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}',
+            '',
+            'data: [DONE]',
+            '',
+          ].join('\n'));
+        }
+        return sseResponse([
+          'data: {"choices":[{"index":0,"delta":{"content":"Sorry, that one was blocked."}}]}',
+          '',
+          'data: {"choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}',
+          '',
+          'data: [DONE]',
+          '',
+        ].join('\n'));
+      }
+      throw new Error(`unexpected fetch: ${url}`);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        baseUrl: 'https://api.senseaudio.cn',
+        apiKey: 'sa-test',
+        projectId: 'test-project',
+        model: 'senseaudio-s2',
+        messages: [{ role: 'user', content: 'draw something blocked' }],
+      }),
+    });
+
+    expect(res.status).toBe(200);
+    const body = await res.text();
+    expect(body).toContain('Sorry, that one was blocked');
+
+    expect(upstreamChatBodies).toHaveLength(2);
+    const toolMsg = upstreamChatBodies[1].messages[2];
+    expect(toolMsg.role).toBe('tool');
+    expect(toolMsg.tool_call_id).toBe('call_err');
+    expect(toolMsg.content).toMatch(/Image generation failed/);
+    expect(toolMsg.content).toMatch(/sensitive_content_blocked/);
+  });
+
+  it('bounds the BYOK tool loop at MAX_BYOK_TOOL_LOOPS=3', async () => {
+    let chatCallIndex = 0;
+    const fetchMock = vi.fn(async (input: FetchInput, init?: FetchInit) => {
+      const url = String(input);
+      if (url.startsWith(baseUrl)) return realFetch(input, init);
+      if (url === 'https://api.senseaudio.cn/v1/image/sync') {
+        return new Response(
+          JSON.stringify({ url: 'https://cdn.example.test/x.png' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      if (url === 'https://cdn.example.test/x.png') {
+        return new Response(Buffer.from([0x89, 0x50]), { status: 200 });
+      }
+      if (url === 'https://api.senseaudio.cn/v1/chat/completions') {
+        chatCallIndex++;
+        // Always return tool_calls — the model never returns text
+        return sseResponse([
+          `data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"call_${chatCallIndex}","type":"function","function":{"name":"generate_image","arguments":"{\\"prompt\\":\\"x\\"}"}}]},"finish_reason":null}]}`,
+          '',
+          'data: {"choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}',
+          '',
+          'data: [DONE]',
+          '',
+        ].join('\n'));
+      }
+      throw new Error(`unexpected fetch: ${url}`);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        baseUrl: 'https://api.senseaudio.cn',
+        apiKey: 'sa-test',
+        projectId: 'test-project',
+        model: 'senseaudio-s2',
+        messages: [{ role: 'user', content: 'infinite' }],
+      }),
+    });
+
+    expect(res.status).toBe(200);
+    const body = await res.text();
+    expect(body).toContain('event: end');
+    // Loop ran exactly MAX_BYOK_TOOL_LOOPS times before bailing.
+    expect(chatCallIndex).toBe(3);
+  });
+
+  it('writes the generated image into the project folder and serves it via /api/projects/:id/files/*', async () => {
+    const pngBytes = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x42, 0x59]);
+    let capturedUrl: string | undefined;
+
+    const fetchMock = vi.fn(async (input: FetchInput, init?: FetchInit) => {
+      const url = String(input);
+      if (url.startsWith(baseUrl)) return realFetch(input, init);
+      if (url === 'https://api.senseaudio.cn/v1/image/sync') {
+        return new Response(
+          JSON.stringify({ url: 'https://cdn.example.test/served.png' }),
+          { status: 200, headers: { 'content-type': 'application/json' } },
+        );
+      }
+      if (url === 'https://cdn.example.test/served.png') {
+        return new Response(pngBytes, { status: 200 });
+      }
+      if (url === 'https://api.senseaudio.cn/v1/chat/completions') {
+        const body = JSON.parse(String(init?.body || '{}'));
+        // Capture URL the tool produced from the second turn's tool message.
+        const toolMsg = body.messages?.find((m: any) => m.role === 'tool');
+        if (toolMsg) {
+          const match = /URL: (\/api\/projects\/[A-Za-z0-9._-]+\/files\/byok-[a-z0-9-]+\.png)/.exec(toolMsg.content);
+          if (match) capturedUrl = match[1];
+        }
+        const isToolTurn = !toolMsg;
+        if (isToolTurn) {
+          return sseResponse([
+            'data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"call_serve","type":"function","function":{"name":"generate_image","arguments":"{\\"prompt\\":\\"s\\"}"}}]},"finish_reason":null}]}',
+            '',
+            'data: {"choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}',
+            '',
+            'data: [DONE]',
+            '',
+          ].join('\n'));
+        }
+        return sseResponse([
+          'data: {"choices":[{"index":0,"delta":{"content":"done"}}]}',
+          '',
+          'data: {"choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}',
+          '',
+          'data: [DONE]',
+          '',
+        ].join('\n'));
+      }
+      throw new Error(`unexpected fetch: ${url}`);
+    });
+    vi.stubGlobal('fetch', fetchMock);
+
+    const proxyRes = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        baseUrl: 'https://api.senseaudio.cn',
+        apiKey: 'sa-test',
+        projectId: 'test-project',
+        model: 'senseaudio-s2',
+        messages: [{ role: 'user', content: 'gen' }],
+      }),
+    });
+    // Drain the SSE body so the tool loop fully completes before we assert.
+    await proxyRes.text();
+
+    expect(capturedUrl).toBeDefined();
+    // The URL the tool emits is relative — same-origin via Next.js
+    // rewrite in production, hits this test server directly here.
+    // We GET the captured URL through the standard project file route
+    // and assert the bytes come back. This proves both halves:
+    // (1) the image landed in <projectsRoot>/<projectId>/ as expected
+    // (so listFiles / FileViewer / archive will find it), and
+    // (2) /api/projects/:id/files/* serves it without needing any
+    //     byok-specific route.
+    const imgRes = await realFetch(`${baseUrl}${capturedUrl!}`);
+    expect(imgRes.status).toBe(200);
+    expect(imgRes.headers.get('content-type')).toMatch(/^image\/png/);
+    const served = Buffer.from(await imgRes.arrayBuffer());
+    expect(served.equals(pngBytes)).toBe(true);
+  });
+
+  it('rejects senseaudio chat requests without a projectId', async () => {
+    const fetchMock = vi.fn();
+    vi.stubGlobal('fetch', fetchMock);
+
+    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        apiKey: 'sa-test',
+        model: 'senseaudio-s2',
+        messages: [{ role: 'user', content: 'hi' }],
+        // no projectId — should 400
+      }),
+    });
+
+    expect(res.status).toBe(400);
+    expect(fetchMock).not.toHaveBeenCalled();
+  });
+
+  it('rejects senseaudio chat requests with an unsafe projectId', async () => {
+    const fetchMock = vi.fn();
+    vi.stubGlobal('fetch', fetchMock);
+
+    const res = await realFetch(`${baseUrl}/api/proxy/senseaudio/stream`, {
+      method: 'POST',
+      headers: { 'content-type': 'application/json' },
+      body: JSON.stringify({
+        apiKey: 'sa-test',
+        model: 'senseaudio-s2',
+        projectId: '../etc/passwd',
+        messages: [{ role: 'user', content: 'hi' }],
+      }),
+    });
+
+    expect(res.status).toBe(400);
+    expect(fetchMock).not.toHaveBeenCalled();
+  });
+
  // Plan §3.A4 / spec §11.8 (e2e-7): the API-fallback proxy paths must
  // never carry plugin context. The web sidecar's fallback mode bypasses
  // the daemon snapshot bus, so any pluginId / appliedPluginSnapshotId in
@ -534,6 +1025,7 @@ describe('API proxy routes', () => {
      '/api/proxy/openai/stream',
      '/api/proxy/azure/stream',
      '/api/proxy/google/stream',
+      '/api/proxy/senseaudio/stream',
    ];

    for (const path of proxies) {
--- a/apps/web/src/components/ChatComposer.tsx
+++ b/apps/web/src/components/ChatComposer.tsx
@ -14,6 +14,7 @@ import {
  trackStudioClickChatComposer,
  trackStudioViewChatPanel,
 } from '../analytics/events';
+import { IMAGE_MODELS } from "../media/models";
 import { projectRawUrl, uploadProjectFiles, openFolderDialog, fetchConnectors } from "../providers/registry";
 import { patchProject } from "../state/projects";
 import { fetchMcpServers } from "../state/mcp";
@ -126,6 +127,14 @@ interface Props {
  researchAvailable?: boolean;
  projectMetadata?: ProjectMetadata;
  onProjectMetadataChange?: (metadata: ProjectMetadata) => void;
+  // SenseAudio BYOK image-model picker shown above the textarea. Hidden
+  // when the active chat protocol is anything other than 'senseaudio',
+  // so the composer stays clean for every other BYOK tab. The state
+  // owner is ProjectView (per-session, reset on refresh); ChatComposer
+  // is a fully controlled select.
+  byokApiProtocol?: AppConfig['apiProtocol'];
+  byokImageModel?: string;
+  onChangeByokImageModel?: (model: string) => void;
  currentSkillId?: string | null;
  onProjectSkillChange?: (skillId: string | null) => void;
  // Set when the project was created with a plugin already pinned
@ -188,6 +197,9 @@ export const ChatComposer = forwardRef<ChatComposerHandle, Props>(
      researchAvailable = false,
      projectMetadata,
      onProjectMetadataChange,
+      byokApiProtocol,
+      byokImageModel,
+      onChangeByokImageModel,
      currentSkillId = null,
      onProjectSkillChange,
      pinnedPluginId = null,
@ -1186,6 +1198,53 @@ export const ChatComposer = forwardRef<ChatComposerHandle, Props>(
              t={t}
            />
          ) : null}
+          {byokApiProtocol === 'senseaudio' && onChangeByokImageModel ? (
+            <div
+              className="composer-byok-image-model"
+              data-testid="composer-byok-image-model"
+              style={{
+                display: 'flex',
+                alignItems: 'center',
+                gap: 8,
+                padding: '4px 8px',
+                fontSize: 12,
+                color: 'var(--text-muted, #888)',
+              }}
+            >
+              <Icon name="image" size={13} />
+              <label
+                htmlFor="composer-byok-image-model-select"
+                style={{ flexShrink: 0 }}
+              >
+                {t('settings.byokImageModel')}
+              </label>
+              <select
+                id="composer-byok-image-model-select"
+                value={byokImageModel ?? ''}
+                onChange={(e) => onChangeByokImageModel(e.target.value)}
+                style={{
+                  background: 'transparent',
+                  border: '1px solid var(--border, #444)',
+                  borderRadius: 4,
+                  padding: '2px 6px',
+                  color: 'inherit',
+                  fontSize: 12,
+                }}
+              >
+                <option value="">
+                  {(IMAGE_MODELS.find((m) => m.provider === 'senseaudio')?.label
+                    ?? 'senseaudio-image-2.0') + ' (default)'}
+                </option>
+                {IMAGE_MODELS.filter((m) => m.provider === 'senseaudio').map(
+                  (m) => (
+                    <option key={m.id} value={m.id}>
+                      {m.label}
+                    </option>
+                  ),
+                )}
+              </select>
+            </div>
+          ) : null}
          {/*
            Spec §8.4 — context bar above the composer input. The
            section now behaves as a pure context bar: it renders the
--- a/apps/web/src/components/ChatPane.tsx
+++ b/apps/web/src/components/ChatPane.tsx
@ -279,6 +279,12 @@ interface Props {
  // message" without forcing a separate side widget.
  activePluginSnapshot?: AppliedPluginSnapshot | null;
  onCollapse?: () => void;
+  // SenseAudio BYOK only — wired straight through to ChatComposer for the
+  // in-composer image-model picker. Active protocol is read so the picker
+  // hides when the user is on any other BYOK tab (azure / openai / …).
+  byokApiProtocol?: AppConfig['apiProtocol'];
+  byokImageModel?: string;
+  onChangeByokImageModel?: (model: string) => void;
 }

 type Tab = 'chat' | 'comments';
@ -327,6 +333,9 @@ export function ChatPane({
  activePluginSnapshot,
  skills = [],
  onCollapse,
+  byokApiProtocol,
+  byokImageModel,
+  onChangeByokImageModel,
 }: Props) {
  const t = useT();
  const logRef = useRef<HTMLDivElement | null>(null);
@ -872,6 +881,9 @@ export function ChatPane({
            researchAvailable={researchAvailable}
            projectMetadata={projectMetadata}
            onProjectMetadataChange={onProjectMetadataChange}
+            byokApiProtocol={byokApiProtocol}
+            byokImageModel={byokImageModel}
+            onChangeByokImageModel={onChangeByokImageModel}
            currentSkillId={currentSkillId}
            onProjectSkillChange={onProjectSkillChange}
            pinnedPluginId={activePluginSnapshot?.pluginId ?? null}
--- a/apps/web/src/components/DesignFilesPanel.tsx
+++ b/apps/web/src/components/DesignFilesPanel.tsx
@ -1192,7 +1192,14 @@ export function DesignFilesPanel({
        </div>
      </div>
      {preview && previewFile ? (
+        // Key on the file name so React unmounts the previous DfPreview
+        // (and its iframe / image element) when the user clicks a
+        // different file. Without this, React diffing reuses the same
+        // iframe DOM node and the browser keeps showing the first
+        // file's contents — only the `src` prop changes but the iframe
+        // never actually navigates.
        <DfPreview
+          key={previewFile.name}
          projectId={projectId}
          file={previewFile}
          onOpen={() => onOpenFile(previewFile.name)}
--- a/apps/web/src/components/ProjectView.tsx
+++ b/apps/web/src/components/ProjectView.tsx
@ -486,6 +486,15 @@ export function ProjectView({
  const [liveArtifacts, setLiveArtifacts] = useState<LiveArtifactSummary[]>([]);
  const [liveArtifactEvents, setLiveArtifactEvents] = useState<LiveArtifactEventItem[]>([]);
  const [workspaceFocused, setWorkspaceFocused] = useState(false);
+  // Per-session override for the BYOK SenseAudio chat's generate_image
+  // tool. Seeded once from Settings (config.byokImageModel) so the
+  // composer dropdown opens on the user's chosen default; subsequent
+  // selections live only in this component's state — page refresh /
+  // project switch resets to the Settings default. Persistent defaults
+  // live in Settings → BYOK → SenseAudio → Image generation model.
+  const [byokImageModelOverride, setByokImageModelOverride] = useState<string>(
+    config.byokImageModel ?? '',
+  );
  // `closed` → no surface; `review` → read-only saved-state panel with a
  // preview + reopen-to-edit action (#1822); `edit` → the textarea editor.
  const [instructionsMode, setInstructionsMode] = useState<'closed' | 'review' | 'edit'>('closed');
@ -2202,6 +2211,13 @@ export function ProjectView({
            });
          },
          onError: handlers.onError,
+        }, {
+          projectId: project.id,
+          // SenseAudio BYOK chat reads this to pre-fill the tool param's
+          // default model. Prefer the live composer override; fall back
+          // to the Settings default when the composer dropdown is on
+          // "use default". Other protocols ignore unknown body fields.
+          byokImageModel: byokImageModelOverride || config.byokImageModel,
        });
      }
    },
@ -3375,6 +3391,9 @@ export function ProjectView({
              onTogglePet={onTogglePet}
              onOpenPetSettings={onOpenPetSettings}
              researchAvailable={config.mode === 'daemon'}
+              byokApiProtocol={config.apiProtocol}
+              byokImageModel={byokImageModelOverride}
+              onChangeByokImageModel={setByokImageModelOverride}
              projectMetadata={project.metadata}
              onProjectMetadataChange={(metadata) => {
                onProjectChange({ ...project, metadata });
--- a/apps/web/src/components/SettingsDialog.tsx
+++ b/apps/web/src/components/SettingsDialog.tsx
@ -68,7 +68,7 @@ import type {
 import { testAgent, testApiProvider } from '../providers/connection-test';
 import { fetchProviderModels } from '../providers/provider-models';
 import { fetchConnectors, fetchDesignTemplates } from '../providers/registry';
-import { MEDIA_PROVIDERS } from '../media/models';
+import { IMAGE_MODELS, MEDIA_PROVIDERS } from '../media/models';
 import { XaiOAuthControl } from './XaiOAuthControl';
 import type { MediaProvider } from '../media/models';
 import { Toast } from './Toast';
@ -444,6 +444,7 @@ function currentApiProtocolConfig(config: AppConfig): ApiProtocolConfig {
    model: config.model,
    apiVersion: config.apiVersion ?? '',
    apiProviderBaseUrl: config.apiProviderBaseUrl ?? null,
+    byokImageModel: config.byokImageModel ?? '',
  };
 }

@ -460,6 +461,11 @@ function applyApiProtocolConfig(
    model: apiConfig.model,
    apiProviderBaseUrl: apiConfig.apiProviderBaseUrl ?? null,
    apiVersion: protocol === 'azure' ? (apiConfig.apiVersion ?? '') : '',
+    // byokImageModel is SenseAudio-only — flipping to another BYOK tab
+    // shouldn't carry a SenseAudio image-model choice into, say, the
+    // OpenAI form. Mirrors the apiVersion guarding above.
+    byokImageModel:
+      protocol === 'senseaudio' ? (apiConfig.byokImageModel ?? '') : '',
  };
 }

@ -2683,6 +2689,34 @@ export function SettingsDialog({
                  />
                </label>
              ) : null}
+              {apiProtocol === 'senseaudio' ? (
+                <label className="field">
+                  <span className="field-label">{t('settings.byokImageModel')}</span>
+                  <select
+                    value={cfg.byokImageModel ?? ''}
+                    onChange={(e) =>
+                      updateApiConfig({ byokImageModel: e.target.value })
+                    }
+                  >
+                    {/* Default-empty option resolves to the registry default
+                        on the daemon side (senseaudio-image-2.0-260319 today).
+                        Listing it explicitly lets the picker show what the
+                        unconfigured state actually means. */}
+                    <option value="">
+                      {IMAGE_MODELS.find((m) => m.provider === 'senseaudio')?.label
+                        ?? 'senseaudio-image-2.0'}
+                      {' (default)'}
+                    </option>
+                    {IMAGE_MODELS.filter((m) => m.provider === 'senseaudio').map(
+                      (m) => (
+                        <option key={m.id} value={m.id}>
+                          {m.label}
+                        </option>
+                      ),
+                    )}
+                  </select>
+                </label>
+              ) : null}
              <p className="hint">{t('settings.apiHint')}</p>
            </section>
          )}
--- a/apps/web/src/i18n/locales/ar.ts
+++ b/apps/web/src/i18n/locales/ar.ts
@ -202,6 +202,7 @@ export const ar: Dict = {
  'settings.azureDeploymentModelHint':
    'في Azure OpenAI، يُستخدم هذا الحقل كاسم النشر في /openai/deployments/<model>. أدخل اسم النشر الذي أنشأته في Azure.',
  'settings.apiVersion': 'إصدار API',
+  'settings.byokImageModel': 'نموذج إنشاء الصور',
  'settings.maxTokens': 'أقصى عدد من الرموز (اختياري)',
  'settings.maxTokensHint':
    'الحد الأقصى لطول الاستجابة. لكل نموذج قيمة افتراضية؛ اتركها فارغة لاستخدامها، أو أدخل رقماً للتجاوز.',
--- a/apps/web/src/i18n/locales/de.ts
+++ b/apps/web/src/i18n/locales/de.ts
@ -202,6 +202,7 @@ export const de: Dict = {
  'settings.azureDeploymentModelHint':
    'Fuer Azure OpenAI wird dieses Feld als Deployment-Name in /openai/deployments/<model> verwendet. Geben Sie den in Azure angelegten Deployment-Namen ein.',
  'settings.apiVersion': 'API-Version',
+  'settings.byokImageModel': 'Bilderzeugungsmodell',
  'settings.maxTokens': 'Max. Tokens (optional)',
  'settings.maxTokensHint':
    'Obergrenze für die Antwortlänge. Jedes Modell hat einen abgestimmten Standardwert (im Platzhalter sichtbar); leer lassen, um ihn zu verwenden, oder eine Zahl eingeben, um ihn zu überschreiben.',
--- a/apps/web/src/i18n/locales/en.ts
+++ b/apps/web/src/i18n/locales/en.ts
@ -227,6 +227,7 @@ export const en: Dict = {
  'settings.azureModelFetchHint':
    'For Azure OpenAI, enter the deployment name you created in Azure. Automatic deployment discovery is not available from this BYOK endpoint.',
  'settings.apiVersion': 'API version',
+  'settings.byokImageModel': 'Image generation model',
  'settings.maxTokens': 'Max tokens (optional)',
  'settings.maxTokensHint':
    'Cap on the response length. Each model has a tuned default (shown as a placeholder); leave blank to use it, or enter a number to override.',
--- a/apps/web/src/i18n/locales/es-ES.ts
+++ b/apps/web/src/i18n/locales/es-ES.ts
@ -202,6 +202,7 @@ export const esES: Dict = {
  'settings.azureDeploymentModelHint':
    'Para Azure OpenAI, este campo se usa como nombre del despliegue en /openai/deployments/<model>. Introduce el nombre del despliegue que creaste en Azure.',
  'settings.apiVersion': 'Versión de API',
+  'settings.byokImageModel': 'Modelo de generación de imágenes',
  'settings.maxTokens': 'Tokens máx. (opcional)',
  'settings.maxTokensHint':
    'Tope para la longitud de la respuesta. Cada modelo tiene un valor por defecto ajustado (visible en el placeholder); déjalo vacío para usarlo o introduce un número para anularlo.',
--- a/apps/web/src/i18n/locales/fa.ts
+++ b/apps/web/src/i18n/locales/fa.ts
@ -202,6 +202,7 @@ export const fa: Dict = {
  'settings.azureDeploymentModelHint':
    'در Azure OpenAI، این فیلد به عنوان نام استقرار در /openai/deployments/<model> استفاده می‌شود. نام استقراری را که در Azure ساخته‌اید وارد کنید.',
  'settings.apiVersion': 'نسخه API',
+  'settings.byokImageModel': 'مدل تولید تصویر',
  'settings.maxTokens': 'حداکثر توکن (اختیاری)',
  'settings.maxTokensHint':
    'سقف طول پاسخ. هر مدل مقدار پیش‌فرض تنظیم‌شدهٔ خود را دارد (در placeholder نمایش داده می‌شود)؛ برای استفاده از آن خالی بگذارید، یا برای جایگزینی، عددی وارد کنید.',
--- a/apps/web/src/i18n/locales/fr.ts
+++ b/apps/web/src/i18n/locales/fr.ts
@ -202,6 +202,7 @@ export const fr: Dict = {
  'settings.azureDeploymentModelHint':
    'Pour Azure OpenAI, ce champ est utilisé comme nom du déploiement dans /openai/deployments/<model>. Saisissez le nom du déploiement créé dans Azure.',
  'settings.apiVersion': 'Version API',
+  'settings.byokImageModel': "Modèle de génération d'images",
  'settings.maxTokens': 'Tokens max (optionnel)',
  'settings.maxTokensHint':
    'Limite de la longueur de réponse. Chaque modèle a une valeur par défaut (affichée à titre indicatif) ; laissez vide pour l\'utiliser, ou entrez un nombre pour la remplacer.',
--- a/apps/web/src/i18n/locales/hu.ts
+++ b/apps/web/src/i18n/locales/hu.ts
@ -202,6 +202,7 @@ export const hu: Dict = {
  'settings.azureDeploymentModelHint':
    'Azure OpenAI esetén ez a mező a /openai/deployments/<model> deployment neveként szerepel. Add meg az Azure-ban létrehozott deployment nevét.',
  'settings.apiVersion': 'API-verzió',
+  'settings.byokImageModel': 'Képgenerálási modell',
  'settings.maxTokens': 'Max tokenek (opcionális)',
  'settings.maxTokensHint':
    'A válasz hosszának felső határa. Minden modellnek van hangolt alapértelmezése (placeholderként látható); hagyd üresen az alkalmazásához, vagy adj meg számot a felülíráshoz.',
--- a/apps/web/src/i18n/locales/id.ts
+++ b/apps/web/src/i18n/locales/id.ts
@ -202,6 +202,7 @@ export const id: Dict = {
  'settings.azureDeploymentModelHint':
    'Untuk Azure OpenAI, field ini digunakan sebagai nama deployment di /openai/deployments/<model>. Masukkan nama deployment yang kamu buat di Azure.',
  'settings.apiVersion': 'Versi API',
+  'settings.byokImageModel': 'Model pembuatan gambar',
  'settings.maxTokens': 'Token maks (opsional)',
  'settings.maxTokensHint':
    'Batas panjang respons. Setiap model punya default sendiri; kosongkan untuk memakainya, atau isi angka untuk menimpa.',
--- a/apps/web/src/i18n/locales/it.ts
+++ b/apps/web/src/i18n/locales/it.ts
@ -199,6 +199,7 @@ export const it: Dict = {
  'settings.azureDeploymentModelHint':
    'Per Azure OpenAI, questo campo viene utilizzato come nome del deployment in /openai/deployments/<model>. Inserisci il nome del deployment creato in Azure.',
  'settings.apiVersion': 'Versione API',
+  'settings.byokImageModel': 'Modello di generazione immagini',
  'settings.maxTokens': 'Token massimi (opzionale)',
  'settings.maxTokensHint':
    'Limite della lunghezza della risposta. Ogni modello ha un valore predefinito (mostrato nel placeholder); lascia vuoto per usarlo, o inserisci un numero per sostituirlo.',
--- a/apps/web/src/i18n/locales/ja.ts
+++ b/apps/web/src/i18n/locales/ja.ts
@ -202,6 +202,7 @@ export const ja: Dict = {
  'settings.azureDeploymentModelHint':
    'Azure OpenAI では、このフィールドが /openai/deployments/<model> のデプロイ名として使われます。Azure で作成したデプロイ名を入力してください。',
  'settings.apiVersion': 'API バージョン',
+  'settings.byokImageModel': '画像生成モデル',
  'settings.maxTokens': '最大トークン（任意）',
  'settings.maxTokensHint':
    '応答長の上限。各モデルにチューニング済みのデフォルト値があります（プレースホルダーに表示）。空のままにすればそれを使用し、数値を入力すれば上書きされます。',
--- a/apps/web/src/i18n/locales/ko.ts
+++ b/apps/web/src/i18n/locales/ko.ts
@ -205,6 +205,7 @@ export const ko: Dict = {
  'settings.azureDeploymentModelHint':
    'Azure OpenAI에서는 이 필드가 /openai/deployments/<model>의 배포 이름으로 사용됩니다. Azure에서 만든 배포 이름을 입력하세요.',
  'settings.apiVersion': 'API 버전',
+  'settings.byokImageModel': '이미지 생성 모델',
  'settings.apiHint': '요청은 로컬 daemon 프록시를 통해 설정한 Base URL로 전송됩니다. 키는 이 브라우저에만 저장되며 제공자 요청과 함께 전송됩니다.',
  'settings.skipForNow': '지금은 건너뛰기',
  'settings.getStarted': '시작하기',
--- a/apps/web/src/i18n/locales/pl.ts
+++ b/apps/web/src/i18n/locales/pl.ts
@ -202,6 +202,7 @@ export const pl: Dict = {
  'settings.azureDeploymentModelHint':
      'Dla Azure OpenAI to pole jest używane jako nazwa wdrożenia w /openai/deployments/<model>. Wpisz nazwę wdrożenia utworzonego w Azure.',
  'settings.apiVersion': 'Wersja API',
+  'settings.byokImageModel': 'Model generowania obrazów',
  'settings.maxTokens': 'Maks. liczba tokenów (opcjonalnie)',
  'settings.maxTokensHint':
      'Limit długości odpowiedzi. Każdy model ma dostrojony domyślny limit (widoczny jako placeholder); pozostaw puste, aby go użyć, lub wpisz liczbę.',
--- a/apps/web/src/i18n/locales/pt-BR.ts
+++ b/apps/web/src/i18n/locales/pt-BR.ts
@ -202,6 +202,7 @@ export const ptBR: Dict = {
  'settings.azureDeploymentModelHint':
    'No Azure OpenAI, este campo e usado como nome do deployment em /openai/deployments/<model>. Informe o nome do deployment criado no Azure.',
  'settings.apiVersion': 'Versão da API',
+  'settings.byokImageModel': 'Modelo de geração de imagens',
  'settings.maxTokens': 'Tokens máx. (opcional)',
  'settings.maxTokensHint':
    'Limite para o comprimento da resposta. Cada modelo tem um valor padrão ajustado (visível no placeholder); deixe em branco para usá-lo ou insira um número para substituí-lo.',
--- a/apps/web/src/i18n/locales/ru.ts
+++ b/apps/web/src/i18n/locales/ru.ts
@ -202,6 +202,7 @@ export const ru: Dict = {
  'settings.azureDeploymentModelHint':
    'Для Azure OpenAI это поле используется как имя развертывания в /openai/deployments/<model>. Укажите имя развертывания, созданного в Azure.',
  'settings.apiVersion': 'Версия API',
+  'settings.byokImageModel': 'Модель генерации изображений',
  'settings.maxTokens': 'Макс. токенов (опционально)',
  'settings.maxTokensHint':
    'Ограничение длины ответа. У каждой модели свой настроенный дефолт (виден в плейсхолдере); оставьте поле пустым, чтобы использовать его, или введите число, чтобы переопределить.',
--- a/apps/web/src/i18n/locales/th.ts
+++ b/apps/web/src/i18n/locales/th.ts
@ -198,6 +198,7 @@ export const th: Dict = {
  'settings.azureDeploymentModel': 'ชื่อ Deployment',
  'settings.azureDeploymentModelHint': 'สำหรับ Azure OpenAI ฟิลด์นี้ใช้เป็นชื่อ Deployment ใน /openai/deployments/<model> ป้อนชื่อ Deployment ที่คุณสร้างใน Azure',
  'settings.apiVersion': 'เวอร์ชัน API',
+  'settings.byokImageModel': 'โมเดลสร้างภาพ',
  'settings.maxTokens': 'Max tokens (เลือกได้)',
  'settings.maxTokensHint': 'ขีดจำกัดความยาวในการตอบกลับ',
  'settings.apiHint': 'คำสั่งจะถูกส่งผ่าน local daemon proxy ไปยัง base URL ที่คุณตั้งไว้ API Key จะถูกเก็บในเบราว์เซอร์นี้เท่านั้น',
--- a/apps/web/src/i18n/locales/tr.ts
+++ b/apps/web/src/i18n/locales/tr.ts
@ -202,6 +202,7 @@ export const tr: Dict = {
  'settings.azureDeploymentModelHint':
    'Azure OpenAI icin bu alan /openai/deployments/<model> icindeki dagitim adi olarak kullanilir. Azureda olusturdugunuz dagitim adini girin.',
  'settings.apiVersion': 'API sürümü',
+  'settings.byokImageModel': 'Görüntü oluşturma modeli',
  'settings.maxTokens': 'Maks. token (isteğe bağlı)',
  'settings.maxTokensHint':
    'Yanıt uzunluğu sınırı. Her modelin ayarlanmış bir varsayılanı vardır (yer tutucuda görünür); kullanmak için boş bırakın, üzerine yazmak için bir sayı girin.',
--- a/apps/web/src/i18n/locales/uk.ts
+++ b/apps/web/src/i18n/locales/uk.ts
@ -203,6 +203,7 @@ export const uk: Dict = {
  'settings.azureDeploymentModelHint':
    'Для Azure OpenAI це поле використовується як назва розгортання в /openai/deployments/<model>. Введіть назву розгортання, створену в Azure.',
  'settings.apiVersion': 'Версія API',
+  'settings.byokImageModel': 'Модель генерації зображень',
  'settings.maxTokens': 'Макс. токенів (необов\'язково)',
  'settings.maxTokensHint':
    'Обмеження на довжину відповіді. Кожна модель має налаштовану за замовчуванням (показано в заповнювачі); залиште поле порожнім, щоб використовувати її, або введіть число, щоб переопрацювати.',
--- a/apps/web/src/i18n/locales/zh-CN.ts
+++ b/apps/web/src/i18n/locales/zh-CN.ts
@ -227,6 +227,7 @@ export const zhCN: Dict = {
  'settings.azureModelFetchHint':
    '对于 Azure OpenAI，请填写你在 Azure 中创建的部署名称。当前 BYOK 端点无法自动发现 deployment。',
  'settings.apiVersion': 'API 版本',
+  'settings.byokImageModel': '图片生成模型',
  'settings.maxTokens': '最大 tokens（可选）',
  'settings.maxTokensHint':
    '响应长度上限。每个 model 有调优过的默认值（在 placeholder 里显示），留空即使用，输入数字则覆盖。',
--- a/apps/web/src/i18n/locales/zh-TW.ts
+++ b/apps/web/src/i18n/locales/zh-TW.ts
@ -201,6 +201,7 @@ export const zhTW: Dict = {
  'settings.azureDeploymentModelHint':
    '對於 Azure OpenAI，此欄位會作為 /openai/deployments/<model> 中的部署名稱使用。請填入你在 Azure 中建立的部署名稱。',
  'settings.apiVersion': 'API 版本',
+  'settings.byokImageModel': '圖片生成模型',
  'settings.maxTokens': '最大 tokens（可選）',
  'settings.maxTokensHint':
    '回應長度上限。每個 model 有調過的預設值（在 placeholder 顯示），留空即使用，輸入數字則覆蓋。',
--- a/apps/web/src/i18n/types.ts
+++ b/apps/web/src/i18n/types.ts
@ -252,6 +252,7 @@ export interface Dict {
  'settings.azureDeploymentModelHint': string;
  'settings.azureModelFetchHint': string;
  'settings.apiVersion': string;
+  'settings.byokImageModel': string;
  'settings.apiHint': string;
  'settings.skipForNow': string;
  'settings.getStarted': string;
--- a/apps/web/src/media/models.ts
+++ b/apps/web/src/media/models.ts
@ -234,7 +234,7 @@ export const MEDIA_PROVIDERS: MediaProvider[] = [
  {
    id: 'senseaudio',
    label: 'SenseAudio',
-    hint: 'TTS · 70+ system voices · clone',
+    hint: '',
    integrated: true,
    defaultBaseUrl: 'https://api.senseaudio.cn',
    docsUrl: 'https://docs.senseaudio.cn',
@ -344,6 +344,29 @@ export const IMAGE_MODELS: MediaModel[] = [
    caps: ['i2i'],
  },

+  // SenseAudio — synchronous /v1/image/sync, Bearer auth, reference URL or data URI.
+  {
+    id: 'senseaudio-image-2.0-260319',
+    label: 'senseaudio-image-2.0',
+    hint: 'SenseAudio · multi-aspect, latest',
+    provider: 'senseaudio',
+    caps: ['t2i', 'i2i'],
+  },
+  {
+    id: 'senseaudio-image-1.0-260319',
+    label: 'senseaudio-image-1.0',
+    hint: 'SenseAudio · standard',
+    provider: 'senseaudio',
+    caps: ['t2i', 'i2i'],
+  },
+  {
+    id: 'doubao-seedream-5-0-260128',
+    label: 'seedream-5.0',
+    hint: 'SenseAudio · ByteDance Seedream 5.0 hi-res',
+    provider: 'senseaudio',
+    caps: ['t2i', 'i2i'],
+  },
+
  // xAI Grok Imagine — text-to-image (1k/2k, 11+ aspect ratios).
  {
    id: 'grok-imagine-image',
--- a/apps/web/src/providers/anthropic.ts
+++ b/apps/web/src/providers/anthropic.ts
@ -11,10 +11,12 @@ import Anthropic from '@anthropic-ai/sdk';
 import { effectiveMaxTokens } from '../state/maxTokens';
 import type { AppConfig, ChatMessage } from '../types';
 import { streamMessageAnthropicProxy } from './anthropic-compatible';
+import type { ProxyContext } from './api-proxy';
 import { streamMessageAzure } from './azure-compatible';
 import { streamMessageGoogle } from './google-compatible';
 import { streamMessageOllama } from './ollama-compatible';
 import { isOpenAICompatible, streamMessageOpenAI } from './openai-compatible';
+import { streamMessageSenseAudio } from './senseaudio-compatible';

 // Re-export for convenience
 export { isOpenAICompatible } from './openai-compatible';
@ -39,6 +41,12 @@ export async function streamMessage(
  history: ChatMessage[],
  signal: AbortSignal,
  handlers: StreamHandlers,
+  // Only the senseaudio branch reads `context.projectId` today (so the
+  // daemon-side `generate_image` tool can write into the active
+  // project's folder). Other branches accept and ignore — keeping the
+  // signature uniform means the single call site in ProjectView passes
+  // the same shape regardless of protocol.
+  context?: ProxyContext,
 ): Promise<void> {
  // Prefer the explicit Settings protocol; keep the legacy heuristic as a
  // fallback for configs saved before apiProtocol existed.
@ -51,6 +59,9 @@ export async function streamMessage(
  if (cfg.apiProtocol === 'google') {
    return streamMessageGoogle(cfg, system, history, signal, handlers);
  }
+  if (cfg.apiProtocol === 'senseaudio') {
+    return streamMessageSenseAudio(cfg, system, history, signal, handlers, context);
+  }
  if (cfg.apiProtocol === 'openai' || (!cfg.apiProtocol && isOpenAICompatible(cfg.model, cfg.baseUrl))) {
    return streamMessageOpenAI(cfg, system, history, signal, handlers);
  }
--- a/apps/web/src/providers/api-proxy.ts
+++ b/apps/web/src/providers/api-proxy.ts
@ -3,6 +3,22 @@ import type { AppConfig, ChatMessage } from '../types';
 import type { StreamHandlers } from './anthropic';
 import { parseSseFrame } from './sse';

+/**
+ * Optional per-request context that some protocols thread into the
+ * proxy body. Today only the senseaudio proxy reads these fields:
+ *  - `projectId` lets the `generate_image` tool write into the active
+ *    project's folder instead of a daemon-global cache.
+ *  - `byokImageModel` is the user's BYOK Settings default for the
+ *    image tool. The LLM can still override per-call via the tool's
+ *    `model` arg; this is just the fallback when it omits one.
+ * Other protocols ignore unknown body fields, so callers are free to
+ * pass this for every protocol.
+ */
+export interface ProxyContext {
+  projectId?: string;
+  byokImageModel?: string;
+}
+
 export async function streamProxyEndpoint(
  endpoint: string,
  cfg: AppConfig,
@ -10,6 +26,7 @@ export async function streamProxyEndpoint(
  history: ChatMessage[],
  signal: AbortSignal,
  handlers: StreamHandlers,
+  context?: ProxyContext,
 ): Promise<void> {
  if (!cfg.apiKey) {
    handlers.onError(new Error('Missing API key — open Settings and paste one in.'));
@ -30,6 +47,10 @@ export async function streamProxyEndpoint(
        messages: history.map((m) => ({ role: m.role, content: m.content })),
        maxTokens: effectiveMaxTokens(cfg),
        apiVersion: cfg.apiVersion,
+        ...(context?.projectId ? { projectId: context.projectId } : {}),
+        ...(context?.byokImageModel
+          ? { byokImageModel: context.byokImageModel }
+          : {}),
      }),
      signal,
    });
--- a/apps/web/src/providers/senseaudio-compatible.ts
+++ b/apps/web/src/providers/senseaudio-compatible.ts
@ -0,0 +1,33 @@
+/**
+ * SenseAudio chat completions provider. Wire-compatible with OpenAI
+ * (POST /v1/chat/completions, Bearer auth, SSE delta frames + [DONE]),
+ * so the only thing that differs from streamMessageOpenAI is the
+ * daemon proxy endpoint — keeping a dedicated client makes the picker
+ * tab → daemon log line → upstream call chain readable end-to-end and
+ * leaves room for SenseAudio-specific divergence in the future.
+ *
+ * Routes through the daemon proxy to avoid browser CORS issues.
+ * BYOK — the key stays on the user's machine.
+ */
+import type { AppConfig, ChatMessage } from '../types';
+import type { StreamHandlers } from './anthropic';
+import { streamProxyEndpoint, type ProxyContext } from './api-proxy';
+
+export async function streamMessageSenseAudio(
+  cfg: AppConfig,
+  system: string,
+  history: ChatMessage[],
+  signal: AbortSignal,
+  handlers: StreamHandlers,
+  context?: ProxyContext,
+): Promise<void> {
+  return streamProxyEndpoint(
+    '/api/proxy/senseaudio/stream',
+    cfg,
+    system,
+    history,
+    signal,
+    handlers,
+    context,
+  );
+}
--- a/apps/web/src/runtime/markdown.tsx
+++ b/apps/web/src/runtime/markdown.tsx
@ -262,6 +262,24 @@ function renderBlock(block: Block, key: number): ReactNode {
  return null;
 }

+// Allowed schemes / forms for image `src` attributes. The BYOK chat
+// tool loop emits relative URLs like `/api/byok-image/<id>.png` which
+// the web's Next.js rewrites proxy to the daemon — that's the common
+// case. data: + blob: cover inline / generated images. http(s):// is
+// allowed so a model can reference public images. Anything else
+// (javascript:, file:, vbscript:, …) is rejected so a hallucinated
+// or adversarial URL cannot exfiltrate or execute.
+function isSafeMarkdownImageSrc(src: string): boolean {
+  if (!src) return false;
+  if (src.startsWith('/') && !src.startsWith('//')) return true;
+  return (
+    src.startsWith('http://')
+    || src.startsWith('https://')
+    || src.startsWith('data:image/')
+    || src.startsWith('blob:')
+  );
+}
+
 // Inline pass: tokenize into runs of `code`, **bold**, *italic*, links,
 // and plain text. We walk the string with a regex that matches whichever
 // delimiter shows up next; everything between delimiters becomes a text
@ -270,14 +288,19 @@ function renderInline(text: string): ReactNode {
  const out: ReactNode[] = [];
  // Order matters:
  //  1. inline code first so its contents are not re-tokenized as bold/italic.
-  //  2. explicit `[text](url)` markdown links before bare URL autolink so the
+  //  2. image syntax `![alt](url)` BEFORE the link branch. Both share
+  //     `[…](…)` and the image is only distinguished by the leading `!`;
+  //     letting the link branch win would render `[alt](url)` as a text
+  //     link with `!` stranded as a sibling text node and the user would
+  //     see the link copy but never the image.
+  //  3. explicit `[text](url)` markdown links before bare URL autolink so the
  //     autolink does not greedily swallow the closing paren.
-  //  3. bare http(s) URL autolink BEFORE italic markers — chat output often
+  //  4. bare http(s) URL autolink BEFORE italic markers — chat output often
  //     contains OAuth-style links with `_type=` / `_id=` query params, and
  //     leaving italic to win turns the URL into an italic-fragmented mess.
-  //  4. bold (**a** / __a__) before italic (*a* / _a_).
+  //  5. bold (**a** / __a__) before italic (*a* / _a_).
  const re =
-    /(`[^`]+`)|\[([^\]]+)\]\(([^)\s]+)\)|(https?:\/\/[^\s)<>]+)|(\*\*[^*]+\*\*)|(__[^_]+__)|(\*[^*\n]+\*)|(_[^_\n]+_)/g;
+    /(`[^`]+`)|!\[([^\]]*)\]\(([^)\s]+)\)|\[([^\]]+)\]\(([^)\s]+)\)|(https?:\/\/[^\s)<>]+)|(\*\*[^*]+\*\*)|(__[^_]+__)|(\*[^*\n]+\*)|(_[^_\n]+_)/g;
  let lastIndex = 0;
  let m: RegExpExecArray | null;
  let key = 0;
@ -291,40 +314,61 @@ function renderInline(text: string): ReactNode {
          {m[1].slice(1, -1)}
        </code>,
      );
-    } else if (m[2] && m[3]) {
+    } else if (m[3] !== undefined) {
+      // Image: m[2] = alt (may be empty), m[3] = src
+      const src = m[3];
+      const alt = m[2] || '';
+      if (isSafeMarkdownImageSrc(src)) {
+        out.push(
+          <img
+            key={key++}
+            className="md-image"
+            src={src}
+            alt={alt}
+            loading="lazy"
+            referrerPolicy="no-referrer"
+            style={{ maxWidth: '100%', height: 'auto', borderRadius: 6 }}
+          />,
+        );
+      } else {
+        // Unsafe scheme — drop the image tag but keep the alt text so
+        // the user sees what the model meant to show.
+        pushText(out, alt, key++);
+      }
+    } else if (m[4] && m[5]) {
      out.push(
        <a
          key={key++}
          className="md-link"
-          href={m[3]}
-          target="_blank"
-          rel="noreferrer noopener"
-        >
-          {m[2]}
-        </a>,
-      );
-    } else if (m[4]) {
-      // Bare URL — autolink with the URL as both href and visible text,
-      // matching the Markdown `<https://…>` autolink convention.
-      out.push(
-        <a
-          key={key++}
-          className="md-link md-link-bare"
-          href={m[4]}
+          href={m[5]}
          target="_blank"
          rel="noreferrer noopener"
        >
          {m[4]}
        </a>,
      );
-    } else if (m[5]) {
-      out.push(<strong key={key++}>{m[5].slice(2, -2)}</strong>);
    } else if (m[6]) {
-      out.push(<strong key={key++}>{m[6].slice(2, -2)}</strong>);
+      // Bare URL — autolink with the URL as both href and visible text,
+      // matching the Markdown `<https://…>` autolink convention.
+      out.push(
+        <a
+          key={key++}
+          className="md-link md-link-bare"
+          href={m[6]}
+          target="_blank"
+          rel="noreferrer noopener"
+        >
+          {m[6]}
+        </a>,
+      );
    } else if (m[7]) {
-      out.push(<em key={key++}>{m[7].slice(1, -1)}</em>);
+      out.push(<strong key={key++}>{m[7].slice(2, -2)}</strong>);
    } else if (m[8]) {
-      out.push(<em key={key++}>{m[8].slice(1, -1)}</em>);
+      out.push(<strong key={key++}>{m[8].slice(2, -2)}</strong>);
+    } else if (m[9]) {
+      out.push(<em key={key++}>{m[9].slice(1, -1)}</em>);
+    } else if (m[10]) {
+      out.push(<em key={key++}>{m[10].slice(1, -1)}</em>);
    }
    lastIndex = re.lastIndex;
  }
--- a/apps/web/src/state/apiProtocols.ts
+++ b/apps/web/src/state/apiProtocols.ts
@ -65,6 +65,22 @@ export const SUGGESTED_MODELS_BY_PROTOCOL: Record<ApiProtocol, readonly string[]
    'gemini-1.5-pro',
    'gemini-1.5-flash',
  ],
+  senseaudio: [
+    // SenseAudio is an OpenAI-compatible gateway that fronts both its own
+    // models (senseaudio-s2 family) and aggregator routes to deepseek /
+    // glm / kimi / minimax. Listing the headline house models first keeps
+    // the picker's default selection on a SenseAudio-native checkpoint;
+    // the aggregator IDs trail so users who arrived for a specific
+    // upstream still find it in this tab without retyping it.
+    'senseaudio-s2',
+    'senseaudio-s2-flash',
+    'deepseek-v4-flash',
+    'deepseek-v4-pro',
+    'glm-5.1',
+    'kimi-k2.6',
+    'MiniMax-M2.7-highspeed',
+    'MiniMax-M2.7',
+  ],
  ollama: [
    'cogito-2.1:671b',
    'deepseek-v3.1:671b',
@ -123,6 +139,7 @@ export const FAST_MODEL_BY_PROTOCOL: Record<ApiProtocol, string> = {
  // pick produces a deterministic answer; users who care can override
  // through the Memory model picker.
  ollama: 'gemma3:4b',
+  senseaudio: 'senseaudio-s2-flash',
 };

 export const API_PROTOCOL_TABS: ReadonlyArray<{
@ -134,6 +151,7 @@ export const API_PROTOCOL_TABS: ReadonlyArray<{
  { id: 'azure', title: 'Azure OpenAI' },
  { id: 'google', title: 'Google Gemini' },
  { id: 'ollama', title: 'Ollama Cloud' },
+  { id: 'senseaudio', title: 'SenseAudio' },
 ];

 export const API_PROTOCOL_LABELS: Record<ApiProtocol, string> = {
@ -142,6 +160,7 @@ export const API_PROTOCOL_LABELS: Record<ApiProtocol, string> = {
  azure: 'Azure OpenAI',
  google: 'Google Gemini',
  ollama: 'Ollama Cloud API',
+  senseaudio: 'SenseAudio API',
 };

 export const API_KEY_PLACEHOLDERS: Record<ApiProtocol, string> = {
@ -150,6 +169,7 @@ export const API_KEY_PLACEHOLDERS: Record<ApiProtocol, string> = {
  azure: 'azure key',
  google: 'AIza...',
  ollama: 'Ollama API key',
+  senseaudio: 'SenseAudio API key',
 };

 // Default base URL the daemon assumes when the user leaves the field
@ -161,4 +181,5 @@ export const DEFAULT_BASE_URL_BY_PROTOCOL: Record<ApiProtocol, string> = {
  azure: '',
  google: 'https://generativelanguage.googleapis.com',
  ollama: 'https://ollama.com',
+  senseaudio: 'https://api.senseaudio.cn',
 };
--- a/apps/web/src/state/config.ts
+++ b/apps/web/src/state/config.ts
@ -249,6 +249,22 @@ export const KNOWN_PROVIDERS: KnownProvider[] = [
    model: 'mimo-v2.5-pro',
    models: ['mimo-v2.5-pro'],
  },
+  {
+    label: 'SenseAudio',
+    protocol: 'senseaudio',
+    baseUrl: 'https://api.senseaudio.cn',
+    model: 'senseaudio-s2',
+    models: [
+      'senseaudio-s2',
+      'senseaudio-s2-flash',
+      'deepseek-v4-flash',
+      'deepseek-v4-pro',
+      'glm-5.1',
+      'kimi-k2.6',
+      'MiniMax-M2.7-highspeed',
+      'MiniMax-M2.7',
+    ],
+  },
 ];

 function normalizePet(input: Partial<PetConfig> | undefined): PetConfig {
@ -290,6 +306,10 @@ function inferApiProtocol(model: string, baseUrl: string): ApiProtocol {
    // protocol so both chat and the connection test hit the native Ollama
    // proxy instead of the Anthropic or OpenAI paths.
    if (normalized.includes('ollama.com')) return 'ollama';
+    // SenseAudio host gets routed to its own proxy so the daemon log line
+    // and the BYOK tab UI stay consistent with the protocol the user
+    // picked — even though the on-wire shape is OpenAI-compatible.
+    if (normalized.includes('senseaudio.cn')) return 'senseaudio';
    return isOpenAICompatible(model, baseUrl) ? 'openai' : 'anthropic';
  } catch {
    // Preserve the rest of the user's settings even if an old saved base URL is
--- a/apps/web/src/types.ts
+++ b/apps/web/src/types.ts
@ -91,7 +91,7 @@ export type {
 } from '@open-design/contracts';

 export type ExecMode = 'daemon' | 'api';
-export type ApiProtocol = 'anthropic' | 'openai' | 'azure' | 'google' | 'ollama';
+export type ApiProtocol = 'anthropic' | 'openai' | 'azure' | 'google' | 'ollama' | 'senseaudio';

 export type LiveArtifactTabId = `live:${string}`;
 export type ProjectWorkspaceTabId = string | LiveArtifactTabId;
@ -180,6 +180,13 @@ export interface ApiProtocolConfig {
  model: string;
  apiVersion?: string;
  apiProviderBaseUrl?: string | null;
+  /** SenseAudio BYOK only — default image model the daemon-side
+   *  `generate_image` tool uses when the LLM doesn't pass one. Carries
+   *  one of the SenseAudio image model ids (`senseaudio-image-2.0-260319`,
+   *  `senseaudio-image-1.0-260319`, `doubao-seedream-5-0-260128`). Stored
+   *  per-protocol so flipping between BYOK tabs doesn't reset the
+   *  SenseAudio image-model choice. */
+  byokImageModel?: string;
 }

 // Per-CLI model + reasoning the user picked in the model menu. Each agent
@ -294,6 +301,11 @@ export interface AppConfig {
  model: string;
  apiProtocol?: ApiProtocol;
  apiVersion?: string;
+  /** SenseAudio BYOK only — default image model for the daemon-side
+   *  generate_image tool. Mirrors apiProtocolConfigs.senseaudio.byokImageModel
+   *  so the active protocol's value lives at the top level (consistent
+   *  with how apiKey / baseUrl / model are projected onto AppConfig). */
+  byokImageModel?: string;
  apiProtocolConfigs?: Partial<Record<ApiProtocol, ApiProtocolConfig>>;
  /** Internal config schema/migration version for localStorage upgrades. */
  configMigrationVersion?: number;
--- a/apps/web/src/utils/apiProtocol.ts
+++ b/apps/web/src/utils/apiProtocol.ts
@ -6,6 +6,7 @@ const API_PROTOCOL_LABELS: Record<ApiProtocol, string> = {
  azure: 'Azure OpenAI',
  google: 'Google Gemini',
  ollama: 'Ollama Cloud API',
+  senseaudio: 'SenseAudio API',
 };

 const API_PROTOCOL_AGENT_IDS: Record<ApiProtocol, string> = {
@ -14,6 +15,7 @@ const API_PROTOCOL_AGENT_IDS: Record<ApiProtocol, string> = {
  azure: 'azure-openai-api',
  google: 'google-gemini-api',
  ollama: 'ollama-cloud-api',
+  senseaudio: 'senseaudio-api',
 };

 export function apiProtocolLabel(protocol: ApiProtocol | undefined): string {
--- a/apps/web/tests/runtime/markdown.test.tsx
+++ b/apps/web/tests/runtime/markdown.test.tsx
@ -105,4 +105,67 @@ describe('renderMarkdown', () => {
    const bodyTd = (out.match(/<tbody>[\s\S]*<\/tbody>/)?.[0] ?? '').match(/<td/g) ?? [];
    expect(bodyTd.length).toBe(2);
  });
+
+  it('renders ![alt](url) as <img> for relative BYOK image URLs', () => {
+    const out = html('Here is your cat: ![cute kitten](/api/byok-image/abc-123.png)');
+    expect(out).toContain('<img');
+    expect(out).toContain('class="md-image"');
+    expect(out).toContain('src="/api/byok-image/abc-123.png"');
+    expect(out).toContain('alt="cute kitten"');
+    expect(out).toContain('loading="lazy"');
+    expect(out).toContain('referrerPolicy="no-referrer"');
+    // Image syntax must NOT be turned into an <a> link — `[alt](url)`
+    // with a leading `!` is image, not link.
+    expect(out).not.toContain('<a class="md-link"');
+  });
+
+  it('renders ![](url) with empty alt text', () => {
+    const out = html('![](/api/byok-image/abc.png)');
+    expect(out).toContain('<img');
+    expect(out).toContain('alt=""');
+  });
+
+  it('renders https image URLs', () => {
+    const out = html('![logo](https://example.com/logo.png)');
+    expect(out).toContain('<img');
+    expect(out).toContain('src="https://example.com/logo.png"');
+  });
+
+  it('renders data: image URIs', () => {
+    const out = html('![inline](data:image/png;base64,iVBORw0KGgo=)');
+    expect(out).toContain('<img');
+    expect(out).toContain('src="data:image/png;base64,iVBORw0KGgo="');
+  });
+
+  it('drops image tags with unsafe schemes and keeps alt text as plain text', () => {
+    const out = html('![hacked](javascript:alert(1))');
+    expect(out).not.toContain('<img');
+    expect(out).not.toContain('javascript:');
+    expect(out).toContain('hacked');
+  });
+
+  it('rejects protocol-relative image URLs (could load cross-origin)', () => {
+    // `//evil.com/track.png` would inherit the page protocol; not in our
+    // allowlist. Should fall through to alt-as-text.
+    const out = html('![track](//evil.com/track.png)');
+    expect(out).not.toContain('<img');
+    expect(out).toContain('track');
+  });
+
+  it('keeps regular [text](url) links working alongside image syntax', () => {
+    const out = html('Click [here](https://example.com) and look ![image](/api/byok-image/a.png)');
+    expect(out).toContain('<a class="md-link"');
+    expect(out).toContain('href="https://example.com"');
+    expect(out).toContain('>here</a>');
+    expect(out).toContain('<img');
+    expect(out).toContain('src="/api/byok-image/a.png"');
+  });
+
+  it('preserves bold + italic + code after the image regex addition', () => {
+    const out = html('**b** and *i* and `c` and ![a](/p.png)');
+    expect(out).toContain('<strong>b</strong>');
+    expect(out).toContain('<em>i</em>');
+    expect(out).toContain('<code class="md-inline-code">c</code>');
+    expect(out).toContain('<img');
+  });
 });
--- a/packages/contracts/src/analytics/events.ts
+++ b/packages/contracts/src/analytics/events.ts
@ -229,7 +229,7 @@ export interface SettingsClickByokProviderOptionProps {
  // Tracking doc names azure/google/ollama as azure_openai/google_gemini/
  // ollama_cloud — we forward the code value verbatim and let dashboards
  // map; see tracking-doc-issues.md §2.5.
-  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google';
+  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google' | 'senseaudio';
  // True when the clicked chip was already the active protocol (no-op
  // toggle); false when the click switches protocol.
  is_selected: boolean;
@ -242,10 +242,10 @@ export interface SettingsClickByokFieldProps {
  action: 'focus_byok_field';
  field_id: 'api_key' | 'base_url' | 'model';
  // Code's `apiProtocol` is wider than the CSV's BYOK provider enum
-  // (anthropic|openai|azure|ollama|google). We forward the code value
-  // verbatim so dashboards can group by the actual protocol; the CSV enum
-  // is a strict subset the product team can revise.
-  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google';
+  // (anthropic|openai|azure|ollama|google|senseaudio). We forward the code
+  // value verbatim so dashboards can group by the actual protocol; the CSV
+  // enum is a strict subset the product team can revise.
+  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google' | 'senseaudio';
  has_value: boolean;
 }

@ -261,7 +261,7 @@ export interface SettingsCliTestResultProps {
 export interface SettingsByokTestResultProps {
  page: 'settings';
  area: 'execution_model';
-  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google';
+  provider_id: 'anthropic' | 'openai' | 'azure' | 'ollama' | 'google' | 'senseaudio';
  result: 'success' | 'failed' | 'timeout';
  error_code?: string;
  duration_ms: number;
--- a/packages/contracts/src/api/connectionTest.ts
+++ b/packages/contracts/src/api/connectionTest.ts
@ -139,7 +139,7 @@ export type ConnectionTestKind =
  | 'agent_spawn_failed'
  | 'unknown';

-export type ConnectionTestProtocol = 'anthropic' | 'openai' | 'azure' | 'google' | 'ollama';
+export type ConnectionTestProtocol = 'anthropic' | 'openai' | 'azure' | 'google' | 'ollama' | 'senseaudio';

 export interface ProviderTestRequest {
  protocol: ConnectionTestProtocol;
--- a/packages/contracts/src/api/memory.ts
+++ b/packages/contracts/src/api/memory.ts
@ -80,16 +80,19 @@ export interface MemoryListResponse {
 /** Provider/protocol the memory extractor calls. Mirrors the chat
 *  BYOK form's protocols — anthropic + openai-compatible + azure
 *  (openai-compatible at a different URL/header) + google gemini +
- *  ollama (also openai-compatible, just hosted on Ollama Cloud) — so
- *  the memory picker can offer the same options as the chat picker
- *  above it. The daemon routes ollama through the same callOpenAI
- *  path since the wire protocol is identical. */
+ *  ollama (also openai-compatible, just hosted on Ollama Cloud) +
+ *  senseaudio (also openai-compatible, SenseAudio's OpenAI-shaped
+ *  /v1/chat/completions gateway) — so the memory picker can offer the
+ *  same options as the chat picker above it. The daemon routes both
+ *  ollama and senseaudio through the same callOpenAI path since the
+ *  wire protocol is identical. */
 export type MemoryExtractionProvider =
  | 'anthropic'
  | 'openai'
  | 'azure'
  | 'google'
-  | 'ollama';
+  | 'ollama'
+  | 'senseaudio';

 /** Masked version of MemoryExtractionConfig returned by GET endpoints —
 *  the api key field is replaced with a 4-char tail so the settings UI