mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
* feat(senseaudio): BYOK chat with image + video generation tools
Adds SenseAudio as a first-class BYOK chat protocol and wires the daemon's
chat proxy with a tool loop so BYOK users can generate images and videos
without dropping to a CLI agent.
- BYOK protocol: new senseaudio tab + /api/proxy/senseaudio/stream route +
connection-test + provider-models discovery (OpenAI-compatible wire)
- Tool loop: generate_image (synchronous /v1/image/sync) and generate_video
(async /v1/video/create + 5s polling /v1/video/status, 10-min ceiling,
periodic progress log every 30s)
- Settings dropdown + chat-composer dropdown for the BYOK image model
default; generate_image's model enum lets the LLM override per call
- Seed-on-success: a successful BYOK chat call idempotently mirrors the
key into media-config (preserves env-resolved + already-stored keys)
- Generated artifacts land in <projectsRoot>/<projectId>/ so FileViewer,
DesignFilesPanel, and project export pick them up automatically;
legacy /api/byok-image/:id route kept for old conversation links
- Markdown renderer learns  image syntax with a scheme
allowlist (http(s) / data:image/ / blob: / relative paths)
- i18n key settings.byokImageModel across all 19 locales
- 3 SenseAudio image models registered (2.0, 1.0, doubao-seedream-5.0);
1 video model (doubao-seedance-2.0)
- Tests: byok-tools (29), media-senseaudio-image (8), media-config seed
(7), proxy-routes (47), markdown image rendering (8)
* fix(senseaudio): unblock image gen + design file preview switching
- SenseAudio /v1/image/sync rejected the previous size mapping with
`参数错误:size` (1664x936, 936x1664, 1280x960, 960x1280 are not in
the gateway's accepted set). Switched to standard HD / SD sizes that
every aspect bucket can hit: 1024×1024, 1280×720, 720×1280,
1024×768, 768×1024. Kept the byok-tools and media.ts tables in sync
so the BYOK chat tool and the CLI agent path both stop failing on
non-square aspects.
- DesignFilesPanel's <DfPreview> was missing a key prop, so React
reused the same iframe DOM node when the user picked a different
file — the src prop changed but the iframe never navigated. Added
key={previewFile.name} so the previous preview unmounts cleanly.
- Updated byok-tools + media-senseaudio-image tests for the new size
expectations.
* docs(senseaudio): clear stale provider hint + update README
- Settings → Media → SenseAudio: clear the auto-promoted
"Image · TTS · 70+ voices · clone" hint; the provider label alone is
enough now that the BYOK chat surface covers image + video tooling.
- README: list the new senseaudio (and missing ollama) proxy routes so
the BYOK section reflects what the daemon actually serves, and
mention the generate_image / generate_video chat tools that ship
with the SenseAudio path.
* fix(senseaudio): address PR #2065 review feedback
Three non-blocking review notes from @PerishCode on PR #2065:
1. Drop the dead /api/byok-image/:id route. The PR description claimed
it was "legacy fallback for old chat history" but that storage
layout never existed on main, so the route can only ever 400 or
404 — never 200. Removed the handler, the isSafeByokImageId
export, the unused createReadStream / stat / path / Request /
Response imports, and the two byok-image regression tests.
2. Add rejectProxyPluginContext guard to the senseaudio proxy
handler so it matches the invariant the other five proxy paths
already enforce (plugin runs must go through /api/runs for
snapshot pinning). Extended the existing "API fallback rejects
plugin runs" describe to also cover /api/proxy/senseaudio/stream
with the 409 PLUGIN_REQUIRES_DAEMON expectation.
3. Wrap the secondary image / video downloads (the URLs the
SenseAudio gateway hands back in /v1/image/sync .url and
/v1/video/status .video_url) in validateBaseUrlResolved so a
malicious gateway can't point us at 169.254.169.254 (AWS / Azure
metadata) or RFC1918 hosts via the response payload. Also passed
`redirect: 'error'` on both fetches to match the SSRF posture
the primary proxy fetch already uses. The new
assertExternalAssetUrl helper lives next to executeGenerateImage
so future tool downloads can reuse it.
Tests: 120/120 daemon tests pass; guard + typecheck green.
* fix(senseaudio): mirror SSRF guard onto renderSenseAudioImage CLI path
Follow-up to 01b1260a — the chat-tool fix in byok-tools.ts wasn't
mirrored onto the parallel renderSenseAudioImage path in media.ts.
Same attacker-controllable shape (gateway-returned `data.url`),
same one-line fix.
- Hoist assertExternalAssetUrl from byok-tools.ts into
connectionTest.ts next to validateBaseUrlResolved so both call
sites (the BYOK chat tool loop AND the CLI agent media dispatcher)
share one helper. Made the error strings provider-agnostic so a
future caller doesn't get a misleading "senseaudio" attribution
for a Volcengine / Grok / etc. download.
- renderSenseAudioImage now runs the response url through
assertExternalAssetUrl before fetching bytes, and passes
redirect: 'error' to block a 3xx hop into private space.
Scope intentionally limited to the senseaudio path PerishCode
flagged; the other unguarded fetch(entry.url) call sites in
media.ts (OpenAI / Volcengine / Grok / Nano-Banana) are pre-existing
patterns and belong in a separate follow-up if the daemon wants
defense-in-depth across every provider.
Tests: 127/127 daemon tests pass; guard + typecheck green.
---------
Co-authored-by: unknown <mazeliang@sensetime.com>
305 lines
9.3 KiB
TypeScript
305 lines
9.3 KiB
TypeScript
import { mkdir, mkdtemp, readFile, rm, writeFile } from 'node:fs/promises';
|
|
import { tmpdir } from 'node:os';
|
|
import path from 'node:path';
|
|
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
|
|
|
import { generateMedia } from '../src/media.js';
|
|
|
|
const TEST_SENSEAUDIO_BASE_URL = 'https://senseaudio-gateway.example.test';
|
|
const TEST_IMAGE_URL = 'https://cdn.example.test/generated/abc.png';
|
|
const TEST_IMAGE_BYTES = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x00, 0x01]);
|
|
|
|
function buildOkResponse(url = TEST_IMAGE_URL) {
|
|
return new Response(
|
|
JSON.stringify({ url, base_resp: { status_code: 0, status_msg: 'success' } }),
|
|
{ status: 200, headers: { 'content-type': 'application/json' } },
|
|
);
|
|
}
|
|
|
|
function buildImageFetchResponse(bytes: Buffer) {
|
|
return new Response(bytes, {
|
|
status: 200,
|
|
headers: { 'content-type': 'image/png' },
|
|
});
|
|
}
|
|
|
|
describe('senseaudio image generation', () => {
|
|
let root: string;
|
|
let projectRoot: string;
|
|
let projectsRoot: string;
|
|
const realFetch = globalThis.fetch;
|
|
const originalMediaConfigDir = process.env.OD_MEDIA_CONFIG_DIR;
|
|
const originalDataDir = process.env.OD_DATA_DIR;
|
|
|
|
beforeEach(async () => {
|
|
root = await mkdtemp(path.join(tmpdir(), 'od-senseaudio-image-'));
|
|
projectRoot = path.join(root, 'project-root');
|
|
projectsRoot = path.join(projectRoot, '.od', 'projects');
|
|
await mkdir(projectsRoot, { recursive: true });
|
|
delete process.env.OD_MEDIA_CONFIG_DIR;
|
|
delete process.env.OD_DATA_DIR;
|
|
delete process.env.OD_SENSEAUDIO_API_KEY;
|
|
delete process.env.SENSEAUDIO_API_KEY;
|
|
});
|
|
|
|
afterEach(async () => {
|
|
globalThis.fetch = realFetch;
|
|
if (originalMediaConfigDir == null) {
|
|
delete process.env.OD_MEDIA_CONFIG_DIR;
|
|
} else {
|
|
process.env.OD_MEDIA_CONFIG_DIR = originalMediaConfigDir;
|
|
}
|
|
if (originalDataDir == null) {
|
|
delete process.env.OD_DATA_DIR;
|
|
} else {
|
|
process.env.OD_DATA_DIR = originalDataDir;
|
|
}
|
|
delete process.env.OD_SENSEAUDIO_API_KEY;
|
|
delete process.env.SENSEAUDIO_API_KEY;
|
|
await rm(root, { recursive: true, force: true });
|
|
});
|
|
|
|
async function writeConfig(data: unknown) {
|
|
const file = path.join(projectRoot, '.od', 'media-config.json');
|
|
await mkdir(path.dirname(file), { recursive: true });
|
|
await writeFile(file, JSON.stringify(data), 'utf8');
|
|
}
|
|
|
|
it('renders a SenseAudio image with the documented sync defaults', async () => {
|
|
await writeConfig({
|
|
providers: {
|
|
senseaudio: {
|
|
apiKey: 'sense-test-key',
|
|
baseUrl: TEST_SENSEAUDIO_BASE_URL,
|
|
},
|
|
},
|
|
});
|
|
|
|
const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
|
|
const urlStr = String(input);
|
|
if (urlStr === `${TEST_SENSEAUDIO_BASE_URL}/v1/image/sync`) {
|
|
expect(init?.method).toBe('POST');
|
|
expect(init?.headers).toMatchObject({
|
|
authorization: 'Bearer sense-test-key',
|
|
'content-type': 'application/json',
|
|
});
|
|
expect(JSON.parse(String(init?.body))).toEqual({
|
|
model: 'senseaudio-image-2.0-260319',
|
|
prompt: 'A magazine-style hero poster.',
|
|
size: '1024x1024',
|
|
});
|
|
return buildOkResponse();
|
|
}
|
|
if (urlStr === TEST_IMAGE_URL) {
|
|
return buildImageFetchResponse(TEST_IMAGE_BYTES);
|
|
}
|
|
throw new Error(`unexpected fetch: ${urlStr}`);
|
|
});
|
|
vi.stubGlobal('fetch', fetchMock);
|
|
|
|
const result = await generateMedia({
|
|
projectRoot,
|
|
projectsRoot,
|
|
projectId: 'project-1',
|
|
surface: 'image',
|
|
model: 'senseaudio-image-2.0-260319',
|
|
prompt: 'A magazine-style hero poster.',
|
|
output: 'sa-hero.png',
|
|
});
|
|
|
|
expect(fetchMock).toHaveBeenCalledTimes(2);
|
|
expect(result.providerId).toBe('senseaudio');
|
|
expect(result.providerNote).toContain('senseaudio/senseaudio-image-2.0-260319');
|
|
expect(result.providerNote).toContain('1024x1024');
|
|
|
|
const bytes = await readFile(path.join(projectsRoot, 'project-1', 'sa-hero.png'));
|
|
expect(bytes.equals(TEST_IMAGE_BYTES)).toBe(true);
|
|
});
|
|
|
|
it('maps aspect ratios to the SenseAudio size strings', async () => {
|
|
await writeConfig({
|
|
providers: {
|
|
senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
|
|
},
|
|
});
|
|
|
|
const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
|
|
const urlStr = String(input);
|
|
if (urlStr === `${TEST_SENSEAUDIO_BASE_URL}/v1/image/sync`) {
|
|
expect(JSON.parse(String(init?.body)).size).toBe('1280x720');
|
|
return buildOkResponse();
|
|
}
|
|
return buildImageFetchResponse(TEST_IMAGE_BYTES);
|
|
});
|
|
vi.stubGlobal('fetch', fetchMock);
|
|
|
|
await generateMedia({
|
|
projectRoot,
|
|
projectsRoot,
|
|
projectId: 'project-1',
|
|
surface: 'image',
|
|
model: 'senseaudio-image-1.0-260319',
|
|
aspect: '16:9',
|
|
prompt: 'Widescreen banner.',
|
|
output: 'sa-banner.png',
|
|
});
|
|
|
|
expect(fetchMock).toHaveBeenCalledTimes(2);
|
|
});
|
|
|
|
it('falls back to the canonical base URL when none is configured', async () => {
|
|
await writeConfig({
|
|
providers: {
|
|
senseaudio: { apiKey: 'sense-test-key' },
|
|
},
|
|
});
|
|
|
|
const fetchMock = vi.fn(async (input: unknown) => {
|
|
const urlStr = String(input);
|
|
if (urlStr === 'https://api.senseaudio.cn/v1/image/sync') {
|
|
return buildOkResponse();
|
|
}
|
|
return buildImageFetchResponse(TEST_IMAGE_BYTES);
|
|
});
|
|
vi.stubGlobal('fetch', fetchMock);
|
|
|
|
await generateMedia({
|
|
projectRoot,
|
|
projectsRoot,
|
|
projectId: 'project-1',
|
|
surface: 'image',
|
|
model: 'doubao-seedream-5-0-260128',
|
|
prompt: 'Default base url.',
|
|
output: 'sa-default-base.png',
|
|
});
|
|
|
|
expect(fetchMock).toHaveBeenCalledTimes(2);
|
|
});
|
|
|
|
it('reads the API key from OD_SENSEAUDIO_API_KEY when storage is empty', async () => {
|
|
process.env.OD_SENSEAUDIO_API_KEY = 'env-sense-key';
|
|
const fetchMock = vi.fn(async (input: unknown, init?: RequestInit) => {
|
|
if (String(input).endsWith('/v1/image/sync')) {
|
|
expect(init?.headers).toMatchObject({ authorization: 'Bearer env-sense-key' });
|
|
return buildOkResponse();
|
|
}
|
|
return buildImageFetchResponse(TEST_IMAGE_BYTES);
|
|
});
|
|
vi.stubGlobal('fetch', fetchMock);
|
|
|
|
await generateMedia({
|
|
projectRoot,
|
|
projectsRoot,
|
|
projectId: 'project-1',
|
|
surface: 'image',
|
|
model: 'senseaudio-image-2.0-260319',
|
|
prompt: 'Env-only key.',
|
|
output: 'sa-env.png',
|
|
});
|
|
|
|
expect(fetchMock).toHaveBeenCalledTimes(2);
|
|
});
|
|
|
|
it('errors when no API key is configured', async () => {
|
|
const fetchMock = vi.fn();
|
|
vi.stubGlobal('fetch', fetchMock);
|
|
|
|
await expect(
|
|
generateMedia({
|
|
projectRoot,
|
|
projectsRoot,
|
|
projectId: 'project-1',
|
|
surface: 'image',
|
|
model: 'senseaudio-image-2.0-260319',
|
|
prompt: 'Should fail.',
|
|
output: 'sa-no-key.png',
|
|
}),
|
|
).rejects.toThrow(/no SenseAudio API key/);
|
|
|
|
expect(fetchMock).not.toHaveBeenCalled();
|
|
});
|
|
|
|
it('surfaces HTTP-level failures with the status code and truncated body', async () => {
|
|
await writeConfig({
|
|
providers: {
|
|
senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
|
|
},
|
|
});
|
|
|
|
const fetchMock = vi.fn(async () =>
|
|
new Response('unauthorized', {
|
|
status: 401,
|
|
headers: { 'content-type': 'text/plain' },
|
|
}),
|
|
);
|
|
vi.stubGlobal('fetch', fetchMock);
|
|
|
|
await expect(
|
|
generateMedia({
|
|
projectRoot,
|
|
projectsRoot,
|
|
projectId: 'project-1',
|
|
surface: 'image',
|
|
model: 'senseaudio-image-2.0-260319',
|
|
prompt: 'Bad auth.',
|
|
output: 'sa-401.png',
|
|
}),
|
|
).rejects.toThrow('senseaudio image 401: unauthorized');
|
|
});
|
|
|
|
it('surfaces upstream error_message verbatim when the body reports failure', async () => {
|
|
await writeConfig({
|
|
providers: {
|
|
senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
|
|
},
|
|
});
|
|
|
|
const fetchMock = vi.fn(async () =>
|
|
new Response(
|
|
JSON.stringify({ error_message: 'sensitive_content_blocked' }),
|
|
{ status: 200, headers: { 'content-type': 'application/json' } },
|
|
),
|
|
);
|
|
vi.stubGlobal('fetch', fetchMock);
|
|
|
|
await expect(
|
|
generateMedia({
|
|
projectRoot,
|
|
projectsRoot,
|
|
projectId: 'project-1',
|
|
surface: 'image',
|
|
model: 'senseaudio-image-2.0-260319',
|
|
prompt: 'Blocked.',
|
|
output: 'sa-blocked.png',
|
|
}),
|
|
).rejects.toThrow('senseaudio image api error: sensitive_content_blocked');
|
|
});
|
|
|
|
it('errors when the response body is missing the image url', async () => {
|
|
await writeConfig({
|
|
providers: {
|
|
senseaudio: { apiKey: 'sense-test-key', baseUrl: TEST_SENSEAUDIO_BASE_URL },
|
|
},
|
|
});
|
|
|
|
const fetchMock = vi.fn(async () =>
|
|
new Response(
|
|
JSON.stringify({ base_resp: { status_code: 0, status_msg: 'success' } }),
|
|
{ status: 200, headers: { 'content-type': 'application/json' } },
|
|
),
|
|
);
|
|
vi.stubGlobal('fetch', fetchMock);
|
|
|
|
await expect(
|
|
generateMedia({
|
|
projectRoot,
|
|
projectsRoot,
|
|
projectId: 'project-1',
|
|
surface: 'image',
|
|
model: 'senseaudio-image-2.0-260319',
|
|
prompt: 'Missing url.',
|
|
output: 'sa-missing-url.png',
|
|
}),
|
|
).rejects.toThrow('senseaudio image response missing url');
|
|
});
|
|
});
|