open-design/apps/daemon/tests/media-alias-capability.test.ts
Nagendhra Madishetti d5566d7627
feat(daemon): user-configurable model alias for the media dispatcher (#1277) (#1309)
* feat(daemon): user-configurable model alias / redirect for the media dispatcher (#1277)

Tilmirs's use case in #1277: their Doubao access has moved from
`doubao-seedream-3-0-t2i-250415` to `doubao-seedream-5-0`, but the
project's registered catalog still emits the old id. Every call
fails because the old name no longer resolves at Volcengine.
Until now the only workaround was patching the source on every
update.

This adds a user-configurable alias layer that swaps the catalog
id for whatever wire-name the provider expects, without changing
the catalog itself. Two storage layers (env wins over disk,
matching the rest of media-config):

1. **Environment variable** `OD_MEDIA_MODEL_ALIASES` carries a
   JSON map: `'{"doubao-seedream-3-0-t2i-250415":"doubao-seedream-5-0"}'`.
   Single var, portable across shells (Windows cmd.exe rejects
   hyphens in env-var names, so the per-id pattern lefarcen
   suggested wouldn't have worked on Windows). Malformed JSON is
   tolerated — falls through to the on-disk map rather than
   blowing up mid-generation.

2. **media-config.json** gains a top-level `aliases` field:
   ```json
   {
     "providers": { ... },
     "aliases": {
       "doubao-seedream-3-0-t2i-250415": "doubao-seedream-5-0"
     }
   }
   ```
   The Settings UI's existing PUT writes providers only, so the
   writeStored path now reads the existing aliases and preserves
   them on every write. Without that, a Settings save would
   silently wipe the user's aliases. The Settings UI surface for
   editing aliases is a separate follow-up; manual JSON edit and
   the env var are the v1 entry points.

The resolution happens inside `startMediaGeneration` after the
catalog lookup and surface validation have already accepted the
registered id, so users still get the "unknown model" error if
they request a catalog id that doesn't exist. The swap only
changes what the provider receives on the wire (volcengine,
openai, grok, fal, nanobanana etc. each pass `ctx.model`
straight into their request body).

Per-provider auto-output-name and the file-naming side use the
function-level `model` parameter (the catalog id), so a `.png`
named after `doubao-seedream-3-0-t2i-250415` keeps surfacing the
registered id the agent / CLI asked for, not the wire-level
alias. `providerNote` strings include the wire name so the user
can see what was actually sent.

Public API additions:
- `resolveModelAlias(projectRoot, modelId)` -> the wire name (or
  the original if no alias matches).
- `readAliasMap(projectRoot)` -> { effective, env, stored } for
  the future Settings UI's source-attribution display.

Tests
- 8 new cases in tests/media-config.test.ts (suite goes 14 -> 22):
  pass-through, stored map, env map, env-over-stored precedence,
  malformed-env fall-through, coercion of bad entries (null /
  number / nested object / empty string / blank key), readAliasMap
  source attribution, and a writeConfig regression that pins
  alias preservation on a Settings-style provider PUT.

Validated
- pnpm guard clean
- pnpm --filter @open-design/daemon typecheck clean (both
  tsconfig.json and tsconfig.tests.json)
- Media test suite (media-config + media-tasks-routes +
  media-tasks-persistence + media-nanobanana): 33/33

Pre-existing daemon test failures on Windows (symlinks, CODEX_BIN
runtime resolution, MCP config, skills, server-paths) are
unrelated to this change and reproduce on a clean main checkout.

* fix(daemon): preserve catalog id for capability branches, surface aliases via /api/media/config (PR #1309 review)

Lefarcen + codex P2 on PR #1309: the alias swap overwrote
`ctx.model` globally, which silently disabled every renderer
branch that keys behaviour off the catalog id. A user aliasing
`dall-e-3 -> azure-dalle3-deployment` would have the wire name
swapped correctly but `body.response_format = 'b64_json'` and
`body.quality = 'hd'` would no longer be set, because the
`ctx.model.startsWith('dall-e-')` / `ctx.model === 'dall-e-3'`
checks now saw the alias. The same regression hit the
gpt-image-* size selection, the gpt-4o-mini-tts instructions
branch, and the openaiSizeFor() sizing function.

MediaContext now carries both fields:
- `model` — the registered catalog id (`dall-e-3`,
  `gpt-4o-mini-tts`, `doubao-seedream-3-0-t2i-250415`). All
  model-family capability branches read from here.
- `wireModel` — the post-alias wire name. Every `body.model = `,
  every URL template, and every `providerNote` string reads from
  here so the user sees what was actually sent and the provider
  gets the alias.

Renderers updated: openai image (body.model + providerNote +
openaiSizeFor keeps catalog), openai speech (body.model +
providerNote + gpt-4o-mini-tts instructions keeps catalog),
volcengine video (body + note), volcengine image (body + note +
openaiSizeFor keeps catalog), grok image (body + note), grok video
(body + note), nanobanana (`credentials.model || ctx.wireModel ||
default` chain), minimax TTS, fishaudio TTS. The MINIMAX/FISHAUDIO
hardcoded maps now sit BEHIND the user alias: explicit user alias
wins over the project's legacy rebranding table, then the table
wins over the catalog id fallback. Stub-fallback diagnostics (the
SVG placeholder + stub providerNote string) keep the catalog id
since those are debug surfaces, not provider calls.

Lefarcen P3: the PR description claimed readAliasMap was the
daemon-public API, but the /api/media/config route returned only
readMaskedConfig (which had no aliases field). readMaskedConfig
now returns `{ providers, aliases: { effective, env, stored } }`
so the future Settings UI PR can consume the source-attributed
map directly. The `aliases` field is always present (empty maps
when nothing is configured) so the UI has a stable shape to read.

Tests
- New `media-alias-capability.test.ts` (2 jsdom cases) drives
  generateMedia end-to-end with a stubbed fetch and asserts on
  the request body. Pins the regression: aliased dall-e-3 still
  sends `response_format: 'b64_json'` + `quality: 'hd'`; aliased
  gpt-4o-mini-tts still attaches the instructions field from the
  voice prop.
- `media-config.test.ts` grows by 2 cases (suite goes 22 -> 24):
  readMaskedConfig surfaces the alias map (both env and stored
  sources), and the empty-state shape for fresh installs.

Validated
- pnpm guard clean
- pnpm --filter @open-design/daemon typecheck clean (both
  tsconfig.json and tsconfig.tests.json)
- Media test suite (config + alias-capability + nanobanana +
  tasks-persistence + tasks-routes): 37/37

---------

Co-authored-by: Nagendhra <nagendhra405@gmail.com>
2026-05-14 14:58:39 +08:00

161 lines
6.3 KiB
TypeScript

/**
* Regression coverage for the lefarcen + codex P2 on PR #1309: when a
* user aliases a registered catalog id to a custom wire-name via
* `OD_MEDIA_MODEL_ALIASES` or media-config.json's `aliases` map, the
* dispatcher must still apply the model-FAMILY behaviour the catalog
* id implies (DALL-E response_format, dall-e-3 hd quality,
* gpt-4o-mini-tts instructions, etc.) and only swap the value that
* goes into the provider's `body.model` field.
*
* The test stubs fetch and asserts on the request body for an
* aliased dall-e-3 -> azure-custom-deployment call. Before the fix
* ctx.model was overwritten with the alias, so the
* `startsWith('dall-e-')` and `=== 'dall-e-3'` branches stopped
* firing and the body was missing both response_format and the hd
* quality flag — exactly the regression codex described.
*/
import { mkdir, mkdtemp, rm, writeFile } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import path from 'node:path';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import { generateMedia } from '../src/media.js';
const PNG_BASE64 = 'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/x8AAwMCAO+X2uoAAAAASUVORK5CYII=';
describe('media alias preserves catalog-keyed capability branching (#1309 review)', () => {
let root: string;
let projectRoot: string;
let projectsRoot: string;
const realFetch = globalThis.fetch;
const originalEnvAliases = process.env.OD_MEDIA_MODEL_ALIASES;
const originalOpenAIKey = process.env.OPENAI_API_KEY;
const originalMediaConfigDir = process.env.OD_MEDIA_CONFIG_DIR;
const originalDataDir = process.env.OD_DATA_DIR;
beforeEach(async () => {
root = await mkdtemp(path.join(tmpdir(), 'od-media-alias-cap-'));
projectRoot = path.join(root, 'project-root');
projectsRoot = path.join(projectRoot, '.od', 'projects');
await mkdir(projectsRoot, { recursive: true });
delete process.env.OD_MEDIA_MODEL_ALIASES;
delete process.env.OD_MEDIA_CONFIG_DIR;
delete process.env.OD_DATA_DIR;
process.env.OPENAI_API_KEY = 'sk-test-key';
});
afterEach(async () => {
globalThis.fetch = realFetch;
vi.unstubAllGlobals();
if (originalEnvAliases == null) {
delete process.env.OD_MEDIA_MODEL_ALIASES;
} else {
process.env.OD_MEDIA_MODEL_ALIASES = originalEnvAliases;
}
if (originalOpenAIKey == null) {
delete process.env.OPENAI_API_KEY;
} else {
process.env.OPENAI_API_KEY = originalOpenAIKey;
}
if (originalMediaConfigDir == null) {
delete process.env.OD_MEDIA_CONFIG_DIR;
} else {
process.env.OD_MEDIA_CONFIG_DIR = originalMediaConfigDir;
}
if (originalDataDir == null) {
delete process.env.OD_DATA_DIR;
} else {
process.env.OD_DATA_DIR = originalDataDir;
}
await rm(root, { recursive: true, force: true });
});
async function writeStoredConfig(data: unknown) {
const file = path.join(projectRoot, '.od', 'media-config.json');
await mkdir(path.dirname(file), { recursive: true });
await writeFile(file, JSON.stringify(data), 'utf8');
}
it('alias dall-e-3 -> custom-deployment still sends dall-e-3 response_format + hd quality', async () => {
await writeStoredConfig({
providers: {},
aliases: { 'dall-e-3': 'azure-dalle3-deployment' },
});
let capturedBody: Record<string, unknown> | null = null;
const fetchMock = vi.fn(async (_input: unknown, init?: RequestInit) => {
capturedBody = JSON.parse(String(init?.body)) as Record<string, unknown>;
return new Response(
JSON.stringify({ data: [{ b64_json: PNG_BASE64 }] }),
{ status: 200, headers: { 'content-type': 'application/json' } },
);
});
vi.stubGlobal('fetch', fetchMock);
const result = await generateMedia({
projectRoot,
projectsRoot,
projectId: 'project-1',
surface: 'image',
model: 'dall-e-3',
prompt: 'A watercolor shiba inu under cherry blossoms',
aspect: '1:1',
output: 'aliased.png',
});
expect(fetchMock).toHaveBeenCalledTimes(1);
expect(capturedBody).not.toBeNull();
// Wire name swap landed — the provider receives the alias.
expect(capturedBody!.model).toBe('azure-dalle3-deployment');
// Capability branches keyed on the catalog id continue to fire.
expect(capturedBody!.response_format).toBe('b64_json');
expect(capturedBody!.quality).toBe('hd');
// providerNote reflects what was actually sent, so a user
// inspecting the result sees the wire name.
expect(result.providerNote).toContain('azure-dalle3-deployment');
expect(result.providerNote).not.toContain('dall-e-3');
});
it('alias gpt-4o-mini-tts -> custom-deployment still attaches style instructions', async () => {
process.env.OD_MEDIA_MODEL_ALIASES = JSON.stringify({
'gpt-4o-mini-tts': 'custom-tts-deployment',
});
let capturedBody: Record<string, unknown> | null = null;
const fetchMock = vi.fn(async (_input: unknown, init?: RequestInit) => {
capturedBody = JSON.parse(String(init?.body)) as Record<string, unknown>;
// Speech endpoints return raw audio bytes, not JSON.
return new Response(Buffer.from([1, 2, 3, 4]), {
status: 200,
headers: { 'content-type': 'audio/mpeg' },
});
});
vi.stubGlobal('fetch', fetchMock);
const result = await generateMedia({
projectRoot,
projectsRoot,
projectId: 'project-1',
surface: 'audio',
audioKind: 'speech',
model: 'gpt-4o-mini-tts',
prompt: 'Hello there.',
// gpt-4o-mini-tts accepts free-form speaking style in `voice`
// when the value isn't a known OpenAI voice id. The dispatcher
// routes that string into `body.instructions` ONLY when the
// model branch fires.
voice: 'warm and slow',
output: 'aliased.mp3',
});
expect(fetchMock).toHaveBeenCalledTimes(1);
expect(capturedBody).not.toBeNull();
expect(capturedBody!.model).toBe('custom-tts-deployment');
// Capability branch keyed on the catalog id continues to fire
// even though the wire-level model is the alias — the
// gpt-4o-mini-tts-specific instructions field is still attached.
expect(capturedBody!.instructions).toBe('warm and slow');
expect(result.providerNote).toContain('custom-tts-deployment');
});
});