mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
* feat(daemon): user-configurable model alias / redirect for the media dispatcher (#1277) Tilmirs's use case in #1277: their Doubao access has moved from `doubao-seedream-3-0-t2i-250415` to `doubao-seedream-5-0`, but the project's registered catalog still emits the old id. Every call fails because the old name no longer resolves at Volcengine. Until now the only workaround was patching the source on every update. This adds a user-configurable alias layer that swaps the catalog id for whatever wire-name the provider expects, without changing the catalog itself. Two storage layers (env wins over disk, matching the rest of media-config): 1. **Environment variable** `OD_MEDIA_MODEL_ALIASES` carries a JSON map: `'{"doubao-seedream-3-0-t2i-250415":"doubao-seedream-5-0"}'`. Single var, portable across shells (Windows cmd.exe rejects hyphens in env-var names, so the per-id pattern lefarcen suggested wouldn't have worked on Windows). Malformed JSON is tolerated — falls through to the on-disk map rather than blowing up mid-generation. 2. **media-config.json** gains a top-level `aliases` field: ```json { "providers": { ... }, "aliases": { "doubao-seedream-3-0-t2i-250415": "doubao-seedream-5-0" } } ``` The Settings UI's existing PUT writes providers only, so the writeStored path now reads the existing aliases and preserves them on every write. Without that, a Settings save would silently wipe the user's aliases. The Settings UI surface for editing aliases is a separate follow-up; manual JSON edit and the env var are the v1 entry points. The resolution happens inside `startMediaGeneration` after the catalog lookup and surface validation have already accepted the registered id, so users still get the "unknown model" error if they request a catalog id that doesn't exist. The swap only changes what the provider receives on the wire (volcengine, openai, grok, fal, nanobanana etc. each pass `ctx.model` straight into their request body). Per-provider auto-output-name and the file-naming side use the function-level `model` parameter (the catalog id), so a `.png` named after `doubao-seedream-3-0-t2i-250415` keeps surfacing the registered id the agent / CLI asked for, not the wire-level alias. `providerNote` strings include the wire name so the user can see what was actually sent. Public API additions: - `resolveModelAlias(projectRoot, modelId)` -> the wire name (or the original if no alias matches). - `readAliasMap(projectRoot)` -> { effective, env, stored } for the future Settings UI's source-attribution display. Tests - 8 new cases in tests/media-config.test.ts (suite goes 14 -> 22): pass-through, stored map, env map, env-over-stored precedence, malformed-env fall-through, coercion of bad entries (null / number / nested object / empty string / blank key), readAliasMap source attribution, and a writeConfig regression that pins alias preservation on a Settings-style provider PUT. Validated - pnpm guard clean - pnpm --filter @open-design/daemon typecheck clean (both tsconfig.json and tsconfig.tests.json) - Media test suite (media-config + media-tasks-routes + media-tasks-persistence + media-nanobanana): 33/33 Pre-existing daemon test failures on Windows (symlinks, CODEX_BIN runtime resolution, MCP config, skills, server-paths) are unrelated to this change and reproduce on a clean main checkout. * fix(daemon): preserve catalog id for capability branches, surface aliases via /api/media/config (PR #1309 review) Lefarcen + codex P2 on PR #1309: the alias swap overwrote `ctx.model` globally, which silently disabled every renderer branch that keys behaviour off the catalog id. A user aliasing `dall-e-3 -> azure-dalle3-deployment` would have the wire name swapped correctly but `body.response_format = 'b64_json'` and `body.quality = 'hd'` would no longer be set, because the `ctx.model.startsWith('dall-e-')` / `ctx.model === 'dall-e-3'` checks now saw the alias. The same regression hit the gpt-image-* size selection, the gpt-4o-mini-tts instructions branch, and the openaiSizeFor() sizing function. MediaContext now carries both fields: - `model` — the registered catalog id (`dall-e-3`, `gpt-4o-mini-tts`, `doubao-seedream-3-0-t2i-250415`). All model-family capability branches read from here. - `wireModel` — the post-alias wire name. Every `body.model = `, every URL template, and every `providerNote` string reads from here so the user sees what was actually sent and the provider gets the alias. Renderers updated: openai image (body.model + providerNote + openaiSizeFor keeps catalog), openai speech (body.model + providerNote + gpt-4o-mini-tts instructions keeps catalog), volcengine video (body + note), volcengine image (body + note + openaiSizeFor keeps catalog), grok image (body + note), grok video (body + note), nanobanana (`credentials.model || ctx.wireModel || default` chain), minimax TTS, fishaudio TTS. The MINIMAX/FISHAUDIO hardcoded maps now sit BEHIND the user alias: explicit user alias wins over the project's legacy rebranding table, then the table wins over the catalog id fallback. Stub-fallback diagnostics (the SVG placeholder + stub providerNote string) keep the catalog id since those are debug surfaces, not provider calls. Lefarcen P3: the PR description claimed readAliasMap was the daemon-public API, but the /api/media/config route returned only readMaskedConfig (which had no aliases field). readMaskedConfig now returns `{ providers, aliases: { effective, env, stored } }` so the future Settings UI PR can consume the source-attributed map directly. The `aliases` field is always present (empty maps when nothing is configured) so the UI has a stable shape to read. Tests - New `media-alias-capability.test.ts` (2 jsdom cases) drives generateMedia end-to-end with a stubbed fetch and asserts on the request body. Pins the regression: aliased dall-e-3 still sends `response_format: 'b64_json'` + `quality: 'hd'`; aliased gpt-4o-mini-tts still attaches the instructions field from the voice prop. - `media-config.test.ts` grows by 2 cases (suite goes 22 -> 24): readMaskedConfig surfaces the alias map (both env and stored sources), and the empty-state shape for fresh installs. Validated - pnpm guard clean - pnpm --filter @open-design/daemon typecheck clean (both tsconfig.json and tsconfig.tests.json) - Media test suite (config + alias-capability + nanobanana + tasks-persistence + tasks-routes): 37/37 --------- Co-authored-by: Nagendhra <nagendhra405@gmail.com>
This commit is contained in:
parent
2976c76fc3
commit
d5566d7627
4 changed files with 539 additions and 29 deletions
|
|
@ -44,9 +44,20 @@ import { expandHomePrefix } from './home-expansion.js';
|
|||
const PROVIDER_IDS = MEDIA_PROVIDERS.map((p) => p.id);
|
||||
type ProviderEntry = { apiKey?: string; baseUrl?: string; model?: string };
|
||||
type ProviderMap = Record<string, ProviderEntry>;
|
||||
type ModelAliasMap = Record<string, string>;
|
||||
type JsonRecord = Record<string, unknown>;
|
||||
type OAuthCredential = { apiKey: string; source: string };
|
||||
|
||||
// Single env var carries the full alias map as JSON so we don't have
|
||||
// to dynamically lift `OD_MEDIA_MODEL_ALIAS_<id>=value` into a record
|
||||
// with all the env-var-name escaping that entails (Windows cmd.exe in
|
||||
// particular rejects hyphens). The shape mirrors the on-disk
|
||||
// `aliases` map so users can switch storage layers without rewriting
|
||||
// their workflow:
|
||||
//
|
||||
// OD_MEDIA_MODEL_ALIASES='{"doubao-seedream-3-0-t2i-250415":"doubao-seedream-5-0"}'
|
||||
const ENV_MODEL_ALIASES = 'OD_MEDIA_MODEL_ALIASES';
|
||||
|
||||
function isRecord(value: unknown): value is JsonRecord {
|
||||
return value !== null && typeof value === 'object';
|
||||
}
|
||||
|
|
@ -128,24 +139,106 @@ function configFile(projectRoot: string): string {
|
|||
return path.join(dir, 'media-config.json');
|
||||
}
|
||||
|
||||
async function readStored(projectRoot: string): Promise<ProviderMap> {
|
||||
/**
|
||||
* Normalise an arbitrary unknown into a string-to-string map, dropping
|
||||
* keys that have empty / non-string values. Shared by the env-var
|
||||
* parser and the on-disk reader so both layers reject malformed
|
||||
* entries the same way.
|
||||
*/
|
||||
function coerceAliasMap(raw: unknown): ModelAliasMap {
|
||||
if (!isRecord(raw)) return {};
|
||||
const out: ModelAliasMap = {};
|
||||
for (const [k, v] of Object.entries(raw)) {
|
||||
if (typeof k !== 'string' || !k.trim()) continue;
|
||||
if (typeof v !== 'string' || !v.trim()) continue;
|
||||
out[k.trim()] = v.trim();
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
async function readStoredFile(projectRoot: string): Promise<JsonRecord> {
|
||||
try {
|
||||
const raw = await readFile(configFile(projectRoot), 'utf8');
|
||||
const parsed = JSON.parse(raw);
|
||||
if (isRecord(parsed) && isRecord(parsed.providers)) {
|
||||
return parsed.providers as ProviderMap;
|
||||
}
|
||||
return {};
|
||||
return isRecord(parsed) ? parsed : {};
|
||||
} catch (err) {
|
||||
if (errorCode(err) === 'ENOENT') return {};
|
||||
throw err;
|
||||
}
|
||||
}
|
||||
|
||||
async function writeStored(projectRoot: string, providers: ProviderMap): Promise<void> {
|
||||
async function readStored(projectRoot: string): Promise<ProviderMap> {
|
||||
const parsed = await readStoredFile(projectRoot);
|
||||
return isRecord(parsed.providers) ? (parsed.providers as ProviderMap) : {};
|
||||
}
|
||||
|
||||
async function readStoredAliases(projectRoot: string): Promise<ModelAliasMap> {
|
||||
const parsed = await readStoredFile(projectRoot);
|
||||
return coerceAliasMap(parsed.aliases);
|
||||
}
|
||||
|
||||
async function writeStored(
|
||||
projectRoot: string,
|
||||
providers: ProviderMap,
|
||||
aliases?: ModelAliasMap,
|
||||
): Promise<void> {
|
||||
const file = configFile(projectRoot);
|
||||
await mkdir(path.dirname(file), { recursive: true });
|
||||
await writeFile(file, JSON.stringify({ providers }, null, 2), 'utf8');
|
||||
// Preserve any existing aliases when the caller doesn't pass them.
|
||||
// The Settings UI writes providers only; without this, every
|
||||
// provider edit would silently wipe the user's model aliases (issue
|
||||
// #1277 introduces aliases but the Settings UI surface for editing
|
||||
// them lands in a follow-up PR).
|
||||
const resolvedAliases = aliases ?? (await readStoredAliases(projectRoot));
|
||||
const body: JsonRecord = { providers };
|
||||
if (Object.keys(resolvedAliases).length > 0) {
|
||||
body.aliases = resolvedAliases;
|
||||
}
|
||||
await writeFile(file, JSON.stringify(body, null, 2), 'utf8');
|
||||
}
|
||||
|
||||
function readEnvAliases(): ModelAliasMap {
|
||||
const raw = process.env[ENV_MODEL_ALIASES];
|
||||
if (typeof raw !== 'string' || !raw.trim()) return {};
|
||||
try {
|
||||
return coerceAliasMap(JSON.parse(raw));
|
||||
} catch {
|
||||
// Malformed JSON is non-fatal — the user can fix the env var
|
||||
// without restarting the daemon mid-generation, and silent fall-
|
||||
// through to the on-disk map matches the precedent of the rest
|
||||
// of the env / stored config resolution in this module.
|
||||
return {};
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolve a registered model id to the wire-name the provider should
|
||||
* actually receive on the network. Env wins over stored, mirroring
|
||||
* the precedence the rest of media-config uses for `apiKey` (issue
|
||||
* #1277). Pass-through when no alias is configured.
|
||||
*/
|
||||
export async function resolveModelAlias(
|
||||
projectRoot: string,
|
||||
modelId: string,
|
||||
): Promise<string> {
|
||||
const envAliases = readEnvAliases();
|
||||
if (envAliases[modelId]) return envAliases[modelId]!;
|
||||
const stored = await readStoredAliases(projectRoot);
|
||||
return stored[modelId] ?? modelId;
|
||||
}
|
||||
|
||||
/**
|
||||
* Read the merged alias map (env + stored). Exposed for the
|
||||
* `/api/media/config` GET endpoint so the Settings UI can display
|
||||
* which aliases are active and where they came from.
|
||||
*/
|
||||
export async function readAliasMap(
|
||||
projectRoot: string,
|
||||
): Promise<{ effective: ModelAliasMap; env: ModelAliasMap; stored: ModelAliasMap }> {
|
||||
const env = readEnvAliases();
|
||||
const stored = await readStoredAliases(projectRoot);
|
||||
const effective: ModelAliasMap = { ...stored, ...env };
|
||||
return { effective, env, stored };
|
||||
}
|
||||
|
||||
function readEnvKey(providerId: string): string | null {
|
||||
|
|
@ -260,9 +353,20 @@ export async function resolveProviderConfig(projectRoot: string, providerId: str
|
|||
* frontend can show "••••" + a "configured" indicator without leaking
|
||||
* the secret back into the DOM.
|
||||
*/
|
||||
export async function readMaskedConfig(projectRoot: string): Promise<{ providers: Record<string, { configured: boolean; source: string; apiKeyTail: string; baseUrl: string; model?: string }> }> {
|
||||
export interface MaskedConfigResponse {
|
||||
providers: Record<string, { configured: boolean; source: string; apiKeyTail: string; baseUrl: string; model?: string }>;
|
||||
/**
|
||||
* Effective alias map plus source attribution. The Settings UI can
|
||||
* show "from env" vs "from media-config.json" badges next to each
|
||||
* entry without needing a second endpoint. Empty maps mean no
|
||||
* aliases are configured (issue #1277).
|
||||
*/
|
||||
aliases: { effective: ModelAliasMap; env: ModelAliasMap; stored: ModelAliasMap };
|
||||
}
|
||||
|
||||
export async function readMaskedConfig(projectRoot: string): Promise<MaskedConfigResponse> {
|
||||
const stored = await readStored(projectRoot);
|
||||
const providers: Record<string, { configured: boolean; source: string; apiKeyTail: string; baseUrl: string; model?: string }> = {};
|
||||
const providers: MaskedConfigResponse['providers'] = {};
|
||||
for (const id of PROVIDER_IDS) {
|
||||
const entry = stored[id] || {};
|
||||
const envKey = readEnvKey(id);
|
||||
|
|
@ -284,7 +388,8 @@ export async function readMaskedConfig(projectRoot: string): Promise<{ providers
|
|||
: {}),
|
||||
};
|
||||
}
|
||||
return { providers };
|
||||
const aliases = await readAliasMap(projectRoot);
|
||||
return { providers, aliases };
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
|
|||
|
|
@ -53,7 +53,7 @@ import {
|
|||
findProvider,
|
||||
modelsForSurface,
|
||||
} from './media-models.js';
|
||||
import { resolveProviderConfig } from './media-config.js';
|
||||
import { resolveModelAlias, resolveProviderConfig } from './media-config.js';
|
||||
import {
|
||||
ensureProject,
|
||||
kindFor,
|
||||
|
|
@ -67,7 +67,29 @@ type ProgressFn = (message: string) => void;
|
|||
type ImageRef = { path: string; abs: string; mime: string; size: number; dataUrl: string };
|
||||
type MediaContext = {
|
||||
surface: MediaSurface;
|
||||
/**
|
||||
* Registered catalog id (e.g. `dall-e-3`, `gpt-4o-mini-tts`,
|
||||
* `doubao-seedream-3-0-t2i-250415`). Every model-family branch in
|
||||
* the renderers below keys off this field so DALL·E sizing,
|
||||
* gpt-image quality, gpt-4o-mini-tts instructions, and the
|
||||
* MINIMAX/FISHAUDIO TTS lookup tables continue to fire even when
|
||||
* the user has aliased the catalog id to a custom wire-name via
|
||||
* issue #1277's alias layer. lefarcen + codex P2 review on PR
|
||||
* #1309 caught the regression where a single `ctx.model` doubled
|
||||
* for both purposes and accidentally disabled the capability
|
||||
* branches under aliasing.
|
||||
*/
|
||||
model: string;
|
||||
/**
|
||||
* What the provider's request body should carry as `model` (or
|
||||
* what gets templated into the URL for Azure-style deployment
|
||||
* routing). Equal to `model` when no alias is configured; equal
|
||||
* to the user-supplied alias from `OD_MEDIA_MODEL_ALIASES` /
|
||||
* `media-config.json` otherwise. Renderers must use this field
|
||||
* for `body.model = ...` and for `providerNote` so users see
|
||||
* what was actually sent.
|
||||
*/
|
||||
wireModel: string;
|
||||
modelDef: MediaModel;
|
||||
provider: MediaProvider | null;
|
||||
prompt: string;
|
||||
|
|
@ -352,9 +374,19 @@ export async function generateMedia(args: {
|
|||
// and decide how to splice the data URL into their request.
|
||||
const imageRef = await resolveProjectImage(image, dir);
|
||||
|
||||
// Resolve any user-configured model alias BEFORE we hand the id to a
|
||||
// dispatcher (issue #1277). Catalog lookup + surface validation above
|
||||
// ran against the original id so we still enforce the registered
|
||||
// catalog; the alias only changes what the provider receives on the
|
||||
// wire. lefarcen + codex P2 on PR #1309: keep BOTH values on ctx so
|
||||
// capability branches (DALL-E sizing, gpt-image quality, gpt-4o-mini-tts
|
||||
// instructions, MINIMAX/FISHAUDIO TTS map) continue to key off the
|
||||
// catalog id while the provider's request body carries the alias.
|
||||
const wireModel = await resolveModelAlias(projectRoot, model);
|
||||
const ctx = {
|
||||
surface,
|
||||
model,
|
||||
wireModel,
|
||||
modelDef: def,
|
||||
provider: findProvider(def.provider),
|
||||
prompt: prompt || '',
|
||||
|
|
@ -615,12 +647,15 @@ async function renderOpenAIImage(ctx: MediaContext, credentials: ProviderConfig)
|
|||
};
|
||||
// For non-Azure calls, include `model` in the body. Azure infers it
|
||||
// from the deployment in the path so omitting it keeps payloads
|
||||
// compatible across both flavors.
|
||||
// compatible across both flavors. The wire-name (post-alias) goes
|
||||
// on the body so the user's alias from issue #1277 reaches the API.
|
||||
if (!azure) {
|
||||
body.model = ctx.model;
|
||||
body.model = ctx.wireModel;
|
||||
}
|
||||
// gpt-image-* returns b64_json by default and rejects response_format,
|
||||
// so we only pass it for dall-e-* (where it's required).
|
||||
// Capability branches key off the CATALOG id (not the alias) so a
|
||||
// user who aliased `dall-e-3` to a custom Azure / proxy deployment
|
||||
// still gets the DALL-E-specific quality + response_format flags
|
||||
// (lefarcen + codex P2 on PR #1309).
|
||||
if (ctx.model.startsWith('dall-e-')) {
|
||||
body.response_format = 'b64_json';
|
||||
body.quality = ctx.model === 'dall-e-3' ? 'hd' : 'standard';
|
||||
|
|
@ -675,7 +710,7 @@ async function renderOpenAIImage(ctx: MediaContext, credentials: ProviderConfig)
|
|||
const tag = azure ? 'azure-openai' : 'openai';
|
||||
return {
|
||||
bytes,
|
||||
providerNote: `${tag}/${ctx.model} · ${ctx.aspect} · ${bytes.length} bytes`,
|
||||
providerNote: `${tag}/${ctx.wireModel} · ${ctx.aspect} · ${bytes.length} bytes`,
|
||||
suggestedExt: '.png',
|
||||
};
|
||||
}
|
||||
|
|
@ -810,7 +845,7 @@ async function renderOpenAISpeech(ctx: MediaContext, credentials: ProviderConfig
|
|||
response_format: format,
|
||||
};
|
||||
if (!azure) {
|
||||
body.model = ctx.model;
|
||||
body.model = ctx.wireModel;
|
||||
}
|
||||
if (instructions && ctx.model === 'gpt-4o-mini-tts') {
|
||||
body.instructions = instructions;
|
||||
|
|
@ -840,7 +875,7 @@ async function renderOpenAISpeech(ctx: MediaContext, credentials: ProviderConfig
|
|||
throw new Error('openai speech returned zero bytes');
|
||||
}
|
||||
const tag = azure ? 'azure-openai' : 'openai';
|
||||
const noteBits = [`${tag}/${ctx.model}`, voiceId, `${format}`, `${bytes.length} bytes`];
|
||||
const noteBits = [`${tag}/${ctx.wireModel}`, voiceId, `${format}`, `${bytes.length} bytes`];
|
||||
if (instructions) noteBits.splice(2, 0, 'styled');
|
||||
return {
|
||||
bytes,
|
||||
|
|
@ -898,7 +933,7 @@ async function renderVolcengineVideo(ctx: MediaContext, credentials: ProviderCon
|
|||
}
|
||||
|
||||
const taskBody = {
|
||||
model: ctx.model,
|
||||
model: ctx.wireModel,
|
||||
content,
|
||||
};
|
||||
|
||||
|
|
@ -988,7 +1023,7 @@ async function renderVolcengineVideo(ctx: MediaContext, credentials: ProviderCon
|
|||
|
||||
return {
|
||||
bytes,
|
||||
providerNote: `volcengine/${ctx.model} · ${ratio} · ${durationSec}s · ${bytes.length} bytes`,
|
||||
providerNote: `volcengine/${ctx.wireModel} · ${ratio} · ${durationSec}s · ${bytes.length} bytes`,
|
||||
suggestedExt: '.mp4',
|
||||
};
|
||||
}
|
||||
|
|
@ -1012,9 +1047,12 @@ async function renderVolcengineImage(ctx: MediaContext, credentials: ProviderCon
|
|||
const baseUrl = (credentials.baseUrl || 'https://ark.cn-beijing.volces.com/api/v3').replace(/\/$/, '');
|
||||
|
||||
const body = {
|
||||
model: ctx.model,
|
||||
model: ctx.wireModel,
|
||||
prompt: ctx.prompt || 'A high-quality reference image.',
|
||||
response_format: 'b64_json',
|
||||
// openaiSizeFor branches on the catalog id (gpt-image-* vs dall-e-*
|
||||
// accept different size enums), so it must NOT see the post-alias
|
||||
// wire name. lefarcen + codex P2 on PR #1309.
|
||||
size: openaiSizeFor(ctx.model, ctx.aspect),
|
||||
};
|
||||
const resp = await fetch(`${baseUrl}/images/generations`, {
|
||||
|
|
@ -1049,7 +1087,7 @@ async function renderVolcengineImage(ctx: MediaContext, credentials: ProviderCon
|
|||
}
|
||||
return {
|
||||
bytes,
|
||||
providerNote: `volcengine/${ctx.model} · ${ctx.aspect} · ${bytes.length} bytes`,
|
||||
providerNote: `volcengine/${ctx.wireModel} · ${ctx.aspect} · ${bytes.length} bytes`,
|
||||
suggestedExt: '.png',
|
||||
};
|
||||
}
|
||||
|
|
@ -1082,7 +1120,7 @@ async function renderGrokImage(ctx: MediaContext, credentials: ProviderConfig):
|
|||
|
||||
const aspectRatio = grokAspectFor(ctx.aspect);
|
||||
const body = {
|
||||
model: ctx.model,
|
||||
model: ctx.wireModel,
|
||||
prompt: ctx.prompt || 'A high-quality reference image.',
|
||||
n: 1,
|
||||
aspect_ratio: aspectRatio,
|
||||
|
|
@ -1125,7 +1163,7 @@ async function renderGrokImage(ctx: MediaContext, credentials: ProviderConfig):
|
|||
// trusts the extension.
|
||||
return {
|
||||
bytes,
|
||||
providerNote: `grok/${ctx.model} · ${aspectRatio} · ${bytes.length} bytes`,
|
||||
providerNote: `grok/${ctx.wireModel} · ${aspectRatio} · ${bytes.length} bytes`,
|
||||
suggestedExt: sniffImageExt(bytes),
|
||||
};
|
||||
}
|
||||
|
|
@ -1138,7 +1176,7 @@ async function renderNanoBananaImage(ctx: MediaContext, credentials: ProviderCon
|
|||
}
|
||||
|
||||
const baseUrl = (credentials.baseUrl || NANOBANANA_DEFAULT_BASE_URL).replace(/\/$/, '');
|
||||
const wireModel = (credentials.model || ctx.model || NANOBANANA_DEFAULT_MODEL).trim();
|
||||
const wireModel = (credentials.model || ctx.wireModel || NANOBANANA_DEFAULT_MODEL).trim();
|
||||
const body = {
|
||||
contents: [{
|
||||
parts: [{
|
||||
|
|
@ -1261,7 +1299,7 @@ async function renderGrokVideo(ctx: MediaContext, credentials: ProviderConfig, o
|
|||
const aspectRatio = grokAspectFor(ctx.aspect);
|
||||
|
||||
const body: Record<string, unknown> = {
|
||||
model: ctx.model,
|
||||
model: ctx.wireModel,
|
||||
prompt: ctx.prompt || 'A short cinematic clip.',
|
||||
duration: durationSec,
|
||||
aspect_ratio: aspectRatio,
|
||||
|
|
@ -1375,7 +1413,7 @@ async function renderGrokVideo(ctx: MediaContext, credentials: ProviderConfig, o
|
|||
|
||||
return {
|
||||
bytes,
|
||||
providerNote: `grok/${ctx.model} · ${aspectRatio} · ${durationSec}s · ${bytes.length} bytes`,
|
||||
providerNote: `grok/${ctx.wireModel} · ${aspectRatio} · ${durationSec}s · ${bytes.length} bytes`,
|
||||
suggestedExt: '.mp4',
|
||||
};
|
||||
}
|
||||
|
|
@ -1583,7 +1621,13 @@ async function renderMinimaxTTS(ctx: MediaContext, credentials: ProviderConfig):
|
|||
/\/$/,
|
||||
'',
|
||||
);
|
||||
const wireModel = MINIMAX_TTS_MODEL_MAP[ctx.model] || ctx.model;
|
||||
// Precedence: user alias from #1277 (when set) -> project's known
|
||||
// MINIMAX legacy rename map -> catalog id. The user knows their
|
||||
// deployment name better than our hardcoded table, so an explicit
|
||||
// alias trumps the legacy mapping.
|
||||
const wireModel = ctx.wireModel !== ctx.model
|
||||
? ctx.wireModel
|
||||
: (MINIMAX_TTS_MODEL_MAP[ctx.model] || ctx.model);
|
||||
const text = (ctx.prompt && ctx.prompt.trim()) || 'This is a test.';
|
||||
// Voice id picks: the agent can pass --voice to choose, otherwise we
|
||||
// default to a neutral Mandarin male voice that handles both Chinese
|
||||
|
|
@ -1686,7 +1730,11 @@ async function renderFishAudioTTS(ctx: MediaContext, credentials: ProviderConfig
|
|||
/\/$/,
|
||||
'',
|
||||
);
|
||||
const wireModel = FISHAUDIO_TTS_MODEL_MAP[ctx.model] || ctx.model;
|
||||
// Same precedence as the MINIMAX TTS path: user alias wins, then
|
||||
// the project's hardcoded fishaudio map, then catalog id.
|
||||
const wireModel = ctx.wireModel !== ctx.model
|
||||
? ctx.wireModel
|
||||
: (FISHAUDIO_TTS_MODEL_MAP[ctx.model] || ctx.model);
|
||||
const text = (ctx.prompt && ctx.prompt.trim()) || 'This is a test.';
|
||||
|
||||
// FishAudio's `reference_id` slot pins which voice the synth uses.
|
||||
|
|
|
|||
161
apps/daemon/tests/media-alias-capability.test.ts
Normal file
161
apps/daemon/tests/media-alias-capability.test.ts
Normal file
|
|
@ -0,0 +1,161 @@
|
|||
/**
|
||||
* Regression coverage for the lefarcen + codex P2 on PR #1309: when a
|
||||
* user aliases a registered catalog id to a custom wire-name via
|
||||
* `OD_MEDIA_MODEL_ALIASES` or media-config.json's `aliases` map, the
|
||||
* dispatcher must still apply the model-FAMILY behaviour the catalog
|
||||
* id implies (DALL-E response_format, dall-e-3 hd quality,
|
||||
* gpt-4o-mini-tts instructions, etc.) and only swap the value that
|
||||
* goes into the provider's `body.model` field.
|
||||
*
|
||||
* The test stubs fetch and asserts on the request body for an
|
||||
* aliased dall-e-3 -> azure-custom-deployment call. Before the fix
|
||||
* ctx.model was overwritten with the alias, so the
|
||||
* `startsWith('dall-e-')` and `=== 'dall-e-3'` branches stopped
|
||||
* firing and the body was missing both response_format and the hd
|
||||
* quality flag — exactly the regression codex described.
|
||||
*/
|
||||
|
||||
import { mkdir, mkdtemp, rm, writeFile } from 'node:fs/promises';
|
||||
import { tmpdir } from 'node:os';
|
||||
import path from 'node:path';
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
import { generateMedia } from '../src/media.js';
|
||||
|
||||
const PNG_BASE64 = 'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/x8AAwMCAO+X2uoAAAAASUVORK5CYII=';
|
||||
|
||||
describe('media alias preserves catalog-keyed capability branching (#1309 review)', () => {
|
||||
let root: string;
|
||||
let projectRoot: string;
|
||||
let projectsRoot: string;
|
||||
const realFetch = globalThis.fetch;
|
||||
const originalEnvAliases = process.env.OD_MEDIA_MODEL_ALIASES;
|
||||
const originalOpenAIKey = process.env.OPENAI_API_KEY;
|
||||
const originalMediaConfigDir = process.env.OD_MEDIA_CONFIG_DIR;
|
||||
const originalDataDir = process.env.OD_DATA_DIR;
|
||||
|
||||
beforeEach(async () => {
|
||||
root = await mkdtemp(path.join(tmpdir(), 'od-media-alias-cap-'));
|
||||
projectRoot = path.join(root, 'project-root');
|
||||
projectsRoot = path.join(projectRoot, '.od', 'projects');
|
||||
await mkdir(projectsRoot, { recursive: true });
|
||||
delete process.env.OD_MEDIA_MODEL_ALIASES;
|
||||
delete process.env.OD_MEDIA_CONFIG_DIR;
|
||||
delete process.env.OD_DATA_DIR;
|
||||
process.env.OPENAI_API_KEY = 'sk-test-key';
|
||||
});
|
||||
|
||||
afterEach(async () => {
|
||||
globalThis.fetch = realFetch;
|
||||
vi.unstubAllGlobals();
|
||||
if (originalEnvAliases == null) {
|
||||
delete process.env.OD_MEDIA_MODEL_ALIASES;
|
||||
} else {
|
||||
process.env.OD_MEDIA_MODEL_ALIASES = originalEnvAliases;
|
||||
}
|
||||
if (originalOpenAIKey == null) {
|
||||
delete process.env.OPENAI_API_KEY;
|
||||
} else {
|
||||
process.env.OPENAI_API_KEY = originalOpenAIKey;
|
||||
}
|
||||
if (originalMediaConfigDir == null) {
|
||||
delete process.env.OD_MEDIA_CONFIG_DIR;
|
||||
} else {
|
||||
process.env.OD_MEDIA_CONFIG_DIR = originalMediaConfigDir;
|
||||
}
|
||||
if (originalDataDir == null) {
|
||||
delete process.env.OD_DATA_DIR;
|
||||
} else {
|
||||
process.env.OD_DATA_DIR = originalDataDir;
|
||||
}
|
||||
await rm(root, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
async function writeStoredConfig(data: unknown) {
|
||||
const file = path.join(projectRoot, '.od', 'media-config.json');
|
||||
await mkdir(path.dirname(file), { recursive: true });
|
||||
await writeFile(file, JSON.stringify(data), 'utf8');
|
||||
}
|
||||
|
||||
it('alias dall-e-3 -> custom-deployment still sends dall-e-3 response_format + hd quality', async () => {
|
||||
await writeStoredConfig({
|
||||
providers: {},
|
||||
aliases: { 'dall-e-3': 'azure-dalle3-deployment' },
|
||||
});
|
||||
|
||||
let capturedBody: Record<string, unknown> | null = null;
|
||||
const fetchMock = vi.fn(async (_input: unknown, init?: RequestInit) => {
|
||||
capturedBody = JSON.parse(String(init?.body)) as Record<string, unknown>;
|
||||
return new Response(
|
||||
JSON.stringify({ data: [{ b64_json: PNG_BASE64 }] }),
|
||||
{ status: 200, headers: { 'content-type': 'application/json' } },
|
||||
);
|
||||
});
|
||||
vi.stubGlobal('fetch', fetchMock);
|
||||
|
||||
const result = await generateMedia({
|
||||
projectRoot,
|
||||
projectsRoot,
|
||||
projectId: 'project-1',
|
||||
surface: 'image',
|
||||
model: 'dall-e-3',
|
||||
prompt: 'A watercolor shiba inu under cherry blossoms',
|
||||
aspect: '1:1',
|
||||
output: 'aliased.png',
|
||||
});
|
||||
|
||||
expect(fetchMock).toHaveBeenCalledTimes(1);
|
||||
expect(capturedBody).not.toBeNull();
|
||||
// Wire name swap landed — the provider receives the alias.
|
||||
expect(capturedBody!.model).toBe('azure-dalle3-deployment');
|
||||
// Capability branches keyed on the catalog id continue to fire.
|
||||
expect(capturedBody!.response_format).toBe('b64_json');
|
||||
expect(capturedBody!.quality).toBe('hd');
|
||||
// providerNote reflects what was actually sent, so a user
|
||||
// inspecting the result sees the wire name.
|
||||
expect(result.providerNote).toContain('azure-dalle3-deployment');
|
||||
expect(result.providerNote).not.toContain('dall-e-3');
|
||||
});
|
||||
|
||||
it('alias gpt-4o-mini-tts -> custom-deployment still attaches style instructions', async () => {
|
||||
process.env.OD_MEDIA_MODEL_ALIASES = JSON.stringify({
|
||||
'gpt-4o-mini-tts': 'custom-tts-deployment',
|
||||
});
|
||||
|
||||
let capturedBody: Record<string, unknown> | null = null;
|
||||
const fetchMock = vi.fn(async (_input: unknown, init?: RequestInit) => {
|
||||
capturedBody = JSON.parse(String(init?.body)) as Record<string, unknown>;
|
||||
// Speech endpoints return raw audio bytes, not JSON.
|
||||
return new Response(Buffer.from([1, 2, 3, 4]), {
|
||||
status: 200,
|
||||
headers: { 'content-type': 'audio/mpeg' },
|
||||
});
|
||||
});
|
||||
vi.stubGlobal('fetch', fetchMock);
|
||||
|
||||
const result = await generateMedia({
|
||||
projectRoot,
|
||||
projectsRoot,
|
||||
projectId: 'project-1',
|
||||
surface: 'audio',
|
||||
audioKind: 'speech',
|
||||
model: 'gpt-4o-mini-tts',
|
||||
prompt: 'Hello there.',
|
||||
// gpt-4o-mini-tts accepts free-form speaking style in `voice`
|
||||
// when the value isn't a known OpenAI voice id. The dispatcher
|
||||
// routes that string into `body.instructions` ONLY when the
|
||||
// model branch fires.
|
||||
voice: 'warm and slow',
|
||||
output: 'aliased.mp3',
|
||||
});
|
||||
|
||||
expect(fetchMock).toHaveBeenCalledTimes(1);
|
||||
expect(capturedBody).not.toBeNull();
|
||||
expect(capturedBody!.model).toBe('custom-tts-deployment');
|
||||
// Capability branch keyed on the catalog id continues to fire
|
||||
// even though the wire-level model is the alias — the
|
||||
// gpt-4o-mini-tts-specific instructions field is still attached.
|
||||
expect(capturedBody!.instructions).toBe('warm and slow');
|
||||
expect(result.providerNote).toContain('custom-tts-deployment');
|
||||
});
|
||||
});
|
||||
|
|
@ -4,7 +4,9 @@ import path from 'node:path';
|
|||
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
import {
|
||||
readAliasMap,
|
||||
readMaskedConfig,
|
||||
resolveModelAlias,
|
||||
resolveProviderConfig,
|
||||
writeConfig,
|
||||
} from '../src/media-config.js';
|
||||
|
|
@ -469,3 +471,197 @@ describe('media-config OpenAI OAuth fallback', () => {
|
|||
});
|
||||
});
|
||||
});
|
||||
|
||||
describe('media-config model alias resolution (issue #1277)', () => {
|
||||
let projectRoot: string;
|
||||
const originalEnvAliases = process.env.OD_MEDIA_MODEL_ALIASES;
|
||||
const originalMediaConfigDir = process.env.OD_MEDIA_CONFIG_DIR;
|
||||
const originalDataDir = process.env.OD_DATA_DIR;
|
||||
|
||||
beforeEach(async () => {
|
||||
projectRoot = await mkdtemp(path.join(tmpdir(), 'od-media-alias-'));
|
||||
delete process.env.OD_MEDIA_MODEL_ALIASES;
|
||||
delete process.env.OD_MEDIA_CONFIG_DIR;
|
||||
delete process.env.OD_DATA_DIR;
|
||||
});
|
||||
|
||||
afterEach(async () => {
|
||||
if (originalEnvAliases == null) {
|
||||
delete process.env.OD_MEDIA_MODEL_ALIASES;
|
||||
} else {
|
||||
process.env.OD_MEDIA_MODEL_ALIASES = originalEnvAliases;
|
||||
}
|
||||
if (originalMediaConfigDir == null) {
|
||||
delete process.env.OD_MEDIA_CONFIG_DIR;
|
||||
} else {
|
||||
process.env.OD_MEDIA_CONFIG_DIR = originalMediaConfigDir;
|
||||
}
|
||||
if (originalDataDir == null) {
|
||||
delete process.env.OD_DATA_DIR;
|
||||
} else {
|
||||
process.env.OD_DATA_DIR = originalDataDir;
|
||||
}
|
||||
await rm(projectRoot, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
async function writeStoredMediaConfig(data: unknown) {
|
||||
const file = path.join(projectRoot, '.od', 'media-config.json');
|
||||
await mkdir(path.dirname(file), { recursive: true });
|
||||
await writeFile(file, JSON.stringify(data), 'utf8');
|
||||
}
|
||||
|
||||
it('passes through unmapped model ids unchanged', async () => {
|
||||
expect(await resolveModelAlias(projectRoot, 'doubao-seedream-3-0-t2i-250415')).toBe(
|
||||
'doubao-seedream-3-0-t2i-250415',
|
||||
);
|
||||
});
|
||||
|
||||
it('redirects via the stored aliases map in media-config.json', async () => {
|
||||
// The flagship use case from the issue: registered catalog id
|
||||
// -> the new model name the user actually has access to.
|
||||
await writeStoredMediaConfig({
|
||||
providers: {},
|
||||
aliases: { 'doubao-seedream-3-0-t2i-250415': 'doubao-seedream-5-0' },
|
||||
});
|
||||
expect(
|
||||
await resolveModelAlias(projectRoot, 'doubao-seedream-3-0-t2i-250415'),
|
||||
).toBe('doubao-seedream-5-0');
|
||||
});
|
||||
|
||||
it('redirects via the OD_MEDIA_MODEL_ALIASES env var', async () => {
|
||||
process.env.OD_MEDIA_MODEL_ALIASES = JSON.stringify({
|
||||
'doubao-seedream-3-0-t2i-250415': 'doubao-seedream-5-0',
|
||||
});
|
||||
expect(
|
||||
await resolveModelAlias(projectRoot, 'doubao-seedream-3-0-t2i-250415'),
|
||||
).toBe('doubao-seedream-5-0');
|
||||
});
|
||||
|
||||
it('lets the env var override an on-disk alias (env wins for power users)', async () => {
|
||||
await writeStoredMediaConfig({
|
||||
providers: {},
|
||||
aliases: { 'doubao-seedream-3-0-t2i-250415': 'on-disk-alias' },
|
||||
});
|
||||
process.env.OD_MEDIA_MODEL_ALIASES = JSON.stringify({
|
||||
'doubao-seedream-3-0-t2i-250415': 'env-alias',
|
||||
});
|
||||
expect(
|
||||
await resolveModelAlias(projectRoot, 'doubao-seedream-3-0-t2i-250415'),
|
||||
).toBe('env-alias');
|
||||
});
|
||||
|
||||
it('tolerates malformed env JSON and falls through to the stored map', async () => {
|
||||
// A user with a half-typed env var (`OD_MEDIA_MODEL_ALIASES='{'`)
|
||||
// should still get their on-disk aliases, not a hard error mid-
|
||||
// generation.
|
||||
process.env.OD_MEDIA_MODEL_ALIASES = '{not valid json';
|
||||
await writeStoredMediaConfig({
|
||||
providers: {},
|
||||
aliases: { 'doubao-seedream-3-0-t2i-250415': 'doubao-seedream-5-0' },
|
||||
});
|
||||
expect(
|
||||
await resolveModelAlias(projectRoot, 'doubao-seedream-3-0-t2i-250415'),
|
||||
).toBe('doubao-seedream-5-0');
|
||||
});
|
||||
|
||||
it('drops non-string and empty alias entries during coercion', async () => {
|
||||
// Defends against a future schema bump (number / null / nested
|
||||
// object) and against accidental empty-string entries from a
|
||||
// Settings UI form. The coercion must never feed garbage into a
|
||||
// dispatcher's request body.
|
||||
process.env.OD_MEDIA_MODEL_ALIASES = JSON.stringify({
|
||||
'good-key': 'good-value',
|
||||
'empty-key': '',
|
||||
'null-key': null,
|
||||
'object-key': { nested: 'no' },
|
||||
'': 'blank-key-rejected',
|
||||
});
|
||||
expect(await resolveModelAlias(projectRoot, 'good-key')).toBe('good-value');
|
||||
expect(await resolveModelAlias(projectRoot, 'empty-key')).toBe('empty-key');
|
||||
expect(await resolveModelAlias(projectRoot, 'null-key')).toBe('null-key');
|
||||
expect(await resolveModelAlias(projectRoot, 'object-key')).toBe('object-key');
|
||||
});
|
||||
|
||||
it('exposes the merged map via readAliasMap so Settings can show source attribution', async () => {
|
||||
await writeStoredMediaConfig({
|
||||
providers: {},
|
||||
aliases: { 'stored-only': 'a', 'overridden': 'stored-value' },
|
||||
});
|
||||
process.env.OD_MEDIA_MODEL_ALIASES = JSON.stringify({
|
||||
'env-only': 'b',
|
||||
'overridden': 'env-value',
|
||||
});
|
||||
const map = await readAliasMap(projectRoot);
|
||||
expect(map.stored).toEqual({ 'stored-only': 'a', 'overridden': 'stored-value' });
|
||||
expect(map.env).toEqual({ 'env-only': 'b', 'overridden': 'env-value' });
|
||||
expect(map.effective).toEqual({
|
||||
'stored-only': 'a',
|
||||
'env-only': 'b',
|
||||
'overridden': 'env-value',
|
||||
});
|
||||
});
|
||||
|
||||
it('readMaskedConfig surfaces the alias map for the Settings UI', async () => {
|
||||
// Lefarcen P3 (#1309 review): the prior PR description claimed
|
||||
// `readAliasMap` was the daemon-public API for the Settings UI,
|
||||
// but the HTTP route returned only `readMaskedConfig` (which
|
||||
// had no aliases field). The fix wires aliases into the GET
|
||||
// response so a future Settings UI PR can consume them without
|
||||
// touching the daemon.
|
||||
await writeStoredMediaConfig({
|
||||
providers: {},
|
||||
aliases: { 'dall-e-3': 'azure-dalle3' },
|
||||
});
|
||||
process.env.OD_MEDIA_MODEL_ALIASES = JSON.stringify({
|
||||
'gpt-4o-mini-tts': 'custom-tts',
|
||||
});
|
||||
|
||||
const masked = await readMaskedConfig(projectRoot);
|
||||
|
||||
expect(masked.aliases.stored).toEqual({ 'dall-e-3': 'azure-dalle3' });
|
||||
expect(masked.aliases.env).toEqual({ 'gpt-4o-mini-tts': 'custom-tts' });
|
||||
expect(masked.aliases.effective).toEqual({
|
||||
'dall-e-3': 'azure-dalle3',
|
||||
'gpt-4o-mini-tts': 'custom-tts',
|
||||
});
|
||||
});
|
||||
|
||||
it('readMaskedConfig returns empty alias maps when no aliases are configured', async () => {
|
||||
// Settings UI needs a stable shape so it can render "no aliases
|
||||
// configured" without crashing on `aliases.effective` being
|
||||
// undefined.
|
||||
const masked = await readMaskedConfig(projectRoot);
|
||||
expect(masked.aliases.effective).toEqual({});
|
||||
expect(masked.aliases.env).toEqual({});
|
||||
expect(masked.aliases.stored).toEqual({});
|
||||
});
|
||||
|
||||
it('writeConfig preserves aliases when a Settings-style provider PUT lands', async () => {
|
||||
// The Settings UI in its current shape writes providers only.
|
||||
// Without alias preservation, every provider edit would wipe the
|
||||
// user's aliases. This pins the regression so a future refactor
|
||||
// that touches writeStored has to keep both fields.
|
||||
await writeStoredMediaConfig({
|
||||
providers: {},
|
||||
aliases: { 'doubao-seedream-3-0-t2i-250415': 'doubao-seedream-5-0' },
|
||||
});
|
||||
await writeConfig(projectRoot, {
|
||||
providers: {
|
||||
openai: { apiKey: 'sk-key', baseUrl: '' },
|
||||
},
|
||||
});
|
||||
const onDisk = JSON.parse(
|
||||
await readFile(
|
||||
path.join(projectRoot, '.od', 'media-config.json'),
|
||||
'utf8',
|
||||
),
|
||||
);
|
||||
expect(onDisk.providers.openai).toMatchObject({ apiKey: 'sk-key' });
|
||||
expect(onDisk.aliases).toEqual({
|
||||
'doubao-seedream-3-0-t2i-250415': 'doubao-seedream-5-0',
|
||||
});
|
||||
expect(
|
||||
await resolveModelAlias(projectRoot, 'doubao-seedream-3-0-t2i-250415'),
|
||||
).toBe('doubao-seedream-5-0');
|
||||
});
|
||||
});
|
||||
|
|
|
|||
Loading…
Reference in a new issue