open-design

mirror of https://github.com/nexu-io/open-design.git synced 2026-06-01 03:14:35 +07:00

Author	SHA1	Message	Date
Dan Porat	3395d2c855	feat(daemon): implement fal.ai renderer for image + video generation (#1606 ) * feat(daemon): implement fal.ai renderer for image + video generation Adds renderFalImage and renderFalVideo backed by the fal queue API (queue.fal.run). Any fal-ai/* model path can be used directly without a catalog entry, enabling the full fal model library without code changes. Catalogued shortcuts are mapped via FAL_ENDPOINTS to their fal-ai/* paths; OD_FAL_MAX_POLL_MS controls the poll ceiling. Expands the fal model catalog with flux-pro-ultra, flux-dev-fal, flux-schnell-fal, ideogram-v3-fal, recraft-v3-fal (images) and veo-3-fal, veo-2-fal, wan-2.1-t2v, wan-2.1-i2v, seedance-1-pro-fal, kling-2.1-t2v-fal (video). Marks fal provider as integrated: true in both daemon and web model registries. * fix(daemon): address fal renderer review comments - Correct Wan 2.1 endpoints: wan-video/v2.1/* → fal-ai/wan-t2v / fal-ai/wan-i2v - Correct Kling 2.1 t2v endpoint: .../pro/... → .../master/text-to-video - Add FAL_IMAGE_USES_ASPECT_RATIO: flux-pro-ultra sends aspect_ratio not image_size - Add FAL_VIDEO_NO_DURATION: Wan models reject the duration field - Add FAL_VIDEO_STRING_DURATION: Veo expects duration as "5s" not 5 - Fix falQueueBase() to use anchored regex replace, avoiding mangled custom base URLs - Do not wrap payload under input — raw fal queue HTTP API expects flat body; the input wrapper is an SDK abstraction only (confirmed by 422 validation error from fal showing prompt missing at body.prompt) * fix(daemon): correct fal queue protocol comment (flat body, no SDK input wrapper) * fix(daemon): clamp Veo duration to valid fal buckets (4s/6s/8s) * fix(daemon): report effective fal Veo duration in providerNote (with snap warning) * fix(daemon): reduce image generation latency from 4m37s to ~73s Five layered fixes targeting the overhead that padded a ~10s fal API call into a 4m37s user-facing wait: 1. Skip DISCOVERY_AND_PHILOSOPHY for media surfaces (image/video/audio). The ~3000-token HTML-artifact discovery layer is irrelevant for media generation and forced the agent to parse and override all its rules before dispatching. Removes it from the system prompt entirely for these surfaces; MEDIA_GENERATION_CONTRACT is the sole authority. 2. Broaden the wait-loop contract to cover ALL slow models, not just "Volcengine i2v / hyperframes-html". Any model whose generation exceeds 25s — including fal flux-pro-ultra, Veo, Sora — returns exit 2 from od media generate. The contract now makes this universal and provides a python3-based bash pattern (jq is not guaranteed to be installed on all agent runtimes). 3. Increase od media wait polling budget from 25s to 120s. od media generate keeps its 25s budget for fast feedback; od media wait is purpose-built to sit and poll, so it can safely use the full 2-minute bash-tool window. Reduces re-entries for a 3-minute generation from ~7 to ~2. 4. First fal poll is now immediate instead of always sleeping 3s before the first status check. Saves 3s for all fal jobs. 5. Project metadata no longer emits "(unknown — ask)" for imageModel and aspectRatio when unset. Emits the actual defaults (gpt-image-2, aspect-ratio scene heuristic) so the agent can dispatch without extended reasoning about model selection. Also adds dispatch-immediately defaults and a brief-reply rule (2–3 sentences max after generation). Measured end-to-end on the exact problem prompt before/after: Before: 4m37s (discovery form + 7x LLM re-entries + jq failure) After: ~73s (single bash loop, no question turn, image delivered) * feat(daemon): inject media dispatch hint for non-media project surfaces Agents running inside prototype, deck, and other non-image/video/audio projects previously had no knowledge of `od media generate`, so when asked to create an image with fal they would try to call provider REST APIs directly and ask the user for API keys — even though the daemon already holds credentials in .od/media-config.json. Add MEDIA_DISPATCH_HINT to composeSystemPrompt for all non-media surfaces. The hint tells the agent to always route media generation through the daemon dispatcher, and explicitly forbids prompting for API keys. Verified end-to-end: a prototype project generates a 952 KB image via flux-pro-ultra in ~52s with no key errors. * fix(daemon): prevent agent from converting bash env vars to PowerShell syntax MEDIA_DISPATCH_HINT now explicitly labels the shell as POSIX bash and shows the correct $VAR form side-by-side with a warning NOT to use PowerShell $env:VAR. Without this, claude-sonnet running on a Windows host converts the example to PowerShell syntax (`& $env:OD_NODE_BIN`) which then fails at the bash executor with 'syntax error near unexpected token &'. * fix(daemon): add generate→wait loop to MEDIA_DISPATCH_HINT for slow models MEDIA_DISPATCH_HINT previously showed only a bare call. flux-pro-ultra and other slow models always exit 2 after ~25s — without the wait loop the agent would treat exit 2 as a failure and report an error to the user. Replace the single-command example with the canonical generate→wait loop (matching media-contract.ts), add an explicit note that exit 2 means 'keep polling', and reinforce the POSIX bash / no-PowerShell rule directly inside the code block. * fix(daemon): allow fal-ai/* passthrough in media-agent contract The media-agent prompt instructed the agent to warn and substitute the default model for any ID not in the catalogue. This blocked the custom fal-ai/* passthrough path the daemon already supports, so users could not reach uncatalogued fal models from the normal chat flow. Carve out the fal-ai/* exception so the agent passes those IDs through directly instead of warning or substituting. * fix(daemon): align MEDIA_DISPATCH_HINT with exit-0 generate contract media generate now always exits 0 (handoff included). The non-media agent hint still checked ec==2 to decide whether to keep polling, so slow fal models (flux-pro-ultra, veo-3-fal) would stop after printing the handoff JSON instead of entering the wait loop. - generate error check: drop the ec!=2 exception (exits 0 always) - while loop: drive on taskId presence, not ec==2; stop on ec==0/5 - footer: remove --surface inference claim; CLI requires it explicitly * fix(guard): add test-fal-webui.ts to e2e scripts allowlist CI failed: guard flagged e2e/scripts/test-fal-webui.ts as an unapproved package-owned entrypoint. Add it to allowedE2eScripts. * fix(daemon): update prompt test expectations to match exit-0 handoff wording The two stale assertions checked for the old generate-exits-2 copy which no longer exists in the contract. Update them to match the current always-exits-0 wording. * fix(daemon): move skipDiscoveryBrief override before discovery block * chore(e2e): remove ad-hoc fal webui test script The script was a one-time developer helper used to manually validate fal image generation through the live UI. It relied on a real fal API key and hardcoded local port, so it cannot participate in the e2e package's fixture/reporting/CI conventions. Removing it per reviewer feedback. - Delete e2e/scripts/test-fal-webui.ts - Remove its guard.ts allowlist entry - Gitignore the file and its screenshots to prevent accidental re-addition * chore: remove accidental local scratch files from branch Remove bash.exe.stackdump (MSYS crash dump) and fix_loop.py (one-off local rewrite helper) — neither is a repo-owned source artifact. * fix(prompts): document fal-ai/* passthrough in non-media dispatch hint Prototype/deck agents now know arbitrary fal-ai/* model ids are valid --model values and should be forwarded as-is, mirroring the exception already present in media-contract.ts. Adds a prompt regression test. * fix(daemon): use renderMediaGenerationContract(mediaExecution) for media surfaces --------- Co-authored-by: mrcfps <mrc@powerformer.com>	2026-05-31 04:44:44 +00:00
Denis Redozubov	c847ace554	Add run-scoped media execution policy (#3106 ) * feat(contracts): add run media execution policy * feat(daemon): enforce run media execution policy * test(daemon): cover media execution policy gates	2026-05-28 09:19:40 +00:00
Eli	18b947c25f	[codex] Land design system GitHub intake handoff (#2187 ) * Add Claude-style design system workflow * Merge design system workflow into main * Restore design system workflow UI styles * Fix design system setup scrolling * Fix design system setup connector button * Preserve connector auth link after popup block * Simplify connected GitHub setup state * Open generated design system workspace project * Summarize design system auto prompt in chat * Add bounded GitHub connector design intake * Prefer path-scoped GitHub intake tools * Restore branch GitHub design context intake * Restore design system review workspace * Restore design system manager tab * Let design system workflow routes own details * Open editable design systems as projects * Restore design system workspace coverage * Fix bounded GitHub connector intake * Hide design system review while generating * Suppress design system generation questions * Constrain GitHub design intake to bounded command * Tolerate oversized GitHub metadata during intake * Rebuild daemon CLI when sources change * Fallback when GitHub connector snapshots are rate limited * Allow GitHub intake without Composio * Use native GitHub auth for design intake * Remove design system review group heading * Improve design system extraction evidence * Align design system scaffold with Claude output * Add evidence inventory for design system intake * Add local design system evidence intake * Add design system package audit gate * Allow auditing Claude Design reference packages * Audit design system package content quality * Migrate legacy design system artifacts * Clean migrated design system artifacts * Require modular design system UI kits * Reject thin design system UI kits * Prioritize core design evidence intake * Require role-based design system UI kits * Clean stale design system manifest references * Require representative preserved design assets * Warn on generic design system visuals * Enforce design system quality warnings * Audit connected design system UI kits * Require mounted design system UI kits * Require composed design system app shells * Require runnable JSX design system kits * Require browser globals for design system components * Infer design system names from source URLs * Require source examples in design system packages * Bind preserved fonts in design system tokens * Require skill frontmatter in design system packages * Preserve build icons in design system packages * Require real assets in brand previews * Require substantive source examples * Require product overview in design system README * Require reusable UI kit README * Require reusable design system skill docs * Seed Claude-style UI kit entry contract * Preserve runtime build assets in design packages * Audit design system packages after generation * Audit design system first-run output * Audit source-backed preview cards * Align design system UI kit scaffolds * Materialize design evidence package artifacts * Show project chat during design system setup * Hand off design system setup to project chat * Auto-repair design system audit failures * Harden design system evidence preservation * Tighten design system package guidance * Add targeted design system repair guidance * Bound design system audit auto repair * Use connector statuses in design system setup * Audit design system preview manifests * Require README preview manifests for design systems * Fix design system GitHub intake handoff * Fix daemon prompt CI assertions	2026-05-19 14:30:17 +08:00
Yuhao Chen	0e61313347	fix(prompts): stabilize discovery brand answers (#1861 )	2026-05-16 15:50:52 +08:00
Quang Do	3d0e708720	fix(daemon): treat media generate handoff as success (#1715 )	2026-05-15 14:11:40 +08:00
Marc Chan	055e55abd8	Add batch design system testing (#1515 ) * feat: add batch design system testing * fix: use daemon default agent for batch tests * fix: honor batch project prompt flags Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * fix: persist batch run output * fix: honor dry-run before daemon resolution Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * fix: persist batch assistant run ids Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode) * fix: cancel timed-out batch runs Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)	2026-05-14 14:19:32 +08:00
kami	4f76e836ae	feat(audio): add ElevenLabs audio support (#1384 ) * docs: add ElevenLabs audio support design * docs: add ElevenLabs audio implementation plan * feat(daemon): add ElevenLabs speech renderer * feat(daemon): add ElevenLabs sound effects renderer * fix(daemon): preserve ElevenLabs sfx durations * feat(web): expose ElevenLabs media providers * feat(daemon): document ElevenLabs audio contract * feat(audio): add ElevenLabs voice selection * chore: ignore superpowers scratch docs * fix(daemon): cache ElevenLabs voice options * fix(audio): expand ElevenLabs voice and SFX selection * fix(audio): align ElevenLabs SFX controls * fix(audio): tighten ElevenLabs SFX prompt budget * fix(audio): preflight ElevenLabs SFX prompt length * fix(audio): surface ElevenLabs lookup failures * fix(audio): sanitize ElevenLabs prompt errors	2026-05-13 15:53:41 +08:00
Deheng Huang	09b78c2f9b	feat(daemon): let Codex image projects use built-in imagegen (#622 ) * feat(daemon): let Codex image projects avoid API-key setup Codex has a built-in image generation path available inside the agent runtime, while the generic media dispatcher still routes gpt-image models through the daemon OpenAI provider. Pass the active agent id into prompt composition so Codex-only gpt-image projects can use built-in imagegen first without changing non-Codex media behavior. Constraint: Existing media contract remains the default path for non-Codex agents and explicit provider fallback Rejected: Add a nested daemon Codex media provider \| heavier auth, streaming, timeout, cancellation, and output parsing surface for this parity fix Confidence: high Scope-risk: narrow Directive: Keep this override after the media contract so it can intentionally supersede dispatcher-only wording for Codex gpt-image projects Tested: pnpm --dir apps/daemon exec vitest run -c vitest.config.ts tests/system-prompt-template.test.ts Tested: pnpm --filter @open-design/daemon typecheck Tested: pnpm guard Tested: pnpm typecheck Not-tested: Live Codex image generation inside the Open Design UI * fix(daemon): harden Codex imagegen prompt routing PR review found the Codex override could be superseded by the web-supplied media contract, trusted unvalidated image model metadata, and assumed generated image paths outside the workspace were readable. This keeps the override daemon-owned, appends it last in the live prompt, validates against registered gpt-image model IDs, allowlists only Codex's generated_images folder, and tightens copy-failure instructions. Constraint: The web contracts composer still emits the generic media contract without agent identity. Rejected: Mirror Codex-specific prompt logic into contracts/web \| duplicates daemon model registry and still leaves final ordering fragile. Confidence: high Scope-risk: narrow Directive: Keep Codex imagegen override appended after client systemPrompt so it remains the final media instruction for Codex gpt-image projects. Tested: pnpm --dir apps/daemon exec vitest run -c vitest.config.ts tests/system-prompt-template.test.ts tests/agents.test.ts tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon typecheck Tested: pnpm guard Tested: pnpm typecheck Not-tested: Live Codex image generation inside the Open Design UI * fix(daemon): keep Codex add-dir writable scope narrow PR review found Codex --add-dir grants writable workspace access, so passing skill, design-system, and linked reference directories through the same chat allowlist broke their documented read-only boundary. This routes chat extra directories by active agent: Codex receives only the validated generated_images output directory needed for built-in imagegen, while non-Codex adapters keep the existing resource and linked-directory read access behavior. Constraint: Codex CLI treats --add-dir as writable sandbox expansion. Constraint: The daemon still stages active skill files into the project cwd as Codex's read-safe path. Rejected: Keep one shared extraAllowedDirs list for all agents \| grants Codex write access to read-only resources. Confidence: high Scope-risk: narrow Directive: Do not add read-only resource/reference directories to Codex --add-dir unless Codex gains a read-only allowlist flag. Tested: git diff --check -- apps/daemon/src/server.ts apps/daemon/tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon exec vitest run tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon typecheck Tested: pnpm guard Tested: pnpm typecheck Not-tested: Live Codex image generation inside the Open Design UI * fix(daemon): validate Codex imagegen add-dir grants PR review found the generated_images grant still trusted symlinked paths and rendered the Codex override before proving the sandbox grant would be present. This validates the generated_images directory before prompt assembly, rejects final-component symlinks and protected-root canonical escapes, passes Codex the canonical grant path, and only appends the Codex imagegen override when that same path is in extraAllowedDirs. Constraint: Codex --add-dir grants writable workspace access, so path aliases into read-only resource roots must be rejected. Rejected: Keep returning the nominal CODEX_HOME path after validation \| leaves Codex operating through a symlink alias instead of the audited grant target. Confidence: high Scope-risk: narrow Directive: Keep Codex imagegen prompt rendering downstream of generated_images validation and grant resolution. Tested: git diff --check -- apps/daemon/src/server.ts apps/daemon/tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon exec vitest run -c vitest.config.ts tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon exec vitest run -c vitest.config.ts tests/agents.test.ts tests/chat-route.test.ts Tested: pnpm --filter @open-design/daemon typecheck Tested: pnpm guard Tested: pnpm typecheck Not-tested: Live Codex image generation inside the Open Design UI	2026-05-06 18:28:16 +08:00
Tom Huang	513bd4edea	feat(web): pick prompt templates (not design systems) for image/video projects (#192 ) * feat(web): pick prompt templates (not design systems) for image/video projects The New Project panel now shows the curated prompt-template gallery in the image and video tabs instead of the Design System picker. Design systems only make sense for prototypes, slide decks, templates, and the freeform "other" canvas — they don't map onto image/video generation. The picked template's body is editable in-line ("optimize" affordance) and the (possibly tuned) prompt is snapshotted into ProjectMetadata as `promptTemplate`. The system-prompt composer surfaces it on every turn as a stylistic + structural reference for the agent — same shape we already use for the saved-template flow. Also rename the right-side gallery tabs from "Image prompts" / "Video prompts" to "Image templates" / "Video templates" across all locales. Made-with: Cursor * fix(prompt-templates): address review — escape fences, error UI, empty hint, tests Apply the actionable feedback from the code review: - Security: escape triple-backticks in `metadata.promptTemplate.prompt` before interpolating it into the system prompt's fenced block. A user who pasted ``` into the editable template body could otherwise close the fence and inject free-form instructions for the agent. Apply the same fix in both the contracts composer and the daemon mirror. - UX: surface fetchPromptTemplate failures via an inline error banner outside the popover, with a one-click retry button bound to the last failed pick. Previously the error toast lived inside the popover and vanished as soon as the popover closed. - UX: show a subtle hint below the prompt textarea when the body is empty, so users who clear it on purpose understand the agent will have no template reference. - Defense in depth: gate the `referenceTemplate:` metadata bullet on a non-empty prompt body in both composers, so a stale title can never appear in the system prompt without the body it claims to reference. - Tests: add tests/system-prompt-template.test.ts covering the happy path (image + video), the backtick escape, the truncation cap, the empty-prompt skip, the non-media kinds, and the missing-source path. - i18n: add `promptTemplates.retry` and `newproj.promptTemplateBodyEmpty` across all 9 locales; backfill the prompt-template picker keys for ja, es-ES, de which landed on main after this branch. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-01 23:31:31 +08:00

9 commits