openpencil/server/api/ai/generate.ts
Kayshen Xu 3b00d5564d
V0.0.1 (#4)
* feat(canvas,panels): multi-selection drag, shift-key selection, and chat pipeline UI

- Preserve multi-selection when clicking a selected object without Shift,
  enabling drag-move of the whole set
- Set selectionKey to shiftKey so only Shift+click toggles multi-select
- Refactor chat-message step rendering: extract step blocks with regex,
  add pipeline checklist progress UI, separate step display from markdown
- Adjust AI chat panel selection count display and layout

* feat(opencode): integrate OpenCode SDK for enhanced AI interactions

- Add support for OpenCode as a new AI provider, allowing users to connect to a local OpenCode server.
- Implement new API endpoints for generating and streaming chat responses via OpenCode.
- Update UI components to include OpenCode in provider selection and display.
- Introduce OpenCode logo and enhance the agent settings dialog to manage OpenCode connections.
- Refactor existing chat and generation logic to accommodate provider-specific routing.
- Include type definitions for OpenCode SDK to ensure type safety and improve developer experience.

* feat(electron): integrate Electron framework for desktop application support

- Add Electron configuration and main process setup for building a desktop application.
- Implement IPC communication for file operations (open, save) between the renderer and main processes.
- Create a preload script to expose Electron APIs to the renderer.
- Update package.json to include Electron and related dependencies.
- Enhance the build process with electron-builder for packaging the application.
- Introduce a new electron-builder.yml configuration file for build settings.
- Modify Vite configuration to support Electron-specific builds.
- Update UI components to accommodate Electron's window management and drag regions.

* fix(canvas): multi-selection drag reparenting and infinite recursion guard

- Add re-entry guard to object:modified handler to prevent infinite
  recursion caused by discardActiveObject() firing object:modified
  via _finalizeCurrentTransform
- Extend drag-into detection to non-layout root frames (rootFrameBounds)
  in addition to layout containers (layoutContainerBounds)
- Add checkReparentIntoFrame() fallback for root-level objects dragged
  into frames, handling both layout and non-layout targets
- Redirect _currentTransform to ActiveSelection in mouse:down so the
  whole group moves/scales/rotates together
- Support same-container reorder in commitDragIntoMulti
- Fix guide-utils getEdges() for center-origin ActiveSelection objects
- Track selection descendants for visual drag-follow during multi-drag
- Show position coordinates during multi-selection drag in dimension label

* feat(store,canvas): add reusable component and instance system

add makeReusable/detachComponent store methods, RefNode resolution in
canvas sync with component/instance selection borders, virtual child ID
handling in selection context, purple/instance-colored frame labels,
and collision avoidance when duplicating nodes

* feat(panels): add component/instance UI to layer and property panels

layer panel shows diamond icon and purple/#9281f7 styling for
components/instances with context menu actions; property panel resolves
RefNode display, routes instance overrides, adds header with go-to and
detach buttons, and supports inline name editing on click

* feat(canvas): match hover outline to selection style for components and instances

hover outlines now use purple solid for reusable components and #9281f7
dashed for instances, consistent with their selection border styling

* fix(editor): save directly without dialog for previously opened files

prioritize fileName check over File System Access API so opened files
save via download without showing a file picker; also add Cmd+Alt+K
shortcut for creating reusable components

* refactor(canvas): remove dimension label overlay

remove the blue dimension/position label that appeared on selection,
drag, and scale interactions

* feat(export): add export section for raster image export functionality

- Introduce a new ExportSection component to facilitate exporting layers as raster images in various formats (PNG, JPEG, WEBP) with adjustable scale options.
- Integrate the ExportSection into the PropertyPanel for easy access during design editing.
- Enhance export utility functions to support exporting layers and managing descendant nodes.

* feat(assets): add new icon files for application branding

- Introduce new icon files in various formats (ICNS, ICO, PNG) for application branding.
- Add logo image for the Electron interface to enhance visual identity.
- Update canvas selection logic to allow programmatic selection from the layer panel.

* feat(electron): enhance Electron app integration and CI/CD workflows

- Add new commands for Electron development, compilation, and building processes in CLAUDE.md.
- Update README.md to reflect the availability of the application as both a web and desktop app.
- Introduce a FixedChecklist component in the AI chat panel for better user interaction with generated tasks.
- Implement CI/CD workflows for automated testing and Electron builds in GitHub Actions.
- Refactor design generator prompts to support element-by-element streaming for improved performance.

* feat(ai): enhance chat functionality and orchestration capabilities

- Introduce context optimization to manage chat history size and prevent unbounded growth.
- Implement complexity assessment for design prompts to determine orchestration needs.
- Add orchestrator functionality for parallel design generation, improving performance and responsiveness.
- Update chat message parsing to include explicit status handling for orchestrator steps.
- Refactor design generator to support orchestration and streamline context building.

* refactor(ai): streamline orchestration and remove complexity classifier

- Remove the complexity classifier as its functionality is no longer needed.
- Update design generator to always route through the orchestrator, simplifying the logic.
- Enhance error handling during orchestration to ensure fallback to direct generation is clear.
- Introduce a new sub-agent prompt for improved output formatting and clarity in design generation.

* feat(ai): refine orchestrator prompt and timeout settings

- Update the ORCHESTRATOR_PROMPT to allow for more granular task division, specifying 4-15 atomic sections.
- Enhance the JSON output structure by adding new subtasks for the hero section.
- Adjust ORCHESTRATOR_TIMEOUTS to extend hard and no-text timeout values for improved orchestration performance.

* feat(docs): update CLAUDE.md and README.md for new features and improvements

- Expand the Fabric.js integration section in CLAUDE.md to include new files and functionalities.
- Highlight new features such as double-click frame entry, advanced drag-and-drop capabilities, and per-layer export options in README.md.
- Add context optimization and multi-provider support for AI in README.md.
- Update the project structure descriptions for clarity and completeness.

* refactor(config): update Vite configuration import for Vitest compatibility

* chore(ci): update GitHub Actions workflow to trigger on main branch pushes

* feat(uikit): introduce UIKit browser and component management features

- Add a new ComponentBrowserPanel for browsing and managing UI components.
- Implement ComponentBrowserGrid and ComponentBrowserCard for displaying components.
- Integrate UIKit store for managing component kits, search queries, and active categories.
- Enhance editor layout to support toggling the UIKit browser with keyboard shortcuts.
- Include import/export functionality for UIKit components.

* feat(types): add clipContent to ContainerProps

Allow frames to explicitly clip overflowing children, essential for
cards with cornerRadius + image children.

* feat(canvas): wire letterSpacing, textGrowth and clipContent to Fabric.js

- Convert letterSpacing (px) to Fabric charSpacing (1/1000 em) in
  factory and sync
- Select IText vs Textbox based on textGrowth mode, restore canvas
  selection after text object recreation
- Add multi-line text height estimation with CJK character support
- Honor clipContent flag in clipping logic

* feat(panels): split text Layout and Typography into separate sections

- Add readOnly prop to NumberInput for non-editable dimension display
- Extract text Layout section (dimensions, fill/hug, resizing toggles)
  into text-layout-section.tsx
- Enhance Typography section with line height, letter spacing, vertical
  alignment controls
- Reorder property panel: Size → Layout → Appearance → Fill → Stroke →
  Typography → Effects
- Add hideWH prop to SizeSection to avoid duplicate W/H inputs

* feat(panels): rewrite layout-section with full Flex Layout panel

- 3x3 alignment grid with context-aware behavior per layout direction
- Gap section with numeric/space-between/space-around radio modes
- Multi-mode padding: single, 2-axis (V/H), 4-individual (T/R/B/L)
  with gear popover
- Dimensions (W/H) and sizing checkboxes (fill/hug/clip) integrated
  into layout panel

* feat(panels): improve layer panel with auto-collapse and scroll-to-selection

- Collapse all layers by default, auto-collapse newly added nodes
- Auto-expand ancestor layers when a child is selected on canvas
- Scroll selected layer item into view after expansion

* fix(panels): improve AI chat panel overflow and color picker styling

- Fix chat panel overflow with proper flex layout and min-h-0
- Cap checklist height with scrollable overflow
- Adjust color picker input sizing

* feat(codegen): add textGrowth, textAlignVertical and clipContent to code generation

- textGrowth: auto → whitespace-nowrap, fixed-width-height → overflow-hidden
- textAlignVertical: middle/bottom → vertical-align CSS
- clipContent: true → overflow-hidden on containers

* feat(ai): integrate text typography properties into design prompts

- Add textGrowth, lineHeight, letterSpacing, textAlignVertical, fontStyle
  to PEN_NODE_SCHEMA and sub-agent prompt
- Add typography scale guidelines with lineHeight/letterSpacing defaults
- Update examples with lineHeight, clipContent, fill_container usage
- Add phone placeholder and clipContent rules

* feat(ai): enhance streaming service with runtime config and thinking modes

- Extract timeout constants to ai-runtime-config.ts
- Add thinking mode control (adaptive/disabled/enabled) to stream options
- Add StyleGuide interface for orchestrator visual consistency
- Add ping timeout and first-text timeout options

* feat(ai): improve design generator with typography defaults and phone placeholder

- Set default lineHeight on text nodes (1.2 for headings, 1.5 for body)
- Expand phone/mockup placeholder detection with shape-based fallback
- Flatten nested phone-shaped frames in post-streaming tree fixes
- Increase CJK character width estimation and button padding buffer

* feat(ai): enhance orchestrator with style guide and prompt optimizer

- Pass style guide from orchestrator plan to sub-agent prompts
- Add orchestrator-prompt-optimizer for context-aware prompt tuning
- Improve parallel sub-task generation and error recovery

* feat(ai): add thinking mode and Codex provider support to server API

- Support thinkingMode/effort params in chat and generate endpoints
- Add Codex (OpenAI) provider streaming via codex-client utility
- Forward thinking config to both Anthropic SDK and Agent SDK paths

* feat(mcp): implement OpenPencil file format and MCP server integration

- Introduce support for .op file format alongside .pen
- Add MCP server functionality for document management and tool operations
- Implement batch processing tools for design and variable management
- Enhance save and open dialogs to accommodate new file format
- Update dependencies and scripts for MCP server compilation and execution

* feat(build): update electron build configuration and resource handling

- Add mcp-server.cjs to extraResources in electron-builder.yml for production packaging
- Modify package.json build script to include MCP server compilation
- Enhance main.ts to set ELECTRON_RESOURCES_PATH for resource access
- Update mcp-install.ts to resolve MCP server path from resources in production

---------

Co-authored-by: Fini <fini.yang@gmail.com>
2026-02-22 12:09:12 +08:00

234 lines
7.4 KiB
TypeScript

import { defineEventHandler, readBody, setResponseHeaders } from 'h3'
import { resolveClaudeCli } from '../../utils/resolve-claude-cli'
import { runCodexExec } from '../../utils/codex-client'
interface GenerateBody {
system: string
message: string
model?: string
provider?: string
thinkingMode?: 'adaptive' | 'disabled' | 'enabled'
thinkingBudgetTokens?: number
effort?: 'low' | 'medium' | 'high' | 'max'
}
/**
* Non-streaming AI generation endpoint.
* Tries ANTHROPIC_API_KEY first (via Anthropic SDK);
* falls back to local Claude Code (via Agent SDK, uses OAuth login).
*/
export default defineEventHandler(async (event) => {
const body = await readBody<GenerateBody>(event)
if (!body?.message || !body?.system) {
setResponseHeaders(event, { 'Content-Type': 'application/json' })
return { error: 'Missing required fields: system, message' }
}
// Explicit provider routing
if (body.provider === 'opencode') {
return generateViaOpenCode(body, body.model)
}
if (body.provider === 'openai') {
return generateViaCodex(body, body.model)
}
// Default: existing behavior (backward-compatible)
const apiKey = process.env.ANTHROPIC_API_KEY
if (apiKey) {
try {
return await generateViaAnthropicSDK(apiKey, body, body.model)
} catch {
// SDK not installed or failed — fall back to Agent SDK
}
}
return generateViaAgentSDK(body, body.model)
})
/** Generate via Anthropic SDK */
async function generateViaAnthropicSDK(apiKey: string, body: GenerateBody, model?: string) {
try {
const { default: Anthropic } = await import('@anthropic-ai/sdk')
const client = new Anthropic({ apiKey })
const response = await client.messages.create({
model: model || 'claude-sonnet-4-5-20250929',
max_tokens: 4096,
system: body.system,
messages: [{ role: 'user', content: body.message }],
})
const textBlock = response.content.find((b: { type: string }) => b.type === 'text')
return { text: textBlock && 'text' in textBlock ? textBlock.text : '' }
} catch (error) {
const message = error instanceof Error ? error.message : 'Unknown error'
return { error: message }
}
}
/** Generate via Claude Agent SDK (uses local Claude Code OAuth login, no API key needed) */
async function generateViaAgentSDK(body: GenerateBody, model?: string): Promise<{ text?: string; error?: string }> {
try {
const { query } = await import('@anthropic-ai/claude-agent-sdk')
// Remove CLAUDECODE env to allow running from within a CC terminal
const env = { ...process.env } as Record<string, string | undefined>
delete env.CLAUDECODE
const claudePath = resolveClaudeCli()
const q = query({
prompt: body.message,
options: {
systemPrompt: body.system,
model: model || 'claude-sonnet-4-6',
maxTurns: 1,
tools: [],
plugins: [],
permissionMode: 'plan',
persistSession: false,
env,
...(claudePath ? { pathToClaudeCodeExecutable: claudePath } : {}),
},
})
for await (const message of q) {
if (message.type === 'result') {
if (message.subtype === 'success') {
return { text: message.result }
}
const errors = 'errors' in message ? (message.errors as string[]) : []
return { error: errors.join('; ') || `Query ended with: ${message.subtype}` }
}
}
return { error: 'No result received from Claude Agent SDK' }
} catch (error) {
const message = error instanceof Error ? error.message : 'Unknown error'
return { error: message }
}
}
async function generateViaCodex(body: GenerateBody, model?: string): Promise<{ text?: string; error?: string }> {
const result = await runCodexExec(body.message, {
model,
systemPrompt: body.system,
thinkingMode: body.thinkingMode,
thinkingBudgetTokens: body.thinkingBudgetTokens,
effort: body.effort,
})
return result.error ? { error: result.error } : { text: result.text ?? '' }
}
function mapOpenCodeEffort(
effort?: 'low' | 'medium' | 'high' | 'max',
): 'low' | 'medium' | 'high' | undefined {
if (!effort) return undefined
if (effort === 'max') return 'high'
return effort
}
function buildOpenCodeReasoning(
body: GenerateBody,
): Record<string, unknown> | undefined {
const reasoning: Record<string, unknown> = {}
const effort = mapOpenCodeEffort(body.effort)
if (effort) {
reasoning.effort = effort
}
if (body.thinkingMode === 'enabled') {
reasoning.enabled = true
} else if (body.thinkingMode === 'disabled') {
reasoning.enabled = false
}
if (typeof body.thinkingBudgetTokens === 'number' && body.thinkingBudgetTokens > 0) {
reasoning.budgetTokens = body.thinkingBudgetTokens
}
return Object.keys(reasoning).length > 0 ? reasoning : undefined
}
async function promptOpenCodeWithThinking(
ocClient: any,
basePayload: Record<string, unknown>,
body: GenerateBody,
): Promise<{ data: any; error: any }> {
const reasoning = buildOpenCodeReasoning(body)
if (!reasoning) {
return await ocClient.session.prompt(basePayload)
}
const enhanced = { ...basePayload, reasoning }
const firstTry = await ocClient.session.prompt(enhanced)
if (!firstTry.error) {
return firstTry
}
console.warn('[AI] OpenCode reasoning options rejected, retrying without reasoning.')
return await ocClient.session.prompt(basePayload)
}
/** Generate via OpenCode SDK (connects to a running OpenCode server) */
async function generateViaOpenCode(body: GenerateBody, model?: string): Promise<{ text?: string; error?: string }> {
let ocServer: { close(): void } | undefined
try {
const { getOpencodeClient } = await import('../../utils/opencode-client')
const oc = await getOpencodeClient()
const ocClient = oc.client
ocServer = oc.server
const { data: session, error: sessionError } = await ocClient.session.create({
title: 'OpenPencil Generate',
})
if (sessionError || !session) {
return { error: 'Failed to create OpenCode session' }
}
// Inject system prompt as context (no AI reply)
await ocClient.session.prompt({
sessionID: session.id,
noReply: true,
parts: [{ type: 'text', text: body.system }],
})
// Parse model string ("providerID/modelID")
let modelOption: { providerID: string; modelID: string } | undefined
if (model && model.includes('/')) {
const idx = model.indexOf('/')
modelOption = { providerID: model.slice(0, idx), modelID: model.slice(idx + 1) }
}
// Send main prompt and await full response
const promptPayload: Record<string, unknown> = {
sessionID: session.id,
...(modelOption ? { model: modelOption } : {}),
parts: [{ type: 'text', text: body.message }],
}
const { data: result, error: promptError } = await promptOpenCodeWithThinking(
ocClient,
promptPayload,
body,
)
if (promptError) {
return { error: 'OpenCode generation failed' }
}
// Extract text from response parts
const texts: string[] = []
if (result?.parts) {
for (const part of result.parts) {
if (part.type === 'text' && part.text) {
texts.push(part.text)
}
}
}
return { text: texts.join('') }
} catch (error) {
const message = error instanceof Error ? error.message : 'Unknown error'
return { error: message }
} finally {
const { releaseOpencodeServer } = await import('../../utils/opencode-client')
releaseOpencodeServer(ocServer)
}
}