openpencil/server/api/ai/chat.ts
Kayshen Xu 90bbcb16fd
V0.4.0 (#44)
* perf(canvas): bitmap cache during zoom/pan, fix save for large files

- Add canvas-zoom-cache: captures pixel snapshot on first viewport change,
  draws cached bitmap with transform delta (~0.1ms vs ~15ms per frame).
  Refreshes snapshot every 200ms during continuous interaction.
- Integrate zoom cache into wheel zoom and space+drag panning.
- Set renderOnAddRemove: false on Fabric canvas to avoid per-object renders.
- Remove depth/size LOD from viewport culling (zoom cache handles perf).
- Rewrite syncCanvasPositionsToStore: single tree walk + single store set
  instead of per-object updateNode (was O(n²) with 200+ history pushes).
- Unify save/save-as: .op files save in-place, non-.op triggers save-as.
- Add Electron filePath storage for reliable native IPC saves.
- Add error handling with fallback for all save paths.

* feat(canvas): integrate CanvasKit for enhanced rendering and layout

- Added support for CanvasKit WASM to improve rendering performance and capabilities.
- Introduced a new SkiaCanvas component for rendering using CanvasKit.
- Implemented spatial indexing with RBush for efficient hit testing in the Skia engine.
- Enhanced text measurement and layout handling using Canvas 2D for accurate word wrapping.
- Updated layout engine to accommodate badge and overlay nodes without affecting layout flow.
- Bumped version from 0.3.3 to 0.3.4 to reflect these significant changes.

* fix(electron): add graceful-fs to devDependencies for Node.js v25 compat (#38)

* feat(figma): enhance instance override handling and derived data processing

- Added support for symbol tree context in instance conversion.
- Improved the application of instance overrides by filtering derived data to exclude nested instances.
- Implemented a two-strategy approach for resolving overrides and derived data, accommodating both direct index mapping and expanded DFS for nested instances.
- Updated FigmaSymbolOverride interface to extend FigmaNodeChange, enhancing the handling of overridable node properties.

* refactor(canvas): remove loading state and enhance error handling in SkiaCanvas

- Eliminated the loading state management from SkiaCanvas, simplifying the component.
- Updated error handling to directly display error messages without loading indicators.
- Adjusted the EditorLayout to directly render SkiaCanvas without suspense, improving performance.
- Introduced drag-and-drop file handling in the canvas store for better user experience.
- Enhanced Figma import dialog to auto-process pending files from drag-and-drop.

* fix(ai): improve error messages for API authentication and connection issues

- Updated error hints in chat.ts to provide clearer instructions for API authentication, including running "claude login" or setting the ANTHROPIC_API_KEY in settings.json.
- Enhanced friendly error messages in connect-agent.ts to guide users on authentication steps when encountering connection errors.
- Refactored resolve-claude-agent-env.ts to improve environment variable normalization, allowing for better handling of ANTHROPIC_CUSTOM_HEADERS and ensuring proper serialization of object values.

* refactor(env): streamline environment variable reading and enhance settings handling

- Renamed and refactored the function for reading Claude settings to improve clarity and maintainability.
- Introduced a new function to read settings from both `settings.json` and `settings.local.json`, ensuring local settings take precedence.
- Updated test setup to use the system's temporary directory for security tests, enhancing compatibility across environments.

* feat(ai): add fallback models for third-party API proxies in connect-agent

- Introduced FALLBACK_CLAUDE_MODELS to provide default model options when supportedModels() fails, enhancing connectivity with third-party API proxies.
- Updated error handling in connectClaudeCode to return fallback models on specific connection errors, ensuring users can still connect and select a model.

* chore: bump version from 0.3.4 to 0.4.0 in package.json

* feat(ai): enhance prompt handling for basic-tier models

- Introduced logic to inline system prompts into user messages for basic-tier models (e.g., MiniMax, GLM) to ensure proper instruction visibility.
- Updated design modification and orchestrator functions to accommodate this change, improving interaction with third-party routers.
- Added a utility function to strip non-standard XML-like tags from AI responses, enhancing JSON extraction reliability.

* fix(server): add ESM-compatible __dirname polyfill (#42)

- Add fileURLToPath/dirname polyfill to mcp-install.ts
- Add fileURLToPath/dirname polyfill to mcp-server-manager.ts

Fixes browser version crash with "__dirname is not defined"
Fixes MCP over HTTP transport failure since v3.3

Closes #37

* fix(server): make MCP HTTP server survive parent process lifecycle (#43)

## Problem
The MCP HTTP server process died when:
1. User closed the Settings/Agent dialog in the UI
2. User interacted with the editor canvas
3. Nitro server hot-reloaded or restarted
4. Electron app sent SIGTERM to Nitro on window close

This made the MCP server unusable for HTTP transport, requiring users
to keep the settings dialog open constantly.

## Root Cause Analysis
The MCP server was spawned as a regular child process without detached
mode. This created several fatal dependencies:

1. **Signal propagation**: Lines 29-34 registered handlers for SIGINT,
   SIGTERM, SIGHUP that killed the MCP process when the parent received
   these signals (e.g., Electron closing Nitro)

2. **Parent-child lifecycle binding**: Without detached mode, the child
   process is tied to the parent's event loop. Any parent instability
   (hot reload, dialog closure causing microtasks, etc.) could affect
   the child

3. **In-memory process tracking**: The mcpProcess variable was stored
   in module memory, so Nitro restarts/hot-reloads would lose track of
   the running server

## Solution
1. **Detached spawn mode**: Use `detached: true` + `unref()` to make
   the MCP process completely independent of the parent

2. **PID file persistence**: Track the process via files in /tmp
   instead of in-memory variables, surviving parent restarts

3. **Remove signal handlers**: Delete the SIGINT/SIGTERM/SIGHUP
   handlers that were killing the MCP process

4. **Cross-platform kill**: Use process.kill() instead of execSync
   with taskkill for safer process termination

## Testing
### Test Case 1: Settings Dialog Close
- Open OpenPencil → Settings → Start MCP HTTP Server
- Close settings dialog
- Reopen settings → MCP server still running 

### Test Case 2: Editor Interaction
- Start MCP HTTP Server
- Draw on canvas, create shapes, use all tools
- Check MCP server status → Still running 

### Test Case 3: Browser Version
- Run `npx vite --port 3000`
- Start MCP from browser UI
- Close tab, reopen → Server still running 

### Test Case 4: Electron App
- Start MCP from installed app
- Close app window
- Reopen app → Server still running (tracked via PID file) 

### Test Case 5: Stop/Restart Cycle
- Start → Stop → Start again
- Works correctly 

## Files Changed
- server/utils/mcp-server-manager.ts - Complete rewrite with detached mode
- server/api/ai/mcp-install.ts - Add ESM __dirname polyfill

Fixes #37

Co-authored-by: Kayshen Xu <kayshen.xu@gmail.com>

* fix(canvas): respect node-level theme overrides during variable resolution (#41)

Previously, resolveNodeForCanvas was called with a single activeTheme
for all nodes, ignoring any per-node theme property. This meant nodes
with theme: {"Mode": "Light"} would still render using Dark mode colors.

This fix merges each node's theme property with the default activeTheme,
allowing individual nodes to override theme axes for proper themed
variable resolution.

Fixes design documents where some panels need different theme variants
than the default (e.g., Light mode components in a Dark mode document).

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Kayshen Xu <kayshen.xu@gmail.com>

* feat(ai): enhance text processing by stripping tool call blocks

- Added a new function to strip fake tool call blocks emitted by basic-tier models, ensuring cleaner JSON extraction.
- Updated the existing text processing flow to incorporate this new function, improving the handling of non-standard model artifacts.
- Modified orchestrator prompts to disallow function calls and tool call syntax in JSON outputs, enhancing response consistency.

* fix(server): resolve merge conflict and clean up ESM __dirname polyfill

- Removed merge conflict markers from mcp-install.ts and mcp-server-manager.ts.
- Ensured consistent implementation of ESM-compatible __dirname polyfill across both files.

* feat(ai): add node availability check and HTTP fallback for MCP server installation

- Implemented a function to check for the availability of Node.js on the system, enhancing compatibility for environments without Node.js.
- Added a fallback mechanism to install the MCP server using an HTTP URL when Node.js is not found, ensuring functionality remains intact.
- Updated the installation response to include a flag indicating if the HTTP fallback was used, allowing the UI to reflect the server status accurately.
- Enhanced the agent settings dialog to synchronize the MCP server status when the HTTP server is auto-started.

* refactor(canvas): replace Fabric.js with CanvasKit/Skia for enhanced rendering performance

- Removed Fabric.js dependencies and related code, transitioning to CanvasKit/Skia as the primary canvas engine.
- Updated documentation and README files to reflect the new canvas technology.
- Adjusted data flow and component interactions to accommodate the new rendering engine.
- Ensured backward compatibility by retaining legacy Fabric.js files for specific utilities until full removal is feasible.

* refactor(canvas): update layout engine and remove Fabric.js dependencies

- Enhanced the canvas layout engine by integrating Skia and CanvasKit, replacing Fabric.js references.
- Improved type safety by refining type assertions and imports across various components.
- Added a new function to retrieve canvas dimensions, defaulting to 800x600 if no engine is mounted.
- Removed obsolete rendering logic related to Fabric.js, streamlining the layout indicator functionality.
- Updated multiple components to utilize the new canvas size retrieval method, ensuring consistent behavior across the application.

* fix(figma): resolve multiple .fig import rendering issues

- Fix layout properties in preserve mode: only apply auto-layout (gap,
  padding, justify, align) for frames with stackMode, use absolute
  positioning for non-auto-layout frames
- Reverse children order for auto-layout frames in preserve mode to
  match Figma's layout flow order (tree builder sorts descending for
  z-stacking but layout needs ascending)
- Fix instance inlining: always inline symbol content when instance has
  no local children, regardless of override/derived data presence
- Fix derived data mapping: use direct GUID matching (Strategy 0) when
  derived guidPath GUIDs are actual symbol node GUIDs, preventing the
  off-by-one size/transform misassignment from index-based mapping
- Add styleIdForFill/styleIdForStrokeFill resolution: resolve fill style
  references to inline paints before tree building, fixing missing fills
  on 448+ nodes including active menu items and styled elements
- Fix collectImageBlobs to detect JPEG/GIF/WebP in addition to PNG
- Fix instance override application: apply overrides even when derived
  data is absent

* fix(figma): skip opacity=0 nodes during import to fix breadcrumb visibility

Nodes with opacity <= 0 are now skipped in convertChildren, along with
all their descendants.  This fixes the breadcrumb "Page / Page / Page"
text showing through despite the parent items having opacity=0, since
the Skia renderer does not propagate parent opacity to child nodes.

* fix(figma): resolve styleIdForFill in instance override entries

Style references (styleIdForFill, styleIdForStrokeFill) inside
symbolOverrides were not being resolved to inline paints.  This caused
text nodes in overridden instances (e.g. "All Products" card on blue
background) to retain the symbol's default colors instead of the
instance-specific white fills.

The resolveStyleReferences pre-processing step now iterates into each
node's symbolData.symbolOverrides array and resolves style references
there as well.

* fix(figma): apply icon colors and fix arc rendering in .fig import

- Fix donut chart missing segment: swap start/end angles when
  endingAngle < startingAngle instead of adding 2π, producing correct
  clockwise arc equivalents for counter-clockwise Figma arcs

- Fix icon stroke thickness: scale strokeWeight proportionally in
  scaleTreeChildren, applyInstanceOverrides derived sizing, and
  convertVector for Lucide icons rendered smaller than 24×24

- Fix nested instance color overrides: build nestedOverrideMap from
  multi-guid override paths (e.g. instanceGuid/childGuid) and inject
  child-scoped overrides into nested INSTANCE symbolOverrides, enabling
  Dashboard Summary Card icons to receive white/blue stroke colors

- Fix override-only instances: when derivedSymbolData is empty but
  symbolOverrides exist (e.g. sidebar icon instances with stroke color
  overrides but no size changes), fall through to direct GUID matching
  so stroke paints are correctly applied

- Include strokePaints in hasVisualOverrides check so instances with
  stroke-only overrides are properly inlined with their symbol children

* feat(canvas): add vector text rendering with bundled fonts for Figma import

Replace bitmap text fallback with CanvasKit Paragraph API for true vector
text rendering. Bundle 11 popular font families locally via @fontsource
packages to ensure reliable rendering in regions where Google Fonts CDN
is unavailable. Key changes:

- SkiaFontManager: local-first font loading with Google Fonts fallback
- Paragraph API rendering with caching, styled segments, and alignment
- Fixed-width text tolerance to prevent unwanted line wrapping from font
  metric differences between Figma and CanvasKit
- Auto-width text manual alignment offset (center/right) to avoid
  infinite layout width issues
- Figma text mapper: support RAW lineHeight units
- Figma import dialog: pre-load fonts after conversion

* fix(canvas): align text centering with CanvasKit and improve font fallback

- Replace Fabric.js FONT_SIZE_MULT (1.13) with actual paragraph height
  (fontSize * lineHeight) for cross-axis text centering, fixing icon
  vertical misalignment in horizontal layouts
- Remove Fabric-specific optical correction (getTextOpticalCenterYOffset)
  since CanvasKit halfLeading handles this correctly
- Add font fallback chain with separate Inter Ext family for latin-ext
  glyph coverage (₦ U+20A6 etc.)
- Cache failed font loads to prevent repeated fetch attempts per frame
- Skip Google Fonts requests for known system/proprietary fonts
  (PingFang SC, Microsoft YaHei, D-DIN-PRO, etc.)

* fix(figma): recursive nested instance expansion and CJK font support

- Fix 3 bugs in applyInstanceOverrides() that broke deeply nested Figma
  component instances (6+ levels): propagate multi-guid derivedSymbolData
  to nested instances, resolve virtual GUIDs via pkToNodeGuid map, and
  build nestedDerivedMap alongside nestedOverrideMap for recursive expansion
- Bundle Noto Sans SC (chinese-simplified + latin subsets, 400/700 weights)
  for offline CJK vector text rendering without Google Fonts dependency
- Add Noto Sans SC to font fallback chain for CJK glyph coverage
- Add China CDN mirror (fonts.font.im) as fallback when Google Fonts is
  inaccessible, with 4s timeout on primary CDN
- Increase .fig file size limits to 150MB compressed / 300MB decompressed

* fix(canvas): enhance CJK font support and pre-load Noto Sans SC

- Update SkiaEngine to pre-load Noto Sans SC alongside Inter for improved CJK glyph coverage in the font fallback chain, preventing rendering issues with system fonts.
- Modify FigmaImportDialog to ensure Noto Sans SC is included when primary fonts are system fonts, ensuring proper rendering of CJK text.

* refactor(figma): optimize instance override handling with full DFS strategy

- Replace hybrid skip-INSTANCE and expanded DFS with a unified full DFS approach for handling all node types, including INSTANCE nodes.
- Update the mapping logic to ensure correct localID assignments and improve the handling of overflow entries during instance expansion.
- Enhance the overall structure and readability of the applyInstanceOverrides function by consolidating the traversal methods.

* refactor(figma): improve layout mapping and instance override logic

- Enhance the mapFigmaLayout function to conditionally set gap based on stackSpacing and justifyContent, ensuring compatibility with Figma's layout behavior.
- Simplify the applyInstanceOverrides function by removing redundant logic and focusing on a unified full DFS approach for node mapping, improving clarity and performance.
- Ensure derivedSymbolData is correctly replaced with nested data from outer instances, preventing incorrect merging of entries.

* refactor(figma): enhance instance override logic with root-inclusive DFS

- Introduce a root-inclusive depth-first search (DFS) to accurately detect and skip overrides targeting the symbol root during instance processing.
- Update the applyInstanceOverrides function to improve clarity and performance by refining the handling of derived data and overrides.
- Ensure that overrides targeting the root are excluded from child nodes, preventing incorrect application of overrides in nested instances.

---------

Co-authored-by: Fini <fini.yang@gmail.com>
Co-authored-by: Related8919 <191752213+Related8919@users.noreply.github.com>
Co-authored-by: Hrijul Dey <44521405+hr1juldey@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
2026-03-15 10:01:35 +08:00

753 lines
28 KiB
TypeScript

import { defineEventHandler, readBody, setResponseHeaders } from 'h3'
import { readFile, writeFile, mkdtemp, rm } from 'node:fs/promises'
import { tmpdir } from 'node:os'
import { join } from 'node:path'
import { resolveClaudeCli } from '../../utils/resolve-claude-cli'
import { runCodexExec } from '../../utils/codex-client'
import {
buildClaudeAgentEnv,
getClaudeAgentDebugFilePath,
} from '../../utils/resolve-claude-agent-env'
/** Pattern for detecting sensitive data in debug log output */
export const SENSITIVE_LOG_PATTERN = /ANTHROPIC_API_KEY=|Authorization:\s*Bearer|api[_-]?key\s*[:=]/i
/** Allowed media types for image attachments */
export const ALLOWED_MEDIA_TYPES = new Set(['image/png', 'image/jpeg', 'image/gif', 'image/webp'])
/** Resolve file extension from media type, falling back to 'png' for disallowed types */
export function resolveMediaExtension(mediaType: string): string {
return ALLOWED_MEDIA_TYPES.has(mediaType) ? mediaType.split('/')[1] : 'png'
}
interface ChatAttachmentWire {
name: string
mediaType: string
data: string // base64
}
interface ChatBody {
system: string
messages: Array<{ role: 'user' | 'assistant'; content: string; attachments?: ChatAttachmentWire[] }>
model?: string
provider?: 'anthropic' | 'openai' | 'opencode' | 'copilot'
thinkingMode?: 'adaptive' | 'disabled' | 'enabled'
thinkingBudgetTokens?: number
effort?: 'low' | 'medium' | 'high' | 'max'
}
async function readDebugTail(path?: string, maxLines = 40): Promise<string[] | undefined> {
if (!path) return undefined
try {
const raw = await readFile(path, 'utf-8')
const lines = raw.split('\n').filter((l) => l.trim().length > 0)
const sanitized = lines.filter(l => !SENSITIVE_LOG_PATTERN.test(l))
return sanitized.slice(-maxLines)
} catch {
return undefined
}
}
function buildClaudeExitHint(rawError: string, debugTail?: string[]): string | undefined {
if (!/process exited with code 1/i.test(rawError)) return undefined
if (!debugTail || debugTail.length === 0) return undefined
const text = debugTail.join('\n')
const hints: string[] = []
if (/Failed to save config with lock: Error: EPERM|operation not permitted, .*\.claude\.json/i.test(text)) {
hints.push('Claude Code cannot write ~/.claude.json in the current runtime (permission denied).')
}
if (/Connection error|Could not resolve host|Failed to connect/i.test(text)) {
hints.push('Upstream API connection failed (check proxy/DNS/network reachability to your ANTHROPIC_BASE_URL).')
}
if (/ANTHROPIC_CUSTOM_HEADERS present: false, has Authorization header: false/i.test(text)) {
hints.push(
'No API auth header detected. Run "claude login" to authenticate, ' +
'or set ANTHROPIC_API_KEY in ~/.claude/settings.json ' +
'(env: { "ANTHROPIC_API_KEY": "sk-..." }).',
)
}
if (hints.length === 0) return undefined
return `${rawError}\n${hints.join(' ')}`
}
/**
* Streaming chat endpoint.
* Routes to the appropriate provider SDK based on the `provider` field.
* Requires explicit provider and model; no fallback routing.
*/
export default defineEventHandler(async (event) => {
const body = await readBody<ChatBody>(event)
if (!body?.messages || !body?.system) {
setResponseHeaders(event, { 'Content-Type': 'application/json' })
return { error: 'Missing required fields: system, messages' }
}
if (!body.provider) {
setResponseHeaders(event, { 'Content-Type': 'application/json' })
return { error: 'Missing provider. Provider fallback is disabled.' }
}
if (!body.model?.trim()) {
setResponseHeaders(event, { 'Content-Type': 'application/json' })
return { error: 'Missing model. Model fallback is disabled.' }
}
if (body.provider !== 'anthropic' && body.provider !== 'openai' && body.provider !== 'opencode' && body.provider !== 'copilot') {
setResponseHeaders(event, { 'Content-Type': 'application/json' })
return { error: 'Missing or unsupported provider. Provider fallback is disabled.' }
}
setResponseHeaders(event, {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
Connection: 'keep-alive',
})
if (body.provider === 'anthropic') return streamViaAgentSDK(body, body.model)
if (body.provider === 'opencode') return streamViaOpenCode(body, body.model)
if (body.provider === 'copilot') return streamViaCopilot(body, body.model)
return streamViaCodex(body, body.model)
})
// Keep-alive ping interval (ms) — prevents client timeout while waiting for API TTFT
const KEEPALIVE_INTERVAL_MS = 15_000
function getAgentThinkingConfig(body: ChatBody):
| { type: 'adaptive' | 'disabled' }
| { type: 'enabled'; budgetTokens?: number }
| undefined {
if (!body.thinkingMode) return undefined
if (body.thinkingMode === 'enabled') {
return { type: 'enabled', budgetTokens: body.thinkingBudgetTokens }
}
return { type: body.thinkingMode }
}
/**
* Save base64 attachments to temp files. Returns { tempDir, files[] } — caller must clean up tempDir.
*
* When `insideProject` is true, files are saved under `.openpencil-tmp/` in the
* current working directory so that Claude Code Agent SDK (which restricts reads
* to the project directory in plan mode) can access them.
*/
async function saveAttachmentsToTempFiles(
attachments: ChatAttachmentWire[],
insideProject = false,
): Promise<{ tempDir: string; files: string[] }> {
let tempDir: string
if (insideProject) {
const { mkdirSync, chmodSync } = await import('node:fs')
const baseDir = join(process.cwd(), '.openpencil-tmp')
mkdirSync(baseDir, { recursive: true, mode: 0o700 })
chmodSync(baseDir, 0o700)
tempDir = await mkdtemp(join(baseDir, 'attach-'))
} else {
tempDir = await mkdtemp(join(tmpdir(), 'openpencil-attach-'))
}
const files: string[] = []
for (const att of attachments) {
const ext = resolveMediaExtension(att.mediaType)
const filePath = join(tempDir, `${files.length}.${ext}`)
await writeFile(filePath, Buffer.from(att.data, 'base64'))
files.push(filePath)
}
return { tempDir, files }
}
/** Collect all attachments from the last user message */
function getLastUserAttachments(body: ChatBody): ChatAttachmentWire[] {
const lastUser = [...body.messages].reverse().find((m) => m.role === 'user')
return lastUser?.attachments ?? []
}
/**
* Strip "NEVER use tools" and similar instructions from system prompt
* when we need Claude Code Agent SDK to use its Read tool for image analysis.
*/
function stripNoToolsRestriction(systemPrompt: string): string {
return systemPrompt
.replace(/^.*NEVER use tools.*$/gim, '')
.replace(/\n{3,}/g, '\n\n')
}
/** Stream via Claude Agent SDK (uses local Claude Code OAuth login, no API key needed) */
function streamViaAgentSDK(body: ChatBody, model?: string) {
const stream = new ReadableStream({
async start(controller) {
const encoder = new TextEncoder()
// Send keep-alive pings until the first real chunk arrives
const pingTimer = setInterval(() => {
try {
controller.enqueue(encoder.encode(`data: ${JSON.stringify({ type: 'ping', content: '' })}\n\n`))
} catch { /* stream already closed */ }
}, KEEPALIVE_INTERVAL_MS)
let debugFile: string | undefined
let attachTempDir: string | undefined
try {
const { query } = await import('@anthropic-ai/claude-agent-sdk')
// Build prompt from the last user message
const lastUserMsg = [...body.messages].reverse().find((m) => m.role === 'user')
let prompt = lastUserMsg?.content ?? ''
// If the last user message has image attachments, save to temp files
// inside the project directory so Claude Code has read permission.
const attachments = getLastUserAttachments(body)
const hasImageAttachments = attachments.length > 0
if (hasImageAttachments) {
const saved = await saveAttachmentsToTempFiles(attachments, true)
attachTempDir = saved.tempDir
const imageRefs = saved.files.map((f) =>
`First, use the Read tool to read the image file at "${f}". Then analyze it and respond to the user.`,
).join('\n')
prompt = imageRefs + '\n\n' + (prompt || 'Describe what you see in the image.')
}
// Remove CLAUDECODE env to allow running from within a CC terminal
const env = buildClaudeAgentEnv()
debugFile = getClaudeAgentDebugFilePath()
const claudePath = resolveClaudeCli()
const thinking = getAgentThinkingConfig(body)
// When images are attached, strip the "NEVER use tools" restriction from
// the system prompt so Claude Code will use its Read tool to view images.
const effectiveSystemPrompt = hasImageAttachments
? stripNoToolsRestriction(body.system)
: body.system
// When images are attached, use result-based flow (like validate.ts):
// let Claude Code read the image via its Read tool internally, then
// only emit the final result text. This avoids streaming intermediate
// tool-use preamble like "I need to read the file first".
if (hasImageAttachments) {
const runImageQuery = async (): Promise<string> => {
const q = query({
prompt,
options: {
systemPrompt: effectiveSystemPrompt,
...(model ? { model } : {}),
maxTurns: 3,
plugins: [],
permissionMode: 'plan',
persistSession: false,
...(body.effort ? { effort: body.effort } : {}),
...(thinking ? { thinking } : {}),
env,
...(debugFile ? { debugFile } : {}),
...(claudePath ? { pathToClaudeCodeExecutable: claudePath } : {}),
},
})
try {
for await (const message of q) {
if (message.type === 'result') {
const isErrorResult = 'is_error' in message && Boolean((message as { is_error?: boolean }).is_error)
if (message.subtype === 'success' && !isErrorResult) {
return message.result ?? ''
}
const errors = 'errors' in message ? (message.errors as string[]) : []
const resultText = 'result' in message ? String(message.result ?? '') : ''
const errContent = errors.join('; ') || resultText || `Query ended with: ${message.subtype}`
throw new Error(errContent)
}
}
return ''
} finally {
q.close()
}
}
const resultText = await runImageQuery()
clearInterval(pingTimer)
if (resultText) {
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'text', content: resultText })}\n\n`),
)
}
} else {
// Normal text-only chat: stream partial messages as before
const runQuery = async () => {
const q = query({
prompt,
options: {
systemPrompt: effectiveSystemPrompt,
...(model ? { model } : {}),
maxTurns: 1,
includePartialMessages: true,
tools: [],
plugins: [],
permissionMode: 'plan',
persistSession: false,
...(body.effort ? { effort: body.effort } : {}),
...(thinking ? { thinking } : {}),
env,
...(debugFile ? { debugFile } : {}),
...(claudePath ? { pathToClaudeCodeExecutable: claudePath } : {}),
},
})
try {
for await (const message of q) {
if (message.type === 'stream_event') {
const ev = message.event
if (ev.type === 'content_block_delta') {
if (ev.delta.type === 'text_delta') {
clearInterval(pingTimer)
const data = JSON.stringify({ type: 'text', content: ev.delta.text })
controller.enqueue(encoder.encode(`data: ${data}\n\n`))
} else if (ev.delta.type === 'thinking_delta') {
// Keep pings alive during thinking — only stop on text output
const data = JSON.stringify({ type: 'thinking', content: (ev.delta as any).thinking })
controller.enqueue(encoder.encode(`data: ${data}\n\n`))
}
}
} else if (message.type === 'result') {
const isErrorResult = 'is_error' in message && Boolean((message as { is_error?: boolean }).is_error)
if (message.subtype !== 'success' || isErrorResult) {
const errors = 'errors' in message ? (message.errors as string[]) : []
const resultText = 'result' in message ? String(message.result ?? '') : ''
const content = errors.join('; ') || resultText || `Query ended with: ${message.subtype}`
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'error', content })}\n\n`),
)
}
}
}
} finally {
q.close()
}
}
await runQuery()
}
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'done', content: '' })}\n\n`),
)
} catch (error) {
const rawContent = error instanceof Error ? error.message : 'Unknown error'
const tail = await readDebugTail(debugFile)
const hintedContent = buildClaudeExitHint(rawContent, tail)
const content = hintedContent ?? rawContent
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'error', content })}\n\n`),
)
} finally {
clearInterval(pingTimer)
if (attachTempDir) {
rm(attachTempDir, { recursive: true, force: true }).catch(() => {})
}
controller.close()
}
},
})
return new Response(stream)
}
/** Error name → user-friendly label mapping */
const OPENCODE_ERROR_LABELS: Record<string, string> = {
APIError: 'API error',
ProviderAuthError: 'Authentication failed',
UnknownError: 'Unknown error',
MessageOutputLengthError: 'Response too long',
MessageAbortedError: 'Request aborted',
StructuredOutputError: 'Output format error',
ContextOverflowError: 'Context too long',
}
/**
* Extract a human-readable message from an OpenCode error object.
* Handles structured errors like { name: "APIError", data: { message: "..." } }
* and nested JSON in message strings.
*/
export function formatOpenCodeError(error: unknown): string {
if (!error) return 'Unknown error'
if (typeof error === 'string') return error
const err = error as Record<string, any>
// Structured OpenCode error: { name, data: { message, ... } }
if (err.name && err.data?.message) {
const label = OPENCODE_ERROR_LABELS[err.name] ?? err.name
let msg: string = err.data.message
// Try to extract nested error message from JSON in the message string
// e.g. 'Unauthorized: {"error":{"code":"invalid_api_key","message":"invalid access token"}}'
const jsonStart = msg.indexOf('{')
if (jsonStart > 0) {
try {
const nested = JSON.parse(msg.slice(jsonStart))
const nestedMsg = nested?.error?.message ?? nested?.message
if (nestedMsg) {
const prefix = msg.slice(0, jsonStart).replace(/:\s*$/, '').trim()
msg = prefix ? `${prefix}: ${nestedMsg}` : nestedMsg
}
} catch { /* not JSON, use as-is */ }
}
return `${label}${msg}`
}
// Plain { message } object
if (err.message) return err.message
// Fallback: truncated JSON
const json = JSON.stringify(error)
return json.length > 200 ? json.slice(0, 200) + '…' : json
}
/** Parse an OpenCode model string ("providerID/modelID") into its parts */
function parseOpenCodeModel(model?: string): { providerID: string; modelID: string } | undefined {
if (!model || !model.includes('/')) return undefined
const idx = model.indexOf('/')
return { providerID: model.slice(0, idx), modelID: model.slice(idx + 1) }
}
function mapOpenCodeEffort(
effort?: 'low' | 'medium' | 'high' | 'max',
): 'low' | 'medium' | 'high' | undefined {
if (!effort) return undefined
if (effort === 'max') return 'high'
return effort
}
function buildOpenCodeReasoning(
body: ChatBody,
): Record<string, unknown> | undefined {
const reasoning: Record<string, unknown> = {}
const effort = mapOpenCodeEffort(body.effort)
if (effort) {
reasoning.effort = effort
}
if (body.thinkingMode === 'enabled') {
reasoning.enabled = true
} else if (body.thinkingMode === 'disabled') {
reasoning.enabled = false
}
if (typeof body.thinkingBudgetTokens === 'number' && body.thinkingBudgetTokens > 0) {
reasoning.budgetTokens = body.thinkingBudgetTokens
}
return Object.keys(reasoning).length > 0 ? reasoning : undefined
}
/** Wrap an async generator with a timeout — yields values until timeout fires */
async function* streamWithTimeout<T>(
stream: AsyncGenerator<T>,
timeoutPromise: Promise<{ done: true; value: undefined }>,
): AsyncGenerator<T> {
while (true) {
const result = await Promise.race([
stream.next(),
timeoutPromise,
]) as IteratorResult<T>
if (result.done) break
yield result.value
}
}
function streamViaCodex(body: ChatBody, model?: string) {
const stream = new ReadableStream({
async start(controller) {
const encoder = new TextEncoder()
const pingTimer = setInterval(() => {
try {
controller.enqueue(encoder.encode(`data: ${JSON.stringify({ type: 'ping', content: '' })}\n\n`))
} catch { /* stream already closed */ }
}, KEEPALIVE_INTERVAL_MS)
let attachTempDir: string | undefined
try {
const lastUserMsg = [...body.messages].reverse().find((m) => m.role === 'user')
const prompt = lastUserMsg?.content ?? ''
// Save image attachments to temp files for Codex CLI
const attachments = getLastUserAttachments(body)
let imageFiles: string[] | undefined
if (attachments.length > 0) {
const saved = await saveAttachmentsToTempFiles(attachments)
attachTempDir = saved.tempDir
imageFiles = saved.files
}
const result = await runCodexExec(prompt, {
model,
systemPrompt: body.system,
thinkingMode: body.thinkingMode,
thinkingBudgetTokens: body.thinkingBudgetTokens,
effort: body.effort,
imageFiles,
})
clearInterval(pingTimer)
if (result.error) {
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'error', content: result.error })}\n\n`),
)
return
}
if (result.text) {
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'text', content: result.text })}\n\n`),
)
}
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'done', content: '' })}\n\n`),
)
} catch (error) {
const content = error instanceof Error ? error.message : 'Unknown error'
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'error', content })}\n\n`),
)
} finally {
clearInterval(pingTimer)
if (attachTempDir) {
rm(attachTempDir, { recursive: true, force: true }).catch(() => {})
}
controller.close()
}
},
})
return new Response(stream)
}
/** Stream via OpenCode SDK using event subscription for real-time streaming */
function streamViaOpenCode(body: ChatBody, model?: string) {
const stream = new ReadableStream({
async start(controller) {
const encoder = new TextEncoder()
const pingTimer = setInterval(() => {
try {
controller.enqueue(encoder.encode(`data: ${JSON.stringify({ type: 'ping', content: '' })}\n\n`))
} catch { /* stream already closed */ }
}, KEEPALIVE_INTERVAL_MS)
let ocServer: { close(): void } | undefined
try {
const { getOpencodeClient } = await import('../../utils/opencode-client')
const oc = await getOpencodeClient()
const ocClient = oc.client
ocServer = oc.server
// Create a session for this conversation
const { data: session, error: sessionError } = await ocClient.session.create({
title: 'OpenPencil Chat',
})
if (sessionError || !session) {
throw new Error(`Failed to create OpenCode session: ${formatOpenCodeError(sessionError)}`)
}
// Inject system prompt as context (no AI reply)
await ocClient.session.prompt({
sessionID: session.id,
noReply: true,
parts: [{ type: 'text', text: body.system }],
})
// Build prompt from the last user message
const lastUserMsg = [...body.messages].reverse().find((m) => m.role === 'user')
const prompt = lastUserMsg?.content ?? ''
const parsed = parseOpenCodeModel(model)
if (model && !parsed) {
console.warn(`[AI] OpenCode: could not parse model string "${model}", sending without model override`)
}
// Build parts array, adding image attachments if present
const attachments = getLastUserAttachments(body)
const parts: Array<Record<string, unknown>> = [
...attachments.map((a) => ({
type: 'image',
url: `data:${a.mediaType};base64,${a.data}`,
})),
{ type: 'text', text: prompt || 'Analyze these images.' },
]
console.log(`[AI] OpenCode streaming prompt: model=${model}, parsed=${JSON.stringify(parsed)}`)
// Build prompt payload with optional model and reasoning
const promptPayload: Record<string, unknown> = {
sessionID: session.id,
...(parsed ? { model: parsed } : {}),
parts,
}
const reasoning = buildOpenCodeReasoning(body)
if (reasoning) {
promptPayload.reasoning = reasoning
}
// Subscribe to event stream for real-time deltas
const eventResult = await ocClient.event.subscribe()
const eventStream = eventResult.stream
// Send prompt asynchronously — response comes via events
const { error: asyncError } = await ocClient.session.promptAsync(promptPayload as any)
if (asyncError) {
const detail = formatOpenCodeError(asyncError)
console.error('[AI] OpenCode promptAsync error:', detail)
throw new Error(detail)
}
// Consume event stream, forwarding text deltas to client
let emittedText = false
const sessionId = session.id
const STREAM_TIMEOUT_MS = 180_000
const timeoutPromise = new Promise<{ done: true; value: undefined }>((resolve) =>
setTimeout(() => resolve({ done: true, value: undefined }), STREAM_TIMEOUT_MS),
)
for await (const event of streamWithTimeout(eventStream, timeoutPromise)) {
if (!event || !('type' in event)) continue
const eventType = event.type as string
// Stream text deltas for our session
if (eventType === 'message.part.delta') {
const props = (event as any).properties
if (props?.sessionID === sessionId && props.field === 'text') {
const data = JSON.stringify({ type: 'text', content: props.delta })
controller.enqueue(encoder.encode(`data: ${data}\n\n`))
emittedText = true
}
// Forward reasoning deltas as thinking chunks
if (props?.sessionID === sessionId && props.field === 'reasoning') {
const data = JSON.stringify({ type: 'thinking', content: props.delta })
controller.enqueue(encoder.encode(`data: ${data}\n\n`))
}
continue
}
// Session went idle — response complete
if (eventType === 'session.idle') {
const props = (event as any).properties
if (props?.sessionID === sessionId) break
continue
}
// Session error
if (eventType === 'session.error') {
const props = (event as any).properties
if (props?.sessionID === sessionId || !props?.sessionID) {
const errMsg = formatOpenCodeError(props?.error)
console.error('[AI] OpenCode session error:', errMsg)
const data = JSON.stringify({ type: 'error', content: errMsg })
controller.enqueue(encoder.encode(`data: ${data}\n\n`))
break
}
continue
}
}
clearInterval(pingTimer)
if (!emittedText) {
console.warn('[AI] OpenCode returned no text via streaming events')
const data = JSON.stringify({ type: 'error', content: 'OpenCode returned an empty response. The model may not have generated any output.' })
controller.enqueue(encoder.encode(`data: ${data}\n\n`))
}
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'done', content: '' })}\n\n`),
)
} catch (error) {
const content = error instanceof Error ? error.message : 'Unknown error'
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'error', content })}\n\n`),
)
} finally {
const { releaseOpencodeServer } = await import('../../utils/opencode-client')
releaseOpencodeServer(ocServer)
clearInterval(pingTimer)
controller.close()
}
},
})
return new Response(stream)
}
/** Map ChatBody effort to Copilot SDK ReasoningEffort */
function mapCopilotReasoningEffort(
effort?: 'low' | 'medium' | 'high' | 'max',
): 'low' | 'medium' | 'high' | 'xhigh' | undefined {
if (!effort) return undefined
if (effort === 'max') return 'xhigh'
return effort
}
/** Stream via GitHub Copilot SDK (@github/copilot-sdk) */
function streamViaCopilot(body: ChatBody, model?: string) {
const stream = new ReadableStream({
async start(controller) {
const encoder = new TextEncoder()
const pingTimer = setInterval(() => {
try {
controller.enqueue(encoder.encode(`data: ${JSON.stringify({ type: 'ping', content: '' })}\n\n`))
} catch { /* stream already closed */ }
}, KEEPALIVE_INTERVAL_MS)
let copilotClient: { stop(): Promise<unknown> } | undefined
try {
const { CopilotClient, approveAll } = await import('@github/copilot-sdk')
// Use standalone copilot binary to avoid Bun's node:sqlite issue
const { resolveCopilotCli } = await import('../../utils/copilot-client')
const cliPath = resolveCopilotCli()
const client = new CopilotClient({
autoStart: true,
...(cliPath ? { cliPath } : {}),
})
copilotClient = client
await client.start()
const session = await client.createSession({
...(model ? { model } : {}),
streaming: true,
onPermissionRequest: approveAll,
systemMessage: { mode: 'replace', content: body.system },
...(body.effort ? { reasoningEffort: mapCopilotReasoningEffort(body.effort) } : {}),
})
const lastUserMsg = [...body.messages].reverse().find((m) => m.role === 'user')
const prompt = lastUserMsg?.content ?? ''
// Subscribe to streaming deltas
session.on('assistant.message_delta', (event) => {
clearInterval(pingTimer)
const deltaContent = (event as any).data?.deltaContent ?? ''
if (deltaContent) {
const data = JSON.stringify({ type: 'text', content: deltaContent })
try {
controller.enqueue(encoder.encode(`data: ${data}\n\n`))
} catch { /* stream closed */ }
}
})
// Wait for completion
await session.sendAndWait({ prompt }, 120_000)
await session.destroy()
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'done', content: '' })}\n\n`),
)
} catch (error) {
const content = error instanceof Error ? error.message : 'Unknown error'
controller.enqueue(
encoder.encode(`data: ${JSON.stringify({ type: 'error', content })}\n\n`),
)
} finally {
clearInterval(pingTimer)
if (copilotClient) {
copilotClient.stop().catch(() => {})
}
controller.close()
}
},
})
return new Response(stream)
}