mirror of
https://github.com/nexu-io/open-design.git
synced 2026-05-31 19:04:39 +07:00
[codex] Improve Claude Code exit diagnostics (#1267)
* fix daemon claude diagnostics * fix claude custom endpoint auth diagnostics * fix project view api empty response test props * fix claude diagnostic review gaps * fix silent custom endpoint claude diagnostics * fix claude diagnostic credential redaction * fix quoted api key redaction * fix claude diagnostic tail redaction * fix silent claude configured profile diagnostics
This commit is contained in:
parent
a75d9938c7
commit
5bd9763181
12 changed files with 802 additions and 4 deletions
|
|
@ -343,6 +343,13 @@ open-design/
|
|||
|
||||
- **`better-sqlite3` fails to load / ABI mismatch after a Node.js version change** — `pnpm install` re-runs `postinstall` automatically and rebuilds the native addon for the current Node.js. To rebuild manually or verify the fix: `pnpm --filter @open-design/daemon rebuild better-sqlite3` then `pnpm --filter @open-design/daemon exec node -e "require('better-sqlite3')"`. Requires build tools: `python3`, `make`, `g++` (or `clang++`). If you have `ignore-scripts=true` in your `.npmrc`, run `node scripts/postinstall.mjs` after `pnpm install`.
|
||||
- **"no agents found on PATH"** — install one of: `claude`, `codex`, `devin`, `gemini`, `opencode`, `cursor-agent`, `qwen`, `qodercli`, `copilot`. Or switch to API mode in Settings and paste a provider key.
|
||||
- **Claude Code exits with code 1** — Open Design was able to start `claude`, but the spawned non-interactive run failed before producing a response. From the same shell or app environment that starts Open Design, check:
|
||||
```bash
|
||||
claude --version
|
||||
claude auth status --text
|
||||
printf 'hello' | claude -p --output-format stream-json --verbose --permission-mode bypassPermissions
|
||||
```
|
||||
If the smoke test reports `401`, `apiKeySource: "none"`, or another auth error without a custom endpoint, run `claude`, use `/login`, exit Claude, and retry Open Design. If you use multiple Claude profiles, set **Settings -> Execution & model -> Claude Code config directory** to the profile path such as `~/.claude-2`. If `ANTHROPIC_BASE_URL` or a proxy is set, check the endpoint URL, proxy credentials, endpoint auth environment, and model access; remove the custom endpoint only if you want to retry with standard Claude Code auth. On Windows, native PowerShell and WSL use separate Claude installs and credential stores; re-authenticate in the same environment Open Design uses, and check Windows Credential Manager if `/login` does not repair native Windows credentials.
|
||||
- **daemon 500 on /api/chat** — check the daemon terminal for the stderr tail; usually the CLI rejected its args. Different CLIs take different argv shapes; see `apps/daemon/src/agents.ts` `buildArgs` if you need to tweak.
|
||||
- **media generation says `OD_BIN` is missing or daemon URL is `:0`** — run the media dispatcher checks above. Do not resume the old CLI session; reopen the project from the Open Design app so the daemon can inject fresh `OD_*` variables.
|
||||
- **Codex loads too much plugin context** — start Open Design with `OD_CODEX_DISABLE_PLUGINS=1 pnpm tools-dev` to make daemon-spawned Codex processes run with `--disable plugins`.
|
||||
|
|
|
|||
157
apps/daemon/src/claude-diagnostics.ts
Normal file
157
apps/daemon/src/claude-diagnostics.ts
Normal file
|
|
@ -0,0 +1,157 @@
|
|||
import { redactSecrets } from './redact.js';
|
||||
|
||||
export interface ClaudeCliDiagnosticInput {
|
||||
agentId: string;
|
||||
exitCode?: number | null;
|
||||
signal?: string | null;
|
||||
stderrTail?: string | null;
|
||||
stdoutTail?: string | null;
|
||||
env?: Record<string, unknown> | null;
|
||||
}
|
||||
|
||||
export interface ClaudeCliDiagnostic {
|
||||
message: string;
|
||||
detail: string;
|
||||
retryable: boolean;
|
||||
}
|
||||
|
||||
function envValue(
|
||||
env: Record<string, unknown> | null | undefined,
|
||||
key: string,
|
||||
): string | null {
|
||||
if (!env) return null;
|
||||
const found = Object.keys(env).find((k) => k.toUpperCase() === key);
|
||||
if (!found) return null;
|
||||
const value = env[found];
|
||||
return typeof value === 'string' && value.trim() ? value.trim() : null;
|
||||
}
|
||||
|
||||
function body(input: ClaudeCliDiagnosticInput): string {
|
||||
return [input.stderrTail, input.stdoutTail]
|
||||
.filter((value): value is string => typeof value === 'string' && value.length > 0)
|
||||
.join('\n');
|
||||
}
|
||||
|
||||
function withContext(
|
||||
message: string,
|
||||
detail: string,
|
||||
input: ClaudeCliDiagnosticInput,
|
||||
): ClaudeCliDiagnostic {
|
||||
const configDir = envValue(input.env, 'CLAUDE_CONFIG_DIR');
|
||||
const baseUrl = envValue(input.env, 'ANTHROPIC_BASE_URL');
|
||||
const diagnosticTail = redactSecrets(body(input)).replace(/\s+/g, ' ').trim().slice(-240);
|
||||
const context: string[] = [message, detail];
|
||||
if (diagnosticTail) context.push(`Claude output: ${diagnosticTail}`);
|
||||
if (configDir) context.push(`Effective CLAUDE_CONFIG_DIR: ${configDir}.`);
|
||||
if (baseUrl) context.push('ANTHROPIC_BASE_URL is set for this Claude Code process.');
|
||||
return {
|
||||
message: redactSecrets(message),
|
||||
detail: redactSecrets(context.filter(Boolean).join(' ')),
|
||||
retryable: true,
|
||||
};
|
||||
}
|
||||
|
||||
export function diagnoseClaudeCliFailure(
|
||||
input: ClaudeCliDiagnosticInput,
|
||||
): ClaudeCliDiagnostic | null {
|
||||
if (input.agentId !== 'claude') return null;
|
||||
if (input.exitCode === 0 && !input.signal) return null;
|
||||
|
||||
const text = body(input);
|
||||
const normalized = text.toLowerCase();
|
||||
const hasCustomBaseUrl = envValue(input.env, 'ANTHROPIC_BASE_URL') !== null;
|
||||
const hasConfigDir = envValue(input.env, 'CLAUDE_CONFIG_DIR') !== null;
|
||||
|
||||
const authFailure =
|
||||
/\b401\b/.test(text) ||
|
||||
/apikeysource["'\s:]+none/i.test(text) ||
|
||||
/(auth|oauth|credential|token).*(fail|invalid|missing|expired|not found|none|unauthorized)/i.test(text) ||
|
||||
/(unauthorized|invalid api key|missing api key|could not authenticate|authentication failed)/i.test(text);
|
||||
if (authFailure && hasCustomBaseUrl) {
|
||||
return withContext(
|
||||
'Claude Code could not authenticate with the configured custom Anthropic endpoint.',
|
||||
'Check ANTHROPIC_BASE_URL, proxy credentials, endpoint authentication environment, and model access. Remove the custom endpoint only if you want to retry with standard Claude Code auth.',
|
||||
input,
|
||||
);
|
||||
}
|
||||
if (authFailure) {
|
||||
const configHint = hasConfigDir
|
||||
? 'The configured Claude config directory may contain stale or expired auth state.'
|
||||
: 'If you use multiple Claude profiles, set CLAUDE_CONFIG_DIR in Settings so Open Design spawns the same profile that works in your terminal.';
|
||||
return withContext(
|
||||
'Claude Code could not authenticate. Run `claude`, use `/login`, then retry the Open Design request.',
|
||||
`The spawned Claude Code process exited before producing a response. ${configHint}`,
|
||||
input,
|
||||
);
|
||||
}
|
||||
|
||||
const modelUnavailable =
|
||||
/selected model is not available/i.test(text) ||
|
||||
/current plan or region/i.test(text) ||
|
||||
/(model).*(not available|not supported|unsupported|not found|not have access|no access)/i.test(text);
|
||||
if (modelUnavailable && hasCustomBaseUrl) {
|
||||
return withContext(
|
||||
'Claude Code could not access the selected model through the configured custom endpoint.',
|
||||
'The custom ANTHROPIC_BASE_URL or proxy may not expose the model Claude Code selected. Change the model, fix the endpoint/proxy, or remove ANTHROPIC_BASE_URL and retry with standard Claude Code auth.',
|
||||
input,
|
||||
);
|
||||
}
|
||||
|
||||
const windowsCredentialMismatch =
|
||||
/credential manager/i.test(text) ||
|
||||
/\bwsl\b/i.test(text) ||
|
||||
/powershell/i.test(text) ||
|
||||
/native windows/i.test(text);
|
||||
if (windowsCredentialMismatch) {
|
||||
return withContext(
|
||||
'Claude Code appears to be using credentials from a different local environment.',
|
||||
'Re-authenticate Claude Code in the same Windows, WSL, or shell environment that Open Design uses. On native Windows, check Windows Credential Manager if `/login` does not repair the session.',
|
||||
input,
|
||||
);
|
||||
}
|
||||
|
||||
const configStateFailure =
|
||||
/(config|profile|session|credential|oauth)/i.test(text) &&
|
||||
/(stale|corrupt|expired|different|missing|not found|invalid)/i.test(text);
|
||||
if (configStateFailure) {
|
||||
const message = hasConfigDir
|
||||
? 'Claude Code failed while using the configured Claude profile.'
|
||||
: 'Claude Code may be using a different or stale local profile than your terminal.';
|
||||
const detail = hasConfigDir
|
||||
? 'Re-run `claude` and `/login` for that profile, then retry Open Design.'
|
||||
: 'Run `claude` and `/login`, or set CLAUDE_CONFIG_DIR in Settings when you use multiple Claude profiles.';
|
||||
return withContext(message, detail, input);
|
||||
}
|
||||
|
||||
if (!text.trim() && input.exitCode === 1 && hasCustomBaseUrl) {
|
||||
return withContext(
|
||||
'Claude Code exited before producing diagnostics while using a custom Anthropic endpoint.',
|
||||
'Check ANTHROPIC_BASE_URL, proxy credentials, endpoint authentication environment, and model access. Remove the custom endpoint only if you want to retry with standard Claude Code auth.',
|
||||
input,
|
||||
);
|
||||
}
|
||||
|
||||
if (!text.trim() && input.exitCode === 1) {
|
||||
const message = hasConfigDir
|
||||
? 'Claude Code exited before producing diagnostics while using the configured Claude profile.'
|
||||
: 'Claude Code exited before producing diagnostics.';
|
||||
const detail = hasConfigDir
|
||||
? 'Re-run `claude` and `/login` for that profile, then retry Open Design.'
|
||||
: 'Run `claude`, use `/login`, and retry. If you use multiple Claude profiles, set CLAUDE_CONFIG_DIR in Settings so Open Design uses the same profile as your terminal.';
|
||||
return withContext(
|
||||
message,
|
||||
detail,
|
||||
input,
|
||||
);
|
||||
}
|
||||
|
||||
if (normalized.includes('anthropic_base_url') && hasCustomBaseUrl) {
|
||||
return withContext(
|
||||
'Claude Code failed while using a custom Anthropic endpoint.',
|
||||
'Check the ANTHROPIC_BASE_URL endpoint, proxy, model access, and authentication settings, then retry.',
|
||||
input,
|
||||
);
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
|
@ -30,6 +30,7 @@ import { createCommandInvocation } from '@open-design/platform';
|
|||
import { attachAcpSession } from './acp.js';
|
||||
import { attachPiRpcSession } from './pi-rpc.js';
|
||||
import { createClaudeStreamHandler } from './claude-stream.js';
|
||||
import { diagnoseClaudeCliFailure } from './claude-diagnostics.js';
|
||||
import { createCopilotStreamHandler } from './copilot-stream.js';
|
||||
import { createJsonEventStreamHandler } from './json-event-stream.js';
|
||||
import { agentCliEnvForAgent, validateAgentCliEnv } from './app-config.js';
|
||||
|
|
@ -724,12 +725,15 @@ interface AgentSink {
|
|||
streamError: Promise<Error>;
|
||||
getText: () => string;
|
||||
getStderrTail: () => string;
|
||||
appendRawStdout: (chunk: string) => void;
|
||||
getRawStdoutTail: () => string;
|
||||
dispose: () => void;
|
||||
}
|
||||
|
||||
export function createAgentSink(): AgentSink {
|
||||
let buffer = '';
|
||||
let stderrTail = '';
|
||||
let rawStdoutTail = '';
|
||||
let debounceTimer: ReturnType<typeof setTimeout> | null = null;
|
||||
let resolveResult!: (value: AgentSinkResult) => void;
|
||||
let resolveStreamError!: (value: Error) => void;
|
||||
|
|
@ -774,6 +778,12 @@ export function createAgentSink(): AgentSink {
|
|||
scheduleTextResolution();
|
||||
};
|
||||
|
||||
const appendRawStdout = (chunk: string) => {
|
||||
if (typeof chunk === 'string' && chunk.length > 0) {
|
||||
rawStdoutTail = (rawStdoutTail + chunk).slice(-400);
|
||||
}
|
||||
};
|
||||
|
||||
const send = (event: string, payload: unknown) => {
|
||||
const data = (payload ?? {}) as Record<string, unknown>;
|
||||
if (event === 'error') {
|
||||
|
|
@ -805,7 +815,10 @@ export function createAgentSink(): AgentSink {
|
|||
}
|
||||
if (event === 'stdout') {
|
||||
const chunk = data.chunk;
|
||||
if (typeof chunk === 'string') consumeText(chunk);
|
||||
if (typeof chunk === 'string') {
|
||||
appendRawStdout(chunk);
|
||||
consumeText(chunk);
|
||||
}
|
||||
return;
|
||||
}
|
||||
if (event === 'stderr') {
|
||||
|
|
@ -825,6 +838,8 @@ export function createAgentSink(): AgentSink {
|
|||
streamError,
|
||||
getText: () => buffer,
|
||||
getStderrTail: () => stderrTail,
|
||||
appendRawStdout,
|
||||
getRawStdoutTail: () => rawStdoutTail,
|
||||
dispose: () => {
|
||||
if (debounceTimer) {
|
||||
clearTimeout(debounceTimer);
|
||||
|
|
@ -846,13 +861,17 @@ function attachAgentStreamHandlers(
|
|||
cwd: string,
|
||||
model: string | undefined,
|
||||
send: (event: string, payload: unknown) => void,
|
||||
appendRawStdout?: (chunk: string) => void,
|
||||
): AgentSpawnHandle {
|
||||
let acpSession: { hasFatalError?: () => boolean } | null = null;
|
||||
child.stdout?.setEncoding('utf8');
|
||||
child.stderr?.setEncoding('utf8');
|
||||
if (def.streamFormat === 'claude-stream-json') {
|
||||
const claude = createClaudeStreamHandler((ev: unknown) => send('agent', ev));
|
||||
child.stdout?.on('data', (chunk: string) => claude.feed(chunk));
|
||||
child.stdout?.on('data', (chunk: string) => {
|
||||
appendRawStdout?.(chunk);
|
||||
claude.feed(chunk);
|
||||
});
|
||||
child.on('close', () => claude.flush());
|
||||
} else if (def.streamFormat === 'copilot-stream-json') {
|
||||
const copilot = createCopilotStreamHandler((ev: unknown) => send('agent', ev));
|
||||
|
|
@ -1100,6 +1119,7 @@ async function testAgentConnectionInternal(
|
|||
tempDir,
|
||||
input.model,
|
||||
sink.send,
|
||||
sink.appendRawStdout,
|
||||
);
|
||||
|
||||
const resultFromChildExit = (
|
||||
|
|
@ -1141,7 +1161,29 @@ async function testAgentConnectionInternal(
|
|||
if (exitedCleanly) return resultFromAgentText(buffered);
|
||||
}
|
||||
const stderrTail = sink.getStderrTail().trim();
|
||||
const rawStdoutTail = sink.getRawStdoutTail().trim();
|
||||
const acpFatal = Boolean(acpSession?.hasFatalError?.());
|
||||
const claudeDiagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: input.agentId,
|
||||
exitCode: winner.code,
|
||||
signal: winner.signal,
|
||||
stderrTail,
|
||||
stdoutTail: rawStdoutTail || buffered,
|
||||
env,
|
||||
});
|
||||
if (claudeDiagnostic) {
|
||||
console.warn(
|
||||
`[test:agent] ${def.name} → claude_diagnostic: ${claudeDiagnostic.detail}`,
|
||||
);
|
||||
return {
|
||||
ok: false,
|
||||
kind: 'agent_spawn_failed',
|
||||
latencyMs,
|
||||
model,
|
||||
agentName: def.name,
|
||||
detail: claudeDiagnostic.detail,
|
||||
};
|
||||
}
|
||||
const detail = redactSecrets(
|
||||
[
|
||||
winner.code != null ? `exit ${winner.code}` : null,
|
||||
|
|
|
|||
|
|
@ -98,6 +98,9 @@ const PATTERNS: readonly Pattern[] = [
|
|||
// non-card numbers (timestamps, IDs, hashes). We isolate the candidate
|
||||
// then run a Luhn check before redacting.
|
||||
const CARD_CANDIDATE = /\b(?:\d[ -]?){12,18}\d\b/g;
|
||||
const API_KEY_HEADER =
|
||||
/(^|[^?&\w-])("?)(x-api-key|api-key|x-goog-api-key)\2(\s*[:=]\s*)("[^"]*"|[^\s,;"'#}]+)/gi;
|
||||
const API_KEY_QUERY = /([?&](?:key|api_key|api-key)=)[^&#\s,;"']+/gi;
|
||||
|
||||
function isLuhnValid(digits: string): boolean {
|
||||
if (digits.length < 13 || digits.length > 19) return false;
|
||||
|
|
@ -116,6 +119,19 @@ function isLuhnValid(digits: string): boolean {
|
|||
return sum % 10 === 0;
|
||||
}
|
||||
|
||||
function redactApiKeyHeaderValue(
|
||||
prefix: string,
|
||||
quote: string,
|
||||
name: string,
|
||||
separator: string,
|
||||
value: string,
|
||||
): string {
|
||||
const redactedValue = value.startsWith('"')
|
||||
? '"[REDACTED:api_key_header]"'
|
||||
: '[REDACTED:api_key_header]';
|
||||
return `${prefix}${quote}${name}${quote}${separator}${redactedValue}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns `input` with every recognised secret / PII pattern replaced by
|
||||
* a `[REDACTED:<kind>]` marker. Idempotent — re-running on already
|
||||
|
|
@ -130,6 +146,19 @@ export function redactSecrets(input: string): string {
|
|||
for (const { name, regex } of PATTERNS) {
|
||||
out = out.replace(regex, `[REDACTED:${name}]`);
|
||||
}
|
||||
out = out
|
||||
.replace(
|
||||
API_KEY_HEADER,
|
||||
(
|
||||
_match,
|
||||
prefix: string,
|
||||
quote: string,
|
||||
name: string,
|
||||
separator: string,
|
||||
value: string,
|
||||
) => redactApiKeyHeaderValue(prefix, quote, name, separator, value),
|
||||
)
|
||||
.replace(API_KEY_QUERY, '$1[REDACTED:api_key_query]');
|
||||
out = out.replace(CARD_CANDIDATE, (match) => {
|
||||
const digits = match.replace(/\D/g, '');
|
||||
return isLuhnValid(digits) ? '[REDACTED:credit_card]' : match;
|
||||
|
|
@ -157,6 +186,28 @@ export function redactSecretsWithCounts(input: string): {
|
|||
});
|
||||
if (matched > 0) counts[name] = matched;
|
||||
}
|
||||
let apiKeyHeaderCount = 0;
|
||||
out = out.replace(
|
||||
API_KEY_HEADER,
|
||||
(
|
||||
_match,
|
||||
prefix: string,
|
||||
quote: string,
|
||||
name: string,
|
||||
separator: string,
|
||||
value: string,
|
||||
) => {
|
||||
apiKeyHeaderCount += 1;
|
||||
return redactApiKeyHeaderValue(prefix, quote, name, separator, value);
|
||||
},
|
||||
);
|
||||
if (apiKeyHeaderCount > 0) counts.api_key_header = apiKeyHeaderCount;
|
||||
let apiKeyQueryCount = 0;
|
||||
out = out.replace(API_KEY_QUERY, (_match, prefix: string) => {
|
||||
apiKeyQueryCount += 1;
|
||||
return `${prefix}[REDACTED:api_key_query]`;
|
||||
});
|
||||
if (apiKeyQueryCount > 0) counts.api_key_query = apiKeyQueryCount;
|
||||
let cardCount = 0;
|
||||
out = out.replace(CARD_CANDIDATE, (match) => {
|
||||
const digits = match.replace(/\D/g, '');
|
||||
|
|
|
|||
|
|
@ -61,6 +61,7 @@ import {
|
|||
import { attachAcpSession } from './acp.js';
|
||||
import { attachPiRpcSession } from './pi-rpc.js';
|
||||
import { createClaudeStreamHandler } from './claude-stream.js';
|
||||
import { diagnoseClaudeCliFailure } from './claude-diagnostics.js';
|
||||
import { loadCritiqueConfigFromEnv } from './critique/config.js';
|
||||
import { reconcileStaleRuns } from './critique/persistence.js';
|
||||
import { runOrchestrator } from './critique/orchestrator.js';
|
||||
|
|
@ -3729,6 +3730,9 @@ export async function startServer({
|
|||
let child;
|
||||
let acpSession = null;
|
||||
let writePromptToChildStdin = false;
|
||||
let spawnedAgentEnv = null;
|
||||
let agentStdoutTail = '';
|
||||
let agentStderrTail = '';
|
||||
try {
|
||||
// Prompt delivery via stdin is now the universal default. This bypasses
|
||||
// both the cmd.exe 8KB limit and the CreateProcess 32KB limit.
|
||||
|
|
@ -3747,6 +3751,7 @@ export async function startServer({
|
|||
),
|
||||
...odMediaEnv,
|
||||
};
|
||||
spawnedAgentEnv = env;
|
||||
const invocation = createCommandInvocation({
|
||||
command: resolvedBin,
|
||||
args,
|
||||
|
|
@ -3796,9 +3801,12 @@ export async function startServer({
|
|||
// structured adapters that buffer partial lines (Codex item.completed,
|
||||
// pi-rpc session/prompt, ACP agent messages) and models that spend a
|
||||
// long time in non-streamed reasoning still keep the run alive.
|
||||
child.stdout.on('data', () => {
|
||||
child.stdout.on('data', (chunk) => {
|
||||
childStdoutSeen = true;
|
||||
noteAgentActivity();
|
||||
if (def.id === 'claude') {
|
||||
agentStdoutTail = `${agentStdoutTail}${chunk}`.slice(-1000);
|
||||
}
|
||||
});
|
||||
|
||||
// ---- Memory: assistant-reply buffer for LLM extraction --------------
|
||||
|
|
@ -4096,6 +4104,9 @@ export async function startServer({
|
|||
run.acpSession = acpSession;
|
||||
child.stderr.on('data', (chunk) => {
|
||||
noteAgentActivity();
|
||||
if (def.id === 'claude') {
|
||||
agentStderrTail = `${agentStderrTail}${chunk}`.slice(-1000);
|
||||
}
|
||||
send('stderr', { chunk });
|
||||
});
|
||||
|
||||
|
|
@ -4143,6 +4154,23 @@ export async function startServer({
|
|||
: code === 0
|
||||
? 'succeeded'
|
||||
: 'failed';
|
||||
if (status === 'failed') {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: def.id,
|
||||
exitCode: code,
|
||||
signal,
|
||||
stderrTail: agentStderrTail,
|
||||
stdoutTail: agentStdoutTail,
|
||||
env: spawnedAgentEnv,
|
||||
});
|
||||
if (diagnostic) {
|
||||
send('error', createSseErrorPayload(
|
||||
'AGENT_EXECUTION_FAILED',
|
||||
diagnostic.message,
|
||||
{ retryable: diagnostic.retryable, details: { detail: diagnostic.detail } },
|
||||
));
|
||||
}
|
||||
}
|
||||
design.runs.finish(run, status, code, signal);
|
||||
});
|
||||
if (writePromptToChildStdin && child.stdin) {
|
||||
|
|
|
|||
|
|
@ -342,6 +342,41 @@ const timer = setInterval(() => {
|
|||
}
|
||||
});
|
||||
|
||||
it('surfaces Claude auth diagnostics through the SSE error channel', async () => {
|
||||
await withFakeAgent(
|
||||
'claude',
|
||||
`
|
||||
console.error(JSON.stringify({ apiKeySource: 'none', error_status: 401 }));
|
||||
process.exit(1);
|
||||
`,
|
||||
async () => {
|
||||
const createResponse = await fetch(`${baseUrl}/api/runs`, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
agentId: 'claude',
|
||||
message: 'hello',
|
||||
}),
|
||||
});
|
||||
expect(createResponse.status).toBe(202);
|
||||
const { runId } = await createResponse.json() as { runId: string };
|
||||
|
||||
const eventsController = new AbortController();
|
||||
const eventsResponse = await fetch(`${baseUrl}/api/runs/${runId}/events`, {
|
||||
signal: eventsController.signal,
|
||||
});
|
||||
const eventsBody = await readSseUntil(eventsResponse, 'event: error');
|
||||
eventsController.abort();
|
||||
const statusBody = await waitForRunStatus(baseUrl, runId);
|
||||
|
||||
expect(eventsBody).toContain('event: error');
|
||||
expect(eventsBody).toContain('/login');
|
||||
expect(eventsBody).toContain('CLAUDE_CONFIG_DIR');
|
||||
expect(statusBody.status).toBe('failed');
|
||||
},
|
||||
);
|
||||
});
|
||||
|
||||
it('caps oversized inactivity overrides so Node does not fire the timer immediately', async () => {
|
||||
const previous = process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS;
|
||||
process.env.OD_CHAT_RUN_INACTIVITY_TIMEOUT_MS = '10000000000';
|
||||
|
|
|
|||
159
apps/daemon/tests/claude-diagnostics.test.ts
Normal file
159
apps/daemon/tests/claude-diagnostics.test.ts
Normal file
|
|
@ -0,0 +1,159 @@
|
|||
import { describe, expect, it } from 'vitest';
|
||||
import { diagnoseClaudeCliFailure } from '../src/claude-diagnostics.js';
|
||||
|
||||
describe('diagnoseClaudeCliFailure', () => {
|
||||
it('maps Claude auth failures to /login guidance', () => {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail: '{"apiKeySource":"none","error_status":401}',
|
||||
env: {},
|
||||
});
|
||||
|
||||
expect(diagnostic?.message).toContain('/login');
|
||||
expect(diagnostic?.detail).toContain('CLAUDE_CONFIG_DIR');
|
||||
});
|
||||
|
||||
it('maps custom endpoint model access failures to endpoint guidance', () => {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail:
|
||||
'Error: The selected model is not available in your current plan or region.',
|
||||
env: { ANTHROPIC_BASE_URL: 'https://proxy.example.com' },
|
||||
});
|
||||
|
||||
expect(diagnostic?.message).toContain('custom endpoint');
|
||||
expect(diagnostic?.detail).toContain('ANTHROPIC_BASE_URL');
|
||||
});
|
||||
|
||||
it('maps custom endpoint auth failures to endpoint credential guidance', () => {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail: '{"apiKeySource":"none","error_status":401}',
|
||||
env: { ANTHROPIC_BASE_URL: 'https://proxy.example.com' },
|
||||
});
|
||||
|
||||
expect(diagnostic?.message).toContain('custom Anthropic endpoint');
|
||||
expect(diagnostic?.detail).toContain('ANTHROPIC_BASE_URL');
|
||||
expect(diagnostic?.detail).toContain('proxy credentials');
|
||||
expect(diagnostic?.detail).not.toContain('use `/login`');
|
||||
});
|
||||
|
||||
it('maps silent custom endpoint exits to endpoint guidance', () => {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail: '',
|
||||
stdoutTail: '',
|
||||
env: { ANTHROPIC_BASE_URL: 'https://proxy.example.com' },
|
||||
});
|
||||
|
||||
expect(diagnostic?.message).toContain('custom Anthropic endpoint');
|
||||
expect(diagnostic?.detail).toContain('ANTHROPIC_BASE_URL');
|
||||
expect(diagnostic?.detail).toContain('proxy credentials');
|
||||
expect(diagnostic?.detail).not.toContain('use `/login`');
|
||||
});
|
||||
|
||||
it('maps silent configured-profile exits to profile guidance', () => {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail: '',
|
||||
stdoutTail: '',
|
||||
env: { CLAUDE_CONFIG_DIR: '/tmp/claude-alt' },
|
||||
});
|
||||
|
||||
expect(diagnostic?.message).toContain('configured Claude profile');
|
||||
expect(diagnostic?.detail).toContain('Re-run `claude` and `/login` for that profile');
|
||||
expect(diagnostic?.detail).toContain('Effective CLAUDE_CONFIG_DIR: /tmp/claude-alt');
|
||||
});
|
||||
|
||||
it('includes configured Claude config directory context', () => {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail: 'Authentication failed: token expired',
|
||||
env: { CLAUDE_CONFIG_DIR: '/tmp/claude-alt' },
|
||||
});
|
||||
|
||||
expect(diagnostic?.detail).toContain('Effective CLAUDE_CONFIG_DIR: /tmp/claude-alt');
|
||||
});
|
||||
|
||||
it('does not classify unrelated non-Claude failures', () => {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'codex',
|
||||
exitCode: 1,
|
||||
stderrTail: 'Authentication failed',
|
||||
env: {},
|
||||
});
|
||||
|
||||
expect(diagnostic).toBeNull();
|
||||
});
|
||||
|
||||
it('redacts token-like text from returned details', () => {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail: '401 Authorization: Bearer abcdef0123456789ABCDEF==',
|
||||
env: {},
|
||||
});
|
||||
|
||||
expect(diagnostic?.detail).not.toContain('abcdef0123456789ABCDEF');
|
||||
expect(diagnostic?.detail).toContain('[REDACTED:bearer_token]');
|
||||
});
|
||||
|
||||
it('redacts provider header and query API keys from returned details', () => {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail:
|
||||
'401 x-api-key: header-secret-123 url=https://proxy.example.test/v1?key=query-secret-456',
|
||||
env: { ANTHROPIC_BASE_URL: 'https://proxy.example.test' },
|
||||
});
|
||||
|
||||
expect(diagnostic?.detail).not.toContain('header-secret-123');
|
||||
expect(diagnostic?.detail).not.toContain('query-secret-456');
|
||||
expect(diagnostic?.detail).toContain('x-api-key: [REDACTED:api_key_header]');
|
||||
expect(diagnostic?.detail).toContain('?key=[REDACTED:api_key_query]');
|
||||
});
|
||||
|
||||
it('redacts quoted provider API key headers from returned details', () => {
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail: '401 {"x-api-key":"secret-value-123"}',
|
||||
env: { ANTHROPIC_BASE_URL: 'https://proxy.example.test' },
|
||||
});
|
||||
|
||||
expect(diagnostic?.detail).not.toContain('secret-value-123');
|
||||
expect(diagnostic?.detail).toContain('"x-api-key":"[REDACTED:api_key_header]"');
|
||||
});
|
||||
|
||||
it('redacts long bearer tokens before taking the diagnostic tail', () => {
|
||||
const credential = 'a'.repeat(300);
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail: `401 Authorization: Bearer ${credential}`,
|
||||
env: {},
|
||||
});
|
||||
|
||||
expect(diagnostic?.detail).not.toContain('a'.repeat(80));
|
||||
expect(diagnostic?.detail).toContain('[REDACTED:bearer_token]');
|
||||
});
|
||||
|
||||
it('redacts long provider API key headers before taking the diagnostic tail', () => {
|
||||
const credential = 'b'.repeat(300);
|
||||
const diagnostic = diagnoseClaudeCliFailure({
|
||||
agentId: 'claude',
|
||||
exitCode: 1,
|
||||
stderrTail: `401 x-api-key: ${credential}`,
|
||||
env: { ANTHROPIC_BASE_URL: 'https://proxy.example.test' },
|
||||
});
|
||||
|
||||
expect(diagnostic?.detail).not.toContain('b'.repeat(80));
|
||||
expect(diagnostic?.detail).toContain('x-api-key: [REDACTED:api_key_header]');
|
||||
});
|
||||
});
|
||||
|
|
@ -82,6 +82,10 @@ async function withFakeCodex<T>(script: string, run: () => Promise<T>): Promise<
|
|||
return withFakeAgent('codex', script, run);
|
||||
}
|
||||
|
||||
async function withFakeClaude<T>(script: string, run: () => Promise<T>): Promise<T> {
|
||||
return withFakeAgent('claude', script, run);
|
||||
}
|
||||
|
||||
async function withFakeOpenCode<T>(script: string, run: () => Promise<T>): Promise<T> {
|
||||
return withFakeAgent('opencode', script, run);
|
||||
}
|
||||
|
|
@ -1173,6 +1177,122 @@ console.log(JSON.stringify({ type: 'item.completed', item: { type: 'agent_messag
|
|||
);
|
||||
});
|
||||
|
||||
it('returns Claude /login guidance when the spawned CLI cannot authenticate', async () => {
|
||||
await withFakeClaude(
|
||||
`console.error(JSON.stringify({ apiKeySource: 'none', error_status: 401 })); process.exit(1);`,
|
||||
async () => {
|
||||
const result = await testAgentConnection({ agentId: 'claude' });
|
||||
|
||||
expect(result).toMatchObject({
|
||||
ok: false,
|
||||
kind: 'agent_spawn_failed',
|
||||
agentName: 'Claude Code',
|
||||
});
|
||||
expect(result.detail).toContain('/login');
|
||||
expect(result.detail).toContain('CLAUDE_CONFIG_DIR');
|
||||
},
|
||||
);
|
||||
});
|
||||
|
||||
it('returns Claude /login guidance when auth failure stream JSON is emitted on stdout', async () => {
|
||||
await withFakeClaude(
|
||||
`console.log(JSON.stringify({ apiKeySource: 'none', error_status: 401 })); process.exit(1);`,
|
||||
async () => {
|
||||
const result = await testAgentConnection({ agentId: 'claude' });
|
||||
|
||||
expect(result).toMatchObject({
|
||||
ok: false,
|
||||
kind: 'agent_spawn_failed',
|
||||
agentName: 'Claude Code',
|
||||
});
|
||||
expect(result.detail).toContain('/login');
|
||||
expect(result.detail).toContain('CLAUDE_CONFIG_DIR');
|
||||
},
|
||||
);
|
||||
});
|
||||
|
||||
it('returns custom endpoint guidance for Claude model access failures', async () => {
|
||||
const previous = process.env.ANTHROPIC_BASE_URL;
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://proxy.example.com';
|
||||
try {
|
||||
await withFakeClaude(
|
||||
`console.error('Error: The selected model is not available in your current plan or region.'); process.exit(1);`,
|
||||
async () => {
|
||||
const result = await testAgentConnection({ agentId: 'claude' });
|
||||
|
||||
expect(result).toMatchObject({
|
||||
ok: false,
|
||||
kind: 'agent_spawn_failed',
|
||||
agentName: 'Claude Code',
|
||||
});
|
||||
expect(result.detail).toContain('ANTHROPIC_BASE_URL');
|
||||
expect(result.detail).toContain('custom');
|
||||
},
|
||||
);
|
||||
} finally {
|
||||
if (previous == null) {
|
||||
delete process.env.ANTHROPIC_BASE_URL;
|
||||
} else {
|
||||
process.env.ANTHROPIC_BASE_URL = previous;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
it('returns custom endpoint guidance for Claude auth failures with a custom endpoint', async () => {
|
||||
const previous = process.env.ANTHROPIC_BASE_URL;
|
||||
process.env.ANTHROPIC_BASE_URL = 'https://proxy.example.com';
|
||||
try {
|
||||
await withFakeClaude(
|
||||
`console.error(JSON.stringify({ apiKeySource: 'none', error_status: 401 })); process.exit(1);`,
|
||||
async () => {
|
||||
const result = await testAgentConnection({ agentId: 'claude' });
|
||||
|
||||
expect(result).toMatchObject({
|
||||
ok: false,
|
||||
kind: 'agent_spawn_failed',
|
||||
agentName: 'Claude Code',
|
||||
});
|
||||
expect(result.detail).toContain('ANTHROPIC_BASE_URL');
|
||||
expect(result.detail).toContain('proxy credentials');
|
||||
expect(result.detail).not.toContain('use `/login`');
|
||||
},
|
||||
);
|
||||
} finally {
|
||||
if (previous == null) {
|
||||
delete process.env.ANTHROPIC_BASE_URL;
|
||||
} else {
|
||||
process.env.ANTHROPIC_BASE_URL = previous;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
it('returns configured profile guidance for silent Claude exits', async () => {
|
||||
const previous = process.env.CLAUDE_CONFIG_DIR;
|
||||
process.env.CLAUDE_CONFIG_DIR = '/tmp/claude-alt';
|
||||
try {
|
||||
await withFakeClaude(
|
||||
`process.exit(1);`,
|
||||
async () => {
|
||||
const result = await testAgentConnection({ agentId: 'claude' });
|
||||
|
||||
expect(result).toMatchObject({
|
||||
ok: false,
|
||||
kind: 'agent_spawn_failed',
|
||||
agentName: 'Claude Code',
|
||||
});
|
||||
expect(result.detail).toContain('configured Claude profile');
|
||||
expect(result.detail).toContain('Effective CLAUDE_CONFIG_DIR: /tmp/claude-alt');
|
||||
},
|
||||
);
|
||||
} finally {
|
||||
if (previous == null) {
|
||||
delete process.env.CLAUDE_CONFIG_DIR;
|
||||
} else {
|
||||
process.env.CLAUDE_CONFIG_DIR = previous;
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
it('classifies structured Codex model errors as not_found_model', async () => {
|
||||
await withFakeCodex(
|
||||
`console.log(JSON.stringify({ type: 'error', message: "The 'dddd' model is not supported when using Codex with a ChatGPT account." }));`,
|
||||
|
|
|
|||
|
|
@ -69,6 +69,42 @@ describe('redactSecrets', () => {
|
|||
).toBe('Authorization: Bearer [REDACTED:bearer_token]');
|
||||
});
|
||||
|
||||
it('redacts provider API key header values while keeping header names', () => {
|
||||
expect(redactSecrets('x-api-key: secret-value-123')).toBe(
|
||||
'x-api-key: [REDACTED:api_key_header]',
|
||||
);
|
||||
expect(redactSecrets('api-key=azure-secret-456')).toBe(
|
||||
'api-key=[REDACTED:api_key_header]',
|
||||
);
|
||||
expect(redactSecrets('x-goog-api-key: google-secret-789, next header')).toBe(
|
||||
'x-goog-api-key: [REDACTED:api_key_header], next header',
|
||||
);
|
||||
expect(redactSecrets('{"x-api-key":"secret-value-123"}')).toBe(
|
||||
'{"x-api-key":"[REDACTED:api_key_header]"}',
|
||||
);
|
||||
expect(redactSecrets('{"x-api-key": "secret-value-123"}')).toBe(
|
||||
'{"x-api-key": "[REDACTED:api_key_header]"}',
|
||||
);
|
||||
expect(redactSecrets('{"api-key":"secret-value-123"}')).toBe(
|
||||
'{"api-key":"[REDACTED:api_key_header]"}',
|
||||
);
|
||||
expect(redactSecrets('{"x-goog-api-key":"secret-value-123"}')).toBe(
|
||||
'{"x-goog-api-key":"[REDACTED:api_key_header]"}',
|
||||
);
|
||||
});
|
||||
|
||||
it('redacts API key query values while keeping URL structure', () => {
|
||||
expect(redactSecrets('https://proxy.example.test/v1?key=secret-value-123&model=x')).toBe(
|
||||
'https://proxy.example.test/v1?key=[REDACTED:api_key_query]&model=x',
|
||||
);
|
||||
expect(redactSecrets('https://proxy.example.test/v1?model=x&api_key=secret_value_456')).toBe(
|
||||
'https://proxy.example.test/v1?model=x&api_key=[REDACTED:api_key_query]',
|
||||
);
|
||||
expect(redactSecrets('https://proxy.example.test/v1?api-key=secret-value-789#tail')).toBe(
|
||||
'https://proxy.example.test/v1?api-key=[REDACTED:api_key_query]#tail',
|
||||
);
|
||||
});
|
||||
|
||||
it('redacts email addresses', () => {
|
||||
expect(redactSecrets('contact me at jane.doe+stuff@example.co.uk!')).toBe(
|
||||
'contact me at [REDACTED:email]!',
|
||||
|
|
|
|||
|
|
@ -179,6 +179,19 @@ export interface DaemonReattachOptions {
|
|||
onRunEventId?: (eventId: string) => void;
|
||||
}
|
||||
|
||||
function daemonSseErrorMessage(data: SseErrorPayload): string {
|
||||
const message = String(data.error?.message ?? data.message ?? 'daemon error');
|
||||
const detail =
|
||||
data.error?.details &&
|
||||
typeof data.error.details === 'object' &&
|
||||
!Array.isArray(data.error.details) &&
|
||||
typeof data.error.details.detail === 'string'
|
||||
? data.error.details.detail
|
||||
: null;
|
||||
if (!detail || detail === message || message.includes(detail)) return message;
|
||||
return `${message}\n${detail}`;
|
||||
}
|
||||
|
||||
export async function streamViaDaemon({
|
||||
agentId,
|
||||
history,
|
||||
|
|
@ -410,7 +423,7 @@ async function consumeDaemonRun({
|
|||
if (event.event === 'error') {
|
||||
onRunStatus?.('failed');
|
||||
const data = event.data as SseErrorPayload;
|
||||
handlers.onError(new Error(String(data.error?.message ?? data.message ?? 'daemon error')));
|
||||
handlers.onError(new Error(daemonSseErrorMessage(data)));
|
||||
return;
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -211,6 +211,40 @@ describe('streamViaDaemon', () => {
|
|||
expect(handlers.onDone).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('includes unified SSE error details in daemon error messages', async () => {
|
||||
const handlers = createDaemonHandlers();
|
||||
vi.stubGlobal(
|
||||
'fetch',
|
||||
vi.fn()
|
||||
.mockResolvedValueOnce(jsonResponse({ runId: 'run-1' }))
|
||||
.mockResolvedValueOnce(
|
||||
sseResponse(
|
||||
[
|
||||
'event: error',
|
||||
'data: {"message":"Claude Code failed","error":{"code":"AGENT_EXECUTION_FAILED","message":"Claude Code failed","details":{"detail":"Set CLAUDE_CONFIG_DIR in Settings and retry."}}}',
|
||||
'',
|
||||
'',
|
||||
].join('\n'),
|
||||
),
|
||||
),
|
||||
);
|
||||
|
||||
await streamViaDaemon({
|
||||
agentId: 'mock',
|
||||
history: [{ id: '1', role: 'user', content: 'hello' }],
|
||||
systemPrompt: '',
|
||||
signal: new AbortController().signal,
|
||||
handlers,
|
||||
});
|
||||
|
||||
expect(handlers.onError).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
message: expect.stringContaining('Set CLAUDE_CONFIG_DIR in Settings'),
|
||||
}),
|
||||
);
|
||||
expect(handlers.onDone).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('keeps the daemon run alive when the browser-side stream aborts', async () => {
|
||||
const handlers = createDaemonHandlers();
|
||||
const controller = new AbortController();
|
||||
|
|
|
|||
116
specs/change/20260511-issue-564-claude-diagnostics/spec.md
Normal file
116
specs/change/20260511-issue-564-claude-diagnostics/spec.md
Normal file
|
|
@ -0,0 +1,116 @@
|
|||
---
|
||||
id: 20260511-issue-564-claude-diagnostics
|
||||
name: Issue 564 Claude Diagnostics
|
||||
status: planned
|
||||
created: '2026-05-11'
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
### Problem Statement
|
||||
|
||||
Issue #564 reports that Claude Code can appear installed and selectable in Open Design, but a run exits immediately with the generic message `Agent exited with code 1`. PR #604 addressed one confirmed variant by adding configurable per-agent CLI environment, including `CLAUDE_CONFIG_DIR`, but the issue still contains other failure modes that are not explained well to users.
|
||||
|
||||
The remaining in-scope problem is diagnostic: Claude local auth/config/model failures collapse into the same generic exit message, which prevents users from knowing whether to re-authenticate, set `CLAUDE_CONFIG_DIR`, or fix a custom endpoint/proxy setup.
|
||||
|
||||
### Goals
|
||||
|
||||
- Classify common Claude Code auth, config, endpoint, and model-access failures into actionable user-facing errors.
|
||||
- Surface the effective Claude CLI configuration path where it helps diagnose multi-profile or stale-auth cases.
|
||||
- Document known recovery paths for `/login`, `CLAUDE_CONFIG_DIR`, custom `ANTHROPIC_BASE_URL`, proxy, and model availability problems.
|
||||
|
||||
### Non-Goals
|
||||
|
||||
- Do not persist or inject Claude auth tokens.
|
||||
- Do not edit Claude Code credentials or platform credential stores.
|
||||
- Do not remove support for custom `ANTHROPIC_BASE_URL`; only make failures clearer.
|
||||
- Do not implement a new image-generation provider in this change.
|
||||
- Do not implement API/BYOK image-mode capability validation in this change. That behavior is related to the same issue thread, but it is deferred to a separate media-capability follow-up so this change can stay focused on Claude CLI diagnostics.
|
||||
|
||||
## Research
|
||||
|
||||
### Current State
|
||||
|
||||
- Claude Code runs are spawned by the daemon as `claude -p --output-format stream-json --verbose ... --permission-mode bypassPermissions`, with the prompt delivered over stdin.
|
||||
- `spawnEnvForAgent('claude', ...)` merges inherited daemon environment with configured agent CLI env, preserves custom endpoint env when `ANTHROPIC_BASE_URL` is set, and strips `ANTHROPIC_API_KEY` for normal Claude Code runs so Claude login/subscription auth wins.
|
||||
- Settings already allow a `CLAUDE_CONFIG_DIR` value, which addresses the multiple-Claude-profile variant reported in #564 and implemented by PR #604.
|
||||
- The daemon forwards child stderr over SSE, but the web client ultimately reports non-zero exits as `agent exited with code <n>` plus a short stderr tail when available.
|
||||
- The connection-test path has a richer stderr tail and returns `agent_spawn_failed`, but it does not yet classify Claude-specific auth/config/model failures into stable remediation messages.
|
||||
- API/BYOK mode uses a plain stream path. Recent work added a prompt rule to suppress `tool_calls` for plain API mode. Image-generation capability validation is still needed, but is intentionally deferred from this spec.
|
||||
|
||||
### Known Failure Variants from #564
|
||||
|
||||
- Custom endpoint or proxy rejects the model Claude Code selects, producing a model/plan/region error upstream.
|
||||
- Multiple Claude config directories cause Open Design's spawned Claude process to use different state than the user's terminal session.
|
||||
- Stale, expired, or corrupted Claude auth state makes the non-interactive `claude -p` run fail even when basic terminal checks appear healthy.
|
||||
- On Windows, native PowerShell and WSL can use separate Claude installs and separate credential stores.
|
||||
- A ghost `ANTHROPIC_API_KEY` or `ANTHROPIC_AUTH_TOKEN` can interfere with expected OAuth/subscription auth in some environments.
|
||||
|
||||
## Design
|
||||
|
||||
### Claude Failure Classification
|
||||
|
||||
Add a Claude-specific diagnostic helper used by both chat-run failure handling and the agent connection test. The helper should accept the agent id, exit code/signal, stderr tail, stdout tail when available, and effective configured env. It returns either a typed actionable error or `null` to preserve the generic fallback.
|
||||
|
||||
The first version should classify these cases:
|
||||
|
||||
- `401`, `apiKeySource: "none"`, missing/invalid token, or authentication failure: tell the user to run `claude`, use `/login`, then retry the same Open Design run.
|
||||
- `ANTHROPIC_BASE_URL` present with model/plan/region unavailable text: explain that the custom endpoint or proxy does not expose the selected Claude Code model and recommend changing the model, fixing the endpoint, or temporarily removing the custom endpoint.
|
||||
- `CLAUDE_CONFIG_DIR` present: include the effective expanded path in the diagnostic detail so users can compare it with the terminal where Claude works.
|
||||
- No `CLAUDE_CONFIG_DIR` present but symptoms match config-state failure: suggest setting it in Settings when using multiple Claude profiles.
|
||||
- Windows credential or WSL/native mismatch indicators: suggest re-authenticating in the same shell/environment used by Open Design and checking Windows Credential Manager where applicable.
|
||||
|
||||
The helper should redact secrets before returning details. It must not echo token values, full API keys, or authorization headers.
|
||||
|
||||
### User-Facing Surfaces
|
||||
|
||||
- Chat run failures should display the classified message instead of only `agent exited with code 1` when a Claude-specific diagnosis is available.
|
||||
- Settings connection test should return the same classified remediation through the existing failure-result shape so users can validate the fix before starting a project run.
|
||||
- Keep the raw stderr tail available in logs for maintainers, but keep UI messages short and actionable.
|
||||
|
||||
### Documentation
|
||||
|
||||
Add a `Claude Code exits with code 1` troubleshooting section to the primary setup/troubleshooting doc. Include:
|
||||
|
||||
- `claude --version`
|
||||
- `claude auth status --text`
|
||||
- `printf 'hello' | claude -p --output-format stream-json --verbose --permission-mode bypassPermissions`
|
||||
- `claude` then `/login`
|
||||
- Setting `CLAUDE_CONFIG_DIR` in Settings for multi-profile setups.
|
||||
- Checking/removing custom `ANTHROPIC_BASE_URL` and proxy settings when the selected model is unavailable.
|
||||
- Windows-specific note that WSL and native Windows Claude credentials are separate.
|
||||
|
||||
### Deferred: API/BYOK Image Capability Handling
|
||||
|
||||
For image/media surfaces in API/BYOK mode, a later follow-up should validate that the request can be routed before treating the run as successful.
|
||||
|
||||
- If a configured daemon media provider can satisfy the selected image model, route through the existing media-generation path.
|
||||
- If no media provider/tool route is available, fail with a typed error explaining that API chat mode cannot execute image-generation tools and that the user should configure Settings -> Media or use a capable local CLI agent.
|
||||
- Do not rely on the model to self-report unsupported tool usage; the app should make the capability decision before or at run start.
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
1. Add the shared Claude diagnostic helper in daemon-owned code and unit-test it with representative stderr/stdout tails.
|
||||
2. Wire the helper into `/api/chat` child close handling for Claude runs before falling back to the generic non-zero exit message.
|
||||
3. Wire the same helper into the agent connection-test result path.
|
||||
4. Add troubleshooting documentation for the known #564 recovery paths.
|
||||
5. Keep PR #604 behavior intact: configured `CLAUDE_CONFIG_DIR` remains a supported Settings field and is passed into detection, connection tests, and chat runs.
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- A Claude auth failure produces a remediation that mentions `/login`, not only `exit code 1`.
|
||||
- A custom endpoint/model-access failure produces a remediation that mentions `ANTHROPIC_BASE_URL` or endpoint/model availability.
|
||||
- A multi-profile failure path can be resolved through Settings by setting `CLAUDE_CONFIG_DIR`.
|
||||
- Existing non-Claude agent failure behavior remains unchanged unless the same generic fallback already applies.
|
||||
|
||||
## Test Plan
|
||||
|
||||
- Daemon unit tests for Claude diagnostic classification:
|
||||
- 401 / `apiKeySource: "none"`.
|
||||
- selected model unavailable / plan or region text.
|
||||
- custom `ANTHROPIC_BASE_URL` present.
|
||||
- configured `CLAUDE_CONFIG_DIR` present.
|
||||
- unrelated stderr falls back to generic behavior.
|
||||
- Connection-test tests that Claude-specific failures return actionable `agent_spawn_failed` detail.
|
||||
- Chat-run tests that non-zero Claude exits emit a classified SSE error when possible.
|
||||
- Run `pnpm guard`, `pnpm typecheck`, and package-scoped daemon/web tests touched by the implementation.
|
||||
Loading…
Reference in a new issue