refactor(host): orchestrator uses chat-panel agent, drop Anthropic-key dependency

User pointed out: the desktop chat panel already has 5 CLI agents (Claude Code / Codex / OpenCode / Copilot / Gemini) that manage their own auth — Claude Code is logged in by the user, Copilot rides GitHub auth, Gemini rides gcloud. The #27 intent-gate landing required a separate `OPENPENCIL_ANTHROPIC_API_KEY` env var to launch the orchestrator, which broke UX consistency and was a real design defect: the existing CLI agents already provide LLM access, the orchestrator should reuse them instead of requiring users to wire up a parallel direct-API key. Root cause was a trait mismatch: `Orchestrator` needs an `LlmClient` impl, and `DesktopLlmClient` (now deleted) was written to take `Arc<dyn agent::Provider>` — but `agent::Provider` is only implemented by `AnthropicProvider` / `OpenAiCompatProvider`, NOT the CLI-backed `ChatProvider`s (`ClaudeCodeProvider` etc). Fix is a thin `ChatProvider → LlmClient` adapter: - New `chat_provider_llm::ChatProviderLlmClient` (~80 lines): owns an `Arc<dyn ChatProvider>`, each `LlmClient::call` spawns a thread that drains the provider's blocking iterator into a futures mpsc channel and returns the receive half as a `BoxStream`. Same async↔sync bridge `BlockingRecvIter` uses in the opposite direction. - `DesignSession::start` now generic over `L: LlmClient + Send + 'static` instead of taking `Arc<dyn Provider>` + default_model; caller picks the LlmClient impl. Test e2e path (`from_channels`) unaffected. - `chat_session::launch_if_pending`: Design branch reads `chat_selected_agent`, wraps the existing `provider_for_agent` output in `ChatProviderLlmClient`, hands it to `DesignSession`. Unwired agents (Codex / OpenCode) fall through to the chat-path unwired-agent error bubble — same UX as a chat send to an unwired agent. Deletions: - `chat_orchestrator.rs` — `DesktopLlmClient` was its last surviving content; with the new adapter no caller needs it. Removed the file + the `mod chat_orchestrator;` line. - `chat_session::provider_for_design` and the `OPENPENCIL_ANTHROPIC_API_KEY` / `ANTHROPIC_API_KEY` / `OPENPENCIL_ORCHESTRATOR_MODEL` env reads. - `op-host-desktop/Cargo.toml` `agent` crate's `["anthropic"]` feature — no path inside the host needs an `agent::Provider` impl anymore (the trait stays imported for the `BuiltInProvider` shim in `chat_runtime.rs`, future-facing). `op-smoke` keeps `AnthropicProvider` + `OpenAiCompatProvider` directly because the smoke deliberately bypasses any CLI auth path to validate the orchestrator against a raw API endpoint independently of the host UI. cargo test -p op-host-desktop 106 passed (unchanged). cargo fmt --all -- --check + cargo clippy --workspace --all-targets -- -D warnings clean.
2026-06-01 03:14:29 +07:00 · 2026-05-24 20:41:36 +08:00 · 2026-05-24 20:41:36 +08:00 · 4f9665227b
commit 4f9665227b
parent dbd5f1dc09
7 changed files with 142 additions and 161 deletions
--- a/crates/op-host-desktop/Cargo.toml
+++ b/crates/op-host-desktop/Cargo.toml
@ -160,9 +160,7 @@ dirs = "5"
 # the trait + the engine wiring + `from_provider`; concrete
 # Provider impls (Anthropic / OpenAI-compat / Ollama) flip on once the
 # workspace rust-toolchain bumps.
-agent = { path = "../../vendor/agent/crates/agent", default-features = false, features = [
-  "anthropic",
-] }
+agent = { path = "../../vendor/agent/crates/agent", default-features = false }
 # Forked from bartolli/anthropic-agent-sdk — lives under
 # `vendor/anthropic-agent-sdk/` so OP owns the source and can diverge
 # from upstream. Provides the Claude Code CLI subprocess transport +
--- a/crates/op-host-desktop/src/chat_orchestrator.rs
+++ b/crates/op-host-desktop/src/chat_orchestrator.rs
@ -1,110 +0,0 @@
-//! Desktop `LlmClient` adapter for `op_orchestrator`.
-//!
-//! [`DesktopLlmClient`] bridges `agent::Provider` (the host's LLM
-//! interface) to `op_orchestrator::LlmClient` (the orchestrator's
-//! transport-free seam). Each `call` spins up a fresh `QueryEngine` so
-//! planner / sub-agent / cleanup turns get independent context —
-//! distinct from `BuiltInProvider` in `chat_runtime.rs`, which shares
-//! one engine across a chat thread to accumulate user history.
-//!
-//! Live caller: [`design_session::DesignSession::start`], which owns
-//! the worker thread that drives `Orchestrator::run` against this
-//! adapter + a `RemoteDocSink`. Previous `DesktopDocSink` + entry-point
-//! `run_design_request` were removed once `DesignSession` replaced
-//! them with the actor-channel model (no UI freeze; ID-remapped state
-//! mirrored back to the worker via ack snapshots).
-
-use std::sync::Arc;
-
-use agent::abort::AbortController;
-use agent::provider::Provider;
-use agent::query::QueryEngine;
-use agent::stream::Event;
-use futures::channel::mpsc;
-use futures::StreamExt;
-use op_orchestrator::{CallRequest, LlmChunk, LlmClient, LlmError};
-
-/// `LlmClient` 的 desktop 实现。每次 `call` 新建一个 `QueryEngine`
-/// —— 规划与各 sub-agent 因此拿到互相隔离的对话上下文(不复用
-/// `BuiltInProvider` 那个累积历史的共享引擎)。
-pub struct DesktopLlmClient {
-    provider: Arc<dyn Provider>,
-    /// 缺省模型 —— `CallRequest.model` 为 `None` 时用它。
-    default_model: String,
-}
-
-impl DesktopLlmClient {
-    pub fn new(provider: Arc<dyn Provider>, default_model: impl Into<String>) -> Self {
-        Self {
-            provider,
-            default_model: default_model.into(),
-        }
-    }
-}
-
-impl LlmClient for DesktopLlmClient {
-    fn call(
-        &self,
-        req: CallRequest,
-    ) -> futures::stream::BoxStream<'static, Result<LlmChunk, LlmError>> {
-        let (tx, rx) = mpsc::unbounded::<Result<LlmChunk, LlmError>>();
-
-        // 调用前已中止 —— 直接给一个 aborted 错误流。
-        if req.abort.is_set() {
-            let _ = tx.unbounded_send(Err(LlmError {
-                message: "aborted".into(),
-                aborted: true,
-            }));
-            return Box::pin(rx);
-        }
-
-        let provider = self.provider.clone();
-        let model = req
-            .model
-            .clone()
-            .unwrap_or_else(|| self.default_model.clone());
-        let system = req.system_prompt.clone();
-        let user = req.user_prompt.clone();
-
-        crate::chat_runtime::shared_runtime().spawn(async move {
-            let engine = QueryEngine::new(provider, model).with_system(system);
-            let abort = AbortController::new();
-            let stream = match engine.run(user, abort).await {
-                Ok(s) => s,
-                Err(e) => {
-                    let _ = tx.unbounded_send(Err(LlmError {
-                        message: e.to_string(),
-                        aborted: false,
-                    }));
-                    return;
-                }
-            };
-            let mut stream = stream;
-            while let Some(item) = stream.next().await {
-                let sent = match item {
-                    Ok(Event::TextDelta { delta }) => tx.unbounded_send(Ok(LlmChunk::Text(delta))),
-                    Ok(Event::Thinking { delta }) => {
-                        tx.unbounded_send(Ok(LlmChunk::Thinking(delta)))
-                    }
-                    Ok(Event::Result { .. }) => break,
-                    Ok(Event::Error { code, message }) => tx.unbounded_send(Err(LlmError {
-                        message: format!("{code}: {message}"),
-                        aborted: false,
-                    })),
-                    // ToolUse / ToolResult / Usage / 其它 —— 编排器
-                    // 只要文本,静默跳过。
-                    Ok(_) => Ok(()),
-                    Err(e) => tx.unbounded_send(Err(LlmError {
-                        message: e.to_string(),
-                        aborted: false,
-                    })),
-                };
-                if sent.is_err() {
-                    break; // 接收端已丢弃
-                }
-            }
-        });
-
-        Box::pin(rx)
-    }
-}
--- a/crates/op-host-desktop/src/chat_provider_llm.rs
+++ b/crates/op-host-desktop/src/chat_provider_llm.rs
@ -0,0 +1,107 @@
+//! `ChatProvider` → `LlmClient` adapter.
+//!
+//! Lets the orchestrator reuse whichever chat-panel agent the user
+//! already selected (Claude Code / Copilot / Gemini / …) as its LLM
+//! transport, instead of forcing a separate Anthropic API key.
+//!
+//! The CLI agents (`ClaudeCodeProvider`, `CopilotProvider`,
+//! `SubprocessProvider`) manage their own auth — `claude` is logged in
+//! by the user, Copilot rides GitHub auth, Gemini rides `gcloud`.
+//! `agent::Provider` (the `QueryEngine`-facing trait) is a different
+//! shape and only `AnthropicProvider` implements it, hence the original
+//! Anthropic-key requirement. This adapter eliminates that
+//! requirement by turning any `ChatProvider` into the `LlmClient` the
+//! orchestrator wants.
+//!
+//! Each `LlmClient::call` spawns one `std::thread` that drains
+//! `provider.send(req)` (a *blocking* iterator) into a futures mpsc
+//! channel; the returned `BoxStream` is the receive half. This is the
+//! same async↔sync bridge `BlockingRecvIter` uses in the opposite
+//! direction in `chat_runtime.rs`.
+
+use std::sync::Arc;
+use std::thread;
+
+use futures::channel::mpsc;
+use futures::stream::BoxStream;
+use op_ai::chat_provider::{ChatDelta, ChatProvider, ChatRequest, EffortLevel, ThinkingMode};
+use op_orchestrator::{CallRequest, LlmChunk, LlmClient, LlmError};
+
+/// Wraps a `ChatProvider` so the orchestrator can call it as an
+/// `LlmClient`. `Arc` so multiple concurrent `call()`s can share the
+/// same provider (orchestrator may run planner + sub-agents in
+/// parallel under `concurrency > 1`).
+pub struct ChatProviderLlmClient {
+    provider: Arc<dyn ChatProvider>,
+}
+
+impl ChatProviderLlmClient {
+    pub fn new(provider: Arc<dyn ChatProvider>) -> Self {
+        Self { provider }
+    }
+}
+
+impl LlmClient for ChatProviderLlmClient {
+    fn call(&self, req: CallRequest) -> BoxStream<'static, Result<LlmChunk, LlmError>> {
+        let (tx, rx) = mpsc::unbounded::<Result<LlmChunk, LlmError>>();
+
+        if req.abort.is_set() {
+            let _ = tx.unbounded_send(Err(LlmError {
+                message: "aborted".into(),
+                aborted: true,
+            }));
+            return Box::pin(rx);
+        }
+
+        // Build the ChatRequest. The orchestrator's `CallRequest` only
+        // carries system + user prompts + an abort handle + an optional
+        // model hint; the chat-side knobs (thinking / effort /
+        // attachments) get sensible defaults — sub-agents inherit
+        // whatever the CLI does by default.
+        let chat_req = ChatRequest {
+            system_prompt: req.system_prompt.clone(),
+            user_message: req.user_prompt.clone(),
+            // The orchestrator's prompts can run long (planner system
+            // is ~12 KB, sub-agents emit dense JSON). Give them room.
+            max_output_tokens: 8192,
+            thinking: ThinkingMode::Disabled,
+            effort: EffortLevel::Low,
+            attachments: vec![],
+        };
+
+        // `provider.send` returns a *blocking* iterator. Drain it on a
+        // dedicated thread; the LLM call's `BoxStream` is the receive
+        // half of the futures mpsc channel.
+        let provider = self.provider.clone();
+        thread::spawn(move || {
+            for delta in provider.send(chat_req) {
+                let chunk = match delta {
+                    ChatDelta::TextDelta(s) => Some(Ok(LlmChunk::Text(s))),
+                    ChatDelta::Thinking(s) => Some(Ok(LlmChunk::Thinking(s))),
+                    ChatDelta::Error(msg) => Some(Err(LlmError {
+                        message: msg,
+                        aborted: false,
+                    })),
+                    // `Done` closes the stream by exiting the loop; the
+                    // orchestrator parses the accumulated text and
+                    // decides what to do.
+                    ChatDelta::Done { .. } => break,
+                    // Tool calls aren't routed through the orchestrator
+                    // — it expects a single text completion per call.
+                    // If a CLI agent decides to invoke an MCP tool
+                    // mid-turn the result text follows in subsequent
+                    // `TextDelta`s anyway.
+                    ChatDelta::ToolUse { .. } => None,
+                };
+                if let Some(c) = chunk {
+                    if tx.unbounded_send(c).is_err() {
+                        // Receiver dropped — orchestrator turn aborted.
+                        break;
+                    }
+                }
+            }
+        });
+
+        Box::pin(rx)
+    }
+}
--- a/crates/op-host-desktop/src/chat_session.rs
+++ b/crates/op-host-desktop/src/chat_session.rs
@ -11,7 +11,6 @@ use std::sync::mpsc::{self, Receiver, TryRecvError};
 use std::sync::Arc;
 use std::thread;

-use agent::provider::Provider;
 use op_ai::chat_provider::{ChatDelta, ChatProvider, ChatRequest, CliName};
 use op_editor_core::{ChatMessage, ChatToolCall};
 use op_host_native::WidgetHostNative;
@ -19,6 +18,7 @@ use op_orchestrator::{classify_intent, DesignRequest, Intent};

 use crate::chat_claude::ClaudeCodeProvider;
 use crate::chat_copilot::CopilotProvider;
+use crate::chat_provider_llm::ChatProviderLlmClient;
 use crate::chat_subprocess::SubprocessProvider;
 use crate::design_session::DesignSession;

@ -166,16 +166,24 @@ pub fn launch_if_pending(
    };
    host.mark_editor_state_dirty();
    // Intent gate (op_orchestrator::classify_intent) — design verbs
-    // route to the orchestrator pipeline when a Provider is available;
-    // otherwise we fall through to the existing chat path so the user
-    // still gets an answer.
+    // route to the orchestrator pipeline, which reuses the chat
+    // panel's currently-selected agent as its LLM (via
+    // `ChatProviderLlmClient`). Unwired agents (Codex / OpenCode) fall
+    // through to the chat path so `provider_for_agent` can surface the
+    // unwired-agent error in the assistant bubble.
+    let agent_idx = host.editor_state().editor_ui.chat_selected_agent;
    if matches!(classify_intent(&user_text), Intent::Design) {
-        if let Some((provider, model)) = provider_for_design() {
+        if let Some(provider) = provider_for_agent(agent_idx) {
            *current_chat = None;
+            let llm = ChatProviderLlmClient::new(Arc::from(provider));
            let initial_state = host.editor_state().clone();
            let request = DesignRequest {
                prompt: user_text,
-                model: Some(model.clone()),
+                // The chosen chat agent decides its own model; the
+                // orchestrator only passes through `req.model` when
+                // it explicitly overrides per sub-call (it doesn't
+                // today).
+                model: None,
                provider: None,
                design_md: initial_state.doc.design_md.clone(),
                append_context: None,
@ -183,17 +191,12 @@ pub fn launch_if_pending(
                validation_enabled: false,
                visual_ref_enabled: false,
            };
-            *current_design = Some(DesignSession::start(
-                provider,
-                model,
-                request,
-                initial_state,
-            ));
+            *current_design = Some(DesignSession::start(llm, request, initial_state));
            return true;
        }
-        // Design intent but no Provider configured — fall through to
-        // the chat path. The assistant CLI will still answer the user
-        // (most CLIs handle design verbs as chat).
+        // Design intent but the selected agent has no ChatProvider
+        // bridge yet — fall through to the chat path so the unwired
+        // agent error message lands in the assistant bubble.
    }
    // Taking the chat path — drop any in-flight design turn so its
    // worker's next `apply` returns false (channel dropped) and its
@ -202,7 +205,6 @@ pub fn launch_if_pending(
    // kept overwriting the new bubble content + applying ack'd
    // EditorCommands long after the user moved on).
    *current_design = None;
-    let agent_idx = host.editor_state().editor_ui.chat_selected_agent;
    let Some(provider) = provider_for_agent(agent_idx) else {
        // Selected agent has no `ChatProvider` bridge yet (Codex /
        // OpenCode HTTP-server transport). Surface that honestly in
@ -259,26 +261,6 @@ pub fn launch_if_pending(
    true
 }

-/// Build an `agent::Provider` for the orchestrator's design path.
-/// MVP: reads `OPENPENCIL_ANTHROPIC_API_KEY` (preferred) or
-/// `ANTHROPIC_API_KEY` from the environment and constructs an
-/// `AnthropicProvider`. Returns `None` when neither key is set so the
-/// caller can fall back to the chat-CLI path honestly.
-///
-/// `OPENPENCIL_*` is checked first so users can isolate the
-/// orchestrator's API key from any `ANTHROPIC_API_KEY` other tooling
-/// might consume (e.g. `claude` CLI, the Anthropic SDK examples).
-fn provider_for_design() -> Option<(Arc<dyn Provider>, String)> {
-    let api_key = std::env::var("OPENPENCIL_ANTHROPIC_API_KEY")
-        .ok()
-        .or_else(|| std::env::var("ANTHROPIC_API_KEY").ok())
-        .filter(|k| !k.is_empty())?;
-    let model = std::env::var("OPENPENCIL_ORCHESTRATOR_MODEL")
-        .unwrap_or_else(|_| "claude-sonnet-4-6".to_string());
-    let provider = agent::provider::anthropic::AnthropicProvider::new(api_key);
-    Some((Arc::new(provider) as Arc<dyn Provider>, model))
-}
-
 /// Build the `ChatProvider` for an agent index (into
 /// `AgentProvider::ALL`: 0 ClaudeCode, 1 CodexCli, 2 OpenCode,
 /// 3 GithubCopilot, 4 GeminiCli). Claude Code uses its dedicated
--- a/crates/op-host-desktop/src/design_session.rs
+++ b/crates/op-host-desktop/src/design_session.rs
@ -44,18 +44,15 @@
 //! sees the channel closed and returns `false`, ending the turn.

 use std::sync::mpsc::{self, Receiver, Sender, SyncSender, TryRecvError};
-use std::sync::Arc;
 use std::thread;

-use agent::provider::Provider;
 use op_editor_core::{EditorCommand, EditorState};
 use op_host_native::WidgetHostNative;
 use op_orchestrator::{
-    AbortFlag, DesignRequest, DocSink, Orchestrator, OrchestratorError, Progress, RunSummary,
-    SkippedScreenshotProvider, SkippedVisionLlmClient, ValidationProviders,
+    AbortFlag, DesignRequest, DocSink, LlmClient, Orchestrator, OrchestratorError, Progress,
+    RunSummary, SkippedScreenshotProvider, SkippedVisionLlmClient, ValidationProviders,
 };

-use crate::chat_orchestrator::DesktopLlmClient;
 use crate::chat_runtime::shared_runtime;
 use crate::pre_validator::LintPreValidator;

@ -108,9 +105,13 @@ impl DesignSession {
    /// Spawn a worker that runs `Orchestrator::run` against a
    /// `RemoteDocSink`. Returns immediately; the LLM turn streams off
    /// the UI thread.
-    pub fn start(
-        provider: Arc<dyn Provider>,
-        default_model: String,
+    ///
+    /// `llm` is any `LlmClient` implementation — production code passes
+    /// a `ChatProviderLlmClient` wrapping the user's currently-selected
+    /// chat agent (Claude Code / Copilot / Gemini), so the orchestrator
+    /// rides whatever CLI auth the chat panel already has.
+    pub fn start<L: LlmClient + Send + 'static>(
+        llm: L,
        request: DesignRequest,
        initial_state: EditorState,
    ) -> Self {
@ -120,7 +121,6 @@ impl DesignSession {
        thread::Builder::new()
            .name("op-design-turn".into())
            .spawn(move || {
-                let llm = DesktopLlmClient::new(provider, default_model);
                let mut sink = RemoteDocSink::new(cmd_tx, initial_state);
                let abort = AbortFlag::new();
                let pre_validator = LintPreValidator;
--- a/crates/op-host-desktop/src/main.rs
+++ b/crates/op-host-desktop/src/main.rs
@ -9,7 +9,7 @@ mod chat_attachment;
 mod chat_claude;
 mod chat_copilot;
 mod chat_http_server;
-mod chat_orchestrator;
+mod chat_provider_llm;
 mod chat_runtime;
 mod chat_session;
 mod chat_subprocess;
--- a/crates/op-smoke/src/main.rs
+++ b/crates/op-smoke/src/main.rs
@ -55,8 +55,12 @@ use op_orchestrator::{

 /// `LlmClient` impl for the smoke runner — `AnthropicProvider` under a
 /// `QueryEngine`, with every call spawned onto the current tokio runtime.
-/// Mirrors `op-host-desktop::chat_orchestrator::DesktopLlmClient` but
-/// uses `tokio::spawn` instead of a shared `Runtime::spawn` handle.
+/// Standalone — `op-host-desktop` no longer ships a desktop
+/// `LlmClient`; its production path goes through
+/// `chat_provider_llm::ChatProviderLlmClient` (wrapping the user's
+/// selected chat CLI). The smoke needs to talk to a raw API endpoint
+/// to validate orchestrator behaviour independently of any CLI, hence
+/// this dedicated client.
 struct SmokeLlmClient {
    provider: Arc<dyn Provider>,
    default_model: String,