vndangkhoa/zed - Forgejo: Beyond coding. We Forge.

mirror of https://github.com/zed-industries/zed.git synced 2026-06-01 03:14:56 +07:00

Author	SHA1	Message	Date
Jakub Konka	da87558cc2	Revert "Improve ChatGPT subscription response resilience" (#58035 ) Reverts zed-industries/zed#57891	2026-05-29 11:23:56 +00:00
morgankrey	ef5606bb61	Improve ChatGPT subscription response resilience (#57891 ) ## Summary This started from #57636, after we saw ChatGPT subscription/Codex requests stall over the past week. OpenCode v1.15.11 shipped related resilience fixes for the same class of Codex subscription endpoint issues, so this ports the relevant pieces into Zed's native ChatGPT subscription provider. When Zed asks ChatGPT/Codex for a response, sometimes the server connection can get stuck before it even sends the first response headers. Before this PR, Zed could wait indefinitely, which looks like OpenCode/Zed “stalling.” This PR makes Zed: - Wait up to 10 seconds for the server to start responding. - If nothing comes back in that window, treat it as a temporary network/API failure. - Let the existing retry logic try again instead of leaving the user stuck. - Send a stable session-id header so OpenAI’s Codex backend can associate requests with the same Zed agent thread. - Add tests to make sure: - stuck-before-response requests time out, - normal slow streaming responses are not cut off, - ChatGPT subscription requests send the right session header, - the agent retries this kind of failure. intended user-facing result is: fewer “the assistant is just sitting there forever” failures when using ChatGPT subscription models. ## Verification - cargo test -p open_ai responses - cargo test -p language_models openai_subscribed - cargo test -p agent test_send_retry_on_http_send_error - cargo check -p open_ai - cargo check -p language_models - cargo check -p agent Release Notes: - Fixed ChatGPT subscription requests stalling indefinitely before response headers arrive.	2026-05-28 16:41:26 +00:00
Tom Houlé	a1d019bdd8	language_models: Support fast mode on ChatGPT subscription provider (#57436 ) Same mechanism as for BYOK: `service_tier == priority`. Most of the work is already done. When validating this in manual testing, I noticed we get back `service_tier == auto` in the response, unlike in the regular OpenAI API scenario with BYOK, but apparently [it doesn't mean priority tier wasn't applied](https://github.com/openai/codex/issues/14204#issuecomment-4033184620). It's not a hard confirmation, but the model does seem to respond faster when I toggle fast mode on. Release Notes: - Added Fast Mode (priority service tier) support to OpenAI models used through the ChatGPT subscription provider.	2026-05-27 09:22:00 +00:00
Bennet Bo Fenner	88a54a2683	open_ai: Fix error message not showing up when using ChatGPT subscription (#57750 ) The API seems to return nested errors, so made the error deserialise properly in case we get `{ "error": {...} }` instead of a top-level error Closes #57024 For testing, you can prompt something like: `tell me about https://registry.npmjs.org/vite-plus.` Before: <img width="631" height="69" alt="image" src="https://github.com/user-attachments/assets/5d02e7ec-8176-4bff-87d7-908ac8f0b498" /> After: <img width="697" height="61" alt="image" src="https://github.com/user-attachments/assets/97fac249-8b76-463c-8483-a150f5db9857" /> Release Notes: - openai: Fixed an issue where error messages would not show up properly	2026-05-27 09:18:39 +00:00
Tom Houlé	5e717a06cd	open_ai: Support fast mode in BYOK via the Responses API service_tier (#57412 ) Maps the existing `Speed::Fast` plumbing to OpenAI's `service_tier: "priority"`, which matches what "fast mode" in Codex does. Relevant docs [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-service_tier). Like for the existing Anthropic fast mode we have a `Model::supports_priority` method for the variants on https://openai.com/api-priority-processing. Pro, nano, and legacy gpt-4 are excluded; Custom defaults to false. This is gated to staff only for now (not in this diff, but the existing fast mode feature), until we have the mechanism to require confirmation before you enable fast mode. Release Notes: - Added support for Fast Mode (priority service tier) on the OpenAI API provider.	2026-05-27 09:17:23 +00:00
Lukas Wirth	63ff99795c	agent: Make messages vec cheap to clone (#57712 ) For long threads we will spend more and more time cloning the messages just to save them to the database, as we need a copy of everything to do so asynchronously. Messages are really expensive to clone though and we accumulate a lot of them really fast, so even for smaller threads we start seeing pauses in the millisecond range. The fix to this is fairly simple though, we never mutate the messages once pushed to the vec, so just Arc them. This PR also slightly changes `UserMessage` to be a bit faster to clone as well. Release Notes: - Fixed a cause of stutters when interacting with the agent	2026-05-26 10:00:34 +00:00
Bennet Bo Fenner	68340172a1	agent: Remove unused `LanguageModelImage` APIs (#57050 ) Pulled out from #56866. Will help with MCP image support Release Notes: - N/A	2026-05-18 12:22:22 +00:00
marius851000	53c910982c	open_ai: Fix parsing response if token use info is unspecified (#55919 ) I tried to use google cloud to test gemma4 and compare with the result of ollama. it had response such as ```json {"choices":[{"delta":{"content":"Hello","reasoning_content":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null,"matched_stop":null}],"created":1778081610,"id":"KV_7adz7Ov20xN8Py-angQ8","model":"google/gemma-4-26b-a4b-it-maas","object":"chat.completion.chunk","usage":{"extra_properties":{"google":{"traffic_type":"ON_DEMAND"}}}} ``` (notice that, while "usage" is present, it does not have any of the usual value) Eventually, I had some more issue when parsing the response (unrelated to this), so I decided to try the google ai endpoint, with its own set of issue. Those simple change should only loosen the accepted format, so no new compatibility error are expected (but I haven’t tried with other provider) Self-Review Checklist: - [x] I've reviewed my own diff for quality, security, and reliability - [x] Unsafe blocks (if any) have justifying comments - [ ] The content is consistent with the [UI/UX checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist) (no change) - [ ] Tests cover the new/changed behavior - [x] Performance impact has been considered and is acceptable Release Notes: - Improved open-ai compatibility when token usage info is absent	2026-05-17 19:50:14 +00:00
morgankrey	37f6d7a15c	Add ChatGPT subscription provider via OAuth 2.0 PKCE (#53166 ) Adds a new language model provider that lets users authenticate with their ChatGPT Plus/Pro subscription and use OpenAI models (codex-mini-latest, o4-mini, o3) directly in the Zed agent — without needing a separate API key. ## How it works 1. OAuth 2.0 + PKCE sign-in: Uses OpenAI's official Codex CLI client ID to run an authorization code flow. A local HTTP server on `127.0.0.1:1455` captures the callback, exchanges the code for tokens, and stores them in the system keychain. 2. Token refresh: Access tokens are automatically refreshed when they're within 5 minutes of expiry, using the stored refresh token. 3. Responses API: Requests go to `https://chatgpt.com/backend-api/codex/responses` using the existing `open_ai::responses` client (Responses API format, not Chat Completions which was deprecated for this endpoint in Feb 2026). 4. Required headers: `originator: zed`, `OpenAI-Beta: responses=experimental`, `ChatGPT-Account-Id` (extracted from JWT), `store: false` in the body. ## Files changed - `crates/open_ai/src/responses.rs`: Add `store: Option<bool>` field to `Request`; add `extra_headers` param to `stream_response` for per-provider header injection - `crates/language_models/src/provider/openai_subscribed.rs`: New provider (sign-in UI, OAuth flow, token storage/refresh, model list) - `crates/language_models/src/provider/open_ai.rs`, `open_ai_compatible.rs`, `opencode.rs`: Pass `vec![]` for new `extra_headers` param - `crates/language_models/src/language_models.rs`: Register the new provider - `crates/language_models/Cargo.toml`: Add `rand` and `sha2` deps for PKCE ## Open questions / known gaps - [ ] Terms of service: Usage appears to be within OpenAI's ToS (interactive use via their official CLI client ID), but needs legal sign-off before shipping - [ ] Redirect URI: Currently `http://localhost:1455/auth/callback` — may need to match exactly what OpenAI's Codex CLI uses - [ ] UI polish: The sign-in card is functional but minimal; needs design review - [ ] Error messages: OAuth error responses from the callback URL aren't surfaced to the user yet - [ ] `o3` availability: o3 may require a higher subscription tier; consider gating it ## Testing Sign-in flow was designed to match the Copilot Chat provider pattern. Manual testing against the live OAuth endpoint is needed. Release Notes: - Added ChatGPT subscription provider, allowing users to use their ChatGPT Plus/Pro subscription with the Zed agent --------- Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com> Co-authored-by: Richard Feldman <richard@zed.dev> Co-authored-by: Richard Feldman <oss@rtfeldman.com> Co-authored-by: Agus Zubiaga <agus@zed.dev>	2026-05-14 21:03:56 +00:00
Ben Brandt	78c889c21d	open_ai: Responses API improvements (#56476 ) Release Notes: - Removed deprecated OpenAI models - Added support for gpt-5.4-nano/mini models for OpenAI provider - Improved output quality when using OpenAI models --------- Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de> Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com> Co-authored-by: Gaauwe Rombouts <mail@grombouts.nl>	2026-05-12 14:47:16 +00:00
Bennet Bo Fenner	aeb05899b3	open_ai: Support specifying reasoning effort (#56411 ) Closes #54875 Release Notes: - Added support for specifying effort level when using OpenAI models	2026-05-11 13:48:33 +00:00
kangxl	febc3ebcb4	agent: Fix Preserve tool call ID/name across DashScope empty delta chunks (#54872 ) Release Notes: - Fixed: DashScope (Aliyun) tool calls now preserve id and name across streaming delta chunks --------------------------------------------------------------------------------------------------- Aliyun (DashScope) SSE streaming sends id="" and name="" in subsequent tool_calls delta chunks after the first chunk. Previously these empty strings would unconditionally overwrite the accumulated id and name values, causing tool calls to lose identity and fail. Add is_empty() guards so id and name are only updated when the delta provides a non-empty value (falsy guard pattern), matching how Hermes Agent and OpenAI SDK handle this provider edge case. Test stream_maps_preserves_tool_id_and_name_across_empty_deltas simulates DashScope's actual streaming behavior and asserts that the completed ToolUse retains the correct id, name, and arguments. Files changed: 1 (+148/-2) - crates/open_ai/src/completion.rs CLA signed. - [x] I've reviewed my own diff for quality, security, and reliability - [x] Tests cover the new/changed behavior - [x] Performance impact has been considered and is acceptable <img width="980" height="1392" alt="CleanShot 2026-04-26 at 00 03 20@2x" src="https://github.com/user-attachments/assets/428a845b-82a0-44eb-9e43-1a351de6ca6a" /> After FIx <img width="900" height="1398" alt="CleanShot 2026-04-26 at 00 02 15@2x" src="https://github.com/user-attachments/assets/604e36fd-bf90-4549-9e60-8a927033d3e9" /> --------- Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>	2026-05-11 13:47:30 +00:00
Bennet Bo Fenner	bf3fc2336d	agent: Allow tools to output multiple content parts (#54518 ) Self-Review Checklist: - [x] I've reviewed my own diff for quality, security, and reliability - [x] Unsafe blocks (if any) have justifying comments - [x] The content is consistent with the [UI/UX checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist) - [x] Tests cover the new/changed behavior - [x] Performance impact has been considered and is acceptable Closes #ISSUE Release Notes: - N/A	2026-04-27 12:36:11 +00:00
Bennet Bo Fenner	83cc2ec054	open_ai: Use responses API for all models (#54910 ) From the [docs](https://developers.openai.com/api/docs/guides/migrate-to-responses#responses-benefits): > Better performance: Using reasoning models, like GPT-5, with Responses will result in better model intelligence when compared to Chat Completions. Our internal evals reveal a 3% improvement in SWE-bench with same prompt and setup. Agentic by default: The Responses API is an agentic loop, allowing the model to call multiple tools, like web_search, image_generation, file_search, code_interpreter, remote MCP servers, as well as your own custom functions, within the span of one API request. Lower costs: Results in lower costs due to improved cache utilization (40% to 80% improvement when compared to Chat Completions in internal tests). Stateful context: Use store: true to maintain state from turn to turn, preserving reasoning and tool context from turn-to-turn. Flexible inputs: Pass a string with input or a list of messages; use instructions for system-level guidance. Encrypted reasoning: Opt-out of statefulness while still benefiting from advanced reasoning. Future-proof: Future-proofed for upcoming models. Self-Review Checklist: - [x] I've reviewed my own diff for quality, security, and reliability - [x] Unsafe blocks (if any) have justifying comments - [x] The content is consistent with the [UI/UX checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist) - [ ] Tests cover the new/changed behavior - [x] Performance impact has been considered and is acceptable Closes #ISSUE Release Notes: - Always use Responses API for OpenAI models	2026-04-27 09:52:51 +00:00
Bennet Bo Fenner	db1d79a5ca	open_ai: Add support for gpt-5.5 (#54820 ) Release Notes: - Added support for GPT 5.5 and GPT 5.5 Pro via the OpenAI provider	2026-04-24 20:26:56 +00:00
Matt Van Horn	adab7b8871	language_models: Honor images capability for custom OpenAI models (#54223 ) ## Summary Users who add custom OpenAI models under `language_models.openai.available_models` can set `capabilities.images: true` to declare that the endpoint accepts image inputs. Today, that setting is silently ignored: the Agent panel's image-attach button stays disabled regardless, and the only workaround is to switch to a built-in OpenAI model, attach the image, and switch back. Root cause: `Model::Custom` does not carry a `supports_images` field, and the OpenAI provider's `supports_images()` for the `Custom` arm hardcodes `false`. ## Changes 1. `crates/settings_content/src/language_model.rs`: add `images: bool` to `OpenAiModelCapabilities` with `#[serde(default)]` so existing settings.json files keep working unchanged. 2. `crates/open_ai/src/open_ai.rs`: add `supports_images: bool` to `Model::Custom` with a matching serde default. 3. `crates/language_models/src/provider/open_ai.rs`: pass `model.capabilities.images` into the `Model::Custom` variant in `provided_models`, and return the stored value from `supports_images()` for `Custom`. Existing `Model::Custom { .. }` match sites (`completion.rs:829`, various in `open_ai.rs`) all use `..` so they continue to compile without change. ## Testing - `cargo check -p settings_content -p open_ai -p language_models`: clean. - I was not able to complete `./script/clippy` locally: the build stalled on the first-time `webrtc-sys` download for livekit-rust-sdks (TLS close_notify failure on docs.rs mirror). Happy to rerun once CI has cached artifacts. - Manually verified the capability plumbing by tracing: settings.json -> `OpenAiModelCapabilities.images` -> `Model::Custom { supports_images }` -> `supports_images()` -> `Thread::prompt_capabilities` -> `SessionCapabilities.supports_images()` -> `build_add_context_menu` gate in `thread_view.rs`. ## Related Issues Closes #50752 Release Notes: - Fixed custom OpenAI models ignoring the `capabilities.images` setting in `language_models.openai.available_models`. This contribution was developed with AI assistance (Codex). --------- Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>	2026-04-24 10:36:46 +00:00
Ben Brandt	2eafa6e6aa	language_models: Remove unused language model token counting (#54177 ) Drop the `count_tokens` API and related implementations across providers, and remove the unused `tiktoken-rs` dependency. I was going to update the dependency becuase they finally released a fix we needed. But then I realized we only used this api in one place, the Rules library. And for most models it would have been wildly incorrect becuase we use tiktoken, i.e. OpenAI tokenizers, for almost every model, which is going to give incorrect results. Given that, I just removed these because the difference in how we get these has caused plenty of confusion in the past. Self-Review Checklist: - [x] I've reviewed my own diff for quality, security, and reliability - [x] Unsafe blocks (if any) have justifying comments - [x] The content is consistent with the [UI/UX checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist) - [x] Tests cover the new/changed behavior - [x] Performance impact has been considered and is acceptable Release Notes: - N/A	2026-04-22 13:39:48 +00:00
Guilherme do Amaral Alves	7b082cbb6f	Add interleaved_reasoning option to openai compatible models (#54016 ) Release Notes: - Added interleaved_reasoning option to openai compatible models --- This PR adds the interleaved_reasoning option for OpenAI-compatible models, addressing the issue described in https://github.com/ggml-org/llama.cpp/issues/20837. In my testing, enabling interleaved_reasoning not only resolved the tool-calling issues encountered by Qwen3.5 models in llama.cpp, but also appeared to improve the model's coding capabilities. I have also verified the outgoing requests using a proxy to ensure the parameter is being sent correctly.It is also likely that this change will benefit other models and providers as well. Note: While I used AI to assist with the implementation, I have reviewed and tested the changes. As I am relatively new to Rust and the Zed codebase, I would appreciate any feedback or suggestions for improvement. I am happy to make further adjustments if needed. Thank you all for building such an amazing editor! Co-authored-by: Oleksiy Syvokon <oleksiy@zed.dev>	2026-04-22 10:40:37 +00:00
Danilo Leal	399d3d267e	docs: Update mentions to "assistant panel" (#53514 ) We don't use this terminology anymore; now it's "agent panel". Release Notes: - N/A	2026-04-09 10:42:21 -03:00
Agus Zubiaga	98c17ca160	language_models: Refactor deps and extract cloud (#53270 ) - `language_model` no longer depends on provider-specific crates such as `anthropic` and `open_ai` (inverted dependency) - `language_model_core` was extracted from `language_model` which contains the types for the provider-specific crates to convert to/from. - `gpui::SharedString` has been extracted into its own crate (still exposed by `gpui`), so `language_model_core` and provider API crates don't have to depend on `gpui`. - Removes some unnecessary `&'static str` \| `SharedString` -> `String` -> `SharedString` conversions across the codebase. - Extracts the core logic of the cloud `LanguageModelProvider` into its own crate with simpler dependencies. Release Notes: - N/A --------- Co-authored-by: John Tur <john-tur@outlook.com>	2026-04-07 12:28:19 -03:00
Ben Brandt	be3a5e2c06	open_ai: Support structured OpenAI tool output content (#51832 ) Allow function call outputs to carry either plain text or a list of input content items, so image tool results are serialized as image content instead of a raw base64 string. Release Notes: - N/A	2026-03-18 12:06:54 +00:00
Elier	905d28cc54	Add stream_options.include_usage for OpenAI-compatible API token usage (#45812 ) ## Summary This PR enables token usage reporting in streaming responses for OpenAI-compatible APIs (OpenAI, xAI/Grok, OpenRouter, etc). ## Problem Currently, the token counter UI in the Agent Panel doesn't display usage for some OpenAI-compatible providers because they don't return usage data during streaming by default. According to OpenAI's API documentation, the `stream_options.include_usage` parameter must be set to `true` to receive usage statistics in streaming responses. ## Solution - Added StreamOptions struct with `include_usage` field to the open_ai crate - Added `stream_options` field to the Request struct - Automatically set `stream_options: { include_usage: true }` when `stream: true` - Updated edit_prediction requests with `stream_options: None` (non-streaming) ## Testing Tested with xAI Grok models - token counter now correctly shows usage after sending a message. ## References - [OpenAI Chat Completions API - stream_options](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options) - [xAI API Documentation](https://docs.x.ai/api)	2026-03-17 10:38:14 +00:00
Neel	175707f95c	open_ai: Support reasoning summaries in OpenAI Responses API (#50959 ) Related to AI-79. Release Notes: - N/A	2026-03-09 13:51:22 +00:00
Richard Feldman	3b3ffc022e	Add GPT-5.4 and GPT-5.4-pro BYOK models (#50858 ) Add GPT-5.4 and GPT-5.4-pro as Bring Your Own Key model options for the OpenAI provider. GPT-5.4 (`gpt-5.4`): - 1,050,000 token context window, 128K max output - Supports chat completions, images, parallel tool calls - Default reasoning effort: none GPT-5.4-pro (`gpt-5.4-pro`): - 1,050,000 token context window, 128K max output - Responses API only (no chat completions) - Default reasoning effort: medium (supports medium/high/xhigh) Also fixes context window sizes for GPT-5 mini and GPT-5 nano (272K → 400K) to match current OpenAI docs. Closes AI-78 Release Notes: - Added GPT-5.4 and GPT-5.4-pro as available models when using your own OpenAI API key.	2026-03-05 23:40:03 -05:00
Richard Feldman	a18b7727ee	Add GPT-5.3-Codex BYOK model under the OpenAI provider (#50122 ) Adds `gpt-5.3-codex` as a built-in model under the OpenAI provider for BYOK usage. Model specs: - 400,000 context window - 128,000 max output tokens - Reasoning token support (default medium effort) - Uses the Responses API (like other codex models) - Token counting falls back to the gpt-5 tokenizer Closes AI-59 Release Notes: - Added support for GPT-5.3-Codex as a bring-your-own-key model in the OpenAI provider.	2026-02-25 16:29:01 -05:00
Richard Feldman	0b8424a14c	Remove deprecated GPT-4o, GPT-4.1, GPT-4.1-mini, and o4-mini (#49082 ) Remove GPT-4o, GPT-4.1, GPT-4.1-mini, and o4-mini from BYOK model options in Zed before OpenAI retires these models. These models are being retired by OpenAI (ChatGPT workspace support ends April 3, 2026), so they have been removed from the available models list in Zed's BYOK provider. Closes AI-4 Release Notes: - Removed deprecated GPT-4o, GPT-4.1, GPT-4.1-mini, and o4-mini models from OpenAI BYOK provider	2026-02-13 04:54:22 +00:00
Oleksiy Syvokon	757ee0571e	ep: Use rejected_output for DPO training + OpenAI support (#47697 ) Release Notes: - N/A --------- Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>	2026-01-27 13:02:40 +00:00
Aero	7bd3075d53	open_ai: Support reasoning content (#43662 ) Support for Kimi K2 Thinking Release Notes: - Added support for thinking traces when using OpenAI-API-compatible AI providers --------- Co-authored-by: Bennet Bo Fenner <bennet@zed.dev>	2026-01-21 10:08:59 +00:00
Richard Feldman	e5706f2349	Add BYOK GPT-5.2-codex support (#47025 ) <img width="449" height="559" alt="Screenshot 2026-01-16 at 4 52 12 PM" src="https://github.com/user-attachments/assets/1b5583d7-9b90-46b1-a32f-9821543ea542" /> Release Notes: - Add support for GPT-5.2-Codex via OpenAI API Key	2026-01-16 17:09:08 -05:00
Marshall Bowers	c6a38f2cfb	open_ai: Use proper type for Responses API `input` (#46526 ) This PR makes it so we use a proper type for the Responses API `input` rather than a `serde_json::Value`. It should have never used `serde_json::Value` to begin with. Release Notes: - N/A	2026-01-10 17:40:20 +00:00
Marshall Bowers	30f776e47f	open_ai: Move `responses` module to its own file (#46450 ) This PR moves the `responses` module to its own module in the `open_ai` crate. Release Notes: - N/A	2026-01-09 14:29:08 +00:00
Matt Stallone	84017bca89	Add OpenAI Responses API support with chat_completions capability flag (#39989 ) Add support for OpenAI's /responses endpoint for models that don't support /chat/completions API. This enables compatibility with newer model variants (`gpt-5-codex`, `gpt-5-pro`, `o3-pro`, etc) while maintaining compatibility with existing configs Changes: - Add `supports_chat_completions` flag to model capabilities that defaults to true for existing behavior - Implement responses API client with streaming support as per [OpenAI documentation](https://app.stainless.com/api/spec/documented/openai/openapi.documented.yml). - Add `ResponseEventMapper` to convert responses events to completion events for maintainer simplicity - Update UI to allow toggling `chat_completions` capability - Add `gpt-5-codex` model Closes #38858 Release Notes: - Added support for `gpt-5-codex` model --------- Co-authored-by: Bennet Bo Fenner <bennet@zed.dev>	2026-01-05 18:15:54 +01:00
Richard Feldman	b5a0a3322d	Add GPT-5.2 support (#44656 ) <img width="429" height="188" alt="Screenshot 2025-12-11 at 3 45 26 PM" src="https://github.com/user-attachments/assets/fe9f1b86-7268-4c63-a8c2-75ac671012c9" /> Release Notes: - Added GPT-5.2 support when using your own OpenAI key	2025-12-11 15:49:10 -05:00
Agus Zubiaga	f08fd732a7	Add experimental mercury edit prediction provider (#44256 ) Release Notes: - N/A --------- Co-authored-by: Ben Kunkle <ben@zed.dev> Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>	2025-12-06 10:08:44 +00:00
Mikayla Maki	53eb35f5b2	Add GPT 5.1 to Zed BYOK (#43492 ) Release Notes: - Added support for OpenAI's GPT 5.1 model to BYOK	2025-11-25 14:17:27 -08:00
Tim McLean	fb90b12073	Add retry support for OpenAI-compatible LLM providers (#37891 ) Automatically retry the agent's LLM completion requests when the provider returns 429 Too Many Requests. Uses the Retry-After header to determine the retry delay if it is available. Many providers are frequently overloaded or have low rate limits. These providers are essentially unusable without automatic retries. Tested with Cerebras configured via openai_compatible. Related: #31531 Release Notes: - Added automatic retries for OpenAI-compatible LLM providers --------- Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>	2025-11-13 14:15:46 +00:00
Max Brunsfeld	784fdcaee3	zeta2: Build edit prediction prompt and process model output in client (#41870 ) Release Notes: - N/A --------- Co-authored-by: Agus Zubiaga <agus@zed.dev> Co-authored-by: Ben Kunkle <ben@zed.dev> Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>	2025-11-06 18:36:58 -05:00
Techy	27a18843d4	open_ai: Make the deltas optional (#39142 ) I am using an Azure OpenAI instance since that is what is provided at work and with how they have it setup not all responses contain a delta, which lead to errors and truncated responses. This is related to how they are filtering potentially offensive requests and responses. I don't believe this filter was made in-house, instead I believe it is provided by Microsoft/Azure, so I suspect this fix may help other users. Release Notes: - N/A Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>	2025-11-05 13:47:14 +01:00
Julia Ryan	ef5b8c6fed	Remove workspace-hack (#40216 ) We've been considering removing workspace-hack for a couple reasons: - Lukas ran into a situation where its build script seemed to be causing spurious rebuilds. This seems more likely to be a cargo bug than an issue with workspace-hack itself (given that it has an empty build script), but we don't necessarily want to take the time to hunt that down right now. - Marshall mentioned hakari interacts poorly with automated crate updates (in our case provided by rennovate) because you'd need to have `cargo hakari generate && cargo hakari manage-deps` after their changes and we prefer to not have actions that make commits. Currently removing workspace-hack causes our workspace to grow from ~1700 to ~2000 crates being built (depending on platform), which is mainly a problem when you're building the whole workspace or running tests across the the normal and remote binaries (which is where feature-unification nets us the most sharing). It doesn't impact incremental times noticeably when you're just iterating on `-p zed`, and we'll hopefully get these savings back in the future when rust-lang/cargo#14774 (which re-implements the functionality of hakari) is finished. Release Notes: - N/A	2025-10-17 18:58:14 +00:00
Conrad Irwin	fcdab160f9	Settings refactor (#38367 ) Co-Authored-By: Ben K <ben@zed.dev> Co-Authored-By: Anthony <anthony@zed.dev> Co-Authored-By: Mikayla <mikayla@zed.dev> Release Notes: - settings: Major internal changes to settings. The primary user-facing effect is that some settings which did not make sense in project settings files are no-longer read from there. (For example the inline blame settings) --------- Co-authored-by: Ben Kunkle <ben@zed.dev> Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com> Co-authored-by: Anthony <anthony@zed.dev>	2025-09-18 16:47:23 +00:00
ZhangJun	7091c70a1e	open_ai: Trim newline before "data:" prefix and account for the possibility of no space after ":" (#37644 ) I'am using an openai compatible model, but got nothing in agent thread panel, and Zed log has "Model generated an empty summary" line. I add one log to open_ai.rs: <img width="2454" height="626" alt="图片" src="https://github.com/user-attachments/assets/85354c7d-a0cc-4bba-86fd-2a640038a13e" /> and got: <img width="3456" height="278" alt="图片" src="https://github.com/user-attachments/assets/7746aedd-5d76-44b5-90f2-e129a1507178" /> It appear that `let line = line.strip_prefix("data: ")?;` can not handle correctly. Release Notes: - N/A --------- Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>	2025-09-08 22:01:55 +02:00
Umesh Yadav	9f749881b3	language_models: Fix tool_choice null issue for other providers (#34554 ) Follow up: #34532 Closes #35434 Mostly fixes a issue were when the tool_choice is none it was getting serialised as null. This was fixed for openrouter just wanted to follow up and cleanup for other providers which might have this issue as this is against the spec. Release Notes: - N/A	2025-09-03 01:22:57 +02:00
Antonio Scandurra	39d86eeb7f	Trim API key when submitting requests to LLM providers (#37082 ) This prevents the common footgun of copy/pasting an API key starting/ending with extra newlines, which would lead to a "bad request" error. Closes #37038 Release Notes: - agent: Support pasting language model API keys that contain newlines.	2025-08-28 12:00:44 +00:00
Michael Sloan	0470baca50	open_ai: Remove `model` field from ResponseStreamEvent (#36902 ) Closes #36901 Release Notes: - Fixed use of Open WebUI as an LLM provider.	2025-08-25 19:50:08 +00:00
Piotr Osiewicz	05fc0c432c	Fix a bunch of other low-hanging style lints (#36498 ) - Fix a bunch of low hanging style lints like unnecessary-return - Fix single worktree violation - And the rest Release Notes: - N/A	2025-08-19 21:26:17 +02:00
Oleksiy Syvokon	42ffa8900a	open_ai: Fix error response parsing (#36390 ) Closes #35925 Release Notes: - Fixed OpenAI error response parsing in some cases	2025-08-18 08:54:31 +00:00
Oleksiy Syvokon	2a57b160b0	openai: Don't send prompt_cache_key for OpenAI-compatible models (#36231 ) Some APIs fail when they get this parameter Closes #36215 Release Notes: - Fixed OpenAI-compatible providers that don't support prompt caching and/or reasoning	2025-08-15 13:54:24 +03:00
Oleksiy Syvokon	a3dcc76687	openai: Don't send reasoning_effort if it's not set (#36228 ) Release Notes: - N/A	2025-08-15 09:12:18 +00:00
Cretezy	8ff2e3e195	language_models: Add reasoning_effort for custom models (#35929 ) Release Notes: - Added `reasoning_effort` support to custom models Tested using the following config: ```json5 "language_models": { "openai": { "available_models": [ { "name": "gpt-5-mini", "display_name": "GPT 5 Mini (custom reasoning)", "max_output_tokens": 128000, "max_tokens": 272000, "reasoning_effort": "high" // Can be minimal, low, medium (default), and high } ], "version": "1" } } ``` Docs: https://platform.openai.com/docs/api-reference/chat/create#chat_create-reasoning_effort This work could be used to split the GPT 5/5-mini/5-nano into each of it's reasoning effort variant. E.g. `gpt-5`, `gpt-5 low`, `gpt-5 minimal`, `gpt-5 high`, and same for mini/nano. Release Notes: * Added a setting to control `reasoning_effort` in OpenAI models	2025-08-13 06:09:16 +00:00
Oleksiy Syvokon	7167f193c0	open_ai: Send `prompt_cache_key` to improve caching (#36065 ) Release Notes: - N/A Co-authored-by: Michael Sloan <mgsloan@gmail.com>	2025-08-12 21:51:23 +03:00

1 2 3

109 commits