Commit graph

30 commits

Author SHA1 Message Date
Bennet Bo Fenner
122619624d
x_ai: Add support for specifying reasoning effort (#58078)
See
https://docs.x.ai/developers/model-capabilities/text/reasoning#the-reasoning_effort-parameter

Closes #58056

Release Notes:

- agent: Added support for specifying reasoning effort for Grok 4.3
(xAI)
2026-05-29 16:28:27 +00:00
Conrad Irwin
be705e677b
Merge gpui::Task and scheduler::Task (#53674)
Release Notes:

- N/A or Added/Fixed/Improved ...
2026-05-05 22:41:13 +00:00
Ben Brandt
2eafa6e6aa
language_models: Remove unused language model token counting (#54177)
Drop the `count_tokens` API and related implementations across
providers, and remove the unused `tiktoken-rs` dependency.

I was going to update the dependency becuase they finally released a fix
we needed. But then I realized we only used this api in one place, the
Rules library. And for most models it would have been wildly incorrect
becuase we use tiktoken, i.e. OpenAI tokenizers, for almost every model,
which is going to give incorrect results.

Given that, I just removed these because the difference in how we get
these has caused plenty of confusion in the past.

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- N/A
2026-04-22 13:39:48 +00:00
Guilherme do Amaral Alves
7b082cbb6f
Add interleaved_reasoning option to openai compatible models (#54016)
Release Notes:

- Added interleaved_reasoning option to openai compatible models

---

This PR adds the interleaved_reasoning option for OpenAI-compatible
models, addressing the issue described in
https://github.com/ggml-org/llama.cpp/issues/20837.

In my testing, enabling interleaved_reasoning not only resolved the
tool-calling issues encountered by Qwen3.5 models in llama.cpp, but also
appeared to improve the model's coding capabilities. I have also
verified the outgoing requests using a proxy to ensure the parameter is
being sent correctly.It is also likely that this change will benefit
other models and providers as well.

Note: While I used AI to assist with the implementation, I have reviewed
and tested the changes. As I am relatively new to Rust and the Zed
codebase, I would appreciate any feedback or suggestions for
improvement. I am happy to make further adjustments if needed.

Thank you all for building such an amazing editor!

Co-authored-by: Oleksiy Syvokon <oleksiy@zed.dev>
2026-04-22 10:40:37 +00:00
Agus Zubiaga
98c17ca160
language_models: Refactor deps and extract cloud (#53270)
- `language_model` no longer depends on provider-specific crates such as
`anthropic` and `open_ai` (inverted dependency)
- `language_model_core` was extracted from `language_model` which
contains the types for the provider-specific crates to convert to/from.
- `gpui::SharedString` has been extracted into its own crate (still
exposed by `gpui`), so `language_model_core` and provider API crates
don't have to depend on `gpui`.
- Removes some unnecessary `&'static str` | `SharedString` -> `String`
-> `SharedString` conversions across the codebase.
- Extracts the core logic of the cloud `LanguageModelProvider` into its
own crate with simpler dependencies.


Release Notes:

- N/A

---------

Co-authored-by: John Tur <john-tur@outlook.com>
2026-04-07 12:28:19 -03:00
Jakub Konka
29609d3599
language_model: Decouple from Zed-specific implementation details (#52913)
This PR decouples `language_model`'s dependence on Zed-specific
implementation details. In particular
* `credentials_provider` is split into a generic `credentials_provider`
crate that provides a trait, and `zed_credentials_provider` that
implements the said trait for Zed-specific providers and has functions
that can populate a global state with them
* `zed_env_vars` is split into a generic `env_var` crate that provides
generic tooling for managing env vars, and `zed_env_vars` that contains
Zed-specific statics
* `client` is now dependent on `language_model` and not vice versa

Release Notes:

- N/A
2026-04-02 17:06:57 -03:00
Anil Pai
a777605ec5
Use split token display for xAI models (#48719)
### Split token display for xAI

Extends the split input/output token display (introduced in #46829 for
OpenAI) to all xAI models.

Instead of the combined `48k / 1M` token counter, xAI models now show:
- **↑** input tokens used / input token limit
- **↓** output tokens used / output token limit

#### Before

<img width="513" height="128" alt="Screenshot 2026-02-08 at 11 07 13 AM"
src="https://github.com/user-attachments/assets/14e5cb4a-9b5c-4081-bbfb-407a737bf234"
/>


#### After

<img width="610" height="126" alt="Screenshot 2026-02-08 at 11 05 36 AM"
src="https://github.com/user-attachments/assets/92396dcb-8905-4f87-9b9e-d8b0f63225ba"
/>


#### Changes

- **x_ai.rs** — Override `supports_split_token_display()` to return
`true` on `XAiLanguageModel`. All built-in Grok models already implement
`max_output_tokens()`, so no additional plumbing was needed.
- **cloud.rs** — Add `XAi` to the `matches!` pattern in
`CloudLanguageModel::supports_split_token_display()` so cloud-routed xAI
models also get the split display.

#### Tests

- `test_xai_supports_split_token_display` — Verifies all built-in Grok
model variants return `true` for split token display.
- `test_xai_models_have_max_output_tokens` — Validates all built-in Grok
models report `max_output_tokens` that is `Some`, positive, and less
than `max_token_count` (required for the UI to compute the input token
limit).
- `test_split_token_display_supported_providers` — Confirms the cloud
provider match pattern includes `OpenAi` and `XAi` while excluding
`Anthropic` and `Google`.

Release Notes:

- Changed the display of tokens for xAI models to reflect the
input/output limits.

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>
2026-03-17 10:42:31 +00:00
Bennet Bo Fenner
87bc2aac5c
Add support for streaming tool input to more providers (#50682)
To test:
- [x] Bedrock
- [x] Copilot Chat
- [x] Deepseek
- [x] Open AI
- [x] Open Router
- [x] Vercel
- [x] Vercel AI Gateway
- [x] xAI
- [x] Mistral

Release Notes:

- N/A
2026-03-04 17:36:25 +01:00
Richard Feldman
29cf14ed2f
Fix rate limiter holding permits during tool execution (#47494)
The rate limiter's semaphore guard was being held for the entire
duration of a turn, including during tool execution. This caused
deadlocks when subagents tried to acquire permits while parent requests
were waiting for them to complete.

## The Problem

In `run_turn_internal`, the stream (which contains the `RateLimitGuard`
holding the semaphore permit) was kept alive throughout the entire loop
iteration - including during **tool execution**:

1. Parent request acquires permit
2. Parent starts streaming, consumes response
3. Parent starts executing tools (subagents)
4. **Stream/guard still held** while tools execute
5. Subagents try to acquire permits → blocked because parent still holds
permit
6. Deadlock if all permits are held by parents waiting for subagent
children

## The Fix

Two changes were made:

1. **Drop the stream early**: Added an explicit `drop(events)` after the
stream is fully consumed but before tool execution begins. This releases
the rate limit permit so subagents can acquire it.

2. **Removed the `bypass_rate_limit` workaround**: Since the root cause
is now fixed, the bypass mechanism is no longer needed.

Note: no release notes because subagents are still feature-flagged, and
this rate limiting change isn't actually observable without them.

Release Notes:

- N/A
2026-01-23 12:15:55 -05:00
Richard Feldman
21050e2d37
Fix nested request rate limiting deadlock for subagent edit_file (#47232)
## Problem

When subagents use the `edit_file` tool, it creates an `EditAgent` that
makes its own model request to get the edit instructions. These "nested"
requests compete with the parent subagent conversation requests for rate
limiter permits.

The rate limiter uses a semaphore with a limit of 4 concurrent requests
per model instance. When multiple subagents run in parallel:

1. 3 subagents each hold 1 permit for their ongoing conversation streams
(3 permits used)
2. When all 3 try to use `edit_file` simultaneously, their edit agents
need permits too
3. Only 1 edit agent can get the 4th permit; the other 2 block waiting
4. The blocked edit agents can't complete, so their parent subagent
conversations can't complete
5. The parent conversations hold their permits, so the blocked edit
agents stay blocked
6. **Deadlock**

## Solution

Added a `bypass_rate_limit` field to `LanguageModelRequest`. When set to
`true`, the request skips the rate limiter semaphore entirely. The
`EditAgent` sets this flag because its requests are already "part of" a
rate-limited parent request.

(No release notes because subagents are still feature-flagged.)

Release Notes:
- N/A

---------

Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
2026-01-20 21:51:54 -05:00
Mikayla Maki
97c35c084b
gpui: Actually remove the Result from AsyncApp (#45809)
Depends on: https://github.com/zed-industries/zed/pull/45768

Refactor plan:
https://gist.github.com/mikayla-maki/6c4bf263fd80050715ba01f45478796e
Overall plan:
https://gist.github.com/mikayla-maki/7bb5078e4385a2e683e1e1eb40d17d38

This is the big one.

Release Notes:

- N/A

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 12:48:24 -08:00
Jakub Konka
f90fe5ce9e
language_models: Make wording for setting env vars consistent (#46240)
Release Notes:

- N/A
2026-01-07 10:26:49 +00:00
Richard Feldman
6055b45ee1
Add support for provider extensions (but no extensions yet) (#45277)
This adds support for provider extensions but doesn't actually add any
yet.

Release Notes:

- N/A
2025-12-18 17:05:04 -05:00
Danilo Leal
0283bfb049
Enable configuring edit prediction providers through the settings UI (#44505)
- Edit prediction providers can now be configured through the settings
UI
- Cleaned up the status bar menu to only show _configured_ providers
- Added to the status bar icon button tooltip the name of the active
provider
- Only display the data collection functionality under "Privacy" for the
Zed models
- Moved the Codestral edit prediction provider out of the Mistral
section in the agent panel into the settings UI
- Refined and improved UI and states for configuring GitHub Copilot as
both an agent and edit prediction provider

#### Todos before merge:

- [x] UI: Unify with settings UI style and tidy it all up
- [x] Unify Copilot modal `impl`s to use separate window
- [x] Remove stop light icons from GitHub modal
- [x] Make dismiss events work on GitHub modal
- [ ] Investigate workarounds to tell if Copilot authenticated even when
LSP not running


Release Notes:

- settings_ui: Added a section for configuring edit prediction providers
under AI > Edit Predictions, including Codestral and GitHub Copilot.
Once you've updated you can use the following link to open it:
zed://settings/edit_predictions.providers

---------

Co-authored-by: Ben Kunkle <ben@zed.dev>
2025-12-13 11:06:30 -05:00
Tim McLean
fb90b12073
Add retry support for OpenAI-compatible LLM providers (#37891)
Automatically retry the agent's LLM completion requests when the
provider returns 429 Too Many Requests. Uses the Retry-After header to
determine the retry delay if it is available.

Many providers are frequently overloaded or have low rate limits. These
providers are essentially unusable without automatic retries.

Tested with Cerebras configured via openai_compatible.

Related: #31531 

Release Notes:

- Added automatic retries for OpenAI-compatible LLM providers

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-11-13 14:15:46 +00:00
Danilo Leal
2fb3d593bc
agent_ui: Add component to standardize the configured LLM card (#42314)
This PR adds a new component to the `language_models` crate called
`ConfiguredApiCard`:

<img width="500" height="420" alt="Screenshot 2025-11-09 at 2  07@2x"
src="https://github.com/user-attachments/assets/655ea941-2df8-4489-a4da-bba34acf33a9"
/>

We were previously recreating this component from scratch with regular
divs in all LLM providers render function, which was redundant as they
all essentially looked the same and didn't have any major variations
aside from labels. We can clean up a bunch of similar code with this
change, which is cool!

Release Notes:

- N/A
2025-11-09 14:32:05 -03:00
chenmi
cc1d66b530
agent_ui: Improve API key configuration UI display (#42306)
Improve the layout and text display of API key configuration in multiple
language model providers to ensure proper text wrapping and ellipsis
handling when API URLs are long.

Before:

<img width="320" alt="image"
src="https://github.com/user-attachments/assets/2f89182c-34a0-4f95-a43a-c2be98d34873"
/>

After:

<img width="320" alt="image"
src="https://github.com/user-attachments/assets/09bf5cc3-07f0-47bc-b21a-d84b8b1caa67"
/>

Changes include:
- Add proper flex layout with overflow handling
- Replace truncate_and_trailoff with CSS text ellipsis
- Ensure consistent UI behavior across all providers

Release Notes:

- Improved API key configuration display in language model settings
2025-11-09 13:00:31 -03:00
Danilo Leal
8f3da5c5cd
settings_ui: Add pickers for theme and icon themes (#40829)
In the process of adding pickers for the theme and icon themes fields in
the settings UI, I felt like there was an improvement opportunity in
regards to where some of these components are stored. The `ui_input`
crate originally was meant only for the text field-like component, which
couldn't be in the regular `ui` crate due to the dependency with
`editor`. Given we had also added the number field there—which is
similar in also having the same dependency—it made sense to think of
this crate more like a home for form-like components rather than for
only one component.

However, we were also storing some settings UI-specific stuff in that
crate, which didn't feel right. So I ended up creating a new directory
within the `settings_ui` for components and moved all the pickers and
the custom input field there. I think this makes it for a cleaner
structure.

Release Notes:

- settings_ui: Added the ability to search for theme and icon themes in
their respective fields.
2025-10-21 19:58:43 -03:00
Jowell Young
92a09ecf25
x_ai: Add support for tools and images with custom models (#38792)
After the change, we can add "supports_images", "supports_tools" and
"parallel_tool_calls" properties to set up new models. Our
`settings.json` will be as follows:
```json
  "language_models": {
     "x_ai": {
       "api_url": "https://api.x.ai/v1",
       "available_models": [
         {
           "name": "grok-4-fast-reasoning",
           "display_name": "Grok 4 Fast Reasoning",
           "max_tokens": 2000000,
           "max_output_tokens": 64000,
           "supports_tools": true,
           "parallel_tool_calls": true,
         },
         {
           "name": "grok-4-fast-non-reasoning",
           "display_name": "Grok 4 Fast Non-Reasoning",
           "max_tokens": 2000000,
           "max_output_tokens": 64000,
           "supports_images": true,
         }
       ]
     }
   }

```

Closes https://github.com/zed-industries/zed/issues/38752

Release Notes:

- xAI: Added support for for configuring tool and image support for
custom model configurations
2025-09-29 11:38:55 +00:00
Michael Sloan
67984d5e49
provider configuration: Use SingleLineInput instead of Editor (#38814)
Release Notes:

- N/A
2025-09-25 22:38:27 +00:00
Conrad Irwin
fcdab160f9
Settings refactor (#38367)
Co-Authored-By: Ben K <ben@zed.dev>
Co-Authored-By: Anthony <anthony@zed.dev>
Co-Authored-By: Mikayla <mikayla@zed.dev>

Release Notes:

- settings: Major internal changes to settings. The primary user-facing
effect is that some settings which did not make sense in project
settings files are no-longer read from there. (For example the inline
blame settings)

---------

Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
Co-authored-by: Anthony <anthony@zed.dev>
2025-09-18 16:47:23 +00:00
Michael Sloan
a598fbaa73
ai: Show "API key configured for {URL}" for non-default urls (#38170)
Followup to #38163, also makes some changes intended to be included in
that PR.

Release Notes:

- N/A
2025-09-15 05:49:25 +00:00
Michael Sloan
634ae72cad
Misc cleanup + clear language model provider API key editors when API keys are submitted (#38165)
Followup to #38163 along with some other misc cleanups

Release Notes:

- N/A
2025-09-15 05:08:38 +00:00
Michael Sloan
98edf1bf0b
Reload API keys when URLs configured for LLM providers change (#38163)
Three motivations for this:

* Changing provider URL could cause credentials for the prior URL to be
sent to the new URL.
* The UI is in a misleading state after URL change - it shows a
configured API key, but on restart it will show no API key.
* #34110 will add support for both URL and key configuration for Ollama.
This is the first provider to have UI for setting the URL, and this
makes these issues show up more directly as odd UI interactions.

#37610 implemented something similar for the OpenAI and OpenAI
compatible providers. This extracts out some shared code, uses it in all
relevant providers, and adds more safety around key use.

I haven't tested all providers, but the per-provider changes were pretty
mechanical, so hopefully work properly.

Release Notes:

- Fixed handling of changes to LLM provider URL in settings to also load
the associated API key.
2025-09-15 03:36:24 +00:00
Daniel Dye
d7c735959e
Add xAI's Grok Code Fast 1 model (#36959)
Release Notes:

- Add the `grok-code-fast-1` model to xAI's list of available models.
2025-08-26 21:08:45 +00:00
Piotr Osiewicz
9e0e233319
Fix clippy::needless_borrow lint violations (#36444)
Release Notes:

- N/A
2025-08-18 21:54:35 +00:00
Agus Zubiaga
8b89ea1a80
Handle auth for claude (#36442)
We'll now use the anthropic provider to get credentials for `claude` and
embed its configuration view in the panel when they are not present.

Release Notes:

- N/A
2025-08-18 20:40:59 +00:00
Oleksiy Syvokon
2a57b160b0
openai: Don't send prompt_cache_key for OpenAI-compatible models (#36231)
Some APIs fail when they get this parameter

Closes #36215

Release Notes:

- Fixed OpenAI-compatible providers that don't support prompt caching
and/or reasoning
2025-08-15 13:54:24 +03:00
Cretezy
8ff2e3e195
language_models: Add reasoning_effort for custom models (#35929)
Release Notes:

- Added `reasoning_effort` support to custom models

Tested using the following config:
```json5
  "language_models": {
    "openai": {
      "available_models": [
        {
          "name": "gpt-5-mini",
          "display_name": "GPT 5 Mini (custom reasoning)",
          "max_output_tokens": 128000,
          "max_tokens": 272000,
          "reasoning_effort": "high" // Can be minimal, low, medium (default), and high
        }
      ],
      "version": "1"
    }
  }
```

Docs:
https://platform.openai.com/docs/api-reference/chat/create#chat_create-reasoning_effort

This work could be used to split the GPT 5/5-mini/5-nano into each of
it's reasoning effort variant. E.g. `gpt-5`, `gpt-5 low`, `gpt-5
minimal`, `gpt-5 high`, and same for mini/nano.

Release Notes:

* Added a setting to control `reasoning_effort` in OpenAI models
2025-08-13 06:09:16 +00:00
Umesh Yadav
ec52e9281a
Add xAI language model provider (#33593)
Closes #30010

Release Notes:

- Add support for xAI language model provider
2025-07-15 15:35:50 -04:00