Document Gemini caching behavior (#57864) (cherry-pick to stable) (#57871)

Cherry-pick of #57864 to stable ---- ## Summary - Document that Zed-hosted Gemini models do not use Google context caching - Clarify that Gemini usage is billed only as input and output tokens, with no cached-input price - Link to Google's Vertex AI context caching and zero-data-retention documentation for background ## Validation - `script/generate-action-metadata` - `mdbook build docs` Release Notes: - N/A Co-authored-by: morgankrey <morgan@zed.dev>
2026-06-01 03:14:56 +07:00 · 2026-05-27 20:13:52 +00:00 · 2026-05-27 20:13:52 +00:00 · efbbd27a46
commit efbbd27a46
parent ce85db149e
1 changed files with 2 additions and 0 deletions
--- a/docs/src/ai/models.md
+++ b/docs/src/ai/models.md
@ -91,6 +91,8 @@ As of February 19, 2026, Zed Pro serves newer model versions in place of the ret

 ## Usage {#usage}

+Because Zed-hosted Gemini models do not use Google context caching, Gemini usage is billed only as input and output tokens; there is no separate cached-input price for these models. This preserves zero-data-retention behavior for hosted Gemini requests. For background, see Google's Vertex AI documentation on [context caching](https://cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-overview) and [zero data retention](https://cloud.google.com/vertex-ai/generative-ai/docs/vertex-ai-zero-data-retention).
+
 Any usage of a Zed-hosted model will be billed at the Zed Price (rightmost column above). See [Plans and Usage](./plans-and-usage.md) for details on Zed's plans and limits for use of hosted models.

 > LLMs can enter unproductive loops that require user intervention. Monitor longer-running tasks and interrupt if needed.