Document Gemini caching behavior (#57864) (cherry-pick to stable) (#57871)

Cherry-pick of #57864 to stable

----
## Summary

- Document that Zed-hosted Gemini models do not use Google context
caching
- Clarify that Gemini usage is billed only as input and output tokens,
with no cached-input price
- Link to Google's Vertex AI context caching and zero-data-retention
documentation for background

## Validation

- `script/generate-action-metadata`
- `mdbook build docs`

Release Notes:

- N/A

Co-authored-by: morgankrey <morgan@zed.dev>
This commit is contained in:
zed-zippy[bot] 2026-05-27 20:13:52 +00:00 committed by GitHub
parent ce85db149e
commit efbbd27a46
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -91,6 +91,8 @@ As of February 19, 2026, Zed Pro serves newer model versions in place of the ret
## Usage {#usage}
Because Zed-hosted Gemini models do not use Google context caching, Gemini usage is billed only as input and output tokens; there is no separate cached-input price for these models. This preserves zero-data-retention behavior for hosted Gemini requests. For background, see Google's Vertex AI documentation on [context caching](https://cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-overview) and [zero data retention](https://cloud.google.com/vertex-ai/generative-ai/docs/vertex-ai-zero-data-retention).
Any usage of a Zed-hosted model will be billed at the Zed Price (rightmost column above). See [Plans and Usage](./plans-and-usage.md) for details on Zed's plans and limits for use of hosted models.
> LLMs can enter unproductive loops that require user intervention. Monitor longer-running tasks and interrupt if needed.