Treat Claude Code stdout like "Not logged in · Please run /login." as an
auth failure in diagnoseClaudeCliFailure so connection tests and chat
runs surface actionable login guidance instead of raw CLI text.
Lazy srcdoc transport was still active after URL-load preview switched off,
leaving the visible iframe on an empty activation shell until Edit forced a
full srcdoc reload. Mount real artifact HTML whenever srcdoc is the active
transport and remount when leaving URL-load.
Fixes#2791
After fixing source acquisition (#3078), the #3060 validation run reached
the container and got through most of `pnpm install`, then failed building
the better-sqlite3 native module: prebuild-install could not reach github
releases and the node-gyp fallback could not fetch node headers from
nodejs.org (ECONNRESET). The electron postinstall hits the same blocked
hosts, and package tarballs from npmjs were throttled to ~20 KB/s.
The runner's network to npmjs / nodejs.org / github releases is throttled
or reset by GFW; the China npm mirror (npmmirror.com) is fast and complete
(verified from the runner: registry ~2.4 MB/s, node headers ~3.6 MB/s,
better-sqlite3 prebuilt present). Point the in-container install at it via
registry + disturl (node-gyp headers) + electron / electron-builder /
better-sqlite3 binary mirrors + Playwright download host.
Package integrity is still verified against the lockfile, so the mirror
only changes transport. Once a native module builds, pnpm's side-effects
cache in the persistent store keeps it warm for later runs.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Local CLI chat does not supply BYOK credentials to finalize synthesis.
Resolve per-protocol saved settings before calling the daemon and show an
actionable toast instead of a generic BAD_REQUEST when credentials are
missing.
Fixes#2959
Add a `skip_comment` workflow_dispatch input (default false). When set,
the "Comment exploration report" step is skipped, so a validation/dry
run can exercise the full pipeline and produce the report artifact
without posting a public comment on the target PR (useful when testing
against an external contributor's PR). The report is still uploaded as
an artifact for review.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The sandbox checked out PR code with `git fetch https://github.com/...`
*inside* the container. The self-hosted runner's bandwidth to github.com
is throttled across every transport (HTTPS/SSH/codeload/API, all
~30-90 KB/s) and the HTTPS handshake is frequently RST'd, so a
from-scratch fetch of this ~200MB repo is impractical and unreliable per
run (run 26491460889 failed here with repeated GnuTLS resets).
Move source acquisition to the trusted host and make it incremental:
- Keep a persistent bare mirror of the base repo
($HOME/.cache/agent-pr-explore/open-design.git, overridable via
OD_SANDBOX_REPO_MIRROR). Each run fetches only the PR's delta via
`refs/pull/<n>/head` over SSH -- the one transport GFW doesn't reset --
using a read-only deploy key (OD_SANDBOX_GIT_SSH_KEY).
- Take the head from the BASE repo's pull ref so fork PRs work without
depending on the head fork, and verify it equals the resolved HEAD_SHA.
- Check the PR head into a per-run worktree and mount it read-only into
the container; the container copies it into a writable workdir and no
longer needs (or has) any github access.
The deploy key stays on the trusted host and is never exposed to the
untrusted PR code. The mirror must be seeded once on the runner (the
error message prints the exact clone command if it is missing).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Use DESIGN.md telemetry palette values in tokens.css and the
components.html fixture. Document the --success/--danger mapping in
DESIGN.md so spec readers and CSS consumers stay aligned.
Expand and briefly highlight the saved routine row so users can
review it immediately. Extract newest-first sort helper and add
regression tests for list ordering and post-create focus.
* ci: skip docker pull when agent sandbox image is already cached
The agent PR exploration script ran an unconditional `docker pull
"$image"` before `docker run`. Under `set -e`, a transient registry
timeout (the self-hosted runner's network to docker.io is unreliable)
aborts the whole run even when the base image (node:24-bookworm) is
already cached locally — which is what happened on run 26490782540.
Skip the pull entirely when the image is already present, and only pull
when it is missing. This avoids both the failure and the wasted pull
timeout on every run, and keeps a run's base image stable. Refreshing
the cached image is a separate, explicit operation on the runner.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* ci: persist agent sandbox pnpm store across runs
The pnpm store was placed under $RUNNER_TEMP, which the Actions runner
wipes per job, so every agent exploration re-downloaded all dependencies
from the npm registry — slow, and as fragile as the runner's docker.io
access (the same network class that already broke the docker pull).
Move the store to a persistent host path ($HOME/.cache/agent-pr-explore/
pnpm-store, overridable via OD_SANDBOX_PNPM_STORE) so a warm,
content-addressed store is reused across runs. `rm -rf "$root"` no longer
touches it since it lives outside the per-run root.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>