Real Codex runs (#3060) explored correctly — verifying 3-4 UI cases with
DOM evidence — but Codex over-planned (6 steps), executed the high-value
ones, then went silent chasing a remaining planned step and tripped
expect-cli's ~180s no-output watchdog, aborting the turn before it emitted
a final report. The run then fell back to an advisory artifact, so the
real findings never reached report.md.
Tighten the prompt so Codex finishes and submits before going idle:
- cap at 3 cases (was 6) and target 2-3, quality over breadth;
- add a CRITICAL instruction stating the runner aborts with no report
after ~3 min of no output, so Codex must stop after 2-3 cases and emit
the complete report in one final turn rather than leaving planned steps
pending or retrying silently.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #1306 routed artifacts whose source matches htmlNeedsSandboxShim() through buildSrcdoc(), which injects a localStorage / sessionStorage polyfill before any user script runs. Issue #1403 stayed open as a verification placeholder against the original repro shape — a React tree whose useState initializer reads localStorage in a sandboxed iframe.
file-viewer-render-mode.test.ts already covers the routing decision. This commit closes the loop on the runtime payload: a real-shape React artifact is fed through buildSrcdoc, the produced doc is run inside a Node vm context whose window forbids Web Storage the same way an allow-scripts iframe does, and we assert (a) the bare sandbox raises SecurityError on access, (b) the shim takes over and exposes a working in-memory store, (c) the original boot script that read localStorage from initializers runs cleanly, and (d) the shim does NOT clobber a working native storage when one is present (the allow-same-origin path stays untouched). Also pins shim ordering — the shim script must appear before the first user storage read for the polyfill to be effective.
MediaSurface rendered preview.poster straight into an <img> with no error handler, so an official Community card whose poster URL 404'd / failed to decode / hit a dead host left the browser's default broken-image glyph on the discovery surface. Reported on the Home page where several official image-template cards looked unreliable side-by-side with healthy ones.
Track a per-URL load-failure flag and swap in the existing plugins-home__media-fallback element (the typographic glyph + media icon) when the <img> fires onError. The flag resets whenever preview.poster changes, so filter rotations or a daemon repopulating the preview after an offline flip get a fresh attempt instead of staying stuck on the fallback.
Regression tests cover the four shapes: default <img> render, error -> fallback swap, poster URL change resets the failed state, and the original no-poster branch still goes straight to the fallback.
Accept <ask-question> as an alias for <question-form> and locate close
tags with a Unicode-safe scan so Turkish dotted-I prose before the tag
does not desync parser indices.
expect-cli defaults to the Claude Code ACP provider, which is not
installed on the self-hosted runner, so the exploration step errored
(AcpProviderNotInstalledError) and fell back to a reachability-only smoke
trace instead of real UI exploration.
Pass `-a codex` to expect-cli so it drives the Codex agent (installed on
the runner, authenticated via CODEX_HOME). Configurable via
OD_EXPECT_AGENT (set to empty to use expect-cli's default). When the
agent is unavailable the existing smoke-trace fallback still applies, so
this is safe even before Codex is authenticated.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Treat Claude Code stdout like "Not logged in · Please run /login." as an
auth failure in diagnoseClaudeCliFailure so connection tests and chat
runs surface actionable login guidance instead of raw CLI text.
Lazy srcdoc transport was still active after URL-load preview switched off,
leaving the visible iframe on an empty activation shell until Edit forced a
full srcdoc reload. Mount real artifact HTML whenever srcdoc is the active
transport and remount when leaving URL-load.
Fixes#2791
After fixing source acquisition (#3078), the #3060 validation run reached
the container and got through most of `pnpm install`, then failed building
the better-sqlite3 native module: prebuild-install could not reach github
releases and the node-gyp fallback could not fetch node headers from
nodejs.org (ECONNRESET). The electron postinstall hits the same blocked
hosts, and package tarballs from npmjs were throttled to ~20 KB/s.
The runner's network to npmjs / nodejs.org / github releases is throttled
or reset by GFW; the China npm mirror (npmmirror.com) is fast and complete
(verified from the runner: registry ~2.4 MB/s, node headers ~3.6 MB/s,
better-sqlite3 prebuilt present). Point the in-container install at it via
registry + disturl (node-gyp headers) + electron / electron-builder /
better-sqlite3 binary mirrors + Playwright download host.
Package integrity is still verified against the lockfile, so the mirror
only changes transport. Once a native module builds, pnpm's side-effects
cache in the persistent store keeps it warm for later runs.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>