Commit graph

135 commits

Author SHA1 Message Date
Caprika
76c7d31c53
chore: bump vela cli to 0.0.4 (#3239)
* chore: bump vela cli to 0.0.4-test.0

* chore: refresh lockfile for vela cli 0.0.4-test.0

* chore(nix): refresh pnpm deps hash

* fix: materialize electron before mac release checks

* fix: rebuild electron when mac framework links are invalid

* revert: drop release workflow experiments

* chore(nix): refresh pnpm deps hash

* fix: stop blocking beta mac release on electron symlink preflight

* fix: stop using custom electron dist for beta mac packaging

* fix: guard oversized chat images and opencode overflow

* chore: bump vela cli to 0.0.4

* chore(nix): refresh pnpm deps hash

* fix(daemon): surface prompt-image stat failures instead of dropping them

resolveSafePromptImagePaths only swallowed unresolvable path input; once a
path was confirmed inside UPLOAD_DIR and existed, a statSync failure
(EACCES/EPERM, a file vanishing mid-run) silently dropped the image and let
the run continue without that prompt context. Since this helper is now also
the 1 MB enforcement point, that turned an infra/validation failure into a
'successful' run with missing required context.

Collect those into a new failedImages bucket and fail the run with
INTERNAL_ERROR at the call site, mirroring the oversized-image guard. Add a
unit test covering statSync throwing.

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
Co-authored-by: lefarcen <935902669@qq.com>
2026-05-29 06:41:17 +00:00
lefarcen
551f967d2c
ci(agent-pr-explore): rewrite prompt — non-lazy disposition + mandatory probe list (#3156)
* ci(agent-pr-explore): rewrite prompt — non-lazy disposition + mandatory probe list

Background: on PR #2355 (large AMR runtime add) the agent stopped at smoke level
because the positive path was gated on a missing `vela` binary. The old prompt
explicitly instructed "if setup prerequisites block, return inconclusive
immediately" + "do not spend more than two attempts on test data" + "do not run
arbitrary host shell commands", which made the agent give up rather than:
- use the PR's own `tests/fixtures/fake-vela.mjs`
- set the `VELA_BIN` env the runtime reads
- probe `/api/integrations/vela/*` directly via fetch

This rewrite shifts disposition + adds 4 structured unblock steps:

- **Mindset**: each /explore is a precious, expensive run; be thorough, not lazy.
- **STEP 0** Read PR body for `## Test Plan` section — declared cases = MUST-COVER.
- **STEP 1** Extract diff-driven probe list (new routes / components / env vars /
  fixtures / CLI flags). Anything skipped requires explicit written reason.
- **STEP 2** Before giving up, try (a) PR-provided fixtures, (b) build minimal stub
  inside container, (c) probe APIs directly via page.evaluate fetch,
  (d) search repo / related PRs / docs for unknown terms.
- **STEP 3** 4-7 cases for substantive PRs (was hard cap 3).
- **STEP 4** Login / multi-tab / OAuth — use Playwright multi-page handling;
  read creds from env, never echo.
- **SECURITY** strengthened: env vars matching common secret patterns are
  confidential; never echo / log / write to file / page.evaluate / report.
- **Report** new required §Mitigations Attempted for Inconclusive verdicts —
  must list what was tried + why each didn't unblock.

Kept unchanged: 3-min keepalive constraint, untrusted-data rule, no-host-shell
rule, report markdown structure (/⚠️// + case emoji).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(agent-pr-explore): truncation-aware + soft-request protocol + capability fix

Addresses review feedback from @mrcfps:

(1) STEP 1 — diff probe list was instructed to enumerate "every new
    route/component/env/fixture/flag" without acknowledging that the harness
    already truncates the diff upstream (file_patch_max_chars + context_max_bytes).
    On exactly the large PRs this prompt targets, the agent only sees a slice.

    Fix: prompt now explicitly tells the agent the context MAY be truncated and
    to (a) note the truncation in §🧭 Scope and (b) emit §📎 Needs to ask the
    maintainer to attach the missing source files into the private workspace
    for the next run.

(2) STEP 2 — old text told the agent to "create stubs inside the sandbox
    container", "rewire env / PATH", and "run gh / grep searches". The harness
    does not expose docker exec or arbitrary shell to the agent (capabilities
    are fs:write on host + Playwright on host driving the dockerized app via
    HTTP). The instructions promised things the agent literally cannot do.

    Fix: STEP 2 now spells out the actual capabilities (host fs:write, Playwright
    page.evaluate / page.request, host-side $WORKSPACE_DIR if maintainer pre-
    attached one). The "unblock by stub" path is rewritten honestly: build a
    host-side stub if useful, but acknowledge container env is fixed at
    docker run and signal what's needed via §🔑 Needs / §📎 Needs for the next
    iteration. The "search repo for unknown term" step (which required gh/grep)
    is dropped in favor of using $WORKSPACE_DIR materials.

(3) Soft-request protocol (new):

    The agent is READ-ONLY for secrets and workspace — it cannot self-attach.
    But it can SIGNAL what was needed via two new optional report sections:

    - §🔑 Needs — secret request ("VELA_RUNTIME_KEY: needed to verify ...")
    - §📎 Needs — workspace file request ("amr-auth-spec.md: clarifies ...")

    The dashboard (synclo platform; see nexu-io/synclo#79 RFC §6.8) will parse
    these structurally and surface as one-click attach hints to the maintainer
    on the run detail screen. Pure passive signal; no auto-action; zero prompt-
    injection risk (no code path takes the values).

    Hard rules:
    - No pasting of existing secret values (security)
    - Each item MUST tie back to a specific blocked case in §🧪 Cases — not
      speculative
    - Workspace privacy: agent may reference workspace files BY PURPOSE
      ("verified positive-1 from test-plan.md") but NEVER paste their content
      into the report

This commit is non-trivial (~65 lines net) but the changes are tightly scoped:
honesty about capabilities + a new signaling channel that replaces the
impossible direct-action promises.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(agent-pr-explore): remove gh-search + container-exec language from prompt

MINDSET bullet: replace 'gh search issues/prs/code' with in-scope
materials (PR body, diff context, workspace files) plus
page.request/page.evaluate probes, matching actual harness capabilities.

SECURITY bullet: replace the contradictory 'You may run commands INSIDE
the sandbox container' with a clear statement that the agent has no shell
or container exec access, only the Playwright browser.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* ci(agent-pr-explore): fix Needs section misuse and unawaited fetch body

Move fixture env-var wiring request from §🔑 Needs to §📎 Needs and
remove the concrete host path from the example; §🔑 Needs is
secret-name-only and must not carry filesystem paths. Await r.text() in
the page.evaluate fetch example so the body field resolves to a string
instead of an unresolved Promise.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* ci(agent-pr-explore): broaden 📎 Needs to cover env/config wiring alongside file attachments

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* ci(agent-pr-explore): fix step-2b Needs routing and split AMR_USER/AMR_PASS example

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

* ci(agent-pr-explore): fix mindset bullet — env var cannot be set mid-run

Replace "set the env var" in the MINDSET mitigations list with
"identify the env var and request the needed startup wiring in §📎 Needs"
to match the actual capability boundary: container env is fixed at
docker run time and cannot be changed by the agent mid-run.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 09:51:19 +00:00
lefarcen
9a4816b101
feat(plugins): site-wide plugin detail pages, share-to-site links, landing deploy trigger (#2999)
* feat(plugins): site-wide plugin detail pages, share-to-site links, landing deploy trigger

Why: a merged plugin PR didn't redeploy the landing site (plugins/** was missing from the deploy paths), and the desktop Share menu copied a local/404 link instead of the public marketplace URL. The landing plugin routing left by the detail-page rework also 404'd: the locale listing's cards used a multi-segment href while detail pages were single-segment, and only 388 bundled _official plugins had pages.

What changed:
- Deploy: landing-page deploy/ci trigger on plugins/**, and skip the slow previews step on an exact cache hit (cache key aligned across both workflows so a PR-built cache is reused by main).
- Share URL: packages/contracts/plugin-url.ts owns the single-segment plugin URL scheme; the web Share menu and the landing site both derive links from it. Web links now point at https://open-design.ai/plugins/<slug>/.
- Full detail coverage: detail pages now cover all 403 local plugins (_official incl. atoms + community), each rendered from its local manifest. Fixes the locale-listing 404s and the community manifest-name/catalog-id (- vs /) mismatch.
- Self-host: daemon exposes OD_SITE_ORIGIN via /api/app-config; web falls back to the canonical origin until the daemon answers.

Validation: pnpm guard, pnpm typecheck (all packages), contracts + web tests green, and a full build E2E confirming all 403 catalog ids and locale-listing cards resolve to built detail pages (0 missing).

* chore: retrigger CI

* ci(landing): carry plugins/** trigger + previews cache-hit into #2994 split workflows

Merged origin/main, which split landing deploy into staging + manual production (#2994). git auto-migrated my landing-page-deploy.yml changes into landing-page-staging.yml via rename detection (plugins/** path, fallback-preview-card.ts cache key, cache-hit skip all carried). The new manual landing-page-production.yml didn't have them, so add the previews cache-key alignment + cache-hit skip there too (plugins/** path is N/A — production is workflow_dispatch only).

* fix(ci): wrangler-action uses pnpm so it tolerates landing's workspace dep

This PR added @open-design/contracts (workspace:*) to apps/landing-page/package.json so the landing site can share the plugin-url slug rules. But the landing deploy/preview steps run cloudflare/wrangler-action with packageManager: npm in workingDirectory apps/landing-page, and 'npm i wrangler' chokes on the workspace: protocol (EUNSUPPORTEDPROTOCOL), failing 'Validate landing page'. Switch all three landing wrangler-action steps (staging / ci preview / production) to packageManager: pnpm, which is workspace-aware.

* test(e2e): bundled plugins now offer the README badge

After this branch, buildPluginShareUrl returns a public open-design.ai link for bundled plugins (not just official-marketplace ones), so the home-starter share menu now shows 'Copy README badge'. Update the assertion from toHaveCount(0) to toBeVisible().

* fix(landing): drop @open-design/contracts dep, use a landing-local slug helper

Per review on #2999: the marketing site must not import @open-design/contracts (AGENTS.md boundary — it's the web/daemon product-runtime contract layer). Move the slug/path helpers into landing-local app/_lib/plugin-slug.ts; the web client keeps contracts' plugin-url. The two derive the same scheme and are verified in lockstep by the e2e route check (403 share URLs -> 403 detail pages, 0 missing). landing no longer has a workspace dep, so revert the wrangler-action packageManager back to npm.

* fix(landing): include plugins/_official in previews cache key

Per review on #2999: generate-previews.ts builds bundled-plugin preview jobs from plugins/_official/**/open-design.json and renders fallback cards from manifest fields (title/description/mode/scenario/tags). With plugins/** now triggering the workflow but the cache key not hashing plugin inputs, a plugin-only PR/merge could exact-hit an old cache and skip the preview regen, shipping with a stale or missing /previews/plugins/<manifest-id>.png. Add plugins/_official/** to the cache key in all three landing workflows (ci, staging, production). community is not currently covered by generate-previews so its glob is omitted.

* fix(plugins): include community marketplace installs in share gate

hasPublicPage now covers sourceMarketplaceId === 'community' so the
README badge and public detail link surface for community installs.
Community manifest names carry a community- prefix that diverges from
the landing-page route slug, so URL derivation uses sourceMarketplaceEntryName
(community/<folder>) instead — pluginDetailSlug takes the last segment,
matching the /plugins/<folder>/ route the landing page emits.
Adds component tests for buildPluginShareUrl, badge copy, and the
Open-in-marketplace link for a community/registry-starter record.

Generated-By: looper 0.9.2 (runner=fixer, agent=claude-code)

---------

Co-authored-by: mrcfps <mrc@powerformer.com>
2026-05-28 09:07:12 +00:00
lefarcen
df8a0faff6
feat(runtimes): register AMR (vela) as an ACP stdio agent (#2355)
* feat(runtimes): register AMR (vela) as an ACP stdio agent

AMR is the vela CLI's ACP runtime mode. `vela agent run --runtime opencode`
speaks ACP JSON-RPC over stdio (see vela's
`specs/current/runtime/manual-agent-run-openrouter.md`); per
`docs/new-agent-runtime-acp.md` we expose it through the same `streamFormat:
'acp-json-rpc'` transport that already powers Hermes, Devin, Kimi, etc.

The new `defs/amr.ts` is the entire wiring — `buildArgs` returns
`['agent', 'run', '--runtime', 'opencode']`, `fetchModels` reuses
`detectAcpModels`, and the fallback list seeds the OpenRouter ids vela's
e2e baseline uses. `executables.ts`/`app-config.ts`/`metadata.ts` get the
matching `VELA_BIN`/`VELA_LINK_URL`/`VELA_RUNTIME_KEY`/`VELA_OPENCODE_BIN`
allowlist + install/docs URLs, so users can configure the per-agent env in
Settings without leaking into other adapters.

Coverage: `tests/fixtures/fake-vela.mjs` is a minimal ACP stub that returns
the documented `initialize` / `session/new` / `session/set_model` /
`session/prompt` shapes; `tests/amr-acp-integration.test.ts` spawns it via
`child_process.spawn` and drives a full turn through `attachAcpSession` and
`detectAcpModels`, so the ACP transport contract for AMR is end-to-end
verified locally even before a real `vela` binary is installed.

Validated:
- pnpm guard
- pnpm typecheck (all workspace projects)
- pnpm --filter @open-design/daemon test (2881/2881)

Deferred: real OpenRouter-backed turn through a built `vela` binary —
the runtime def needs no changes for that path, only `VELA_RUNTIME_KEY`
and `VELA_LINK_URL` in env (or Settings).

* fix(runtimes/amr): pin a concrete default model and bare openai ids

End-to-end validation against a freshly-built `vela` (nexu-io/vela@main)
+ OpenRouter surfaced two contract details the first AMR runtime def
got wrong:

1. vela rejects `session/prompt` with `session/set_model must be called
   before session/prompt`. attachAcpSession in apps/daemon/src/acp.ts
   skips set_model whenever the picked model is the synthetic 'default'
   id, so AMR's fallback list must NOT include DEFAULT_MODEL_OPTION. The
   def now ships a concrete `gpt-5.4-mini` as both `fetchModels`'
   default option and `fallbackModels[0]`, which makes attachAcpSession
   always send a real `session/set_model` for AMR turns.

2. `vela --runtime opencode` auto-prepends `openai/` to whatever modelId
   it forwards to opencode's openai provider. With OpenRouter-style ids
   like `openai/gpt-5.4-mini`, opencode receives the double-prefixed
   `openai/openai/gpt-5.4-mini` and replies `ProviderModelNotFoundError`.
   The new fallback list ships the bare ids opencode's openai registry
   actually knows about (gpt-5.4, gpt-5.4-mini, gpt-5.4-fast, etc.).

Stub + tests:
- tests/fixtures/fake-vela.mjs now enforces the set_model gate the same
  way real vela does, so a regression that silently goes back to
  model: 'default' would surface as a fatal error in tests instead of a
  hidden production failure.
- tests/amr-acp-integration.test.ts pins both contracts: no 'default' /
  no 'openai/' prefix in fallbackModels, and a negative case that
  asserts session/prompt fails when no model is set.

Adds `apps/daemon/scripts/verify-amr-real-vela.mjs` — a small dev-time
runner that drives `attachAcpSession` against a real `vela` binary and
prints the daemon's chat events, so future protocol drift can be checked
against an actual OpenRouter call.

Verified locally: `vela agent run --runtime opencode` + OpenRouter
returns the prompted string ("AMR-E2E-PASS") through the full daemon
pipeline; daemon test suite stays 2883/2883.

* fix(runtimes/amr): substitute concrete model when chat run sends 'default'

A plugin-driven AMR run from the UI surfaced a real-world hole in the
prior commit:

  json-rpc id 3: session/set_model must be called before session/prompt

The Default-design-router plugin (and any caller that doesn't pin a
real model) sends `model: 'default'` straight through, which the AMR
runtime def cannot accept — vela rejects `session/prompt` without
`session/set_model` and attachAcpSession skips set_model whenever
model === 'default'. Just leaving DEFAULT_MODEL_OPTION out of the
adapter's `fallbackModels` is not enough: the chat-run handler in
server.ts still forwarded 'default' verbatim.

This adds `resolveModelForAgent(def, resolved, env?)` as the
single source of truth for the substitution:

  1. If the caller picked a real id, pass it through.
  2. Else, if `def.defaultModelEnvVar` is set and the daemon process
     env has a non-empty value for it, return that (operator escape
     hatch — see below).
  3. Else, if the def's `fallbackModels` does NOT contain a 'default'
     id, return `fallbackModels[0].id`.
  4. Else, return the original value (the historic shape — defs that
     list 'default' themselves are untouched).

AMR sets `defaultModelEnvVar: 'VELA_DEFAULT_MODEL'`, so when
opencode's openai-provider registry deprecates `gpt-5.4-mini`
upstream, an operator can swap the fallback id without a code change
by exporting `VELA_DEFAULT_MODEL=gpt-5.5` before launching tools-dev
/ od. Worth noting the env var must live in the daemon's `process.env`
(Settings-UI per-agent env values only reach the spawned child, not
the daemon's resolver) — the new field's docblock spells this out.

Coverage:
- `tests/runtimes/resolve-model.test.ts` — 8 unit tests covering all
  four resolver branches plus the env-override happy path / fallback /
  ignore-when-user-picked-a-real-id case.
- `pnpm --filter @open-design/daemon typecheck` clean.

* chore(runtimes/amr): move AMR to the top of the base agent list

So `AMR (vela)` shows up first in the agent picker / status views,
ahead of claude / codex. Pure ordering change; no behavior delta.

* feat(amr): Sign-in / Sign-out button on the AMR Settings card

The first half of the AMR work assumed the operator would set
VELA_RUNTIME_KEY / VELA_LINK_URL on the daemon process and never
surfaced login state to users. This adds the missing UX so a fresh
install can drive the full path from Settings:

  - GET  /api/integrations/vela/status   reads ~/.vela/config.json
    for the active profile and returns { loggedIn, profile, user }
    (without leaking the runtime/control keys themselves).
  - POST /api/integrations/vela/login    spawns `vela login` once
    (409 if one is already in flight). The vela CLI opens the user's
    browser to the device-authorization page itself — Open Design
    only needs to kick the subprocess off.
  - POST /api/integrations/vela/logout   removes ~/.vela/config.json
    so the next status read returns logged-out.

`AmrAgentCard` is a dedicated agent-card component for AMR because
the existing `<button>` row can't host an interactive sub-control
(nested interactive elements). It polls /status after a login click
until the daemon reports loggedIn=true (or 5 minutes elapse), and
exposes a Sign-out action on hover. Other adapters (claude, codex,
hermes, …) keep their existing `<button>` card.

i18n: 8 new keys (settings.amrLogin / Logout / LoggingIn / etc.)
added to en + zh-CN. Other locales spread `en` and inherit the
English copy until translations land.

Coverage:
- `tests/integrations/vela.test.ts` pins the config.json reader
  against a tmp HOME — including the negative case where a profile
  has user info but no runtimeKey (still logged-out), and the
  secret-leak guard ("rt-secret-*" must not appear in the projection
  payload).
- `tests/components/AmrAgentCard.test.tsx` covers all four UI
  states (logged-out, logging-in, logged-in, logging-out) plus the
  click-propagation invariant the divergent card was built to keep.

`pnpm --filter @open-design/daemon test` 2901 / 2901 passing.
`pnpm --filter @open-design/web test` 1719 / 1719 passing.
`pnpm typecheck` + `pnpm guard` clean.

Dev script side-effects: `apps/daemon/scripts/verify-amr-real-vela.mjs`
no longer requires both VELA_RUNTIME_KEY and VELA_LINK_URL — if
VELA_PROFILE is set, the vela CLI is allowed to resolve credentials
from `~/.vela/config.json`. Added the two AMR `.mjs` fixtures to
`scripts/guard.ts` allowlist with the executable-fixture / dev-runner
rationale.

* fix(connection-test): substitute model for AMR before attachAcpSession

The chat-run path in server.ts already routes the requested model through
`resolveModelForAgent` so AMR / vela (whose CLI demands an explicit
`session/set_model` before `session/prompt`) gets the def's first
concrete fallback id when the chat run ships `model: 'default'`.
`connectionTest.ts` was wiring `attachAcpSession({ ..., model: model ?? null })`
directly, which made the Test Connection button on the AMR Settings
card deadlock with the same `session/set_model must be called before
session/prompt` error the chat-run path already handles — surfaced as a
permanent "Testing connection…" spinner in the UI.

Reuse the same helper here so Test Connection mirrors chat-run behavior.

* test(amr): three-layer end-to-end coverage for the AMR login + turn flow

The PR up to this point shipped runtime + UI code with unit-level Vitest
coverage. This commit adds the cross-layer regression net the live demo
relied on:

1. apps/daemon/tests/integrations/vela.routes.test.ts (HTTP, Vitest)
   Spins up the real daemon Express app via `startServer({port:0,...})`,
   persists `agentCliEnv.amr.VELA_BIN = <fake>` into app-config.json,
   and exercises every /api/integrations/vela/* endpoint against the
   extended fake-vela stub:
     - status reads ~/.vela/config.json under various states
     - login spawns the fake, waits for config.json to appear, returns
       pid + startedAt + profile
     - 409 already-running guard with the stub's delay knob
     - logout removes the file (idempotent)
     - secrets (runtimeKey / controlKey) never leak in the projection
     - login → status round-trip flips loggedIn=false → true

2. e2e/tests/amr/turn.test.ts (tools-dev orchestrated, Vitest)
   Boots a namespaced daemon + web pair through `createSmokeSuite`,
   inlines a self-contained fake `vela` binary that handles BOTH
   `vela login` (writes ~/.vela/config.json) and
   `vela agent run --runtime opencode` (ACP stdio with the
   `session/set_model must precede session/prompt` gate the real binary
   enforces), then drives a complete /api/runs lifecycle for
   `agentId: 'amr', model: 'default'` and asserts the assistant message
   captures the fake's streamed text. This is the test that would have
   surfaced today's plugin-default-model regression (the `set_model
   before prompt` error) at PR time instead of demo time.

3. e2e/ui/amr-login-pill.test.ts (Playwright)
   Mocks /api/agents + /api/integrations/vela/{status,login,logout}
   to drive the Settings AMR card through the full Sign in → Signed in
   → Sign out cycle. Pins the AmrLoginPill polling contract and the
   aria-label semantics (the pill's accessible name is "Sign out" once
   logged in, regardless of which label the hover-state text shows).

fake-vela.mjs extensions:
   - Handles `vela login` argv by writing
     ~/.vela/config.json for the active VELA_PROFILE and exiting 0 —
     mirrors real vela's on-disk side-effect without the device-auth
     loop.
   - FAKE_VELA_LOGIN_DELAY_MS knob so route tests can observe the
     in-flight state of the spawn lifecycle.
   - FAKE_VELA_LOGIN_USER_EMAIL / _USER_PLAN to assert the surfaced
     user fields end-to-end.

Validated:
   - `pnpm guard` + `pnpm typecheck` (all workspace projects)
   - `pnpm --filter @open-design/daemon test`: 2998 / 2998 passing,
     including the new 8-test integration suite.
   - `cd e2e && pnpm test tests/amr`: 1 / 1 passing.
   - `cd e2e && pnpm exec playwright test ui/amr-login-pill.test.ts`:
     1 / 1 passing (6.7s).

* feat(amr): package native cli and refine login ui

* feat(amr): wire vela cli beta packaging

* docs(amr): document vela ci packaging review

* docs(amr): refine vela ci integration review

* fix(ci): refresh nix pnpm dependency hashes

* fix(pack): clean up Vela CLI packaging

* fix(pack): bundle Vela CLI support files

* fix(amr): recover login attempts from stale auth state

* test: expand AMR and automations coverage

* fix(amr): address review follow-ups

* test(web): align tasks fixtures with contracts

* fix(daemon): type wildcard route params

* fix(ci): refresh PR merge validation

* fix(amr): clear env credentials on logout

* feat(settings): inline local CLI model configuration

* fix(amr): recognize daemon env credentials

* [codex] Fix Vela companion packaging (#2979)

* Fix Vela companion packaging

* Update Nix pnpm dependency hashes

* [codex] Surface AMR account failures (#2980)

* fix: surface AMR account failures

* fix: cover AMR recovery error guidance

* chore: bump beta base version to 0.8.1 (#2990)

* Fix AMR profile and packaged runtime review issues

* Detect packaged AMR OpenCode companion tree

* feat(web): polish AMR frontend flows

* Polish AMR onboarding card

* fix: read AMR login state from dot-amr config (#3048)

* test: tighten AMR credential and packaging coverage

* test: restore AMR executable test env helper

* [codex] Fix packaged mac Dock identity and AMR label (#3076)

* Fix packaged mac sidecar Dock identity

* Rename AMR assistant label

* Fix AMR live models and dot-amr login state (#3073)

* fix: read AMR login state from dot-amr config

* fix: load live AMR models before runs

* fix: point AMR onboarding link to production wallet

* fix: address AMR model review feedback

* fix: persist live AMR model fallback

* [codex] Fix AMR link catalog model ids (#3088)

* Fix packaged mac sidecar Dock identity

* Rename AMR assistant label

* Fix AMR link catalog model ids

* Fix AMR model normalization typecheck

* Use live AMR model for default runs

* fix: polish AMR runtime settings UI

* Accelerate AMR startup defaults (#3092)

* Surface AMR insufficient balance wallet URL (#3099)

* fix(web): polish onboarding controls (#3112)

* fix(web): show CLI scan loading state

* Avoid duplicate AMR wallet recharge links (#3117)

* Avoid duplicate AMR wallet recharge links

* Use Vela CLI 0.0.3 test package

* chore(nix): refresh pnpm deps hash

* Fix AMR wallet guidance display

---------

Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>

* chore(pack): pin Vela CLI 0.0.3-test.1 (#3127)

* chore(nix): refresh pnpm deps hash

* chore(pack): pin Vela CLI 0.0.3

* chore(nix): refresh pnpm deps hash

* fix(web): suppress AMR exit 130 fallback (#3136)

* feat(web): nudge users to hosted AMR on model/auth/quota failures (#3083)

* feat(web): nudge users to hosted AMR on model/auth/quota failures

When a non-AMR agent run fails with an auth / quota / upstream model
error, surface an inline nudge under the error pill linking to Open
Design's hosted AMR gateway (https://open-design.ai/amr). The nudge
fires `surface_view` (element=run_failed_toast) on impression and
`ui_click` (element=go_amr) on the link.

Also teach the daemon to classify CLI-agent auth/quota/upstream failures
(Claude Code, codex, ...) into specific API error codes
(AGENT_AUTH_REQUIRED / RATE_LIMITED / UPSTREAM_UNAVAILABLE) instead of
the generic AGENT_EXECUTION_FAILED, so both the error message and the
nudge key off accurate codes. AMR's own runs are excluded from the
nudge — they keep the dedicated sign-in / recharge affordances.

* feat(web): rework failed-run AMR guidance into per-case error UI

Replace the single inline nudge with a per-case failed-run experience
driven by the run's error code + agent:

- The error card is now neutral gray (was red) and always carries a
  retry button; it is driven by the persisted per-message error event so
  it survives a reload.
- Non-AMR agent hitting a model/auth/quota wall: a theme-color promotion
  card under the error card offers "switch to AMR & retry" — switches the
  run to AMR, opens Settings on the AMR card, and auto-retries once the
  account signs in (ProjectView polls vela login status, independent of
  the Settings pill lifecycle, with success / 5-min-timeout / unmount
  exits).
- AMR agent unauthorized: clearer copy + an "authorize & retry" button.
- AMR agent out of balance: clearer copy + a "top up" button to the AMR
  wallet, with manual retry.
- Settings AMR card: when opened from the nudge, it scrolls into view and
  pulses, and an authorize-button coachmark (a fake hand cursor that
  rises in and dismisses on hover) points at the sign-in control when not
  yet authorized.

analytics: surface_view (run_failed_toast) on the promotion card and
ui_click (go_amr) on its action are retained. i18n adds chat.amrCard.*
and chat.amrError.* (en / zh-CN / zh-TW translated; other locales fall
back to en) and drops the old chat.amrErrorGuidance keys.

* fix(daemon): require status context for numeric service-failure codes

Per review on #3083: the model-service classifier matched bare HTTP
status numbers (`500`, `502`, `429`, `401`), so ordinary CLI output like
`line 500`, `read 502 bytes`, or `exit code 401` could be misclassified
as a provider outage / auth wall and wrongly surface the AMR nudge. Now
a status number only counts when it carries explicit context (`HTTP 500`,
`status 503`, `code: 401`, `502 Bad Gateway`); textual provider phrases
(overloaded, bad gateway, service unavailable, rate limit, …) are
unchanged. Adds fixtures proving unrelated numeric output stays null.

* fix(web): keep error pill for failed runs ChatPane's card doesn't cover

Per review on #3083: the per-message gray error pill was suppressed for
every persisted error status event, but ChatPane only renders the
replacement top-level error card for `retryableAssistantMessage` (the
last failed assistant). So a failed turn that is no longer last (after a
follow-up) or an older failed run in history showed neither the pill nor
the card — its error detail vanished, undercutting reload/history
survival. ChatPane now passes `errorCardOwnerId` (the assistant id whose
error the card represents); AssistantMessage suppresses only that one
pill and keeps rendering StatusPill for all other error events.

* fix(daemon): don't treat a process exit code as an HTTP status

Follow-up to review on #3083: the status-context helper accepted a bare
`code` prefix, so `exit code 401` / `process exited with code 429` still
matched and got classified as AGENT_AUTH_REQUIRED / RATE_LIMITED (the
very `exit code 401` case the comment calls out as noise). `code` now
only counts when qualified (`status code` / `error code` / `response
code`) or punctuation-bound (`code: 401`); bare `exit code N` no longer
matches. Adds fixtures for exit-code lines returning null.

* chore(web): translate AMR card / error keys for 16 remaining locales

PR #3083 added 10 new `chat.amrCard.*` / `chat.amrError.*` keys but only
provided en/zh-CN/zh-TW translations; the other 16 locales fell back to
English. Translate the card title/body, three chips, primary CTA, and
the AMR self-error (auth / balance) messages and buttons for ar, de,
es-ES, fa, fr, hu, id, it, ja, ko, pl, pt-BR, ru, th, tr, uk.

* fix(amr): address review feedback on #2355

Targeted fixes for the unresolved review threads on #2355. Each fix
includes / updates a focused test.

- runtimes/executables.ts: `packagedVelaOpenCodeCompanionTree` now
  verifies the inner `opencode` executable exists + is runnable, not
  just the directory. This closes the false-positive availability path
  that let `detectAgents()` surface AMR as available even when the
  packaged companion was empty / partially copied (mrcfps, 4 threads).

- runtimes/executables.ts: `resolveAmrOpenCodeExecutable` now prefers
  the bundled `<OD_RESOURCE_ROOT>/bin/libexec/opencode/opencode` over a
  stale `opencode` on the user's PATH, so packaged AMR builds can't be
  hijacked by a global installation.

- web/EntryShell.tsx: when the Local CLI scan returns an available
  agent and the previously-selected agent is AMR, switch the selection
  to the first available local agent so the runtime and persisted
  agent agree before Continue.

- server.ts (model-probe branch): for AMR, check `readVelaLoginStatus`
  BEFORE rejecting on an empty live-model catalog — a signed-out user
  was getting `AMR_MODEL_UNAVAILABLE` ("choose a model") instead of
  the correct `AMR_AUTH_REQUIRED` (sign-in affordance).

- server.ts (default model fallback): if the user asked for the AMR
  agent default and the cached id is no longer in the FRESH catalog,
  fall back to `liveModels[0]` from the probe instead of rejecting the
  run as `AMR_MODEL_UNAVAILABLE`.

- integrations/vela.ts: route `vela login` through
  `createCommandInvocation` so an npm/Node-style `vela.cmd` / `.bat`
  shim on Windows gets the correct `cmd.exe /d /s /c …` wrapping with
  verbatim args (matches `execAgentFile` / chat-run spawning).

- tools/pack/src/linux.ts: in containerized Linux builds, bind-mount
  the host directory of `OPEN_DESIGN_VELA_CLI_BIN` and rewrite the env
  to the container-side path. The host path was being passed in as-is
  even though the default container only mounts /project, /tools-pack
  and cache/home — `copyOptionalVelaCliBinary` saw a missing path.

Deferred (out of scope for this PR):
- `od amr status/login/logout/cancel` CLI subcommands (AGENTS.md
  UI/CLI dual-track rule, server.ts:5763) — sizable surface; tracked
  for a separate focused PR.
- Strict `--require-vela-cli` for Windows + mac-x64 beta builds:
  prematurely blocked — `@powerformer/vela-cli` only publishes the
  `darwin-arm64` platform binary today; adding the flag elsewhere
  would fail the builds. Revisit once win/x64/linux binaries ship.

* fix(amr): hoist sendAmrAccountFailure above the AMR catalog preflight (TDZ)

The new signed-out AMR branch in the catalog preflight at server.ts:10875
calls `sendAmrAccountFailure(...)` to emit AMR_AUTH_REQUIRED, but the
const declaration sat ~100 lines below at the outer function scope. Because
`const` is TDZ-aware, that branch would have thrown `ReferenceError:
Cannot access 'sendAmrAccountFailure' before initialization` for the
exact users it tries to help — defeating the original intent.

Hoist the helper to just above the AMR preflight block so it's available
to every AMR code path in this function. Behavior elsewhere is unchanged.

Also rerun the daemon test suite: `launch.test.ts > resolveAgentLaunch
uses packaged built-in Vela for AMR` was creating the
`<resourceRoot>/bin/libexec/opencode/` companion *directory* only, but
this PR's earlier tightening of `packagedVelaOpenCodeCompanionTree`
also requires the inner `opencode` executable. Add it to that fixture
to match the new contract; the test was a sibling of the executables /
env-and-detection fixtures already updated in 13fc4f4.

Addresses #2355 review (mrcfps, 2026-05-28).

* feat(web): add hover cancel for AMR login (#3158)

* feat(web): add hover cancel for AMR login

* fix(web): don't bounce AmrLoginPill back to 'Signing in…' after local cancel

Both codex-connector (P2) and looper (CHANGES_REQUESTED) on this PR
flagged the same race in the new local-cancel path: `handleCancelLogin`
dispatches `notifyAmrLoginStatusChanged('login-canceled')` immediately
after `/login/cancel` returns, but the `AMR_LOGIN_STATUS_EVENT` listener
unconditionally re-enters `refresh()` and then restarts polling
whenever `/api/integrations/vela/status` still reports
`loginInFlight: true`.

That is a real race because the daemon's `cancelVelaLogin()` only sends
SIGTERM (escalating to SIGKILL after `LOGIN_CANCEL_KILL_GRACE_MS` =
2000 ms) and keeps the child in `activeLoginProcs` until it actually
exits — so the first `/status` read after a successful cancel can
legally still come back as in-flight. Under that window the pill flips
back to 'Signing in…' and can later surface the timeout/error path even
though the user already canceled, defeating the behavior promised in
the PR description.

Fix the listener instead of every dispatch site: in the
`login-canceled` branch, after the local reset (stopPolling +
setPending(null) + clear refs), optimistically mark every subscribed
pill instance as not-in-flight (`setStatus((c) => c ? { ...c,
loginInFlight: false } : c)`) and `return` — skip the
refresh-and-reconcile branch below entirely. The next explicit refresh
(component mount, user interaction, or a `status-changed` event) will
pick up the daemon's confirmed state once the child has actually
exited.

Add a focused regression test that holds `/api/integrations/vela/status`
at `loginInFlight: true` even after a successful `/login/cancel`,
asserting that the pill stays at the Canceled → Authorize sequence and
never bounces back to 'Signing in…'. This test fails on the pre-fix
listener and passes on the new behavior; existing
'cancels an in-flight AMR sign-in…' and 'reconciles late AMR browser
completion to Signed in after local cancel' tests continue to pass.

Addresses review feedback on #3158 (chatgpt-codex-connector, nettee).

---------

Co-authored-by: lefarcen <935902669@qq.com>

---------

Co-authored-by: a1chzt <chizblank@gmail.com>
Co-authored-by: Amy <1184569493@qq.com>
Co-authored-by: Mason <jinmeihong0201@gmail.com>
Co-authored-by: Caprika <56862773+alchemistklk@users.noreply.github.com>
Co-authored-by: open-design-bot[bot] <282769551+open-design-bot[bot]@users.noreply.github.com>
2026-05-28 05:09:55 +00:00
PerishFire
d3a5e2901b
Add release-beta-s self-hosted workflow placeholder (#3150)
* Add self-hosted beta release workflow placeholder

* ci: register self-hosted runner labels for release-beta-s lint

actionlint flagged the custom self-hosted labels nexu-win and release-beta
as unknown. Add them to .github/actionlint.yaml so the release-beta-s
workflow passes the lint gate.

Generated-By: looper 0.0.0-dev (runner=fixer, agent=claude-code)

---------

Co-authored-by: libertecode <libertecode@proton.me>
2026-05-28 03:44:52 +00:00
lefarcen
abe72af2a2
fix(ci): resolve blog-SEO base ref to a SHA in merge queue (#3137)
Some checks failed
visual-baseline / Capture visual baselines (push) Waiting to run
actionlint / Lint GitHub Actions workflows (push) Failing after 1s
ci / Detect CI change scopes (push) Successful in 0s
landing-page-ci / Validate landing page (push) Failing after 1s
landing-page-staging / Deploy landing page to staging (push) Has been skipped
nix-check / build (push) Failing after 2s
ci / Validate Nix flake (push) Has been skipped
ci / Preflight (push) Failing after 2s
ci / Workspace unit tests (push) Failing after 2s
ci / Daemon workspace tests (push) Failing after 2s
ci / Web workspace tests (push) Failing after 1s
ci / Browser tests (push) Failing after 1s
ci / Build workspaces (push) Failing after 1s
ci / Validate workspace (push) Failing after 0s
ci / Runtime trace (push) Has been skipped
The "Lint changed blog SEO" and "Guard blog URL changes" steps in
landing-page-ci.yml fell back to the literal "HEAD^" when no base SHA was
available. On merge_group (and first-push) events there is no
pull_request.base.sha / before, so BASE became "HEAD^", and the
blog-indexing scripts' assertSafeGitRef (regex /^[A-Za-z0-9_./:-]+$/,
which forbids "^") threw `Unsafe git ref for base: HEAD^`, failing the
merge queue for every PR.

Resolve the fallback to a concrete commit with `git rev-parse HEAD^`
(checkout uses fetch-depth: 0, so it is always available), mirroring
blog-indexing-on-deploy.yml. The resulting 40-hex SHA passes the ref guard.
2026-05-27 17:07:09 +00:00
lefarcen
ce9fa687ca
ci: trigger PR exploration via maintainer /explore comment (no approval) (#3139)
* ci: trigger PR exploration via maintainer "/explore" comment (no approval)

Add a low-friction way to run the sandbox exploration: a maintainer
comments "/explore" on a PR.

- on: issue_comment (kept workflow_dispatch). The job `if` allows the
  comment path only when it is on a PR and the commenter has write access
  (author_association OWNER/MEMBER/COLLABORATOR), so randoms cannot trigger
  it; untrusted PR code still runs only inside the Docker sandbox.
- Drop the agent-pr-explore environment approval gate: both triggers are
  already write-gated and there is no auto-trigger, so the extra manual
  approval is redundant. R2 creds are repo-level secrets (no env-scoped
  secrets), so they stay available without the environment.
- Feedback: 👀 reaction on the command + a placeholder comment carrying
  the report marker (so the run yields one evolving comment), 🚀 on
  success, and 👎 + a failure note (with the run link) on failure.

Does not auto-run on every PR, so unrelated PRs stay clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: don't clobber a produced report with the /explore failure note

Review: the failure-feedback step ran after the always() report step, so
on the failure-with-report case (sandbox wrote a report then exited
non-zero) it overwrote the just-posted report with the generic "failed
before producing a report" note — losing the useful output.

Guard it: if the report file exists, leave the posted report in place and
skip the failure note/reaction. Only post the short failure note when no
report was produced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 16:48:58 +00:00
lefarcen
ae7a417208
ci: add idempotent provision script for the agent-pr-explore runner (#3122)
* ci: add idempotent provision script for the agent-pr-explore runner

The self-hosted runner's setup was hand-assembled and easy to lose on a
rebuild — most dangerously the codex-acp pin: expect-cli bundles
codex-acp 0.10, which is incompatible with ChatGPT-account auth (every
model rejected); we run 0.15, but any expect-cli reinstall silently
reverts it and breaks the agent.

Add a self-contained, idempotent provision script that brings the
runner's config layer back to a working state and is safe to re-run:
codex model pin (gpt-5.4), the codex-acp 0.15 pin (npm pack + extract +
chmod), deploy-key generation, base-repo git mirror seed/refresh,
pnpm-store/reports dirs, the weekly image-refresh helper + cron, and the
readiness self-check helper. The header documents the manual/secret
steps it intentionally does not automate (base toolchain + colima, the
interactive `codex login`, registering the deploy key on the repo, and
registering the Actions runner service).

Verified idempotent against the live runner (all checks pass, no config
disturbed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: provision — update codex model key in place, don't truncate config.toml

Review: step 2 overwrote the whole ~/.codex/config.toml with just the
model line whenever the exact pin wasn't already present, dropping any
other Codex settings on a re-run — destructive, contradicting the
idempotent goal. Now: replace an existing `model =` line in place (sed),
append only when the key is absent, and leave the rest of config.toml
untouched. Verified preservation locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: provision — create ~/.ssh before ssh-keygen on fresh host

Review: on the fresh-rebuild path this script targets, ~/.ssh usually
does not exist, so `ssh-keygen -f ~/.ssh/od_agent_deploy` fails with
"No such file or directory" and the deploy key (and downstream mirror
bootstrap) never gets created. mkdir -p the key's parent dir (chmod 700)
before keygen, and only print the pubkey when it actually exists.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 14:51:59 +00:00
Marc Chan
bc44a8add3
ci: relay contributor card events to worker (#3113) 2026-05-27 14:48:01 +00:00
lefarcen
54f225d6b3
ci: retry PR-context gh calls so a transient API blip doesn't abort the run (#3128)
* ci: retry PR-context gh calls so a transient API blip doesn't abort the run

The early PR-context gathering calls `gh pr diff`, `gh pr view`, and
`gh api .../files`. gh hits api.github.com under the hood, and a single
transient timeout/5xx there aborts the whole run before any exploration
(seen on #3083: "could not find pull request diff: Get \"https://api.
github.com/...\": net/http timeout"). These were the only network calls
in the run without a retry (source fetch + npm already retry).

Add a small gh_retry helper (4 attempts, linear backoff) and wrap the
three read-only context calls. gh writes nothing to stdout on a failed
API call, so retrying is safe even for the calls piped into the context
file; the retry warning goes to stderr.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: address review — buffer gh retries to files (no paginated duplication)

Review (Siri-Ray): wrapping `gh api --paginate` retries inline while the
context block is redirected to the file means a mid-pagination failure
leaves partial pages in the context, and the retry appends them again —
duplicating the patches section and burning the context budget the agent
reads.

Replace the in-pipe gh_retry with gh_retry_file: each call buffers to its
own file per attempt (`>` truncates on open, so a failed/partial attempt
is discarded before the next), and the context block just cats the
finished files. Fetch PR body + patches to files up front, then assemble.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 14:42:08 +00:00
lefarcen
a6a56099ca
ci: show per-case pass/fail status emoji in agent report (#3118)
Reviewers asked for at-a-glance outcomes. Instruct the agent to begin each
"Cases Tested" bullet with a status emoji ( pass /  fail / ⚠️ warning /
 inconclusive) and a bold case name, so the report shows which checks
passed or failed without reading each line.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 13:34:29 +00:00
lefarcen
bf61a39cb5
ci: clean agent report (write-to-file) + slim artifacts/uploads (#3116)
* ci: clean agent report (write-to-file) + slim artifacts/uploads

Four related cleanups to the agent PR exploration output:

1. Clean report. The PR comment / report.md was assembled by dumping the
   entire verbose expect.log (ACP init logs, "Git failed" warnings, the
   ~24KB echoed prompt, ANSI codes, progress checklist) under the trace
   header -- ~28KB of noise. Instead, instruct the agent to write its
   final Markdown report to a file via its file-write tool, and have the
   runner read that file directly. Verified: Codex writes a clean report
   to the given absolute path. Falls back to an inconclusive note if the
   agent did not finish.

2. Drop duplicate trace/video. The script copied
   playwright-smoke-trace.zip -> playwright-trace.zip (a ~28MB legacy
   duplicate) and the webm likewise, and uploaded both to R2. Keep only
   the canonical smoke-named artifacts.

3. Slim the GitHub artifact. The trace zips and videos are already on R2;
   exclude *.zip / *.webm from the uploaded artifact so it drops from
   ~56MB to <1MB (report + logs only).

4. Persist report on the runner. Copy the report / agent-report /
   expect.log / trace URL to a stable host dir
   ($HOME/.cache/agent-pr-explore/reports/pr-<n>) so dry runs
   (skip_comment) can be inspected without downloading the artifact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: address review — keep advisory reports + recursive artifact excludes

Review findings on the report/artifact cleanup:

1. Regression fix: the non-app-surface and deterministic-verifier branches
   write their pre-baked advisory report (Inconclusive / Pass / Fail) and
   never run the agent, so they don't produce agent-report.md. After
   switching write_agent_report_artifact to read only agent-report.md they
   fell through to the "agent did not write a final report" fallback,
   dropping the real advisory (and mis-reporting on .github-only PRs like
   this one). Fix: those branches now write their advisory directly to
   $agent_report_file — single source of truth for the report body.

2. Recursive artifact excludes: the source Playwright recording lives at
   artifacts/playwright-video/<uuid>.webm; non-recursive !*.webm / !*.zip
   didn't match the subdirectory. Use **/*.zip and **/*.webm so the slim
   actually holds.

3. Drop the now-dangling summary.legacyTrace field (the legacy trace copy
   is no longer produced), matching the legacyVideo removal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 12:30:05 +00:00
Jane
f8c860a505
feat(landing-page): localize plugins library across 18 locales (#3010)
* feat(landing-page): localize plugins library across 18 locales

PR #2926 shipped the new `/plugins/` library hub + four kind sub-routes
+ detail pages, but the chrome was English-only — visitors landing on
`/zh/plugins/` saw the old marketplace registry placeholder rendered
by the catch-all instead, and detail pages rendered identical English
copy regardless of locale prefix. This PR brings the plugin surface
to feature parity with `/zh/skills/`, `/zh/templates/`, `/zh/systems/`,
`/zh/craft/`.

## What changes

- New `app/_lib/plugins-i18n.ts` — single source for all plugin chrome
  copy (hub, list pages, chip rails, share dialog, detail-page meta
  labels). English baseline + 17 locale overrides keyed on
  `LandingLocaleCode` (the same short-code shape `localeFromPath()`
  returns). Missing keys per locale fall back to English so a
  partially-translated locale still renders sensibly. Translations
  cover hub copy, four tile titles + blurbs, seven artifact-kind
  labels + descriptions, 23 scene-subcategory labels, 18 detail-page
  chrome strings, and a six-key share-dialog table with a
  per-locale `shareTemplate({title, url})` function (translated for
  every locale where `_lib/i18n.ts` already had one — same voice).
- `app/pages/plugins/{,templates/,templates/[kind]/,skills/,systems/,
  craft/,[slug]/}/index.astro` — every hardcoded English string now
  reads `getPluginsCopy(locale)` keys. Page logic and routing
  unchanged.
- New short-code wrappers under `app/pages/[locale]/plugins/` — six
  files (hub + three sub-routes + `[kind]/` and `[slug]/`) following
  the same pattern `[locale]/skills/index.astro` already uses: each
  re-exports the canonical page component and adds a per-locale
  `getStaticPaths()` so the build emits 17 locale prefixes per
  plugin route. Total plugin-route prerender count goes from ~390 to
  ~7 000, matching the existing skill/template scaling.
- Catch-all (`[locale]/[...path].astro`) — old `getPublicPlugins` /
  `getRegistryCounts` registry rendering removed (placeholder UI
  that was never wired to a real marketplace data source). Plugin
  routes now live exclusively under `[locale]/plugins/...` short-code
  wrappers, so the catch-all stops claiming `'plugins'` as a route
  root. The dead-code path also drops a `pluginCounts.all` reference
  the title row was reading.
- `.plugins-tile-grid` styles promoted from a scoped `<style>` in the
  default-locale hub to global `app/sub-pages.css` so the
  short-code wrapper renders the same hub markup without re-mounting
  per-page CSS — `display: contents`-style scoping pitfalls in
  Astro's per-component CSS scoping made this the cleanest fix.

## Surface area

- [ ] **UI** — new page / dialog / panel / menu item / setting / empty state in `apps/web` or `apps/desktop`
- [ ] **Keyboard shortcut** — new or changed
- [ ] **CLI / env var** — new `od` subcommand or flag, new `tools-dev` flag, or new `OD_*` env var
- [ ] **API / contract** — new `/api/*` endpoint, new SSE event, or changed shape in `packages/contracts`
- [ ] **Extension point** — new entry under `skills/`, `design-systems/`, `design-templates/`, or `craft/`, or change to the skills protocol
- [ ] **i18n keys** — new translation keys (full plugin chrome added across all 18 locales)
- [ ] **New top-level dependency** — adding any new entry to the **root** `package.json`
- [ ] **Default behavior change** — changes what existing users experience without opting in
- [x] **None** — landing-page-only restoration of i18n parity for the plugin surface

## Validation

- `pnpm --filter @open-design/landing-page typecheck` → 0 errors
- `pnpm --filter @open-design/landing-page build:static` → 16 127 pages
  built (+6 584 over current main: ~388 plugin detail pages × 17
  locale prefixes plus the hub + four sub-routes × 17 locales).
- `copy-example-html.ts` reports `266 entry files + 65 referenced
  files`, identical to before — no regression in the asset-mirroring
  pipeline.
- Local Playwright smoke (`/zh/plugins/...`):
  - `/zh/plugins/` renders `<title>插件库 · Open Design</title>`,
    label `插件库`, h1 `407 个可组合的构件。`, four tiles labelled
    `模板 / 技能 / 设计系统 / 工艺`.
  - `/zh/plugins/templates/video/` renders h1 `48 视频`, scene chips
    `全部 / 动效 / 短视频 / 营销 / 产品 / 数据讲解`.
  - `/zh/plugins/example-article-magazine/` share dialog renders
    `复制下面的文案、然后跳到你想分享的平台粘贴即可` etc., share
    template auto-interpolates plugin title + URL into Chinese voice.
  - All 18 locale prefixes (`/zh`, `/zh-tw`, `/ja`, `/ko`, `/de`,
    `/fr`, `/ru`, `/es`, `/pt-br`, `/it`, `/vi`, `/pl`, `/id`, `/nl`,
    `/ar`, `/tr`, `/uk`) → 200 across hub + four sub-routes + sample
    detail page.
  - English `/plugins/` unchanged (default-locale path bypasses the
    `[locale]/...` wrapper).

* feat(landing-page): finish plugins i18n chrome across 18 locales

The first localization pass shipped a partial fix: hub headings, lead
copy, two-level page chrome, detail-page metadata labels, the share
dialog, and the chip rail were still falling back to English on every
non-English locale because plugins-i18n.ts only filled a chrome slice
for `zh` and the file header even claimed "7 artifact-kind labels and
25 scene-subcategory labels are translated" for every locale that did
not yet have those blocks.

Three changes close the visible gap:

1. plugins-i18n.ts: fills the 27 still-missing chrome fields per locale
   for zh-tw / ja / ko / de / fr / ru / es / pt-br / it / vi / pl / id /
   nl / ar / tr / uk. Includes the 7-key category map, the 23-key
   subcategory map, hubHeading / hubLead, the 4 *Label / *Heading /
   *Lead triples for the templates / skills / systems / craft hub
   pages, the 4 tile blurbs, the 4 browse buttons, sceneLabel, allChip,
   the 12 detail-page metadata labels (mode / scenario / platform /
   surface / author / manifest id / tags / preview caption / find on
   GitHub / homepage / open in new tab) and bucket label map, the
   detail share dialog (title / copy link / jump-to), and the
   header-side nav.plugins entry. zh receives the same 11 detail-page
   and share-dialog labels it was also missing.

2. header.tsx + site-footer.astro: routes the hardcoded "Plugins /
   Templates / Skills / Systems / Craft" labels through `nav.*` from
   HeaderCopy, so every locale gets its own dropdown trigger and
   footer column. Adds `nav.plugins` to HeaderCopy and fills it in 18
   locales with the local form ("插件" / "プラグイン" / "Plugins" /
   "Plug-ins" / "Plaginy" / "الإضافات" / etc).

3. plugin-row.astro + content-i18n.ts: chip rail. The bundled-plugin
   branch now runs raw `mode` / `scenario` slugs through the shared
   localizeTaxonomyValue, and that helper now also consults the
   plugins-i18n subcategory map before giving up. localizeTaxonomyValue
   now returns undefined on a true miss instead of the unknownTag
   placeholder, so chips drop quietly instead of showing "Category" /
   "分類" / "Categoría" for taxonomy slugs we have not localized yet.
   Callers that genuinely want the placeholder (`localizeContentTag`,
   blog `category`, system noun) still keep the explicit fallback.

Out of scope and tracked separately: per-plugin title and description
in plugins/_official/* (author-supplied English metadata, ~401 plugins
without an i18n schema in the manifest yet — needs RFC + tooling
before the manifests can be expanded), and adding the long tail of
mode / scenario / category slugs (`code-migration`, `plugin-sharing`,
`tune-collab`, `live-artifacts`, `engineering`, ...) to TAXONOMY_TERMS
so chips render localized labels for every taxonomy value rather than
dropping silently.

* feat(landing-page): cover plugins chip rail long-tail taxonomy slugs

PR #3010's first round localized the high-frequency mode/scenario
chips (prototype, video, image, marketing, design, ...) but left the
~37 mode/scenario and 14 category slugs that show up in real `od.*`
metadata — code-migration, plugin-sharing, design-system, planning,
scenario, refine, discovery, handoff, token-map, tune-collab, orbit,
live-artifacts, engineering, healthcare, hr, sales, support,
default-router, downstream-export, figma-migration, media-generation,
plugin-authoring, validation, 3d-shaders, animation-motion,
audio-music, creative-direction, design-systems, diagrams, documents,
image-generation, marketing-creative, screenshots, slides,
video-generation, web-artifacts, ... — falling through to undefined
and dropping their chip silently on every non-English locale.

The data layer is the source of truth here, so this expansion lands
in `content-i18n.ts:TAXONOMY_TERMS` / `CATEGORY_LABELS` rather than
the plugins-i18n catalog: a single dictionary entry per slug fans out
to every chip-rail consumer (catalog rows, detail metadata, the
templates/[kind] facets) without each consumer touching its own copy.

Translations cover all 17 non-`en` locales. Brand and product nouns
(Figma, Open Design, BYOK, plugin) stay literal; technical taxonomy
slugs get short equivalents that read as chips rather than full
prose. The result on `/ja/plugins/skills/` matches `/plugins/skills/`
chip-for-chip (30 chips both sides) instead of dropping 27 of them
the way the previous iteration did.

* feat(landing-page): read manifest title_i18n / description_i18n on bundled plugins

PR #3010's prior rounds localized chrome and chip rails but the
catalog's most prominent text — each row's plugin name and blurb —
stayed English on every non-English locale. The plugin manifest
schema (`packages/contracts/src/plugins/manifest.ts`) has supported
`title_i18n` and `description_i18n` (Record<locale, string>) on every
manifest from spec v1; ~24 of the 401 first-party manifests already
carry one for `zh-CN`. The reader was just never wired to use them.

This change does the reader half: bundled-plugins.ts captures the
two i18n maps off each `open-design.json`, plugin-row.astro and the
detail page resolve them at render time via two new helpers
(`resolveBundledTitle`, `resolveBundledDescription`) that mirror the
short→long fallback chain documented in the manifest spec
(`htmlLang` like `zh-CN` → short `LandingLocaleCode` like `zh` →
primary tag → `en` → English baseline). The static-paths pass still
runs once for all locales — it has to, since each manifest produces
one URL — but the title/description shown on the rendered page now
reads the locale off `Astro.url.pathname` and picks the right entry
out of the maps.

Verified locally: `/zh/plugins/example-card-twitter/` now reads
"Twitter 分享卡 / 推特金句 / 数据卡, 适合配推文" from the manifest's
existing `zh-CN` block instead of the English baseline.

Plugin-data half follows in a separate commit. The 17 non-English
locales × 401 manifests need backfilling so the reader has something
to resolve to; that's data, not schema, and lands as a sequence of
manifest patches rather than tangled with this code change.

* feat(plugins): translate scenarios bucket title/description across 17 locales

Closes the first chunk of #3028. Eleven scenarios plugins (the
default-scenario bundle for each taskKind: code-migration,
figma-migration, media-generation, new-generation, tune-collab,
plugin-authoring; the default design router; the React / Vue /
Next.js downstream-export starters; and the Refine baseline) get
title_i18n + description_i18n filled for all 17 non-English locales
the landing page serves (zh-CN, zh-TW, ja, ko, de, fr, ru, es,
pt-BR, it, vi, pl, id, nl, ar, tr, uk).

The reader landed in 7ddfe36; this commit is data-only. taskKind
slugs that other docs reference by name (`code-migration`,
`figma-migration`, `tune-collab`, etc.) stay literal in the
descriptions so cross-references still resolve. Brand nouns —
Open Design, Next.js, React, Vue, Figma — also stay literal.

`/ja/plugins/od-code-migration/` now reads
"コードマイグレーション(デフォルトシナリオ)" instead of the English
baseline; `/zh/plugins/skills/` shows "代码迁移(默认场景)" in the
catalog row.

Remaining buckets (image-templates 45, video-templates 50,
examples 140, design-systems 142 = 377 plugins) follow in
subsequent commits in this PR.

* fix(landing-page): drop CJK template wrap when source name is still English

The Chinese / Japanese / Korean fallback templates for craft, skill,
template, system, plugin, and blog text splice the source `name` /
`title` into a CJK sentence frame: ``${name}工艺规则``,
``Open Design 指南:${topic}``, ``${name} は…のスキルです``. When the
underlying SKILL.md / craft markdown / blog frontmatter still ships
an English name (true for ~95% of the catalog today), that produces
mid-sentence script straddling on `/zh/...`, `/zh-tw/...`, `/ja/...`,
`/ko/...` like:

  H1   : "Editorial typography hierarchy工艺规则"
  Lead : "这条 Open Design 工艺规则定义 Editorial typography hierarchy
          的执行标准…"
  Plug : "video 插件 · 3D Animated Boy Building Lego"

That reads worse than the all-English fallback, because the visitor
parses the page in two scripts at once.

Adds a `nameNeedsEnglishFallback` guard that fires for the four CJK
locales whenever the spliced-in name has no CJK characters of its
own, and threads it through every `localizeXxxText` helper:
craft, template, system, plugin, skill, blog. When it fires the
helper returns the raw English content untouched, so the section
renders end-to-end in one language. Chrome (header, footer, breadcrumb,
buttons, share dialog) keeps its CJK rendering — only the
title-and-lead block falls back.

Side benefit: the same guard kicks in on the long tail of plugin
manifests still pending `title_i18n` / `description_i18n` backfill
(tracked in #3028), so `/zh/plugins/<bundled>/` no longer pairs a
"video 插件 · 3D Animated Boy Building Lego" title with a Chinese
breadcrumb. The page reads "3D Animated Boy Building Lego" + the
English manifest description, while header / footer / breadcrumbs
stay localized. Once a manifest ships its i18n maps, the chrome and
body re-converge automatically.

Non-CJK non-Latin scripts (ar, vi, ...) keep the previous behavior —
their templates already read tolerably with English names. If that
turns out to be wrong on a real audit, the same guard generalizes by
adding the matching Unicode range and locale set.

* feat(plugins): translate image-templates bucket title/description across 17 locales

44 of 45 image-templates plugins get title_i18n + description_i18n
filled for all 17 non-English locales (zh-CN, zh-TW, ja, ko, de, fr,
ru, es, pt-BR, it, vi, pl, id, nl, ar, tr, uk). Generated via Claude
Sonnet 4.5 over the OpenRouter gateway, ~$1.38 in API spend, 156s
wall-clock. Brand and cultural references stay literal (Open Design,
Lego, Hanfu, Showa, Pokémon, Black Myth: Wukong). Long AI generation
prompts collapse to a 1-2 sentence summary capturing what the plugin
does — the description doubles as catalog blurb on the landing site,
not as the actual generation prompt (which lives in example.html /
the manifest's preview entry).

Skipped: `profile-avatar-realistically-imperfect-ai-selfie` returned
malformed JSON on three retries; will rerun with a tighter prompt in
a follow-up commit. Catalog rows for that plugin keep falling back to
the raw English fields per #3010's reader change, so nothing breaks.

Tracking: closes the image-templates row in #3028.

* feat(plugins): translate video-templates bucket title/description across 17 locales

49 of 50 video-templates plugins get title_i18n + description_i18n
filled for the 17 non-English landing locales. Generated via Claude
Sonnet 4.5 over OpenRouter, ~$1.47 in API spend, 177s wall-clock.
HyperFrames templates, the Three Kingdoms cinematic series, the
Seedance/short-film prompts, and the K-pop / wuxia / anime variants
all get a 1-2 sentence catalog blurb in each locale; brand and
cultural tokens (Black Myth: Wukong, Hanfu, Showa, Pokémon, Three
Kingdoms / 三国志, Lego, Disney, K-pop, HyperFrames) stay literal.

Skipped: `live-action-anime-adaptation-water-vs-thunder-breathing-duel`
returned malformed JSON on three retries; will rerun in followup.
Falls back to the raw English fields per the reader landed in 7ddfe36.

Tracking: closes the video-templates row in #3028.

* feat(plugins): translate examples bucket (117/140) title/description across 17 locales

117 of 140 examples plugins get title_i18n + description_i18n filled
for the 17 non-English landing locales. Generated via Claude Sonnet
4.5 over OpenRouter, $3.94 in API spend, ~13 min wall-clock at
8-way concurrency. Existing zh-CN translations on 24 manifests are
preserved (the merge keeps author-supplied entries and only adds
missing locales).

23 of 140 returned malformed JSON on three retries — the output
likely hit the 4000 max_tokens ceiling on plugins whose description
balloons across 17 locales. Those manifests fall back to English on
non-`en` rendering per the reader landed in 7ddfe36, and will rerun
in a follow-up commit with a larger token budget and a stricter
output schema.

Tracking: closes 117/140 of the examples row in #3028; the remaining
23 stay open in that issue's failure list.

* feat(plugins): translate design-systems bucket (141/142) title/description across 17 locales

141 of 142 design-systems plugins get title_i18n + description_i18n
filled for the 17 non-English landing locales. Generated via Claude
Sonnet 4.5 over OpenRouter, $2.55 in API spend, 301s wall-clock at
8-way concurrency.

Translator script gained two improvements between examples and this
bucket:
- max_tokens bumped from 4000 to 8000 so 17-locale outputs stop
  truncating on the long-tail manifests with verbose descriptions
- a balanced-brace JSON extractor that pulls the outermost `{ ... }`
  from the response, tolerating trailing prose Claude occasionally
  appends after the JSON object.

Result: only 1 manifest (`totality-festival`) failed parse this
batch, down from ~16% on the examples bucket. The next commit
re-runs the prior buckets' failures with the improved script.

Tracking: closes 141/142 of the design-systems row in #3028.

* fix(plugins): backfill 4 plugins that retried green after JSON extractor improvement

dcf-valuation, social-media-dashboard, wireframe-sketch (examples
bucket) and live-action-anime-adaptation-water-vs-thunder-breathing-duel
(video-templates bucket) parse cleanly under the balanced-brace
extractor introduced for the design-systems batch. The remaining
22 failures from the prior runs hit a different parse mode (Claude
emitting unescaped double quotes inside string values when the source
description contains its own English quotes like 'make it professional');
those will need a tighter prompt and rerun.

* fix(plugins): translate the last 22 plugins with quote-handling prompt fix

The 22 stuck plugins all carried English / Chinese double-quoted
phrases inside their description (\"make it professional\",
\"What's inside\", \"电子杂志 × 电子墨水\") that Claude was emitting
back inside JSON string values without escaping, breaking the parse.

Added one rule to the translator prompt — never use a straight double
quote inside a translated string, prefer single quotes / curly quotes
/ CJK 『 』 / 《 》 — and the previously stuck batch sailed through
clean: 22/22 ok, 0 retries, $0.85.

This closes the long tail of #3028:
- scenarios   11/11   ✓
- image-templates 45/45 ✓
- video-templates 50/50 ✓
- examples    140/140  ✓
- design-systems 142/142 ✓
- atoms       N/A (filtered from public catalog)

All 388 catalog-visible plugins now ship title_i18n + description_i18n
for all 17 non-English locales the landing page serves.

* fix(plugins): clean up four review-flagged i18n data issues

- apps/landing-page/app/_lib/plugins-i18n.ts:759 — Polish bucket
  label `examples: 'Przyklad'` was missing the diacritic; every
  other Polish string in the same block uses proper diacritics.
  Restore to 'Przykład'. (Reviewer: looper #4364985878.)

- video-templates/cinematic-route-navigation-guide — German
  title_i18n.de was a byte-for-byte copy of en ("Cinematic Route
  Navigation Guide") while the German description was already
  translated. Replace with "Cinematischer Routen-Navigationsleitfaden"
  to match the German voice the description sets.

- video-templates/hollywood-haute-couture-fantasy-video-prompt —
  Dutch title_i18n.nl was identical to en for the same reason.
  Translate the trailing noun phrase: "Hollywood Haute Couture
  Fantasy Videoprompt" (mirrors the Dutch description's compound
  word style).

- video-templates/video-seedance-three-kingdoms-guanyu-slaying-yanliang —
  Korean Hangul `돌진` had leaked into the Turkish description (a
  translation-pipeline artifact where the model copied the verb
  from the Korean output without translating it). Replace
  "saflarına돌진 eder" with the idiomatic Turkish "saflarına dalar".

All four are data-only fixes against existing manifests; no schema
changes, no reader changes. typecheck stays at 0 errors.

* fix(landing-page): localize aria-labels, alt text and BreadcrumbList JSON-LD on plugin detail page

The PR's prior rounds left six accessibility / structured-data
surfaces on `/{locale}/plugins/<slug>/` either entirely English or
mixing English chrome with the localized plugin title. Reviewer
flagged each one across multiple loops; this commit clears them all:

1. `aria-label` on the open-in-new-tab popout no longer reuses the
   visible label `pcopy.detailOpenInNewTab` (which carries the
   decorative `↗`). Added `detailOpenInNewTabAria` — same wording,
   no glyph — and the `<a aria-label>` consumes that key. The
   visible link text still ends in `↗`.

2. `<nav class="breadcrumb" aria-label="Breadcrumb">` now reads
   `aria-label={pcopy.breadcrumbLabel}`. Eighteen locales filled
   ("面包屑导航", "パンくずリスト", "Brotkrumen-Navigation",
   "Fil d'Ariane", "مسار التنقل", "İçerik haritası", ...).

3. Share-dialog `<button aria-label="Close">` now reads
   `aria-label={pcopy.shareDialogClose}`. Eighteen locales filled
   ("关闭", "閉じる", "Cerrar", "Закрыть", "إغلاق", ...).

4. Three template-literal a11y strings (`${pluginTitle} preview`,
   `Open interactive preview for ${pluginTitle}`, `${pluginTitle}
   interactive preview`) become function calls
   (`pcopy.previewImageAlt(t)`, `previewSummaryAria(t)`,
   `previewIframeTitle(t)`) so the sentence frame around the
   plugin title rotates with the page locale. Two `<img alt>` call
   sites (the static preview at line 210 and the click-to-expand
   thumbnail at line 179) both consume `previewImageAlt`.

5. `BreadcrumbList` JSON-LD position-2 now reads
   `name: pcopy.hubLabel` instead of hardcoded English `"Plugins"`.
   The visible breadcrumb at line 105 already renders
   `pcopy.hubLabel`; this aligns the structured data with the
   rendered chrome on every locale.

The new function-typed keys deliberately interpolate `pluginTitle`
(which is itself locale-resolved via `resolveBundledTitle`) so the
mixed-language guard from commit 002d457 is preserved: a manifest
without a per-locale title still flows through to a coherent
single-language a11y string because `pluginTitle` falls back to
English along with the rest of the section.

apps/landing-page typecheck stays at 0 errors.

Closes reviewer threads:
- #pullrequestreview-4364985878 (Open in new tab aria)
- #pullrequestreview-4368926224 (Polish typo + plus mixed-language alt/aria)
- #4373... (BreadcrumbList JSON-LD)
- #4374... (aria-label="Close" + aria-label="Breadcrumb")

* fix(landing-page): redirect legacy fa/hu/th /plugins/ paths to canonical

When the new `/{locale}/plugins/...` short-code wrappers landed, the
legacy catch-all `pages/[locale]/[...path].astro` dropped `'plugins'`
from its `paths` list. That intentionally avoids serving stale
marketplace-registry placeholder routes for the modern landing
locales — but it also takes `/fa/plugins/`, `/hu/plugins/`, and
`/th/plugins/` from 200 to 404, because those three legacy locales
live only in the old `_lib/i18n.ts:LOCALES` set and are not part of
`LANDING_LOCALES` (the modern 18-locale list the new wrappers serve).

Three `301`s in `_redirects` send those legacy URLs to the canonical
English `/plugins/...` so SEO and inbound links keep working until
the legacy locale set is retired entirely.

Reviewer thread (#pullrequestreview-4364052045) flagged this as a
non-blocking regression across multiple loops; this commit closes it.

* ci(landing-page): add merge_group trigger so the queue can clear PRs

`landing-page-ci.yml` only fired on `pull_request` and `push:main`,
which meant the required `Validate landing page` and
`Strict PR visual tests` checks never dispatched against the
`merge_group` ref the merge queue creates. The queue then sat at
"awaiting checks" until it timed out and ejected the PR (the
deadlock observed during the 5/26 release window).

Adding a `merge_group: { types: [checks_requested] }` trigger to
the same workflow lets the queued ref reuse the existing job graph,
matching the pattern in `ci.yml` which already wires `merge_group`.

Also drops `plugins/**` into the same paths filter as `pull_request`
since the new bundled-plugins reader (commit 7ddfe364) consumes
those manifests' `title_i18n` / `description_i18n` maps and the
landing-page CI must rerun when manifest data changes.

---------

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-27 09:30:59 +00:00
lefarcen
995601de9c
ci: make agent exploration finalize promptly (avoid inactivity abort) (#3100)
Real Codex runs (#3060) explored correctly — verifying 3-4 UI cases with
DOM evidence — but Codex over-planned (6 steps), executed the high-value
ones, then went silent chasing a remaining planned step and tripped
expect-cli's ~180s no-output watchdog, aborting the turn before it emitted
a final report. The run then fell back to an advisory artifact, so the
real findings never reached report.md.

Tighten the prompt so Codex finishes and submits before going idle:
- cap at 3 cases (was 6) and target 2-3, quality over breadth;
- add a CRITICAL instruction stating the runner aborts with no report
  after ~3 min of no output, so Codex must stop after 2-3 cases and emit
  the complete report in one final turn rather than leaving planned steps
  pending or retrying silently.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:15:06 +08:00
lefarcen
fed464509b
ci: drive agent PR exploration with the Codex ACP backend (#3086)
expect-cli defaults to the Claude Code ACP provider, which is not
installed on the self-hosted runner, so the exploration step errored
(AcpProviderNotInstalledError) and fell back to a reachability-only smoke
trace instead of real UI exploration.

Pass `-a codex` to expect-cli so it drives the Codex agent (installed on
the runner, authenticated via CODEX_HOME). Configurable via
OD_EXPECT_AGENT (set to empty to use expect-cli's default). When the
agent is unavailable the existing smoke-trace fallback still applies, so
this is safe even before Codex is authenticated.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 15:05:03 +08:00
lefarcen
114be63a4e
ci: route agent sandbox installs through the China npm mirror (#3084)
After fixing source acquisition (#3078), the #3060 validation run reached
the container and got through most of `pnpm install`, then failed building
the better-sqlite3 native module: prebuild-install could not reach github
releases and the node-gyp fallback could not fetch node headers from
nodejs.org (ECONNRESET). The electron postinstall hits the same blocked
hosts, and package tarballs from npmjs were throttled to ~20 KB/s.

The runner's network to npmjs / nodejs.org / github releases is throttled
or reset by GFW; the China npm mirror (npmmirror.com) is fast and complete
(verified from the runner: registry ~2.4 MB/s, node headers ~3.6 MB/s,
better-sqlite3 prebuilt present). Point the in-container install at it via
registry + disturl (node-gyp headers) + electron / electron-builder /
better-sqlite3 binary mirrors + Playwright download host.

Package integrity is still verified against the lockfile, so the mirror
only changes transport. Once a native module builds, pnpm's side-effects
cache in the persistent store keeps it warm for later runs.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 14:30:59 +08:00
lefarcen
1ac3da130f
ci: add skip_comment dry-run input to agent PR exploration (#3080)
Add a `skip_comment` workflow_dispatch input (default false). When set,
the "Comment exploration report" step is skipped, so a validation/dry
run can exercise the full pipeline and produce the report artifact
without posting a public comment on the target PR (useful when testing
against an external contributor's PR). The report is still uploaded as
an artifact for review.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 05:51:48 +00:00
lefarcen
12141648e4
ci: fetch agent sandbox PR source on the host over SSH via a local mirror (#3078)
The sandbox checked out PR code with `git fetch https://github.com/...`
*inside* the container. The self-hosted runner's bandwidth to github.com
is throttled across every transport (HTTPS/SSH/codeload/API, all
~30-90 KB/s) and the HTTPS handshake is frequently RST'd, so a
from-scratch fetch of this ~200MB repo is impractical and unreliable per
run (run 26491460889 failed here with repeated GnuTLS resets).

Move source acquisition to the trusted host and make it incremental:

- Keep a persistent bare mirror of the base repo
  ($HOME/.cache/agent-pr-explore/open-design.git, overridable via
  OD_SANDBOX_REPO_MIRROR). Each run fetches only the PR's delta via
  `refs/pull/<n>/head` over SSH -- the one transport GFW doesn't reset --
  using a read-only deploy key (OD_SANDBOX_GIT_SSH_KEY).
- Take the head from the BASE repo's pull ref so fork PRs work without
  depending on the head fork, and verify it equals the resolved HEAD_SHA.
- Check the PR head into a per-run worktree and mount it read-only into
  the container; the container copies it into a writable workdir and no
  longer needs (or has) any github access.

The deploy key stays on the trusted host and is never exposed to the
untrusted PR code. The mirror must be seeded once on the runner (the
error message prints the exact clone command if it is missing).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 05:36:13 +00:00
Alan Matias
f176b2ce5e
refactor(issue-template): separate logs and screenshots fields in bug report (#3032) 2026-05-27 04:41:28 +00:00
lefarcen
2ed93e9c5d
ci: reuse cached docker image and persist pnpm store for agent sandbox (#3074)
* ci: skip docker pull when agent sandbox image is already cached

The agent PR exploration script ran an unconditional `docker pull
"$image"` before `docker run`. Under `set -e`, a transient registry
timeout (the self-hosted runner's network to docker.io is unreliable)
aborts the whole run even when the base image (node:24-bookworm) is
already cached locally — which is what happened on run 26490782540.

Skip the pull entirely when the image is already present, and only pull
when it is missing. This avoids both the failure and the wasted pull
timeout on every run, and keeps a run's base image stable. Refreshing
the cached image is a separate, explicit operation on the runner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: persist agent sandbox pnpm store across runs

The pnpm store was placed under $RUNNER_TEMP, which the Actions runner
wipes per job, so every agent exploration re-downloaded all dependencies
from the npm registry — slow, and as fragile as the runner's docker.io
access (the same network class that already broke the docker pull).

Move the store to a persistent host path ($HOME/.cache/agent-pr-explore/
pnpm-store, overridable via OD_SANDBOX_PNPM_STORE) so a warm,
content-addressed store is reused across runs. `rm -rf "$root"` no longer
touches it since it lives outside the per-run root.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 12:49:26 +08:00
lefarcen
80639d4da4
ci: make agent PR exploration trusted checkout lightweight (#3071)
The "Checkout trusted base scripts" step did a full actions/checkout of
this large repo on the self-hosted runner. On a recent run it stalled in
the initial `git fetch --depth=1 origin <sha>` for many minutes before
the agent script ever started, and the run had to be cancelled.

The trusted host side only needs the self-contained
`.github/scripts/agent-pr-explore-sandbox.sh`; PR code is checked out
inside Docker and PR context is gathered via the API. Replace the full
checkout with a single-file fetch via `gh api` (raw), pinned to the same
trusted base/dispatch commit, which avoids the git-protocol fetch of the
whole repo entirely.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 04:18:19 +00:00
lefarcen
7b8bf0d9fb
ci: map agent trace upload to existing R2 secrets (#3013)
* ci: map agent trace upload to existing R2 secrets

* ci: make agent report comments macos-compatible

* ci: ensure Playwright browsers for agent traces
2026-05-27 03:01:36 +00:00
lefarcen
7312c64580
ci(landing): split landing deploy into staging gate + manual production (#2994)
* ci(landing): split landing deploy into staging gate + manual production

A merge to `main` previously published the landing page straight to
production (open-design.ai) via `landing-page-deploy`. There was no
buffer to review the rendered site, so a bad merge was live instantly.

Split deploys across two Cloudflare Pages projects so production is only
ever reached by an explicit human action:

- `landing-page-staging` (push to main) -> staging project
  `open-design-landing-staging` -> staging.open-design.ai.
- `landing-page-production` (manual workflow_dispatch only) -> production
  project `open-design-landing` -> open-design.ai. Only this workflow
  names the production project; gate it with required reviewers on the
  `production` GitHub environment.
- `landing-page-ci` now also deploys a per-PR preview into the staging
  project (`--branch=pr-<n>`) for same-repo branches and comments the URL.
  Fork PRs (no secrets / read-only token) skip the deploy and keep just
  the build validation. Path filters already scope this to landing edits.

Decouple search-engine indexing from staging:

- `blog-indexing-on-deploy` now triggers on `landing-page-production`
  (not every main push), so the test environment is never submitted to
  Google/IndexNow.
- It diffs from a new `blog-indexed-prod` tag (the last indexed prod
  commit) instead of `HEAD^`, and force-advances the tag after a
  successful run, so a manual promotion bundling several merged posts
  indexes all of them rather than only the last commit.

Staging and PR-preview builds drop `PUBLIC_GA_MEASUREMENT_ID` so test
traffic does not pollute the production GA property.

* ci(landing): keep staging + PR previews out of the search index

staging.open-design.ai mirrors production and is exposed via cert
transparency logs, so search engines can discover it. Indexing the
mirror competes with open-design.ai for the same content.

Emit `<meta name="robots" content="noindex, nofollow">` whenever
OD_LANDING_NOINDEX=1, and set that flag on the staging and PR-preview
builds (production leaves it unset and stays indexable). noindex is
used rather than a robots.txt Disallow so crawlers can still fetch the
page and read both the tag and the canonical, which already points at
the production origin.

* fix(landing): make staging noindex actually take effect

The previous commit read `process.env.OD_LANDING_NOINDEX` directly in
`seo-head.astro`, but `.astro` frontmatter is transformed by Vite and
does not see process.env, so the meta never rendered. Two fixes:

- Inject the flag as the compile-time constant `__OD_LANDING_NOINDEX__`
  via `vite.define` in astro.config.ts (config runs in Node and can read
  process.env); SeoHead consumes that constant.
- The homepage (`index.astro`) and `og.astro` build their own <head> and
  never use SeoHead, so a per-component meta can miss pages. Add an
  `astro:build:done` integration that appends a catch-all
  `/*  X-Robots-Tag: noindex, nofollow` to the Cloudflare Pages `_headers`
  on staging/preview builds, covering every response (homepage, assets,
  any custom-head page) at the HTTP layer. Production builds leave
  `_headers` untouched.

Verified: build with OD_LANDING_NOINDEX=1 emits the _headers block and
the SeoHead <meta>; build without the flag emits neither; astro check
clean.

* fix(landing): address review — pin prod checkout to main, defer index pointer

Two blockers from review:

- landing-page-production: workflow_dispatch can be launched from any ref
  via the Actions "Use workflow from" dropdown, so an operator could ship
  an arbitrary branch to open-design.ai. Pin the checkout to `ref: main`
  so the deployed artifact always equals reviewed main.

- blog-indexing-on-deploy: the `blog-indexed-prod` pointer was advanced
  right after sitemap submission, before Inspect / Search Analytics /
  Render status / Open status PR. A failure in any of those still moved
  the pointer, so the next production run skipped those posts. Move the
  advance to the very end, gated on `success()`, so a failure leaves the
  tag in place and the range is re-processed next run (submissions are
  idempotent).

* fix(landing): gate production promotion to the main ref only

Follow-up to the production-path review note: pinning checkout to main
fixed the deployed content, but the workflow was still dispatchable from
any ref, which records a non-main production run and would dodge
blog-indexing's `workflow_run` `branches: [main]` filter. Gate the whole
job on `github.ref == 'refs/heads/main'` so a dispatch from any other
branch/tag is skipped outright.
2026-05-26 14:05:04 +00:00
lefarcen
a0ea9bdaf3
ci: make agent PR exploration manual only (#2993)
* Make agent PR explore manual dispatch only

* chore: retrigger PR checks

* chore: retrigger CI after Actions recovery
2026-05-26 12:59:58 +00:00
lefarcen
b5bf28060b
Add sandboxed agent PR exploration (#2604) 2026-05-26 07:52:42 +00:00
Marc Chan
d5659d82d4
chore(nix): streamline pnpm deps hash maintenance (#2919)
* chore(nix): streamline pnpm deps hash maintenance

Generated-By: looper 0.9.0 (runner=worker, agent=opencode)

* fix(ci): satisfy actionlint in nix hash autofix

Generated-By: looper 0.9.0 (runner=fixer, agent=opencode)

* fix(ci): allow nix hash autofix on fork PRs

Generated-By: looper 0.9.0 (runner=fixer, agent=opencode)

* fix(ci): follow up nix hash review

Generated-By: looper 0.9.0 (runner=fixer, agent=opencode)

* fix(ci): tolerate nix hash bot token failures

Generated-By: looper 0.9.0 (runner=fixer, agent=opencode)
2026-05-26 07:35:38 +00:00
Marc Chan
125dcd0174
fix(ci): run fork visual reports from trusted code (#2935)
* fix: run fork visual reports from trusted code

* fix: auto-approve strict web visual capture

* fix: address visual report review feedback

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: propagate visual report storage failures

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: validate PR screenshots before upload

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: validate visual PR identity before comment

* fix: harden fork visual report validation

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: address remaining fork visual report review feedback

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: handle stale fork visual report lookup

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)

* fix: allow stale fork visual report fallback

Generated-By: looper 0.9.1 (runner=fixer, agent=opencode)
2026-05-26 06:17:04 +00:00
PerishFire
bfcafc81fd
feat(pack): add Windows portable zip target alongside NSIS installer (#2937)
Adds a new `--to zip` (and `--to all`) tools-pack Windows build target that
produces a portable `.zip` from the cached `win-unpacked` tree using the
bundled 7z. The zip lays files at the archive root so users can extract it
anywhere and launch `Open Design.exe` without going through the NSIS
installer, addressing the no-install download request.

Release plumbing is updated to publish the portable zip and its sha256
beside the existing installer on R2 for beta, preview, and stable channels
(default on, gated by `WINDOWS_INCLUDE_ZIP`/`WIN_INCLUDE_ZIP`). The
electron-updater `latest.yml` feed continues to point only at the
installer; the zip is a manual-download convenience and is intentionally
excluded from the in-app updater.

Closes #1121

Generated-By: looper 0.0.0-dev (runner=worker, agent=claude-code)

Co-authored-by: libertecode <libertecode@proton.me>
2026-05-26 06:14:44 +00:00
Jane
40ae0836dd
feat(landing-page): rebuild plugins library to mirror in-app taxonomy (#2926)
* feat(landing-page): synthesize fallback preview cards for instruction skills

The skill catalog renders a diagonal-stripe placeholder for any skill
without a runnable example.html, which leaves ~70% of /skills/ as a
field of bare grey thumbs (instruction skills like copywriting,
creative-director, color-expert, brainstorming have no static demo
because their output depends on the agent's input).

Synthesize a typographic editorial card from each SKILL.md frontmatter
and screenshot it through the same Playwright pipeline that handles
real demos, so every catalog row carries a thumbnail. Cards include:

  - OPEN DESIGN · SKILL top label + Nº NNN index (1..96 over the
    instruction subset, sorted by od.featured then alphabetical)
  - Big Playfair Display slug with a coral dot accent
  - Italic serif description clamped to 3 lines
  - mode/category chips + "Curated from <author>" attribution
  - Warm-paper background with a subtle 135° stripe to thread the
    landing's existing visual language

Bundle a few related improvements caught while building this:

  - SkillRecord gains a `kind: 'instruction' | 'template'` field so
    the detail page can render differently per kind (instruction
    skills now render the SKILL.md body inline as "About this skill",
    template skills keep the click-to-expand iframe demo).
  - Catalog row thumbnails switch from the bespoke IntersectionObserver
    pipeline to native `loading="lazy"` (with eager + fetchpriority=high
    on the first 3). The observer's swap latency stranded mid-list
    rows on the SVG placeholder during fast scrolls; native lazy uses
    the browser's 1250-3000px lookahead so the placeholder flash is
    gone.
  - precise-lazyload rootMargin bumped to 1500px for any remaining
    data-precise-src callers.
  - CI cache key for generated previews now folds in
    fallback-preview-card.ts so a template tweak invalidates the cache.

* feat(landing-page): rebuild plugins library to mirror in-app taxonomy

The marketing site's `/skills/`, `/templates/`, `/systems/`, `/craft/`
top-level entries were organized around author-supplied `od.mode` /
`od.scenario` taxonomies that visitors never see inside Open Design
itself. The in-app Plugins home (`apps/web/src/components/plugins-home/`)
groups every bundled plugin by the artifact it produces — Prototype,
Live Artifact, Slides, Image, Video, HyperFrames, Audio — and that's
the language users encounter the moment they open the product.

This PR rebuilds the public library around the same taxonomy and the
same data source so a visitor reading "Templates · 231" on the
marketing site sees the same 231 inside the app.

## What changes

- New top-level `/plugins/` hub: four tiles (Templates, Skills,
  Systems, Craft) with live counts pulled straight from
  `plugins/_official/<bucket>/<slug>/open-design.json` — the daemon's
  bundled-plugin registry.
- `/plugins/templates/` lists every bundled plugin that lands in one
  of the seven artifact kinds. Seven sub-routes
  (`/plugins/templates/prototype/`, `/deck/`, `/image/`, `/video/`,
  `/hyperframes/`, `/audio/`, `/live-artifact/`) carry the same chip
  rail with an active state, so visitors can switch artifact kinds
  with one click without losing the rail.
- Each artifact-kind sub-route shows a Scene chip rail when the kind
  has scene buckets (Prototype / Slides / Image / Video each get
  five-six). The Scene filter runs client-side via inline `style.display`
  toggles; URLs stay one-per-kind so we don't multiply 25 × 18 locales
  worth of static pages just for filter combinations.
- `/plugins/skills/` collects the instruction-only entries (mode
  doesn't fit any of the seven kinds) — copywriting, color theory,
  creative direction, brainstorming, etc.
- `/plugins/systems/` lists the 150 bundled design systems via the
  legacy SystemCard renderer (palette swatches, tagline) so the
  visual treatment matches the in-product library.
- `/plugins/craft/` keeps the existing craft principles list.
- `/plugins/<manifest-id>/` detail pages built from manifest metadata:
  hero (poster image or playable Cloudflare Stream MP4 for video
  templates), author / mode / scenario / tags, GitHub source link.
  Author URLs pointing at the `nexu-io` org redirect to the
  `nexu-io/open-design` repo so the attribution is actionable.
- Header dropdown labelled "Plugins" with the four sub-routes; footer
  Library column updated to match.
- Old marketplace registry pages under `/plugins/` and
  `/[locale]/plugins/` removed (they were a dormant placeholder UI;
  the actual manifests it tried to load lived nowhere). The rest of
  the legacy plugin-registry loader stays intact for any other
  consumer.

## Preview generation

Bundled plugins ship `od.preview.poster` URLs on R2 for image and
video templates; those are used directly. The other 293 entries
(html-mode examples, design-systems, scenarios) had no poster, so
`generate-previews.ts` was extended to:

1. Screenshot a local `example.html` referenced by `od.preview.entry`
   when present (134 examples).
2. Synthesize the same typographic editorial card the SKILL.md
   fallback uses, sourced from manifest title / description / mode /
   author (159 systems / scenarios / misc).

Output lands at `public/previews/plugins/<manifest-id>.png`. The
catalog loader checks for the local file when the manifest carries no
poster URL, so the row's `<img src>` always has something to point at.

Result: every catalog row and every detail page has a thumbnail;
visiting `/plugins/templates/video/` shows the same 48 entries the
in-app Plugins home shows, hyperframes the same 13, etc.

## Counts

- Templates: 231 (Prototype 59 + Slides 59 + Image 46 + Video 48 +
  HyperFrames 13 + Audio 1 + Live Artifact 5)
- Skills: 15
- Systems: 150
- Craft: 11

Atoms (13 infrastructure plugins, `od.kind === 'atom'`) are filtered
to mirror the in-app behaviour.

* fix(landing-page): use Astro 6 render() helper for SKILL.md body

Astro 6 dropped `entry.render()` in favour of a top-level `render(entry)`
helper imported from `astro:content`. The instruction-kind skill detail
page was still using the legacy method, which compiled locally on Astro
6 only because tsx ignored the missing prototype method, but `astro
check` (run in CI) flagged it as ts(2551) and broke the workflow.

---------

Co-authored-by: Joey-nexu <joeylee12629@gmail.com>
2026-05-26 02:49:58 +00:00
lefarcen
b4f700540f
ci: add agent explore workflow placeholder (#2830) 2026-05-24 20:22:51 +08:00
lefarcen
a37d11fe72
Merge pull request #2461 from nexu-io/release/v0.8.0
release: Open Design 0.8.0 — Everything is a plugin. Headless. Plugins create plugins.
2026-05-23 12:38:36 +08:00
lefarcen
c14baf07d3 Merge origin/main into release/v0.8.0
PR #2461 sync prep — resolves 14 conflicts merging 84 main-side commits
on top of 58 release-side commits accumulated during the 0.8.0 cycle.

Resolution summary:

Take main (theirs) where main carried deliberate forward progress:
- apps/web/src/components/PluginCard.tsx — 7 hunks, i18n migration:
  hardcoded English aria-labels/titles replaced with t() calls keyed
  on pluginCard.* (all 8 keys verified present in en.ts).
- apps/web/src/components/TasksView.tsx — 1 hunk, source-ingestion
  feature: sortedRoutines (newest-first), sourceIngestionTemplates,
  patchSourceForm, submitSourceIngestion. activeCount/pausedCount
  semantics preserved (now keyed on sortedRoutines, count unchanged).
- e2e/ui/app.test.ts — new node:fs/promises + tmpdir + path + @/timeouts
  imports needed by main-side test helpers.
- e2e/ui/settings-local-cli-codex-fallback.test.ts — menu-dismissal
  helper block added by main.

Keep both sides where each added a different field to the same object
literal:
- apps/web/src/components/ProjectView.tsx (locale + analyticsHints
  spread).
- apps/web/src/components/DesignSystemFlow.tsx (locale + analyticsHints).

Take release (ours) where release carried deliberate work that ships
0.8.0:
- CHANGELOG.md — release-side 0.8.0 entry + PR link refs; main's
  Unreleased section was the same body of work, now finalized.
- apps/landing-page/public/{apple-touch-icon,favicon}.png +
  apps/web/public/app-icon.svg — release-side visual refresh assets
  consistent with 0.8.0 stable ship.
- tools/pack/src/linux.ts — packageVersion const required by line 466;
  taking main's empty line would build-error.
- e2e/ui/project-management-flows.test.ts +
  e2e/ui/settings-api-protocol.test.ts +
  e2e/ui/settings-memory-routines.test.ts — release-side release-smoke
  hardening (shangxinyu1 + PerishFire) takes precedence on overlap.

Closes-issue / unblocks: PR #2461 sync release/v0.8.0 → main.
2026-05-23 12:17:18 +08:00
Marc Chan
135c6b42d8
fix(ci): lint workflow changes with actionlint (#2742)
* fix(ci): lint workflow changes with actionlint

* fix(ci): lint workflow changes with actionlint

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): lint workflow changes with actionlint

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): lint workflow changes with actionlint

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
2026-05-23 12:12:55 +08:00
Marc Chan
866661ac65
fix(ci): run merge queue checks on merge groups (#2745) 2026-05-23 11:59:30 +08:00
Marc Chan
6592d638ce
ci: gate fork PR workflow auto-approval (#2683)
* ci: gate fork PR workflow auto-approval

* ci: rename fork PR approval workflow

* ci: normalize fork workflow paths

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): match action_required workflow runs

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): denylist tool config paths

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): retry action_required workflow lookup

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): restrict fork workflow approvals to target PR

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): keep polling fork workflow approvals

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): revalidate fork workflow approvals before approving

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): poll longer for first fork approval run

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): make fork approval poll budget configurable

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): drop stale fork approval runs

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): deny dotted tsconfig variants in fork approvals

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): run fork approval regression in guard

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): refresh Nix pnpm deps hash

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* test(web): mock useI18n in reattach restore test

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): accept status-only fork approvals

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): rerun fork approval on retarget

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): ignore base tip churn in PR association

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): broaden pending approval run fetch

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): skip non-retarget fork approval edits

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): checkout visual comment workflow head

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): paginate workflow approval run lookup

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): harden fork workflow follow-ups

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): honor full post-appearance settling window

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): validate manual visual comment checkout

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
2026-05-23 11:48:36 +08:00
Marc Chan
1c7233ef10
fix(landing-page): speed up landing-page CI builds (#2734)
* fix(landing-page): speed up landing-page CI builds

* fix(landing-page): disable dev-only landing caches

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(landing-page): reuse previews across CI runs

* fix(landing-page): hash shared preview dependencies

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(landing-page): skip missing preview html reads

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(landing-page): rerun previews on cache hits

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): repair landing-page workflow cache keys
2026-05-23 00:30:31 +08:00
Marc Chan
a5b47c5f76
fix(ci): narrow workflow scope and reuse setup steps (#2708)
* fix(ci): narrow workflow scope and reuse setup steps

* fix(ci): narrow workflow scope and reuse setup steps

Repair Nix fixed-output hashes for the filtered daemon and web source trees.

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): narrow workflow scope and reuse setup steps

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): narrow workflow scope and reuse setup steps

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)

* fix(ci): repair daemon and nix checks

Generated-By: looper 0.0.0-dev (runner=fixer, agent=opencode)
2026-05-22 18:58:53 +08:00
ashleyashli
558fedd207
fix(landing): wire GA4 rollout config (#2615)
Co-authored-by: ashley li <ashleyli@ashleydeMacBook-Air-2.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-22 14:56:58 +08:00
PerishFire
1ef865dd31
fix: defer Windows packaged updater installer (#2575)
* fix: tighten packaged updater flow

* test: prune noisy extended ui coverage

* fix: hide unpublished release artifacts

* test: validate release updater channels

* fix: align prerelease release namespaces

* fix: align packaged updater validation
2026-05-21 19:13:18 +08:00
PerishFire
526c7f7c26
Fix packaged auto-update release validation (#2565)
* fix: tighten packaged updater flow

* test: prune noisy extended ui coverage

* fix: hide unpublished release artifacts

* test: validate release updater channels

* fix: align prerelease release namespaces
2026-05-21 18:15:53 +08:00
Marc Chan
10192dcc52
fix(ci): catch nix hash drift before merge (#2530)
* fix(ci): catch nix hash drift before merge

* fix(nix): add pnpm hash refresh helper

* chore(nix): drop redundant hash alias

* fix(nix): raise update-hash output buffer

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* fix(nix): handle current pnpm deps hash

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)

* fix(nix): reject non-mismatch hash updates

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)
2026-05-21 16:08:13 +08:00
lefarcen
50a4dc8a62 Merge origin/main into release/v0.8.0 2026-05-21 13:17:52 +08:00
Marc Chan
23d28682da
fix(ci): fall back to manifest PR number for visual comments (#2506)
* fix(ci): fall back to manifest PR number for visual comments

* fix(ci): restore authoritative visual PR lookup

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)
2026-05-21 12:06:41 +08:00
lefarcen
ebf4a3ffca
feat(release): upload browser sourcemaps to PostHog for packaged builds (#2508)
* i18n: add translations for media provider coming soon section (#2415)

* i18n: add translations for media provider coming soon section

- Add 'settings.mediaProviderComingSoonHint' key to all 19 locales
- Replace hardcoded English strings in SettingsDialog.tsx with i18n keys
- Reuse existing 'tasks.comingSoon' and 'settings.agentInstall.docs' keys
- Resolves TODO(i18n) comment at line 5091

* fix: escape single quotes in translation strings

* fix: escape all single quotes in English translation string

* feat(release): upload browser sourcemaps to PostHog for packaged builds

Next.js was emitting minified JS with no browser sourcemaps, so PostHog
Error Tracking surfaces frames like fO / fz / s4 / tD instead of real
file:line locations. This wires up the full pipeline:

- apps/web/next.config.ts: enable productionBrowserSourceMaps so next build
  emits .js.map alongside each chunk.
- tools/pack/src/web-sourcemaps.ts: new helper that runs after next build
  and before any packaging step copies the web output into the Electron
  resources. Uses @posthog/cli to inject chunk IDs and upload sourcemaps
  to PostHog, then ALWAYS strips every .map under .next/static so source
  never ships inside an installer (saves ~14 MB per packaged image too).
- tools/pack/src/{mac/workspace,win/app,linux}.ts: call processWebSourcemaps
  immediately after the @open-design/web build step.
- tools/pack/src/config.ts: read POSTHOG_CLI_API_KEY + POSTHOG_CLI_PROJECT_ID
  (with POSTHOG_PERSONAL_API_KEY / POSTHOG_PROJECT_ID aliases) and expose
  them on ToolPackConfig with the same shape as the existing posthogKey /
  posthogHost fields.
- .github/workflows/release-{beta,preview,stable}.yml: pass the new secrets
  through so all three release channels symbolicate stacks.

When the API key is missing (PR builds, forks, local contributor builds),
the helper logs and skips the upload — but still strips .map files. The
strip step is unconditional because shipping a sourcemap is equivalent to
shipping the source.

Adds tools/pack/tests/web-sourcemaps.test.ts covering: missing chunks dir
silently noop, no-map noop, strip-only path when credentials are absent,
recursive walker for nested subdirectories. CLI happy path is left to the
release workflow itself.

Required follow-up (cannot push from code): add a repo secret named
POSTHOG_CLI_API_KEY (the phx_ personal API key) and a repo var named
POSTHOG_CLI_PROJECT_ID (the numeric project id, 420348 for our project)
in nexu-io/open-design settings before merging.

* fix(web-sourcemaps): use management host for CLI, not ingest host

POSTHOG_HOST is the ingest URL (us.i.posthog.com) used by the runtime SDK
to POST events to /capture/. The @posthog/cli sourcemap upload talks to
the **management** API (us.posthog.com) and gets a 404 on the ingest
host. The two are not interchangeable.

Adds a separate `posthogCliHost` field on ToolPackConfig sourced from
POSTHOG_CLI_HOST (with no fallback to POSTHOG_HOST). When the env is
unset the @posthog/cli defaults to the US Cloud app host on its own,
which is correct for our project — so this PR doesn't need a new repo
variable for it.

---------

Co-authored-by: Nicholas-Xiong <2482929840@qq.com>
2026-05-21 11:48:57 +08:00
lefarcen
f5f8937421 Merge origin/main into release/v0.8.0
Conflict resolved by taking origin/main:

- apps/web/src/components/EntryNavRail.tsx  design-systems rail
  button icon name palette-filled (release-side) -> blocks (main);
  main's icon swap is part of the more recent design-systems rail
  pass.
2026-05-21 10:52:08 +08:00
Marc Chan
c45c5c9764
fix(ci): align visual selectors and nix hashes (#2471)
* fix(ci): align visual selectors and nix hashes

* fix(ci): add strict PR visual verification

* fix(ci): repair visual-home captures

Generated-By: looper 0.8.1 (runner=fixer, agent=opencode)
2026-05-21 10:45:37 +08:00
lefarcen
1cfe274a90 Merge origin/main into release/v0.8.0
Conflicts resolved by taking origin/main on all six points:

- apps/web/src/components/HomeHero.tsx:479-487  brand div removed
  (main dropped the .home-hero__brand wrapper; the release-side visual
  refresh still had it).
- apps/web/src/components/HomeHero.tsx:894-898  attach Icon size
  18 (main's update) replaces 20 from release.
- apps/web/src/components/HomeHero.tsx:913-927  submit button uses
  <Icon name="arrow-up" size={22} /> (main's component refactor)
  instead of the release-side inline SVG.
- apps/web/src/components/EntryShell.tsx:578-582  Discord Icon size
  14 (main) instead of 16 (release).
- apps/web/src/styles/home/home-hero.css  drop .home-hero__brand /
  __brand-mark / __brand-name rules — main removed both the component
  div and these CSS rules together; keeping the CSS would be dead code.
- apps/web/src/styles/home/entry-layout.css  Discord badge icon color
  #5865f2 (main, the brand color introduced by PR #2386) instead of
  release's neutral var(--text-strong).
2026-05-20 20:59:00 +08:00
PerishFire
899c9fe4d8
Support nightly and preview package identity (#2437) 2026-05-20 19:46:39 +08:00
shangxinyu1
71044bd3d6
test(e2e): harden extended coverage state assertions (#2245)
* test(e2e): harden extended coverage contracts

* docs(testing): add e2e hardening status

* fix(web): persist artifact chips after daemon runs

* ci: install playwright browsers for e2e vitest

* Fix daemon run recovery across reloads

Pin daemon-created runs to assistant messages immediately so hard reloads before the create response can reattach.

Replay terminal and active run events from the beginning on reload so restored turns keep assistant text, thinking events, produced files, and artifacts.

Fixes #2366

Fixes #2368

Fixes #2371

* test(e2e): preserve fake runtime selection across reload

* fix(web): scope daemon run recovery to daemon mode

* fix(e2e): remove duplicate delayed smoke flag

* fix(web): scope replay artifact recovery to current run

* fix(daemon): remove duplicate run-create pin
2026-05-20 16:21:01 +08:00
ashleyashli
65e760b88a
feat(seo): add GSC report opportunities (#2388)
Co-authored-by: ashley li <ashleyli@ashleydeMacBook-Air-2.local>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-20 16:19:14 +08:00