open-design/AGENTS.md
PerishFire 8c0fb8dc01
feat(tools-pr): add maintainer PR-duty workspace (#1259)
* feat(tools-pr): add maintainer PR-duty workspace

Adds `tools/pr` as the maintainer-only control plane for PR-duty work on
this repo. Thin `gh` wrapper that encodes repo-specific knowledge:
review lanes, forbidden surfaces, lane-specific checklists, validation
command derivation from touched packages.

Subcommands:
- `list` — triage open queue by lane and review-state bucket.
- `view <num>` — agent-friendly review brief for a single PR.
- `classify [num]` — emit script-level tags for one PR or the whole
  open queue; full-queue JSON output lands under `.tmp/tools-pr/classify/`
  with rate-limit telemetry per run.
- `assignment` — assigner-perspective view of PR ownership, idle time,
  and blockers (derived from existing tags; no new judgments).

Tag dictionary (13 tags) covers: bot-only-approval, needs-rebase,
forbidden-surface, unlabeled, duplicate-title, non-ascii-slug,
maintainer-edits-disabled, org-member, unresolved-changes-requested,
stale-approval, and three awaiting-* timing tags. Each rule is
expressible as one factual sentence over `gh` data + repo paths — see
`tools/pr/AGENTS.md` for the full dictionary plus precision rules.

Templates in `tools/pr/templates/*.md` are aesthetic references for
recurring maintainer comments (duplicate-title ask, awaiting-author
nudge, agent-review brief shape). `templates/examples/` holds
frozen-in-time agent-review snapshots for three PR shapes.

Infrastructure:
- `gh()` wraps `execFile` with minimum-touch retry (2 attempts at 1s + 2s
  backoff) on transient 5xx / network errors. Persistent failures still
  surface — retry is anti-jitter, not an exponential-backoff resilience
  layer.
- Heavy chunks (`reviews`, `comments`, `commits`, assignment timelines)
  use cursor-paginated `gh api graphql` via `fetchPaginatedPrList` to
  stay under GitHub's GraphQL server-side timeout. Light chunks stay on
  `gh pr list --json`.
- `fetchOrgMembers` cached per process via `gh api orgs/<owner>/members
  --paginate`.

Wiring:
- Root `package.json` adds `pnpm tools-pr` to the allowed root entry
  points.
- `scripts/postinstall.mjs` builds `tools/pr` alongside other workspace
  packages.
- `scripts/guard.ts` allowlists `tools/pr/bin/tools-pr.mjs` and
  `tools/pr/esbuild.config.mjs`, and adds `pr/` to the `tools/` top-level
  layout allowlist.
- Root `AGENTS.md` and `tools/AGENTS.md` document the new command
  surface, root-command-boundary update, and per-tool ownership.

* docs(agents): brief tools-pr in root AGENTS.md, link to tools/pr/AGENTS.md

Adds a `PR-duty tooling` section to the root AGENTS.md summarising what
`pnpm tools-pr` is, listing the four common subcommands (list / view /
classify / assignment), and pointing readers to `tools/pr/AGENTS.md` for
the full tag dictionary, operational playbook, templates, and design
rules. The section keeps root-level guidance to high-level orientation
while details stay local to the tool's own AGENTS.md.

* fix(tools-pr): drop overly broad touches-root-package.json forbidden hit

`deriveForbidden` was flagging any change to root `package.json` as a
forbidden-surface hit, but AGENTS.md §Root command boundary only forbids
specific *lifecycle* aliases (pnpm dev / test / build / daemon / preview
/ start) — tools-control-plane entrypoints like `pnpm tools-pr` are
explicitly allowed. Distinguishing "forbidden alias" from "allowed
entry" requires reading the diff content, which is `pnpm guard`'s job
rather than a path-derived classify tag.

Dogfooded on this branch's own PR (#1259), which added the `pnpm
tools-pr` script and was incorrectly flagged. Removing the hit aligns
the `forbidden-surface` tag with what tools-pr can mechanically detect
from file paths alone (apps/nextjs/, packages/shared/).

* fix(tools-pr): paginate commits fetch, recognise ready-to-merge, escape title-index separator

Three review follow-ups on #1259, all factual fixes:

- `fetchOpenPrCommits` now uses `fetchPaginatedPrList` instead of a
  one-shot `pullRequests(first: $first)` query. GitHub GraphQL caps
  connection page size at 100, so the previous implementation would
  fail at runtime when callers passed `--limit > 100`. The paginated
  path makes the commits fetch consistent with the other heavy chunks
  (reviews, comments, assignment timelines) and removes the artificial
  ceiling entirely. The `limit` parameter is dropped from
  `fetchOpenPrCommits`; the CLI `--limit` continues to bound the
  `gh pr list --json` chunks.
- `deriveStatus` in `assignment.ts` now reads `facts.reviewDecision`
  and `facts.mergeStateStatus`. When the PR is `APPROVED` with merge
  state `CLEAN` or `UNSTABLE` and carries no blockers, status renders
  as `ready to merge` instead of falling through to `in review`. The
  assignment view loses its main triage signal without this — a clean
  human-approved PR rendered identical to a REVIEW_REQUIRED one.
- `tags.ts:tagDuplicateTitle` and `tags.ts:buildContext` both
  constructed the title-index key with a literal NUL byte between
  author and title, which made the file appear as binary in `git diff`
  / review tooling. Replaced the literal byte with a Unicode escape
  sequence in source; the runtime string value is identical, the
  source stays plain text and round-trips through review tooling
  cleanly.

* fix(tools-pr): raise default --limit to 1000 to cover the live open queue

mrcfps flagged that `tools-pr list` (and `classify --all`, `assignment`)
defaults to `--limit 100`, which silently drops every PR past the first
100 in the open queue. The repo currently sits at 104 open PRs, so the
out-of-the-box run was already omitting four PRs.

Raise the default to 1000 in `list.ts`, `classify.ts`, and `assignment.ts`,
and remove the now-pointless 200 ceiling — `gh pr list --limit N` paginates
internally, so a high cap is cheap. Users can still pass `--limit <small>`
for a truncated preview. CLI help text on the three subcommands updated to
match.

* fix(web): pass designTemplates to ProjectView render helper

#955 made `designTemplates` a required Prop on ProjectView, but the test
helper added in #1244 (`renderProjectView` in
`ProjectView.api-empty-response.test.tsx`) was never updated. The two
PRs landed on main without conflicting, leaving `apps/web` typecheck red
for every PR that rebases past b5eb8c16.

Pass `designTemplates={[] as SkillSummary[]}` alongside the existing
`skills={[] as SkillSummary[]}` so the helper compiles. The component
already treats the array shape (empty included) as a no-op fallback in
the empty-response paths the test exercises.

* fix(tools-pr): correct author signal + merge inline review comments

Two correctness gaps in the awaiting-* signal pipeline surfaced during
review of the new tools-pr commands:

1. `authorSignalAt` iterated every PR commit unconditionally. On
   `maintainerCanModify=true` PRs a maintainer's follow-up push would
   advance the author timestamp, masking a stalled author response.
   Filter commits to those whose `authorLogin` matches `facts.author`,
   mirroring the same filter already applied to comments.

2. `fetchOpenPrComments` (and `fetchView`) only fetched
   `pullRequest.comments` / `gh pr view --json comments`, which is the
   issue-conversation thread. Inline review-thread replies — where
   authors and reviewers actually exchange most fix-up replies — live in
   `reviewThreads.comments` / REST `pulls/{n}/comments`. Missing them let
   `humanReviewerSignalAt` / `authorSignalAt` and the `view` brief point
   at the wrong side after someone replied inline. Extend the list-mode
   GraphQL to also sweep `reviewThreads(last: 20).comments(first: 20)`,
   and add a parallel REST inline-comments fetch in `fetchView` that
   merges into `GhView.comments`.
2026-05-11 19:17:21 +08:00

16 KiB

Directory guide

This file is the single source of truth for agents entering this repository. Read this file first; after entering apps/, packages/, tools/, or e2e/, read that layer's AGENTS.md for module-level details. Do not copy module details back into the root file; root stays focused on cross-repository boundaries, workflow, and commands.

Core documentation index

  • Product and onboarding: README.md, README.zh-CN.md, QUICKSTART.md.
  • Contribution and environment: CONTRIBUTING.md, CONTRIBUTING.zh-CN.md.
  • Architecture and protocols: docs/spec.md, docs/architecture.md, docs/skills-protocol.md, docs/agent-adapters.md, docs/modes.md.
  • Roadmap and references: docs/roadmap.md, docs/references.md, docs/code-review-guidelines.md, specs/current/maintainability-roadmap.md.
  • Directory-level agent guidance: apps/AGENTS.md, packages/AGENTS.md, tools/AGENTS.md, e2e/AGENTS.md.

Workspace directories

  • Workspace packages come from pnpm-workspace.yaml: apps/*, packages/*, tools/*, and e2e.
  • Top-level content directories: skills/ (functional skills the agent invokes mid-task — utilities, briefs, packagers; see skills/AGENTS.md), design-templates/ (rendering catalogue: decks, prototypes, image/video/audio templates; see design-templates/AGENTS.md and specs/current/skills-and-design-templates.md), design-systems/ (brand DESIGN.md files), craft/ (universal brand-agnostic craft rules a skill can opt into via od.craft.requires).
  • apps/web is the Next.js 16 App Router + React 18 web runtime; do not restore apps/nextjs.
  • apps/daemon is the local privileged daemon and od bin. It owns /api/*, agent spawning, skills, design systems, artifacts, and static serving.
  • apps/desktop is the Electron shell; it discovers the web URL through sidecar IPC.
  • apps/packaged is the thin packaged Electron runtime entry; it starts packaged sidecars and owns the od:// entry glue only.
  • packages/contracts is the pure TypeScript web/daemon app contract layer.
  • packages/sidecar-proto owns the Open Design sidecar business protocol; packages/sidecar owns the generic sidecar runtime; packages/platform owns generic OS process primitives.
  • tools/dev is the local development lifecycle control plane.
  • tools/pack is the local packaged build/start/stop/logs control plane and mac beta release artifact preparation surface.
  • tools/pr is the maintainer PR-duty control plane: a thin gh wrapper that encodes this repo's review-lane derivation, forbidden-surface flags, lane checklists, and validation-command suggestions.
  • e2e owns user-level end-to-end smoke tests and Playwright UI automation; read e2e/AGENTS.md before editing its tests or commands.

Inactive or placeholder directories

  • apps/nextjs and packages/shared have been removed; do not recreate or reference them.
  • .od/, .tmp/, Playwright reports, and agent scratch directories are local runtime data and must stay out of git.

Development workflow

Environment baseline

  • Runtime target is Node ~24 and pnpm@10.33.2; use Corepack so the pnpm version pinned in package.json is selected.
  • New project-owned entrypoints, modules, scripts, tests, reporters, and configs should default to TypeScript.
  • Residual JavaScript is limited to generated output, vendored dependencies, explicitly documented compatibility build artifacts, and the allowlist in scripts/guard.ts.

Local lifecycle

  • Use pnpm tools-dev as the only local development lifecycle entry point.
  • Do not add or restore root lifecycle aliases: pnpm dev, pnpm dev:all, pnpm daemon, pnpm preview, or pnpm start.
  • Ports are governed by tools-dev flags: --daemon-port and --web-port.
  • tools-dev exports OD_PORT for the web proxy target and OD_WEB_PORT for the web listener; do not use NEXT_PORT.

Root command boundary

  • Keep root scripts reserved for true repo-level checks and tools control-plane entrypoints: pnpm guard, pnpm typecheck, pnpm tools-dev, pnpm tools-pack, and pnpm tools-pr.
  • Do not add root aggregate pnpm build or pnpm test aliases. Build/test commands must stay package-scoped (pnpm --filter <package> ...) or tool-scoped (pnpm tools-pack ... / pnpm tools-pr ...).
  • Do not add root e2e aliases; e2e package commands and ownership rules live in e2e/AGENTS.md.

Boundary constraints

  • Tests under apps/, packages/, and tools/ live in a package/app/tool-level tests/ directory sibling to src/; keep src/ source-only and do not add new *.test.ts or *.test.tsx files under src/. Playwright UI automation belongs to e2e/ui/, not app packages.
  • App packages must not import another app's private src/ or tests/ implementation as a shared helper. In particular, apps/web/** must not import apps/daemon/src/**; web/daemon integration belongs behind HTTP APIs, packages/contracts, and app-local provider boundaries.
  • Cross-app, cross-runtime, or repository-resource consistency checks belong in e2e/tests/ when they need to observe more than one app/package boundary; promote reusable logic to a pure package instead of borrowing another app's private source.
  • Keep shared API DTOs, SSE event unions, error shapes, task shapes, and example payloads in packages/contracts; update contracts before wiring divergent web/daemon request or response shapes.
  • Keep packages/contracts pure TypeScript and free of Next.js, Express, Node filesystem/process APIs, browser APIs, SQLite, daemon internals, and sidecar control-plane dependencies.
  • Keep project-owned entrypoints, modules, scripts, tests, reporters, and configs TypeScript-first; generated dist/*.js is runtime output, and source edits belong in .ts files.
  • New .js, .mjs, or .cjs files need an explicit generated/vendor/compatibility reason and must pass pnpm guard.
  • App business logic must not know about sidecar/control-plane concepts. Keep sidecar awareness in apps/<app>/sidecar or the desktop sidecar entry wrapper.
  • Shared web/daemon app contracts belong in packages/contracts; that package must not depend on Next.js, Express, Node filesystem/process APIs, browser APIs, SQLite, daemon internals, or the sidecar control-plane protocol.
  • Sidecar process stamps must have exactly five fields: app, mode, namespace, ipc, and source.
  • Orchestration layers (tools-dev, tools-pack, packaged launchers) must call package primitives; do not hand-build --od-stamp-* args or process-scan regexes.
  • Packaged runtime paths must be namespace-scoped and independent from daemon/web ports; ports are transient transport details only.
  • Default runtime files live under <project-root>/.tmp/<source>/<namespace>/...; POSIX IPC sockets are fixed at /tmp/open-design/ipc/<namespace>/<app>.sock.

Git commit policy

  • Git commits must not include Co-authored-by trailers or any other co-author metadata.

Code review guide

  • Use docs/code-review-guidelines.md as the repository-wide review standard. That document is the operational guide; this AGENTS.md is the source of truth when the two disagree.
  • Walk reviews top-down through docs/code-review-guidelines.md: Product relevance test → forbidden surfaces → ownership/scope → matching lane → checklist → comments → approval bar.
  • Pick the matching review lane: default code/tests, contract and protocol changes, design-system additions, skill additions, or craft additions.
  • Before reviewing changes under apps/, packages/, tools/, or e2e/, read that directory's AGENTS.md and apply its local boundaries.
  • Blocking review feedback should focus on correctness, security/secrets, data integrity, repository boundary violations, contract/migration breakage, missing required validation, or high-risk maintainability issues.
  • Only maintainers may close a PR instead of requesting changes, and only when the change is not salvageable on the existing branch (wrong target product, foreign test harness, DOM/API assumptions absent from this repo, or scripts that conflict with lifecycle rules).

PR-duty tooling

pnpm tools-pr is the maintainer-only control plane for PR-duty work on this repo. It is a thin gh wrapper that encodes repo-specific knowledge — review-lane derivation, forbidden-surface flags, per-lane checklists, validation-command suggestions, and a fixed dictionary of factual classify tags (bot-only-approval, needs-rebase, stale-approval, unresolved-changes-requested, awaiting-* timing, org-member, etc.). The tool is read-only on the PR surface: it never approves, merges, comments, or closes; those side effects stay in explicit gh invocations the maintainer runs.

Common subcommands:

  • pnpm tools-pr list — triage the open queue by lane and review-state bucket.
  • pnpm tools-pr view <num> — factual review brief for a single PR.
  • pnpm tools-pr classify --all — script-level tag JSON for the whole open queue (entry point for cron / digest consumers); per-PR classify <num> for spot checks.
  • pnpm tools-pr assignment — assigner-perspective ownership + idle-time / blocker view across the queue.

For the full tag dictionary, operational playbook (direct merge / duplicate-title / awaiting-author / org-member / agent-review flows), comment templates, language-detection rules, and tool-design constraints (precision boundaries, factual-output rule, retry + pagination strategy), see tools/pr/AGENTS.md.

Validation strategy

  • After package, workspace, or command-entry changes, run pnpm install so workspace links and generated dist entries stay fresh.
  • Before marking regular work ready, run at least pnpm guard and pnpm typecheck, plus the package-scoped tests/builds that match the files changed. Do not use or add root pnpm test/pnpm build aliases.
  • For local web runtime loops, prefer pnpm tools-dev run web --daemon-port <port> --web-port <port>.
  • On a GUI-capable machine, validate desktop by running pnpm tools-dev, then pnpm tools-dev inspect desktop status.
  • Stamp/namespace changes must validate two concurrent namespaces and run desktop inspect eval plus inspect screenshot for each namespace.
  • Path/log changes must run pnpm tools-dev logs --namespace <name> --json and confirm log paths are under .tmp/tools-dev/<namespace>/....

Bug follow-up workflow

The following is a working playbook for routine bug follow-ups, distilled from recent practice. Treat it as a default action shape, not a contract — production reality always has edges these bullets can't anticipate, so use judgment when the situation doesn't fit cleanly.

  • Lead with a red spec. Default to encoding the bug as a falsifiable test that goes red before any source change, so the fix is anchored in observable behavior rather than source-code intuition. If a red spec can't be written cheaply, that's usually a signal to clarify scope rather than push forward on a guess.
  • Try the cheapest layer first. Reach for the lightest test layer that can still see the symptom (e2e Vitest at the daemon HTTP boundary → app-local Vitest → Playwright UI → platform-native harnesses), and drop down only when the cheaper layer can't.
  • Hold the spec's scope. Defects discovered outside the bug's described boundary belong in a follow-up — their own red spec, their own PR — not in this fix. List them in the PR body's "Adjacent issues" section with the rationale and move on.
  • Let the fix read as an invariant. Prefer a named helper whose docblock describes what must hold over a bolt-on if guard with apologetic history-comments. The call site should read as intent.
  • Diff against the baseline. When neighboring suites have pre-existing failures, stash or check out upstream before claiming "no new failures."
  • Link the issue from the PR body. Use Fixes #N / Closes #N / Resolves #N so the issue auto-closes on merge and the release-time reverse lookup (gh issue view N --json closedByPullRequestsReferencesgit tag --contains <merge sha>) actually has a chain to follow. The repo's PR template prompts for this; deleting the prompt is fine when the PR genuinely closes nothing.
  • Stage human verification for visible bugs. When the symptom needs an eye to confirm — UI, platform-native behavior, animations, race conditions a unit test can't see — green specs alone aren't acceptance. Stand up a buggy-vs-fix comparison the reviewer can drive themselves (typical shape: two namespaced runtimes, one on main, one on the fix branch), and seed any required data only through production HTTP APIs; source-level test backdoors invalidate the verification because they prove a fake flow rather than the real one.

For a worked example of one full loop (red e2e spec → fix → green), see e2e/tests/dialog/stop-reconciles-message.test.ts (issue #135).

Common commands

pnpm install
pnpm tools-dev
pnpm tools-dev start web
pnpm tools-dev run web --daemon-port 17456 --web-port 17573
pnpm tools-dev status --json
pnpm tools-dev logs --json
pnpm tools-dev inspect desktop status --json
pnpm tools-dev inspect desktop screenshot --path /tmp/open-design.png
pnpm tools-dev stop
pnpm tools-dev check
pnpm guard
pnpm typecheck
pnpm tools-pr list
pnpm tools-pr list --bucket=merge-ready,approved-blocked
pnpm tools-pr list --lane=skill,contract --json
pnpm tools-pr view 1180
pnpm tools-pr view 1180 --json
pnpm --filter @open-design/web typecheck
pnpm --filter @open-design/web test
pnpm --filter @open-design/web build
pnpm --filter @open-design/daemon test
pnpm --filter @open-design/daemon build
pnpm --filter @open-design/desktop build
pnpm --filter @open-design/tools-dev build
pnpm --filter @open-design/tools-pack build
pnpm --filter @open-design/tools-pr build
pnpm tools-pack mac build --to all
pnpm tools-pack mac install
pnpm tools-pack mac cleanup
pnpm tools-pack win build --to nsis
pnpm tools-pack win install
pnpm tools-pack win cleanup
pnpm tools-pack linux build --to appimage
pnpm tools-pack linux install
pnpm tools-pack linux build --containerized

FAQ

Why is there no root pnpm dev / pnpm start?

To avoid starting daemon, web, and desktop through inconsistent env, port, namespace, or log paths. All local lifecycle flows must go through pnpm tools-dev.

Why should apps/nextjs not be restored?

The current web runtime is apps/web. The historical apps/nextjs layout has been removed from the active repo shape; restoring it would reintroduce duplicate app boundaries and stale scripts.

How does desktop discover the web URL?

Desktop queries runtime status through sidecar IPC. The web URL comes from tools-dev launch status, not from desktop guessing ports or reading web internals.

How are sidecar-proto, sidecar, and platform split?

@open-design/sidecar-proto owns Open Design app/mode/source constants, namespace validation, stamp fields/flags, IPC message schema, status shapes, and error semantics. @open-design/sidecar provides only generic bootstrap, IPC transport, path/runtime resolution, launch env, and JSON runtime files. @open-design/platform provides only generic OS process stamp serialization, command parsing, and process matching/search primitives, consuming the proto descriptor.

Where is data written?

The daemon writes .od/ by default: SQLite at .od/app.sqlite, agent CWDs under .od/projects/<id>/, saved renders under .od/artifacts/, and credentials at .od/media-config.json. Two env vars override the storage root, in order:

  1. OD_DATA_DIR=<dir> — relocates all daemon runtime data to <dir> (used by Playwright for test isolation, and by the packaged daemon and the Home Manager / NixOS modules to point the daemon at a writable directory when the install root is read-only). The path is resolved with ~/ expansion and relative paths anchored to <projectRoot>.
  2. OD_MEDIA_CONFIG_DIR=<dir> — narrower override that relocates only media-config.json. Same resolution semantics. Most installs do not need this; it exists for setups that want to keep API credentials in a different location from the rest of the runtime data.

Default precedence is OD_MEDIA_CONFIG_DIR > OD_DATA_DIR > <projectRoot>/.od.

When is pnpm install required?

Run pnpm install after changing package manifests, workspace layout, command entrypoints, bin/link-related content, or after adding/removing workspace packages.