mirror of https://github.com/nexu-io/open-design.git synced 2026-05-31 19:04:39 +07:00

feat(tools-pr): add maintainer PR-duty workspace (#1259 )

* feat(tools-pr): add maintainer PR-duty workspace

Adds `tools/pr` as the maintainer-only control plane for PR-duty work on
this repo. Thin `gh` wrapper that encodes repo-specific knowledge:
review lanes, forbidden surfaces, lane-specific checklists, validation
command derivation from touched packages.

Subcommands:
- `list` — triage open queue by lane and review-state bucket.
- `view <num>` — agent-friendly review brief for a single PR.
- `classify [num]` — emit script-level tags for one PR or the whole
  open queue; full-queue JSON output lands under `.tmp/tools-pr/classify/`
  with rate-limit telemetry per run.
- `assignment` — assigner-perspective view of PR ownership, idle time,
  and blockers (derived from existing tags; no new judgments).

Tag dictionary (13 tags) covers: bot-only-approval, needs-rebase,
forbidden-surface, unlabeled, duplicate-title, non-ascii-slug,
maintainer-edits-disabled, org-member, unresolved-changes-requested,
stale-approval, and three awaiting-* timing tags. Each rule is
expressible as one factual sentence over `gh` data + repo paths — see
`tools/pr/AGENTS.md` for the full dictionary plus precision rules.

Templates in `tools/pr/templates/*.md` are aesthetic references for
recurring maintainer comments (duplicate-title ask, awaiting-author
nudge, agent-review brief shape). `templates/examples/` holds
frozen-in-time agent-review snapshots for three PR shapes.

Infrastructure:
- `gh()` wraps `execFile` with minimum-touch retry (2 attempts at 1s + 2s
  backoff) on transient 5xx / network errors. Persistent failures still
  surface — retry is anti-jitter, not an exponential-backoff resilience
  layer.
- Heavy chunks (`reviews`, `comments`, `commits`, assignment timelines)
  use cursor-paginated `gh api graphql` via `fetchPaginatedPrList` to
  stay under GitHub's GraphQL server-side timeout. Light chunks stay on
  `gh pr list --json`.
- `fetchOrgMembers` cached per process via `gh api orgs/<owner>/members
  --paginate`.

Wiring:
- Root `package.json` adds `pnpm tools-pr` to the allowed root entry
  points.
- `scripts/postinstall.mjs` builds `tools/pr` alongside other workspace
  packages.
- `scripts/guard.ts` allowlists `tools/pr/bin/tools-pr.mjs` and
  `tools/pr/esbuild.config.mjs`, and adds `pr/` to the `tools/` top-level
  layout allowlist.
- Root `AGENTS.md` and `tools/AGENTS.md` document the new command
  surface, root-command-boundary update, and per-tool ownership.

* docs(agents): brief tools-pr in root AGENTS.md, link to tools/pr/AGENTS.md

Adds a `PR-duty tooling` section to the root AGENTS.md summarising what
`pnpm tools-pr` is, listing the four common subcommands (list / view /
classify / assignment), and pointing readers to `tools/pr/AGENTS.md` for
the full tag dictionary, operational playbook, templates, and design
rules. The section keeps root-level guidance to high-level orientation
while details stay local to the tool's own AGENTS.md.

* fix(tools-pr): drop overly broad touches-root-package.json forbidden hit

`deriveForbidden` was flagging any change to root `package.json` as a
forbidden-surface hit, but AGENTS.md §Root command boundary only forbids
specific *lifecycle* aliases (pnpm dev / test / build / daemon / preview
/ start) — tools-control-plane entrypoints like `pnpm tools-pr` are
explicitly allowed. Distinguishing "forbidden alias" from "allowed
entry" requires reading the diff content, which is `pnpm guard`'s job
rather than a path-derived classify tag.

Dogfooded on this branch's own PR (#1259), which added the `pnpm
tools-pr` script and was incorrectly flagged. Removing the hit aligns
the `forbidden-surface` tag with what tools-pr can mechanically detect
from file paths alone (apps/nextjs/, packages/shared/).

* fix(tools-pr): paginate commits fetch, recognise ready-to-merge, escape title-index separator

Three review follow-ups on #1259, all factual fixes:

- `fetchOpenPrCommits` now uses `fetchPaginatedPrList` instead of a
  one-shot `pullRequests(first: $first)` query. GitHub GraphQL caps
  connection page size at 100, so the previous implementation would
  fail at runtime when callers passed `--limit > 100`. The paginated
  path makes the commits fetch consistent with the other heavy chunks
  (reviews, comments, assignment timelines) and removes the artificial
  ceiling entirely. The `limit` parameter is dropped from
  `fetchOpenPrCommits`; the CLI `--limit` continues to bound the
  `gh pr list --json` chunks.
- `deriveStatus` in `assignment.ts` now reads `facts.reviewDecision`
  and `facts.mergeStateStatus`. When the PR is `APPROVED` with merge
  state `CLEAN` or `UNSTABLE` and carries no blockers, status renders
  as `ready to merge` instead of falling through to `in review`. The
  assignment view loses its main triage signal without this — a clean
  human-approved PR rendered identical to a REVIEW_REQUIRED one.
- `tags.ts:tagDuplicateTitle` and `tags.ts:buildContext` both
  constructed the title-index key with a literal NUL byte between
  author and title, which made the file appear as binary in `git diff`
  / review tooling. Replaced the literal byte with a Unicode escape
  sequence in source; the runtime string value is identical, the
  source stays plain text and round-trips through review tooling
  cleanly.

* fix(tools-pr): raise default --limit to 1000 to cover the live open queue

mrcfps flagged that `tools-pr list` (and `classify --all`, `assignment`)
defaults to `--limit 100`, which silently drops every PR past the first
100 in the open queue. The repo currently sits at 104 open PRs, so the
out-of-the-box run was already omitting four PRs.

Raise the default to 1000 in `list.ts`, `classify.ts`, and `assignment.ts`,
and remove the now-pointless 200 ceiling — `gh pr list --limit N` paginates
internally, so a high cap is cheap. Users can still pass `--limit <small>`
for a truncated preview. CLI help text on the three subcommands updated to
match.

* fix(web): pass designTemplates to ProjectView render helper

#955 made `designTemplates` a required Prop on ProjectView, but the test
helper added in #1244 (`renderProjectView` in
`ProjectView.api-empty-response.test.tsx`) was never updated. The two
PRs landed on main without conflicting, leaving `apps/web` typecheck red
for every PR that rebases past b5eb8c16.

Pass `designTemplates={[] as SkillSummary[]}` alongside the existing
`skills={[] as SkillSummary[]}` so the helper compiles. The component
already treats the array shape (empty included) as a no-op fallback in
the empty-response paths the test exercises.

* fix(tools-pr): correct author signal + merge inline review comments

Two correctness gaps in the awaiting-* signal pipeline surfaced during
review of the new tools-pr commands:

1. `authorSignalAt` iterated every PR commit unconditionally. On
   `maintainerCanModify=true` PRs a maintainer's follow-up push would
   advance the author timestamp, masking a stalled author response.
   Filter commits to those whose `authorLogin` matches `facts.author`,
   mirroring the same filter already applied to comments.

2. `fetchOpenPrComments` (and `fetchView`) only fetched
   `pullRequest.comments` / `gh pr view --json comments`, which is the
   issue-conversation thread. Inline review-thread replies — where
   authors and reviewers actually exchange most fix-up replies — live in
   `reviewThreads.comments` / REST `pulls/{n}/comments`. Missing them let
   `humanReviewerSignalAt` / `authorSignalAt` and the `view` brief point
   at the wrong side after someone replied inline. Extend the list-mode
   GraphQL to also sweep `reviewThreads(last: 20).comments(first: 20)`,
   and add a parallel REST inline-comments fetch in `fetchView` that
   merges into `GhView.comments`.

2026-05-11 19:17:21 +08:00

16 KiB

Raw Blame History

Directory guide

This file is the single source of truth for agents entering this repository. Read this file first; after entering apps/, packages/, tools/, or e2e/, read that layer's AGENTS.md for module-level details. Do not copy module details back into the root file; root stays focused on cross-repository boundaries, workflow, and commands.

Core documentation index

Product and onboarding: README.md, README.zh-CN.md, QUICKSTART.md.
Contribution and environment: CONTRIBUTING.md, CONTRIBUTING.zh-CN.md.
Architecture and protocols: docs/spec.md, docs/architecture.md, docs/skills-protocol.md, docs/agent-adapters.md, docs/modes.md.
Roadmap and references: docs/roadmap.md, docs/references.md, docs/code-review-guidelines.md, specs/current/maintainability-roadmap.md.
Directory-level agent guidance: apps/AGENTS.md, packages/AGENTS.md, tools/AGENTS.md, e2e/AGENTS.md.

Workspace directories

Workspace packages come from pnpm-workspace.yaml: apps/*, packages/*, tools/*, and e2e.
Top-level content directories: skills/ (functional skills the agent invokes mid-task — utilities, briefs, packagers; see skills/AGENTS.md), design-templates/ (rendering catalogue: decks, prototypes, image/video/audio templates; see design-templates/AGENTS.md and specs/current/skills-and-design-templates.md), design-systems/ (brand DESIGN.md files), craft/ (universal brand-agnostic craft rules a skill can opt into via od.craft.requires).
apps/web is the Next.js 16 App Router + React 18 web runtime; do not restore apps/nextjs.
apps/daemon is the local privileged daemon and od bin. It owns /api/*, agent spawning, skills, design systems, artifacts, and static serving.
apps/desktop is the Electron shell; it discovers the web URL through sidecar IPC.
apps/packaged is the thin packaged Electron runtime entry; it starts packaged sidecars and owns the od:// entry glue only.
packages/contracts is the pure TypeScript web/daemon app contract layer.
packages/sidecar-proto owns the Open Design sidecar business protocol; packages/sidecar owns the generic sidecar runtime; packages/platform owns generic OS process primitives.
tools/dev is the local development lifecycle control plane.
tools/pack is the local packaged build/start/stop/logs control plane and mac beta release artifact preparation surface.
tools/pr is the maintainer PR-duty control plane: a thin gh wrapper that encodes this repo's review-lane derivation, forbidden-surface flags, lane checklists, and validation-command suggestions.
e2e owns user-level end-to-end smoke tests and Playwright UI automation; read e2e/AGENTS.md before editing its tests or commands.

Inactive or placeholder directories

apps/nextjs and packages/shared have been removed; do not recreate or reference them.
.od/, .tmp/, Playwright reports, and agent scratch directories are local runtime data and must stay out of git.

Development workflow

Environment baseline

Runtime target is Node ~24 and pnpm@10.33.2; use Corepack so the pnpm version pinned in package.json is selected.
New project-owned entrypoints, modules, scripts, tests, reporters, and configs should default to TypeScript.
Residual JavaScript is limited to generated output, vendored dependencies, explicitly documented compatibility build artifacts, and the allowlist in scripts/guard.ts.

Local lifecycle

Use pnpm tools-dev as the only local development lifecycle entry point.
Do not add or restore root lifecycle aliases: pnpm dev, pnpm dev:all, pnpm daemon, pnpm preview, or pnpm start.
Ports are governed by tools-dev flags: --daemon-port and --web-port.
tools-dev exports OD_PORT for the web proxy target and OD_WEB_PORT for the web listener; do not use NEXT_PORT.

Root command boundary

Keep root scripts reserved for true repo-level checks and tools control-plane entrypoints: pnpm guard, pnpm typecheck, pnpm tools-dev, pnpm tools-pack, and pnpm tools-pr.
Do not add root aggregate pnpm build or pnpm test aliases. Build/test commands must stay package-scoped (pnpm --filter <package> ...) or tool-scoped (pnpm tools-pack ... / pnpm tools-pr ...).
Do not add root e2e aliases; e2e package commands and ownership rules live in e2e/AGENTS.md.

Boundary constraints

Tests under apps/, packages/, and tools/ live in a package/app/tool-level tests/ directory sibling to src/; keep src/ source-only and do not add new *.test.ts or *.test.tsx files under src/. Playwright UI automation belongs to e2e/ui/, not app packages.
App packages must not import another app's private src/ or tests/ implementation as a shared helper. In particular, apps/web/** must not import apps/daemon/src/**; web/daemon integration belongs behind HTTP APIs, packages/contracts, and app-local provider boundaries.
Cross-app, cross-runtime, or repository-resource consistency checks belong in e2e/tests/ when they need to observe more than one app/package boundary; promote reusable logic to a pure package instead of borrowing another app's private source.
Keep shared API DTOs, SSE event unions, error shapes, task shapes, and example payloads in packages/contracts; update contracts before wiring divergent web/daemon request or response shapes.
Keep packages/contracts pure TypeScript and free of Next.js, Express, Node filesystem/process APIs, browser APIs, SQLite, daemon internals, and sidecar control-plane dependencies.
Keep project-owned entrypoints, modules, scripts, tests, reporters, and configs TypeScript-first; generated dist/*.js is runtime output, and source edits belong in .ts files.
New .js, .mjs, or .cjs files need an explicit generated/vendor/compatibility reason and must pass pnpm guard.
App business logic must not know about sidecar/control-plane concepts. Keep sidecar awareness in apps/<app>/sidecar or the desktop sidecar entry wrapper.
Shared web/daemon app contracts belong in packages/contracts; that package must not depend on Next.js, Express, Node filesystem/process APIs, browser APIs, SQLite, daemon internals, or the sidecar control-plane protocol.
Sidecar process stamps must have exactly five fields: app, mode, namespace, ipc, and source.
Orchestration layers (tools-dev, tools-pack, packaged launchers) must call package primitives; do not hand-build --od-stamp-* args or process-scan regexes.
Packaged runtime paths must be namespace-scoped and independent from daemon/web ports; ports are transient transport details only.
Default runtime files live under <project-root>/.tmp/<source>/<namespace>/...; POSIX IPC sockets are fixed at /tmp/open-design/ipc/<namespace>/<app>.sock.

Git commit policy

Git commits must not include Co-authored-by trailers or any other co-author metadata.

Code review guide

Use docs/code-review-guidelines.md as the repository-wide review standard. That document is the operational guide; this AGENTS.md is the source of truth when the two disagree.
Walk reviews top-down through docs/code-review-guidelines.md: Product relevance test → forbidden surfaces → ownership/scope → matching lane → checklist → comments → approval bar.
Pick the matching review lane: default code/tests, contract and protocol changes, design-system additions, skill additions, or craft additions.
Before reviewing changes under apps/, packages/, tools/, or e2e/, read that directory's AGENTS.md and apply its local boundaries.
Blocking review feedback should focus on correctness, security/secrets, data integrity, repository boundary violations, contract/migration breakage, missing required validation, or high-risk maintainability issues.
Only maintainers may close a PR instead of requesting changes, and only when the change is not salvageable on the existing branch (wrong target product, foreign test harness, DOM/API assumptions absent from this repo, or scripts that conflict with lifecycle rules).

PR-duty tooling

pnpm tools-pr is the maintainer-only control plane for PR-duty work on this repo. It is a thin gh wrapper that encodes repo-specific knowledge — review-lane derivation, forbidden-surface flags, per-lane checklists, validation-command suggestions, and a fixed dictionary of factual classify tags (bot-only-approval, needs-rebase, stale-approval, unresolved-changes-requested, awaiting-* timing, org-member, etc.). The tool is read-only on the PR surface: it never approves, merges, comments, or closes; those side effects stay in explicit gh invocations the maintainer runs.

Common subcommands:

pnpm tools-pr list — triage the open queue by lane and review-state bucket.
pnpm tools-pr view <num> — factual review brief for a single PR.
pnpm tools-pr classify --all — script-level tag JSON for the whole open queue (entry point for cron / digest consumers); per-PR classify <num> for spot checks.
pnpm tools-pr assignment — assigner-perspective ownership + idle-time / blocker view across the queue.

For the full tag dictionary, operational playbook (direct merge / duplicate-title / awaiting-author / org-member / agent-review flows), comment templates, language-detection rules, and tool-design constraints (precision boundaries, factual-output rule, retry + pagination strategy), see tools/pr/AGENTS.md.

Validation strategy

After package, workspace, or command-entry changes, run pnpm install so workspace links and generated dist entries stay fresh.
Before marking regular work ready, run at least pnpm guard and pnpm typecheck, plus the package-scoped tests/builds that match the files changed. Do not use or add root pnpm test/pnpm build aliases.
For local web runtime loops, prefer pnpm tools-dev run web --daemon-port <port> --web-port <port>.
On a GUI-capable machine, validate desktop by running pnpm tools-dev, then pnpm tools-dev inspect desktop status.
Stamp/namespace changes must validate two concurrent namespaces and run desktop inspect eval plus inspect screenshot for each namespace.
Path/log changes must run pnpm tools-dev logs --namespace <name> --json and confirm log paths are under .tmp/tools-dev/<namespace>/....

Bug follow-up workflow

The following is a working playbook for routine bug follow-ups, distilled from recent practice. Treat it as a default action shape, not a contract — production reality always has edges these bullets can't anticipate, so use judgment when the situation doesn't fit cleanly.

Lead with a red spec. Default to encoding the bug as a falsifiable test that goes red before any source change, so the fix is anchored in observable behavior rather than source-code intuition. If a red spec can't be written cheaply, that's usually a signal to clarify scope rather than push forward on a guess.
Try the cheapest layer first. Reach for the lightest test layer that can still see the symptom (e2e Vitest at the daemon HTTP boundary → app-local Vitest → Playwright UI → platform-native harnesses), and drop down only when the cheaper layer can't.
Hold the spec's scope. Defects discovered outside the bug's described boundary belong in a follow-up — their own red spec, their own PR — not in this fix. List them in the PR body's "Adjacent issues" section with the rationale and move on.
Let the fix read as an invariant. Prefer a named helper whose docblock describes what must hold over a bolt-on if guard with apologetic history-comments. The call site should read as intent.
Diff against the baseline. When neighboring suites have pre-existing failures, stash or check out upstream before claiming "no new failures."
Link the issue from the PR body. Use Fixes #N / Closes #N / Resolves #N so the issue auto-closes on merge and the release-time reverse lookup (gh issue view N --json closedByPullRequestsReferences → git tag --contains <merge sha>) actually has a chain to follow. The repo's PR template prompts for this; deleting the prompt is fine when the PR genuinely closes nothing.
Stage human verification for visible bugs. When the symptom needs an eye to confirm — UI, platform-native behavior, animations, race conditions a unit test can't see — green specs alone aren't acceptance. Stand up a buggy-vs-fix comparison the reviewer can drive themselves (typical shape: two namespaced runtimes, one on main, one on the fix branch), and seed any required data only through production HTTP APIs; source-level test backdoors invalidate the verification because they prove a fake flow rather than the real one.

For a worked example of one full loop (red e2e spec → fix → green), see e2e/tests/dialog/stop-reconciles-message.test.ts (issue #135).

Common commands

pnpm install
pnpm tools-dev
pnpm tools-dev start web
pnpm tools-dev run web --daemon-port 17456 --web-port 17573
pnpm tools-dev status --json
pnpm tools-dev logs --json
pnpm tools-dev inspect desktop status --json
pnpm tools-dev inspect desktop screenshot --path /tmp/open-design.png
pnpm tools-dev stop
pnpm tools-dev check

pnpm guard
pnpm typecheck

pnpm tools-pr list
pnpm tools-pr list --bucket=merge-ready,approved-blocked
pnpm tools-pr list --lane=skill,contract --json
pnpm tools-pr view 1180
pnpm tools-pr view 1180 --json

pnpm --filter @open-design/web typecheck
pnpm --filter @open-design/web test
pnpm --filter @open-design/web build
pnpm --filter @open-design/daemon test
pnpm --filter @open-design/daemon build
pnpm --filter @open-design/desktop build
pnpm --filter @open-design/tools-dev build
pnpm --filter @open-design/tools-pack build
pnpm --filter @open-design/tools-pr build

pnpm tools-pack mac build --to all
pnpm tools-pack mac install
pnpm tools-pack mac cleanup
pnpm tools-pack win build --to nsis
pnpm tools-pack win install
pnpm tools-pack win cleanup
pnpm tools-pack linux build --to appimage
pnpm tools-pack linux install
pnpm tools-pack linux build --containerized

FAQ

Why is there no root `pnpm dev` / `pnpm start`?

To avoid starting daemon, web, and desktop through inconsistent env, port, namespace, or log paths. All local lifecycle flows must go through pnpm tools-dev.

Why should `apps/nextjs` not be restored?

The current web runtime is apps/web. The historical apps/nextjs layout has been removed from the active repo shape; restoring it would reintroduce duplicate app boundaries and stale scripts.

How does desktop discover the web URL?

Desktop queries runtime status through sidecar IPC. The web URL comes from tools-dev launch status, not from desktop guessing ports or reading web internals.

How are sidecar-proto, sidecar, and platform split?

@open-design/sidecar-proto owns Open Design app/mode/source constants, namespace validation, stamp fields/flags, IPC message schema, status shapes, and error semantics. @open-design/sidecar provides only generic bootstrap, IPC transport, path/runtime resolution, launch env, and JSON runtime files. @open-design/platform provides only generic OS process stamp serialization, command parsing, and process matching/search primitives, consuming the proto descriptor.

Where is data written?

The daemon writes .od/ by default: SQLite at .od/app.sqlite, agent CWDs under .od/projects/<id>/, saved renders under .od/artifacts/, and credentials at .od/media-config.json. Two env vars override the storage root, in order:

OD_DATA_DIR=<dir> — relocates all daemon runtime data to <dir> (used by Playwright for test isolation, and by the packaged daemon and the Home Manager / NixOS modules to point the daemon at a writable directory when the install root is read-only). The path is resolved with ~/ expansion and relative paths anchored to <projectRoot>.
OD_MEDIA_CONFIG_DIR=<dir> — narrower override that relocates only media-config.json. Same resolution semantics. Most installs do not need this; it exists for setups that want to keep API credentials in a different location from the rest of the runtime data.

Default precedence is OD_MEDIA_CONFIG_DIR > OD_DATA_DIR > <projectRoot>/.od.

When is `pnpm install` required?

Run pnpm install after changing package manifests, workspace layout, command entrypoints, bin/link-related content, or after adding/removing workspace packages.

16 KiB Raw Blame History