open-design

mirror of https://github.com/nexu-io/open-design.git synced 2026-06-01 03:14:35 +07:00

Author	SHA1	Message	Date
PerishFire	c3d41c7d45	fix(tools-pr): chunk stats fetch through cursor-paginated GraphQL (#1285 ) `fetchOpenPrs` was reading the stats chunk via `gh pr list --limit 1000 --json mergeStateStatus,...`. With the default limit raised to 1000 in #1259, this 502s reliably on the live open queue (107 PRs): GitHub's GraphQL gateway has to recompute mergeStateStatus for every PR up front, and the resulting query exceeds the gateway budget once the requested page passes ~60 PRs. Switch the stats chunk to `fetchPaginatedPrList`, the same cursor- paginated GraphQL helper that already drives reviews / comments / commits / assignment-timelines. Page size stays at PR_LIST_PAGE_SIZE (30), well within the gateway budget, and the heavy stats fetch is now consistent with the other heavy chunks. Verified locally: `pnpm tools-pr list` now completes against the live 107-PR queue without a 502.	2026-05-11 20:51:29 +08:00
nettee	be77dc0394	Default English resource i18n fallback (#1270 )	2026-05-11 20:29:05 +08:00
Joey-nexu	12ac2e988e	docs: add Maintainer rules (MAINTAINERS.md + CONTRIBUTING entry-point) (#1290 ) Adds a public set of rules for the External Maintainer role: who qualifies, how nominations work, what permissions Maintainers gain, what's expected of them, and how step-down works. The Core Team's individual roster is intentionally not enumerated. What's public is the rules everyone plays by. - New file: MAINTAINERS.md (English authoritative version) + 5 locale variants matching the existing CONTRIBUTING.md i18n surface (de, fr, ja-JP, pt-BR, zh-CN). Non-EN/non-zh-CN variants are machine-translated drafts marked at the top — native-speaker review is welcome via follow-up PRs. - CONTRIBUTING.md (and its 5 locale variants): adds a short Becoming a Maintainer section that points at MAINTAINERS.md, so the rules live in one place and translation drift is bounded. Decisions not in this PR (intentional): - No internal Core Team roster. - No internal observability dashboards. - No nomination PR / public voting flow (Core-Team-consensus-driven for now; to be revisited once External Maintainers exceed 5). Co-authored-by: Joey Li <lijinwei@open-design.ai>	2026-05-11 20:19:55 +08:00
Caprika	fb079d8115	Add reliable agent-browser skill (#1284 ) * Add reliable agent browser skill * Fix ProjectView delete conversation test props	2026-05-11 20:09:12 +08:00
PerishFire	1eb20e3807	fix(web): keep tweaks selection usable without annotations (#1268 )	2026-05-11 20:06:49 +08:00
Sebastian Westberg	8962088c75	feat(daemon): guard against agent-emitted stub artifact regressions (#1171 ) * feat(daemon): guard against agent-emitted stub artifact regressions When an agent emits an <artifact> block whose body is a placeholder ("see other-file.html in this project", a bare filename string, a tiny fallback page) instead of the full document, the daemon writes the placeholder to disk verbatim. Users see a 25-500 byte HTML file where their previous version had tens of kilobytes of real markup. Add a structural regression guard in writeProjectFile: before writing an html/deck artifact whose manifest carries metadata.identifier, scan the project dir for prior siblings matching <identifier>(-\d+)?\.html? and compare sizes. If the new body is below minRetainedRatio (default 0.2) of the largest prior sibling >= minPriorBytes (default 4096), flag a regression. Three modes via env: - OD_ARTIFACT_STUB_GUARD=warn (default) writes the file and attaches stubGuardWarning to the response so the frontend can surface it. - OD_ARTIFACT_STUB_GUARD=reject throws ArtifactRegressionError before fs.writeFile; the route returns 422 ARTIFACT_REGRESSION with the prior sibling's name and size in error.details. - OD_ARTIFACT_STUB_GUARD=off skips the guard entirely. Cross-agent by design: anchored on size delta + identifier match, no agent-specific stub-phrase regex, so works for any agent backend behind the agent-adapter abstraction. The body-then-manifest write order pre-dates this change; the reject path throws before fs.writeFile so rejections never leave a partial state behind. 24 unit + 8 HTTP tests cover happy paths, all three modes, deck kind, .htm extension sibling detection, ratio=1 edge case, and verify rejected writes leave neither the html nor its manifest sidecar on disk. * fix(stub-guard): close same-name, nested-dir, and non-slug bypasses Code review on PR #1171 (lefarcen, Codex, mrcfps) found three holes where the stub guard could be silently bypassed. All three are now closed with HTTP test coverage. Same-name overwrite (lefarcen P1): the writer's prior-sibling scan deliberately skipped the file at safeName, but for an in-session overwrite (persistArtifact reuses the same fileName when savedArtifactRef.current matches) that file is the prior content, not the new entry. Drop the exclude-by-name filter; the current on-disk size at scan time is always the prior because the overwrite happens after this check. Subdirectory scoping (Codex/mrcfps P2): writeProjectFile creates parent directories for nested paths like reports/overview.html, but the guard only scanned the project root. Pass path.dirname(target) as scanDir so nested artifacts are evaluated against their real sibling set. Non-slug identifier (Codex/lefarcen/mrcfps P2): the web's persistArtifact slugifies the filename basename but stores the raw identifier in the manifest, so an identifier like "Landing Page" yields filename landing-page.html with metadata.identifier="Landing Page". Build the sibling regex from both the raw identifier and a slugified variant (mirroring the frontend's slugifier) so either form matches the same priors. Also surface warn-mode warnings in the web UI: ProjectView now checks file.stubGuardWarning after writeProjectTextFile and renders the warning via setError. Reject-mode 422 surfacing requires restructuring writeProjectTextFile's return contract and is deferred. API change inside the daemon: evaluateArtifactStubGuard / findPriorArtifactSiblings drop excludeSafeName and rename projectDir to scanDir. Tests updated. Tests: 4 new HTTP cases (same-name overwrite preserves prior body, nested subdir rejects, slug-form match rejects, plus the existing warn/off/deck/.htm cases) and 1 new unit case (slug-form sibling match). 44 tests pass. * fix(stub-guard): empty-slug fallback + reject-mode UI surface Round 3 review on PR #1171 (lefarcen, mrcfps) found two remaining holes after `9cc82430` closed the same-name / subdir / non-slug bypasses. Empty-slug fallback bypass (lefarcen P2): an identifier like "测试" (all-non-ASCII) strips to empty through the web slugifier, and persistArtifact's `slice(0,60) \|\| 'artifact'` falls back to the literal "artifact" basename. The guard searched for raw identifier + slug only, so a later artifact-2.html stub bypassed the prior. Add EMPTY_SLUG_FALLBACK_NAME = 'artifact' as a sibling-name candidate when the slug is empty, mirroring the frontend fallback exactly. Reject-mode UI silence (mrcfps P2 + lefarcen P2): writeProjectTextFile collapses any non-OK response (including 422 ARTIFACT_REGRESSION) to null, and persistArtifact previously had no else branch. Users in reject mode saw the daemon log fire but the UI was silent. Add an else branch that surfaces a generic banner pointing at the most likely cause and mentions checking the daemon logs for structured details. Also clear savedArtifactRef.current on failure so retries re-enter the persistence path. Plumbing the structured 422 details through writeProjectTextFile itself remains out of scope (cross-cutting client contract change affecting 5+ call sites). The generic banner is the "at minimum" path mrcfps suggested. Tests: 1 new unit case (artifact.html sibling discovery for non-ASCII identifier) + 1 new HTTP case (empty-slug stub regression rejected end-to-end). 46 tests pass across stub-guard suites (was 44). * fix(stub-guard): verify sidecar identity to avoid cross-identifier false positives Round 4 review on PR #1171 (mrcfps inline + lefarcen review) caught a false-positive introduced by the round-3 empty-slug fallback. Two distinct identifiers that both slugify to empty (e.g. "测试" and "首页") share the artifact.html basename, so a brand-new save under the second identifier was being compared against — and falsely rejected because of — the unrelated first. The same shape exists symmetrically: a non-empty-slug identifier literally named "artifact" would falsely match empty-slug fallback files written under any other identifier. Fix: filename pattern matching is now a candidate generator, not the source of truth. For every candidate sibling, read its .artifact.json sidecar and verify metadata.identifier matches the input via artifactIdentifiersMatch (raw equality OR shared non-empty slug). Files without a sidecar are skipped — they weren't written through the artifact-tag path this guard targets, and treating them as priors was always a stretch. Empty-slug equivalence is intentionally NOT honored: 测试 != 首页 even though both slugify to empty. The whole bug was conflating distinct identifiers via the fallback name; slug-equivalence kicks in only for non-empty slugs (Landing Page <-> landing-page). Tests: unit fixtures now write file+sidecar pairs (mirrors prod); new artifactIdentifiersMatch suite covers the 5 equivalence cases; new HTTP test does NOT cross-reject distinct empty-slug identifiers asserts the second save returns 200 instead of 422; new unit test skips files without a sidecar. 42 tests pass across stub-guard suites. fix(stub-guard): require canonical-form anchor in identifier match to avoid 60-char truncation collisions Round 5 review on PR #1171 (mrcfps) caught another false-positive in artifactIdentifiersMatch: slugifyArtifactIdentifier truncates at 60 chars, so two distinct >60-char identifiers that share their first 60 chars (e.g. "A...A1" and "A...A2", 70 chars each) slugify to the same string and would falsely bridge. Same shape as the empty-slug fallback bug from round 4, just at the other end of the input range. Tighten the rule: slug-equivalence requires at least one input to BE its own canonical slug form. That keeps the legitimate bridge ("Landing Page" <-> "landing-page" — second input IS the slug) but rejects truncation collisions ("A...A1" <-> "A...A2" — neither is in canonical form). Side effect: two non-canonical forms that slugify to the same value no longer bridge (e.g. "Landing Page" vs "LANDING-PAGE"). This is correct: without one canonical anchor we can't safely call them the same lineage. Updated the slug-equivalence test to assert the new semantics explicitly with both directions and a negative case. Tests: 2 new cases (no bridge for >60-char truncation collision; raw 70-char to its 60-char truncated slug still bridges) + 1 negative test for the non-canonical-pair case. 45 tests pass. * fix(stub-guard): cover legacy sidecar-less HTML priors Round 6 review on PR #1171 (mrcfps, non-blocking) caught a real legacy bypass: round 4's sidecar-required policy skipped any HTML file without an .artifact.json companion, but readManifestForPath (projects.ts) treats those same files as legitimate artifacts via inferLegacyManifest. So a project with an older sidecar-less dashboard.html (pre-sidecar era, Write-tool-emitted, paste-text, manual import, etc.) let its first stub rewrite through as a supposed "first emission". Fix: when the sidecar is missing, derive a synthetic identifier from the filename (strip the (-N)?\.html? suffix) and run it through the same artifactIdentifiersMatch rules. Synthetic identifiers come from already-slugified filenames, so they bridge raw inputs only via the canonical-form rule established in round 5 — no truncation collisions, no empty-slug conflation, no unrelated cross-identifier matches. Tests: 3 new unit cases (legacy fallback finds the prior; bridges raw->slug under the same rules; does NOT bridge unrelated slug forms via inference) + 1 new HTTP test that seeds a sidecar-less prior via the artifact-manifest-less write path and asserts the stub rewrite is rejected with 422 ARTIFACT_REGRESSION. 48 tests pass across stub-guard suites (was 45). * fix(stub-guard): try both interpretations for legacy filename inference Round 7 review on PR #1171 (mrcfps, non-blocking) caught a real ambiguity in the round-6 legacy fallback: a filename like `phase-2.html` is genuinely ambiguous without a sidecar. It could be the identifier "phase" with a -2 collision suffix, OR the standalone identifier "phase-2". The round-6 helper only stripped the suffix, so a sidecar-less `phase-2.html` followed by a stub emission with metadata.identifier="phase-2" bypassed the guard ("phase-2" doesn't match the inferred "phase"). Fix: when the sidecar is missing, generate both candidate identifiers (full basename and suffix-stripped basename) and accept the file as a prior if either matches. Visible false positives are preferable to silent false negatives — and the canonical-form anchor in artifactIdentifiersMatch still rules out truncation collisions and empty-slug conflations regardless of which candidate matched. Tests: 2 new unit cases (full-basename interpretation finds "phase-2"; suffix-stripped interpretation also finds "phase") and 1 new HTTP test that seeds a sidecar-less `phase-2.html` and asserts the stub rewrite is rejected with 422 ARTIFACT_REGRESSION. 51 tests pass across stub-guard suites (was 48). --------- Co-authored-by: Sebastian Westberg <sebastianwestberg@users.noreply.github.com>	2026-05-11 19:59:37 +08:00
初晨	0f0d214298	fix(web): render static previews for sketch json files (#1060 ) * fix(web): render static previews for sketch json files * fix(web): tolerate malformed sketch text items * fix(web): harden sketch preview parsing * fix(web): preserve sketch items on round-trip * fix(web): clear sketch files destructively * fix(web): unblock unsupported sketch saves	2026-05-11 19:29:46 +08:00
Dongsen	fd67b680d7	fix(contracts): pin API-mode override above discovery layer (#313 ) (#1207 ) * fix(contracts): pin API-mode override above discovery layer (#313) The old streamFormat='plain' rule was appended at the BOTTOM of the composed prompt, but DISCOVERY_AND_PHILOSOPHY is pinned at the TOP with its own 'these override anything later' header — so its hard rules ('TodoWrite on turn 3', 'brand-spec extraction via Bash + Read + WebFetch') still won precedence in API mode. With no real tools wired through to the Anthropic Messages path, the agent narrated pseudo-tool markup (<todo-list>...</todo-list>, [读取 X]) instead of emitting structured tool_use events the UI could render. Move the API-mode override to the absolute top of the prompt so it beats the discovery layer, name every unavailable tool, and explicitly forbid the pseudo-tool / fake-protocol markup observed in #313. <artifact> output and <question-form> discovery are still allowed — both are markup the UI parses, not tool calls. * fix(daemon): mirror API-mode override above discovery layer (#313) Address Codex + mrcfps review on #1207: the daemon has its own copy of composeSystemPrompt that is hit by any adapter declaring streamFormat: 'plain' (e.g. DeepSeek) via server.ts:6190. That copy still appended the obsolete bottom '## API mode rule', which loses the precedence war against DISCOVERY_AND_PHILOSOPHY's 'these override anything later' header — so plain-stream daemon agents could still narrate <todo-list> / [读取 X] pseudo-tool markup. Mirror the same top-anchored API_MODE_OVERRIDE here (byte-identical to the contracts copy) so both code paths produce the same behaviour. Adds 8 daemon-side tests including the indexOf-based positional assertion that pins the override above the discovery layer header.	2026-05-11 19:29:34 +08:00
Dongsen	12ce5ad38b	fix(web): ignore <artifact> tags inside markdown code spans and fences (#1132 ) * test(web): add failing parser cases for <artifact> recitation in markdown code Cover the three real-world prose contexts where the model legitimately quotes the artifact tag without intending to emit one: - inside an inline backtick span - inside a fenced code block - spread across streaming chunks crossing the fence boundary Establishes the RED baseline before parser code-fence awareness lands. * fix(web): ignore <artifact> tags inside markdown code spans and fences The streaming artifact parser scanned the buffer with a raw indexOf, guarded only by 'next char must be whitespace'. That meant any literal <artifact ...> the model recited while documenting the protocol — even inside backticks or a ```html fence — flipped the parser into artifact mode, swallowed the rest of the reply from the chat UI, and (when a matching </artifact> appeared in the recitation) silently wrote a spurious file to disk via persistArtifact. Replace findOpenTag with a linear scan that tracks fenced code blocks (```) and inline code spans (`), skipping any <artifact prefix found inside either. If the buffer ends mid-fence, return a partial match anchored at the fence start so the next streaming chunk can resolve the boundary without losing fence context. Closes #1130. * fix(web): match renderer fence/inline-code rules in artifact parser Codex review on PR #1132 caught that the previous fix toggled inFence on any triple-backtick run anywhere in the buffer, including mid-line, while the chat renderer (apps/web/src/runtime/markdown.tsx) only treats ``` as a fence when it occupies a whole line matching /^[ ]{0,3}```(\w[\w+-])?\s$/. That asymmetry would suppress a real <artifact> tag emitted after a prose sentence like "the opening marker is ```html and the response then writes:". Rework findOpenTag in three passes that mirror the renderer: 1. Walk \n-terminated lines; only a line that matches FENCE_LINE_RE toggles fence state. Open fences without a close (or with an unterminated tail line) return partial so the next chunk can resolve. 2. Collect inline code spans with /`[^`]+`/g — the same regex used by renderInline — so what the parser skips matches what the user sees as code. Unmatched trailing backticks after the last \n hold back. 3. Find the first <artifact …> outside any skip range; preserve the existing partial-prefix tail handling. Adds a regression test covering the exact case Codex reported. * test(web): pin parser behavior on double-backtick and in-fence string literal recitation Two cases raised in PR #1132 review: - a real artifact tag wrapped in '``<artifact …>``' (double-backtick inline code span) should not be treated as a real artifact - a fenced JS example whose body contains a string literal like 'const fence = "```";' should not pop fence state early and let a later literal <artifact> be parsed as real Both already pass on 96e88ca because the line-anchored fence regex and the renderer-aligned inline regex handle them correctly. Pinning the behavior so future regressions surface as test failures. * fix(web): make stripArtifact markdown-aware to stop truncating literal recitations The streaming artifact parser was hardened in 96e88ca to skip <artifact> recitations inside backticks and fences, but the post-stream stripper at AssistantMessage.tsx still ran a naive 'content.indexOf("<artifact")' over the same text events. As reported by lefarcen on PR #1132, that meant chat replies with literal protocol recitations could still get silently truncated mid-explanation — even though the parser preserved them in the text stream and the file panel was no longer polluted with ghost files. Extract the renderer-aligned classification (FENCE_LINE_RE, INLINE_CODE_RE, computeSkipRanges, rangeContains) into a single source of truth at apps/web/src/artifacts/markdown-context.ts so the parser and the stripper agree on what counts as code. Add apps/web/src/artifacts/strip.ts with a markdown-aware stripArtifact that: - ignores any <artifact open inside a fenced block or inline code span - looks for </artifact> with the same skip-range filter, so a real open paired with a literal close inside backticks does not strip a literal body that is meant to render - returns content unchanged when an open exists with no matching real close (the previous implementation sliced to end-of-string, which would nuke trailing prose on a malformed or still-streaming tag) Refactor parser.ts to import the shared helpers; behavior preserved (all seven existing parser tests still pass). New strip.test.ts covers six cases including the empirically-verified inline-backtick regression. * fix(web): align artifact stripper/parser fence rules with renderer exactly Two gaps surfaced in review at a0bf05f: - markdown-context.ts used a single FENCE_LINE_RE that allowed 0-3 leading spaces and reused the same pattern for opening and closing fences. The chat renderer (runtime/markdown.tsx:44 and :49) is asymmetric — opens with /^```(\w[\w+-])?\s$/, closes with /^```\s$/, and rejects any leading indentation on either side. Indented " ```html" was being treated as a code fence even though the renderer keeps it as a paragraph, and a literal "```html" line inside an open fenced example was closing the skip range early — both could expose a real or literal <artifact …> to the wrong handler. - stripArtifact discarded computeSkipRanges' unclosedFenceStart, so a fenced literal that ends at EOF without a trailing newline (very common for chat output) leaked the inner <artifact …> recitation to the stripper, reproducing the original #1130 truncation symptom on a narrower input shape. Split FENCE_LINE_RE into FENCE_OPEN_RE / FENCE_CLOSE_RE with no leading indentation, gate the fence state machine on the right side of the toggle, and have stripArtifact extend skip ranges to end-of-content when a fence is left open. Also tightened the parser's tail-line hold-back regex to match the renderer's no-leading-space rule. Added regression tests for the EOF-unclosed-fence case, the indented pseudo-fence (renderer treats as paragraph, stripper must strip the real artifact), and a "```html" line inside an open fence. Refs nexu-io/open-design#1130 refactor(web): align streaming tail-line fence guard with FENCE_OPEN_RE The streaming parser's tail-line hold-back used a stricter local regex (/^```\w$/) than the renderer's FENCE_OPEN_RE (/^```(\w[\w+-])?\s$/), missing valid opener tails like ```c++, ```ts-, or ``` (trailing space). In practice these tails are still held back by the unmatched-backtick parity scan that runs immediately after — three backticks in a tail line are odd, so firstUnmatched stays set and the parser holds from that position. So this wasn't a runtime correctness bug, just a regex divergence that future readers could trip on. Drop the local regex and reuse FENCE_OPEN_RE so the tail check matches the same shape the rest of the pipeline already uses. Pinned the behavior with three new parser tests (`+`/`-` info-string suffix and trailing-space tails arriving as the first chunk) — they pass at HEAD, proving the parity scan was already covering these cases. Refs nexu-io/open-design#1132 (lefarcen polish P2) fix(web): scope inline-code skip ranges per block and reject <artifact prefix-shared opens INLINE_CODE_RE previously ran over the whole buffer, so an unmatched backtick in one paragraph could pair with a backtick in a later paragraph and create a phantom inline span that swallowed any real <artifact …> between them. Mirror runtime/markdown.tsx by splitting the buffer on fence / blank / heading / list / hr boundaries and running INLINE_CODE_RE per block region instead. stripArtifact accepted any unskipped `<artifact` substring as a real open, while the streaming parser already required a following whitespace character — so prose like `<artifactual>demo</artifact>` was being truncated to `prefix suffix`. Extract the parser's real-open guard into isRealArtifactOpenAt and reuse it from both sides. While reordering findOpenTag for the shared guard, also fix the related hold-back ordering issue tracked at #1141: a stray tail-line backtick or fence-opener prefix used to suppress an artifact already complete earlier in the buffer. Scan for the earliest complete real open first, then pick the earliest hold-back position only when no complete tag was found. Regressions pinned in parser.test.ts and strip.test.ts for both new finding shapes. * fix(web): keep HR-shaped lines inside paragraph regions for inline-code scanning The previous walker closed inline-scan regions on lines matching the HR regex, but `parseBlocks()` in runtime/markdown.tsx does not break a paragraph on HR — its inner accumulation loop only breaks on blank / fence / heading / ul / ol (runtime/markdown.tsx:95-104). HR is only an HR block in the outer loop's first-look, never mid-paragraph. So inputs like `intro \`\n---\n<artifact …>…</artifact>\n---\nclosing \`` are one paragraph in the renderer, whose two stray backticks pair to cover the literal artifact recitation — but the walker was splitting on the `---` lines, leaving the recitation outside skip ranges, and the parser/stripper would treat it as a real tag. Drop HR from the paragraph-break list (HR-shaped lines carry no backticks of their own, so keeping them inside the surrounding region is benign either way) and document the renderer-mirror rationale. Regressions pinned on both sides.	2026-05-11 19:29:22 +08:00
Sid	156bf5a34e	fix(web): refresh home projects after deleting a conversation (#1202 ) (#1219 ) The home design cards render their `Needs input` badge from the cached `/api/projects` payload — App.tsx owns the `projects` state and exposes a `refreshProjects` callback that ProjectView already fires from every other state-changing branch (run end, live-artifact events, project rename, etc.). The conversation-delete branch silently skipped it: deleting a conversation that owned an unanswered `<question-form>` flips the daemon-side flag, but the home view kept showing the stale badge until the next manual reload. Call `onProjectsRefresh()` immediately after a successful `deleteConversation` API response (and only then — if the request fails the cached state is still the truth and we must not pretend otherwise). Adds `onProjectsRefresh` to the useCallback deps for exhaustive-deps correctness; matches the pattern at the four existing call sites in this file. New regression coverage in `apps/web/tests/components/ProjectView.deleteConversation.test.tsx`: - triggers onProjectsRefresh after deleting a conversation (verified RED before this fix, GREEN after) - does not trigger onProjectsRefresh when the delete request fails (defensive complement so a future "always refresh" refactor doesn't paper over a real failure with a stale-but-confident UI)	2026-05-11 19:29:09 +08:00
shangxinyu1	10802bb0b0	test: expand nightly UI and desktop regression coverage (#1256 ) * e2e(ui): cover examples preview flows * e2e(ui): cover Codex local CLI fallback UX * test: expand desktop and connector regression coverage * e2e(ui): cover workspace restoration flows * e2e(ui): cover retry recovery workspace flow * test: cover artifact and connector recovery flows * e2e(ui): cover Continue in CLI stale provenance flow * e2e(ui): cover BYOK model fetch caching * test: expand Orbit and desktop connector coverage * e2e(ui): cover workspace quick switcher recovery flows * e2e(ui): cover connector pending authorization recovery * e2e(ui): cover workspace and conversation restoration routes * e2e(ui): cover conversation draft and attachment restoration * e2e(ui): cover conversation history selection recovery * e2e(ui): cover workspace surface conversation selection * test: cover artifact presentation and orbit link behavior * test: cover artifact external link restoration * e2e(ui): cover root-route deep-link restoration * e2e(specs): cover Orbit open-artifact desktop click * e2e(specs): cover desktop artifact open link * test: fix Orbit settings fixture type drift * test: split Playwright critical and extended suites * test: fix ProjectView design template fixtures * ci: split workspace test stages * guard: allow split Playwright suite scripts * test: shrink Playwright critical suite * test: restore omitted Playwright suites	2026-05-11 19:23:13 +08:00
PerishFire	8c0fb8dc01	feat(tools-pr): add maintainer PR-duty workspace (#1259 ) * feat(tools-pr): add maintainer PR-duty workspace Adds `tools/pr` as the maintainer-only control plane for PR-duty work on this repo. Thin `gh` wrapper that encodes repo-specific knowledge: review lanes, forbidden surfaces, lane-specific checklists, validation command derivation from touched packages. Subcommands: - `list` — triage open queue by lane and review-state bucket. - `view <num>` — agent-friendly review brief for a single PR. - `classify [num]` — emit script-level tags for one PR or the whole open queue; full-queue JSON output lands under `.tmp/tools-pr/classify/` with rate-limit telemetry per run. - `assignment` — assigner-perspective view of PR ownership, idle time, and blockers (derived from existing tags; no new judgments). Tag dictionary (13 tags) covers: bot-only-approval, needs-rebase, forbidden-surface, unlabeled, duplicate-title, non-ascii-slug, maintainer-edits-disabled, org-member, unresolved-changes-requested, stale-approval, and three awaiting-* timing tags. Each rule is expressible as one factual sentence over `gh` data + repo paths — see `tools/pr/AGENTS.md` for the full dictionary plus precision rules. Templates in `tools/pr/templates/.md` are aesthetic references for recurring maintainer comments (duplicate-title ask, awaiting-author nudge, agent-review brief shape). `templates/examples/` holds frozen-in-time agent-review snapshots for three PR shapes. Infrastructure: - `gh()` wraps `execFile` with minimum-touch retry (2 attempts at 1s + 2s backoff) on transient 5xx / network errors. Persistent failures still surface — retry is anti-jitter, not an exponential-backoff resilience layer. - Heavy chunks (`reviews`, `comments`, `commits`, assignment timelines) use cursor-paginated `gh api graphql` via `fetchPaginatedPrList` to stay under GitHub's GraphQL server-side timeout. Light chunks stay on `gh pr list --json`. - `fetchOrgMembers` cached per process via `gh api orgs/<owner>/members --paginate`. Wiring: - Root `package.json` adds `pnpm tools-pr` to the allowed root entry points. - `scripts/postinstall.mjs` builds `tools/pr` alongside other workspace packages. - `scripts/guard.ts` allowlists `tools/pr/bin/tools-pr.mjs` and `tools/pr/esbuild.config.mjs`, and adds `pr/` to the `tools/` top-level layout allowlist. - Root `AGENTS.md` and `tools/AGENTS.md` document the new command surface, root-command-boundary update, and per-tool ownership. docs(agents): brief tools-pr in root AGENTS.md, link to tools/pr/AGENTS.md Adds a `PR-duty tooling` section to the root AGENTS.md summarising what `pnpm tools-pr` is, listing the four common subcommands (list / view / classify / assignment), and pointing readers to `tools/pr/AGENTS.md` for the full tag dictionary, operational playbook, templates, and design rules. The section keeps root-level guidance to high-level orientation while details stay local to the tool's own AGENTS.md. * fix(tools-pr): drop overly broad touches-root-package.json forbidden hit `deriveForbidden` was flagging any change to root `package.json` as a forbidden-surface hit, but AGENTS.md §Root command boundary only forbids specific lifecycle aliases (pnpm dev / test / build / daemon / preview / start) — tools-control-plane entrypoints like `pnpm tools-pr` are explicitly allowed. Distinguishing "forbidden alias" from "allowed entry" requires reading the diff content, which is `pnpm guard`'s job rather than a path-derived classify tag. Dogfooded on this branch's own PR (#1259), which added the `pnpm tools-pr` script and was incorrectly flagged. Removing the hit aligns the `forbidden-surface` tag with what tools-pr can mechanically detect from file paths alone (apps/nextjs/, packages/shared/). * fix(tools-pr): paginate commits fetch, recognise ready-to-merge, escape title-index separator Three review follow-ups on #1259, all factual fixes: - `fetchOpenPrCommits` now uses `fetchPaginatedPrList` instead of a one-shot `pullRequests(first: $first)` query. GitHub GraphQL caps connection page size at 100, so the previous implementation would fail at runtime when callers passed `--limit > 100`. The paginated path makes the commits fetch consistent with the other heavy chunks (reviews, comments, assignment timelines) and removes the artificial ceiling entirely. The `limit` parameter is dropped from `fetchOpenPrCommits`; the CLI `--limit` continues to bound the `gh pr list --json` chunks. - `deriveStatus` in `assignment.ts` now reads `facts.reviewDecision` and `facts.mergeStateStatus`. When the PR is `APPROVED` with merge state `CLEAN` or `UNSTABLE` and carries no blockers, status renders as `ready to merge` instead of falling through to `in review`. The assignment view loses its main triage signal without this — a clean human-approved PR rendered identical to a REVIEW_REQUIRED one. - `tags.ts:tagDuplicateTitle` and `tags.ts:buildContext` both constructed the title-index key with a literal NUL byte between author and title, which made the file appear as binary in `git diff` / review tooling. Replaced the literal byte with a Unicode escape sequence in source; the runtime string value is identical, the source stays plain text and round-trips through review tooling cleanly. * fix(tools-pr): raise default --limit to 1000 to cover the live open queue mrcfps flagged that `tools-pr list` (and `classify --all`, `assignment`) defaults to `--limit 100`, which silently drops every PR past the first 100 in the open queue. The repo currently sits at 104 open PRs, so the out-of-the-box run was already omitting four PRs. Raise the default to 1000 in `list.ts`, `classify.ts`, and `assignment.ts`, and remove the now-pointless 200 ceiling — `gh pr list --limit N` paginates internally, so a high cap is cheap. Users can still pass `--limit <small>` for a truncated preview. CLI help text on the three subcommands updated to match. * fix(web): pass designTemplates to ProjectView render helper #955 made `designTemplates` a required Prop on ProjectView, but the test helper added in #1244 (`renderProjectView` in `ProjectView.api-empty-response.test.tsx`) was never updated. The two PRs landed on main without conflicting, leaving `apps/web` typecheck red for every PR that rebases past `b5eb8c16`. Pass `designTemplates={[] as SkillSummary[]}` alongside the existing `skills={[] as SkillSummary[]}` so the helper compiles. The component already treats the array shape (empty included) as a no-op fallback in the empty-response paths the test exercises. * fix(tools-pr): correct author signal + merge inline review comments Two correctness gaps in the awaiting-* signal pipeline surfaced during review of the new tools-pr commands: 1. `authorSignalAt` iterated every PR commit unconditionally. On `maintainerCanModify=true` PRs a maintainer's follow-up push would advance the author timestamp, masking a stalled author response. Filter commits to those whose `authorLogin` matches `facts.author`, mirroring the same filter already applied to comments. 2. `fetchOpenPrComments` (and `fetchView`) only fetched `pullRequest.comments` / `gh pr view --json comments`, which is the issue-conversation thread. Inline review-thread replies — where authors and reviewers actually exchange most fix-up replies — live in `reviewThreads.comments` / REST `pulls/{n}/comments`. Missing them let `humanReviewerSignalAt` / `authorSignalAt` and the `view` brief point at the wrong side after someone replied inline. Extend the list-mode GraphQL to also sweep `reviewThreads(last: 20).comments(first: 20)`, and add a parallel REST inline-comments fetch in `fetchView` that merges into `GhView.comments`.	2026-05-11 19:17:21 +08:00
Tom Huang	b5eb8c1647	feat: generic skills + split skills/design-templates + finalize-design API (#955 ) * feat: general-purpose skills with @-mention composition and user import Lift skills from "one mode-bound skill per project" to a generic capability the user can compose per turn: - Daemon: scan multiple skill roots (user-skills under runtime data, then the bundled `skills/`); user-imported skills can shadow built-ins by id. - New `POST /api/skills/import` and `DELETE /api/skills/:id` endpoints, with CONFLICT/BAD_REQUEST/NOT_FOUND error codes and built-in delete protection. - ChatRequest gains `skillIds: string[]`; the chat run concatenates each picked skill's body (and merges craftRequires) into the system prompt for that turn only — the project's persistent `skillId` is untouched. - Web composer: `@` popover now lists skills alongside project files; picks render as removable chips above the textarea and ride along with the request as `skillIds`. - Settings → Library: import form (name/description/triggers/body), per-card delete for user skills, "user" origin badge. * chore(web): drop welcome pet teaser + add ds→prompt-template mapping util - SettingsDialog: remove the inline pet adoption teaser from the welcome panel so the first-run modal stays focused on configuration. - New `inferPromptTemplateCategoriesForDs(ds)` helper that maps a design system's authored metadata to prompt-template gallery categories. Imported by the design-system gallery wiring on a sibling branch; no callers in this branch yet. * feat: split skills/design-templates and add finalize-design API Phase 0 of the skills/design-templates refactor (specs/current/ skills-and-design-templates.md): - Move ~104 rendering catalogue entries from skills/ to design-templates/ and keep skills/ for the small set of functional skills that do work on user input (utilities, briefs, packagers). - Add design-templates/AGENTS.md and skills/AGENTS.md describing the contract, and a brand-agnostic craft/ surface for opt-in craft rules. - Daemon: add DESIGN_TEMPLATES_DIR / USER_DESIGN_TEMPLATES_DIR roots and an /api/design-templates surface mirroring /api/skills. Asset/example routes still span both registries so existing srcdoc URLs keep resolving across the rename. - Web: split LibrarySection into SkillsSection + DesignSystemsSection, rename the EntryView "Examples" tab to "Templates", and update locales + the New-project picker accordingly. Adds the finalize-design endpoint: - New apps/daemon/src/finalize-design.ts and packages/contracts/src/api/ finalize.ts — one-shot synthesis of a project's transcript + active design system + current artifact into <projectDir>/DESIGN.md via the Anthropic Messages API. Per-project .finalize.lock mirrors the transcript-export hygiene from PR #493; provider credentials are not persisted by the daemon. Other supporting changes: - README + AGENTS.md updates to document the new directory split and craft/ surface, plus i18n strings across 13 locales. - Test refactors and new coverage (finalize-design, runs, sidecar server, plus refreshed daemon integration tests). - .gitignore: scope the .exe ignore to /OpenDesign.exe so legitimate vendor binaries are no longer hidden. fix(merge): move clinical-case-report to design-templates/ Origin/main added the clinical-case-report skill under skills/ before the skills/design-templates split landed. Its od.mode is prototype, so per specs/current/skills-and-design-templates.md it is a design template and belongs alongside the other rendering catalogue entries — not under the slimmed-down functional skills/ root. Moving it keeps the EntryView Templates tab consistent with origin/main's intent. * feat(skills): curated design/creative catalogue + collapsible Settings rows Seed ~100 curated design/creative skill stubs under skills/ sourced from awesome-claude-skills (ComposioHQ) and awesome-agent-skills (VoltAgent). Each stub carries an od.category tag so the new filter pill row in Settings -> Skills can group them. The seed script (scripts/seed-curated-design-skills.ts, pnpm seed:curated-design-skills) is idempotent: it only creates folders that don't already exist, so hand-edited stubs are never overwritten. - Daemon: parse and surface od.category on SkillInfo with a strict slug normaliser; mirror the field on SkillSummary in @open-design/contracts. Category is purely a UI hint — system-prompt composition is unchanged. - Web: rewrite SkillsSection from a left-list / right-detail grid into a vertical stack of collapsible rows mirroring the External MCP panel (header always visible with name + mode/source/category pills + per-row enable toggle; SKILL.md preview, file tree and inline edit form expand on demand). Add a Category filter row above the list. Reorder Settings nav so Skills + External MCP sit above the Composio/MCP cluster. Update composer placeholder/hint across 17 locales to advertise '@ files or skills · / for commands'. - Docs: extend skills/AGENTS.md with the curated catalogue rules (idempotency, category vocabulary, no upstream vendoring). Co-authored-by: Cursor <cursoragent@cursor.com> * test(skills): teach localized-content + system-prompt tests about the skills/design-templates split mrcfps blocking review on PR #955: the skills/design-templates split (`b5993385`) moved ~110 SKILL.md entries out of `skills/` and into `design-templates/`, but two repo-level tests still hard-coded the single-root layout, so CI gates went red on the merged branch: - `e2e/tests/localized-content.test.ts` only scanned `<repo>/skills` while the locale `skillCopy` map keeps id-keyed entries spanning both roots (ExamplesTab/Templates uses one lookup regardless of origin). Teach the helper to read both `skills/` and `design-templates/`, deduplicating ids so the union matches the localized claim. - `apps/daemon/tests/prompts/system.test.ts` read `skills/live-artifact/SKILL.md`, which now lives under `design-templates/live-artifact/`. Update the absolute path so composeSystemPrompt's coverage of the live-artifact preamble is exercised again. Also enroll the curated design/creative catalogue (PR #955, ~91 stubs sourced from awesome-claude-skills / awesome-agent-skills) in the DE / FR / RU `_SKILL_IDS_WITH_EN_FALLBACK` lists. The stubs are English-only by design (frontmatter advertises an upstream URL); the fallback list is exactly the place to acknowledge "we know this id exists, English copy is fine here" so the localized-content coverage gate passes without forcing a translation task per locale. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(skills): always quote frontmatter name so importUserSkill round-trips numeric / boolean ids mrcfps PR #955 review: `buildSkillMarkdown` emitted `name: ${escapeYamlString(name)}` without quotes, so YAML coerced names like `123`, `true`, `false`, or `null` into non-string scalars on re-parse. listSkills() then read `data.name` as a number/boolean and the import flow's follow-up `findSkillById(skills, result.id)` missed it, falling into `/api/skills/import`'s "imported skill could not be re-read" 500 path for those ids. Switch the emitter to a quoted scalar (`name: "..."`) — the double-escape already in `escapeYamlString` makes the quoted form safe — and add a round-trip test covering `123`, `true`, `false`, `null`, and `0` to lock in the contract. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): drop staged-skill chips when the matching @<id> token leaves the draft mrcfps PR #955 review: `submit()` always forwarded every id in `stagedSkills`, but that state was only mutated on picker click and chip removal. Hand-deleting an `@<id>` token from the textarea left the chip staged, so the request still carried `skillIds: [<id>]` and the daemon composed a skill the prompt no longer referenced. Sync the chips with the draft inside `handleChange()` by pruning `stagedSkills` whenever the new value no longer contains the `@<id>` token (using the same whitespace boundary as `removeStagedSkill`'s strip regex). Comment explains why this prune does not run for `staged` file attachments — users frequently add files via the upload button without leaving an `@<path>` token, so a symmetric prune there would erase legitimate uploads. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(daemon): stage @-composed skills' side files alongside the active skill codex PR #955 review: composing a per-turn `@`-picked skill into the system prompt appended its body (with the `withSkillRootPreamble` guidance pointing at relative paths under `<cwd>/.od-skills/<folder>/`) but never staged the actual folder. `startChatRun` only copied `activeSkillDir`, so when the project's primary skill was different (or absent) the composed skill's references/, examples/, and scripts/ files lived only at their absolute repo path — agents that honour the cwd-relative form (or that don't get `--add-dir`, e.g. Codex with allowlisted gpt-image projects) couldn't reach them. Thread the composed skills' dirs out of `composeDaemonSystemPrompt` as `extraSkillDirs` and stage each one through the same `stageActiveSkill` API used for the primary skill. Dedupe by folder basename so a project whose primary skill is also `@`-composed isn't copied twice. Each preamble already advertises its own folder, so the prompt and the staged tree stay aligned without further changes. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): respect the Library disable toggle in the project @-mention picker codex PR #955 review: only `EntryView` received `enabledSkills` (filtered against `config.disabledSkills`); active projects still got `skills={skills}` raw, so a skill the user disabled in Settings kept appearing in the project's `@`-mention popover and could ride along to the daemon via `skillIds`. That broke the Library toggle for any project opened on the post-split branch. Compute a functional-skills-only enabled subset (`enabledFunctionalSkills`) and pass it into `<ProjectView>` instead. Templates stay separate — design-templates are filtered through their own `enabledDesignTemplates` memo for the Templates gallery — so ProjectView's chat composer still only sees skills, never templates, matching the pre-split prop surface. Co-authored-by: Cursor <cursoragent@cursor.com> * test(e2e): mock /api/design-templates for example-use-prompt flow The Templates tab in EntryView fetches from /api/design-templates after the skills/design-templates split (specs/current/skills-and-design-templates.md). The example-use-prompt Playwright scenario only mocked /api/skills, so the gallery card never appeared and the test timed out waiting on example-card-warm-utility-example. Serve the same fixture summary on both endpoints so the templates gallery renders the card the test clicks. Co-authored-by: Cursor <cursoragent@cursor.com> * test(tools-pack): create design-templates fixture for resources test The packaging resources copy now bundles the new design-templates tree alongside skills (see resources.ts BUNDLED_RESOURCE_TREES). The copyBundledResourceTrees fixture only created skills, design-systems, craft, etc., so the recursive copy crashed with ENOENT on design-templates before it could check the prompt-templates assertion. Add the missing fixture directory so the test exercises the same set of resource trees the packaged build does. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(skills): clone built-in side files into the shadow on first edit mrcfps PR #955 review: editing a built-in skill wrote a USER_SKILLS_DIR shadow folder that contained only a new SKILL.md. The next listSkills() pass surfaced the shadow as the active dir, but every side-file resolver (/api/skills/:id/files, /example, /assets/, the system-prompt preamble, and the per-turn cwd staging) reads through skill.dir. With nothing but SKILL.md in the shadow, the bundled assets/, references/, scripts/, and examples/ disappeared the moment the user hit save — a built-in like last30days or live-artifact would break immediately after edit instead of just having its body overridden. Teach updateUserSkill() to take a `sourceDir` and clone every entry except SKILL.md / dotfiles into the shadow on the very first edit. The shadow stays self-contained, so all the resolvers keep working without fallback bookkeeping. Subsequent edits detect the existing shadow and skip the clone, so user tweaks under the side tree survive a re-save. Wire `sourceDir: skill.dir` from server.ts's PUT /api/skills/:id handler and add two regression tests: - 'clones built-in side files into the shadow on the first edit' walks the file tree after save and asserts assets/template.html, references/ notes.md, and scripts/helper.sh all round-trip from the built-in. - 'preserves user-edited side files on subsequent edits' edits the staged assets/template.html, re-saves, and confirms the user content is still there. Co-authored-by: Cursor <cursoragent@cursor.com> test(e2e): rename home tab from Examples to Templates The Examples tab was renamed to Templates in EntryView (b5993385's skills/design-templates split — entry.tabExamples became entry.tabTemplates and the tab value moved from 'examples' to 'templates'), but entry-chrome-flows still asserted the old label and testId. Update both. * fix(skills+web): preserve template body in API mode and dir-based skill delete Two follow-ups from PR #955 review: 1. ProjectView only received `enabledFunctionalSkills`, but `composedSystemPrompt()` still resolved `project.skillId` through that prop and `fetchSkill()`. Projects created from the new `/api/design-templates` surface keep a template id in `project.skillId`, so opening one in API mode dropped the template body from the system prompt and the upstream request ran without the project's primary template instructions. Now ProjectView takes a separate `designTemplates` prop (the unfiltered template list, so a later-disabled template still loads for projects already created from it) and `composedSystemPrompt()` plus the metadata / `isDeck` lookups fall back to that list, with `fetchDesignTemplate()` as the body-fetch fallback to `fetchSkill()`. The chat composer's `@`-picker keeps receiving only the enabled functional skills. 2. `DELETE /api/skills/:id` used `deleteUserSkill(USER_SKILLS_DIR, skill.id)` which re-slugified the frontmatter id and removed `<userSkillsDir>/<slug>/`. That matched the import shape but missed the install shape — `installFromTarget` writes the folder at `sanitizeRepoName(url)` (GitHub) or `path.basename(realpath)` (local symlink), neither of which is guaranteed to equal the slugified frontmatter `name`. A duplicate `app.delete('/api/skills/:id', ...)` handler at the install routes never fired because Express resolved the earlier registration first, leaving the install/uninstall path without working teardown. The handler now removes `skill.dir` (the absolute path listSkills already discovered) under a USER_SKILLS_DIR safety check, using `lstat` + `unlinkSync` so symlinked local installs unlink cleanly without recursing into the user's source tree. The dead duplicate handler is removed; `deleteUserSkill` is dropped from the server.ts import set (still exported and unit-tested in skills.ts). Regression coverage in `apps/daemon/tests/skills-delete-route.test.ts` pins both shapes plus the symlink-preserves-source case. * test(daemon): point hyperframes system-prompt test at design-templates The merge with main brought in a hyperframes system-prompt test that reads `skills/hyperframes/SKILL.md`, but this branch's split moved `hyperframes` into `design-templates/` (same migration as `live-artifact` already handled above in this file). CI was failing with ENOENT on the old path. --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 17:48:34 +08:00
PerishFire	f2db5a749c	chore: enforce PR→issue linking discipline (#1263 ) PRs that omit Fixes #N break the release-time reverse lookup (issue → closing PR → merge sha → first containing tag), since the auto-link only fires on the explicit closing keywords. We've been doing this by hand on recent fixes; codify it so future PRs don't drift. - Add .github/pull_request_template.md with a Fixes # placeholder so the link surface is in front of the author by default. - Add a corresponding bullet to the Bug follow-up workflow in the root AGENTS.md so the discipline lives next to the methodology that produces issue-linked work.	2026-05-11 17:24:24 +08:00
PerishFire	a797e079b1	fix(desktop): exit fullscreen before hiding window on macOS close (#1249 ) * fix(desktop): exit fullscreen before hiding window on macOS close (#1215) When a preview is in 演示 → 全屏 mode, the macOS close handler called window.hide() directly, leaving the OS fullscreen Space orphaned as a black screen — the window vanished but the Space stayed up. Extract hideWindowExitingFullscreen as the named invariant ("hide, but first leave fullscreen so the OS Space tears down with the window") and route the darwin close handler through it. The hide is deferred until 'leave-full-screen' fires so we don't race the OS Space teardown. Bootstraps Vitest on apps/desktop with a single test under tests/main/hide-window-exiting-fullscreen.test.ts that exercises the helper through a structural mock — the bug shape is pure logic, no real Electron window required. Spec was red against a hide-only helper and green after the leave-full-screen sequencing. * docs(agents): codify bug follow-up workflow Distill the spec-first / cheapest-layer / scope-discipline / invariant-shaped-fix / baseline-diff playbook used recently on #135 and #1215 into a top-level subsection of root AGENTS.md, framed as a default action shape with explicit room for case-by-case judgment rather than a hard rule. Includes a single pointer back to the worked example spec. * docs(agents): require staged human verification for visible bugs Add the human-verification gate as a sixth bullet in the Bug follow-up workflow. UI / platform-native / animation symptoms can pass green specs and still ship the visible regression — proven by #1215, where the desktop unit test green-lighted the helper logic but only a side-by-side buggy-vs-fix run on a real macOS Space proved the black-screen actually went away. Reinforces the production-API-only seed constraint while we're there: source-level backdoors prove a fake flow, not the real one, so they invalidate the verification. * fix(desktop): defer hide across the fullscreen-enter transition (#1215) mrcfps observed on PR #1249 that the close handler only catches windows already in fullscreen — Electron's enter-full-screen event is async on macOS, so isFullScreen() can still read false during the OS Space transition triggered by requestFullscreen(). A close in that window took the plain hide() path and stranded the same black Space the fix was meant to eliminate. Track in-flight fullscreen entry from webContents.enter-html-full-screen (set) and BrowserWindow.{enter,leave}-full-screen (clear), and surface it through WindowFullscreenSurface.isEnteringFullscreen. The helper now parks on enter-full-screen until the OS confirms the Space, then runs the existing exit-then-hide path. Adds a regression test ("waits out a fullscreen-enter transition before exiting and hiding") that goes red against the previous helper.	2026-05-11 17:04:42 +08:00
Caprika	f7f2661bda	[codex] Handle empty API responses as no output (#1244 ) * Handle empty API responses as no output * Fix empty API response comment cleanup * Stabilize API empty response detection	2026-05-11 16:57:02 +08:00
PerishFire	421ddf553c	fix(pack/win): close running app before silent reinstall (#1238 )	2026-05-11 16:35:07 +08:00
nettee	e859c31574	fix(web): complete finished tool calls missing results (#1240 )	2026-05-11 15:54:11 +08:00
Tom Huang	e254d1280b	feat(memory): auto-memory store with chat-protocol-aware extraction (#999 ) * feat(memory): auto-memory store with chat-protocol-aware extraction Markdown memory store at <dataDir>/memory/ with two extractors — heuristic regex for explicit "remember:" / "我是 X" markers, and a small-model LLM pass after each turn — folded into the system prompt so cross-chat preferences, role, and ongoing-work context survive restarts. Settings UI: - Memory tab lists entries, exposes a hand-edited MEMORY.md index, and shows an extraction history with per-attempt phase/skip/failure rows. - Memory model picker is inline next to the chat model picker (CLI and BYOK) so the choice "which fast model mines facts each turn?" sits next to the chat-model decision instead of a separate panel. The picker reuses the same SUGGESTED_MODELS table and "Custom..." pattern the chat picker uses. LLM extractor supports all four protocols (anthropic / openai / azure / google); pickProvider takes the chat agent id from the chat handler and constrains its auto-pick to the chat's protocol family — Claude Code chats no longer surprise users by silently extracting on whatever OpenAI key happens to be in media-config. When no matching key is configured the attempt records as 'skipped: no-provider' instead of quietly switching vendors. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): keep hint outside <label> and disambiguate Model selectors The inline Memory model picker wrapped its hint paragraph inside the <label>, which made the hint's "API key" / "model" wording bleed into the <select>'s accessible name and broke Playwright's getByLabel('API key') / getByLabel('Model') strict-mode matching in the existing settings-api-protocol e2e suite. - Move the hint <p> out of the <label> in MemoryModelInline so the select's accessible name is just "Memory model". - Switch the chat-Model selectors in settings-api-protocol.test.ts from getByLabel('Model') to getByRole('combobox', { name: 'Model', exact: true }) so they no longer collide with the new "Memory model" select that sits next to the chat Model picker. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): address review changes — BYOK wiring, MEMORY.md index, /v1, label wrapper Addresses the four blocking review threads on PR #999. 1. MemoryModelInline accessibility (mrcfps) The inline picker still wrapped its select + custom input + flash + hint inside a single <label>, which made the select's accessible name absorb every text descendant — including the "API key" / "model" hint copy. The previous fix moved only the hint outside; the reviewer asked for a non-label wrapper. Switch to <div className="field"> and associate just the short title with the controls via `aria-labelledby` / `aria-label`. The select's accessible name is now exactly "Memory model" so `getByLabel` strict-mode locators on the surrounding chat form stop cross-matching the memory copy. 2. Respect the hand-edited MEMORY.md index (mrcfps + codex) `composeMemoryBody()` was reading every .md file in the memory dir, ignoring the index. Removing a `- [Name](id.md)` line had no effect on future prompts. Parse the index's `INDEX_LINK_RE` bullets and filter `listMemoryEntries()` to the linked id set, so the editor's "delete this line to disable injection" promise actually holds. 3. Versioned OpenAI-compatible base URLs (codex) `callOpenAI` and `callAnthropic` hard-coded `/v1` onto `provider.baseUrl`, breaking custom endpoints whose saved URL already includes `/v1` (`/v1/v1/chat/completions`). Apply the same conditional `appendVersionedApiPath` helper the chat proxy and connection-test routes already use. 4. Wire memory into BYOK / API-mode chats (mrcfps + codex) The previous PR's daemon-only memory hook never fired for BYOK, leaving the Memory tab + model picker as a no-op for that mode. Add the missing surface and wire it through ProjectView: - contracts: extend `composeSystemPrompt` with `memoryBody`, mirroring the daemon's local composer; add `MemorySystemPromptResponse` and the `attemptedLLM` flag on `ExtractMemoryResponse`. - daemon: expose `GET /api/memory/system-prompt` (returns the composed body) and turn `POST /api/memory/extract` into a two-phase endpoint — heuristic-only when only userMessage is supplied (pre-turn), LLM-only when assistantMessage is also supplied (post-turn), so the extraction-history doesn't double up. - web: ProjectView's BYOK branch now fetches the memory body before composing the system prompt, runs the heuristic extractor before the run (so "remember:" markers in this turn reach this turn's prompt), accumulates assistant text during streaming, and queues the LLM extractor on `onDone` — fire-and- forget so it never blocks the chat round-trip. Co-authored-by: Cursor <cursoragent@cursor.com> fix(memory): re-sync BYOK memory override when chat config drifts The inline memory-model picker captured `apiProtocol` / `chatApiKey` / `chatBaseUrl` / `chatApiVersion` into the saved override only at the moment the user clicked a model. If they later swapped the BYOK protocol tab, rotated the API key, or edited the base URL in the same settings flow, the daemon's background extractor kept calling the old vendor / credential — directly contradicting the picker's "borrows the surrounding chat picker's protocol, key, base URL, and api-version automatically" promise. Add a debounced effect that compares the persisted (masked) shape against the live chat props and re-PATCHes /api/memory/config when they drift. The masked config exposes `apiKeyTail` (last 4 chars), so key rotation is detectable without ever round-tripping the secret back to the browser. The 300 ms debounce coalesces the keystroke- granularity prop updates the parent settings dialog streams during its autosave loop, so a user editing the base URL doesn't trigger one PATCH per character. Background re-syncs are silent — the "Saved!" flash only fires for explicit user clicks, so the picker doesn't feel like it's fighting them as they edit unrelated chat fields. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): thread BYOK chat config through /api/memory/extract default path Leaving the BYOK memory picker on "Same as chat" still broke the default LLM extraction path: `MemoryModelInline` clears the override for that option, both `/api/memory/extract` calls in `ProjectView` only sent the messages, and the daemon never persists BYOK creds, so `extractWithLLM(..., { chatAgentId: null })` always reached `pickProvider()` with no chat context and fell through to env / media-config — the wrong vendor for a BYOK chat that works for inference. Thread the live BYOK chat config through the extract endpoint as a per-call snapshot: - contracts: extend `ExtractMemoryRequest` with an optional `chatProvider` (provider/apiKey/baseUrl/apiVersion/model) and add `'chat-byok'` to the credentialSource enum. - daemon: parse + validate `chatProvider` on `/api/memory/extract` (provider must be one of the five known shapes) and forward to `extractWithLLM` as a new option. `pickProvider()` gets a new path 2 that uses the snapshot directly with the per-protocol fast-model default — so a memory pass on `gpt-4o` / `claude-sonnet-4-5` silently turns into a cheap `gpt-4o-mini` / `claude-haiku-4-5` call instead of paying chat-tier rates for sediment work. Override and CLI-agent-constrained paths still win when they apply. - web: `ProjectView` snapshots `apiProtocol` / `apiKey` / `baseUrl` / `apiVersion` from the live `AppConfig` on each BYOK extract call (both pre-turn heuristic-only and post-turn LLM phases). The picker's existing drift-resync effect already covers explicit overrides; this snapshot covers the implicit "Same as chat" default that the override flow can't reach. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(memory): treat empty apiKey on PATCH as a real clear MemoryModelInline silently re-PATCHes /api/memory/config whenever the surrounding BYOK chat creds drift. The previous reuse branch lumped `apiKey === ''` together with `apiKey === undefined`, so clearing the chat API key from the picker quietly preserved the old daemon-side secret and kept calling the provider on a stale credential. Distinguish four states for the apiKey field: - absent -> preserve stored secret (form re-save without re-typing) - '' -> clear stored secret (user removed it from the picker) - 'sk-...' -> replace - new provider -> ignore stored secret entirely Add tests/memory-config-route.test.ts covering all four cases. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 15:45:42 +08:00
Tom Huang	e11e86d468	feat(hyperframes): land HTML-in-Canvas across web + skills (#866 ) * feat(hyperframes): land HTML-in-Canvas across web + skills Ships HTML-in-Canvas as a first-class HyperFrames video path: - 7 new video prompt templates (liquid glass, iPhone+MacBook, portal, shatter, magnetic, liquid background, text-cursor reveal). - skills/hyperframes/references/html-in-canvas.md, surfaced via SKILL.md description+triggers and the system-prompt pre-flight references list. - ChatPane starter prompts now branch by project kind and video model, so the hyperframes-html surface shows HTML-in-canvas-shaped prompts instead of the generic prototype trio. - NewProjectPanel propagates a picked template's model+aspect onto the project, and defaults videoModel to hyperframes-html when the hyperframes skill resolves for the video tab. Polish bundled in the same branch: - DesignFilesPanel empty state becomes a centered pill with a "New sketch" CTA; designFiles.empty copy simplified across 19 locales. - Topbar project title + meta render on one baseline row separated by a middot. - scripts/seed-test-projects.ts hardens daemon URL discovery against pnpm engine warnings on stdout. * fix(new-project): preserve explicit video model choice across tab revisits Latch a videoModelTouched guard once the user picks a model via the dropdown or via a template that declares one, so the hyperframes-html auto-default no longer silently overwrites the override when the Video tab is re-entered. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * fix(i18n): register hyperframes html-in-canvas templates, category, and tags Adds the seven new prompt-template ids, the "VFX / HTML-in-Canvas" category, and the new tag set to the de/ru/fr i18n bundles so the e2e localized-content coverage test passes. Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code) * fix(daemon): inject html-in-canvas preflight for hyperframes runs The contracts-side derivePreflight() learned about references/html-in-canvas.md when this PR landed, but the daemon copy at apps/daemon/src/prompts/system.ts kept the older five-ref allowlist. server.ts:4138 wires composeSystemPrompt from the daemon copy into live chat runs, so the main HyperFrames flow this PR is meant to improve still wasn't auto-injecting the preflight directive in production. Mirror the html-in-canvas case into the daemon composer and lock it behind a daemon-side test so the two copies cannot drift again on this reference. The broader live-artifact preflight gap (artifact- schema / connector-policy / refresh-contract) is pre-existing drift and is intentionally out of scope here. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(web): restyle designs empty state as centered card on grid backdrop Swap the horizontal pill for a stacked card and add a faint grid backdrop so the empty designs surface reads as an intentional canvas rather than a gap. Title now wraps instead of truncating; container is taller. * fix(new-project): pin skillId to hyperframes when videoModel is hyperframes-html When the Video tab resolves its skill it used to fall back to `list[0]?.id` if no skill declared `default_for: video`. That list is built from an unsorted `readdir()` in apps/daemon/src/skills.ts, so a freshly mounted project could land on `video-shortform` even when the user had explicitly chosen the HyperFrames-HTML model (or one of the new `hyperframes-html-in-canvas-*` templates). The agent then ran without the hyperframes SKILL body or its `references/html-in-canvas.md` preflight — the exact regression PR #866 was meant to land. `skillIdForTab` now pins to `hyperframes` whenever the current video model is `hyperframes-html`, regardless of discovery order. Added a unit test that mounts both `video-shortform` and `hyperframes` (with hyperframes last, simulating the bad readdir order) and asserts the create payload routes through `hyperframes`. --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 15:45:12 +08:00
PerishFire	31e57fd773	fix(daemon): persist runStatus/endedAt on chat run termination (#1230 ) * fix(daemon): persist runStatus/endedAt on chat run termination (#135) POST /api/runs created the run but never reconciled the messages row on terminal status. If the web failed to persist the cancel (refresh, dropped PUT), the row stayed at run_status='running' / ended_at=NULL, and on reload the elapsed timer kept climbing because the renderer fell back to now - startedAt. Mirror routine/orbit reconciliation: attach a wait-completion handler that updates run_status and ended_at, guarded by COALESCE and a run_status IN ('queued','running') filter so concurrent web persists are not clobbered. Adds cancelRun helper and two regression specs under e2e/tests/dialog/. * fix(daemon): annotate reconcile callback params for chat-routes The chat run reconciliation block landed in chat-routes.ts after the recent server-route split (#1043), where stricter type checking surfaces implicit `any` parameters. Annotate the wait/then callback as `{ status: string }` and the catch callback as `unknown`. * refactor(daemon): extract reconcileAssistantMessageOnRunEnd helper The inline if/wait/then/catch block in POST /api/runs read as a bolt-on patch. Lift it to a named file-scope helper so the route handler stays intent-level (start the run, arrange follow-up reconciliation) and the guard for missing assistantMessageId is an internal detail. The helper's docblock describes the invariant ("messages row reflects the run's terminal state even without web persist"); commit history keeps the issue context. * test(e2e): wait for any terminal status in stop-reconcile spec The earlier .catch fallback chained two waitForRunStatus calls (canceled then succeeded). waitForRunStatus throws on the first non-expected terminal, so a canceled run that resolves to failed (e.g. agent exits non-zero on SIGTERM) would still abort the test before reaching the messages-row assertion. Add waitForRunTerminal to e2e/lib/vitest/runs.ts: polls until any terminal status without throwing on mismatch, since this spec's claim is about the resulting messages row, not which terminal the run took. Addresses Codex inline review on PR #1230.	2026-05-11 15:37:52 +08:00
nettee	ab922327f4	refactor(daemon): split agent runtime definitions (#1063 )	2026-05-11 15:01:55 +08:00
nettee	b1d440d2bd	refactor(daemon): split route registration (#1043 ) * spec * refactor(daemon): split server route registrars * refactor(daemon): group route registrar dependencies * refactor(daemon): move remaining domain routes out of server * update doc * revert spec * fix daemon route context contract Generated-By: looper 0.5.6 (runner=fixer, agent=opencode) * fix media task persistence Generated-By: looper 0.5.6 (runner=fixer, agent=opencode) * fix: restore daemon route registrations * fix: restore static resource mutation origin checks	2026-05-11 15:00:23 +08:00
PerishFire	976edaf38e	test: harden e2e smoke and release reports (#1140 ) * test: harden e2e inspect specs * test: wire e2e release reports * chore: bump packaged beta base to 0.6.1 * test: run release smoke vitest directly * test: add suite-owned tools-dev lifecycle * ci: harden stable release packaging * fix(release,e2e): gate stable signing on verify and harden suite cleanup - restore `needs: [metadata, verify]` on the stable release `build_mac`, `build_mac_intel`, `build_win`, and `build_linux` jobs so Apple signing/notarization and Windows release builds cannot run before pnpm guard, typecheck, and layout checks complete on the metadata commit. - in `runToolsDevSuite`, drop the `started` flag and always attempt `stopToolsDevWeb` in `finally`; record stop errors in diagnostics, and when the test body succeeded, escalate the stop failure to the suite result and rethrow — so orphan daemon/web processes from an interrupted `startToolsDevWeb` or a broken shutdown can no longer pass silently. Addresses PR #1140 review feedback from lefarcen and mrcfps.	2026-05-11 13:11:16 +08:00
Sid	1dc0224599	fix(desktop): enforce minimum window size on main client (#1189 ) (#1203 ) The main BrowserWindow was created with only `width: 1280, height: 900` and no `minWidth` / `minHeight`, so Electron honored arbitrary user drags. Past roughly 900×600 the project page's left/right split (chat composer + designs panel + preview pane) overlaps and the top navigation clips, which is the broken first impression reported in #1189. Pin `minWidth: 900, minHeight: 600` on the main window — preserves the usable layout floor while still fitting common 13" small-screen laptops. The ephemeral print sub-window (`show: false`, closed on print completion) is unchanged: it isn't user-resizable so a min-size floor has no observable effect there.	2026-05-11 12:33:47 +08:00
shangxinyu1	b19aa6c907	Improve Codex CLI path fallback UX (#1205 ) * Improve Codex CLI path fallback UX (#1193) * Handle ENOENT Codex shim fallback	2026-05-11 12:00:47 +08:00
Botshelo Brandon Tidimalo	979733d39b	feat(web): add Cmd+, shortcut to open settings with platform shortcut badge (#1173 ) Register a capture-phase Cmd+, (mac) / Ctrl+, (win/linux) listener in App.tsx that opens Settings, and show a shortcut badge on the Settings menu item in both AvatarMenu and EntryView. Extract the duplicated isMac platform check into a shared isMacPlatform() utility in utils/platform.ts, replacing inline copies in FileWorkspace and ProjectView as well.	2026-05-11 11:43:57 +08:00
Nicholas-Xiong	2838a28585	fix: set writable OD_DATA_DIR default for nix run (#1159 ) Fixes #1157 When running via 'nix run github:nexu-io/open-design', the daemon attempted to create runtime state under the Nix store package path: /nix/store/.../lib/open-design/.od/projects The Nix store is read-only at runtime, causing startup to fail with ENOENT when mkdir() tried to create the projects directory. This commit updates the nix run wrapper to export OD_DATA_DIR with a writable default ($HOME/.od) when the variable is unset. Users can still override it by setting OD_DATA_DIR before running. The Home Manager and NixOS modules already set OD_DATA_DIR, so they are unaffected by this change.	2026-05-11 10:52:53 +08:00
github-actions[bot]	d3b1804523	docs(readme): refresh contributors wall (#1188 ) Co-authored-by: mrcfps <23410977+mrcfps@users.noreply.github.com>	2026-05-11 10:50:30 +08:00
github-actions[bot]	12708fd379	Update docs/assets/github-metrics.svg - [Skip GitHub Action] (#1183 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2026-05-11 10:50:16 +08:00
shangxinyu1	d45bf3fb9a	test: expand entry and settings automation coverage (#954 ) * test: harden new project panel metadata coverage * test: expand entry e2e coverage * test: drop e2e docs from the guarded package * test: cover examples gallery interactions * test: cover examples preview modal actions * test: cover examples preview escape fullscreen * test: cover examples template prompt filtering * test: cover updated settings and entry tabs * test: fix entry/settings coverage type drift * test: fix example preview fetch assertion * test: fix new project panel skill fixture	2026-05-11 10:49:42 +08:00
Nagendhra Madishetti	32fa0c23bb	feat(daemon): Critique Theater Phase 6.2 (artifact extraction + endpoint) (#1085 ) The orchestrator was leaving artifactPath = null on every shipped run because the SHIP <ARTIFACT> body never made it past the parser. Reviewers caught this on PR #1006: a rerun-style endpoint built on top of that null could not return a usable prior-art reference, and tests that synthesized artifactPath via insertCritiqueRun were hiding the gap rather than covering the feature. This PR closes that gap. The parser now hands the orchestrator a ShipArtifactPayload (round, mime, body) through a side-channel callback, and the orchestrator writes the bytes to <artifactsDir>/<projectId>/<runId>/ artifact.<ext> via a new artifact-writer module. The row's artifactPath is the absolute on-disk path. The web layer never sees that path: it fetches the bytes through GET /api/projects/:projectId/critique/:runId/artifact, which the new artifact-handler module serves with a mime-derived Content-Type, X-Content-Type-Options: nosniff, a CSP header for HTML and SVG, and the same cross-project leak guard pattern the interrupt handler uses. The body and mime intentionally never travel on the SSE wire. The SHIP PanelEvent (which doubles as the SSE payload shape) keeps its lightweight artifactRef, and the orchestrator strips body/mime before bus.emit, so a multi-megabyte artifact does not broadcast to every subscriber. The new orchestrator test asserts this explicitly. Defense in depth in the writer + handler: - mime allowlist with text/html, text/css, text/markdown, text/plain, application/json, image/svg+xml; everything else falls through to application/octet-stream + .bin so unknown payloads can't be misinterpreted as a known type; - UTF-8 byte-length cap, configurable via cfg.parserMaxBlockBytes, so multi-byte payloads can't sneak past a JS .length check; - atomic write through a sibling tmp file + rename so a daemon crash mid-write can't leave a half-written artifact under the canonical name; - path-traversal guard on the GET endpoint that resolves the row's artifactPath against the artifacts root and refuses anything that escapes it, refuses non-regular files (symlinks, dirs), and refuses files larger than the response cap. Folded in two non-blocking notes lefarcen left on PR #1016 (the contracts move) since persistence.ts was already in scope here: - P2: introduced CritiquePersistedStatus = CritiqueRunStatus \| 'running' in the contracts package. CritiqueRunRow.status and CritiqueRunInsert. status now use it, and the inline `as CritiqueRunStatus \| 'running'` widen in interrupt-handler.ts is gone. Public DTOs continue to use the terminal-only CritiqueRunStatus so a future endpoint can't leak a 'running' row through the wire. - P3: added AssertExhaustiveValues + a compile-time assertion that CRITIQUE_RUN_STATUSES covers every CritiqueRunStatus variant. Adding a value to ShipStatus or CritiqueRunStatus without updating the array now fails the build with a tuple naming the missing variants instead of silently dropping out of UI filters. Coverage: 174 critique tests across 14 files pass locally, including the new critique-artifact-writer (13 cases) and critique-artifact-endpoint (11 cases) suites, the inverted critique-lifecycle artifact-persistence test, and the orchestrator happy-path that asserts the SSE ship payload does NOT carry body or mime. Validated: pnpm guard, pnpm --filter @open-design/contracts build, pnpm --filter @open-design/daemon build (full tsc), pnpm --filter @open-design/web typecheck, pnpm --filter @open-design/daemon exec vitest run tests/critique (all green). This is step (b) of the four-step plan that PR #1006's closing comment laid out. Step (a) was the contracts move in PR #1016. Steps (c) (persist original_message_id / agent_id / model_id) and (d) (real rerun endpoint on top of (a)+(b)+(c)) follow. Co-authored-by: Nagendhra <nagendhra405@gmail.com>	2026-05-10 23:59:04 +08:00
Matt Van Horn	976a5900f8	fix: clear stale upload failure banner when previewing files (#797 ) * fix: clear stale upload failure banner when previewing existing files Closes #786 - Clear uploadError in openFile() so navigating to a file dismisses the banner - Scope banner visibility to the Design Files tab so stale errors do not bleed into preview surfaces - Add test pinning that no banner is rendered when there is no upload error * fix(workspace): move upload banner into DesignFilesPanel + interactive test Per @mrcfps + @lefarcen review on PR #797: - Move the upload-error banner from FileWorkspace into DesignFilesPanel body. Hide it whenever the in-panel preview is active (the missed flow that mrcfps and lefarcen flagged: single-click preview kept activeTab on DESIGN_FILES_TAB, so the old guard left the banner mounted above the preview). - Keep a fallback banner in FileWorkspace that fires only when activeTab is not Design Files. This preserves the partial-upload visibility flagged by chatgpt-codex-connector: a partial upload opens the last successful file (flipping activeTab to a viewer) and the failure note still surfaces. - Wrap uploadProjectFiles in try/catch so thrown errors surface a banner instead of disappearing. - Replace the brittle viewer-empty assertion with two interactive vitest cases: (1) mock-fail upload, banner visible, preview file, banner hidden, close preview, banner back, dismiss, banner gone; (2) partial-upload uploaded+failed, banner appears on the viewer surface with the existing 'Uploaded N file(s), but M failed' text. - Add df-upload-banner class and stable test ids upload-error-banner and upload-error-dismiss so future tests don't rely on the generic viewer-empty class. Closes #786 staleness; addresses follow-up review. --------- Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com> Co-authored-by: mrcfps <mrc@powerformer.com>	2026-05-10 23:56:24 +08:00
Yuhao Chen	35e7b622b7	fix(web): allow pod-to-chat comment text to wrap instead of truncating (#793 ) (#1156 )	2026-05-10 23:27:06 +08:00
Zihuailin	06e677cb72	Fix pending prompt clearing for templates (#1148 )	2026-05-10 21:52:49 +08:00
code-Y	84f768d4a2	feat: add WeChat design system, login-flow skill, and fix API mode tool_calls bug (#1083 ) * feat: add WeChat design system, login-flow skill, and fix API mode tool_calls bug - Add WeChat design system (design-systems/wechat/) with full brand spec including color palette, typography, and component rules for chat UI - Add login-flow skill (skills/login-flow/) for mobile authentication flows with P0 checklist, example HTML, and i18n registration across 3 locales - Fix DeepSeek V4 bug: API/BYOK mode (streamFormat=plain) models now receive a directive to emit only <artifact> HTML blocks and suppress tool_calls, since plain adapters proxy to external providers that cannot execute tools * fix: restore full server.ts and WeChat DESIGN.md from ad46d8cd commit Restore files that were corrupted in PR #1083 head branch. The WeChat DESIGN.md was reduced to a single line (filename only) and server.ts was reduced to ~1 line. Both are restored to their original ad46d8cd state with full content. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: restore full server.ts and WeChat DESIGN.md from ad46d8cd Restore files corrupted in PR #1083: - apps/daemon/src/server.ts: restored 7106-line file - design-systems/wechat/DESIGN.md: restored 301-line WeChat design spec - skills/login-flow/SKILL.md: restored from local working state - skills/login-flow/example.html: restored 351-line example HTML * fix: only suppress tool_calls when streamFormat='plain' explicitly, remove nonexistent assets/template.html 1. streamFormat check now requires explicit 'plain' value instead of defaulting to 'plain' when undefined. This prevents normal tool-using chat runs from incorrectly inheriting the API/BYOK tool_calls suppression rule. 2. login-flow SKILL.md: removed reference to assets/template.html since that file does not exist in the skill bundle and derivePreflight() would inject a hard instruction to read it before any other tool, causing pre-flight to fail. * fix: thread streamFormat to composeSystemPrompt in server.ts call Previously the composeSystemPrompt call at line ~4940 omitted streamFormat, causing the composer to default to 'plain' and suppress tool_calls even for tool-using chat runs. Now streamFormat is passed through from the adapter definition so the API mode rule only fires when streamFormat='plain' is explicitly set. * fix: WeChat category metadata, font-family, and login-flow example interactivity WeChat DESIGN.md: - Add Category: Social & Messaging metadata so it appears correctly in picker - Fix font-family declaration: remove invalid -webkit-font-family prefix, use standard font-family so downstream CSS generation works correctly skills/login-flow/example.html: - Add password toggle click handler so show/hide actually works - Change Apple icon fill from hardcoded #fff to currentColor so it is visible on light backgrounds * fix: mirror streamFormat suppression in contracts composer and add WeChat i18n 1. packages/contracts/src/prompts/system.ts: Add streamFormat parameter to ComposeInput and ComposeInput interface, mirroring the same suppression rule from daemon prompts/system.ts. When streamFormat='plain' is passed, a directive is appended telling models not to emit tool_calls and to only output <artifact> HTML blocks. 2. apps/web/src/i18n/content.{ts,fr,ru}.ts: Add WeChat design system entries: - Add 'wechat' to DE/FR/RU_DESIGN_SYSTEM_IDS_WITH_EN_FALLBACK arrays - Add 'wechat' summary to DE/FR/RU_DESIGN_SYSTEM_SUMMARIES - Add 'Social & Messaging' category to DE/FR/RU_DESIGN_SYSTEM_CATEGORIES (matching the Category: Social & Messaging metadata in WeChat DESIGN.md) * fix: thread streamFormat='plain' into web composeSystemPrompt for api mode * test: focus localized content coverage on missing resources --------- Co-authored-by: Open Design Contributor <z@open-design.dev> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: mrcfps <mrc@powerformer.com>	2026-05-10 20:38:33 +08:00
Dongsen	bfedbeca0f	fix(prompts): add "When NOT to emit" guardrail to artifact handoff (#1143 ) (#1145 ) * fix(prompts): add "When NOT to emit <artifact>" clauses (#1143) #1143 现场报告：当本轮只用 Edit 工具修改已有 HTML 文件、没有写出新 canonical HTML 时，AI 仍按 system prompt 的 "non-negotiable output rule" 字面收尾，把一句中文总结塞进 `<artifact type="text/html">` 块里。下游持久化路径会把这一句 prose 当合法 HTML 落盘，污染项目文件面板（截图见 #50 评论）。根因在 prompt 缺少免发条款。本次修改： - 把 "Artifact handoff (non-negotiable output rule)" 改为条件化措辞 "Artifact handoff" + "When you ship a fresh deliverable…" - Workflow step 5 ("Finish") 增加 in-place edit 不发 artifact 的分支 - 新增 "When NOT to emit `<artifact>`" 子段，明确三条免发条件： - in-place edits only：本轮无新 canonical HTML 产出 → 直接说改了哪个文件、改了什么 - body 必须是完整 `<!doctype html>` 文档 → 总结/路径/bash 输出/ 说明用普通回复，不要包标签 - 拿不准就别发 → 重发未变 artifact 没价值，发空壳 artifact 反而误导用户、污染面板测试：apps/daemon/tests/prompts/system.test.ts 新增 describe 块 "artifact handoff no-emit clauses (#1143)" 4 例，断言 composed prompt 含必要短语。 #50 持久化层兜底（pre-write HTML gate）由 #1144 单独跟进，与本 PR 互补：本 PR 让 AI 不去发空壳 artifact，#1144 在写盘前再挡一道，即使 prompt 失守也不会污染项目面板。 * fix(prompts): 让 discovery 主导层也支持 artifact 免发例外 (方案 C) review (lefarcen P1, mrcfps blocker): base 层新增的 "When NOT to emit <artifact>" 例外被更高优先级的 DISCOVERY_AND_PHILOSOPHY 层中的无条件 emit 指令盖过去，导致 #1143 主路径仍可能产出空壳 artifact。按 review 中的方案 C 修补： - discovery.ts RULE 3 之前新增 "Artifact emission is conditional" 主导层不变式段落（条件式 emit 在主导层声明一次，base 层保持详细规则） - discovery.ts:17 arc 注释 / :143 plan 模板 step 9 / :262 default arc recap 全部改为条件式（仅在本轮写出新 canonical HTML 时 emit） - deck-framework.ts:327 deck workflow step 7 同步改为条件式测试加 2 条断言： - 负断言：组装后 prompt 不再含未限定的 "Emit single <artifact>" / "emit a single <artifact>." 行 - 正断言：discovery 层包含 "only when this turn wrote a new canonical HTML" 与 "only edited an existing HTML file" 等价表述 * test(prompts): 补 deck-mode 负断言覆盖 deck-framework.ts:327 review (lefarcen P2): 上一轮的负断言用 composeSystemPrompt({}) 调用，不会触发 DECK_FRAMEWORK_DIRECTIVE 的拼接（仅在 skillMode === 'deck' 或 metadata.kind === 'deck' 时追加）。如果 deck-framework.ts:327 后续回退到 "Emit single <artifact>"，无参负断言依然假绿。补一条显式的 deck-mode 断言： - 负断言：deck-mode prompt 不含未限定的 "7. Emit single <artifact>" 行 - 正断言：含本次改的 "Emit single <artifact> if a new canonical deck HTML"	2026-05-10 20:28:22 +08:00
Dongsen	7c1db80893	fix(web): 写盘前拦截 prose-as-HTML artifact (#50 ) (#1144 ) * fix(web): reject prose-as-HTML artifacts at write time (#50) AI 偶尔会在仅做 in-place 编辑（无新 canonical HTML 产出）时仍按 system prompt 的非协商性收尾规则发出 `<artifact type="text/html">` 块，但块内只装一句中文总结。`persistArtifact` 之前不做内容校验，此类 prose 会作为合法 HTML 落盘到 `.od/projects/<id>/<id>.html`，并附带 `kind: html` manifest，污染项目文件面板（截图见 #50 评论）。新增 `validateHtmlArtifact` 纯函数：要求非空 + 长度 ≥64 + 含 `<!doctype html>` 或 `<html>` 标签（大小写不敏感、容忍 BOM）。 `persistArtifact` 在 `ext === '.html'` 分支调用 gate，失败时通过 `setError` 报错且不写文件。 scope 限于 `<artifact>`-tag 持久化路径——FileViewer/FileWorkspace 里用户手动保存草稿 HTML 走的是不同入口，不受影响。 prompt 层根因（缺少免发条款）已拆出 #1143 单独跟进，本 PR 是持久化层的兜底防御。 * fix(web): anchor HTML structural check at content start (#1144 review) mrcfps 在 #1144 review 指出原实现的 false negative： HTML_OPENING_TAG_RE / DOCTYPE_RE 用 .test() 在整个字符串里搜，所以 AI 描述改动时 inline 引一个 tag 名（"Updated the <html lang> attribute..."）就能蒙混过关——长度过 64、含 `<html `——同样落地为幽灵 HTML 文件。修复：合并两个 regex 成 STARTS_WITH_DOCUMENT_RE，加 ^ anchor，要求 trimmed 内容的首个非空白 token 必须是 `<!doctype html>` 或 `<html`。Mid-string 出现的 tag 名不再算数。同时按 lefarcen 的非阻塞文档建议把 docblock 改写得更精确： - "structural sniff" 替代 "validation"，明确不是 HTML 校验器 - 列出 not-a-linter / .jsx-tsx-skipped / 用户手动保存路径不受影响三条 scope 边界 - 64 字符阈值会拒收 49 字符的最小空 doc（如 `<!doctype html><html><body></body></html>`），明确这是有意 trade-off：AI 产出预期是 non-trivial deliverable 新增 3 例测试覆盖 mrcfps 描述的 false negative： - 长 prose 中 inline 引 `<html lang>` 应拒收 - 长 prose 中 inline 引 `<!doctype html>` 应拒收 - 首个 token 是 `<p>` 等非文档标签的 fragment 应拒收	2026-05-10 20:22:48 +08:00
lefarcen	93a08689e4	fix(web): truncate entry footer pet label (#1150 )	2026-05-10 19:45:39 +08:00
Sid	e948405c22	fix(web): surface connector auth errors and stop silent popup close (#725 ) (#1128 ) * fix(web): surface connector auth errors and stop silent popup close (#725) Two layered bugs caused the "Twitter Connect button does nothing" symptom: 1. ConnectorsBrowser dropped result.error from connectConnector. On Electron desktop the popup is never opened (electronAPI.openExternal path), so the existing renderConnectorAuthError(null, ...) was a no-op and the user got zero feedback. 2. registry.ts silently called authWindow?.close() whenever the connect response did not carry { kind: 'redirect_required', redirectUrl }, leaving web users with a popup that vanishes without explanation. Patch: - Add a per-connector connectorAuthorizationError state and render it as an inline banner on both ConnectorCard and ConnectorDetailDrawer (mirrors the existing cancel-failed pattern; reuses the existing .connector-authorization-error styling). - Replace authWindow?.close() with a renderConnectorAuthInfo helper that branches on auth.kind ('connected' \| 'pending' \| unknown) and writes an explanatory message to the popup before the user closes it. - Tests: 1 registry test for the pending/info popup branch, 2 ConnectorsBrowser tests for surfacing and clearing the inline banner. * fix(web): clear connector auth error on background status refresh Addresses review feedback from @mrcfps and the Codex bot on PR #1128: the inline error banner stayed visible even after background status refresh marked the connector as `connected` (e.g. user completes auth out-of-band through the Composio dashboard, then focus/poll/message refresh observes the connection). - Add clearConnectorAuthorizationErrorsForConnected helper next to the existing pending-state helpers; same shape, returns the same object reference when nothing changes so React skips a re-render. - Wire it into reloadConnectorStatuses so every status refresh path (pending poll, focus, OAuth callback message) drops stale errors for any connector now reported as connected. - Add 2 unit tests for the helper next to the existing pending-state helper tests in EntryView.test.ts.	2026-05-10 19:38:18 +08:00
Priyanshu Kayarkar	eabf3a6e86	feat: add collapsible MCP JSON field-mapping helper (#1136 ) * feat(web): add collapsible MCP JSON helper component * feat(web): add collapsible MCP JSON field-mapping helper * test(web): add McpJsonHelper component tests for toggle behavior * fix(web): scope helper id per row and show helper * test(web): rewrite McpJsonHelper tests to use row-scoped ids * feat(mcp): use stable _localId for McpRow keys and aria-controls\n\n- Add _localId to DraftRow and genLocalId()\n- Use _localId as React key and helper id to avoid duplicate DOM ids\n- Move helper outside transport branches so helper is visible for all transports\n- Fix malformed template.homepage anchor * fix(web): restore _localId-scoped helperId and helper visibility for all transports * test(web): replace integration test with _localId-scoped helper tests * test(web): exercise McpJsonHelper via production McpClientSection in jsdom * fix(web): resolve typecheck errors * test(web):expand rows before querying helper toggles to fix timeout	2026-05-10 19:37:46 +08:00
Jie Zhu	1f625cff77	fix(i18n): translate comments panel UI to Chinese (#1139 ) The comments panel in the project page left sidebar was missing Chinese translations for all UI strings. Users with Chinese language settings would see English text in the comments section, which created an inconsistent experience. This commit adds complete translations for: - Comment section titles (attached/saved comments) - Action buttons (add/remove/add all) - Empty state messages - Comment placeholder text - Attachment-related labels Both simplified Chinese (zh-CN) and traditional Chinese (zh-TW) locales are updated to provide full Chinese language support for the comments feature.	2026-05-10 19:37:22 +08:00
Jie Zhu	602cf704e2	fix(web): center close button in MCP picker dialog (#1137 )	2026-05-10 15:32:58 +08:00
郭一通	13005f4fea	fix(desktop): allow about:blank popup for PDF export fallback (#1081 ) The renderer's PDF export fallback uses window.open('', '_blank') to open a blank window that is then navigated to a Blob URL. Electron's setWindowOpenHandler only allowed blob: and od: protocols, so about:blank was denied and the user saw a "Popup blocked" alert. Fix: add about:blank to the allowed child window URL whitelist. Co-authored-by: Ken <hitken@users.noreply.github.com>	2026-05-10 12:21:15 +08:00
Arya Kaushal	9079c51ba3	feat(daemon): HTTP 206 range request support for video/audio files Fixes #784 (#1105 ) * feat(daemon): HTTP 206 range request support for video/audio files (#784) Stream video and audio via fs.createReadStream with Accept-Ranges: bytes and 206 Partial Content responses so browsers can play and seek media inline. Non-media files keep the existing buffer path unchanged. Add parseByteRange (RFC 7233-compliant) and resolveProjectFilePath to projects.ts, and 23 unit tests covering all range edge cases. * fix(daemon): move range streaming to /raw/* route used by media viewers The inline VideoViewer and AudioViewer components fetch /api/projects/:id/raw/* (via projectRawUrl), not /files/*. Apply the HTTP 206 / Accept-Ranges streaming path to the raw route while preserving its Origin: null CORS behaviour for sandboxed iframes. Add 7 route-level HTTP tests against a real startServer() instance covering 200 full, 206 partial, suffix range, open-ended range, 416 unsatisfiable, non-media passthrough, and 404 cases. --------- Co-authored-by: mrcfps <mrc@powerformer.com>	2026-05-10 12:16:52 +08:00
Nicholas-Xiong	31f89f74fd	fix: remove Trump pet from bundled community pets (#1103 ) Fixes #1042 Problem: The Trump pet was included in the bundled community pets catalog, which appeared in the Built-in pet adoption picker. This raised concerns about keeping politically-charged content in the default pet selection. Solution: - Removed the trump pet directory from assets/community-pets/ - Removed 'trump' from the BUNDLED_PETS list in bake-community-pets.ts The pet is still available on Codex Pet Share for users who want to download it manually, but it no longer ships as a built-in option. Impact: - Trump pet no longer appears in the default pet adoption picker - Users can still access it via "Download community pets" if desired - Keeps the built-in pet selection neutral and welcoming Related: - PR #850 (previous attempt that was closed without merging)	2026-05-10 12:11:00 +08:00
Yuhao Chen	6f2584e315	fix(web): prevent chat messages from overflowing into workspace area (#662 ) (#1104 ) Add overflow-x: hidden to .chat-log so any horizontally overflowing content (thinking blocks, status pills, form cards) is clipped inside the chat pane instead of spilling into the workspace area. Add min-width: 0 and max-width: 100% to .msg so flex items in the chat log column cannot expand beyond the panel width when their intrinsic content is wider than the container. Add min-width: 0 to .assistant-flow to prevent the intermediate flex container from propagating intrinsic content width up to the message boundary. This complements the existing overflow: hidden on .pane and .split-chat-slot from #740 by also constraining the intermediate flex items that can propagate width from deeply nested content up to the scroll container boundary.	2026-05-10 11:58:49 +08:00
soulme	cbb3c0e33a	Improve design files grouping (#1082 ) Add a modified-date grouping mode to make busy design workspaces easier to scan as generated files accumulate. The new view keeps existing batch actions and pagination available, adds localized labels, and covers date boundaries with component tests.	2026-05-10 11:55:34 +08:00
bojie.hbj	bb578b3dca	fix: Support OpenCode Write tool display as card (#1126 ) The Write tool from OpenCode AI wasn't being displayed correctly as a card. This fix addresses two issues: 1. Tool name normalization: Added support for lowercase 'write' in addition to 'Write' 2. Field naming normalization: Added support for camelCase 'filePath' in addition to snake_case 'file_path' Changes made: - Added `normalizeToolInput()` function in daemon.ts for root-level field normalization - Updated ToolCard.tsx to recognize both tool name variants and field naming conventions - Updated AssistantMessage.tsx for tool name recognition - Updated ProjectView.tsx for file path parsing in auto-open feature This ensures consistent behavior across different AI providers regardless of their tool naming conventions.	2026-05-10 11:49:00 +08:00
Nicholas-Xiong	29e5732f44	fix: prevent design system filter popover from shifting position on reopen (#960 ) * fix: prevent design system filter popover from shifting position on reopen Fixes #921 The design system filter popover was repositioning incorrectly when reopened after filtering, sometimes appearing too high and becoming partially hidden at the top of the viewport. Root cause: - The popover uses position: absolute with top: calc(100% + 6px) - When filtering reduces the number of items, the popover height shrinks - On reopen, the reduced height can cause the popover to appear higher than expected, especially if the trigger button is near the top Solution: - Added min-height: 120px to .ds-picker-list - This ensures the popover maintains a consistent minimum height - Prevents position shifts when content is filtered - The popover stays anchored correctly to its trigger The 120px minimum provides enough space for ~3-4 items while keeping the popover stable across filter state changes. * fix: scope min-height to design-system picker only The .ds-picker-list class is shared by multiple pickers: - NewProjectPanel prompt-template picker - SettingsDialog MCP client picker - NewProjectPanel design-system picker Adding a global min-height: 120px would affect all pickers, causing unnecessary blank space when they have few items. This adds a dedicated .ds-picker-list-design-systems modifier class to scope the min-height fix to only the design-system picker, which is the one affected by the position-shift bug.	2026-05-10 11:47:40 +08:00

1 2 3 4 5 ...

561 commits