Commit graph

17 commits

Author SHA1 Message Date
lefarcen
e1bc83a476
feat(analytics): PostHog product analytics (P0 events, consent-gated, packaged) (#1428)
* feat(analytics): scaffold PostHog product-analytics integration

- Add @open-design/contracts/analytics subpath with the 17 P0 event
  payload types, header constants, and code↔CSV enum mapping helpers.
- Add apps/daemon/src/analytics.ts with env-gated posthog-node client,
  request-scoped analytics context reader, and artifact-id anonymizer.
- Expose GET /api/analytics/config so the web bundle never embeds the
  PostHog key at build time; daemon owns POSTHOG_KEY / POSTHOG_HOST.
- Add apps/web/src/analytics module (identity + lazy posthog-js client
  + React provider) and mount it under <I18nProvider> in app/layout.

No event wiring yet — that lands in the next commit alongside trigger
points (App.tsx, EntryView, NewProjectPanel, SettingsDialog, FileViewer,
runs.ts).

* feat(analytics): wire app_launch, home_view, home_click, project_create_result

- App.tsx: fire app_launch once after first effect tick. handleCreateProject
  now emits project_create_result on both success and failure paths.
- EntryView.tsx: home_view (page) gated on agents loading so
  has_available_cli isn't transiently false; home_view (asset_panel) fires
  per top-tab change with the right result_count.
- NewProjectPanel.tsx: home_click create_button fires before delegating to
  the parent; a fresh request_id is generated here and threaded through
  onCreate so the matching project_create_result stitches via $insert_id.
- contracts/analytics: tighten createTabToTracking and topTabToTracking
  for the worktree branch's renamed tabs (live-artifact, templates).

* feat(analytics): wire settings_view + 3 settings_click events

- settings_view fires on dialog mount and on every section switch,
  carrying the active section (mapped via settingsSectionToTracking
  for the 16-section worktree layout), execution_mode, and the
  selected CLI provider id when present.
- settings_click execution_mode_tab: setMode now emits before/after
  values whenever the user toggles between Local CLI and BYOK.
- settings_click cli_provider_card: agent card onClick reports
  cli_provider_id via agentIdToTracking (kiro → other).
- settings_click byok_field: onFocus added to api_key, model select,
  and base_url inputs; provider_id widened to include google so the
  worktree's Gemini protocol slot type-checks.

* feat(analytics): wire studio_view + studio_click chat, studio_view artifact

- packages/contracts/src/analytics/artifact-id.ts: FNV-1a 64-bit helper
  produces a 16-hex anonymized id for (projectId, fileName). Stable
  cross-platform so the daemon and the web bundle resolve the same id
  without a Web Crypto round-trip; daemon now re-exports it.
- ChatComposer: studio_view chat_panel fires once per project mount,
  studio_click chat_composer fires on attachment + send buttons with
  estimated user_query_tokens (length/4) and has_attachment.
- FileViewer: studio_view artifact fires once per (project, file) at
  the dispatcher level, before any sub-viewer renders, with
  artifact_kind derived from the renderer registry / file.kind table.
- Widen TrackingExportFormat to include markdown and cloudflare_pages
  so the worktree branch's full share menu can emit verbatim.

* feat(analytics): wire studio_click share_option + artifact_export_result

HtmlViewer's share menu now emits both events per click via a
fireShareExport helper:

- studio_click share_option fires immediately on click with the chosen
  export_format and a fresh request_id.
- artifact_export_result fires when the export resolves — success for
  sync exporters (html, markdown, template) the moment the call
  returns, success/failed for async exporters (pdf, zip, deploy)
  via .then/.catch. The same request_id threads both events so
  PostHog stitches click → result via $insert_id.

DEPLOY_PROVIDER_OPTIONS maps to the CSV's vercel / cloudflare_pages
slots; markdown is now a first-class export_format value.

Also ignore .env.local so local POSTHOG_KEY / .env-style secrets
don't get committed.

* feat(analytics): emit run_created and run_finished from the daemon

POST /api/runs now reads the analytics context off the
x-od-analytics-* headers the web client sets on every fetch, then:

- Captures run_created with project_id, conversation_id, run_id,
  model_id, agent_provider_id (mapped via agentIdToTracking),
  skill_id, design_system_id, plus the token_count_source marker.
- Schedules a run_finished capture on runs.wait(run) resolution,
  mapping succeeded/canceled/failed to success/cancelled/failed and
  reporting total_duration_ms.

Both events use a stable insert_id derived from the same uuid so
PostHog dedupes the daemon-side mirror against any future
web-side capture without double-counting.

Token sub-fields (user_query_tokens/system_prompt_tokens/...) stay
omitted in v1 — the claude-stream parser only exposes input/output
totals today. See tracking-doc-issues.md §3.2.

* feat(analytics): emit settings_cli_test_result + settings_byok_test_result

The original BLOCKING-list assumed these CSV P0 events were not
implementable in this branch because main lacked Test buttons. The
worktree HEAD actually wires `handleTestAgent` and `handleTestProvider`
in SettingsDialog, so both events are now in scope.

- handleTestAgent emits settings_cli_test_result on success and
  failure paths with cli_provider_id mapped via agentIdToTracking,
  result drawn from result.ok / catch branch, error_code from
  result.kind or the thrown error name, and duration_ms timed via
  performance.now().
- handleTestProvider emits settings_byok_test_result analogously,
  using apiProtocol (anthropic|openai|azure|ollama|google) directly
  as provider_id — wider than the CSV's 5-value enum, documented in
  tracking-doc-issues.md §2.5.

Contracts: add SettingsCliTestResultProps / SettingsByokTestResultProps
plus matching track* helpers. AnalyticsEventName union now covers all
14 P0 events this branch supports.

* feat(analytics): gate PostHog on the existing telemetry.metrics consent

The integration now reuses the same first-launch privacy banner +
Settings → Privacy toggle that gates Langfuse, so a single user
decision controls both telemetry sinks.

- /api/analytics/config now consults the persisted AppConfigPrefs:
  it returns enabled=true only when POSTHOG_KEY is set AND the user
  has chosen "Share usage data" (telemetry.metrics === true). The
  response also echoes installationId so the web client uses the
  same anonymous id Langfuse keys off of — one identity per install,
  shared across both sinks.
- Web AnalyticsProvider:
  - Bootstrap fetch resolves installationId and threads it through
    the x-od-analytics-anonymous-id header on every /api/* fetch,
    so daemon-side captures (run_created / run_finished /
    project_create_result) land on the same person record.
  - Exposes a setConsent(granted) method that calls posthog-js's
    opt_in_capturing / opt_out_capturing, wired from App.tsx via a
    useEffect watching config.telemetry?.metrics. Toggling Privacy
    → metrics now stops/resumes events immediately, no reload.
- app_launch additionally gates on telemetry.metrics so a freshly-
  declined user fires nothing, and a freshly-opted-in user fires on
  the next reload.

* feat(packaging): bake POSTHOG_KEY into packaged daemon spawn env

Wires PostHog product analytics through the same Langfuse-style build-
secret pipeline so official Open Design builds ship with the key while
fork builds compile without it (the integration short-circuits cleanly
when POSTHOG_KEY is absent).

tools/pack
- resolveToolPackConfig reads POSTHOG_KEY / POSTHOG_HOST from
  process.env at packaging time, validates them (no whitespace in the
  key, http(s) URL for host, trailing-slash strip), and stamps them on
  ToolPackConfig. Fork builds without the env vars simply omit the
  fields; the daemon-side gate keeps things off in that case.
- Mac, Windows, and Linux packaged-config writers each append the two
  fields to open-design-config.json next to the existing
  telemetryRelayUrl entry.

apps/packaged
- RawPackagedConfig / PackagedConfig surface posthogKey / posthogHost
  so the Electron entry and headless entry both forward them to the
  daemon sidecar.
- buildPackagedDaemonSpawnEnv emits POSTHOG_KEY / POSTHOG_HOST into
  the daemon child env when present. The daemon's existing analytics
  module reads these via process.env — no daemon-side changes needed.
- The headless packaged path falls back to process.env for fields the
  builder hasn't injected, mirroring how OPEN_DESIGN_TELEMETRY_RELAY_URL
  is read there.

CI
- release-beta.yml and release-stable.yml expose POSTHOG_KEY (secret)
  and POSTHOG_HOST (var) at workflow-env scope so every packaging job
  inherits them. PR / fork builds without these set simply skip the
  bake step.

Tests
- tools/pack: config.test.ts covers bake-through, fork-build omission,
  whitespace rejection, invalid-URL rejection, and trailing-slash
  normalization.
- apps/packaged: sidecars.test.ts covers buildPackagedDaemonSpawnEnv
  forwarding the keys when present and omitting them when null.

* feat(analytics): enable PostHog autocapture + perf + exceptions

Flip on the PostHog SDK's automatic diagnostic features so we capture
click paths, page transitions, web vitals, dead clicks, and browser
exceptions without scattering instrumentation through the codebase.

Privacy defense lives in one place — apps/web/src/analytics/scrub.ts —
wired in via posthog-js's `before_send` hook so every outgoing event
passes through the same audit point:

  - $autocapture / $rageclick / $dead_click / $copy_autocapture:
    strips $el_text and value/placeholder/aria-label attrs from any
    input, textarea, password input, or contenteditable element. PostHog
    autocapture does not capture input.value by default, but $el_text
    on a <textarea> reflects the typed content — that's the prompt
    body for us, so it has to be scrubbed every time.
  - $pageview / $pageleave: drops query string and fragment from
    $current_url / $referrer so any future ?q=… can't leak.
  - $exception: rewrites file:// and absolute filesystem paths in
    stack frames to app://apps/<repo-relative> so we don't ship the
    user's home directory.
  - Suppresses $opt_in entirely — duplicate of our explicit
    setConsent toggle in App.tsx.

Element-level defense in depth is limited to the single most sensitive
surface: the chat composer textarea gets `ph-no-capture` so PostHog
never even generates an event for clicks inside that subtree. Every
other input relies on scrub.ts — sprinkling the class through every
form would be noisy and easy to forget on new surfaces.

The existing Privacy → "Share usage data" toggle continues to gate
every new feature: posthog-js's opt_out_capturing() halts autocapture,
$pageview, $exception, web vitals, and dead clicks alongside the
explicit capture() calls — one global switch.

11 unit tests pin the scrub rules in apps/web/tests/analytics-scrub.test.ts.

* ci(nix): bump pnpmDepsHash for posthog-js + posthog-node additions

Adding posthog-js to apps/web and posthog-node to apps/daemon changed
pnpm-lock.yaml, which Nix's fixed-output pnpmDeps derivation pins by
sha256. The CI nix flake check failed with:

  specified: sha256-KF3Mld72/iau+pJmA7HvnanRx8VLtDP0N624SKrtrrc=
  got:       sha256-PGFgX4lYyeH2TRAXfUq52A3EOa6bb1gO59hPsXhEk3s=

Copy the new hash into both nix/package-web.nix and
nix/package-daemon.nix per the procedure documented in nix/README.md
§"First-build hash pinning".

* feat(analytics): unify PostHog identity with Langfuse installationId

PostHog's distinct_id is the installationId stamped by /api/analytics/
config; Langfuse already reads the same id off app-config.json to
populate trace.userId. With both sinks keying off the same anonymous
identity, dashboards can correlate user actions (PostHog events) with
LLM runs (Langfuse traces) without re-identifying.

Two gaps closed:

1. applyConsent(false) — clear posthog-js's persisted ph_*_posthog
   localStorage entry on opt-out via posthog.reset(). Without this, a
   user who opts out, then clicks Delete my data, then re-opts in
   would see PostHog stitch their new session to the deleted identity
   because bootstrap.distinctID only takes effect on first init.

2. applyIdentity(newInstallationId) — Delete my data rotates the
   installationId in app-config; App.tsx now watches config.installationId
   and calls posthog.reset() then identify(newId) so the next event
   batch is fully decoupled from the deleted one. Idempotent on
   same-id re-renders so benign config refreshes don't churn PostHog
   identities.

The fetch wrapper's x-od-analytics-anonymous-id header also flips to
the new id on rotation so daemon-side captures (run_created /
run_finished) land on the same person record from the very next API
call, not after a reload.

The end-to-end rotation flow is verified against a live PostHog
project; these unit tests pin the safety guards (no-client paths, null
inputs) since stubbing posthog-js's init-loaded callback chain is
brittle.

* fix(langfuse): require both metrics AND content consent for trace reports

Tightens the Langfuse gate so a user who shares anonymous metrics but
NOT conversation content stops emitting Langfuse traces entirely —
Langfuse is used for turn-quality evals which only make sense with
prompt/output bodies. PostHog (product analytics, content-free) stays
gated on `metrics` alone and is unaffected.

i18n: "Conversation content" → "Conversation and tool content" with
hints expanded to mention tool inputs/outputs so the consent surface
matches what the trace actually carries (en + zh-CN).

Bundled here per PR scope — change originated outside this PostHog
PR but lands cleanly on the same files; gating Langfuse strictly
on `content` makes the dual-sink consent model (PostHog = metrics,
Langfuse = metrics + content) symmetric across both i18n locales and
the daemon-side gate.

* feat(analytics): wire byok_provider_option + fix PR review P1s

Adds the BYOK protocol-chip click event (5-value provider_id mirroring
the apiProtocol Settings UI) and resolves four P1 review threads on
PR #1428.

byok_provider_option:
- New SettingsClickByokProviderOptionProps in contracts (provider_id =
  anthropic|openai|azure|google|ollama; maps to CSV's 5 values per
  tracking-doc-issues.md §2.5).
- trackSettingsClickByokProviderOption helper in apps/web/src/analytics.
- SettingsDialog hooks it on the protocol-chip onClick alongside the
  existing setApiProtocol call; is_selected reflects whether the chip
  was already active.

Review fixes:

1. client.ts (Siri-Ray): clear `initPromise` when the resolution is
   null so a Privacy → metrics opt-in after a previous decline triggers
   a fresh /api/analytics/config fetch. Without this, the disabled
   response was cached forever — first-session opt-in needed a reload
   to start sending PostHog events.

2. provider.tsx (Siri-Ray): replace `url.includes('/api/')` with a
   strict same-origin + /api/ pathname check (shared
   `isSameOriginApiCall` helper). Outbound third-party URLs containing
   `/api/` (e.g. provider.example.com/api/x) no longer receive our
   x-od-analytics-* headers.

3. provider.tsx (codex-connector, lefarcen): gate header injection on
   `resolvedAnonId` being non-null. When Privacy → metrics is off,
   /api/analytics/config returns enabled=false → resolvedAnonId stays
   null → wrapper never installs → daemon can't read consent-bearing
   headers → no daemon-side PostHog event. setConsent now also clears
   resolvedAnonId on opt-out and re-fetches on opt-in.

4. daemon/analytics.ts (defense in depth): createAnalyticsService now
   takes dataDir and capture() re-reads app-config to check
   telemetry.metrics inside the fire-and-forget wrapper. Even if a
   stale header somehow reaches the daemon after opt-out, the capture
   is dropped before posthog-node.capture is called.

* fix(web): place "Share usage data" on the right in privacy consent banner

Swap button order in PrivacyConsentModal and the in-settings ConsentCard
so the affirmative "Share usage data" lands on the right and "Not now"
on the left. Matches the OK-on-the-right pattern users expect for
primary actions.

Both buttons keep equal visual prominence (same .privacy-consent-action
styling) so the swap doesn't change the EDPB equal-prominence stance
called out in the original Langfuse telemetry spec.

* feat(analytics): populate run_finished token totals from claude-stream usage

Daemon's claude-stream parser already emits agent usage events with
input_tokens / output_tokens totals; the run service buffers them in
run.events and Langfuse reads them out the same way. The run_finished
PostHog event was leaving these fields empty.

Scan run.events for the most recent agent usage frame on terminal
transition and emit input_tokens / output_tokens / total_tokens when
present. token_count_source flips to 'provider_usage' only when at
least one count landed; runs without provider-side usage data keep
'unknown'.

Provider does not break the input down into the 7 sub-fields the
tracking doc lists (memory / context / attachment / system_prompt /
…); those stay omitted until a parser change exposes them.

* feat(analytics): estimate user_query_tokens from prompt length

The user_query_tokens field for run_created / run_finished was hardcoded
to 0. We can't tokenize without bundling a model-specific tokenizer, but
the character/4 heuristic is the industry-standard estimate when one
isn't available and is enough for funnel analysis (prompt-length cohorts,
short-vs-long-query conversion rates).

Extracted from req.body via the same telemetryPromptFromRunRequest
pattern the daemon already uses for langfuse-bridge (currentPrompt then
message fallback). Only the integer count goes to PostHog — the prompt
text itself never leaves the daemon.

token_count_source flips appropriately:
- run_created with a prompt: 'estimated' (was 'unknown')
- run_created with no prompt: 'unknown'
- run_finished with provider usage: 'provider_usage' (overrides
  baseProps' 'estimated' value)
- run_finished without provider usage: inherits 'estimated' or 'unknown'
  from baseProps so input/output absent doesn't mask the estimate.
2026-05-12 22:32:42 +08:00
lefarcen
2a0ebea50b release: Open Design 0.7.0
- bump 14 monorepo package.json files to 0.7.0 (root + apps/{web,daemon,desktop,packaged,landing-page} + packages/{contracts,platform,sidecar,sidecar-proto} + tools/{dev,pack,pr} + e2e); apps/packaged was already at 0.6.1 from beta lane, all others at 0.6.0
- add CHANGELOG.md [0.7.0] - 2026-05-12 entry covering 97 merged PRs since 0.6.0:
  - Critique Theater: Phase 7 web client state machine (#1307) + Phase 6.2 daemon artifact extraction (#1085)
  - Web/UI: thumbs-up/down feedback widget (#1308), Cmd+, opens Settings (#1173), Finalize design package + Continue in CLI (#974), fetch models button for BYOK (#1034), provider models alphabetical sort (#1097), collapsible MCP JSON field-mapping (#1136), design file rename (#894)
  - Daemon: auto-memory store with chat-protocol-aware extraction (#999), install/uninstall skills & design systems (#1003), HTTP 206 range requests for video/audio (#1105), scheduled routines (#1033), agent runtime + route registration refactor (#1063, #1043)
  - HyperFrames: HTML-in-Canvas across web + skills (#866)
  - Skills/design systems: generic skills + design-templates split + finalize-design API (#955), agent-browser skill (#1284), WeChat design system + login-flow skill (#1083), hud/loom/trading-terminal design systems (#1069), release-notes-one-pager skill (#873), tokens.css schema (#1231)
  - Packaging: macOS Intel (x64) build (#759), official Nix flake (#402), beta packaging cache (#1095)
  - Maintainer ops: tools-pr PR-duty workspace (#1259), MAINTAINERS.md (#1290), contributor card bot (#932), PR→issue linking discipline (#1263)
  - Changed: conversation run isolation (#1271), default English i18n fallback (#1270), Codex CLI exit diagnostics / empty-response handling / path fallback (#1267, #1244, #1205)
  - Fixed: ~30 web + desktop + daemon + packaging bugfixes
  - Internal: nightly UI/desktop regression coverage (#1256), e2e/release report hardening (#1140), entry/settings automation (#954)
- catch up [Unreleased] compare link to v0.7.0 and add missing [0.6.0] release link
- add 97 PR footnote refs ([#402]..[#1330])

Verified locally: pnpm install + pre-build contracts/daemon/desktop dist + pnpm typecheck (exit 0 across all 14 packages on Node 22.22 with engine-warning).

Release workflow validation runs after merge via release-stable.
2026-05-12 15:33:28 +08:00
Demoniooo
617fb043fe
feat(settings): add fetch models button for BYOK providers (#1034)
* feat(settings): add fetch models button for BYOK providers

* fix(settings): exclude Ollama from fetch models, add manual-entry hint

* fix(provider-models): classify non-JSON upstream errors by HTTP status

* fix(i18n): drop redundant English overrides from non-English locales

* fix(provider-models): allow ollama through allowlist, return unsupported_protocol

---------

Co-authored-by: haolin122 <hl6593@nyu.edu>
2026-05-09 22:28:03 +08:00
Marc Chan
b03a504da6
release: Open Design 0.6.0 (#1080) 2026-05-09 19:58:11 +08:00
初晨
9ef136ced5
fix: sync Orbit last run with selected prompt template (#937)
* fix(orbit): scope last run to selected template

* fix(orbit): preserve legacy last run on upgrade

* fix(orbit): pin legacy last-run fallback on refresh

* fix(orbit): pin template id at run start

* test(web): sync orbit fixtures with skill summary
2026-05-09 11:19:59 +08:00
Bryan A
e13adf2e63
feat(daemon): finalize design package endpoint (closes #450) (#832)
* feat(daemon): scaffold /api/projects/:id/finalize/anthropic (refs #450)

Phase C of the PR 2 plan for issue #450: scaffold the route + module
shape so subsequent phases (D-I) land function bodies and tests against
a stable surface that already passes typecheck.

What lands here:
- apps/daemon/src/finalize-design.ts: module-level constants
  (DEFAULT_BASE_URL, DEFAULT_MAX_TOKENS=16000, INPUT_BODY_CAP_BYTES=384KiB,
  LOCK_FILENAME=.finalize.lock, OUTPUT_FILENAME=DESIGN.md,
  DEFAULT_TIMEOUT_MS=120s); inline interfaces for the request/response
  shape (kept out of packages/contracts per scope rules); two error
  classes - FinalizePackageLockedError (mirrors PR #493's
  TranscriptExportLockedError) and FinalizeUpstreamError (carries upstream
  HTTP status for the route's error mapping); function stub that throws
  "not yet implemented".
- apps/daemon/tests/finalize-design.test.ts: vitest harness with
  describe.skip placeholder so the file imports cleanly. Real cases land
  in phases D-I. Default-import of node:fs (per memory: vi.spyOn cannot
  redefine on the frozen ESM Module Namespace; CJS exports object is
  mutable).
- apps/daemon/src/server.ts: route handler at
  POST /api/projects/:id/finalize/anthropic, slotted next to the
  existing :id/deploy* family. Validates apiKey/model non-empty, optional
  baseUrl via the existing validateExternalApiBaseUrl closure (forbidden
  -> 403, invalid -> 400), optional maxTokens positive number; calls
  getProject (404 on miss); calls finalizeDesignPackage (which throws,
  caught and mapped to 500 for now); maps known error classes
  (FinalizePackageLockedError -> 409, FinalizeUpstreamError -> 502)
  pre-emptively.

Path shape rationale (Bryan-confirmed): project-scoped path matches every
sibling /api/projects/:id/* route in server.ts (deploy, deployments,
deploy/preflight); provider-namespaced segment leaves a clean expansion
line for /api/projects/:id/finalize/openai etc. as follow-ups.

Field-name rationale: apiKey, baseUrl, model, maxTokens match
ProxyStreamRequest verbatim (packages/contracts/src/api/proxy.ts:8-19)
so a future caller can reuse the same body shape. baseUrl is optional
here (intentional divergence from the proxy at server.ts which requires
it) so standard Anthropic users do not need to set it; Bedrock /
self-hosted-proxy users still can.

Verification: pnpm --filter @open-design/daemon typecheck exits 0;
finalize-design.test.ts loads cleanly with 1 skipped placeholder; no
other tests touched.

Refs nexu-io/open-design#450 (PR 2 scaffold; pipeline body in subsequent
commits)

* feat(daemon): transcript truncation helper for /finalize prompt

Phase D of the PR 2 plan for issue #450: lands the helper that bounds
the transcript section of the synthesis prompt.

Why this exists: real-world signal at authoring time was a local project
transcript already at 3.95 MB. Anthropic's claude-opus-4-7 context cap
is roughly 200K tokens (~700 KB at typical density). Inserting an
unbounded transcript would 4xx upstream on the first real call. This
helper keeps the on-disk .transcript.jsonl lossless (PR #493's contract)
while making the prompt-inclusion bounded.

Strategy:
- Cap output at INPUT_BODY_CAP_BYTES (384 KiB) so the prompt has room
  for the system prompt + design system body + current artifact + room
  for the synthesis output.
- Always preserve the header line - it carries projectId, schemaVersion,
  conversation/message counts, attachment counts; synthesis quality
  depends on knowing the original sizes.
- Split equal byte budgets between head and tail so both project genesis
  and most-recent intent survive. Two thinking segments separated only
  by mid-session truncation lose the same kind of boundary that PR #493
  preserves between thinking blocks - that's accepted; smarter
  semantic chunking is a follow-up.
- Insert a single `{"kind":"truncated","reason":"size","omittedBytes":N}`
  sentinel JSON line between the head and tail so a synthesis consumer
  can detect the gap. omittedBytes is the difference between the
  original UTF-8 byte length and the output's UTF-8 byte length.
- If the head + tail budgets together cover the whole body (e.g. all
  message lines are tiny), no marker is emitted - the output is the
  input verbatim.

Tests:
- "returns the input verbatim when the JSONL fits under the 384 KiB cap"
  pins that small transcripts pass through unchanged with no marker.
- "head+tail truncates with a single marker line when the JSONL exceeds
  the 384 KiB cap" pins that output is bounded, header survives, exactly
  one marker emitted with non-zero omittedBytes, both ends of the body
  preserved, and at least one middle message omitted.

Suite delta: +2 tests in finalize-design.test.ts.

Refs nexu-io/open-design#450

* fix(daemon): resolve noUncheckedIndexedAccess in truncateTranscriptForPrompt

D1 (0eaa123) shipped with `body[headIndex]` and `body[i]` typed as
`string | undefined` under TypeScript's `noUncheckedIndexedAccess`
strict mode. Local typecheck would have caught it but the prior
verification piped through `tail` which masked the non-zero exit code
of `tsc`. Coalesce each access via `?? ''` (the array is from
`String.split('\n')` so undefined elements are not actually reachable;
the coalesce is a type-narrowing convenience, not a behavior change).

Verification: `pnpm --filter @open-design/daemon typecheck` exits 0;
`pnpm --filter @open-design/daemon test finalize-design` shows 2/2 +
1 skipped, identical to the pre-fix run.

Refs nexu-io/open-design#450

* feat(daemon): current-artifact resolver for /finalize

Phase E of the PR 2 plan for issue #450: resolves which artifact (if
any) accompanies the transcript + design system in the synthesis prompt.

Priority order (Bryan-locked in plan §6):
  1. The file referenced by tabs.is_active = 1 IF an
     <name>.artifact.json sidecar exists on disk. Sidecar presence is
     the discriminator: an inferred manifest from
     `inferLegacyManifest` (e.g. for a bare .html with no sidecar)
     does NOT count, and an active tab pointing at a non-artifact file
     (.md, .txt) falls through.
  2. Newest project file with a real .artifact.json sidecar, sorted by
     manifest.updatedAt descending. Files without an updatedAt sort
     last so legacy pre-streaming manifests do not get accidentally
     promoted.
  3. Returns null - "no artifact in scope". The Phase H caller will
     emit `artifact: null` in the response and the prompt's "Current
     artifact" section will read "none".

Sidecar presence is checked via `existsSync` on the on-disk path, NOT
via the `artifactManifest` field returned by readProjectFile/listFiles
(those run inferLegacyManifest as a fallback for known kinds, which
would otherwise cause a bare .html with no sidecar to look like an
artifact).

Tests:
- "returns the active-tab artifact when its sidecar is present, even
  if a newer artifact exists elsewhere": pinned.html (older
  updatedAt) is in the active tab; newer.html (newer updatedAt) is
  not. Resolver returns pinned.html - intent (active tab) beats
  recency.
- "falls through to newest .artifact.json when active tab points at a
  non-artifact file": README.md is the active tab (no sidecar);
  design.html has a real sidecar. Resolver falls through and returns
  design.html.
- "returns null when no active tab and no .artifact.json sidecars
  exist": only a README.md is in the project; no tabs row. Resolver
  returns null.

Suite delta: +3 tests in finalize-design.test.ts (5 active total).

Refs nexu-io/open-design#450

* feat(daemon): synthesis prompt construction for /finalize

Phase F of the PR 2 plan for issue #450: builds the system + user
prompts that get sent to Anthropic's Messages API in the synthesis
call. Pure function; no IO, no side effects.

System prompt (literal, stored as a module-level constant): instructs
Claude to emit a DESIGN.md document with a fixed 7-heading structure
(# DESIGN.md / ## Summary / ## Brand & Voice / ## Information
Architecture / ## Components & Patterns / ## Visual System / ## Open
Questions / ## Provenance). The Provenance section is required to list
project ID, design system, current artifact, transcript message count,
and the UTC generation timestamp.

User prompt (built at runtime): structured payload with the truncated
transcript JSONL, the design system body, and the current artifact
body, each under a ## heading. Missing inputs (no design system
selected, no artifact in scope) produce explicit "none" headings +
parenthetical placeholder body so Claude does not hallucinate content
for absent sections.

Truncation is the caller's concern - this function does not
re-truncate. The caller (Phase H pipeline) feeds in a JSONL that has
already been bounded by truncateTranscriptForPrompt.

Tests:
- "includes the transcript JSONL verbatim and the generation context":
  pins all section headings, the transcript body verbatim, the design
  system body verbatim, the artifact body verbatim, and every
  generation-context line.
- "falls back to \"none\" + parenthetical when no design system is
  selected": designSystemId=null and designSystemBody=null -> heading
  reads "## Active design system: none" with the parenthetical body.
- "falls back to \"none\" + parenthetical when no artifact is in
  scope": artifact=null -> heading reads "## Current artifact: none"
  with the parenthetical body.

Suite delta: +3 tests in finalize-design.test.ts (8 active total).

Refs nexu-io/open-design#450

* feat(daemon): Anthropic call + retry strategy for /finalize

Phase G of the PR 2 plan for issue #450: lands the upstream Claude
Messages API call with a single transient-error retry, plus the
response extractor that turns Anthropic's content array into the
DESIGN.md body.

What lands here:
- appendVersionedApiPath: inlined from the connectionTest helper at
  apps/daemon/src/connectionTest.ts:188-195 (it is not exported there).
  Appends /v1/messages when the base URL has no /vN segment, otherwise
  appends /messages directly. Same semantics; ~5 lines.
- callAnthropicWithRetry: POSTs to <base>/v1/messages with the canonical
  Anthropic headers (content-type, x-api-key, anthropic-version:
  2023-06-01) and body shape ({ model, max_tokens, system, messages,
  stream:false }). One retry on transient (HTTP 429 or 5xx); on terminal
  failure throws FinalizeUpstreamError carrying the upstream HTTP
  status and raw body text. The route handler in Phase I maps status
  to AUTH_FAILED / RATE_LIMITED / UPSTREAM_FAILED and runs the body
  through redactSecrets before exposing it as `details`.
- extractDesignMd: concatenates content[].text for every block where
  type === 'text', preserving order. Throws FinalizeUpstreamError(502)
  on three malformed-response shapes: non-object payload, missing
  content array, zero text blocks. The route handler maps the throw
  to 502 UPSTREAM_FAILED so synthesis cannot land a half-empty
  DESIGN.md on disk.
- Test-only `_sleepMs` injection on the call params so the retry-delay
  sleep is instant under vitest. Default sleep uses setTimeout.

Retry posture (1 retry on transient) is opinionated; the maintainer's
"standard exponential backoff" answer was directional and a single
retry matches the existing daemon's posture (transcript export and
connectionTest do zero retries) while staying inside the daemon's
blocking-fast posture for /finalize.

Tests:
- callAnthropicWithRetry: throws on 401 with no retry; retries once
  on 429 and resolves on second 200; throws after both 5xx attempts;
  propagates AbortError when signal is pre-aborted.
- extractDesignMd: concatenates ordered text blocks; throws on
  missing content array; throws on content with zero text blocks.

A spurious typecheck error from `exactOptionalPropertyTypes` (signal
typed as AbortSignal | undefined where RequestInit expects
AbortSignal | null) was resolved by conditionally spreading signal
into the RequestInit literal.

Suite delta: +7 tests in finalize-design.test.ts (15 active total).

Refs nexu-io/open-design#450

* feat(daemon): wire /finalize pipeline end-to-end

Phase H of the PR 2 plan for issue #450: stitches together every
phase D-G primitive into the full finalizeDesignPackage pipeline that
the route handler in Phase I will expose over HTTP.

Pipeline (in execution order, all inside a try/finally that always
releases the lockfile):
1. getProject(db, projectId): defensive 404 (the route validates first;
   this throw catches direct CLI/script callers).
2. mkdirSync(<projectDir>, { recursive: true }): some projects have DB
   rows but no on-disk dir yet (PR #493's same fix).
3. fs.openSync(.finalize.lock, 'wx'): EEXIST -> FinalizePackageLockedError
   (mirror PR #493's TranscriptExportLockedError).
4. exportProjectTranscript(db, projectsRoot, projectId, { now }): produces
   .transcript.jsonl on disk; we read the body and run it through
   truncateTranscriptForPrompt to bound the prompt-inclusion size.
5. readDesignSystem(designSystemsRoot, designSystemId): returns null when
   the project has no design_system_id selected, when the design system
   directory does not exist, or when the DESIGN.md file is missing.
6. resolveCurrentArtifact(db, projectsRoot, projectId): active tab ->
   newest .artifact.json by manifest.updatedAt -> null.
7. buildSynthesisPrompt({...}): system + user prompt (per Phase F).
8. callAnthropicWithRetry({...}): one retry on 429/5xx; throws
   FinalizeUpstreamError on terminal failure.
9. extractDesignMd(payload): concatenates content[].text blocks; throws
   FinalizeUpstreamError(502) on malformed shape.
10. Atomic write: writeFileSync({flag:'wx'}) -> reopen for fsync ->
    rename. Errors unlink tmp before rethrowing.
11. Lock release in finally (always closeSync + unlinkSync).

Bounded blocking: the function uses its own AbortController + 120s
timeout when the caller does not supply a signal. Caller-supplied
signal takes precedence.

Type tightening: switched the local Db interface to
`type Db = Database.Database` (better-sqlite3) so the function signature
is compatible with `exportProjectTranscript`'s typed parameter. Source
file already had a `better-sqlite3` import in claude-design-import area
of the daemon, so no new dependency.

Tests:
- "writes DESIGN.md atomically on the happy path": end-to-end with
  seeded project + conversation + 2 messages + design system on disk;
  asserts file at exact path + body bytes match the fetch mock.
- "response carries every documented field with correct types":
  designMdPath/bytesWritten/model/inputTokens/outputTokens/artifact/
  transcriptMessageCount/designSystemId all present and typed.
- "emits design system 'none' in the prompt when no design_system_id is
  set": fetch mock asserts on the body it receives.
- "throws FinalizePackageLockedError when .finalize.lock is already
  held": pre-create lockfile; assert throw + DESIGN.md not written +
  pre-existing lock NOT unlinked (we did not own it).
- "replaces an existing DESIGN.md atomically on a second finalize":
  inject a sentinel between two finalize calls; assert sentinel is
  gone after second run.
- "cleans up tmp file AND lock file on every error path": mock
  fs.writeFileSync to throw on the tmp path; assert no DESIGN.md.tmp.*
  remain, no DESIGN.md, no .finalize.lock.
- "uses the default https://api.anthropic.com baseUrl when baseUrl is
  omitted": fetch URL begins with the default; baseUrl=undefined path.

vi.restoreAllMocks() now runs in afterEach so the writeFileSync spy
from the cleanup test does not leak into subsequent tests.

Suite delta: +7 tests in finalize-design.test.ts (22 active total).

Refs nexu-io/open-design#450

* feat(daemon): /finalize HTTP route handler + error mapping

Phase I of the PR 2 plan for issue #450: replaces the Phase C stub's
catch-all 500 with status-aware error mapping that surfaces the right
HTTP status + error code for each documented failure mode, and adds
HTTP-layer tests that boot startServer to exercise the route's
validation branches.

Route handler changes:
- :id format guard: an inline regex matching isSafeId at
  apps/daemon/src/projects.ts:556-558 rejects unsafe ids with 400
  BAD_REQUEST before any DB or filesystem work. Without this, an id
  like 'bad!id' would either fail getProject as 404 (wrong code) or
  reach the function and throw 'invalid project id' (mapped to 500).
- FinalizeUpstreamError mapping is now status-aware:
  - upstream 401 -> 401 AUTH_FAILED
  - upstream 429 -> 429 RATE_LIMITED
  - upstream 5xx (or our own 502 sentinel for malformed responses)
    -> 502 UPSTREAM_FAILED
  In all cases the upstream raw text is run through redactSecrets so
  the apiKey cannot leak through `details` even if the upstream
  echoes the inbound headers.
- AbortError mapping: when the 120s AbortController fires (or the
  caller pre-aborted the signal), surface as 503 TIMEOUT.
- Default case: console.error the error per daemon convention; client
  sees 500 INTERNAL with the message routed through redactSecrets.
- Imported redactSecrets alongside the existing connectionTest
  imports (apps/daemon/src/server.ts:51).

HTTP-layer tests (boot startServer({port:0,returnServer:true}) once
in beforeAll, mirror the proxy-routes.test.ts pattern):
- "400 BAD_REQUEST when baseUrl is not a valid URL (test #13)":
  baseUrl='not-a-url'.
- "403 FORBIDDEN when baseUrl points at a private internal IP (test
  #14)": baseUrl='http://10.0.0.1'. Note: validateBaseUrl explicitly
  allows loopback (for local OpenAI-compatible servers) and only
  blocks non-loopback private IPs (10/8, 172.16/12, 192.168/16,
  fc00::/7, fe80::/10).
- "400 BAD_REQUEST when apiKey is missing (test #15)": apiKey omitted.
- "400 BAD_REQUEST when :id contains characters outside the safe-id
  regex (test #16)": id='bad!id' contains '!' which is not in
  [A-Za-z0-9._-].

Suite delta: +4 tests (26 active in finalize-design.test.ts).
Full daemon suite: 1078/1078 pass; baseline+26 (the +5 above plan
target reflects retry+extract split into more granular unit tests
than originally enumerated; all real, none skipped).

Refs nexu-io/open-design#450

* fix(daemon): tighten isSafeId to reject pure-dot project ids

Addresses the P1 path-traversal finding from @lefarcen on PR #832
(https://github.com/nexu-io/open-design/pull/832#discussion_r3202512644).

The pre-fix `isSafeId` at apps/daemon/src/projects.ts:556-558 used
regex `/^[A-Za-z0-9._-]{1,128}$/` which permitted pure-dot ids
(`.`, `..`, `...`) because `.` is in the character class. `projectDir`
and `resolveProjectDir` both delegated to `isSafeId`, so an id of
`..` would resolve to the PARENT of `.od/projects/` via `path.join`.

Threat model (per @lefarcen):
- An attacker creates a project row whose stored id is `..` (or
  another pure-dot variant) — for instance via a workflow that
  writes the row directly without going through the API. Subsequent
  finalize/write ops keyed by that id then escape the project tree.
- A direct CLI / scripted caller passing `..` as the project id
  reaches the function without HTTP normalization saving us. (Express
  normalizes %2e%2e to .. and collapses path segments, which yields
  404 for the URL `/api/projects/%2e%2e/...` in practice — but that's
  Express's protection, not ours.)

Fix:
- isSafeId now explicitly rejects pure-dot ids (`/^\.+$/.test(id)`)
  before the char-class regex check. Empty string and inputs longer
  than 128 chars are also rejected explicitly so the function fails
  closed on edge cases.
- isSafeId is now exported from apps/daemon/src/projects.ts so the
  /finalize route handler in apps/daemon/src/server.ts can use the
  same validator instead of re-implementing the regex inline. This
  prevents drift between the route guard and the projectDir guard,
  which was how this hole originally appeared.

Tests (in finalize-design.test.ts because that's where the threat was
flagged; isSafeId is daemon-wide so a dedicated test file would also
work):
- isSafeId rejects `.`, `..`, `...`, `....`
- isSafeId rejects ids with `/`, `\`, `!`, leading whitespace
- isSafeId rejects empty string and >128 chars
- isSafeId rejects non-string inputs (null/undefined/number)
- isSafeId accepts plain ids, ids with mid-string dots, UUIDs, single chars

Suite delta: +7 tests (33 active in finalize-design.test.ts).
Full daemon suite: 1085/1085.

Refs nexu-io/open-design#832

* fix(daemon): address PR #832 P1 findings — imported folders + network 502

Addresses two of the three P1 findings from @lefarcen on PR #832:

1. Imported-folder projects route DESIGN.md to metadata.baseDir
   (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512656,
   also flagged independently by @chatgpt-codex-connector at #discussion_r3202430470)

   The pipeline previously called `projectDir(projectsRoot, projectId)`
   unconditionally, which resolves to `.od/projects/<id>`. For projects
   created via /api/import/folder the project row's `metadata.baseDir`
   carries the user's actual folder; without threading metadata through,
   finalize would silently land DESIGN.md in the hidden daemon data dir
   and the current-artifact resolver would miss the user's real files.

   Fix: switch from `projectDir` to `resolveProjectDir(projectsRoot,
   projectId, metadata)` in both `finalizeDesignPackage` and
   `resolveCurrentArtifact`. Thread `project.metadata` (from
   `getProject`'s normalized row) through both call paths. The resolver
   gets a new optional `metadata` parameter; native projects pass null
   and get identical behavior.

2. Network failures and JSON parse errors now map to 502 UPSTREAM_FAILED
   (https://github.com/nexu-io/open-design/pull/832#discussion_r3202512661)

   Pre-fix, only HTTP-non-OK responses were wrapped as
   FinalizeUpstreamError. DNS failures (ECONNREFUSED, ENOTFOUND), fetch
   TypeErrors, and `response.json()` SyntaxErrors fell through to the
   route's catch-all and surfaced as 500 INTERNAL — incorrect: those are
   upstream-level failures, not daemon bugs.

   Fix:
   - Wrap callAnthropicWithRetry in a try/catch that passes
     FinalizeUpstreamError and AbortError through verbatim, but rewraps
     any other thrown error as FinalizeUpstreamError(502, '', message).
   - Wrap response.json() in a try/catch that rewraps SyntaxError as
     FinalizeUpstreamError(502, '', "upstream Anthropic returned non-JSON
     body: ...").
   - The route handler's existing FinalizeUpstreamError mapping then
     correctly maps these to 502 with the message in `details` (run
     through redactSecrets first).

Tests:
- "writes DESIGN.md under metadata.baseDir for imported-folder projects":
  inserts a project row with metadata.baseDir pointing at a
  user-folder temp dir; asserts result.designMdPath lands there AND
  the hidden .od/projects/<id> dir does NOT contain a DESIGN.md.
- "rewraps fetch network rejection as FinalizeUpstreamError(502)":
  fetchImpl throws TypeError with cause.code='ENOTFOUND'; assert thrown
  error has name=FinalizeUpstreamError and status=502.
- "rewraps 200 with non-JSON body as FinalizeUpstreamError(502)":
  fetchImpl returns 200 with text/html body; response.json() throws
  SyntaxError internally; assert FinalizeUpstreamError(502).

Suite delta: +3 tests (36 active in finalize-design.test.ts).
Full daemon suite: green at last check; will re-verify before push.

Refs nexu-io/open-design#832

* refactor(daemon): move /finalize DTOs to contracts + map error codes + validate active-tab

Addresses the P2 and P3 findings from @lefarcen on PR #832:

P2 — Error codes + DTOs not in packages/contracts
  https://github.com/nexu-io/open-design/pull/832#discussion_r3202512673

  Reverses my plan's locked decision #10 ("no contracts changes in this
  PR; inline the request/response types"). That rule came from the
  predecessor PROMPT brief's anti-pattern table; @lefarcen's review is
  fresher signal and supersedes it. Drift risk between the daemon's
  inline types and any future PR 3 web client is real.

  - New contracts module: packages/contracts/src/api/finalize.ts with
    FinalizeAnthropicRequest / FinalizeArtifactRef /
    FinalizeAnthropicResponse. Re-exported from the package root and
    made addressable via `@open-design/contracts/api/finalize` subpath.
  - Daemon source imports the canonical types from contracts and
    re-exports the public type names so internal references keep
    working without touching every call site.
  - Daemon-local error codes remapped to existing ApiErrorCode union
    members (apps/daemon/src/server.ts), per @lefarcen's suggested
    mapping:
      FINALIZE_IN_PROGRESS -> CONFLICT
      AUTH_FAILED          -> UNAUTHORIZED
      UPSTREAM_FAILED      -> UPSTREAM_UNAVAILABLE
      TIMEOUT              -> UPSTREAM_UNAVAILABLE (status 503)
      INTERNAL             -> INTERNAL_ERROR
    HTTP status codes are unchanged; only the `code` field in the
    error JSON body changed.

P3 — Active-tab name not validated before sidecar probe
  https://github.com/nexu-io/open-design/pull/832#discussion_r3202512684

  resolveCurrentArtifact now runs the active tab's name through
  validateProjectPath BEFORE composing it into a path.join expression.
  An invalid tab (traversal segments, absolute path, null byte,
  reserved segment) causes resolveCurrentArtifact to fall through to
  the newest-artifact branch rather than abort or probe outside the
  project directory.

Tests:
- "falls through (does not throw) when active tab name contains
  traversal segments": injects a malformed `tabs.name =
  '../../../etc/passwd'` row directly via SQL (bypassing production
  tab-creation validation), seeds a real artifact, asserts the
  resolver returns the real artifact rather than the malformed name.

Suite delta: +1 test (37 active in finalize-design.test.ts).
Full daemon suite: 1089/1089 green.

Refs nexu-io/open-design#832

* fix(contracts): publish /api/finalize as standalone runtime entrypoint

Addresses @mrcfps's CI-red review on PR #832
(https://github.com/nexu-io/open-design/pull/832, inline comment on
packages/contracts/package.json).

The previous J3 commit added `./api/finalize` as a type-only subpath:
the entry had only a `types` field, no `default`. That broke the
contracts package-runtime gate (packages/contracts/tests/package-
runtime.test.ts:38-47) which asserts every exports entry exposes both
a `.mjs` runtime and a `.d.ts` types target. mrcfps proposed two fixes;
this commit takes path B — make finalize a first-class published
module rather than a type-only re-export from the package root.

Path B vs path A (a peer-AI second opinion via /collaborate confirmed):
under NodeNext + ESM with exports-map semantics, TypeScript validates
re-exported symbols against the published module-identity surface.
Because the previous J3 had `./api/finalize` neither declared as an
exports-map entry nor materialized as a standalone .mjs, TS omitted
the re-exported names during package boundary analysis. Even at
runtime `import('@open-design/contracts').FINALIZE_SCHEMA_VERSION`
worked from the bundled index.mjs but the type-checker rejected it.
Path B aligns the runtime and declaration surfaces.

Changes:
- packages/contracts/esbuild.config.mjs: add `./src/api/finalize.ts`
  to entryPoints so dist/api/finalize.mjs is generated as a standalone
  module rather than only inlined into the bundled root.
- packages/contracts/package.json: re-add `./api/finalize` to the
  exports map with both `default: ./dist/api/finalize.mjs` AND
  `types: ./dist/api/finalize.d.ts`. Mirrors `./api/connectionTest`'s
  shape (the canonical pattern for first-class submodule entries).
- packages/contracts/src/api/finalize.ts: keep the runtime export
  `FINALIZE_SCHEMA_VERSION = 1` (giving the standalone module a real
  value to emit beyond the type-only interfaces) and update the
  doc-comment now that the standalone .mjs is wired.
- apps/daemon/src/finalize-design.ts: switch the type import from
  the inline declarations introduced in the prior J3 fallback to
  `import type { ... } from '@open-design/contracts/api/finalize'`.
  Re-export the names so internal references inside finalize-design.ts
  keep working without touching every call site.

Verified:
- node --input-type=module -e "import('@open-design/contracts/api/finalize').then(m=>console.log(JSON.stringify(Object.keys(m))))"
  prints ["FINALIZE_SCHEMA_VERSION"] — runtime resolution clean.
- pnpm --filter @open-design/contracts test: 6/6 (including both
  package-runtime.test.ts cases on the rebuilt exports map).
- pnpm --filter @open-design/daemon typecheck: exits 0.
- pnpm --filter @open-design/daemon test: 1089/1089 (no regression vs
  the prior J3 number).

Refs nexu-io/open-design#832

---------

Co-authored-by: DevForgeAI CI/CD Engineer <devforge-ai@development.ai>
2026-05-08 19:52:11 +08:00
Tom Huang
56bf6ee1b6
feat: agent-callable research command and /search (#615)
* feat: pre-generation research (Tavily) for grounded generation

Adds an optional pre-generation research step so the agent can produce
slides / prototypes / decks grounded in real sources instead of guessing.

User flow:
  1. Settings -> Tavily Search -> paste API key (or set TAVILY_API_KEY).
  2. Click the new Research button in the chat composer.
  3. On send, the daemon runs a Tavily search, prepends the findings
     as a <research_context> block ahead of the system prompt, and
     spawns the agent. Research progress shows up as status pills in
     the chat stream; the agent cites sources inline as [1]/[2]/...

Phase 1 surface:
  - Single provider (Tavily), single depth ('shallow'), no LLM
    synthesis pass (Tavily's `answer` is the summary).
  - Composer toggle only; no popover / depth picker yet.
  - Reuses the existing `status` SSE agent payload + StatusPill UI
    so no new event variants or renderer code are needed.

Layers touched:
  - contracts: ResearchOptions / Source / Findings DTOs;
    ChatRequest.research; export from index.
  - daemon: apps/daemon/src/research/{index,tavily}.ts orchestrator
    + provider; tavily added to MEDIA_PROVIDERS and ENV_KEYS; hook
    in startChatRun before prompt assembly.
  - web: ChatComposer toggle + ChatSendMeta; threaded through
    ChatPane / ProjectView / streamViaDaemon into ChatRequest.

Side fix (required to land the feature, but useful on its own):
  contracts internal relative imports lacked the `.js` suffix that
  NodeNext module resolution requires. This was already breaking
  `pnpm --filter @open-design/daemon typecheck` on main; without the
  fix, none of the new research types were visible to the daemon.
  All internal contracts imports now carry `.js`.

Spec: specs/current/research-feature.md (phases 2-4 outlined for
follow-up: composer popover, multi-provider, deep recursion, example
skills with research_recommends).

Verified:
  - pnpm --filter @open-design/contracts typecheck/test
  - pnpm --filter @open-design/daemon typecheck (the chokidar
    project-watchers test is a pre-existing flake, unrelated)
  - pnpm --filter @open-design/web typecheck
  - node scripts/verify-media-models.mjs

* fix(daemon): clamp Tavily max_results to 20

Tavily's /search endpoint requires `max_results` in [0, 20]; sending a
larger value (e.g. when `research.depth: "deep"` resolves to 30) returns
400 and `runResearch` silently falls back to no-research. Clamp at the
provider boundary so Phase 2 depth tiers above 20 still produce results
instead of failing the request.

Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)

* Remove stale research merge leftovers

* Add agent-callable research search

* Fix Indonesian locale typecheck

* Fix research command invocation edge cases

* Harden slash search prompt expansion

* Honor research source caps in command contract

* Require search reports in design files

* Add research data provider settings

* Wire web research provider fallback order

* Update research provider fallback wording

* Revert "Update research provider fallback wording"

This reverts commit 86fb6001e3.

* Revert "Wire web research provider fallback order"

This reverts commit 4c9e16036b.

* Revert "Add research data provider settings"

This reverts commit 23630d1746.

* Add Dexter and Last30Days research skills

* Add DCF and Last30Days OD skills

* Add Last30Days and Dexter skills

* Resolve research review threads

---------

Co-authored-by: a1chzt <chizblank@gmail.com>
2026-05-08 10:33:44 +08:00
lefarcen
2bb029cb58
release: Open Design 0.5.0 (#820)
0.5.0 已从 c21cbc6 发布(https://github.com/nexu-io/open-design/releases/tag/open-design-v0.5.0);本次 squash 把版本 bump 与 CHANGELOG [0.5.0] 条目带到 main 历史,便于后续 0.5.1 release 在 main 上走标准 dispatch 流程。
2026-05-08 00:41:01 +08:00
monshunter
e6e5928be1
feat(web): add connection tests for execution settings (#507)
* feat(settings): add connection test for providers and CLI agents

Adds a "Test" action in the Settings dialog that verifies the configured
provider (Anthropic/OpenAI/Azure/Google) or CLI agent without sending a
real chat. Backed by a new daemon endpoint and shared contracts, with
categorized inline statuses and i18n strings across all supported locales.

* fix(settings): address connection test review feedback

* fix(daemon): pass empty MCP servers for connection probes

* fix(connection-test): address review blockers

* fix(daemon): fail json stream runs on structured errors

* fix(contracts): build connection test subpath export

* Use draft CLI env in agent connection tests

* fix(i18n): add fallback ids for new curated content
2026-05-07 11:25:37 +08:00
lefarcen
ae4a08773a
chore(release): prepare 0.4.1 (#659)
- bump remaining monorepo package.json files to 0.4.1 after apps/packaged was already bumped in #637
- add CHANGELOG.md [0.4.1] - 2026-05-06 entry covering the startup hotfix and 19 merged PRs since 0.4.0:
  - Added: manual edit mode (#620), Cmd/Ctrl+P quick file switcher (#556), resizable chat panel (#563), PI status/cancel updates (#618), accessibility and RTL/Bidi craft modules (#587, #595), i18n structure checks (#608)
  - Changed: first-PR README links now surface help-wanted issues (#605)
  - Fixed: packaged contracts runtime exports (#577), packaged runtime beta gating (#637), ACP/MCP/agent fixes (#604, #612, #627), conversation error recovery (#623), native mac quit (#637)
  - Documentation/Internal: OD_DATA_DIR migration docs (#570), Simplified Chinese QUICKSTART (#578), zh-TW/ko README syncs (#586, #619), generated metrics (#592)

Release workflow validation runs after merge via release-stable.
2026-05-06 18:05:56 +08:00
iulian
14a73d948b
Fix packaged contracts runtime exports (#577)
* fix packaged contracts runtime exports

* fix(packaging): prepare contracts runtime exports on install
2026-05-06 09:11:35 +08:00
lefarcen
963bbf2500
release: Open Design 0.4.0 (#454) 2026-05-05 23:39:40 +08:00
PerishFire
bbdd4e84b5
chore: enforce test directory conventions (#496)
* chore: enforce test directory conventions

Move package, app, and tool tests out of src and add guard enforcement so source directories stay source-only.

* ci: use guard and package-scoped tests

Run the new repository guard in CI and keep test execution aligned with package-scoped commands after removing root aliases.

* ci: align stable release guard check

Use the new repository guard in stable release verification after replacing the residual-JS-only script.

* chore: tighten test layout enforcement

Enforce sibling tests directories, typecheck moved test suites with dedicated configs, and refresh remaining guidance that pointed at src-based tests.

* chore: clarify no-emit test tsconfigs

Explicitly disable declaration-only emit in test tsconfigs so review tooling sees they are no-emit typecheck configs.
2026-05-05 15:34:22 +08:00
Nagendhra Madishetti
47eeaf445d
feat: Critique Theater foundation (contracts + parser, Phases 0-2) (#387)
* docs(specs): add Critique Theater design spec for panel-tempered artifacts

* docs(specs): add Critique Theater implementation plan

* docs(specs): rename UI to Design Jury, add lane-density modes, ship-rule explainer, label sizing

* feat(contracts): add CritiqueConfig schema and defaults

* fix(contracts): apply Task 1.1 review (CRITIQUE_PROTOCOL_VERSION rename, descriptions, RoleWeights export)

* feat(contracts): add PanelEvent discriminated union and isPanelEvent guard

* fix(contracts): apply Task 1.2 review (exhaustive event-type list, runId guard, import order)

* feat(contracts): add CritiqueSseEvent variants and panelEventToSse mapper

* test(daemon): add v1 wire-protocol golden fixtures for Critique Theater parser

* feat(daemon): add v1 streaming parser for Critique Theater wire protocol

* chore(contracts): add .js extensions to relative imports for NodeNext consumers

* fix(daemon): satisfy noUncheckedIndexedAccess in v1 parser regex match access

* test(daemon): cover parser failure modes; fix unclosed-PANELIST swallow bug

* fix(daemon,contracts): address PR #387 review

- parser now clamps panelist + DIM scores against the run-declared scale
  captured from <CRITIQUE_RUN scale=...>, not a hardcoded 100
- PANELIST appearing before any <ROUND n=...> opens now throws
  MalformedBlockError rather than emitting events with NaN round
- DIM_RE and MUST_FIX_RE hoisted to module scope and lastIndex reset per
  call so the parser hot path stops recompiling regex per artifact
- overflow check after drain simplified to a plain buf.length > cap test
  (the prior compound condition was always true on the right side and
  obscured intent)
- scoreThreshold <= scoreScale refine gains a 1e-9 epsilon so floating
  slack does not reject semantically valid configs
- round-1 designer ARTIFACT guard gains a comment naming the spec
  invariant and the v2 relaxation path
- 3 new regression tests cover the panelist-without-round, scale=10
  clamp, and scale=20 plumbing cases

* docs(specs): rationale for non-goals, failure-mode rate targets, Phase 10 matrix, Phase 14 doc layout

* Merge branch 'main' into feat/critique-theater

Resolves the contracts/index.ts conflict by keeping the .js extensions added
by chore(contracts) 2d6e8d6 and slotting in the new export for ./api/app-config
introduced upstream by #255 (9d700ec). Critique Theater additions
(./sse/critique, ./critique) preserved in their original positions.

Verified after merge:
  pnpm --filter @open-design/contracts test    -> 10/10 pass
  pnpm --filter @open-design/contracts typecheck -> exit 0
  pnpm --filter @open-design/daemon typecheck  -> exit 0
  pnpm --filter @open-design/web typecheck     -> exit 0

Two daemon tests in tests/media-config.test.ts fail both before and after the
merge because they read real OAuth credentials from the developer machine
instead of using mock fixtures. That's an upstream isolation issue on
origin/main, not something this branch introduces.

* fix: unblock web build and address mrcfps PANELIST oversize bypass

The chore commit that added .js extensions to satisfy daemon's nodenext
typecheck broke apps/web's Next.js build, because webpack tried to resolve
the literal ./common.js when only common.ts exists on disk. Replaced with
a subpath approach: contracts/exports gains a './critique' entry pointing
straight at src/critique.ts (which has no relative imports), and daemon
imports route through @open-design/contracts/critique instead of the
barrel. Web keeps the bundler-friendly barrel; daemon's nodenext walks
only the leaf module. All 13 contracts source files reverted to no-.js.

Separately, mrcfps flagged that parserMaxBlockBytes was only enforced on
the leftover buffer after drain returned, so a complete oversized block
arriving in one chunk slipped past the cap. Added an explicit per-block
size check inside drain for every buffered block type (PANELIST,
ROUND_END, SHIP). Three regression tests yield the whole stream as a
single chunk and assert OversizeBlockError fires before any events emit.

* fix(daemon): close three v1 parser invariant gaps from mrcfps review

Three independent gaps that all let malformed or oversized protocol
output pass the v1 envelope contract:

(1) Envelope guard. ROUND, PANELIST, ROUND_END, and SHIP now throw
MalformedBlockError when state.inRun is false. Without this, a stream
that omits <CRITIQUE_RUN> could still emit panelist_* events without
the run_started handshake, leaving downstream reducers with no run-level
config.

(2) UTF-8 byte length. Both the per-block size check and the post-drain
buf-size check now compare Buffer.byteLength(text, 'utf8') against
parserMaxBlockBytes. The previous string-length comparison let multibyte
content (CJK, emoji) inside <NOTES>/<SUMMARY> exceed the configured
byte cap while staying under the JS string length cap, bypassing the
daemon's resource guard.

(3) Header-end ordering. PANELIST, ROUND_END, and SHIP now require the
opener's > to appear before the matched closing tag. A malformed opener
like <PANELIST role="x" score="8"</PANELIST> previously fell through
to the closing tag's > and emitted events for an invalid block.

Four regression tests cover each gap (ROUND-without-run,
SHIP-without-run, multibyte-byte-cap, malformed-opener).

* fix(lockfile): regenerate to include contracts zod + vitest entries

The earlier conflict resolution took main's lockfile and ran pnpm
install, but the install pass on Windows didn't write the contracts
package's zod and vitest entries back into the lockfile. CI's
--frozen-lockfile install rejected the resulting state. Re-running
pnpm install with --no-frozen-lockfile rewrites the lockfile so it
now matches every package.json across the workspace, including
contracts/zod ^3.23.8 and contracts/vitest ^2.1.8. Verified locally:
pnpm install --frozen-lockfile passes.

---------

Co-authored-by: Nagendhra <nagendhra405@gmail.com>
2026-05-04 20:28:28 +08:00
lefarcen
016c08183f
release: Open Design 0.3.0 2026-05-03 23:07:28 +08:00
lefarcen
62b01a6dbf
release: Open Design 0.2.0 (#297) 2026-05-02 22:28:59 +08:00
nettee
56d08b8c5f
Add shared contracts and migrate project code to TypeScript (#118) 2026-04-30 13:01:15 +08:00