open-design/craft
Tom Huang 56bf6ee1b6
feat: agent-callable research command and /search (#615)
* feat: pre-generation research (Tavily) for grounded generation

Adds an optional pre-generation research step so the agent can produce
slides / prototypes / decks grounded in real sources instead of guessing.

User flow:
  1. Settings -> Tavily Search -> paste API key (or set TAVILY_API_KEY).
  2. Click the new Research button in the chat composer.
  3. On send, the daemon runs a Tavily search, prepends the findings
     as a <research_context> block ahead of the system prompt, and
     spawns the agent. Research progress shows up as status pills in
     the chat stream; the agent cites sources inline as [1]/[2]/...

Phase 1 surface:
  - Single provider (Tavily), single depth ('shallow'), no LLM
    synthesis pass (Tavily's `answer` is the summary).
  - Composer toggle only; no popover / depth picker yet.
  - Reuses the existing `status` SSE agent payload + StatusPill UI
    so no new event variants or renderer code are needed.

Layers touched:
  - contracts: ResearchOptions / Source / Findings DTOs;
    ChatRequest.research; export from index.
  - daemon: apps/daemon/src/research/{index,tavily}.ts orchestrator
    + provider; tavily added to MEDIA_PROVIDERS and ENV_KEYS; hook
    in startChatRun before prompt assembly.
  - web: ChatComposer toggle + ChatSendMeta; threaded through
    ChatPane / ProjectView / streamViaDaemon into ChatRequest.

Side fix (required to land the feature, but useful on its own):
  contracts internal relative imports lacked the `.js` suffix that
  NodeNext module resolution requires. This was already breaking
  `pnpm --filter @open-design/daemon typecheck` on main; without the
  fix, none of the new research types were visible to the daemon.
  All internal contracts imports now carry `.js`.

Spec: specs/current/research-feature.md (phases 2-4 outlined for
follow-up: composer popover, multi-provider, deep recursion, example
skills with research_recommends).

Verified:
  - pnpm --filter @open-design/contracts typecheck/test
  - pnpm --filter @open-design/daemon typecheck (the chokidar
    project-watchers test is a pre-existing flake, unrelated)
  - pnpm --filter @open-design/web typecheck
  - node scripts/verify-media-models.mjs

* fix(daemon): clamp Tavily max_results to 20

Tavily's /search endpoint requires `max_results` in [0, 20]; sending a
larger value (e.g. when `research.depth: "deep"` resolves to 30) returns
400 and `runResearch` silently falls back to no-research. Clamp at the
provider boundary so Phase 2 depth tiers above 20 still produce results
instead of failing the request.

Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)

* Remove stale research merge leftovers

* Add agent-callable research search

* Fix Indonesian locale typecheck

* Fix research command invocation edge cases

* Harden slash search prompt expansion

* Honor research source caps in command contract

* Require search reports in design files

* Add research data provider settings

* Wire web research provider fallback order

* Update research provider fallback wording

* Revert "Update research provider fallback wording"

This reverts commit 86fb6001e3.

* Revert "Wire web research provider fallback order"

This reverts commit 4c9e16036b.

* Revert "Add research data provider settings"

This reverts commit 23630d1746.

* Add Dexter and Last30Days research skills

* Add DCF and Last30Days OD skills

* Add Last30Days and Dexter skills

* Resolve research review threads

---------

Co-authored-by: a1chzt <chizblank@gmail.com>
2026-05-08 10:33:44 +08:00
..
accessibility-baseline.md feat(craft): accessibility-baseline module + opt-ins on dashboard, hr-onboarding, mobile-onboarding (#587) 2026-05-06 09:18:59 +08:00
animation-discipline.md feat(craft): animation-discipline module + opt-ins on mobile-app, mobile-onboarding, gamified-app (#515) 2026-05-05 18:32:30 +08:00
anti-ai-slop.md feat(craft): add brand-agnostic craft references + Refero-derived lint rules (#225) 2026-05-02 11:00:33 +08:00
color.md feat(craft): add brand-agnostic craft references + Refero-derived lint rules (#225) 2026-05-02 11:00:33 +08:00
form-validation.md craft: add form-validation so generated forms aren't stuck in 2018 RHF/Formik patterns (#625) 2026-05-06 20:09:30 +08:00
laws-of-ux.md feat: agent-callable research command and /search (#615) 2026-05-08 10:33:44 +08:00
README.md craft: add laws-of-ux guidance 2026-05-07 20:02:26 +08:00
rtl-and-bidi.md craft: add rtl-and-bidi so OD artifacts don't break for Arabic / Hebrew / Persian users (#595) 2026-05-06 12:43:48 +08:00
state-coverage.md feat(craft): state-coverage module + opt-ins on dashboard, mobile-app, kanban-board (#502) 2026-05-05 16:31:05 +08:00
typography.md feat(craft): add brand-agnostic craft references + Refero-derived lint rules (#225) 2026-05-02 11:00:33 +08:00

Craft references

Brand-agnostic craft knowledge. Each file is a small, dense rulebook on one dimension of professional UI craft (typography, color, motion, …). Skills opt into the references they need; the daemon injects only the requested ones into the system prompt above the active skill body.

Why a third axis next to skills/ and design-systems/

Axis Scope Example
skills/ Artifact shape saas-landing, dashboard, pricing-page
design-systems/ Brand visual language (the 9-section DESIGN.md) linear-app, apple, notion
craft/ Universal craft knowledge — true regardless of brand letter-spacing rules, accent-overuse caps, anti-AI-slop

DESIGN.md tells the agent which colors and fonts a brand uses. craft/ tells the agent the universal rules a competent designer applies on top — e.g. ALL CAPS always needs ≥0.06em tracking, regardless of the brand.

How a skill opts in

Add an od.craft.requires array to the skill's front-matter. Only the listed sections are injected, so a skill that needs only typography pays no token cost for color/motion content.

od:
  craft:
    requires: [typography, color, anti-ai-slop]

Allowed values match the file names in this directory minus the .md extension. Unknown values are silently ignored (forward-compatible).

Why silent fallback instead of fail-fast?

A skeptical reader will ask: "If a skill requests a planned-but-not-yet-vendored section and the corresponding file doesn't exist yet, shouldn't we warn the user?" We chose forward-compatibility over fail-fast: a skill authored today can list a planned slug and start benefiting the moment the matching craft/<slug>.md is vendored in a follow-up PR, with no skill edit needed. The cost of a missed reference is a missing paragraph in the system prompt, not a broken skill — so the loud failure mode is not worth the friction.

Note for skill authors arriving from older guidance: an earlier draft used motion as the future-slug placeholder. The shipped equivalent today is animation-discipline. Use that one if your skill emits motion.

Enforcement levels

Craft files mix auto-checked rules and guidance.

  • Auto-checked. Rules wired into apps/daemon/src/lint-artifact.ts — currently the P0 list in anti-ai-slop.md (Tailwind-indigo accent, two-stop hero gradients, emoji-as-icons, etc.). The linter reports these as findings back to the UI (for P0/P1 badges) and to the agent (as a system reminder for self-correction). Artifact persistence is not currently hard-blocked on P0 hits.
  • Guidance. The rest. The agent reads the rules, reviewers apply them, the linter doesn't check them.

A purely behavioral craft file (state-coverage, animation-discipline) is guidance unless a specific rule is later promoted into lint-artifact.ts.

Files

File Section name When to require
typography.md typography Any skill that emits typed content (~all skills)
color.md color Any skill that emits styled output (~all skills)
anti-ai-slop.md anti-ai-slop Marketing pages, landing pages, decks
state-coverage.md state-coverage Any skill with stateful UI (dashboards, mobile apps, forms, list/table views)
animation-discipline.md animation-discipline Any skill that ships motion: mobile apps, multi-screen flows, gamified UI, transitions, microinteractions
accessibility-baseline.md accessibility-baseline Any skill that ships interactive UI: dashboards, forms, mobile flows, anything with focus/labels/keyboard paths
rtl-and-bidi.md rtl-and-bidi Any skill that ships localized text or layout: blogs, docs, financial tables, mobile apps, anything that may render Arabic / Hebrew / Persian
form-validation.md form-validation Any skill whose primary artifact contains an interactive form: lead capture, sign-in, signup, settings, multi-step intake
laws-of-ux.md laws-of-ux Any skill whose composition decisions hit named cognitive limits: pricing pages (Hick's, Choice Overload, Von Restorff), dashboards (Pareto, Selective Attention, Working Memory), onboarding (Goal-Gradient, Zeigarnik, Peak-End), modals (Fitts's, Tesler's). Sibling axis to the rendering-rule files above — covers what to compose, not how to render.

Partial-stateful skills. A skill that's mostly static but contains an embedded form, data table, or query surface should opt in. State-coverage rules apply to the stateful component, not the whole page.

More sections (icons, craft-details) will be added in follow-up PRs as we wire the linter side.

Attribution

Craft content is adapted from the MIT-licensed refero_skill project (© Refero Design), with edits to fit Open Design's house style and link back to OD's design tokens (var(--accent) etc.) instead of generic Tailwind hex values.