Commit graph

8 commits

Author SHA1 Message Date
Tom Huang
56bf6ee1b6
feat: agent-callable research command and /search (#615)
* feat: pre-generation research (Tavily) for grounded generation

Adds an optional pre-generation research step so the agent can produce
slides / prototypes / decks grounded in real sources instead of guessing.

User flow:
  1. Settings -> Tavily Search -> paste API key (or set TAVILY_API_KEY).
  2. Click the new Research button in the chat composer.
  3. On send, the daemon runs a Tavily search, prepends the findings
     as a <research_context> block ahead of the system prompt, and
     spawns the agent. Research progress shows up as status pills in
     the chat stream; the agent cites sources inline as [1]/[2]/...

Phase 1 surface:
  - Single provider (Tavily), single depth ('shallow'), no LLM
    synthesis pass (Tavily's `answer` is the summary).
  - Composer toggle only; no popover / depth picker yet.
  - Reuses the existing `status` SSE agent payload + StatusPill UI
    so no new event variants or renderer code are needed.

Layers touched:
  - contracts: ResearchOptions / Source / Findings DTOs;
    ChatRequest.research; export from index.
  - daemon: apps/daemon/src/research/{index,tavily}.ts orchestrator
    + provider; tavily added to MEDIA_PROVIDERS and ENV_KEYS; hook
    in startChatRun before prompt assembly.
  - web: ChatComposer toggle + ChatSendMeta; threaded through
    ChatPane / ProjectView / streamViaDaemon into ChatRequest.

Side fix (required to land the feature, but useful on its own):
  contracts internal relative imports lacked the `.js` suffix that
  NodeNext module resolution requires. This was already breaking
  `pnpm --filter @open-design/daemon typecheck` on main; without the
  fix, none of the new research types were visible to the daemon.
  All internal contracts imports now carry `.js`.

Spec: specs/current/research-feature.md (phases 2-4 outlined for
follow-up: composer popover, multi-provider, deep recursion, example
skills with research_recommends).

Verified:
  - pnpm --filter @open-design/contracts typecheck/test
  - pnpm --filter @open-design/daemon typecheck (the chokidar
    project-watchers test is a pre-existing flake, unrelated)
  - pnpm --filter @open-design/web typecheck
  - node scripts/verify-media-models.mjs

* fix(daemon): clamp Tavily max_results to 20

Tavily's /search endpoint requires `max_results` in [0, 20]; sending a
larger value (e.g. when `research.depth: "deep"` resolves to 30) returns
400 and `runResearch` silently falls back to no-research. Clamp at the
provider boundary so Phase 2 depth tiers above 20 still produce results
instead of failing the request.

Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)

* Remove stale research merge leftovers

* Add agent-callable research search

* Fix Indonesian locale typecheck

* Fix research command invocation edge cases

* Harden slash search prompt expansion

* Honor research source caps in command contract

* Require search reports in design files

* Add research data provider settings

* Wire web research provider fallback order

* Update research provider fallback wording

* Revert "Update research provider fallback wording"

This reverts commit 86fb6001e3.

* Revert "Wire web research provider fallback order"

This reverts commit 4c9e16036b.

* Revert "Add research data provider settings"

This reverts commit 23630d1746.

* Add Dexter and Last30Days research skills

* Add DCF and Last30Days OD skills

* Add Last30Days and Dexter skills

* Resolve research review threads

---------

Co-authored-by: a1chzt <chizblank@gmail.com>
2026-05-08 10:33:44 +08:00
Mohamed Abdallah
bc9a49ff48
craft: add laws-of-ux guidance
Adds the laws-of-ux craft guidance for generated UI work.
2026-05-07 20:02:26 +08:00
Mohamed Abdallah
d80402a8ca
craft: add form-validation so generated forms aren't stuck in 2018 RHF/Formik patterns (#625)
* feat(craft): add form-validation + opt-ins on saas-landing, mobile-onboarding

Module 5 of 5 in the behavioral craft series proposed in #501.
Modules 1-4 merged: state-coverage (#502), animation-discipline (#515),
accessibility-baseline (#587), rtl-and-bidi (#595).

Picks up where accessibility-baseline.md ends (label + describedby +
invalid + role=alert for inline errors) and connects the four layers a
real form spans: WHATWG Constraint Validation as the platform floor,
validation timing as a state machine on the input, WCAG 3.3.x as the
announcement and recovery contract, schema as the cross-stack truth.

Sections: input state machine; validation timing (4 rules anchored on
:user-invalid Baseline 2023); Constraint Validation API rules
(setCustomValidity, requestSubmit vs submit, readonly + #11841,
inputmode); error wiring beyond the baseline (adaptive messages,
error summary without role=alert, preserve user input on error);
schema as cross-stack contract (Standard Schema, server-authoritative,
Zod 4 z.email() form); WCAG 3.3.3 / 3.3.4 / 3.3.8 / 3.3.9; native
mobile parity (UIKit, SwiftUI, Compose, Flutter, RN); common mistakes.

Reviewed in 3 loops with Claude CLI Opus 4.7 xhigh effort:
- Loop 1: 6 P0s caught (SwiftUI Form validity claim, SwiftUI
  announcement primitive, Compose semantics syntax, UIKit
  UIAlertController, contradictory Baymard stats, 3.3.8 CAPTCHA
  framing reversed) + 11 P1/P2s; all addressed.
- Loop 2: verified P0 fixes; flagged 1 P1 (RN table row scrambled) +
  4 P2s; all addressed.
- Loop 3: SHIP verdict. Three P2 nits applied (Zod 4 z.email() form,
  WebAIM Million 2026 stat woven in: 51% page-level, 33.1% input-level).

WebAIM Million 2026 numbers verified directly against
webaim.org/projects/million/.

Skill opt-ins: saas-landing (lead capture form), mobile-onboarding
(sign-in screen). Skill bodies do not contain validation-specific
instructions that would override craft guidance — opt-in alone is
sufficient. README updated.

Refs #501.

* fix(craft+skills): form-validation review fixes (lefarcen + mrcfps P2s)

Both non-blocking findings addressed:

- Drop form-validation from saas-landing.craft.requires. The skill body
  produces a CTA-driven landing page with no JS and no interactive
  form. Adding form-validation injected ~221 lines of irrelevant prompt
  pressure and conflicted with the README opt-in rule ("primary
  artifact contains an interactive form"). mobile-onboarding keeps the
  opt-in — sign-in screen is a real form.
- Reword timing rule 4 (async checks). Previous "never block submit on
  a network round-trip" was too broad and conflicted with the
  schema-layer "server is the truth" rule. Split into two paths:
  background preflight (uniqueness, address lookup) doesn't gate the
  form; authoritative submit-path server validation must await the
  server response and surface its field errors. The rule is "don't let
  a slow background check freeze the form," not "don't ever wait for
  the server."

* fix(craft): form-validation mrcfps round-2 (novalidate trade-off, Flutter RTL)

Two non-blocking precision items:

- novalidate trade-off: previous wording said keeping required/pattern/type
  preserves no-JS PE, but a literal server-rendered <form novalidate>
  disables the browser's submit-blocking and validation UI even when
  JS is unavailable — losing the no-JS constraint-validation floor.
  Reworded to spell out the two safe patterns: (A) render <form>
  without novalidate server-side and have the form library set
  form.noValidate = true after hydration, or (B) ship novalidate from
  the start only when the submit path reaches server validation
  without JS. Either way, keep the constraint attributes.
- Flutter announcement example: hardcoded TextDirection.ltr would
  announce Arabic/Hebrew/Persian validation messages with wrong bidi
  direction when this craft is combined with rtl-and-bidi. Switched
  to SemanticsService.announce(message, Directionality.of(context))
  with an explicit warning never to hardcode the direction.

* fix(craft): form-validation mrcfps round-3 (readonly safety, Compose error message)

Two non-blocking precision items:

- Non-input readonly fallback: previous text said `aria-readonly` plus
  hidden mirror input was an option for non-input controls that need
  to submit. But `aria-readonly` doesn't actually stop a `<select>` or
  custom widget from being changed, so the visible control can drift
  while the hidden input ships a stale value — user sees one option,
  server gets another. Tightened: prefer `disabled` plus a same-named
  hidden input, or non-editable text plus hidden input. If using
  `aria-readonly`, the interaction must also be blocked or the two
  values kept in sync.
- Compose error message: previous rule was too absolute about avoiding
  `Modifier.semantics { error("…") }`. `isError = true` flips the
  field state but does not carry the localized error message; Android
  Compose accessibility guidance pairs `isError` with
  `semantics { error(message) }` so the accessibility service gets the
  real text. The trap is duplication, not the API itself. Reframed
  the rule: use both, source the message from the same state field as
  `supportingText` so they stay in sync.

* fix(craft): form-validation Compose live-region API name

Compose row in the native-mobile parity table named a "LiveRegion"
semantic that doesn't exist. Real API is `Modifier.semantics
{ liveRegion = LiveRegionMode.Polite }` on the supporting-text node.
Also replaced the generic `view.announceForAccessibility(…)` with the
Compose-idiomatic `LocalView.current.announceForAccessibility(message)`
so generated snippets compile.
2026-05-06 20:09:30 +08:00
Mohamed Abdallah
be1b3dae40
craft: add rtl-and-bidi so OD artifacts don't break for Arabic / Hebrew / Persian users (#595)
* feat(craft): add rtl-and-bidi + opt-ins on blog-post, docs-page, finance-report

Module 4 of 5 in the behavioral craft series proposed in #501. Modules
1 (state-coverage, #502) and 2 (animation-discipline, #515) merged.
Module 3 (accessibility-baseline, #587) open at time of authoring.

Differentiating niche per the corpus prior-art survey: zero existing
OSS RTL skill is Apache-2.0, framework-agnostic, and aligned with
UAX #9 rev 51. The closest comparators (idanlevi1/rtlify 5★, MIT;
skills-il/localization 7★, MIT) are LTR-web-skewed and don't cover
Flutter Directionality, RN I18nManager, Compose LocalLayoutDirection,
or iOS UIKit semanticContentAttribute / SwiftUI layoutDirection.

Three-loop adversarial review pass via Claude Opus 4.7 xhigh effort
(codex unavailable). Loop 1 caught five revisions (typography spin-out,
WebKit prose compression, mistakes-list trim 12→9, alreq letter-spacing
rename dropped, WebKit r94775 specific revision dropped). Loop 2 caught
one blocking SwiftUI 4 claim and three nits. Loop 3 said ship.

Skill opt-ins picked to avoid PR #587 merge surface: blog-post (long-form
text), docs-page (LTR code islands in RTL prose), finance-report
(numerals + IBAN + currency).

Refs #501.

* fix(craft): rtl-and-bidi review fixes (lefarcen 6 findings)

- P2 #1 WebKit #50949: bug is RESOLVED FIXED, not still open. Verified
  directly against bugs.webkit.org. Removed the broken-WebKit framing;
  the recommendation to prefer <bdi> over CSS now stands on UAX #9
  §2.7 ("prefer markup over CSS or control characters") rather than a
  WebKit bug. Source list updated to drop the dead reference.
- P2 #2 isolate vs embedding controls: U+202C PDF is the
  embedding/override terminator, not an isolate terminator. Split into
  two families: isolate controls (U+2066/2067/2068 + U+2069 PDI) for
  modern code, embedding/override controls (U+202A/202B/202D/202E +
  U+202C PDF) as legacy. Recommend isolates first.
- P2 #3 base direction and language: new section covering
  <html dir lang>, mixed-language subtrees, dir=auto for UGC. Without
  this, agents can follow every other rule and still ship an LTR
  document containing Arabic.
- P2 #4 phone/IBAN/card values: bare <bdi> is unreliable for
  weak/neutral character runs; updated must-mirror bullet and forms
  section to require <bdi dir="ltr">. Added common-mistake entry.
- P3 #1 native mobile budget: added a one-line opt-out hint at the
  top of the section so HTML-only skills know they can skim it. Full
  split into web/native files deferred — the table is 16 lines on a
  176-line file, the cost is bounded.
- P3 #2 lintability: restructured "common mistakes" into three groups
  — mechanically lintable, needs script detection, HTML semantics —
  with explicit exception language (chart axes, physical-object icons,
  platform-pinned UI). Avoids false positives in future linting.

Reviewed via Claude CLI Opus 4.7 xhigh effort (3 loops on the
original draft); these fixes are explicit reviewer responses with
WebKit Bugzilla state verified live.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(craft): rtl-and-bidi mrcfps round-2 precision (lang+dir, isolate picks)

Two non-blocking precision items:

- lang-without-dir scope: previous wording implied English never needs
  dir="ltr". True only at the document root in a default-LTR page.
  lang does not reset an inherited bidi base direction, so an
  <section lang="en"> inside an RTL ancestor still resolves RTL.
  Reworded to "lang without dir is fine at the document root in a
  default-LTR page; inside any opposite-direction ancestor, set both."
- Plain-text isolate picks: previous wording recommended U+2068 / U+2069
  generically. U+2068 is FSI (first-strong auto-detect) — wrong default
  for known-direction runs, especially weak/neutral-heavy values like
  phone, IBAN, card numbers (the same class this file forces to LTR in
  HTML). Split: LRI/PDI for known-LTR, RLI/PDI for known-RTL, FSI/PDI
  reserved for unknown direction. Added an explicit "don't default to
  FSI" callout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(craft+skills): rtl-and-bidi mrcfps round-3 — skill-body conflicts + bidi semantic correction

P1 BLOCKING — skill-body physical-direction conflicts (mrcfps):

- skills/docs-page: "left nav" / "right-rail TOC" / "left-edge accent
  stripe" survive in skill body even with the rtl-and-bidi opt-in,
  because craft is injected ABOVE the skill body. An Arabic docs
  request would still see "Left nav" and emit physical-direction
  layout. Updated description, lay-out section, and self-check to
  inline-start / inline-end vocabulary; added a self-check bullet
  requiring logical CSS on rails and accent.

- skills/blog-post: pull-quote "accent rule on the left" updated to
  "accent rule on the inline-start edge" with a matching note about
  flipping under dir="rtl".

P1 craft semantic correction (mrcfps):

- HTML-semantics lint: previous wording equated <bdi dir="auto"> with
  unicode-bidi: plaintext. Not equivalent. <bdi> isolates an inline
  run from surrounding bidi resolution; unicode-bidi: plaintext
  changes how base direction is *determined* for each plaintext
  paragraph in a block. Different surfaces. Reworded the lint guidance
  to "prefer semantic isolation in HTML for inline runs; reach for
  unicode-bidi: plaintext only when that block-level paragraph
  behavior is explicitly required and tested" — and explicitly flagged
  that they are not drop-in equivalents to avoid future linters
  flagging valid CSS with a non-equivalent fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(craft): rtl-and-bidi mrcfps round-4 — split progress-bar from media scrubber

Non-blocking precision: prior must-mirror bullet lumped "progress-bar
fill" together with sliders, which would have flipped a video / audio
scrubber under dir="rtl" — directly conflicting with the must-not-mirror
rule for media playback controls (play/pause/FF/rewind represent tape
direction, not reading direction). The two cases collide on every audio
or video player.

- Must-mirror progress bars now scoped to "non-media" (download, upload,
  form-completion).
- Media scrubber / progress timeline added explicitly to the must-not-
  mirror media bullet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 12:43:48 +08:00
Mohamed Abdallah
b064637e3f
feat(craft): accessibility-baseline module + opt-ins on dashboard, hr-onboarding, mobile-onboarding (#587)
* feat(craft): add accessibility-baseline + opt-ins on dashboard, hr-onboarding, mobile-onboarding

Module 3 of 5 in the behavioral craft series proposed in #501. Modules 1 (state-coverage, #502) and 2 (animation-discipline, #515) merged earlier today.

The differentiator that survived the corpus review is native-mobile
parity. Existing OSS prior art (fecarrico/A11Y.md, awesome-copilot,
Community-Access) covers web ARIA well, none covers Flutter Semantics,
Compose semantics, iOS UIKit/SwiftUI, or RN labelling APIs.

Secondary differentiator: jurisdictional legal-floor calibration. EAA
references WCAG 2.1 (via EN 301 549 v3.2.1), not 2.2. ADA Title II
2026-04-24 deadline slipped to 2027-04-26 via 2026-04-20 IFR. Most
existing OSS a11y prior art doesn't track either accurately.

Three-loop adversarial review pass before push (codex unavailable, ran
via substitute agent). Loop 1 caught nine cuts plus four factual fixes
including a wrong Android Compose API name. Loop 2 verified and flagged
two more trims. Loop 3 said ship.

Anchored citations: WCAG 2.2 Understanding pages, ISO/IEC 40500:2025,
ADA Title II 2024 + 2026-04-20 IFR, EN 301 549 v3.2.1, WAI-ARIA 1.3 +
AccName 1.2 + Core AAM 1.2, WebAIM Million 2025, A11yn (arXiv 2510.13914),
APCA W3C silver branch.

Refs #501.

* fix(craft): accessibility-baseline review fixes (lefarcen + mrcfps)

Address all P1/P2/P3 findings:

- P1 (lefarcen): add "Keyboard operability and semantic structure" section covering tab reachability (2.1.1), activation keys, no keyboard trap (2.1.2), focus order (2.4.3), native-control-first, document language (3.1.1), heading hierarchy (1.3.1, 2.4.6), landmarks (1.3.1, 2.4.1), text alternatives (1.1.1)
- P2 (lefarcen): expand jurisdiction scope with US Section 508 (WCAG 2.0 AA), ADA Title III caveat, EU WAD reference
- P2 (lefarcen + mrcfps): rename contrast-table row to "Normal text below 18 pt regular / 14 pt bold" so the table matches the threshold rule
- P2 (mrcfps): correct "exclusive" → "inclusive" — exact 4.5:1 / 3:1 passes; the no-rounding rule is what makes 2.999:1 fail
- P2 (lefarcen): add "Prior art and scope" note differentiating from existing OSS a11y agent docs
- P3 (lefarcen): narrow APCA framing to "not part of WCAG/EN/ADA/Section 508" and clarify size/weight-dependent thresholds
- P3 (lefarcen): expand WCAG 2.5.8 exceptions list (Spacing, Equivalent, Inline, User Agent Control, Essential)
- Common-mistakes additions: Section 508/2.1 confusion, tabindex>0 anti-pattern, modal-focus-trap distinction from 2.1.2, heading-size vs level confusion

* fix(craft): accessibility-baseline mrcfps round-2 precision fixes

All three non-blocking precision items addressed:

- Update WebAIM Million benchmark from 2025 to 2026 (February 2026 crawl). Form labels: page-level 51% (was 48.2%), input-level 33.1% (was 34.2%) of 6.9M inputs (was 6.3M). ARIA: 59.1 errors on ARIA pages vs 42 on non-ARIA (was 57 vs 27); gap is ~17 in 2026, was 30 in 2025. ARIA usage 82.7% of pages (was 79.4%). Verified directly against webaim.org/projects/million/.
- Soften keyboard/semantic-structure intro: Level A items are still labeled Level A, but 2.4.6 Headings and Labels is correctly tagged AA, and the one-h1 / no-skipped-levels rules are now framed as OD craft conventions on top of WCAG's programmatic-structure floor (1.3.1).
- Tighten <a> activation note: bare <a> without href is not focusable, not a link, and not keyboard-operable. Use <a href="…"> for navigation or <button> for actions. Added a "common mistakes" entry to lock the rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 09:18:59 +08:00
Mohamed Abdallah
5da21e4054
feat(craft): animation-discipline module + opt-ins on mobile-app, mobile-onboarding, gamified-app (#515)
* feat(craft): add animation-discipline + opt-ins on mobile-app, mobile-onboarding, gamified-app

Animation discipline is the second behavioral craft module proposed in
#501 and explicitly invited in @mrcfps's post-merge comment on #502.

Differentiation from prior art (LottieFiles motion-design-skill, MIT,
96 stars): citation-grounded against primary sources rather than
asserted. Anchors:

- Tversky/Morrison/Bétrancourt 2002 (IJHCS) on the one demonstrated
  win-condition for animation
- Heer & Robertson TVCG 2007 on staging (with the actual durations
  they tested, not the laundered '300-1000ms' rule)
- Harrison/Yeo/Hudson CHI 2010 on perceived-duration scope (progress
  bars only, not skeletons)
- Doherty & Thadani IBM 1982 productivity numbers
- Material 3 motion tokens (M3 standard vs M2 legacy delta)
- IBM @carbon/motion durations
- Apple SwiftUI Animation API published defaults
- W3C View Transitions API + WCAG 2.2.2/2.3.3 calibration
- WebKit 2017 prefers-reduced-motion rationale

The 'common mistakes (lint these)' section busts five specific
folklore claims that don't survive primary-source check, including
the Doherty-400ms attribution and the M2-vs-M3 standard easing
confusion.

Three skills opt in via od.craft.requires:
  - mobile-app (animation-heavy mobile screens)
  - mobile-onboarding (multi-screen flow with transitions)
  - gamified-app (animations central to the format)

Refs #501.

* fix(craft): address review findings on animation-discipline

Six findings from @lefarcen's CHANGES_REQUESTED review on #515,
addressed in one pass. Reviewed by codex across three loops before
push.

P1 integration gaps:
- gamified-app and mobile-onboarding skills now require both
  state-coverage and animation-discipline (both render stateful UI
  with motion).
- craft/README.md silent-fallback example reframed as a
  planned-but-not-yet-vendored placeholder rather than a hard-coded
  next-to-ship slug. Note added pointing skill authors who arrive from
  older guidance at animation-discipline as the equivalent of the
  earlier 'motion' placeholder.

P2 reasoning completeness:
- > 500 ms duration row reframed: 'Reserved for cross-screen, staged,
  or platform-native transitions (e.g. M3 long2-extraLong4, Heer &
  Robertson 2007's per-stage recommendation)'. Surrounding paragraph
  rewritten with an enumerated category — 'Non-navigation
  microinteractions: hover, press, toggle, validation, chip selection,
  row expansion' — rather than the vague 'routine' term.
- New 'Flashing limits' subsection added in the Reduced motion
  section. WCAG 2.3.1 (Level A) three-flashes-in-any-one-second-period
  rule with the area/brightness threshold qualifier; WCAG 2.3.2 (AAA)
  unconditional rule. Photosensitive epilepsy framing.
- New 'Repeated and ambient motion' section added. Five rules covering
  iteration cap, WCAG 2.2.2 pause control after 5s, cancel-on-route,
  one-shot reward animations, and spinner timeout cross-referencing
  state-coverage.md.

File length now 154 lines (was 130, 80-110 craft target). Trade is
citation density and the new sections demanded by the integration
context (gamified/onboarding skills with looping motion).

Refs #501, #515.
2026-05-05 18:32:30 +08:00
Mohamed Abdallah
ab58b62b17
feat(craft): state-coverage module + opt-ins on dashboard, mobile-app, kanban-board (#502)
* feat(craft): add state-coverage rules + opt-ins on dashboard, mobile-app, kanban-board

State coverage is the most reliable AI-design failure: agents ship only the
populated state. This adds craft/state-coverage.md (108 lines, matches the
existing craft format) covering the five required states (loading, empty,
error, populated, edge), three form-specific states, ARIA/focus rules, and
loading-duration thresholds.

Sources are public: WCAG 2.2, NN/g, Material Design 3, Apple HIG, Baymard.

Three skills with stateful UI opt in via od.craft.requires:
  - dashboard
  - mobile-app
  - kanban-board

Decks, ppt, image-poster and other static-output skills do not opt in.

Refs: see issue body for the broader proposal (state-coverage is module 1
of 5 behavioral craft modules).

* fix(craft): address review findings on state-coverage

Four P2 findings from #502 review addressed in one pass.

- Edge state Test matrix added under the five-states table (dashboard,
  mobile, form, search, detail-view scenarios with concrete thresholds).
- Server-driven empty pattern added as trailing note in the empty-state
  composition section.
- Retry discipline subsection added after error severity tiers
  (immediate first retry, exponential 2s/4s/8s backoff, 3-retry floor,
  Last-attempted timestamp).
- README enforcement-levels subsection added distinguishing auto-checked
  P0 rules from guidance; partial-stateful skill clarification added
  after the Files table.

No rewrites. ~30 lines added. File stays inside the 80-110-line craft
target.

* fix(craft): correct lint enforcement claim + remove duplicate threshold message

Two findings from @mrcfps review (Looper-generated against ee95b909).

- README: rewrote Enforcement-levels P0 description. Verified against
  apps/daemon/src/server.ts:1706-1727: /api/artifacts/save writes the
  file first, then calls lintArtifact, then returns findings in the
  response. Findings reach the UI (P0/P1 badges) and the agent (system
  reminder for self-correction). Persistence is not hard-blocked on P0.
  Original wording mischaracterized this as a generation gate.
- state-coverage: 30-60s duration-table bucket no longer duplicates the
  '15 s taking longer than expected' message from the loading row.
  Reworded to focus on cancel affordance and explicitly note the
  longer-than-expected notice already fired at 15 s.

Both findings non-blocking per Looper but genuine factual issues. Fixed
in one pass.
2026-05-05 16:31:05 +08:00
Tom Huang
1edab990bb
feat(craft): add brand-agnostic craft references + Refero-derived lint rules (#225)
* feat(craft): add brand-agnostic craft references and refero-derived lint rules

Introduce `craft/` as a third top-level content axis alongside `skills/`
and `design-systems/`, holding universal (brand-agnostic) craft rules
that apply on top of any DESIGN.md. Skills opt in via a new
`od.craft.requires` front-matter array; the daemon resolves the slug
list and injects the matching files between DESIGN.md and the skill
body in the system prompt.

Initial vendor (MIT, adapted from referodesign/refero_skill): typography
craft, color craft, anti-ai-slop. Pilot wired on saas-landing.

Extend the existing lint-artifact pass with two refero-derived rules:
- P0 ai-default-indigo — solid #6366f1 / #4f46e5 / #4338ca / #8b5cf6 as
  accent (not just gradients) is the most-reported AI tell.
- P1 all-caps-no-tracking — `text-transform: uppercase` rules without
  ≥0.06em letter-spacing.

The craft loader silently drops missing files so a skill can
forward-reference future sections (e.g. `motion`) without breaking.

* fix(daemon): skip :root token blocks in ai-default-indigo lint

The ai-default-indigo P0 check scanned the whole HTML for the raw
hex, so brands that intentionally encode indigo as `--accent: #6366f1`
in :root and consume it via var(--accent) downstream were flagged
as AI-default — a false positive that forced the agent to "fix"
valid output. Strip :root token-definition blocks (including
attribute-selector theme variants) before scanning, mirroring the
existing pattern used by the raw-hex P1 check. Hex still flagged
when it appears in component rules or inline styles.

* docs(craft): address PR #225 P3 review feedback

- craft/README.md: explain why missing craft sections are silently
  dropped (forward-compatibility) instead of surfacing a warning.
- craft/typography.md: ground the 0.06em ALL CAPS tracking floor in
  Bringhurst-derived typographic practice rather than presenting
  the threshold as unattributed.
- craft/color.md: cover the edge case where a brand's DESIGN.md
  intentionally encodes indigo as --accent — `var(--accent)` uses
  remain unflagged because the linter only inspects hardcoded hex.
- docs/skills-protocol.md: link the "missing files dropped silently"
  note back to craft/README.md for the canonical slug list and the
  rationale behind the choice.

* fix(craft): address PR #225 P0 review feedback

- tools/pack: copy `craft/` into the packaged resource root alongside
  `skills`, `design-systems`, and `frames`, so the `od.craft.requires`
  integration isn't a silent no-op when the daemon resolves
  `${OD_RESOURCE_ROOT}/craft` in packaged builds.
- packages/contracts: add `craftRequires?: string[]` to `SkillSummary`
  (and therefore `SkillDetail`) so the field that `listSkills()`
  already returns and `/api/skills(/:id)` already serializes via
  `...rest` is part of the documented web/daemon contract instead of
  leaking through as an untyped property.
- apps/daemon/lint-artifact: expand the indigo token-strip pass to
  cover selector lists containing `:root` (e.g. `:root, [data-theme="light"]`)
  and any rule whose body is custom-property-only (e.g. a
  `[data-theme="dark"] { --accent: ... }` theme variant). Real
  component rules with a hardcoded indigo are still preserved so the
  P0 finding still fires; tests cover the new selector-list and
  theme-variant cases.

* fix(craft): address PR #225 follow-up review feedback

- lint-artifact: scope the indigo token-strip to <style> blocks so the
  rule-shaped regex no longer captures leading `<style>` text into the
  selector (which broke `:root` recognition for token blocks that mix
  `color-scheme`/etc. with `--accent`). Run the strip on the extracted
  CSS instead, with a regression covering `:root { color-scheme: light;
  --accent: #6366f1 }`.
- lint-artifact: tighten the custom-property-only exemption to global
  theme-scope selectors (`:root`, `html`, `body`, bare attribute
  selectors like `[data-theme="dark"]`). Component-local rules such as
  `.cta { --cta-bg: #6366f1 }` are no longer exempted, so an agent
  cannot launder default indigo through a local var. Regression test
  added.
- craft/anti-ai-slop.md: stop claiming every rule below is enforced by
  the linter; only several are. The unenforced rules (standard
  Hero→Features→Pricing→FAQ→CTA flow, decorative blob/wave SVG
  backgrounds, perfect symmetry) are now flagged inline as
  "(guidance, not auto-checked)" so the contract with the lint surface
  stays honest.

* fix(daemon): tighten lint-artifact iteration and :root token gating

- all-caps-no-tracking: iterate every <style> block. The previous
  check called `exec` once on a non-global regex, so an artifact
  whose offending uppercase rule sat in a second <style> block
  (e.g. a reset block followed by a components block) slipped
  past. Switch to `matchAll` and break across both loops once a
  violation is found. Regression test covers a second-block
  uppercase rule.
- ai-default-indigo: stop unconditionally exempting any selector
  list containing `:root`. The exemption now requires both
  conditions to hold: every selector in the list is global theme
  scope AND the body is token-shaped (CSS custom properties or
  the `color-scheme` keyword). So `:root { background: #6366f1 }`
  and `:root, .cta { --cta-bg: #6366f1 }` no longer launder a
  hardcoded indigo through the strip pass. Regression tests cover
  both bypass shapes.

* fix(daemon): scope theme-attr exemption and strip CSS comments in token blocks

Address PR #225 review feedback on `ai-default-indigo`:

- The bare-attribute branch of `selectorListIsGlobalThemeScope` accepted
  any `[attr=...]` selector, so a custom-property-only rule on a
  component/state attribute (e.g. `[data-variant="primary"]`,
  `[aria-current="page"]`) was treated as a global theme block and
  stripped before the indigo scan — exactly the component-local indigo
  laundering this lint is meant to catch. Restrict the exemption to a
  small allowlist of known theme switches: `data-theme`,
  `data-color-scheme`, `data-mode`.
- `stripTokenBlocksFromCss` split rule bodies on `;` and matched each
  fragment from the start, so a token block whose body contained a
  normal CSS comment such as `:root { /* brand accent */ --accent:
  #6366f1; }` produced a fragment beginning with the comment, failed
  `isTokenShapedDeclaration`, and the rule was left in scope of the
  indigo scan — a false P0 on a legitimate token definition. Strip CSS
  comments before splitting/classifying declarations.

Add regression coverage: arbitrary component/state attribute selectors
still trip `ai-default-indigo`; `data-color-scheme` theme variants stay
exempted; `:root` token blocks with leading, trailing, and
between-declaration CSS comments are recognized.

* fix(daemon): strip CSS comments and recognize tokens nested in at-rules

The all-caps-no-tracking scan ran against raw `<style>` content, so a
commented-out rule like `/* .eyebrow { text-transform: uppercase; } */`
matched `upperRe` and emitted a P1 for CSS the browser ignores. Strip
CSS comments from the style body before structural matching.

`stripTokenBlocksFromCss` only matched flat `selector { body }` rules,
so a media-query-wrapped token block like
`@media (prefers-color-scheme: dark) { :root { --accent: #6366f1 } }`
had its outer `@media` rule treated as the selector/body pair and the
inner `:root` token block was never stripped, producing a P0 false
positive on legitimate responsive theme CSS. Tighten the body
alternation to `[^{}]*` so the regex matches innermost rules and
recognizes the inner `:root` block directly while preserving the
outer at-rule wrapper.

* fix(daemon): align ai-default-indigo list with documented cardinal sins

The lint's AI_DEFAULT_INDIGO subset omitted #3730a3 and #a855f7, which
craft/anti-ai-slop.md lists as P0-blocked solid accents. An artifact
could hard-code one of those documented colors as a button fill and
slip past the indigo scan unless it happened to be inside a gradient.

Bring the lint set to the exact list documented in the craft doc, and
tighten the doc's wording from "etc." to an explicit enumeration that
points at AI_DEFAULT_INDIGO so the prompt contract and daemon behavior
stay in sync. Add regression tests pinning each newly-included hex.

* fix(daemon): tighten theme-scope selector and scan inline ALL CAPS

The theme-scope exemption used to accept any attribute on `:root`,
`html`, or `body` (e.g. `:root[data-variant="primary"]`), letting an
agent launder default indigo through a component/state attribute and
slip past the `ai-default-indigo` lint. The prefixed branches now
require the attribute name to be one of GLOBAL_THEME_ATTRIBUTES,
matching the bare-attribute branch.

The `all-caps-no-tracking` rule only iterated `<style>` blocks, so
inline declarations like `<span style="text-transform: uppercase">`
produced no finding even though craft/typography.md treats the
≥0.06em tracking floor as having no exceptions. Added a second scan
over `style="..."` attributes that runs the same letter-spacing
check and dedupes against the existing `<style>`-block finding so
the agent gets a single corrective signal per artifact.

* fix(daemon): align uppercase tracking px floor with the 0.06em rule

The previous absolute fallback (>=1.5px) was stricter than the craft
rule it enforces. `font-size: 12px; letter-spacing: 1px` is 0.083em
— above the 0.06em floor — but 1.5px would reject it and trigger an
unnecessary correction loop on compliant small-label CSS.

Extract `hasAdequateUppercaseTracking`: read `font-size` from the same
rule body and compare px tracking against `fontSize * 0.06`; fall back
to a conservative >=1px floor when font-size is inherited (covers the
default 16px body where 1px ≈ 0.0625em). Apply the helper to both the
<style>-block scan and the inline-style scan, and add 12–14px label
tests in both branches.

* fix(daemon): treat rem letter-spacing as absolute, not per-element em

`rem` was previously folded into the same branch as `em` and accepted
at the 0.06 threshold. But `rem` is relative to the root font-size
(16px default), not the element's own font-size, so on a 48px heading
`letter-spacing: 0.06rem` resolves to 0.96px — about 0.02em of the
element, well below the 0.06em rule the lint enforces.

Convert rem to absolute px through the 16px root assumption and reuse
the same px-vs-element-font-size resolution: same-rule `font-size: <n>px`
gives an exact `n * 0.06` floor; otherwise the conservative >=1px
fallback applies. Add regression tests for 48px headings with 0.06rem
tracking (must flag) plus the 16px-element and rem-floor matches that
must keep passing, in both <style>-block and inline-style branches.

* fix(daemon): resolve var() refs in uppercase tracking lint

`hasAdequateUppercaseTracking` only matched literal numeric values,
so a tokenized rule like `letter-spacing: var(--caps-tracking)` —
exactly the pattern the craft prompt steers artifacts toward — was
falsely reported as `all-caps-no-tracking`. Extract `--name: value`
declarations from global theme scopes (`:root`, `html`, theme-attribute
selectors) once per artifact, then expand simple `var(--name)` (and
`var(--name, fallback)`) references in the inspected rule body before
applying the existing 0.06em / px-floor / rem-conversion logic.
References without a matching token and no fallback stay in place,
preserving the conservative "missing tracking" finding.

* fix(daemon): resolve rem and var() font-size in uppercase tracking lint

Previously the px-vs-element-font-size resolution only matched
`font-size: <n>px`. Any rem-based or tokenized display size fell
through to the lenient `>= 1px` body-text fallback, so an artifact
emitting `.display { font-size: 3rem; text-transform: uppercase;
letter-spacing: 1px; }` (a ~48px heading with a 2.88px floor) slipped
past the lint that this helper exists to enforce.

Resolve `rem` font-size via the same root-font assumption already used
for tracking, and treat any explicitly declared but unresolvable unit
(`em`, `%`, `calc(...)`, an unresolved `var(...)`) conservatively —
refuse the lenient fallback so the rule must use either an `em`
letter-spacing or a verifiable px/rem font-size.

`var()` font-size declarations resolve through the existing
`resolveCssVars` pass before the size scan runs, so the same fix
catches the tokenized-display-size pattern (`--display-size: 3rem`).

* fix(daemon): parse declarations to ignore custom-prop names in uppercase tracking lint

The hasAdequateUppercaseTracking and resolveFontSizePx helpers used substring regexes against the rule body, so a token-name declaration such as `--letter-spacing: 0.08em` or `--display-font-size: 48px` could satisfy the `letter-spacing` / `font-size` checks even though it has no rendered effect — letting actual ALL-CAPS-without-tracking rules slip past the P1 lint.

Parse the declaration list, compare exact property names, and skip declarations whose property starts with `--`. Adds regression tests covering token-name letter-spacing (style-block + inline) and a token-name font-size masking the bail-out branch.

* fix(daemon): scope indigo token exemption to --accent only

Previously stripTokenBlocksFromCss removed every custom-property-only
global theme block before the ai-default-indigo scan, which let a
laundered indigo token like `:root { --primary: #6366f1 }` consumed
via `var(--primary)` slip past the lint. The craft contract is that
the only escape hatch is encoding indigo as the design system's
`--accent` token; any other token name is still the LLM-default
color hidden behind an arbitrary name. Narrow the strip pass so a
non-`--accent` token whose value carries an AI-default indigo hex
keeps the rule in scope, and add regression tests for `--primary` /
`--button-bg` global tokens feeding a CTA, including the at-rule
and theme-attribute variants.

* fix(daemon): model CSS cascade in tracking lint and detect blue→cyan trust gradients

Address PR #225 review feedback (3 comments):

- `letter-spacing` / `font-size` selection now picks the LAST matching
  declaration in the rule body, modeling CSS source-order cascade.
  `.eyebrow { letter-spacing: 0.08em; letter-spacing: 0.02em }` renders
  the noncompliant 0.02em the browser actually shows; the previous
  first-match behaviour silently passed it.
- `extractCssTokens` now records every distinct value seen for a token
  across global theme scopes, and `hasAdequateUppercaseTracking`
  enumerates each combination so a default-theme value below the floor
  cannot be rescued by a scoped override that happened to be parsed
  later (`:root { --caps-tracking: 0.02em }` +
  `[data-theme="dark"] { --caps-tracking: 0.08em }` now fires).
- New `trust-gradient` P0 rule pairs blue/sky tokens against cyan
  tokens in `linear-gradient(...)` bodies so `blue→cyan` two-stop
  trust gradients (documented as a cardinal sin in
  `craft/anti-ai-slop.md`) are actually enforced — both the hex form
  (`linear-gradient(90deg, #3b82f6, #06b6d4)`) and the keyword form
  (`linear-gradient(90deg, blue, cyan)`).

Adds 11 regression tests covering each path (cascade override in
<style> and inline form, font-size cascade shifting the floor, both
orderings of the conflicting-token cascade, the don't-over-fire case
when every theme value clears the floor, hex / keyword / sky variants
of the trust gradient, and the don't-double-fire case when
purple-gradient already caught a mixed gradient).

* fix(daemon): apply per-scope cascade in extractCssTokens

When the same CSS custom property is declared more than once inside a
single rule body (e.g. `:root { --caps-tracking: 0.02em;
--caps-tracking: 0.08em }`), CSS source-order cascade collapses to the
last value; the earlier declaration never reaches any element.
`extractCssTokens` was treating intra-scope duplicates as simultaneous
theme alternatives, so `hasAdequateUppercaseTracking` enumerated the
stale 0.02em and emitted a spurious all-caps-no-tracking finding.

Collapse duplicate token declarations within a rule body to the last
value before merging into the cross-scope distinct-value map. Cross-scope
overrides (separate `:root` and `[data-theme]` rules) remain preserved
as distinct values so the conservative theme-cascade check still fires
when ANY applicable theme renders below the floor.

* fix(daemon): scope tracking lint to innermost rules and per-theme tokens

Restrict the upperRe body alternation to [^{}]* so the regex matches
innermost CSS rules and skips at-rule wrappers — an outer @media or
@supports could otherwise capture as a single rule whose selector was
the at-rule and whose body began with the inner selector token, masking
the same-rule font-size and letting noncompliant tracking on large
headings slip through the lenient inherited-size fallback.

Replace the by-name-distinct-values token map with per-scope token
records and a buildResolvedThemes pass that materializes one effective
map per theme. Paired token declarations now stay paired during
evaluation, so theme variants like :root + [data-theme=dark] no longer
generate cross-theme cartesian pairings (e.g. default-size + dark-track)
that emit false positives on legitimate light/dark themes.

---------

Co-authored-by: looper <looper@open-claude.dev>
2026-05-02 11:00:33 +08:00