open-design/docs/roadmap.md
Tom Huang 86ec951fb9
[codex] Add automation templates and proposal workflows (#2193)
* feat(web): introduce Automations tab with dual-track capability for routines

This commit adds a new Automations tab that consolidates routines, schedules, and live artifacts, allowing users to manage automations seamlessly. The tab features a modal for creating and editing automations, which supports various scheduling options (hourly, daily, weekdays, weekly) and project modes (create_each_run, reuse). The CLI is also updated to expose automation commands, ensuring consistency between the web UI and CLI interfaces.

Key changes include:
- New `NewAutomationModal` component for automation creation and editing.
- Updated `TasksView` to integrate the new Automations functionality.
- Enhanced styling for the Automations tab to improve user experience.

This implementation aligns with the dual-track capability exposure policy, ensuring all features are accessible via both the web UI and CLI.

* feat(daemon): enhance automation context handling and CLI commands

This commit introduces several improvements to the automation context management and updates the CLI commands accordingly. Key changes include:

- Added support for new context fields (`plugin`, `mcp`, `connector`) in automation commands.
- Updated the CLI to reflect new target options (`new-project`).
- Enhanced error messages for invalid target inputs.
- Introduced functions to handle context selection and normalization for routines, including the ability to parse and store context data in the database.
- Updated the database schema to include a new `context_json` field for routines.
- Improved the handling of context in routine routes and the web interface, ensuring that selected contexts are properly managed and displayed.

These changes aim to provide a more robust and flexible automation experience, aligning with the recent enhancements in the web UI.

* feat(web): enhance TasksView with automation run history and status indicators

This commit introduces several new features to the TasksView component, including:

- Added functionality to display automation run history for each routine, showing metadata such as status, timestamps, and project details.
- Implemented status indicators for routine runs, providing visual feedback on their current state (succeeded, failed, running, queued).
- Enhanced the UI to allow users to expand and view detailed run history, including the ability to open the corresponding project conversation.
- Updated styles to improve the presentation of automation statuses and history.

These changes aim to provide users with better insights into their automation routines and improve overall usability.

* feat(daemon): implement automation ingestion and proposal management

This commit introduces several new features related to automation ingestion and proposal management within the daemon. Key changes include:

- Added new modules for handling automation source packets and proposals, allowing for the storage, retrieval, and management of automation-related data.
- Implemented functions to list, create, and apply automation proposals, enhancing the automation workflow.
- Introduced new CLI commands for interacting with memory entries and automation sources, providing users with more control over their automation processes.
- Enhanced the server routes to support automation source and proposal APIs, enabling seamless integration with the existing system.

These changes aim to improve the overall automation experience, making it easier for users to manage and utilize automation proposals and ingestions effectively.
2026-05-19 16:35:28 +08:00

13 KiB
Raw Blame History

Roadmap

Parent: spec.md · Siblings: architecture.md · skills-protocol.md · agent-adapters.md · modes.md

Phased plan from "spec-only today" to "usable MVP" to "published v1." All estimates assume one focused developer; multiply by 0.6 for two and 0.4 for three.


Phase 0 — Spec finalization (current, ~35 days)

Goal: get the interfaces right before writing implementation code. All decisions that are cheap to change on paper and expensive to change in code live here.

Deliverables:

  • README.md + docs/spec.md + architecture / protocol / adapter / modes / references docs (this repo, as of now)
  • docs/schemas/skill-manifest.json — JSON Schema for the od: front-matter block
  • docs/schemas/design-system.md — formal spec of the 9-section DESIGN.md
  • docs/schemas/protocol.md — HTTP/SSE API schemas
  • docs/schemas/adapter.md — adapter interface in TypeScript, printed out
  • docs/examples/DESIGN.sample.md — a working example design system
  • docs/examples/saas-landing-skill/ — a working example skill (the one sketched in skills-protocol.md §8)
  • Resolve the four "open questions" at the end of each spec doc

Exit criteria: every interface we'll implement has a signed-off schema in this repo. No code yet.


Phase 1 — MVP (~68 weeks)

Goal: a single developer can clone, install, start the daemon, point at Claude Code, and produce a prototype and a deck from scratch. The tool is usable for real work even if not polished.

Scope

Included:

  • Web app (Next.js 16, App Router)
    • chat pane · artifact tree · sandboxed iframe preview · export menu
    • skill picker · mode picker · design-system picker
    • no comment mode yet · no sliders yet · no template gallery UI yet
  • Local daemon (Node)
    • HTTP/SSE API on :7456
    • agent detection + cached results
    • skill registry (scan three dirs, hot-reload)
    • artifact store (plain files + history.jsonl)
    • design-system resolver
    • export pipeline (HTML + ZIP only; PDF/PPTX in Phase 2)
  • Agent adapters
    • claude-code — native skill loading, streaming, surgical edit
    • api-fallback — direct Anthropic Messages API, minimal tool loop (Read/Write/Edit only)
  • Skills shipped in repo
    • saas-landing (Prototype)
    • magazine-web-ppt (Deck, fork of guizang-ppt-skill)
  • Modes available
    • Prototype (fully working)
    • Deck (fully working)
    • Design System (basic: from text brief only; no screenshot input yet)
    • Template (deferred to Phase 2)
  • Topologies
    • A — fully local (primary)
    • C — Vercel + direct API (partial; no daemon features)

Explicitly out of MVP:

  • Codex / Cursor / Gemini adapters
  • Comment mode + sliders
  • Template gallery + template skill
  • Design System from screenshot (vision) / PDF / URL
  • PDF / PPTX export
  • Topology B (Vercel + tunneled local daemon)
  • Docker compose file
  • Skill tests (od skill test)
  • Auth / multi-user

Week-by-week breakdown

Week Theme Concrete deliverables
1 Scaffolding pnpm workspaces (apps/web, apps/daemon, e2e); Next.js 16 base; daemon CLI skeleton; CI green
2 Daemon core HTTP/SSE API; project/conversation store; skill registry scanning; artifact store; design-system resolver loading DESIGN.md
3 Claude Code adapter detection (PATH + ~/.claude/ probe); spawn with --output-format stream-json; parser from JSON-lines → AgentEvent; streaming to daemon's session; cancel via SIGTERM
4 API-fallback adapter Anthropic Messages streaming; minimal tool loop (Read/Write/Edit rooted to artifact cwd); integration with skill prompt injection
5 Web UI — chat + file workspace React state + daemon-backed project store; SSE client; chat pane; file workspace reflects project files; skill picker
6 Web UI — preview + export sandboxed iframe with hot reload; JSX → vendored React/Babel runtime; export ZIP; export self-contained HTML (inline CSS)
7 Default skills port guizang-ppt-skill (no modifications; add od: extension block); write saas-landing skill; write 12 DESIGN.md examples; docs for skill authors
8 Polish + dogfood end-to-end dogfooding; performance pass (daemon <500ms cold start, first generation overhead <50ms); bug-fixing; first publishable alpha

MVP exit criteria

  1. corepack enable && pnpm install && pnpm tools-dev run web works on clean macOS and Linux with Node 24.
  2. With Claude Code installed: prototype + deck generation works end-to-end.
  3. Without Claude Code installed: API-fallback produces prototypes (not decks — guizang-ppt-skill needs native skill loading).
  4. A user can drop a DESIGN.md into the project root and subsequent generations respect it.
  5. A third party can publish a skill repo; od skill add <url> installs it and it works.
  6. Artifacts are plain files; git add ./.od/artifacts/ and git log tell a sensible story.
  7. No Electron, no Tauri, no desktop packaging anywhere in the repo.

Phase 2 — v1 (~8 weeks after MVP)

Goal: feature parity with the "UI-polish-heavy" parts of Open CoDesign + multi-agent support + the full four modes.

Scope

Agent adapters:

  • codex (P1)
  • cursor-agent (P1)
  • capability-driven UI gating (disable features per adapter)
  • agent fallback chain

UI:

  • Comment mode (click element → surgical edit; only when capabilities.surgicalEdit)
  • Slider parameters (live-tweak od.parameters)
  • Multi-frame preview (desktop / tablet / phone)
  • Template gallery UI with thumbnails
  • Design System editor (split view: markdown ↔ sample-components preview)

Skills:

  • Template skills: stripe-ish-landing, linear-ish-docs, notion-ish-workspace, vercel-ish-pricing
  • More Prototype skills: dashboard, login-flow, empty-state-pack, pricing-page
  • More Deck skills: pitch-deck, product-demo-deck
  • Design System skills: design-system-from-screenshot, design-system-refine

Modes:

  • Template mode fully shipped
  • Design System mode extended: screenshot input, URL input

Export:

  • PDF (Puppeteer)
  • PPTX (pptxgenjs, driven by slides.json)

Deployment:

  • Docker compose file
  • Topology B: Vercel web + tunneled local daemon
    • Ship a helper subcommand: od daemon --expose using cloudflared (opt-in, documented)

Dev experience:

  • od skill test with cheap-model runs
  • Skill author starter template: od skill scaffold

v1 exit criteria

  1. All four modes fully functional.
  2. Three adapters working (Claude Code, Codex, Cursor Agent); fallback chain shipping.
  3. PDF + PPTX export working for at least the magazine-web-ppt + pitch-deck skills.
  4. Deployed example at demo.open-design.dev (Topology C).
  5. Skill author docs published; at least one third-party skill submitted.
  6. Documentation site rebuilt from these spec docs.

Phase 3 — v2 (~12 weeks after v1)

Goal: ecosystem + robustness.

Scope sketch (non-binding):

  • Skill marketplace UI — searchable, categorized, install with one click
  • Skill signing / checksums
  • Gemini CLI + OpenCode + OpenClaw adapters (P2 tier)
  • Windows support
  • Collaborative mode (multi-user session on a single daemon)
  • "Freeze prototype as design system" action
  • Figma export (behind the Open CoDesign post-1.0 line; borrow their approach when they ship it)
  • Telemetry (opt-in, self-hosted, never phoning home to a central service)
  • Hosted SaaS offering (optional; full-local stays primary)

v2 isn't promised. It's the direction if v1 lands.

Self-evolution track

The newer Automations direction is tracked in specs/current/automation-self-evolution.md. It folds routines, scheduled connector digests, live-artifact refreshes, Orbit, memory extraction, skill creation, token compression, and design-system extraction into one Automation template model.

Milestones:

Milestone Deliverable
SE0 Contracts for source packets, automation templates, evolution proposals, memory tree nodes, and compression reports.
SE1 Editable memory tree that agents actually consume through the daemon and BYOK/API-mode prompt resolver.
SE2 Automation template registry exposed in both web UI and od automation.
SE3 Design-system extraction and skill crystallization proposals with review gates.
SE4 Connector-driven ingestion into memory/design-system/skill proposals with provenance.
SE5 Optional token compression with before/after token reports and rollback-safe stored originals.

SE1 starts from the existing Markdown memory store: /api/memory/tree and od memory tree list/view/edit/move expose a derived editable tree while the same selected entries continue feeding daemon and BYOK/API-mode prompts.

SE2 also includes the first review gate: /api/automation-proposals plus od automation proposal list/get/apply/reject can review memory-node, skill, and design-system proposals. Accepted memory proposals write into the memory tree; accepted skill and design-system proposals write reviewed drafts under the user-owned runtime roots.

SE3/SE4 start closing the source loop through /api/automation-ingestions, /api/automation-source-packets, and od automation source ingest/list/get. The Automations page now has a source-ingestion panel that can turn pasted connector/repo/artifact/chat context into stored source packets plus reviewable memory, skill, and design-system proposals. Each ingestion can choose off/balanced/aggressive compression and records before/after token counts while preserving the original packet.

Exit criteria: a connected or uploaded source can become reviewable memory, skill, and design-system proposals; accepted proposals are visible in the tree and are consumed by a later agent run without extra prompting.


Risk register

Risk Impact Mitigation
Claude Code JSON stream format changes between versions adapter breaks pin version range; write a compatibility test; keep a parser for each major release
Third-party agent CLIs don't expose enough to stream tool calls UX degrades silently capability flags + feature gates; document per-adapter limitations in-product
@mariozechner/pi-ai or similar abstractions get popular and contributors ask us to support them scope creep defer; if demand is real, add as yet-another-adapter next to api-fallback
Vercel deploy (Topology B) flaky because of tunnel setup users can't try the cloud path ship Topology C (direct API) as the always-works path; document Topology B as advanced
guizang-ppt-skill or similar upstream skill changes format default deck skill breaks pin git SHA in our default install; monitor upstream
DESIGN.md format evolves in awesome-claude-design incompatibility track upstream; adopt changes; our resolver is tolerant of missing sections
Anthropic ships an open-source Claude Design differentiation collapses our moat is the "uses user's existing agent" angle; Anthropic is unlikely to ship that
Skill security (malicious skill via od skill add) user machine compromise install-time warning; rely on agent's own permission model; document best practices

Decision log (lightweight)

Record one line per material decision as we go. Example entries:

  • 2026-04-24 — Use plain files + history.jsonl over SQLite for artifacts. Why: git-reviewable, no driver dependency, matches "skills are files" ethos.
  • 2026-04-24 — Adopt DESIGN.md (awesome-claude-design) verbatim rather than inventing a new format. Why: 68 existing files are immediately compatible.
  • 2026-04-24 — Do not ship an Electron / Tauri wrapper. Why: every minute on code-signing is a minute not on skills; cc-switch already solves the tray-icon use case.
  • 2026-04-24 — Delegate the entire agent loop to the user's CLI. Why: reimplementing is worse than integrating; ecosystem compatibility beats control.

Decisions supersede each other; keep the log append-only and date every entry.


What to do right after reading this

If you're the implementer:

  1. Read spec.md top to bottom.
  2. Skim architecture.md, skills-protocol.md, agent-adapters.md.
  3. Argue with anything in the four "open questions" sections; file one-line decisions.
  4. Fill in the missing Phase 0 deliverables (the docs/schemas/ and docs/examples/ files).
  5. Scaffold the monorepo and start Week 1.

If you're evaluating the concept:

  1. Read README.md + spec.md §13.
  2. Check the comparison matrix in references.md.
  3. Look at the worked example in skills-protocol.md §7 — that's the end-to-end feel.