* feat(web): introduce Automations tab with dual-track capability for routines This commit adds a new Automations tab that consolidates routines, schedules, and live artifacts, allowing users to manage automations seamlessly. The tab features a modal for creating and editing automations, which supports various scheduling options (hourly, daily, weekdays, weekly) and project modes (create_each_run, reuse). The CLI is also updated to expose automation commands, ensuring consistency between the web UI and CLI interfaces. Key changes include: - New `NewAutomationModal` component for automation creation and editing. - Updated `TasksView` to integrate the new Automations functionality. - Enhanced styling for the Automations tab to improve user experience. This implementation aligns with the dual-track capability exposure policy, ensuring all features are accessible via both the web UI and CLI. * feat(daemon): enhance automation context handling and CLI commands This commit introduces several improvements to the automation context management and updates the CLI commands accordingly. Key changes include: - Added support for new context fields (`plugin`, `mcp`, `connector`) in automation commands. - Updated the CLI to reflect new target options (`new-project`). - Enhanced error messages for invalid target inputs. - Introduced functions to handle context selection and normalization for routines, including the ability to parse and store context data in the database. - Updated the database schema to include a new `context_json` field for routines. - Improved the handling of context in routine routes and the web interface, ensuring that selected contexts are properly managed and displayed. These changes aim to provide a more robust and flexible automation experience, aligning with the recent enhancements in the web UI. * feat(web): enhance TasksView with automation run history and status indicators This commit introduces several new features to the TasksView component, including: - Added functionality to display automation run history for each routine, showing metadata such as status, timestamps, and project details. - Implemented status indicators for routine runs, providing visual feedback on their current state (succeeded, failed, running, queued). - Enhanced the UI to allow users to expand and view detailed run history, including the ability to open the corresponding project conversation. - Updated styles to improve the presentation of automation statuses and history. These changes aim to provide users with better insights into their automation routines and improve overall usability. * feat(daemon): implement automation ingestion and proposal management This commit introduces several new features related to automation ingestion and proposal management within the daemon. Key changes include: - Added new modules for handling automation source packets and proposals, allowing for the storage, retrieval, and management of automation-related data. - Implemented functions to list, create, and apply automation proposals, enhancing the automation workflow. - Introduced new CLI commands for interacting with memory entries and automation sources, providing users with more control over their automation processes. - Enhanced the server routes to support automation source and proposal APIs, enabling seamless integration with the existing system. These changes aim to improve the overall automation experience, making it easier for users to manage and utilize automation proposals and ingestions effectively.
13 KiB
Roadmap
Parent: spec.md · Siblings: architecture.md · skills-protocol.md · agent-adapters.md · modes.md
Phased plan from "spec-only today" to "usable MVP" to "published v1." All estimates assume one focused developer; multiply by 0.6 for two and 0.4 for three.
Phase 0 — Spec finalization (current, ~3–5 days)
Goal: get the interfaces right before writing implementation code. All decisions that are cheap to change on paper and expensive to change in code live here.
Deliverables:
README.md+docs/spec.md+ architecture / protocol / adapter / modes / references docs (this repo, as of now)docs/schemas/skill-manifest.json— JSON Schema for theod:front-matter blockdocs/schemas/design-system.md— formal spec of the 9-sectionDESIGN.mddocs/schemas/protocol.md— HTTP/SSE API schemasdocs/schemas/adapter.md— adapter interface in TypeScript, printed outdocs/examples/DESIGN.sample.md— a working example design systemdocs/examples/saas-landing-skill/— a working example skill (the one sketched inskills-protocol.md§8)- Resolve the four "open questions" at the end of each spec doc
Exit criteria: every interface we'll implement has a signed-off schema in this repo. No code yet.
Phase 1 — MVP (~6–8 weeks)
Goal: a single developer can clone, install, start the daemon, point at Claude Code, and produce a prototype and a deck from scratch. The tool is usable for real work even if not polished.
Scope
Included:
- Web app (Next.js 16, App Router)
- chat pane · artifact tree · sandboxed iframe preview · export menu
- skill picker · mode picker · design-system picker
- no comment mode yet · no sliders yet · no template gallery UI yet
- Local daemon (Node)
- HTTP/SSE API on
:7456 - agent detection + cached results
- skill registry (scan three dirs, hot-reload)
- artifact store (plain files +
history.jsonl) - design-system resolver
- export pipeline (HTML + ZIP only; PDF/PPTX in Phase 2)
- HTTP/SSE API on
- Agent adapters
claude-code— native skill loading, streaming, surgical editapi-fallback— direct Anthropic Messages API, minimal tool loop (Read/Write/Edit only)
- Skills shipped in repo
saas-landing(Prototype)magazine-web-ppt(Deck, fork of guizang-ppt-skill)
- Modes available
- Prototype (fully working)
- Deck (fully working)
- Design System (basic: from text brief only; no screenshot input yet)
- Template (deferred to Phase 2)
- Topologies
- A — fully local (primary)
- C — Vercel + direct API (partial; no daemon features)
Explicitly out of MVP:
- Codex / Cursor / Gemini adapters
- Comment mode + sliders
- Template gallery + template skill
- Design System from screenshot (vision) / PDF / URL
- PDF / PPTX export
- Topology B (Vercel + tunneled local daemon)
- Docker compose file
- Skill tests (
od skill test) - Auth / multi-user
Week-by-week breakdown
| Week | Theme | Concrete deliverables |
|---|---|---|
| 1 | Scaffolding | pnpm workspaces (apps/web, apps/daemon, e2e); Next.js 16 base; daemon CLI skeleton; CI green |
| 2 | Daemon core | HTTP/SSE API; project/conversation store; skill registry scanning; artifact store; design-system resolver loading DESIGN.md |
| 3 | Claude Code adapter | detection (PATH + ~/.claude/ probe); spawn with --output-format stream-json; parser from JSON-lines → AgentEvent; streaming to daemon's session; cancel via SIGTERM |
| 4 | API-fallback adapter | Anthropic Messages streaming; minimal tool loop (Read/Write/Edit rooted to artifact cwd); integration with skill prompt injection |
| 5 | Web UI — chat + file workspace | React state + daemon-backed project store; SSE client; chat pane; file workspace reflects project files; skill picker |
| 6 | Web UI — preview + export | sandboxed iframe with hot reload; JSX → vendored React/Babel runtime; export ZIP; export self-contained HTML (inline CSS) |
| 7 | Default skills | port guizang-ppt-skill (no modifications; add od: extension block); write saas-landing skill; write 1–2 DESIGN.md examples; docs for skill authors |
| 8 | Polish + dogfood | end-to-end dogfooding; performance pass (daemon <500ms cold start, first generation overhead <50ms); bug-fixing; first publishable alpha |
MVP exit criteria
corepack enable && pnpm install && pnpm tools-dev run webworks on clean macOS and Linux with Node 24.- With Claude Code installed: prototype + deck generation works end-to-end.
- Without Claude Code installed: API-fallback produces prototypes (not decks — guizang-ppt-skill needs native skill loading).
- A user can drop a DESIGN.md into the project root and subsequent generations respect it.
- A third party can publish a skill repo;
od skill add <url>installs it and it works. - Artifacts are plain files;
git add ./.od/artifacts/andgit logtell a sensible story. - No Electron, no Tauri, no desktop packaging anywhere in the repo.
Phase 2 — v1 (~8 weeks after MVP)
Goal: feature parity with the "UI-polish-heavy" parts of Open CoDesign + multi-agent support + the full four modes.
Scope
Agent adapters:
codex(P1)cursor-agent(P1)- capability-driven UI gating (disable features per adapter)
- agent fallback chain
UI:
- Comment mode (click element → surgical edit; only when
capabilities.surgicalEdit) - Slider parameters (live-tweak
od.parameters) - Multi-frame preview (desktop / tablet / phone)
- Template gallery UI with thumbnails
- Design System editor (split view: markdown ↔ sample-components preview)
Skills:
- Template skills:
stripe-ish-landing,linear-ish-docs,notion-ish-workspace,vercel-ish-pricing - More Prototype skills:
dashboard,login-flow,empty-state-pack,pricing-page - More Deck skills:
pitch-deck,product-demo-deck - Design System skills:
design-system-from-screenshot,design-system-refine
Modes:
- Template mode fully shipped
- Design System mode extended: screenshot input, URL input
Export:
- PDF (Puppeteer)
- PPTX (pptxgenjs, driven by
slides.json)
Deployment:
- Docker compose file
- Topology B: Vercel web + tunneled local daemon
- Ship a helper subcommand:
od daemon --exposeusingcloudflared(opt-in, documented)
- Ship a helper subcommand:
Dev experience:
od skill testwith cheap-model runs- Skill author starter template:
od skill scaffold
v1 exit criteria
- All four modes fully functional.
- Three adapters working (Claude Code, Codex, Cursor Agent); fallback chain shipping.
- PDF + PPTX export working for at least the
magazine-web-ppt+pitch-deckskills. - Deployed example at
demo.open-design.dev(Topology C). - Skill author docs published; at least one third-party skill submitted.
- Documentation site rebuilt from these spec docs.
Phase 3 — v2 (~12 weeks after v1)
Goal: ecosystem + robustness.
Scope sketch (non-binding):
- Skill marketplace UI — searchable, categorized, install with one click
- Skill signing / checksums
- Gemini CLI + OpenCode + OpenClaw adapters (P2 tier)
- Windows support
- Collaborative mode (multi-user session on a single daemon)
- "Freeze prototype as design system" action
- Figma export (behind the Open CoDesign post-1.0 line; borrow their approach when they ship it)
- Telemetry (opt-in, self-hosted, never phoning home to a central service)
- Hosted SaaS offering (optional; full-local stays primary)
v2 isn't promised. It's the direction if v1 lands.
Self-evolution track
The newer Automations direction is tracked in
specs/current/automation-self-evolution.md.
It folds routines, scheduled connector digests, live-artifact refreshes, Orbit,
memory extraction, skill creation, token compression, and design-system
extraction into one Automation template model.
Milestones:
| Milestone | Deliverable |
|---|---|
| SE0 | Contracts for source packets, automation templates, evolution proposals, memory tree nodes, and compression reports. |
| SE1 | Editable memory tree that agents actually consume through the daemon and BYOK/API-mode prompt resolver. |
| SE2 | Automation template registry exposed in both web UI and od automation. |
| SE3 | Design-system extraction and skill crystallization proposals with review gates. |
| SE4 | Connector-driven ingestion into memory/design-system/skill proposals with provenance. |
| SE5 | Optional token compression with before/after token reports and rollback-safe stored originals. |
SE1 starts from the existing Markdown memory store: /api/memory/tree and
od memory tree list/view/edit/move expose a derived editable tree while the
same selected entries continue feeding daemon and BYOK/API-mode prompts.
SE2 also includes the first review gate: /api/automation-proposals plus
od automation proposal list/get/apply/reject can review memory-node, skill,
and design-system proposals. Accepted memory proposals write into the memory
tree; accepted skill and design-system proposals write reviewed drafts under
the user-owned runtime roots.
SE3/SE4 start closing the source loop through /api/automation-ingestions,
/api/automation-source-packets, and od automation source ingest/list/get.
The Automations page now has a source-ingestion panel that can turn pasted
connector/repo/artifact/chat context into stored source packets plus reviewable
memory, skill, and design-system proposals. Each ingestion can choose
off/balanced/aggressive compression and records before/after token counts while
preserving the original packet.
Exit criteria: a connected or uploaded source can become reviewable memory, skill, and design-system proposals; accepted proposals are visible in the tree and are consumed by a later agent run without extra prompting.
Risk register
| Risk | Impact | Mitigation |
|---|---|---|
| Claude Code JSON stream format changes between versions | adapter breaks | pin version range; write a compatibility test; keep a parser for each major release |
| Third-party agent CLIs don't expose enough to stream tool calls | UX degrades silently | capability flags + feature gates; document per-adapter limitations in-product |
@mariozechner/pi-ai or similar abstractions get popular and contributors ask us to support them |
scope creep | defer; if demand is real, add as yet-another-adapter next to api-fallback |
| Vercel deploy (Topology B) flaky because of tunnel setup | users can't try the cloud path | ship Topology C (direct API) as the always-works path; document Topology B as advanced |
guizang-ppt-skill or similar upstream skill changes format |
default deck skill breaks | pin git SHA in our default install; monitor upstream |
| DESIGN.md format evolves in awesome-claude-design | incompatibility | track upstream; adopt changes; our resolver is tolerant of missing sections |
| Anthropic ships an open-source Claude Design | differentiation collapses | our moat is the "uses user's existing agent" angle; Anthropic is unlikely to ship that |
Skill security (malicious skill via od skill add) |
user machine compromise | install-time warning; rely on agent's own permission model; document best practices |
Decision log (lightweight)
Record one line per material decision as we go. Example entries:
- 2026-04-24 — Use plain files +
history.jsonlover SQLite for artifacts. Why: git-reviewable, no driver dependency, matches "skills are files" ethos. - 2026-04-24 — Adopt
DESIGN.md(awesome-claude-design) verbatim rather than inventing a new format. Why: 68 existing files are immediately compatible. - 2026-04-24 — Do not ship an Electron / Tauri wrapper. Why: every minute on code-signing is a minute not on skills;
cc-switchalready solves the tray-icon use case. - 2026-04-24 — Delegate the entire agent loop to the user's CLI. Why: reimplementing is worse than integrating; ecosystem compatibility beats control.
Decisions supersede each other; keep the log append-only and date every entry.
What to do right after reading this
If you're the implementer:
- Read
spec.mdtop to bottom. - Skim
architecture.md,skills-protocol.md,agent-adapters.md. - Argue with anything in the four "open questions" sections; file one-line decisions.
- Fill in the missing Phase 0 deliverables (the
docs/schemas/anddocs/examples/files). - Scaffold the monorepo and start Week 1.
If you're evaluating the concept:
- Read
README.md+spec.md§1–3. - Check the comparison matrix in
references.md. - Look at the worked example in
skills-protocol.md§7 — that's the end-to-end feel.