open-design

vndangkhoa/open-design

Fork 0

mirror of https://github.com/nexu-io/open-design.git synced 2026-06-01 03:14:35 +07:00

Commit graph

Author	SHA1	Message	Date
kami	055680a67d	fix(daemon): dedupe scheduled routine slots (#1971 ) * fix(daemon): dedupe scheduled routine slots Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): claim scheduled routine runs atomically Co-authored-by: multica-agent <github@multica.ai> * Fix routine loser snapshot rollback Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): defer scheduled routine side effects Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): terminate in-memory run on scheduled prepare failure If `prepare()` throws after `persistPreparedRun()` has mutated the routine run with real project/conversation/agentRunId values, the catch in `RoutineService.start_` previously left the in-memory chat run queued (no `discard()`), so its `completion` promise hung waiting on `design.runs.wait(run)` forever, and the `routine_runs` row stayed pinned to `routine-pending-` placeholders even though the underlying project/conversation rows for those real IDs had been created. The catch now calls `handlerStart.discard?.()` so the in-memory run terminates as `canceled`, releasing `completion`, and passes the real IDs through `updateRun` so the persisted failed row reflects what was attempted instead of the placeholder sentinels. A cleanup failure inside `discard()` is logged via `console.error` rather than swallowed, following the same surface-don't-swallow rule the loser cleanup path uses. The original prepare error is still rethrown so the scheduler advances to the next cadence (the slot claim is already terminal, so retrying the same slot would just duplicate-claim and lose). Added regression coverage in `apps/daemon/tests/routines.test.ts` for both the normal prepare-failure path (real IDs persisted, discard fired, completion resolved) and the case where the cleanup itself also throws (failure surfaces via console.error, the row is still finalized with the real IDs). Co-authored-by: multica-agent <github@multica.ai> fix(daemon): clear placeholder IDs on scheduled prepare failure Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): finalize routine prepare failures * fix(daemon): defer manual routine setup cleanup Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): drop loser chat runs and rollback partial snapshot pins Two follow-ups from the latest scheduler-claim review: - Duplicate scheduled losers used to call `design.runs.finish(run, 'canceled')`, exposing a phantom canceled routine run on `/api/runs` even though no `routine_runs` row, conversation, or messages were ever committed. Split the handler tear-down into `discardUnstarted` (used for never-inserted paths — drops the in-memory run via the new `design.runs.drop()`) and the existing `discard` (used after `prepare()` runs — still finalizes as canceled and rolls back partial state). - `resolvePluginSnapshot()` calls `linkSnapshotToProject()` before linking the conversation/run, so a failure mid-link could leave the reused project pinned to a snapshot the routine never durably claimed while `resolvedRoutineSnapshot` stayed null. Capture the intermediate snapshot id in `partiallyAppliedSnapshotId` when the resolver throws, and let `discard()` fall back to it for `restoreProjectSnapshotLink` so the previous project pin is restored either way. Regression coverage added in `tests/routine-schedule-claims.test.ts`: - A scheduled loser does not surface a phantom canceled chat run via `/api/runs` after the slot is lost. - A resolver that throws after `linkSnapshotToProject()` (forced via a SQLite trigger on `conversations.applied_plugin_snapshot_id`) still restores the reused project's previous pin in `discard()`. * fix(daemon): return prepared routine run ids Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: kami.c <kami.c@chative.com>	2026-05-29 03:20:47 +00:00

Author

SHA1

Message

Date

kami

055680a67d

fix(daemon): dedupe scheduled routine slots (#1971 )

* fix(daemon): dedupe scheduled routine slots

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): claim scheduled routine runs atomically

Co-authored-by: multica-agent <github@multica.ai>

* Fix routine loser snapshot rollback

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): defer scheduled routine side effects

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): terminate in-memory run on scheduled prepare failure

If `prepare()` throws after `persistPreparedRun()` has mutated the
routine run with real project/conversation/agentRunId values, the catch
in `RoutineService.start_` previously left the in-memory chat run
queued (no `discard()`), so its `completion` promise hung waiting on
`design.runs.wait(run)` forever, and the `routine_runs` row stayed
pinned to `routine-pending-*` placeholders even though the underlying
project/conversation rows for those real IDs had been created.

The catch now calls `handlerStart.discard?.()` so the in-memory run
terminates as `canceled`, releasing `completion`, and passes the real
IDs through `updateRun` so the persisted failed row reflects what was
attempted instead of the placeholder sentinels. A cleanup failure
inside `discard()` is logged via `console.error` rather than swallowed,
following the same surface-don't-swallow rule the loser cleanup path
uses. The original prepare error is still rethrown so the scheduler
advances to the next cadence (the slot claim is already terminal, so
retrying the same slot would just duplicate-claim and lose).

Added regression coverage in `apps/daemon/tests/routines.test.ts` for
both the normal prepare-failure path (real IDs persisted, discard
fired, completion resolved) and the case where the cleanup itself also
throws (failure surfaces via console.error, the row is still finalized
with the real IDs).

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): clear placeholder IDs on scheduled prepare failure

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): finalize routine prepare failures

* fix(daemon): defer manual routine setup cleanup

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): drop loser chat runs and rollback partial snapshot pins

Two follow-ups from the latest scheduler-claim review:

- Duplicate scheduled losers used to call `design.runs.finish(run, 'canceled')`,
  exposing a phantom canceled routine run on `/api/runs` even though no
  `routine_runs` row, conversation, or messages were ever committed. Split
  the handler tear-down into `discardUnstarted` (used for never-inserted
  paths — drops the in-memory run via the new `design.runs.drop()`) and the
  existing `discard` (used after `prepare()` runs — still finalizes as
  canceled and rolls back partial state).
- `resolvePluginSnapshot()` calls `linkSnapshotToProject()` before linking
  the conversation/run, so a failure mid-link could leave the reused project
  pinned to a snapshot the routine never durably claimed while
  `resolvedRoutineSnapshot` stayed null. Capture the intermediate snapshot
  id in `partiallyAppliedSnapshotId` when the resolver throws, and let
  `discard()` fall back to it for `restoreProjectSnapshotLink` so the
  previous project pin is restored either way.

Regression coverage added in `tests/routine-schedule-claims.test.ts`:

- A scheduled loser does not surface a phantom canceled chat run via
  `/api/runs` after the slot is lost.
- A resolver that throws after `linkSnapshotToProject()` (forced via a
  SQLite trigger on `conversations.applied_plugin_snapshot_id`) still
  restores the reused project's previous pin in `discard()`.

* fix(daemon): return prepared routine run ids

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: kami.c <kami.c@chative.com>

2026-05-29 03:20:47 +00:00

1 commit