open-design/apps/daemon/tests/prompts/discovery-todo-cap.test.ts
Patrick A c5e38bbe58
fix(prompts): remove 10-item cap from discovery TodoWrite plan (#2298)
* fix(daemon): remove 10-item cap from discovery TodoWrite plan prompt

The RULE 3 sentence in DISCOVERY_AND_PHILOSOPHY told the model to write
'a plan of 5–10 short imperative items'. That upper bound caused the agent
to cap every plan at exactly ten steps even when the task genuinely needed
more. The TodoWrite JSON schema imposes no maxItems constraint, so the cap
was entirely prompt-driven.

Replace '5–10 short imperative items' with 'short imperative items covering
the work'. TodoWrite intent, RULE 3 label, and planning-before-building
requirement all survive unchanged.

Red spec: apps/daemon/tests/prompts/discovery-todo-cap.test.ts

* fix(prompts): remove 10-item cap from contracts discovery copy and harden tests

[pass-6,7 BLOCKER] packages/contracts/src/prompts/discovery.ts still had
the old '5-10 short imperative items' wording. apps/web imports
composeSystemPrompt from @open-design/contracts (ProjectView.tsx:43),
so web-originated chat runs were still subject to the cap.

[pass-8 WARNING] discovery-todo-cap.test.ts did not cover the contracts
copy, leaving that path unguarded. Also no guard against semantically
equivalent re-introduction via 'at most / maximum / no more than'.

Changes:
- packages/contracts/src/prompts/discovery.ts: apply same wording fix as
  apps/daemon; add inline rationale comment
- apps/daemon/src/prompts/discovery.ts: add inline rationale comment
- apps/daemon/tests/prompts/discovery-todo-cap.test.ts: add 4th assertion
  blocking 'at most|maximum|no more than N item' re-introduction
- packages/contracts/tests/system-prompt.test.ts: add 5-assertion suite
  guarding the contracts copy and composed prompt output
2026-05-22 16:23:37 +08:00

38 lines
1.8 KiB
TypeScript
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

import { describe, expect, it } from 'vitest';
import { DISCOVERY_AND_PHILOSOPHY } from '../../src/prompts/discovery.js';
// The system prompt historically told the model to write "a plan of 510 short
// imperative items". That upper bound caused the agent to cap every plan at
// exactly ten steps and then stop or skip additional items — even when the task
// genuinely needed more. There is no maxItems constraint in the upstream
// TodoWrite JSON schema (the array is unbounded), so the cap is entirely
// prompt-driven and can be removed here.
//
// This test locks the absence of the cap so a future prompt edit cannot
// accidentally re-introduce the "510" or "5 to 10" wording.
describe('discovery.ts RULE 3 — TodoWrite plan item count', () => {
it('does not cap the plan at 10 items via "510" wording', () => {
// The old wording was "a plan of 510 short imperative items".
// After the fix the sentence must not mention an upper bound of 10.
expect(DISCOVERY_AND_PHILOSOPHY).not.toMatch(/5[\-]10\s+short\s+imperative/);
});
it('does not cap the plan at 10 items via "5 to 10" wording', () => {
expect(DISCOVERY_AND_PHILOSOPHY).not.toMatch(/5 to 10\s+(?:short\s+)?items/i);
});
it('does not re-introduce a numeric cap via "at most / maximum / no more than" phrasing', () => {
// Guard against semantically equivalent upper-bound re-introduction.
expect(DISCOVERY_AND_PHILOSOPHY).not.toMatch(
/(?:at most|maximum|no more than)\s+1[0-9]\s+(?:todo|plan|step|item)/i,
);
});
it('still instructs the agent to write at least a few items', () => {
// The intent — plan with TodoWrite before building — must survive the fix.
expect(DISCOVERY_AND_PHILOSOPHY).toContain('TodoWrite');
expect(DISCOVERY_AND_PHILOSOPHY).toContain('RULE 3');
});
});