open-design/apps/daemon/tests/research.test.ts
Tom Huang 56bf6ee1b6
feat: agent-callable research command and /search (#615)
* feat: pre-generation research (Tavily) for grounded generation

Adds an optional pre-generation research step so the agent can produce
slides / prototypes / decks grounded in real sources instead of guessing.

User flow:
  1. Settings -> Tavily Search -> paste API key (or set TAVILY_API_KEY).
  2. Click the new Research button in the chat composer.
  3. On send, the daemon runs a Tavily search, prepends the findings
     as a <research_context> block ahead of the system prompt, and
     spawns the agent. Research progress shows up as status pills in
     the chat stream; the agent cites sources inline as [1]/[2]/...

Phase 1 surface:
  - Single provider (Tavily), single depth ('shallow'), no LLM
    synthesis pass (Tavily's `answer` is the summary).
  - Composer toggle only; no popover / depth picker yet.
  - Reuses the existing `status` SSE agent payload + StatusPill UI
    so no new event variants or renderer code are needed.

Layers touched:
  - contracts: ResearchOptions / Source / Findings DTOs;
    ChatRequest.research; export from index.
  - daemon: apps/daemon/src/research/{index,tavily}.ts orchestrator
    + provider; tavily added to MEDIA_PROVIDERS and ENV_KEYS; hook
    in startChatRun before prompt assembly.
  - web: ChatComposer toggle + ChatSendMeta; threaded through
    ChatPane / ProjectView / streamViaDaemon into ChatRequest.

Side fix (required to land the feature, but useful on its own):
  contracts internal relative imports lacked the `.js` suffix that
  NodeNext module resolution requires. This was already breaking
  `pnpm --filter @open-design/daemon typecheck` on main; without the
  fix, none of the new research types were visible to the daemon.
  All internal contracts imports now carry `.js`.

Spec: specs/current/research-feature.md (phases 2-4 outlined for
follow-up: composer popover, multi-provider, deep recursion, example
skills with research_recommends).

Verified:
  - pnpm --filter @open-design/contracts typecheck/test
  - pnpm --filter @open-design/daemon typecheck (the chokidar
    project-watchers test is a pre-existing flake, unrelated)
  - pnpm --filter @open-design/web typecheck
  - node scripts/verify-media-models.mjs

* fix(daemon): clamp Tavily max_results to 20

Tavily's /search endpoint requires `max_results` in [0, 20]; sending a
larger value (e.g. when `research.depth: "deep"` resolves to 30) returns
400 and `runResearch` silently falls back to no-research. Clamp at the
provider boundary so Phase 2 depth tiers above 20 still produce results
instead of failing the request.

Generated-By: looper 0.6.1 (runner=fixer, agent=claude-code)

* Remove stale research merge leftovers

* Add agent-callable research search

* Fix Indonesian locale typecheck

* Fix research command invocation edge cases

* Harden slash search prompt expansion

* Honor research source caps in command contract

* Require search reports in design files

* Add research data provider settings

* Wire web research provider fallback order

* Update research provider fallback wording

* Revert "Update research provider fallback wording"

This reverts commit 86fb6001e3.

* Revert "Wire web research provider fallback order"

This reverts commit 4c9e16036b.

* Revert "Add research data provider settings"

This reverts commit 23630d1746.

* Add Dexter and Last30Days research skills

* Add DCF and Last30Days OD skills

* Add Last30Days and Dexter skills

* Resolve research review threads

---------

Co-authored-by: a1chzt <chizblank@gmail.com>
2026-05-08 10:33:44 +08:00

96 lines
2.9 KiB
TypeScript

import { mkdtemp, rm } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import path from 'node:path';
import { afterEach, describe, expect, it, vi } from 'vitest';
import { searchResearch, ResearchError } from '../src/research/index.js';
const TAVILY_ENV_KEYS = ['OD_TAVILY_API_KEY', 'TAVILY_API_KEY'];
type FetchInput = Parameters<typeof fetch>[0];
type FetchInit = Parameters<typeof fetch>[1];
describe('research search', () => {
const originalEnv = Object.fromEntries(
TAVILY_ENV_KEYS.map((key) => [key, process.env[key]]),
);
let projectRoot: string | null = null;
afterEach(async () => {
vi.unstubAllGlobals();
for (const key of TAVILY_ENV_KEYS) {
if (originalEnv[key] == null) delete process.env[key];
else process.env[key] = originalEnv[key];
}
const dir = projectRoot;
projectRoot = null;
if (dir) await rm(dir, { recursive: true, force: true });
});
async function tempProjectRoot() {
projectRoot = await mkdtemp(path.join(tmpdir(), 'od-research-project-'));
return projectRoot;
}
it('requires a Tavily API key', async () => {
for (const key of TAVILY_ENV_KEYS) delete process.env[key];
await expect(
searchResearch({ projectRoot: await tempProjectRoot(), query: 'EV trends' }),
).rejects.toMatchObject({
code: 'TAVILY_API_KEY_MISSING',
status: 400,
} satisfies Partial<ResearchError>);
});
it('uses shallow Tavily search and normalizes JSON findings', async () => {
process.env.OD_TAVILY_API_KEY = 'tvly-test';
const fetchMock = vi.fn(async (_input: FetchInput, _init?: FetchInit) =>
new Response(
JSON.stringify({
answer: 'EV sales are growing.',
results: [
{
title: 'EV report',
url: 'https://example.com/ev',
content: 'EV adoption increased in 2025.',
published_date: '2025-05-01',
},
],
}),
{ status: 200, headers: { 'content-type': 'application/json' } },
),
);
vi.stubGlobal('fetch', fetchMock);
const findings = await searchResearch({
projectRoot: await tempProjectRoot(),
query: 'EV market 2025 trends',
maxSources: 50,
});
expect(findings).toMatchObject({
query: 'EV market 2025 trends',
summary: 'EV sales are growing.',
provider: 'tavily',
depth: 'shallow',
sources: [
{
title: 'EV report',
url: 'https://example.com/ev',
snippet: 'EV adoption increased in 2025.',
provider: 'tavily',
publishedAt: '2025-05-01',
},
],
});
const [, init] = fetchMock.mock.calls[0] as [FetchInput, FetchInit];
const body = JSON.parse(String(init!.body));
expect(body).toMatchObject({
query: 'EV market 2025 trends',
search_depth: 'basic',
max_results: 20,
include_answer: true,
include_raw_content: false,
});
});
});