open-design/apps/web/tests/components/chat-feedback.test.tsx
lefarcen 6690dbd5bb
feat(analytics): PostHog + Langfuse instrumentation for assistant feedback (#1558)
* feat(analytics): PostHog + Langfuse instrumentation for assistant feedback

Re-bases the original three-commit PR onto release/v0.8.0. The web-side
feedback UI instrumentation (surface_view / ui_click / feedback_submit_result)
landed on main while this branch was open, so on this rebase that wiring
is taken from main; the remaining net additions are:

- Contracts: TrackingFeedback* enums and the four dedicated
  assistant_feedback_* event payload types (click, reason_view,
  reason_click, reason_submit), plus normalizeCustomReason helper.
  The new event-name variants are added to TrackingEventName and the
  AnalyticsEventPayload discriminated union next to the existing
  surface_view/ui_click variants — both wire formats coexist.
- POST /api/runs/:id/feedback in apps/daemon/src/chat-routes.ts:
  thin route that validates rating, allowlists reasonCodes through a
  simple string filter, and fire-and-forgets into the daemon's
  reportFeedback hook.
- apps/daemon/src/langfuse-bridge.ts reportRunFeedbackFromDaemon
  forwards the rating + reasonCodes into Langfuse as user_rating
  (NUMERIC ±1) + user_rating_reason (CATEGORICAL, one per code)
  score-create entries. Gates on telemetry.metrics + telemetry.content.
- apps/web/src/providers/daemon.ts reportChatRunFeedback (fire-and-forget
  fetch) and apps/web/src/components/ProjectView.tsx wiring so each
  thumbs-up/down + reason submission posts the side-channel.

Conflicts resolved (release/v0.8.0 vs the branch's old base):
- packages/contracts/src/analytics/events.ts: keep main's
  file_upload_result / feedback_submit_result / settings_* event
  variants alongside the new assistant_feedback_* additions.
- apps/daemon/src/server.ts: keep DNS-aware validateExternalApiBaseUrl,
  add reportFeedback closure wired into registerChatRoutes telemetry.
- apps/daemon/src/chat-routes.ts: keep both /tool-result and the new
  /feedback routes; merge RegisterChatRoutesDeps to include both
  'paths' and 'telemetry'. Drop PR's chat-routes-local
  reconcileAssistantMessageOnRunEnd helper (main has the equivalent in
  server.ts).
- apps/web/src/components/ChatPane.tsx & AssistantMessage.tsx & ProjectView.tsx:
  keep main's projectKindForTracking prop name and its existing
  emission of surface_view / ui_click / feedback_submit_result; the
  PR's analyticsCtx-based reason_view/click/submit emission is dropped
  in this rebase since it would duplicate the existing wire format.
- apps/web/tests/components/*: rename projectKind → projectKindForTracking
  to match ChatPane's current prop name.

Outstanding review feedback (from the pre-rebase round, will be
addressed in a follow-up commit):
- AssistantMessage tests not yet passing the new feedback context to
  the direct render path.
- ProjectView clear-feedback path skips reportChatRunFeedback, leaving
  stale Langfuse user_rating scores.
- buildFeedbackPayload has no deletion path for previously-submitted
  user_rating_reason scores when the user switches thumbs.
- POST /api/runs/:id/feedback always returns {status:'accepted'} even
  when consent is off; needs to surface skipped_consent / skipped_no_sink.
- reasonCodes are filtered to string[] but not allowlisted against
  ChatMessageFeedbackReasonCode or deduped.

* fix(analytics): address review on assistant feedback rebase

Picks up the in-scope correctness items from the prior review round
and the rebase residue without rewriting history:

- chat-routes.ts: `/feedback` now awaits the daemon's preflight
  outcome and echoes it as the response. The contract was already
  shaped as `accepted | skipped_consent | skipped_no_sink`, but the
  previous handler always returned `accepted` because the network
  send was fire-and-forget. The consent + sink decision is local
  (a small file read and an env-var lookup); the actual Langfuse
  upload still runs as a detached promise.
- chat-routes.ts: reasonCodes are now allowlisted against the
  contract's reason-code union and deduplicated before reaching
  Langfuse, so a stale or replayed client can't poison the
  Langfuse score table with unknown categorical values or
  duplicate stable ids in the same batch.
- langfuse-bridge.ts: split the consent + sink resolution from the
  fire-and-forget network send so the route can claim `accepted`
  honestly. The legacy `skipped_no_sink` return on app-config read
  failure is preserved.

Contracts + comment hygiene:
- TrackingFeedbackReasonCode in packages/contracts/src/analytics/events.ts
  drifted from ChatMessageFeedbackReasonCode in packages/contracts/src/api/chat.ts;
  add `followed_design_system` and `missed_design_system` so the
  analytics wire format stays aligned with the persistence shape.
- langfuse-trace.ts buildFeedbackPayload: the docblock claimed the
  raw custom-reason text is bucketed before send. Product reversed
  that on 2026-05-13 (raw text now ships, consent-gated). Replace
  the stale comment with the real semantics + a note that there is
  no tombstone path for reason codes the user removes in a
  follow-up submission (left as scope for a later PR).
- AssistantMessage.tsx: remove the now-unused
  `AssistantFeedbackAnalyticsCtx` interface and a stray blank-line
  delete from the rebase; restore the analytics-context comment
  above the feedback hook.

Left as follow-up (intentional, documented in code):
- Sending a tombstone score when the user clears their rating —
  ProjectView still skips reportChatRunFeedback on `change===null`,
  so Langfuse retains the previous rating until the user re-submits.
  The PostHog event captures the clear separately.
- Removing reason-code scores when the user re-submits with a
  smaller set — buildFeedbackPayload only overwrites the codes
  present in the current payload.

* feat(analytics): wire PR's dedicated assistant_feedback_* events

The four dedicated event types (`assistant_feedback_click` /
`_reason_view` / `_reason_click` / `_reason_submit`) the PR added to
contracts were sitting unused after the rebase because main's
umbrella `surface_view` / `ui_click` / `feedback_submit_result`
emissions covered the same user gestures. Wire the dedicated events
alongside the umbrella ones so both wire formats fire on every
feedback action — dashboards / evals can pick whichever schema they
were built against without losing signal.

Each dedicated event has stricter typing than its umbrella sibling
(`project_id` / `project_kind` / `conversation_id` are non-null), so
the new emissions are guarded behind a presence check and skipped on
test renders that mount AssistantMessage without project context. The
umbrella emissions retain their nullable fallbacks unchanged.

Pairing:
- surface_view (feedback reason panel) ↔ assistant_feedback_reason_view
- ui_click (feedback button)           ↔ assistant_feedback_click
- ui_click (reason submit button)      ↔ assistant_feedback_reason_click
- feedback_submit_result               ↔ assistant_feedback_reason_submit

Reason click + submit share the existing `requestId` so PostHog can
stitch click→result across both schemas, matching the spec.
2026-05-21 19:28:51 +08:00

413 lines
12 KiB
TypeScript

// @vitest-environment jsdom
if (typeof HTMLElement.prototype.scrollTo !== 'function') {
HTMLElement.prototype.scrollTo = function (
options?: ScrollToOptions | number,
_y?: number,
) {
if (typeof options === 'object' && options !== null) {
if (options.top !== undefined) this.scrollTop = options.top;
if (options.left !== undefined) this.scrollLeft = options.left;
}
};
}
import { cleanup, fireEvent, render, screen, within } from '@testing-library/react';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import { ChatPane } from '../../src/components/ChatPane';
import type { ChatMessage, ChatMessageFeedbackChange } from '../../src/types';
const originalScrollIntoView = Element.prototype.scrollIntoView;
if (typeof Element.prototype.scrollIntoView !== 'function') {
Element.prototype.scrollIntoView = vi.fn();
}
function completedAssistant(
input: Partial<ChatMessage> = {},
): ChatMessage {
return {
id: 'assistant-1',
role: 'assistant',
content: 'Done',
createdAt: 1_700_000_000_000,
startedAt: 1_700_000_000_000,
endedAt: 1_700_000_003_000,
runStatus: 'succeeded',
...input,
};
}
function completedArtifactAssistant(
input: Partial<ChatMessage> = {},
): ChatMessage {
return completedAssistant({
producedFiles: [
{
name: 'index.html',
size: 1024,
mtime: 1_700_000_003_000,
kind: 'html',
mime: 'text/html',
},
],
...input,
});
}
function completedEditAssistant(
input: Partial<ChatMessage> = {},
): ChatMessage {
return completedAssistant({
events: [
{
kind: 'tool_use',
id: 'edit-1',
name: 'Edit',
input: { file_path: 'index.html' },
},
{
kind: 'tool_result',
toolUseId: 'edit-1',
content: 'Done',
isError: false,
},
],
...input,
});
}
function completedLiveArtifactAssistant(
input: Partial<ChatMessage> = {},
): ChatMessage {
return completedAssistant({
events: [
{
kind: 'live_artifact',
action: 'updated',
projectId: 'project-1',
artifactId: 'live-1',
title: 'Ricky Dental Poster',
refreshStatus: 'idle',
},
],
...input,
});
}
function renderChatPane({
messages,
streaming = false,
onAssistantFeedback = vi.fn(),
hasActiveDesignSystem = false,
}: {
messages: ChatMessage[];
streaming?: boolean;
onAssistantFeedback?: (
assistantMessage: ChatMessage,
change: ChatMessageFeedbackChange,
) => void;
hasActiveDesignSystem?: boolean;
}) {
return {
onAssistantFeedback,
...render(
<ChatPane
projectKindForTracking="prototype"
messages={messages}
streaming={streaming}
error={null}
projectId="project-1"
projectFiles={[]}
hasActiveDesignSystem={hasActiveDesignSystem}
onEnsureProject={async () => 'project-1'}
onSend={() => {}}
onStop={() => {}}
conversations={[]}
activeConversationId="conversation-1"
onSelectConversation={() => {}}
onDeleteConversation={() => {}}
onAssistantFeedback={onAssistantFeedback}
/>,
),
};
}
describe('chat assistant feedback', () => {
afterEach(() => cleanup());
beforeEach(() => {
Element.prototype.scrollIntoView = vi.fn();
});
afterEach(() => {
Element.prototype.scrollIntoView = originalScrollIntoView;
vi.restoreAllMocks();
});
it('collects feedback after any successfully completed assistant turn', () => {
renderChatPane({
messages: [completedAssistant()],
});
expect(screen.getByRole('group', { name: 'Feedback' })).toBeTruthy();
});
it('collects positive and negative feedback on completed artifact results', () => {
const { onAssistantFeedback } = renderChatPane({
messages: [completedArtifactAssistant()],
});
const feedbackGroup = screen.getByRole('group', { name: 'Feedback' });
const footer = document.querySelector('.assistant-footer');
expect(feedbackGroup.textContent).not.toContain('Feedback');
expect(footer?.contains(feedbackGroup)).toBe(true);
fireEvent.click(screen.getByRole('button', { name: 'Helpful' }));
expect(onAssistantFeedback).toHaveBeenLastCalledWith(
expect.objectContaining({ id: 'assistant-1' }),
{ rating: 'positive' },
);
fireEvent.click(screen.getByRole('button', { name: 'Not helpful' }));
expect(onAssistantFeedback).toHaveBeenLastCalledWith(
expect.objectContaining({ id: 'assistant-1' }),
{ rating: 'negative' },
);
expect(document.querySelector('.assistant-feedback-burst')).toBeTruthy();
});
it('shows feedback after completed artifact edits without newly produced files', () => {
renderChatPane({
messages: [completedEditAssistant()],
});
expect(screen.getByRole('group', { name: 'Feedback' })).toBeTruthy();
});
it('shows feedback after completed live artifact updates', () => {
renderChatPane({
messages: [completedLiveArtifactAssistant()],
});
expect(screen.getByRole('group', { name: 'Feedback' })).toBeTruthy();
});
it('keeps every artifact turn feedback control visible and independent', () => {
const { onAssistantFeedback } = renderChatPane({
messages: [
completedArtifactAssistant({ id: 'assistant-1' }),
{
id: 'user-1',
role: 'user',
content: 'Make another version',
createdAt: 1_700_000_004_000,
},
completedArtifactAssistant({ id: 'assistant-2', createdAt: 1_700_000_005_000 }),
],
});
const groups = screen.getAllByRole('group', { name: 'Feedback' });
expect(groups).toHaveLength(2);
fireEvent.click(within(groups[0]!).getByRole('button', { name: 'Helpful' }));
expect(onAssistantFeedback).toHaveBeenLastCalledWith(
expect.objectContaining({ id: 'assistant-1' }),
{ rating: 'positive' },
);
fireEvent.click(within(groups[1]!).getByRole('button', { name: 'Not helpful' }));
expect(onAssistantFeedback).toHaveBeenLastCalledWith(
expect.objectContaining({ id: 'assistant-2' }),
{ rating: 'negative' },
);
});
it('shows the persisted feedback state without saved copy', () => {
renderChatPane({
messages: [
completedArtifactAssistant({
feedback: {
rating: 'negative',
createdAt: 1_700_000_004_000,
updatedAt: 1_700_000_004_000,
},
}),
],
});
expect(screen.queryByText('Feedback saved')).toBeNull();
expect(
screen.getByRole('button', { name: 'Not helpful' }).getAttribute('aria-pressed'),
).toBe('true');
expect(
screen.getByRole('button', { name: 'Helpful' }).getAttribute('aria-pressed'),
).toBe('false');
});
it('clicking an already selected feedback rating clears it', () => {
const { onAssistantFeedback } = renderChatPane({
messages: [
completedArtifactAssistant({
feedback: {
rating: 'positive',
createdAt: 1_700_000_004_000,
updatedAt: 1_700_000_004_000,
},
}),
],
});
fireEvent.click(screen.getByRole('button', { name: 'Helpful' }));
expect(onAssistantFeedback).toHaveBeenLastCalledWith(
expect.objectContaining({ id: 'assistant-1' }),
null,
);
});
it('collects preset and custom reasons after a rating is selected', () => {
const { onAssistantFeedback } = renderChatPane({
messages: [completedArtifactAssistant()],
});
fireEvent.click(screen.getByRole('button', { name: 'Helpful' }));
expect(screen.getByText('Tell us why')).toBeTruthy();
expect(screen.getByText('😊')).toBeTruthy();
expect(
screen.getByTestId('assistant-feedback-discord-positive').getAttribute('href'),
).toBe('https://discord.gg/mHAjSMV6gz');
expect(screen.getByText(/Share what you made with the/i)).toBeTruthy();
fireEvent.click(screen.getByLabelText('Understood my request'));
fireEvent.click(screen.getByLabelText('Other'));
fireEvent.change(screen.getByPlaceholderText('Add a short note...'), {
target: { value: 'The layout is ready to present.' },
});
fireEvent.click(screen.getByRole('button', { name: 'Submit' }));
expect(onAssistantFeedback).toHaveBeenLastCalledWith(
expect.objectContaining({ id: 'assistant-1' }),
expect.objectContaining({
rating: 'positive',
reasonCodes: ['matched_request', 'other'],
customReason: 'The layout is ready to present.',
reasonsSubmittedAt: expect.any(Number),
}),
);
expect(screen.queryByText('Tell us why')).toBeNull();
});
it('adds design-system feedback reasons only when a design system is active', () => {
const { unmount } = renderChatPane({
messages: [completedArtifactAssistant()],
});
fireEvent.click(screen.getByRole('button', { name: 'Helpful' }));
expect(screen.queryByLabelText('Followed the design system')).toBeNull();
unmount();
const { onAssistantFeedback } = renderChatPane({
messages: [completedArtifactAssistant()],
hasActiveDesignSystem: true,
});
fireEvent.click(screen.getByRole('button', { name: 'Helpful' }));
fireEvent.click(screen.getByLabelText('Followed the design system'));
fireEvent.click(screen.getByRole('button', { name: 'Submit' }));
expect(onAssistantFeedback).toHaveBeenLastCalledWith(
expect.objectContaining({ id: 'assistant-1' }),
expect.objectContaining({
rating: 'positive',
reasonCodes: ['followed_design_system'],
}),
);
fireEvent.click(screen.getByRole('button', { name: 'Not helpful' }));
expect(screen.getByLabelText('Did not follow the design system')).toBeTruthy();
});
it('clears custom reason when Other is deselected', () => {
const { onAssistantFeedback } = renderChatPane({
messages: [completedArtifactAssistant()],
});
fireEvent.click(screen.getByRole('button', { name: 'Helpful' }));
fireEvent.click(screen.getByLabelText('Other'));
fireEvent.change(screen.getByPlaceholderText('Add a short note...'), {
target: { value: 'This note should not be submitted.' },
});
fireEvent.click(screen.getByLabelText('Other'));
expect(screen.queryByPlaceholderText('Add a short note...')).toBeNull();
fireEvent.click(screen.getByLabelText('Understood my request'));
fireEvent.click(screen.getByRole('button', { name: 'Submit' }));
expect(onAssistantFeedback).toHaveBeenLastCalledWith(
expect.objectContaining({ id: 'assistant-1' }),
expect.objectContaining({
rating: 'positive',
reasonCodes: ['matched_request'],
customReason: undefined,
reasonsSubmittedAt: expect.any(Number),
}),
);
});
it('uses a sad marker for negative feedback reasons', () => {
renderChatPane({
messages: [completedArtifactAssistant()],
});
fireEvent.click(screen.getByRole('button', { name: 'Not helpful' }));
expect(screen.getByText('Tell us why')).toBeTruthy();
expect(screen.getByText('😔')).toBeTruthy();
expect(
screen.getByTestId('assistant-feedback-discord-negative').getAttribute('href'),
).toBe('https://discord.gg/mHAjSMV6gz');
expect(
screen.getByText(/so the team can understand what went wrong/i),
).toBeTruthy();
});
it('scrolls the feedback reasons panel into view after selecting a rating', () => {
const scrollIntoView = vi.fn();
Element.prototype.scrollIntoView = scrollIntoView;
renderChatPane({
messages: [completedArtifactAssistant()],
});
fireEvent.click(screen.getByRole('button', { name: 'Not helpful' }));
expect(scrollIntoView).toHaveBeenCalledWith({ block: 'start', behavior: 'smooth' });
});
it('does not ask for feedback while the assistant is still running', () => {
renderChatPane({
streaming: true,
messages: [
{
id: 'assistant-1',
role: 'assistant',
content: '',
createdAt: 1_700_000_000_000,
startedAt: 1_700_000_000_000,
runStatus: 'running',
producedFiles: [
{
name: 'index.html',
size: 1024,
mtime: 1_700_000_003_000,
kind: 'html',
mime: 'text/html',
},
],
},
],
});
expect(screen.queryByRole('group', { name: 'Feedback' })).toBeNull();
});
});