mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
* feat(analytics): PostHog + Langfuse instrumentation for assistant feedback
Re-bases the original three-commit PR onto release/v0.8.0. The web-side
feedback UI instrumentation (surface_view / ui_click / feedback_submit_result)
landed on main while this branch was open, so on this rebase that wiring
is taken from main; the remaining net additions are:
- Contracts: TrackingFeedback* enums and the four dedicated
assistant_feedback_* event payload types (click, reason_view,
reason_click, reason_submit), plus normalizeCustomReason helper.
The new event-name variants are added to TrackingEventName and the
AnalyticsEventPayload discriminated union next to the existing
surface_view/ui_click variants — both wire formats coexist.
- POST /api/runs/:id/feedback in apps/daemon/src/chat-routes.ts:
thin route that validates rating, allowlists reasonCodes through a
simple string filter, and fire-and-forgets into the daemon's
reportFeedback hook.
- apps/daemon/src/langfuse-bridge.ts reportRunFeedbackFromDaemon
forwards the rating + reasonCodes into Langfuse as user_rating
(NUMERIC ±1) + user_rating_reason (CATEGORICAL, one per code)
score-create entries. Gates on telemetry.metrics + telemetry.content.
- apps/web/src/providers/daemon.ts reportChatRunFeedback (fire-and-forget
fetch) and apps/web/src/components/ProjectView.tsx wiring so each
thumbs-up/down + reason submission posts the side-channel.
Conflicts resolved (release/v0.8.0 vs the branch's old base):
- packages/contracts/src/analytics/events.ts: keep main's
file_upload_result / feedback_submit_result / settings_* event
variants alongside the new assistant_feedback_* additions.
- apps/daemon/src/server.ts: keep DNS-aware validateExternalApiBaseUrl,
add reportFeedback closure wired into registerChatRoutes telemetry.
- apps/daemon/src/chat-routes.ts: keep both /tool-result and the new
/feedback routes; merge RegisterChatRoutesDeps to include both
'paths' and 'telemetry'. Drop PR's chat-routes-local
reconcileAssistantMessageOnRunEnd helper (main has the equivalent in
server.ts).
- apps/web/src/components/ChatPane.tsx & AssistantMessage.tsx & ProjectView.tsx:
keep main's projectKindForTracking prop name and its existing
emission of surface_view / ui_click / feedback_submit_result; the
PR's analyticsCtx-based reason_view/click/submit emission is dropped
in this rebase since it would duplicate the existing wire format.
- apps/web/tests/components/*: rename projectKind → projectKindForTracking
to match ChatPane's current prop name.
Outstanding review feedback (from the pre-rebase round, will be
addressed in a follow-up commit):
- AssistantMessage tests not yet passing the new feedback context to
the direct render path.
- ProjectView clear-feedback path skips reportChatRunFeedback, leaving
stale Langfuse user_rating scores.
- buildFeedbackPayload has no deletion path for previously-submitted
user_rating_reason scores when the user switches thumbs.
- POST /api/runs/:id/feedback always returns {status:'accepted'} even
when consent is off; needs to surface skipped_consent / skipped_no_sink.
- reasonCodes are filtered to string[] but not allowlisted against
ChatMessageFeedbackReasonCode or deduped.
* fix(analytics): address review on assistant feedback rebase
Picks up the in-scope correctness items from the prior review round
and the rebase residue without rewriting history:
- chat-routes.ts: `/feedback` now awaits the daemon's preflight
outcome and echoes it as the response. The contract was already
shaped as `accepted | skipped_consent | skipped_no_sink`, but the
previous handler always returned `accepted` because the network
send was fire-and-forget. The consent + sink decision is local
(a small file read and an env-var lookup); the actual Langfuse
upload still runs as a detached promise.
- chat-routes.ts: reasonCodes are now allowlisted against the
contract's reason-code union and deduplicated before reaching
Langfuse, so a stale or replayed client can't poison the
Langfuse score table with unknown categorical values or
duplicate stable ids in the same batch.
- langfuse-bridge.ts: split the consent + sink resolution from the
fire-and-forget network send so the route can claim `accepted`
honestly. The legacy `skipped_no_sink` return on app-config read
failure is preserved.
Contracts + comment hygiene:
- TrackingFeedbackReasonCode in packages/contracts/src/analytics/events.ts
drifted from ChatMessageFeedbackReasonCode in packages/contracts/src/api/chat.ts;
add `followed_design_system` and `missed_design_system` so the
analytics wire format stays aligned with the persistence shape.
- langfuse-trace.ts buildFeedbackPayload: the docblock claimed the
raw custom-reason text is bucketed before send. Product reversed
that on 2026-05-13 (raw text now ships, consent-gated). Replace
the stale comment with the real semantics + a note that there is
no tombstone path for reason codes the user removes in a
follow-up submission (left as scope for a later PR).
- AssistantMessage.tsx: remove the now-unused
`AssistantFeedbackAnalyticsCtx` interface and a stray blank-line
delete from the rebase; restore the analytics-context comment
above the feedback hook.
Left as follow-up (intentional, documented in code):
- Sending a tombstone score when the user clears their rating —
ProjectView still skips reportChatRunFeedback on `change===null`,
so Langfuse retains the previous rating until the user re-submits.
The PostHog event captures the clear separately.
- Removing reason-code scores when the user re-submits with a
smaller set — buildFeedbackPayload only overwrites the codes
present in the current payload.
* feat(analytics): wire PR's dedicated assistant_feedback_* events
The four dedicated event types (`assistant_feedback_click` /
`_reason_view` / `_reason_click` / `_reason_submit`) the PR added to
contracts were sitting unused after the rebase because main's
umbrella `surface_view` / `ui_click` / `feedback_submit_result`
emissions covered the same user gestures. Wire the dedicated events
alongside the umbrella ones so both wire formats fire on every
feedback action — dashboards / evals can pick whichever schema they
were built against without losing signal.
Each dedicated event has stricter typing than its umbrella sibling
(`project_id` / `project_kind` / `conversation_id` are non-null), so
the new emissions are guarded behind a presence check and skipped on
test renders that mount AssistantMessage without project context. The
umbrella emissions retain their nullable fallbacks unchanged.
Pairing:
- surface_view (feedback reason panel) ↔ assistant_feedback_reason_view
- ui_click (feedback button) ↔ assistant_feedback_click
- ui_click (reason submit button) ↔ assistant_feedback_reason_click
- feedback_submit_result ↔ assistant_feedback_reason_submit
Reason click + submit share the existing `requestId` so PostHog can
stitch click→result across both schemas, matching the spec.
258 lines
9.2 KiB
TypeScript
258 lines
9.2 KiB
TypeScript
// @vitest-environment jsdom
|
|
|
|
// Polyfill scrollTo for jsdom (not available in jsdom's HTMLElement)
|
|
if (typeof HTMLElement.prototype.scrollTo !== 'function') {
|
|
HTMLElement.prototype.scrollTo = function (
|
|
options?: ScrollToOptions | number,
|
|
_y?: number,
|
|
) {
|
|
if (typeof options === 'object' && options !== null) {
|
|
if (options.top !== undefined) this.scrollTop = options.top;
|
|
if (options.left !== undefined) this.scrollLeft = options.left;
|
|
}
|
|
};
|
|
}
|
|
|
|
import { act, cleanup, fireEvent, render, screen } from '@testing-library/react';
|
|
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
|
import { ChatPane } from '../../src/components/ChatPane';
|
|
import type { ChatMessage } from '../../src/types';
|
|
|
|
// jsdom does not run a layout engine, so scrollHeight/clientHeight/scrollTop
|
|
// are all 0. The scroll-preservation effect derives "near bottom" and the
|
|
// restore target from those, so we install prototype-level getters/setters
|
|
// that route every chat-log read/write through a per-test `geom` object.
|
|
//
|
|
// The earlier shape installed instance-level Object.defineProperty mocks on
|
|
// the *remounted* chat-log only AFTER `await switchTab('Chat')`. Inside that
|
|
// act() the component schedules a rAF that writes scrollTop on the new
|
|
// element; depending on whether jsdom's rAF polyfill flushed before the await
|
|
// resolved, the write either landed on the still-default prototype setter
|
|
// (lost) or the not-yet-installed instance setter (also lost). The instance
|
|
// mock's closure-captured `remountedTop` then served its initial 0 forever
|
|
// and the assertion failed nondeterministically across CI runs without any
|
|
// product-code change. Patching at the prototype level eliminates the race
|
|
// because any chat-log instance React mounts later automatically reads/writes
|
|
// through this single test-controlled geometry.
|
|
type Geom = { scrollHeight: number; clientHeight: number; scrollTop: number };
|
|
let geom: Geom;
|
|
let rafCallbacks: FrameRequestCallback[] = [];
|
|
let resizeCallbacks: ResizeObserverCallback[] = [];
|
|
let savedDescriptors: Record<
|
|
'scrollTop' | 'scrollHeight' | 'clientHeight',
|
|
PropertyDescriptor | undefined
|
|
>;
|
|
let originalResizeObserver: typeof ResizeObserver | undefined;
|
|
|
|
function isChatLog(el: HTMLElement): boolean {
|
|
return typeof el?.classList?.contains === 'function' && el.classList.contains('chat-log');
|
|
}
|
|
|
|
beforeEach(() => {
|
|
geom = { scrollHeight: 0, clientHeight: 0, scrollTop: 0 };
|
|
rafCallbacks = [];
|
|
resizeCallbacks = [];
|
|
vi.spyOn(window, 'requestAnimationFrame').mockImplementation((callback) => {
|
|
rafCallbacks.push(callback);
|
|
return rafCallbacks.length;
|
|
});
|
|
originalResizeObserver = globalThis.ResizeObserver;
|
|
class MockResizeObserver {
|
|
constructor(callback: ResizeObserverCallback) {
|
|
resizeCallbacks.push(callback);
|
|
}
|
|
observe = vi.fn();
|
|
unobserve = vi.fn();
|
|
disconnect = vi.fn();
|
|
}
|
|
Object.defineProperty(globalThis, 'ResizeObserver', {
|
|
configurable: true,
|
|
writable: true,
|
|
value: MockResizeObserver,
|
|
});
|
|
savedDescriptors = {
|
|
scrollTop: Object.getOwnPropertyDescriptor(HTMLElement.prototype, 'scrollTop'),
|
|
scrollHeight: Object.getOwnPropertyDescriptor(HTMLElement.prototype, 'scrollHeight'),
|
|
clientHeight: Object.getOwnPropertyDescriptor(HTMLElement.prototype, 'clientHeight'),
|
|
};
|
|
Object.defineProperty(HTMLElement.prototype, 'scrollTop', {
|
|
configurable: true,
|
|
get(this: HTMLElement) {
|
|
return isChatLog(this) ? geom.scrollTop : 0;
|
|
},
|
|
set(this: HTMLElement, v: number) {
|
|
if (isChatLog(this)) geom.scrollTop = v;
|
|
},
|
|
});
|
|
Object.defineProperty(HTMLElement.prototype, 'scrollHeight', {
|
|
configurable: true,
|
|
get(this: HTMLElement) {
|
|
return isChatLog(this) ? geom.scrollHeight : 0;
|
|
},
|
|
});
|
|
Object.defineProperty(HTMLElement.prototype, 'clientHeight', {
|
|
configurable: true,
|
|
get(this: HTMLElement) {
|
|
return isChatLog(this) ? geom.clientHeight : 0;
|
|
},
|
|
});
|
|
});
|
|
|
|
afterEach(() => {
|
|
cleanup();
|
|
vi.restoreAllMocks();
|
|
rafCallbacks = [];
|
|
resizeCallbacks = [];
|
|
if (originalResizeObserver) {
|
|
Object.defineProperty(globalThis, 'ResizeObserver', {
|
|
configurable: true,
|
|
writable: true,
|
|
value: originalResizeObserver,
|
|
});
|
|
} else {
|
|
delete (globalThis as unknown as { ResizeObserver?: unknown }).ResizeObserver;
|
|
}
|
|
for (const key of ['scrollTop', 'scrollHeight', 'clientHeight'] as const) {
|
|
const original = savedDescriptors[key];
|
|
if (original) {
|
|
Object.defineProperty(HTMLElement.prototype, key, original);
|
|
} else {
|
|
delete (HTMLElement.prototype as unknown as Record<string, unknown>)[key];
|
|
}
|
|
}
|
|
});
|
|
|
|
function setGeom(partial: Partial<Geom>) {
|
|
geom = { ...geom, ...partial };
|
|
}
|
|
|
|
function setUserScroll(top: number) {
|
|
geom.scrollTop = top;
|
|
const el = document.querySelector('.chat-log');
|
|
if (el) fireEvent.scroll(el);
|
|
}
|
|
|
|
function chatPaneEl(messages: ChatMessage[], activeConversationId: string | null) {
|
|
return (
|
|
<ChatPane
|
|
projectKindForTracking="prototype"
|
|
messages={messages}
|
|
streaming={false}
|
|
error={null}
|
|
projectId="project-1"
|
|
projectFiles={[]}
|
|
onEnsureProject={async () => 'project-1'}
|
|
onSend={() => {}}
|
|
onStop={() => {}}
|
|
conversations={[]}
|
|
activeConversationId={activeConversationId}
|
|
onSelectConversation={() => {}}
|
|
onDeleteConversation={() => {}}
|
|
/>
|
|
);
|
|
}
|
|
|
|
function renderChatPane(messages: ChatMessage[], activeConversationId: string | null = null) {
|
|
return render(chatPaneEl(messages, activeConversationId));
|
|
}
|
|
|
|
const sampleMessages: ChatMessage[] = [
|
|
{ id: 'u1', role: 'user', content: 'first request', createdAt: Date.now() },
|
|
{ id: 'a1', role: 'assistant', content: 'first reply', createdAt: Date.now() },
|
|
{ id: 'u2', role: 'user', content: 'second request', createdAt: Date.now() },
|
|
{ id: 'a2', role: 'assistant', content: 'second reply', createdAt: Date.now() },
|
|
];
|
|
|
|
async function flushFrame() {
|
|
await act(async () => {
|
|
const callbacks = rafCallbacks;
|
|
rafCallbacks = [];
|
|
callbacks.forEach((callback) => callback(performance.now()));
|
|
await Promise.resolve();
|
|
});
|
|
}
|
|
|
|
async function triggerResize() {
|
|
await act(async () => {
|
|
const callbacks = [...resizeCallbacks];
|
|
callbacks.forEach((callback) =>
|
|
callback([], {} as ResizeObserver),
|
|
);
|
|
await Promise.resolve();
|
|
});
|
|
}
|
|
|
|
describe('chat scroll behavior', () => {
|
|
it('does not render the removed comments tab beside chat', () => {
|
|
renderChatPane(sampleMessages);
|
|
|
|
expect(screen.queryByRole('tab', { name: 'Comments' })).toBeNull();
|
|
expect(screen.queryByRole('tab', { name: 'Chat' })).toBeNull();
|
|
});
|
|
|
|
it('does not auto-scroll a short scrollback (~90px above bottom) when new content streams in', async () => {
|
|
setGeom({ scrollHeight: 1000, clientHeight: 400, scrollTop: 0 });
|
|
const { rerender } = render(chatPaneEl(sampleMessages, null));
|
|
// Drain the initial-bottom-scroll rAF queued during the first render,
|
|
// otherwise it fires after our setUserScroll calls and re-pins the
|
|
// ref to true behind the test's back.
|
|
await flushFrame();
|
|
|
|
// User intentionally scrolls 90px up: distance = 1000 - 510 - 400 = 90.
|
|
// That's between the 80px auto-follow cutoff and the 120px jump-button
|
|
// threshold, so the user is deliberately reading slightly earlier
|
|
// output and should not be yanked to the latest message.
|
|
setUserScroll(510);
|
|
|
|
// A new assistant message streams in; scrollHeight grows.
|
|
const streamed: ChatMessage[] = [
|
|
...sampleMessages,
|
|
{ id: 'a3', role: 'assistant', content: 'streaming chunk', createdAt: Date.now() },
|
|
];
|
|
setGeom({ scrollHeight: 1100 });
|
|
await act(async () => {
|
|
rerender(chatPaneEl(streamed, null));
|
|
});
|
|
await flushFrame();
|
|
|
|
// Auto-scroll must respect the user's pause: scrollTop stays where
|
|
// they put it instead of snapping to the new scrollHeight.
|
|
expect(geom.scrollTop).toBe(510);
|
|
});
|
|
|
|
it('keeps following the latest content when the pinned chat DOM grows after render', async () => {
|
|
setGeom({ scrollHeight: 1000, clientHeight: 400, scrollTop: 0 });
|
|
renderChatPane(sampleMessages);
|
|
await flushFrame();
|
|
|
|
expect(geom.scrollTop).toBe(1000);
|
|
|
|
// A message body can grow after the messages effect has already run
|
|
// (markdown/tool rows/forms/images/layout). If the user was pinned to
|
|
// the bottom, that DOM resize must still carry them to the latest turn.
|
|
setGeom({ scrollHeight: 1200 });
|
|
await triggerResize();
|
|
await flushFrame();
|
|
|
|
expect(geom.scrollTop).toBe(1200);
|
|
});
|
|
|
|
it('lands new conversation at its own bottom when switching conversations', async () => {
|
|
const { rerender } = render(chatPaneEl(sampleMessages, 'conv-A'));
|
|
setGeom({ scrollHeight: 1000, clientHeight: 400, scrollTop: 0 });
|
|
|
|
// User scrolls up in conversation A.
|
|
setUserScroll(150);
|
|
|
|
// The active conversation changes to B. It must land at conversation
|
|
// B's own initial bottom, not at scrollTop: 0 and not at conversation
|
|
// A's saved offset.
|
|
rerender(chatPaneEl(sampleMessages, 'conv-B'));
|
|
await flushFrame();
|
|
|
|
// Saved state was cleared by the activeConversationId-reset effect,
|
|
// so the new conversation lands at its own scrollHeight rather than
|
|
// the browser default 0.
|
|
expect(geom.scrollTop).toBe(1000);
|
|
});
|
|
});
|