mirror of
https://github.com/nexu-io/open-design.git
synced 2026-06-01 03:14:35 +07:00
feat(analytics): PostHog + Langfuse instrumentation for assistant feedback (#1558)
* feat(analytics): PostHog + Langfuse instrumentation for assistant feedback
Re-bases the original three-commit PR onto release/v0.8.0. The web-side
feedback UI instrumentation (surface_view / ui_click / feedback_submit_result)
landed on main while this branch was open, so on this rebase that wiring
is taken from main; the remaining net additions are:
- Contracts: TrackingFeedback* enums and the four dedicated
assistant_feedback_* event payload types (click, reason_view,
reason_click, reason_submit), plus normalizeCustomReason helper.
The new event-name variants are added to TrackingEventName and the
AnalyticsEventPayload discriminated union next to the existing
surface_view/ui_click variants — both wire formats coexist.
- POST /api/runs/:id/feedback in apps/daemon/src/chat-routes.ts:
thin route that validates rating, allowlists reasonCodes through a
simple string filter, and fire-and-forgets into the daemon's
reportFeedback hook.
- apps/daemon/src/langfuse-bridge.ts reportRunFeedbackFromDaemon
forwards the rating + reasonCodes into Langfuse as user_rating
(NUMERIC ±1) + user_rating_reason (CATEGORICAL, one per code)
score-create entries. Gates on telemetry.metrics + telemetry.content.
- apps/web/src/providers/daemon.ts reportChatRunFeedback (fire-and-forget
fetch) and apps/web/src/components/ProjectView.tsx wiring so each
thumbs-up/down + reason submission posts the side-channel.
Conflicts resolved (release/v0.8.0 vs the branch's old base):
- packages/contracts/src/analytics/events.ts: keep main's
file_upload_result / feedback_submit_result / settings_* event
variants alongside the new assistant_feedback_* additions.
- apps/daemon/src/server.ts: keep DNS-aware validateExternalApiBaseUrl,
add reportFeedback closure wired into registerChatRoutes telemetry.
- apps/daemon/src/chat-routes.ts: keep both /tool-result and the new
/feedback routes; merge RegisterChatRoutesDeps to include both
'paths' and 'telemetry'. Drop PR's chat-routes-local
reconcileAssistantMessageOnRunEnd helper (main has the equivalent in
server.ts).
- apps/web/src/components/ChatPane.tsx & AssistantMessage.tsx & ProjectView.tsx:
keep main's projectKindForTracking prop name and its existing
emission of surface_view / ui_click / feedback_submit_result; the
PR's analyticsCtx-based reason_view/click/submit emission is dropped
in this rebase since it would duplicate the existing wire format.
- apps/web/tests/components/*: rename projectKind → projectKindForTracking
to match ChatPane's current prop name.
Outstanding review feedback (from the pre-rebase round, will be
addressed in a follow-up commit):
- AssistantMessage tests not yet passing the new feedback context to
the direct render path.
- ProjectView clear-feedback path skips reportChatRunFeedback, leaving
stale Langfuse user_rating scores.
- buildFeedbackPayload has no deletion path for previously-submitted
user_rating_reason scores when the user switches thumbs.
- POST /api/runs/:id/feedback always returns {status:'accepted'} even
when consent is off; needs to surface skipped_consent / skipped_no_sink.
- reasonCodes are filtered to string[] but not allowlisted against
ChatMessageFeedbackReasonCode or deduped.
* fix(analytics): address review on assistant feedback rebase
Picks up the in-scope correctness items from the prior review round
and the rebase residue without rewriting history:
- chat-routes.ts: `/feedback` now awaits the daemon's preflight
outcome and echoes it as the response. The contract was already
shaped as `accepted | skipped_consent | skipped_no_sink`, but the
previous handler always returned `accepted` because the network
send was fire-and-forget. The consent + sink decision is local
(a small file read and an env-var lookup); the actual Langfuse
upload still runs as a detached promise.
- chat-routes.ts: reasonCodes are now allowlisted against the
contract's reason-code union and deduplicated before reaching
Langfuse, so a stale or replayed client can't poison the
Langfuse score table with unknown categorical values or
duplicate stable ids in the same batch.
- langfuse-bridge.ts: split the consent + sink resolution from the
fire-and-forget network send so the route can claim `accepted`
honestly. The legacy `skipped_no_sink` return on app-config read
failure is preserved.
Contracts + comment hygiene:
- TrackingFeedbackReasonCode in packages/contracts/src/analytics/events.ts
drifted from ChatMessageFeedbackReasonCode in packages/contracts/src/api/chat.ts;
add `followed_design_system` and `missed_design_system` so the
analytics wire format stays aligned with the persistence shape.
- langfuse-trace.ts buildFeedbackPayload: the docblock claimed the
raw custom-reason text is bucketed before send. Product reversed
that on 2026-05-13 (raw text now ships, consent-gated). Replace
the stale comment with the real semantics + a note that there is
no tombstone path for reason codes the user removes in a
follow-up submission (left as scope for a later PR).
- AssistantMessage.tsx: remove the now-unused
`AssistantFeedbackAnalyticsCtx` interface and a stray blank-line
delete from the rebase; restore the analytics-context comment
above the feedback hook.
Left as follow-up (intentional, documented in code):
- Sending a tombstone score when the user clears their rating —
ProjectView still skips reportChatRunFeedback on `change===null`,
so Langfuse retains the previous rating until the user re-submits.
The PostHog event captures the clear separately.
- Removing reason-code scores when the user re-submits with a
smaller set — buildFeedbackPayload only overwrites the codes
present in the current payload.
* feat(analytics): wire PR's dedicated assistant_feedback_* events
The four dedicated event types (`assistant_feedback_click` /
`_reason_view` / `_reason_click` / `_reason_submit`) the PR added to
contracts were sitting unused after the rebase because main's
umbrella `surface_view` / `ui_click` / `feedback_submit_result`
emissions covered the same user gestures. Wire the dedicated events
alongside the umbrella ones so both wire formats fire on every
feedback action — dashboards / evals can pick whichever schema they
were built against without losing signal.
Each dedicated event has stricter typing than its umbrella sibling
(`project_id` / `project_kind` / `conversation_id` are non-null), so
the new emissions are guarded behind a presence check and skipped on
test renders that mount AssistantMessage without project context. The
umbrella emissions retain their nullable fallbacks unchanged.
Pairing:
- surface_view (feedback reason panel) ↔ assistant_feedback_reason_view
- ui_click (feedback button) ↔ assistant_feedback_click
- ui_click (reason submit button) ↔ assistant_feedback_reason_click
- feedback_submit_result ↔ assistant_feedback_reason_submit
Reason click + submit share the existing `requestId` so PostHog can
stitch click→result across both schemas, matching the spec.
This commit is contained in:
parent
10e2019c59
commit
6690dbd5bb
18 changed files with 780 additions and 9 deletions
|
|
@ -12,7 +12,25 @@ import { isSafeId as isSafeProjectId } from './projects.js';
|
|||
import { projectKindToTracking } from '@open-design/contracts/analytics';
|
||||
import { validateBaseUrlResolved } from './connectionTest.js';
|
||||
|
||||
export interface RegisterChatRoutesDeps extends RouteDeps<'db' | 'design' | 'http' | 'chat' | 'agents' | 'critique' | 'validation' | 'lifecycle' | 'paths'> {}
|
||||
// Allowlist for the `/feedback` route. Mirrors the
|
||||
// ChatMessageFeedbackReasonCode union in packages/contracts/src/api/chat.ts.
|
||||
// Kept inline (not imported as a runtime value, since the contract type is
|
||||
// type-only) so a stale client can't poison Langfuse with unknown categories.
|
||||
const FEEDBACK_REASON_ALLOWLIST: ReadonlySet<string> = new Set([
|
||||
'matched_request',
|
||||
'strong_visual',
|
||||
'useful_structure',
|
||||
'easy_to_continue',
|
||||
'followed_design_system',
|
||||
'missed_request',
|
||||
'weak_visual',
|
||||
'incomplete_output',
|
||||
'hard_to_use',
|
||||
'missed_design_system',
|
||||
'other',
|
||||
]);
|
||||
|
||||
export interface RegisterChatRoutesDeps extends RouteDeps<'db' | 'design' | 'http' | 'chat' | 'agents' | 'critique' | 'validation' | 'lifecycle' | 'paths' | 'telemetry'> {}
|
||||
|
||||
export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
|
||||
const { db, design } = ctx;
|
||||
|
|
@ -122,6 +140,74 @@ export function registerChatRoutes(app: Express, ctx: RegisterChatRoutesDeps) {
|
|||
res.json({ ok: true });
|
||||
});
|
||||
|
||||
// Receives the user's thumbs-up/down (+ reason codes) for an assistant
|
||||
// turn and forwards it to Langfuse as a `score-create`. Web persists the
|
||||
// feedback itself via PUT /messages/:id; this endpoint exists only as a
|
||||
// telemetry side channel — the daemon is the single network egress for
|
||||
// Langfuse and gates on `telemetry.metrics + telemetry.content` consent.
|
||||
//
|
||||
// The consent + sink decision is fast (awaits a small file read, no
|
||||
// network); we await it so the response status honestly reflects whether
|
||||
// the score was enqueued, skipped for consent, or skipped because no
|
||||
// Langfuse sink is configured. The actual Langfuse network call happens
|
||||
// as a detached promise inside the bridge.
|
||||
app.post('/api/runs/:id/feedback', async (req, res) => {
|
||||
const runId = req.params.id;
|
||||
const body = (req.body ?? {}) as Partial<{
|
||||
projectId: string;
|
||||
conversationId: string;
|
||||
assistantMessageId: string;
|
||||
rating: 'positive' | 'negative';
|
||||
reasonCodes: string[];
|
||||
hasCustomReason: boolean;
|
||||
customReason: string;
|
||||
}>;
|
||||
if (!runId) {
|
||||
return sendApiError(res, 400, 'INVALID_RUN_ID', 'runId missing');
|
||||
}
|
||||
if (body.rating !== 'positive' && body.rating !== 'negative') {
|
||||
return sendApiError(res, 400, 'INVALID_RATING', 'rating must be positive or negative');
|
||||
}
|
||||
// Drop anything outside the contract-side reason allowlist and
|
||||
// deduplicate; otherwise a malformed or replayed client payload could
|
||||
// create unknown Langfuse categories or duplicate score ids in the
|
||||
// same batch.
|
||||
const reasonCodes = Array.isArray(body.reasonCodes)
|
||||
? Array.from(
|
||||
new Set(
|
||||
body.reasonCodes.filter(
|
||||
(c): c is string =>
|
||||
typeof c === 'string' && FEEDBACK_REASON_ALLOWLIST.has(c),
|
||||
),
|
||||
),
|
||||
)
|
||||
: [];
|
||||
const customReason = typeof body.customReason === 'string' ? body.customReason : '';
|
||||
const reportFeedback = ctx.telemetry?.reportFeedback;
|
||||
if (!reportFeedback) {
|
||||
res.status(202).json({ status: 'skipped_no_sink' });
|
||||
return;
|
||||
}
|
||||
// Build score metadata bag that lands in the Langfuse score body.
|
||||
// Mirrors the PostHog event so analysts can cross-reference.
|
||||
const scoreMetadata: Record<string, unknown> = {
|
||||
projectId: body.projectId,
|
||||
conversationId: body.conversationId,
|
||||
assistantMessageId: body.assistantMessageId,
|
||||
hasCustomReason: body.hasCustomReason === true,
|
||||
customReason,
|
||||
};
|
||||
const outcome = await reportFeedback({
|
||||
runId,
|
||||
rating: body.rating,
|
||||
reasonCodes,
|
||||
hasCustomReason: body.hasCustomReason === true,
|
||||
customReason,
|
||||
scoreMetadata,
|
||||
});
|
||||
res.status(202).json(outcome);
|
||||
});
|
||||
|
||||
app.post('/api/chat', (req, res) => {
|
||||
if (isDaemonShuttingDown()) {
|
||||
return sendApiError(res, 503, 'UPSTREAM_UNAVAILABLE', 'daemon is shutting down');
|
||||
|
|
|
|||
|
|
@ -14,9 +14,12 @@ import { readAppConfig } from './app-config.js';
|
|||
import type { AppVersionInfo } from './app-version.js';
|
||||
import { listMessages } from './db.js';
|
||||
import {
|
||||
readTelemetrySinkConfig,
|
||||
reportRunCompleted,
|
||||
reportRunFeedback,
|
||||
type ArtifactSummary,
|
||||
type EventsSummary,
|
||||
type FeedbackReportContext,
|
||||
type MessageSummary,
|
||||
type ReportContext,
|
||||
type RuntimeInfo,
|
||||
|
|
@ -357,3 +360,71 @@ export async function reportRunCompletedFromDaemon(
|
|||
console.warn('[langfuse-bridge] report failed:', String(err));
|
||||
}
|
||||
}
|
||||
|
||||
export interface ReportRunFeedbackFromDaemonOpts {
|
||||
dataDir: string;
|
||||
runId: string;
|
||||
rating: 'positive' | 'negative';
|
||||
reasonCodes: string[];
|
||||
hasCustomReason: boolean;
|
||||
/** Raw "other" free text. Empty when no custom reason. */
|
||||
customReason: string;
|
||||
/** Extra context for Langfuse score metadata (projectId / conversationId / assistantMessageId). */
|
||||
scoreMetadata?: Record<string, unknown>;
|
||||
fetchImpl?: typeof fetch;
|
||||
}
|
||||
|
||||
/**
|
||||
* Result for the POST /api/runs/:id/feedback handler. Telemetry is
|
||||
* best-effort and the network call runs after the response is sent, but
|
||||
* the handler still tells the caller whether the report was at least
|
||||
* enqueued — useful for QA and e2e.
|
||||
*/
|
||||
export type FeedbackReportOutcome =
|
||||
| { status: 'accepted' }
|
||||
| { status: 'skipped_consent' }
|
||||
| { status: 'skipped_no_sink' };
|
||||
|
||||
export async function reportRunFeedbackFromDaemon(
|
||||
opts: ReportRunFeedbackFromDaemonOpts,
|
||||
): Promise<FeedbackReportOutcome> {
|
||||
let cfg;
|
||||
try {
|
||||
cfg = await readAppConfig(opts.dataDir);
|
||||
} catch (err) {
|
||||
console.warn('[langfuse-bridge] feedback config read failed:', String(err));
|
||||
return { status: 'skipped_no_sink' };
|
||||
}
|
||||
const prefs = cfg.telemetry ?? {};
|
||||
if (prefs.metrics !== true || prefs.content !== true) {
|
||||
return { status: 'skipped_consent' };
|
||||
}
|
||||
// Pre-resolve the sink before claiming `accepted`. Avoids advertising a
|
||||
// successful enqueue to callers when there's no Langfuse endpoint
|
||||
// configured to ship the score to.
|
||||
const sink = readTelemetrySinkConfig();
|
||||
if (!sink) {
|
||||
return { status: 'skipped_no_sink' };
|
||||
}
|
||||
const ctx: FeedbackReportContext = {
|
||||
runId: opts.runId,
|
||||
installationId: cfg.installationId ?? null,
|
||||
prefs,
|
||||
rating: opts.rating,
|
||||
reasonCodes: opts.reasonCodes,
|
||||
hasCustomReason: opts.hasCustomReason,
|
||||
customReason: opts.customReason,
|
||||
...(opts.scoreMetadata ? { metadata: opts.scoreMetadata } : {}),
|
||||
};
|
||||
// Fire-and-forget the actual network send so the route can respond
|
||||
// immediately. The handler's response already encodes the consent +
|
||||
// sink-presence outcome above; failures inside the send are operational
|
||||
// telemetry, not a client-facing signal.
|
||||
void reportRunFeedback(
|
||||
ctx,
|
||||
opts.fetchImpl ? { fetchImpl: opts.fetchImpl } : {},
|
||||
).catch((err) => {
|
||||
console.warn('[langfuse-bridge] feedback report failed:', String(err));
|
||||
});
|
||||
return { status: 'accepted' };
|
||||
}
|
||||
|
|
|
|||
|
|
@ -151,6 +151,29 @@ export interface ReportRunOpts {
|
|||
fetchImpl?: typeof fetch;
|
||||
}
|
||||
|
||||
/**
|
||||
* Payload sent to Langfuse when a user thumbs-up/down's an assistant turn.
|
||||
*
|
||||
* The `runId` doubles as the Langfuse trace id (same convention used by
|
||||
* buildTracePayload), so the score lands on the existing trace if the run
|
||||
* was previously reported. If the run wasn't reported (e.g. content
|
||||
* consent was off at run completion, then turned on before the user
|
||||
* scored), Langfuse will accept the score anyway and the trace will
|
||||
* materialize when/if the daemon backfills it.
|
||||
*/
|
||||
export interface FeedbackReportContext {
|
||||
runId: string;
|
||||
installationId: string | null;
|
||||
prefs: TelemetryPrefs;
|
||||
rating: 'positive' | 'negative';
|
||||
reasonCodes: string[];
|
||||
/** Raw "other" free text the user typed. Trimmed; empty string when absent. */
|
||||
customReason: string;
|
||||
hasCustomReason: boolean;
|
||||
/** Optional context bag that ends up in Langfuse score metadata. */
|
||||
metadata?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
export function readLangfuseConfig(
|
||||
env: NodeJS.ProcessEnv = process.env,
|
||||
): LangfuseConfig | null {
|
||||
|
|
@ -658,3 +681,105 @@ export async function reportRunCompleted(
|
|||
}
|
||||
await postLangfuseBatch(config, batch, fetchImpl);
|
||||
}
|
||||
|
||||
// Build a Langfuse `score-create` batch for a user-supplied turn rating.
|
||||
//
|
||||
// Langfuse scores let evals filter traces by user feedback. We emit one
|
||||
// NUMERIC score (`user_rating`, +1 / -1) plus optional CATEGORICAL scores
|
||||
// for each reason code, so the Langfuse UI's score filters work out of
|
||||
// the box. Raw custom-reason text rides in the score metadata when the
|
||||
// user opted into telemetry.content; the consent gate lives in
|
||||
// reportRunFeedback below, so this builder stays content-agnostic.
|
||||
//
|
||||
// Limitation: stable score ids (`${traceId}-rating`, `${traceId}-reason-${code}`)
|
||||
// mean re-submission overwrites cleanly, but reason codes the user removes
|
||||
// in a follow-up submission do not get a tombstone. A future change can
|
||||
// thread `removedReasonCodes` through and emit overwriting "cleared"
|
||||
// scores for them; not done here to keep this PR scoped to the bridge.
|
||||
export function buildFeedbackPayload(ctx: FeedbackReportContext): unknown[] {
|
||||
const traceId = ctx.runId;
|
||||
const nowIso = new Date().toISOString();
|
||||
const batch: unknown[] = [];
|
||||
|
||||
const ratingMetadata: Record<string, unknown> = {
|
||||
reasonCodes: ctx.reasonCodes,
|
||||
reasonCount: ctx.reasonCodes.length,
|
||||
hasCustomReason: ctx.hasCustomReason,
|
||||
// Raw text — gated upstream by telemetry.content consent.
|
||||
customReason: ctx.customReason || undefined,
|
||||
installationId: ctx.installationId ?? undefined,
|
||||
...(ctx.metadata ?? {}),
|
||||
};
|
||||
|
||||
batch.push({
|
||||
id: randomUUID(),
|
||||
type: 'score-create',
|
||||
timestamp: nowIso,
|
||||
body: {
|
||||
id: `${traceId}-rating`,
|
||||
traceId,
|
||||
name: 'user_rating',
|
||||
value: ctx.rating === 'positive' ? 1 : -1,
|
||||
dataType: 'NUMERIC',
|
||||
comment: ctx.rating,
|
||||
metadata: ratingMetadata,
|
||||
},
|
||||
});
|
||||
|
||||
for (const code of ctx.reasonCodes) {
|
||||
batch.push({
|
||||
id: randomUUID(),
|
||||
type: 'score-create',
|
||||
timestamp: nowIso,
|
||||
body: {
|
||||
// Stable per (run, code) so re-submission overwrites cleanly.
|
||||
id: `${traceId}-reason-${code}`,
|
||||
traceId,
|
||||
name: 'user_rating_reason',
|
||||
value: code,
|
||||
dataType: 'CATEGORICAL',
|
||||
// Group the reason under the rating it was submitted with so a
|
||||
// "matched_request" tag on a thumbs-down run is still visibly
|
||||
// negative in the Langfuse UI.
|
||||
comment: ctx.rating,
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
return batch;
|
||||
}
|
||||
|
||||
export async function reportRunFeedback(
|
||||
ctx: FeedbackReportContext,
|
||||
opts: ReportRunOpts = {},
|
||||
): Promise<void> {
|
||||
if (ctx.prefs.metrics !== true) return;
|
||||
if (ctx.prefs.content !== true) return;
|
||||
|
||||
const config = resolveReportConfig(opts);
|
||||
if (!config) return;
|
||||
|
||||
let batch: unknown[];
|
||||
try {
|
||||
batch = buildFeedbackPayload(ctx);
|
||||
} catch (error) {
|
||||
console.warn(`[langfuse-trace] Feedback payload build error: ${String(error)}`);
|
||||
return;
|
||||
}
|
||||
|
||||
const serialized = JSON.stringify({ batch });
|
||||
const serializedBytes = Buffer.byteLength(serialized, 'utf8');
|
||||
if (serializedBytes > HARD_BATCH_MAX_BYTES) {
|
||||
console.warn(
|
||||
`[langfuse-trace] Feedback batch too large (${serializedBytes}B > ${HARD_BATCH_MAX_BYTES}B), dropping feedback for ${ctx.runId}`,
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
const fetchImpl = opts.fetchImpl ?? globalThis.fetch;
|
||||
if (config.kind === 'relay') {
|
||||
await postRelayBatch(config, serialized, fetchImpl);
|
||||
return;
|
||||
}
|
||||
await postLangfuseBatch(config, batch, fetchImpl);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -55,6 +55,21 @@ export interface RoutineDeps {
|
|||
|
||||
export interface TelemetryDeps {
|
||||
reportFinalizedMessage: (saved: any, body?: any) => void;
|
||||
/**
|
||||
* Best-effort Langfuse score emission for assistant-turn user ratings.
|
||||
* Returns the categorical outcome so the API surface in chat-routes can
|
||||
* report back to the web client whether the report was accepted or
|
||||
* skipped (consent off / no sink). The handler must not await this in
|
||||
* the request hot path — fire-and-forget.
|
||||
*/
|
||||
reportFeedback?: (req: {
|
||||
runId: string;
|
||||
rating: 'positive' | 'negative';
|
||||
reasonCodes: string[];
|
||||
hasCustomReason: boolean;
|
||||
customReason: string;
|
||||
scoreMetadata?: Record<string, unknown>;
|
||||
}) => Promise<{ status: 'accepted' | 'skipped_consent' | 'skipped_no_sink' }>;
|
||||
}
|
||||
|
||||
export interface ServerContext {
|
||||
|
|
|
|||
|
|
@ -184,7 +184,10 @@ import { renderDesignSystemPreview } from './design-system-preview.js';
|
|||
import { renderDesignSystemShowcase } from './design-system-showcase.js';
|
||||
import { createChatRunService } from './runs.js';
|
||||
import { deriveRunErrorCode, runResultFromStatus } from './run-result.js';
|
||||
import { reportRunCompletedFromDaemon } from './langfuse-bridge.js';
|
||||
import {
|
||||
reportRunCompletedFromDaemon,
|
||||
reportRunFeedbackFromDaemon,
|
||||
} from './langfuse-bridge.js';
|
||||
import {
|
||||
createAnalyticsService,
|
||||
newInsertId,
|
||||
|
|
@ -4619,6 +4622,19 @@ export async function startServer({
|
|||
getAppVersion: () => cachedAppVersion,
|
||||
});
|
||||
|
||||
const reportFeedback = (req: {
|
||||
runId: string;
|
||||
rating: 'positive' | 'negative';
|
||||
reasonCodes: string[];
|
||||
hasCustomReason: boolean;
|
||||
customReason: string;
|
||||
scoreMetadata?: Record<string, unknown>;
|
||||
}) =>
|
||||
reportRunFeedbackFromDaemon({
|
||||
dataDir: RUNTIME_DATA_DIR,
|
||||
...req,
|
||||
});
|
||||
|
||||
// DNS-aware wrapper. The sync `validateBaseUrl` only inspects the literal
|
||||
// hostname string, so a public DNS name pointing at an internal address
|
||||
// (`internal.example.com → 10.0.0.5`) still passes. We delegate to
|
||||
|
|
@ -11770,7 +11786,7 @@ export async function startServer({
|
|||
critique: critiqueDeps,
|
||||
validation: validationDeps,
|
||||
lifecycle: { isDaemonShuttingDown: () => daemonShuttingDown },
|
||||
|
||||
telemetry: { reportFinalizedMessage, reportFeedback },
|
||||
});
|
||||
|
||||
registerStaticSpaFallback(app, STATIC_DIR);
|
||||
|
|
|
|||
|
|
@ -1,10 +1,13 @@
|
|||
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
import {
|
||||
buildFeedbackPayload,
|
||||
buildTracePayload,
|
||||
readLangfuseConfig,
|
||||
readTelemetrySinkConfig,
|
||||
reportRunCompleted,
|
||||
reportRunFeedback,
|
||||
type FeedbackReportContext,
|
||||
type LangfuseConfig,
|
||||
type ReportContext,
|
||||
type TelemetrySinkConfig,
|
||||
|
|
@ -749,3 +752,110 @@ describe('reportRunCompleted', () => {
|
|||
expect(warnSpy).not.toHaveBeenCalled();
|
||||
});
|
||||
});
|
||||
|
||||
function makeFeedbackCtx(
|
||||
overrides: Partial<FeedbackReportContext> = {},
|
||||
): FeedbackReportContext {
|
||||
return {
|
||||
runId: 'run-feedback-1',
|
||||
installationId: 'install-uuid-1',
|
||||
prefs: { metrics: true, content: true },
|
||||
rating: 'positive',
|
||||
reasonCodes: ['matched_request'],
|
||||
hasCustomReason: false,
|
||||
customReason: '',
|
||||
...overrides,
|
||||
};
|
||||
}
|
||||
|
||||
describe('buildFeedbackPayload', () => {
|
||||
it('emits a numeric user_rating score plus per-reason categorical scores', () => {
|
||||
const batch = buildFeedbackPayload(
|
||||
makeFeedbackCtx({
|
||||
rating: 'negative',
|
||||
reasonCodes: ['missed_request', 'weak_visual'],
|
||||
hasCustomReason: true,
|
||||
customReason: 'It got the layout wrong on tablet',
|
||||
}),
|
||||
) as Array<Record<string, any>>;
|
||||
expect(batch).toHaveLength(3);
|
||||
const ratingScore = batch[0]!;
|
||||
expect(ratingScore.type).toBe('score-create');
|
||||
expect(ratingScore.body.traceId).toBe('run-feedback-1');
|
||||
expect(ratingScore.body.name).toBe('user_rating');
|
||||
expect(ratingScore.body.value).toBe(-1);
|
||||
expect(ratingScore.body.dataType).toBe('NUMERIC');
|
||||
expect(ratingScore.body.comment).toBe('negative');
|
||||
expect(ratingScore.body.metadata).toMatchObject({
|
||||
reasonCount: 2,
|
||||
customReason: 'It got the layout wrong on tablet',
|
||||
hasCustomReason: true,
|
||||
});
|
||||
for (const reasonScore of batch.slice(1)) {
|
||||
expect(reasonScore.body.name).toBe('user_rating_reason');
|
||||
expect(reasonScore.body.dataType).toBe('CATEGORICAL');
|
||||
expect(reasonScore.body.comment).toBe('negative');
|
||||
expect(reasonScore.body.traceId).toBe('run-feedback-1');
|
||||
}
|
||||
expect(batch[1]!.body.value).toBe('missed_request');
|
||||
expect(batch[2]!.body.value).toBe('weak_visual');
|
||||
});
|
||||
|
||||
it('does not emit reason scores when no codes were submitted', () => {
|
||||
const batch = buildFeedbackPayload(
|
||||
makeFeedbackCtx({ reasonCodes: [] }),
|
||||
) as Array<Record<string, any>>;
|
||||
expect(batch).toHaveLength(1);
|
||||
expect(batch[0]!.body.name).toBe('user_rating');
|
||||
expect(batch[0]!.body.value).toBe(1);
|
||||
});
|
||||
});
|
||||
|
||||
describe('reportRunFeedback', () => {
|
||||
const TEST_CONFIG: LangfuseConfig = {
|
||||
baseUrl: 'https://us.cloud.langfuse.com',
|
||||
authHeader: 'Basic Zm9vOmJhcg==',
|
||||
retries: 0,
|
||||
timeoutMs: 1000,
|
||||
};
|
||||
|
||||
beforeEach(() => {
|
||||
vi.useRealTimers();
|
||||
});
|
||||
|
||||
it('skips when metrics consent is off', async () => {
|
||||
const fetchSpy = vi.fn();
|
||||
await reportRunFeedback(makeFeedbackCtx({ prefs: { metrics: false, content: true } }), {
|
||||
config: TEST_CONFIG,
|
||||
fetchImpl: fetchSpy as any,
|
||||
});
|
||||
expect(fetchSpy).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('skips when content consent is off', async () => {
|
||||
const fetchSpy = vi.fn();
|
||||
await reportRunFeedback(makeFeedbackCtx({ prefs: { metrics: true, content: false } }), {
|
||||
config: TEST_CONFIG,
|
||||
fetchImpl: fetchSpy as any,
|
||||
});
|
||||
expect(fetchSpy).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('posts a score-create batch to /api/public/ingestion when consent is on', async () => {
|
||||
const fetchSpy = vi.fn().mockResolvedValue(
|
||||
new Response(JSON.stringify({ successes: [], errors: [] }), { status: 207 }),
|
||||
);
|
||||
await reportRunFeedback(
|
||||
makeFeedbackCtx({ reasonCodes: ['matched_request'] }),
|
||||
{ config: TEST_CONFIG, fetchImpl: fetchSpy as any },
|
||||
);
|
||||
expect(fetchSpy).toHaveBeenCalledTimes(1);
|
||||
const [url, init] = fetchSpy.mock.calls[0]!;
|
||||
expect(url).toBe('https://us.cloud.langfuse.com/api/public/ingestion');
|
||||
expect(init.method).toBe('POST');
|
||||
const body = JSON.parse(init.body);
|
||||
expect(body.batch).toHaveLength(2);
|
||||
expect(body.batch[0].type).toBe('score-create');
|
||||
expect(body.batch[0].body.value).toBe(1);
|
||||
});
|
||||
});
|
||||
|
|
|
|||
|
|
@ -53,7 +53,11 @@ import type {
|
|||
PresentPopoverClickProps,
|
||||
ShareOptionPopoverClickProps,
|
||||
AssistantFeedbackButtonClickProps,
|
||||
AssistantFeedbackClickProps,
|
||||
AssistantFeedbackReasonClickProps,
|
||||
AssistantFeedbackReasonSubmitClickProps,
|
||||
AssistantFeedbackReasonSubmitProps,
|
||||
AssistantFeedbackReasonViewProps,
|
||||
SettingsSidebarClickProps,
|
||||
SettingsExecutionModeTabClickProps,
|
||||
SettingsLocalCliClickProps,
|
||||
|
|
@ -616,3 +620,47 @@ export function trackSettingsConnectorAuthResult(
|
|||
): void {
|
||||
send(track, 'settings_connector_auth_result', props);
|
||||
}
|
||||
|
||||
export function trackAssistantFeedbackClick(
|
||||
track: Track,
|
||||
props: AssistantFeedbackClickProps,
|
||||
) {
|
||||
track(
|
||||
'assistant_feedback_click',
|
||||
props as unknown as Record<string, unknown>,
|
||||
);
|
||||
}
|
||||
|
||||
export function trackAssistantFeedbackReasonView(
|
||||
track: Track,
|
||||
props: AssistantFeedbackReasonViewProps,
|
||||
) {
|
||||
track(
|
||||
'assistant_feedback_reason_view',
|
||||
props as unknown as Record<string, unknown>,
|
||||
);
|
||||
}
|
||||
|
||||
export function trackAssistantFeedbackReasonClick(
|
||||
track: Track,
|
||||
props: AssistantFeedbackReasonClickProps,
|
||||
options?: { requestId: string },
|
||||
) {
|
||||
track(
|
||||
'assistant_feedback_reason_click',
|
||||
props as unknown as Record<string, unknown>,
|
||||
options,
|
||||
);
|
||||
}
|
||||
|
||||
export function trackAssistantFeedbackReasonSubmit(
|
||||
track: Track,
|
||||
props: AssistantFeedbackReasonSubmitProps,
|
||||
options?: { requestId: string },
|
||||
) {
|
||||
track(
|
||||
'assistant_feedback_reason_submit',
|
||||
props as unknown as Record<string, unknown>,
|
||||
options,
|
||||
);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -7,11 +7,20 @@ import { submitChatRunToolResult } from "../providers/daemon";
|
|||
import { useAnalytics } from "../analytics/provider";
|
||||
import {
|
||||
trackAssistantFeedbackButtonClick,
|
||||
trackAssistantFeedbackClick,
|
||||
trackAssistantFeedbackReasonClick,
|
||||
trackAssistantFeedbackReasonPanelSurfaceView,
|
||||
trackAssistantFeedbackReasonSubmit,
|
||||
trackAssistantFeedbackReasonSubmitClick,
|
||||
trackAssistantFeedbackReasonView,
|
||||
trackFeedbackSubmitResult,
|
||||
} from "../analytics/events";
|
||||
import type { TrackingProjectKind } from "@open-design/contracts/analytics";
|
||||
import {
|
||||
normalizeCustomReason,
|
||||
type TrackingFeedbackReasonCode,
|
||||
type TrackingFeedbackRatingWithNone,
|
||||
type TrackingProjectKind,
|
||||
} from "@open-design/contracts/analytics";
|
||||
import {
|
||||
splitOnQuestionForms,
|
||||
type QuestionForm,
|
||||
|
|
@ -550,10 +559,10 @@ function AssistantFeedback({
|
|||
}) {
|
||||
const t = useT();
|
||||
const analytics = useAnalytics();
|
||||
// P0 — analytics context the feedback events need. The four ids are
|
||||
// either user-anchored (projectId / assistantMessageId) or run-anchored
|
||||
// (runId), so we pass them down with a stable identity. `producedFileCount`
|
||||
// feeds `has_produced_files` on assistant_feedback_button click.
|
||||
// Analytics context the feedback events need. The four ids are either
|
||||
// user-anchored (projectId / assistantMessageId) or run-anchored (runId),
|
||||
// so we pass them down with a stable identity. `producedFileCount` feeds
|
||||
// `has_produced_files` on assistant_feedback_button click.
|
||||
const [burstKey, setBurstKey] = useState(0);
|
||||
const [reasonRating, setReasonRating] =
|
||||
useState<ChatMessageFeedbackRating | null>(null);
|
||||
|
|
@ -585,6 +594,24 @@ function AssistantFeedback({
|
|||
run_id: runId ?? "",
|
||||
rating: reasonRating,
|
||||
});
|
||||
// Dedicated assistant_feedback_reason_view event paired with the
|
||||
// umbrella surface_view above. Requires the full project + conversation
|
||||
// identity (its props type is stricter than the umbrella variant);
|
||||
// skipped on test renders that mount AssistantMessage without those.
|
||||
if (projectId && projectKind && conversationId) {
|
||||
trackAssistantFeedbackReasonView(analytics.track, {
|
||||
page: "studio",
|
||||
area: "chat_panel",
|
||||
element: "assistant_feedback_reason_panel",
|
||||
view_type: "panel",
|
||||
project_id: projectId,
|
||||
project_kind: projectKind,
|
||||
conversation_id: conversationId,
|
||||
assistant_message_id: assistantMessageId,
|
||||
run_id: runId ?? null,
|
||||
rating: reasonRating,
|
||||
});
|
||||
}
|
||||
}, [
|
||||
reasonRating,
|
||||
analytics.track,
|
||||
|
|
@ -620,6 +647,26 @@ function AssistantFeedback({
|
|||
rating_before: ratingBefore,
|
||||
has_produced_files: producedFileCount > 0,
|
||||
});
|
||||
// Dedicated assistant_feedback_click paired with the umbrella ui_click
|
||||
// above. Carries the post-action rating in the widened union (allows
|
||||
// 'none' for the clear path).
|
||||
if (projectId && projectKind && conversationId) {
|
||||
const ratingAfter: TrackingFeedbackRatingWithNone = nextRating ?? "none";
|
||||
trackAssistantFeedbackClick(analytics.track, {
|
||||
page: "studio",
|
||||
area: "chat_panel",
|
||||
element: "assistant_feedback_button",
|
||||
action: nextRating ? "submit_feedback_rating" : "clear_feedback_rating",
|
||||
project_id: projectId,
|
||||
project_kind: projectKind,
|
||||
conversation_id: conversationId,
|
||||
assistant_message_id: assistantMessageId,
|
||||
run_id: runId ?? null,
|
||||
rating: ratingAfter,
|
||||
rating_before: ratingBefore,
|
||||
has_produced_files: producedFileCount > 0,
|
||||
});
|
||||
}
|
||||
onFeedback(nextRating ? { rating: nextRating } : null);
|
||||
};
|
||||
const toggleReasonCode = (code: ChatMessageFeedbackReasonCode) => {
|
||||
|
|
@ -687,6 +734,47 @@ function AssistantFeedback({
|
|||
},
|
||||
{ requestId },
|
||||
);
|
||||
// Dedicated assistant_feedback_reason_click + reason_submit paired with
|
||||
// the umbrella ui_click + feedback_submit_result above. Both fire under
|
||||
// the same `requestId` so PostHog can stitch click → result per the
|
||||
// tracking spec.
|
||||
if (projectId && projectKind && conversationId) {
|
||||
const reasons = reasonCodes as TrackingFeedbackReasonCode[];
|
||||
const sharedPayload = {
|
||||
page: "studio" as const,
|
||||
area: "chat_panel" as const,
|
||||
project_id: projectId,
|
||||
project_kind: projectKind,
|
||||
conversation_id: conversationId,
|
||||
assistant_message_id: assistantMessageId,
|
||||
run_id: runId ?? null,
|
||||
rating: reasonRating,
|
||||
reason: reasons,
|
||||
reason_count: reasons.length,
|
||||
has_custom_reason: hasCustomReason,
|
||||
custom_reason: hasCustomReason
|
||||
? normalizeCustomReason(trimmedCustomReason)
|
||||
: "",
|
||||
};
|
||||
trackAssistantFeedbackReasonClick(
|
||||
analytics.track,
|
||||
{
|
||||
...sharedPayload,
|
||||
element: "assistant_feedback_reason_submit_button",
|
||||
action: "click_submit_feedback_reason",
|
||||
},
|
||||
{ requestId },
|
||||
);
|
||||
trackAssistantFeedbackReasonSubmit(
|
||||
analytics.track,
|
||||
{
|
||||
...sharedPayload,
|
||||
element: "assistant_feedback_reason_submit",
|
||||
action: "submit_feedback_reason",
|
||||
},
|
||||
{ requestId },
|
||||
);
|
||||
}
|
||||
onFeedback({
|
||||
rating: reasonRating,
|
||||
reasonCodes,
|
||||
|
|
|
|||
|
|
@ -19,9 +19,11 @@ import {
|
|||
listActiveChatRuns,
|
||||
listProjectRuns,
|
||||
reattachDaemonRun,
|
||||
reportChatRunFeedback,
|
||||
streamViaDaemon,
|
||||
} from '../providers/daemon';
|
||||
import { fetchElevenLabsVoiceOptions } from '../providers/elevenlabs-voices';
|
||||
import { normalizeCustomReason } from '@open-design/contracts/analytics';
|
||||
import {
|
||||
deletePreviewComment,
|
||||
fetchPreviewComments,
|
||||
|
|
@ -1516,8 +1518,25 @@ export function ProjectView({
|
|||
},
|
||||
true,
|
||||
);
|
||||
// Forward affirmative ratings to the daemon → Langfuse `score-create`.
|
||||
// Clears (change=null) are skipped — Langfuse scores are append-only,
|
||||
// and the rating is also captured by the PostHog event so a clear is
|
||||
// recoverable downstream if we ever need it.
|
||||
const runId = assistantMessage.runId;
|
||||
if (change && runId && activeConversationId) {
|
||||
void reportChatRunFeedback({
|
||||
runId,
|
||||
projectId: project.id,
|
||||
conversationId: activeConversationId,
|
||||
assistantMessageId: assistantMessage.id,
|
||||
rating: change.rating,
|
||||
reasonCodes: change.reasonCodes ?? [],
|
||||
hasCustomReason: !!change.customReason,
|
||||
customReason: normalizeCustomReason(change.customReason),
|
||||
});
|
||||
}
|
||||
},
|
||||
[updateMessageById],
|
||||
[updateMessageById, activeConversationId, project.id],
|
||||
);
|
||||
|
||||
const appendAssistantErrorEvent = useCallback(
|
||||
|
|
|
|||
|
|
@ -365,6 +365,31 @@ export async function submitChatRunToolResult(
|
|||
}
|
||||
}
|
||||
|
||||
// Forwards the user's assistant-turn rating to the daemon so it can emit
|
||||
// a Langfuse `score-create`. Fire-and-forget — failures are not surfaced
|
||||
// to the UI (the rating is already persisted on the message itself via
|
||||
// the PUT /messages/:id round-trip).
|
||||
export async function reportChatRunFeedback(req: {
|
||||
runId: string;
|
||||
projectId: string;
|
||||
conversationId: string;
|
||||
assistantMessageId: string;
|
||||
rating: 'positive' | 'negative';
|
||||
reasonCodes: string[];
|
||||
hasCustomReason: boolean;
|
||||
customReason: string;
|
||||
}): Promise<void> {
|
||||
try {
|
||||
await fetch(`/api/runs/${encodeURIComponent(req.runId)}/feedback`, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify(req),
|
||||
});
|
||||
} catch {
|
||||
// Best-effort.
|
||||
}
|
||||
}
|
||||
|
||||
export async function listActiveChatRuns(
|
||||
projectId: string,
|
||||
conversationId: string,
|
||||
|
|
|
|||
|
|
@ -46,6 +46,7 @@ describe('ChatPane streaming state', () => {
|
|||
|
||||
render(
|
||||
<ChatPane
|
||||
projectKindForTracking="prototype"
|
||||
messages={messages}
|
||||
streaming={false}
|
||||
error={null}
|
||||
|
|
@ -117,6 +118,7 @@ Expected output:
|
|||
|
||||
render(
|
||||
<ChatPane
|
||||
projectKindForTracking="prototype"
|
||||
messages={messages}
|
||||
streaming={false}
|
||||
error={null}
|
||||
|
|
|
|||
|
|
@ -23,6 +23,8 @@ describe('AssistantMessage tool status', () => {
|
|||
it('shows Done for a completed run tool use that has no tool result', () => {
|
||||
const { container } = render(
|
||||
<AssistantMessage
|
||||
projectKind="prototype"
|
||||
conversationId="conv-1"
|
||||
message={messageWithEvents([
|
||||
{
|
||||
kind: 'tool_use',
|
||||
|
|
@ -43,6 +45,8 @@ describe('AssistantMessage tool status', () => {
|
|||
it('keeps legacy completed messages without runStatus as Done', () => {
|
||||
const { container } = render(
|
||||
<AssistantMessage
|
||||
projectKind="prototype"
|
||||
conversationId="conv-1"
|
||||
message={{
|
||||
...messageWithEvents([
|
||||
{
|
||||
|
|
@ -66,6 +70,8 @@ describe('AssistantMessage tool status', () => {
|
|||
it('shows Done in a grouped completed run when tool results are missing', () => {
|
||||
const { container } = render(
|
||||
<AssistantMessage
|
||||
projectKind="prototype"
|
||||
conversationId="conv-1"
|
||||
message={messageWithEvents([
|
||||
{
|
||||
kind: 'tool_use',
|
||||
|
|
@ -92,6 +98,8 @@ describe('AssistantMessage tool status', () => {
|
|||
it('does not show Done when a failed run is missing a tool result', () => {
|
||||
const { container } = render(
|
||||
<AssistantMessage
|
||||
projectKind="prototype"
|
||||
conversationId="conv-1"
|
||||
message={{
|
||||
...messageWithEvents([
|
||||
{
|
||||
|
|
@ -115,6 +123,8 @@ describe('AssistantMessage tool status', () => {
|
|||
it('does not show Done when a canceled run is missing a tool result', () => {
|
||||
const { container } = render(
|
||||
<AssistantMessage
|
||||
projectKind="prototype"
|
||||
conversationId="conv-1"
|
||||
message={{
|
||||
...messageWithEvents([
|
||||
{
|
||||
|
|
@ -138,6 +148,8 @@ describe('AssistantMessage tool status', () => {
|
|||
it('keeps Running for a streaming tool use that has no tool result', () => {
|
||||
const { container } = render(
|
||||
<AssistantMessage
|
||||
projectKind="prototype"
|
||||
conversationId="conv-1"
|
||||
message={{
|
||||
...messageWithEvents([
|
||||
{
|
||||
|
|
|
|||
|
|
@ -79,6 +79,8 @@ describe('AssistantMessage unfinished todo state', () => {
|
|||
it('shows a soft no-output state instead of Done for empty API responses', () => {
|
||||
render(
|
||||
<AssistantMessage
|
||||
projectKind="prototype"
|
||||
conversationId="conv-1"
|
||||
message={messageWithEvents([
|
||||
{ kind: 'status', label: 'empty_response', detail: 'deepseek-chat' },
|
||||
{
|
||||
|
|
@ -101,6 +103,8 @@ describe('AssistantMessage unfinished todo state', () => {
|
|||
it('keeps Done for a completed latest TodoWrite fixture', () => {
|
||||
render(
|
||||
<AssistantMessage
|
||||
projectKind="prototype"
|
||||
conversationId="conv-1"
|
||||
message={messageWithEvents([
|
||||
{
|
||||
kind: 'tool_use',
|
||||
|
|
@ -166,6 +170,8 @@ describe('AssistantMessage unfinished todo state', () => {
|
|||
const onContinue = vi.fn();
|
||||
render(
|
||||
<AssistantMessage
|
||||
projectKind="prototype"
|
||||
conversationId="conv-1"
|
||||
message={messageWithEvents([
|
||||
{
|
||||
kind: 'tool_use',
|
||||
|
|
@ -213,6 +219,8 @@ describe('AssistantMessage unfinished todo state', () => {
|
|||
it('hides the continue button on older assistant turns', () => {
|
||||
render(
|
||||
<AssistantMessage
|
||||
projectKind="prototype"
|
||||
conversationId="conv-1"
|
||||
message={messageWithEvents([
|
||||
{
|
||||
kind: 'tool_use',
|
||||
|
|
|
|||
|
|
@ -113,6 +113,7 @@ function renderChatPane({
|
|||
onAssistantFeedback,
|
||||
...render(
|
||||
<ChatPane
|
||||
projectKindForTracking="prototype"
|
||||
messages={messages}
|
||||
streaming={streaming}
|
||||
error={null}
|
||||
|
|
|
|||
|
|
@ -135,6 +135,7 @@ function setUserScroll(top: number) {
|
|||
function chatPaneEl(messages: ChatMessage[], activeConversationId: string | null) {
|
||||
return (
|
||||
<ChatPane
|
||||
projectKindForTracking="prototype"
|
||||
messages={messages}
|
||||
streaming={false}
|
||||
error={null}
|
||||
|
|
|
|||
|
|
@ -21,6 +21,7 @@ import type { ChatMessage, Conversation } from '../../src/types';
|
|||
function renderChatPane(messages: ChatMessage[]) {
|
||||
return render(
|
||||
<ChatPane
|
||||
projectKindForTracking="prototype"
|
||||
messages={messages}
|
||||
streaming={false}
|
||||
error={null}
|
||||
|
|
|
|||
|
|
@ -33,6 +33,10 @@ export type AnalyticsEventName =
|
|||
| 'artifact_export_result'
|
||||
// Feedback
|
||||
| 'feedback_submit_result'
|
||||
| 'assistant_feedback_click'
|
||||
| 'assistant_feedback_reason_view'
|
||||
| 'assistant_feedback_reason_click'
|
||||
| 'assistant_feedback_reason_submit'
|
||||
// Settings
|
||||
| 'settings_view'
|
||||
| 'settings_cli_test_result'
|
||||
|
|
@ -158,6 +162,35 @@ export type TrackingRunResult = 'success' | 'failed' | 'cancelled';
|
|||
export type TrackingExportResult = 'success' | 'failed' | 'cancelled';
|
||||
export type TrackingTestResult = 'success' | 'failed' | 'timeout';
|
||||
|
||||
export type TrackingFeedbackRating = 'positive' | 'negative';
|
||||
// Click events emit `none` when the user clears a previously-set rating, so
|
||||
// `rating` (post-state) and `rating_before` (pre-state) on click both use
|
||||
// this widened union. Reason events still require a concrete rating.
|
||||
export type TrackingFeedbackRatingWithNone = 'positive' | 'negative' | 'none';
|
||||
export type TrackingFeedbackAction =
|
||||
| 'submit_feedback_rating'
|
||||
| 'clear_feedback_rating';
|
||||
|
||||
// Mirrors ChatMessageFeedbackReasonCode in packages/contracts/src/api/chat.ts.
|
||||
// Kept independent so the analytics wire format can evolve without forcing
|
||||
// a contract bump on the chat persistence shape.
|
||||
export type TrackingFeedbackReasonCode =
|
||||
| 'matched_request'
|
||||
| 'strong_visual'
|
||||
| 'useful_structure'
|
||||
| 'easy_to_continue'
|
||||
| 'followed_design_system'
|
||||
| 'missed_request'
|
||||
| 'weak_visual'
|
||||
| 'incomplete_output'
|
||||
| 'hard_to_use'
|
||||
| 'missed_design_system'
|
||||
| 'other';
|
||||
|
||||
// Product confirmed on 2026-05-13: custom_reason ships the raw text so
|
||||
// analysts can read the actual feedback. The earlier length-bucket approach
|
||||
// from the tracking doc draft is no longer in effect.
|
||||
|
||||
export type TrackingTokenCountSource =
|
||||
| 'provider_usage'
|
||||
| 'estimated'
|
||||
|
|
@ -1215,6 +1248,65 @@ export interface FeedbackSubmitResultProps {
|
|||
result: TrackingResult;
|
||||
}
|
||||
|
||||
interface AssistantFeedbackBase {
|
||||
page: 'studio';
|
||||
area: 'chat_panel';
|
||||
project_id: string;
|
||||
project_kind: TrackingProjectKind;
|
||||
conversation_id: string;
|
||||
assistant_message_id: string;
|
||||
// run_id may be absent for messages whose run record is missing or pruned,
|
||||
// but the product funnel keys off this; we emit `null` rather than dropping
|
||||
// the field so PostHog can distinguish "no run id" from "field forgotten".
|
||||
run_id: string | null;
|
||||
rating: TrackingFeedbackRating;
|
||||
}
|
||||
|
||||
// Click events override `rating` to allow `'none'` because the user can
|
||||
// clear a previously-set rating; reason_* events still inherit the
|
||||
// stricter `positive | negative` base since they only fire after the user
|
||||
// commits to a thumb.
|
||||
export interface AssistantFeedbackClickProps
|
||||
extends Omit<AssistantFeedbackBase, 'rating'> {
|
||||
element: 'assistant_feedback_button';
|
||||
action: TrackingFeedbackAction;
|
||||
/** Post-action state. `'none'` when the user just cleared their rating. */
|
||||
rating: TrackingFeedbackRatingWithNone;
|
||||
/** Pre-action state. Renamed from `previous_rating` for symmetry with `rating`. */
|
||||
rating_before: TrackingFeedbackRatingWithNone;
|
||||
has_produced_files: boolean;
|
||||
}
|
||||
|
||||
export interface AssistantFeedbackReasonViewProps extends AssistantFeedbackBase {
|
||||
element: 'assistant_feedback_reason_panel';
|
||||
view_type: 'panel';
|
||||
}
|
||||
|
||||
// Shape shared by reason_click (button click) and reason_submit (result).
|
||||
// Both fire from the same submit handler with the same payload, threaded by
|
||||
// request_id so PostHog can stitch click→result.
|
||||
interface AssistantFeedbackReasonResultBase extends AssistantFeedbackBase {
|
||||
reason: TrackingFeedbackReasonCode[];
|
||||
reason_count: number;
|
||||
has_custom_reason: boolean;
|
||||
/** Raw free-text the user typed in the "other" input. Empty string when
|
||||
* the user didn't select "other" or left the field blank. Product
|
||||
* confirmed on 2026-05-13 that the raw text ships (no length bucketing). */
|
||||
custom_reason: string;
|
||||
}
|
||||
|
||||
export interface AssistantFeedbackReasonClickProps
|
||||
extends AssistantFeedbackReasonResultBase {
|
||||
element: 'assistant_feedback_reason_submit_button';
|
||||
action: 'click_submit_feedback_reason';
|
||||
}
|
||||
|
||||
export interface AssistantFeedbackReasonSubmitProps
|
||||
extends AssistantFeedbackReasonResultBase {
|
||||
element: 'assistant_feedback_reason_submit';
|
||||
action: 'submit_feedback_reason';
|
||||
}
|
||||
|
||||
// SETTINGS view + result events (page=settings)
|
||||
export interface SettingsViewProps {
|
||||
page_name: TrackingSettingsPage;
|
||||
|
|
@ -1264,6 +1356,19 @@ export type AnalyticsEventPayload =
|
|||
| { event: 'file_upload_result'; props: FileUploadResultProps }
|
||||
| { event: 'artifact_export_result'; props: ArtifactExportResultProps }
|
||||
| { event: 'feedback_submit_result'; props: FeedbackSubmitResultProps }
|
||||
| { event: 'assistant_feedback_click'; props: AssistantFeedbackClickProps }
|
||||
| {
|
||||
event: 'assistant_feedback_reason_view';
|
||||
props: AssistantFeedbackReasonViewProps;
|
||||
}
|
||||
| {
|
||||
event: 'assistant_feedback_reason_click';
|
||||
props: AssistantFeedbackReasonClickProps;
|
||||
}
|
||||
| {
|
||||
event: 'assistant_feedback_reason_submit';
|
||||
props: AssistantFeedbackReasonSubmitProps;
|
||||
}
|
||||
| { event: 'settings_view'; props: SettingsViewProps }
|
||||
| { event: 'settings_cli_test_result'; props: SettingsCliTestResultProps }
|
||||
| { event: 'settings_byok_test_result'; props: SettingsByokTestResultProps }
|
||||
|
|
@ -1567,3 +1672,13 @@ export function deriveConfigureGlobals(
|
|||
configure_availability: configureAvailability,
|
||||
};
|
||||
}
|
||||
|
||||
// Normalize the "other" custom-reason free text for transport. Trims
|
||||
// whitespace and returns empty string when the field is blank or the user
|
||||
// didn't select the "other" option. Callers should pass the raw text only
|
||||
// when `has_custom_reason` is true; the helper itself is permissive.
|
||||
export function normalizeCustomReason(
|
||||
text: string | null | undefined,
|
||||
): string {
|
||||
return (text ?? '').trim();
|
||||
}
|
||||
|
|
|
|||
|
|
@ -70,6 +70,34 @@ export interface ChatMessageFeedback {
|
|||
updatedAt?: number;
|
||||
}
|
||||
|
||||
/**
|
||||
* POST /api/runs/:runId/feedback — relays the user's assistant-turn rating
|
||||
* to Langfuse as a `score-create` so evals can filter traces by feedback.
|
||||
* The daemon is the single network egress point for telemetry (web never
|
||||
* talks to Langfuse directly), and gates this on `telemetry.metrics +
|
||||
* telemetry.content` consent independently of what the browser thinks.
|
||||
*
|
||||
* `customReason` ships the raw free text the user typed in the "other"
|
||||
* input (trimmed). Product confirmed on 2026-05-13 that analysts need the
|
||||
* text to make sense of the feedback; this is consent-gated behind
|
||||
* `telemetry.content` like the rest of the message-content telemetry.
|
||||
*/
|
||||
export interface ChatRunFeedbackRequest {
|
||||
projectId: string;
|
||||
conversationId: string;
|
||||
assistantMessageId: string;
|
||||
rating: ChatMessageFeedbackRating;
|
||||
reasonCodes: ChatMessageFeedbackReasonCode[];
|
||||
hasCustomReason: boolean;
|
||||
/** Raw "other" free text (trimmed). Empty string when no custom reason. */
|
||||
customReason: string;
|
||||
}
|
||||
|
||||
export interface ChatRunFeedbackResponse {
|
||||
/** `'accepted'` once the daemon has enqueued (or skipped due to consent). */
|
||||
status: 'accepted' | 'skipped_consent' | 'skipped_no_sink';
|
||||
}
|
||||
|
||||
export interface ChatRunCreateResponse {
|
||||
runId: string;
|
||||
appliedPluginSnapshotId?: string;
|
||||
|
|
|
|||
Loading…
Reference in a new issue