open-design/apps
Sid 362b92c1a6
fix(packaged): swallow harmless setTypeOfService EINVAL from undici (#895) (#906)
* fix(packaged): swallow harmless setTypeOfService EINVAL from undici (#895)

Opening Settings → Pets → Community in the packaged desktop app
surfaced a native "JavaScript error in main process" dialog with
`Uncaught Exception: Error: setTypeOfService EINVAL`. Root cause:
undici's socket setup tries to set the IP_TOS byte for QoS / DSCP
marking on outbound sockets, and the macOS kernel refuses with
EINVAL on certain configurations (VPNs, IPv6-only sockets, some
firewall postures). The byte is purely advisory — the socket
itself is healthy and serves traffic — so the rejection should
not crash the app.

Two cooperating layers:

1. **`protocol.ts`** registers the `od://` scheme that backs every
   renderer page load and API call in the packaged build by
   forwarding through Node's global `fetch` (which is undici under
   the hood). Pulled the inner request handler out as
   `handleOdRequest()` so a test can drive it with a stub fetch,
   and wrapped the `await fetch()` in a try/catch that returns a
   502 Response on failure. Without this, every undici rejection —
   not just `setTypeOfService` — propagated to Electron's default
   uncaught-exception path. Now the renderer sees a normal error
   response and the main process keeps running.

2. **`logging.ts`** adds a defensive `process.on('uncaughtException')`
   handler with a narrow filter, `isHarmlessSocketOptionError`,
   that only matches the canonical undici shape (message contains
   `setTypeOfService` AND code is `EINVAL` or message contains
   `EINVAL`). For any unrecognised error the handler re-throws
   via `setImmediate` so Node's default crash + Electron's
   crash dialog still fire end-to-end — a future regression that
   broadens the filter to "every EINVAL" is caught by the unit
   tests below.

Tests: 13 new tests across `tests/protocol.test.ts` (5) and
`tests/logging.test.ts` (8) pin both layers — including the
explicit #895 regression case (fetch rejecting with the canonical
EINVAL shape returns a 502 instead of throwing) and the negative
guard against the filter swallowing real bugs (a generic write
EINVAL or a setTypeOfService EACCES is *not* matched).

Verified locally:
- `pnpm --filter @open-design/packaged vitest tests/protocol.test.ts tests/logging.test.ts` → 13/13
- packaged `tsconfig.json` and `tsconfig.tests.json` (the CI killer): both clean
- the one pre-existing failure in `tests/sidecars.test.ts` (`adds custom VP_HOME/bin …`) is independent of this PR — confirmed by stashing the change and re-running

* fix(packaged): break recursive rethrow + tighten EINVAL filter (#906 review)

@mrcfps and @lefarcen both flagged a real P1 in the first iteration:
the non-harmless branch of the new uncaughtException handler
rethrew via setImmediate while the same listener was still
registered, so a real bug would re-enter the handler indefinitely
instead of terminating. mrcfps reproduced the loop with a minimal
Node script. lefarcen also flagged that the filter trusted the
message string over a contradicting structured `code`.

Both fixes:

1. **Recursive rethrow (P1).** Extract the handler as a named
   factory, `createFatalUncaughtExceptionHandler(logger)`, that
   captures itself in closure. On non-harmless errors the handler
   now `process.removeListener('uncaughtException', self)` before
   scheduling the rethrow. With no listener registered, the next
   throw lands in Node's default crash path — which is exactly
   what we wanted ("preserve fail-fast for real bugs").

2. **`code` is authoritative (P2).** When `code` is present on the
   error, only `code === 'EINVAL'` qualifies. A contradicting
   `EACCES`/`EPERM` paired with `setTypeOfService EINVAL` in the
   message now slips through to the crash path instead of being
   swallowed. Message-based detection only fires when `code` is
   genuinely absent (some libuv builds don't populate it on raw
   thrown Errors).

3 new tests pin both fixes:
   - `does NOT match when code contradicts the message` and the
     EPERM variant guard against the P2 regression.
   - `removes itself from uncaughtException listeners before
     scheduling the rethrow` uses `vi.spyOn(process,
     'removeListener')` and a stubbed setImmediate to assert the
     call order: removeListener fires before setImmediate
     schedules the throw.
   - `does NOT re-enter itself when invoked twice` is a
     belt-and-suspenders loop guard — even if a future refactor
     dropped the removeListener call, the test would catch
     runaway scheduling.

Verified locally:
- packaged vitest: 18/18 (was 13, +3 new tests; +2 negative-guard
  tests for the P2 filter; -0 deletions)
- packaged tsc -p tsconfig.json --noEmit: clean
- packaged tsc -p tsconfig.tests.json --noEmit (the CI killer): clean
2026-05-09 01:16:23 +08:00
..
daemon Bug FIx: Media generation task state is volatile and lost on daemon restart #648 (#884) 2026-05-09 00:00:18 +08:00
desktop feat(desktop): export artifacts directly to PDF (#532) 2026-05-08 23:42:12 +08:00
landing-page release: Open Design 0.5.0 (#820) 2026-05-08 00:41:01 +08:00
packaged fix(packaged): swallow harmless setTypeOfService EINVAL from undici (#895) (#906) 2026-05-09 01:16:23 +08:00
web fix(i18n): rename live artifact tab label in zh-CN and zh-TW (#969) 2026-05-09 01:15:14 +08:00
AGENTS.md test(e2e): gate beta packaged runtime (#637) 2026-05-06 17:44:29 +08:00