[hero_lifecycle] shared admin RPC helpers hardcode bare /api/rpc; should be /api/{domain}/rpc #151

Closed
opened 2026-06-14 15:00:53 +00:00 by sameh-farouk · 2 comments
Member

The shared admin/web RPC helpers — window.rpc (parts/hero_scripts.html) and the api_docs/jobs/logs panes' rpc-url — POST to bare /api/rpc. But serve_domains serves each domain at /api/{domain}/rpc (single domain = /api/main/rpc), so live RPC from any single-domain migrated admin returns -32000 Backend unavailable.

Discovery/spec already use domain-scoped URLs (DomainCache::resolve emits /api/{domain}/openrpc.json); only the RPC-call path is bare.

Fix options: (a) serve_domains also serves a bare /api/rpc alias for the single domain, or (b) the shared components use the resolved /api/{domain}/rpc. Affects all single-domain migrated admins (planner now, collab soon).

The shared admin/web RPC helpers — `window.rpc` (`parts/hero_scripts.html`) and the `api_docs`/`jobs`/`logs` panes' `rpc-url` — POST to bare `/api/rpc`. But `serve_domains` serves each domain at `/api/{domain}/rpc` (single domain = `/api/main/rpc`), so live RPC from any single-domain migrated admin returns `-32000 Backend unavailable`. Discovery/spec already use domain-scoped URLs (`DomainCache::resolve` emits `/api/{domain}/openrpc.json`); only the RPC-call path is bare. Fix options: (a) `serve_domains` also serves a bare `/api/rpc` alias for the single domain, or (b) the shared components use the resolved `/api/{domain}/rpc`. Affects all single-domain migrated admins (planner now, collab soon).
Author
Member

Triage (verified on-box) — difficulty: M, breaking: no (but the obvious quick fix crashes)

Bug confirmed on-the-wire. Shared window.rpc (parts/hero_scripts.html:37) + the api_docs/jobs/logs/terminal panes POST bare /api/rpc; the admin proxy forwards it verbatim to the peer's socket-local /api/rpc (openrpc_proxy/mod.rs:284). But serve_domains serves single-domain peers only at /api/{domain}/rpc (= /api/main/rpc), with no bare /api/rpc (manifest.rs:803-804), so the peer 404s → reshaped to -32000 Backend unavailable. Discovery/spec already work (domain-scoped openrpc.json). This is why hero_planner ships an interim domain-scoped rpc_url.

Not breaking: the correct fix (a single-domain bare /api/rpc alias) is purely additive on the wire — existing /api/{domain}/rpc callers untouched, machine SDKs already POST the domain-scoped path. No downstream build breaks.

⚠️ Difficulty is M, not S — the obvious quick fix panics. Naively merging oschema_server::build_router into serve_rpc_domains (the tempting ~5-line approach) double-registers /api/ping (already mounted by service_root_router, rpc.rs:377) → axum panics on startup for every service (the documented die-loudly path). It also won't compile as written (no RpcState in scope at serve_rpc_domains), and a per-domain variant duplicate-route-panics on multi-domain services (e.g. hero_proc, 4 domains).

Correct approach: gate on domains.len()==1 at the launcher (serve_rpc_domains, manifest.rs:736); mount only /api/rpc + /api/openrpc.json for the lone domain (NOT /api/ping/health/heroservice — those already exist); update the ~4 in-repo #[cfg(test)] assertions that hardcode bare /api/rpc (e.g. logs/mod.rs, core/tests.rs); then revert hero_planner's interim domain-scoped rpc_url (planner_admin/src/router.rs:62-66, per its own TODO).

Option (b) alternative (no server alias): switch the shared templates to the resolved /api/{domain}/rpc — but that's 7 edit sites + an async DomainCache hop into two currently-synchronous index handlers, for the same user-visible result. Option (a) is preferred.

Consumers of the shared helper: hero_lib default shell, hero_planner_admin, hero_sync_gmail_admin (single-domain — also broken today). hero_proc_admin uses its own per-domain JS rpc() and is unaffected. Option (a) fixes everyone server-side with zero per-consumer edits.

## Triage (verified on-box) — difficulty: **M**, breaking: **no** (but the obvious quick fix *crashes*) **Bug confirmed on-the-wire.** Shared `window.rpc` (`parts/hero_scripts.html:37`) + the api_docs/jobs/logs/terminal panes POST bare `/api/rpc`; the admin proxy forwards it verbatim to the peer's socket-local `/api/rpc` (`openrpc_proxy/mod.rs:284`). But `serve_domains` serves single-domain peers only at `/api/{domain}/rpc` (= `/api/main/rpc`), with **no bare `/api/rpc`** (`manifest.rs:803-804`), so the peer 404s → reshaped to `-32000 Backend unavailable`. Discovery/spec already work (domain-scoped openrpc.json). This is why hero_planner ships an interim domain-scoped `rpc_url`. **Not breaking:** the correct fix (a single-domain bare `/api/rpc` alias) is purely additive on the wire — existing `/api/{domain}/rpc` callers untouched, machine SDKs already POST the domain-scoped path. No downstream build breaks. **⚠️ Difficulty is M, not S — the obvious quick fix panics.** Naively merging `oschema_server::build_router` into `serve_rpc_domains` (the tempting ~5-line approach) double-registers `/api/ping` (already mounted by `service_root_router`, `rpc.rs:377`) → **axum panics on startup for every service** (the documented die-loudly path). It also won't compile as written (no `RpcState` in scope at `serve_rpc_domains`), and a per-domain variant duplicate-route-panics on multi-domain services (e.g. hero_proc, 4 domains). **Correct approach:** gate on `domains.len()==1` **at the launcher** (`serve_rpc_domains`, `manifest.rs:736`); mount **only** `/api/rpc` + `/api/openrpc.json` for the lone domain (NOT `/api/ping`/health/heroservice — those already exist); update the ~4 in-repo `#[cfg(test)]` assertions that hardcode bare `/api/rpc` (e.g. `logs/mod.rs`, `core/tests.rs`); then revert hero_planner's interim domain-scoped `rpc_url` (`planner_admin/src/router.rs:62-66`, per its own TODO). **Option (b) alternative** (no server alias): switch the shared templates to the resolved `/api/{domain}/rpc` — but that's 7 edit sites + an async `DomainCache` hop into two currently-synchronous index handlers, for the same user-visible result. Option (a) is preferred. **Consumers of the shared helper:** hero_lib default shell, hero_planner_admin, hero_sync_gmail_admin (single-domain — also broken today). hero_proc_admin uses its own per-domain JS `rpc()` and is unaffected. Option (a) fixes everyone server-side with zero per-consumer edits.
Author
Member

Client half merged to development (squash) — completing #151 alongside the already-landed server control plane.

What landed (client side, in shared hero_lifecycle):

  • window.rpc (templates/parts/hero_scripts.html) now splits routing: ping/discover_domains → bare /api/rpc (control plane), data methods → /api/{domain}/rpc (domain discovered once via discover_domains, cached; optional 3rd domain arg for multi-domain admins).
  • api-docs pane rpc_url is domain-scoped in both constructors — api_docs/router.rs::pane_handler (the actually-served path) and api_docs/mod.rs::from_app — and the domain <select> now switches rpc-url alongside spec-url.

Verification: hero_lifecycle builds; 51/51 lib tests pass serially (the context-test flakes are pre-existing parallel/disk races); live browser E2E on hero_planner — connection indicator green (ping → control plane → pong), window.rpc('workspace_list')/api/main/rpc → real data, api-docs rpc-url domain-scoped, 0 RPC console errors.

Coordination / follow-ups:

  • hero_planner is the only consumer needing a manual step: it vendors hero_scripts.html (#152 workaround), so it must re-sync that partial and drop its interim rpc_url override after bumping the hero_lib dep. Every other consumer that mounts the shared api-docs pane gets the fix transparently on rebuild.
  • The jobs/logs/terminal panes still hardcode bare /api/rpc (same class of bug; no current consumer hits it — hero_proc uses its own rpc-peer wiring) — worth a small follow-up.
Client half merged to `development` (squash) — completing #151 alongside the already-landed server control plane. **What landed (client side, in shared `hero_lifecycle`):** - `window.rpc` (`templates/parts/hero_scripts.html`) now splits routing: `ping`/`discover_domains` → bare `/api/rpc` (control plane), data methods → `/api/{domain}/rpc` (domain discovered once via `discover_domains`, cached; optional 3rd `domain` arg for multi-domain admins). - api-docs pane `rpc_url` is domain-scoped in **both** constructors — `api_docs/router.rs::pane_handler` (the actually-served path) and `api_docs/mod.rs::from_app` — and the domain `<select>` now switches `rpc-url` alongside `spec-url`. **Verification:** `hero_lifecycle` builds; 51/51 lib tests pass serially (the context-test flakes are pre-existing parallel/disk races); **live browser E2E on hero_planner** — connection indicator green (ping → control plane → `pong`), `window.rpc('workspace_list')` → `/api/main/rpc` → real data, api-docs `rpc-url` domain-scoped, 0 RPC console errors. **Coordination / follow-ups:** - **hero_planner** is the only consumer needing a manual step: it *vendors* `hero_scripts.html` (#152 workaround), so it must re-sync that partial and drop its interim `rpc_url` override after bumping the hero_lib dep. Every other consumer that mounts the shared api-docs pane gets the fix transparently on rebuild. - The `jobs`/`logs`/`terminal` panes still hardcode bare `/api/rpc` (same class of bug; no current consumer hits it — hero_proc uses its own `rpc-peer` wiring) — worth a small follow-up.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_lib#151
No description provided.