proc: verify new RPC + multi-domain contract (server only) #152
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_proc#152
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
proc: verify new RPC + multi-domain contract (server only)
Scope: server RPC + multi-domain correctness only. Admin crates (
hero_proc_admin,hero_proc_admin_dx,hero_proc_admin_dx_app) are out of scope and untouched. hero_router is not modified (only consumed/listed).Ground truth derived from a live build + infocheck + contract probe + full test run on the latest
developmentof bothhero_procandhero_skills, against hero_libdevelopmentHEAD925b3df9(the Cargo.lock was refreshed from a stalea9b14b60).Phase 0 — ground truth
Static / build
cargo check --workspace✅ greencargo clippy --workspace -- -D warnings✅ green (0 warnings)lab infocheck✅ all in-scope crates clean (the only failure ishero_proc_admin_dx_app, out of scope, not a workspace member)lab build --install✅ server + admin + cli built; admin_dx* skipped (disabled)Live multi-domain contract probe on
hero_proc/rpc.sockGET /api/domains.jsonGET /api/{domain}/openrpc.json×4POST /api/{domain}/rpc(ping / sources / secret_list / service_list)GET /health.json,GET /heroservice.jsonGET /api/{domain}/eventsIntegration suite (
--basic --functional --extended): 278 passed / 4 stable failures (+1 transient flake:uc39_batch_insertonce hitlogger I/O error: No such file or directory, passed on rerun).Root causes (verified, not guessed)
web.rs::extra_routerserves the job-log SSE handler at top-level/events. The hero_libserve_domainsmacro, the SDK, and the schema (oschema/logs/logs.oschema:53stream_job(job_sid) @sse(...)) all use canonical/api/{domain}/events. The generated logs spec already advertises the correctx-sseextension (endpoint:/events, filter:job_sid); only the served path is wrong. hero_router PR #120/#121 aligned to forward/api/{domain}/eventsverbatim and derive channels fromx-sse, so this is exactly the contract it now consumes.basic::cleanup::clean_test_data_is_idempotent→ test bug. Asserts a singleclean_by_tagreturns 0, relying on a prior subtest having done the first clean (fails in isolation / on a persistent DB). Serverclean_by_tagis genuinely idempotent (live: first/second/third all return 0). Fix the test to be self-contained; do not weaken assertions.uc39_batch_insert→ transient/env flake, not reproducible.Downstream (LIST only — hero_router not modified)
hero_router
developmentalready aligned to the canonical contract (PR #1208dcffe5forward verbatim; PR #121d1dfe88derive SSE fromx-sse, multiplex canonical/events). Once hero_proc serves/api/logs/events, SSE works end-to-end through the router with no router change. No other raw-RPC consumers found broken by the current wire in scope.Open decisions (resolved)
/schema/— yes (unreferenced; macro usesoschema/)./api/{domain}/events+ keep/eventsalias; nosse.jsonstubs (router dropped them).Task checklist
/api/logs/events(+ keep/eventsalias) —web.rsclean_test_data_is_idempotentself-containeddisabled = trueincrates/hero_proc_test/service.toml/schema/developmentlock bumpa9b14b60→925b3df9Verify:
cargo check/clippy,lab build, full integration suite green, live re-probe of/api/logs/events, graceful + force shutdown. PR →development.All tasks done and verified — PR #153 →
development./api/logs/events(+/eventsalias)7f7a11dclean_test_data_is_idempotentself-containedd9f569adisabled = trueinhero_proc_test/service.toml4a36c76/schema/84465cfa9b14b60 → 925b3df97100287Extra finding (folded into task 2, commit
d9f569a):clean_test_data_removes_everythingwas also failing — root cause traced live, not guessed. The test deleted scheduled actions viaschedule_delete(which removes the action without its logs) beforeclean_by_tag, orphaning every scheduled action's log subtree so its entries stayed queryable. Proved the server delete is correct by isolating it: a directlogs.delete(src)on a leftover src droppedcount 233 → 0and held. Fixed test-side withschedule_disable+ drain-to-terminal + flush settle; no assertions weakened. (Same class as #126 / #141.)Verification:
cargo check✅ ·clippy --all-targets -D warnings✅ (0 warnings) ·lab build --install✅ · full suite--basic --functional --extended282 passed / 0 failed, run twice (cleanup test was previously flaky) · graceful SIGTERM ✅ · live re-probeGET /api/logs/events?job_sid=…now served (was 404).Update: rebased onto
developmentafter953e752(remove trackedCargo.lock+ gitignore). The earlier lock-bump commit (task 5) was dropped — withCargo.locknow untracked, hero_lib resolves fresh frombranch = "development"(currently60867649, floated up from925b3df9). Rebuilt + re-ran the full suite on the new base: 282 passed / 0 failed, clippy clean, SSE path live (GET /api/logs/events→ handler 404 on bogus sid, i.e. route mounted).Rebased commit SHAs on PR #153:
e6d1e11fix(server): SSE at/api/logs/eventsda124acfix(test): cleanup tests self-contained + race-freefd2ede4chore: disable hero_proc_test servicea6a55dfchore: remove stale top-level/schema/PR #153 is mergeable into
development.