feat: universal Headroom prompt-compression middleware (#151) #152
No reviewers
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_aibroker!152
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feat/headroom-compression"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #151.
What
Universal Headroom prompt-compression middleware wired into
Router::chat_completions. Runs on every backend (openai, openrouter, groq, sambanova, kimi, alibaba, mother brokers). Default-off viaConfig::compression_enabled; force-enabled via new--compressionCLI flag.How it hooks in
In
Router::chat_completions, right afterattach_attribution_headers(...)and before the streaming-vs-blocking dispatch:Single chokepoint covers both response paths. Streaming responses are untouched (compression only mutates the request body).
Fail-open
Every error path — serialize, deserialize, panic in the upstream call (
catch_unwind), JSON shape mismatch — emitstracing::warn!attarget = "aibroker.compression"and leaves the request untouched.rusqlite downgrade
Headroom's
headroom-corepinsrusqlite 0.32; broker was on0.39. Cargo'slinks = "sqlite3"constraint forbids two crates linking the same native library in one binary. Downgraded broker to0.32— usage limited tomiddleware/apikey.rsandmiddleware/request_log.rswith only stable API (Connection,params!,ToSql,Error::SqliteFailure). All 13middleware::request_log::testspass post-downgrade.Tests
cargo test -p hero_aibroker_server --bin hero_aibroker_server. Includes 3 new compression-related tests + all SQLite-using tests.openrpc3/3,domains15/15 pass.fake_server+e2efailures pre-date this branch (PATH_SOCKET handling, hard-coded127.0.0.1:0targets).hero_aibroker_server --fake --compressionon a 29 KB log-heavy prompt:stream: truewith compression onFiles changed
Cargo.toml(workspace)headroom-proxyandheadroom-coregit deps pinned at01fdedc6; downgradedrusqlitefrom0.39→0.32crates/hero_aibroker_server/Cargo.tomlcrates/hero_aibroker_server/src/config/mod.rscompression_enabled: boolfield + default testcrates/hero_aibroker_server/src/service/compression.rsmaybe_compress_chat_requesthelper withcatch_unwind, structured tracing, 2 unit testscrates/hero_aibroker_server/src/service/mod.rspub mod compression;crates/hero_aibroker_server/src/service/router.rscompression_enabledfield onRouter,with_compression(bool)builder, middleware call inchat_completionscrates/hero_aibroker_server/src/api_openrpc/mod.rscrates/hero_aibroker_server/src/api_openrpc/admin/common.rsconfig.compression_enabled→Router::with_compressionat both construction sitescrates/hero_aibroker_server/src/main.rs--compressionCLI flag forcingconfig.compression_enabled = trueat startupREADME.mdDeviations from the spec (recorded in the issue comment)
tests/compression.rs— existing tests cover default-off; on-path covered by live test (network-dependent in CI).--compressionCLI flag added (not in original spec) — necessary for live-test toggle.libsqlite3-syslinksconstraint.Out of scope (separate tickets if/when needed)
/v1/messagescompression (compress_anthropic_request). Claude/Anthropic models routed via OpenRouter — the broker's default — are already covered by this PR because the broker is OpenAI-shape end to end.compression_enabledat runtime without restart.tokens_saved.View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.Merge
Merge the changes and update on Forgejo.Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.