Keyspace fragmentation: KEYS / DBSIZE / EXISTS / SCAN ignore hash, list & set keys #39
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_db#39
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
KEYS,DBSIZE,EXISTS, andSCANonly see the string keyspace. Keys holding hashes, lists, or sets are invisible to keyspace enumeration and existence checks, even though they exist and are accessible by their own type commands.Reproduction (passes on redis 7.0.15, fails on hero_db)
Root cause
Storage is a single flat redb table (
crates/hero_db/src/stor/redb.rs, table"data") with per-type key prefixesH:/L:/S:/X:(seecrates/hero_db/src/db/engine/types.rs). The keyspace ops filter those out or only look at the bare key:keys()—crates/hero_db/src/db/engine/keyspace.rs:39-48explicitlycontinues pastH:/L:/S:keys (streams are re-surfaced via:_meta, hash/list/set are not).dbsize()—keyspace.rs:196-200counts only keys without anH:/L:/S:/TTL:prefix.exists()—keyspace.rs:82does a point-get on the bare key only.key_type()anddel()(keyspace.rs:146-172,:85-144) do walk all prefixes, soTYPEandDELare already correct — the fix is to bring KEYS/DBSIZE/EXISTS/SCAN in line with them.Impact
Breaks any client, tool, or admin UI that enumerates or probes the keyspace.
EXISTSreturning 0 for a live key is especially dangerous for existence-guarded logic.Secondary
DBSIZEalso over-counts streams — eachX:/XG:/XC:/XP:sub-key is counted as a separate key.Shares a root cause (flat prefixed keyspace, no per-key type index) with the type-safety issue #40.
Filed from a Redis-compatibility audit (hero_db v0.6.0 @ main
aacaad1). Every finding was cross-validated: the same probe passes on stockredis-server 7.0.15and fails on hero_db, using the Apache Kvrocksgocasesuite (Go) and aredis-py 8.0probe (Python). Root causes verified against the source.__scope_probe__to Keyspace fragmentation: KEYS / DBSIZE / EXISTS / SCAN ignore hash, list & set keysOrg-wide consumer blast-radius audit (#39 + #40)
Scanned all 132 non-empty repos across 6 orgs (
lhumina_code,geomind_code,geomind_research,ourworld_it,ourworld_org,projectmycelium) — remote shallow-clone of each default branch, then classified hero_db references and drilled into the data consumers.Who actually consumes hero_db (data path)
HeroDbClient; exposes genericherodb_keys/herodb_exists/get/set/… to Rhai scriptsherodb_keys()+herodb_exists()(clients_rhai/src/herodb.rs:480,491)HeroDbClient;set/get/del(string) +sadd/smembers(set), disjoint key namespacesstorage.rs:100-147)redis.get/set/del/hset/hgetall/sadd/smemberson distinct keys +database.createredis.hset/hget/hgetall/hdelon one fixed hash keyHeroDBServerClientforontology.*/graph (its ownscan/keysare NOT hero_db's)6378confighero_db_*.sock(stale, #34-class)(The ~25 raw "risky" grep hits were false positives —
Path::exists()and"keys"in OpenRPC JSON schemas.)#39 blast radius (KEYS/DBSIZE/EXISTS/SCAN see all types) — narrow
herodb_keys()/herodb_exists()are exposed to Rhai scripts. Today they see string keys only; after #39 they'd also surface hash/list/set keys (relevant if the instance is shared with hero_slides/wallet/aibroker). A script that does enumerate-keys → GET each could get non-string keys back (nil, orWRONGTYPEonce #40 lands). Regression-check the Rhai scripts.EXISTSflipping totruefor live hash/list/set keys is a correctness fix nobody currently depends on.#40 blast radius (type-safety / WRONGTYPE + possible type index) — two risks
{prefix}{id}vsindex_key; hero_aibroker set/hash/string on distinct keys; hero_slides single hash key). None reuses a key as two types. Only wildcard is hero_lib_rhai's generic scripting surface — but it exposes string ops only, so it can't create conflicting types itself.Recommendations
TYPE/DELalready walk all prefixes). No on-disk change, zero migration risk. Only gate: a regression pass on hero_lib_rhai / Rhai scripts that enumerate keys.cargo_depconsumers.Audit method: org-wide remote scan of 132 repos / 6 orgs (not just locally-cloned repos), per the cross-org blast-radius convention.