embedding loading broken, can't load more books/collections, libraries don't show up #23
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
need to do something, too slow
we should calculate hash per collection (files sorted and hashed) or per file (chose/test) and see with this KVS to process page or not
its too slow, when we use embedder it goes at +100 EMBEDDNGS PER SEC, here much lower, why?
RPC API: http://127.0.0.1:8883/rpc
OpenRPC: http://127.0.0.1:8883/rpc/discover
Books dir: examples/ebooks_local
Use books_client to interact with the API.
Starting Hero Books server at http://127.0.0.1:8883
Books directory: examples/ebooks_local
Hero Embedder: http://localhost:3752/rpc
Exporting 7 books for web serving...
Found collection: python_intro (namespace: coding)
Exported 'coding_python': 2 pages, 0 images (namespace: coding)
Found collection: rust_basics (namespace: coding)
Exported 'coding_rust': 2 pages, 0 images (namespace: coding)
Found collection: rust_basics (namespace: herobooks)
Found collection: scifi_classics (namespace: herobooks)
Found collection: python_intro (namespace: herobooks)
Found collection: hero_books_docs (namespace: herobooks)
Found collection: hero_redis_docs (namespace: herobooks)
Found collection: philosophy_intro (namespace: herobooks)
Found collection: hero_tools_docs (namespace: herobooks)
Exported 'herobooks_guide': 4 pages, 0 images (namespace: herobooks)
Found collection: hero_redis_docs (namespace: herobooks)
Exported 'herobooks_redis': 2 pages, 0 images (namespace: herobooks)
Found collection: hero_tools_docs (namespace: herobooks)
Exported 'herobooks_tools': 2 pages, 0 images (namespace: herobooks)
Found collection: philosophy_intro (namespace: literature)
Exported 'literature_philosophy': 2 pages, 0 images (namespace: literature)
Found collection: scifi_classics (namespace: literature)
Exported 'literature_scifi': 2 pages, 0 images (namespace: literature)
Export directory: /var/folders/c1/m7s3rsy512b7yf46z05hbcy80000gn/T/herotools_export
Indexing books for vector search...
Found 3 namespaces: ["herobooks", "coding", "literature"]
Indexing 3 books to namespace 'herobooks'
Indexing book: herobooks_redis
2 pages to process, 0 unchanged
Processed: overview (14 Q&A pairs) [Q&A cached]
Processed: vector_search (17 Q&A pairs) [Q&A cached]
Uploading 31 documents to hero_embedder...
Book 'herobooks_redis' indexed: 2 processed, 0 unchanged, 0 errors, 31 documents uploaded (31 from cache)
Indexed 'Hero Redis Guide': 2 pages, 31 documents
Indexing book: herobooks_guide
4 pages to process, 0 unchanged
Processed: 1_introduction (10 Q&A pairs) [Q&A cached]
Processed: 2_getting_started (20 Q&A pairs) [Q&A cached]
something wrong with loading, I did loading of demo for mycelium did not show up in books when I started it
where do we keep whcih books we have???
and libraries
maybe this need to be in redis (or hero_redis)
embedding loadingto embedding loading broken, can't load more books/collections, libraries don't show upI see there are multiple demo's
but should be more generic, all libraries should be remembered
no point having these multiple demo's
Addressed in PR #24 (branch
development-fixes-feb).Commits:
08591c5,23ea189kvs_set(),kvs_get(),kvs_delete(),kvs_list()book:{name}→ JSON with name, namespace, path, page_count, titlepage_hash:{namespace}:{book}:{page}→ content hashNote: KVS requires the hero_embedder SDK change (
namespace_create_with_quality()) which has been pushed to hero_embedder development. Until hero_embedder is rebuilt with these changes, KVS operations will log warnings and fall back to file-based state.KVS Persistence Complete
What was done
Write side (
persist_books_state()) was already working — stores book metadata in hero_embedder KVS namespaceherobooks_state.Read side (
load_books_state()) was missing — now implemented:compute_book_content_hash()— Computes a single hash per book by hashing all page contents + page names + page count. Detects any content change, page reorder, or page addition/removal.persist_books_state()— Enhanced to storecontent_hashalongside existing metadata (name, namespace, page_count, image_count, export_path).load_books_state()— New function that reads allbook:*keys from KVS on startup, returns a map of book_name → (namespace, content_hash, page_count).index_books_for_search()— Modified to accept persisted state. For each exported book, computes current content hash and compares against KVS-stored hash. If they match, the book is skipped entirely (embeddings are already in hero_embedder).Startup flow updated: Step 3 loads KVS state, Step 4 indexes (skipping unchanged books), Step 5 persists updated state.
Hash-based change detection
Two layers of hash checking now work together:
.ai/*.tomlmetadata) — skip AI Q&A re-extraction for unchanged pagesTest results
Cold start (no KVS state): All 3 books indexed normally, 26 pages processed, 667 documents uploaded. State persisted to KVS.
Warm restart (KVS state exists):
Search works correctly after warm restart (embeddings persist in hero_embedder).
Also fixed
Branch:
development-fixes-feb