processing is wrong #22

Open
opened 2026-02-07 15:52:11 +00:00 by despiegk · 3 comments
Owner
Processing complete: 0 processed, 24 unchanged, 0 skipped, 0 errors
  Processed: 0, Unchanged: 24, Errors: 0
----------------------------------------
Processing: mycelium_hero_plan
  Markdown files found: 3
  Existing .ai/ metadata: 3 files
Processing collection 'mycelium_hero_plan' with 3 pages
  Unchanged: intro
  Unchanged: we_are_different
  Unchanged: realitycheck

Processing complete: 0 processed, 3 unchanged, 0 skipped, 0 errors
  Processed: 0, Unchanged: 3, Errors: 0
----------------------------------------
Processing: mycelium_network_tech
  Markdown files found: 23
  Existing .ai/ metadata: 6 files
Processing collection 'mycelium_network_tech' with 23 pages
  Unchanged: threefold_grid
  Unchanged: message_bus
    Using hero_embedder for embeddings
  Processed: name_services
  Unchanged: message
  Unchanged: what_is_it
  Unchanged: secure_network
  Unchanged: secure_mail
  Unchanged: secure_chat
  Unchanged: use_the_app
  Unchanged: packet
    Using hero_embedder for embeddings
  Processed: mycelium_app
  Unchanged: api_yaml
  Unchanged: additional_information
  Unchanged: introduction
  Unchanged: features
    Using hero_embedder for embeddings
  Processed: mycelium_apps
    Using hero_embedder for embeddings
  Processed: data_packet
  Unchanged: secure_file_management
  Unchanged: secure_calendar
  Unchanged: soon
    Using hero_embedder for embeddings
  Processed: linux_installation
  Unchanged: ai
    Using hero_embedder for embeddings
  Processed: secure_contacts

Processing complete: 6 processed, 17 unchanged, 0 skipped, 0 errors
  Processed: 6, Unchanged: 17, Errors: 0
----------------------------------------
Processing: mycelium_society
  Markdown files found: 17
Processing collection 'mycelium_society' with 17 pages
    Using hero_embedder for embeddings
  Processed: ecommerce
    Using hero_embedder for embeddings
  Processed: calendar
    Using hero_embedder for embeddings
  Processed: freezone
    Using hero_embedder for embeddings
  Processed: products
    Using hero_embedder for embeddings
  Processed: pricing
    Using hero_embedder for embeddings
  Processed: register
    Using hero_embedder for embeddings
  Processed: referrals
    Using hero_embedder for embeddings
  Processed: mail
    Using hero_embedder for embeddings
  Processed: coder
    Using hero_embedder for embeddings
  Processed: country_as_code
    Using hero_embedder for embeddings
  Processed: names
    Using hero_embedder for embeddings
  Processed: intro
    Using hero_embedder for embeddings
  Processed: product_overview
    Using hero_embedder for embeddings
  Processed: chat

we should first process all questions/answers in the collections
then export

then only we process books, which starts from an exported library

then we use the metadata (which was also exported in the exported library), to do the embeddings
this should go fast

```bash Processing complete: 0 processed, 24 unchanged, 0 skipped, 0 errors Processed: 0, Unchanged: 24, Errors: 0 ---------------------------------------- Processing: mycelium_hero_plan Markdown files found: 3 Existing .ai/ metadata: 3 files Processing collection 'mycelium_hero_plan' with 3 pages Unchanged: intro Unchanged: we_are_different Unchanged: realitycheck Processing complete: 0 processed, 3 unchanged, 0 skipped, 0 errors Processed: 0, Unchanged: 3, Errors: 0 ---------------------------------------- Processing: mycelium_network_tech Markdown files found: 23 Existing .ai/ metadata: 6 files Processing collection 'mycelium_network_tech' with 23 pages Unchanged: threefold_grid Unchanged: message_bus Using hero_embedder for embeddings Processed: name_services Unchanged: message Unchanged: what_is_it Unchanged: secure_network Unchanged: secure_mail Unchanged: secure_chat Unchanged: use_the_app Unchanged: packet Using hero_embedder for embeddings Processed: mycelium_app Unchanged: api_yaml Unchanged: additional_information Unchanged: introduction Unchanged: features Using hero_embedder for embeddings Processed: mycelium_apps Using hero_embedder for embeddings Processed: data_packet Unchanged: secure_file_management Unchanged: secure_calendar Unchanged: soon Using hero_embedder for embeddings Processed: linux_installation Unchanged: ai Using hero_embedder for embeddings Processed: secure_contacts Processing complete: 6 processed, 17 unchanged, 0 skipped, 0 errors Processed: 6, Unchanged: 17, Errors: 0 ---------------------------------------- Processing: mycelium_society Markdown files found: 17 Processing collection 'mycelium_society' with 17 pages Using hero_embedder for embeddings Processed: ecommerce Using hero_embedder for embeddings Processed: calendar Using hero_embedder for embeddings Processed: freezone Using hero_embedder for embeddings Processed: products Using hero_embedder for embeddings Processed: pricing Using hero_embedder for embeddings Processed: register Using hero_embedder for embeddings Processed: referrals Using hero_embedder for embeddings Processed: mail Using hero_embedder for embeddings Processed: coder Using hero_embedder for embeddings Processed: country_as_code Using hero_embedder for embeddings Processed: names Using hero_embedder for embeddings Processed: intro Using hero_embedder for embeddings Processed: product_overview Using hero_embedder for embeddings Processed: chat ``` we should first process all questions/answers in the collections then export then only we process books, which starts from an exported library then we use the metadata (which was also exported in the exported library), to do the embeddings this should go fast
despiegk added this to the now milestone 2026-02-07 15:52:24 +00:00
despiegk added this to the ACTIVE project 2026-02-07 15:52:29 +00:00
Author
Owner

========================================
Demo Mycelium complete!

Next steps:
Start server: make run-mycelium
Search UI: http://localhost:8883/search

prob not right at end of demo loading

======================================== Demo Mycelium complete! ======================================== Next steps: Start server: make run-mycelium Search UI: http://localhost:8883/search prob not right at end of demo loading
Author
Owner

use

there is a new RPC endpoint for hero_embedder its a key/value store per namespace, we can use this to remember info we don't want to forget (in stead of a default redis), and this is per namespace, use this for checking if we need to process a page again or for other config

for remembering metadata we need to remember

use > there is a new RPC endpoint for hero_embedder its a key/value store per namespace, we can use this to remember info we don't want to forget (in stead of a default redis), and this is per namespace, use this for checking if we need to process a page again or for other config for remembering metadata we need to remember
Owner

Addressed in PR #24 (branch development-fixes-feb).

Commit: 23ea189 Phase 2

  • Restructured pipeline order: process_collections_for_qa()export_books_for_serving()index_books_for_search()
  • Q&A extraction now happens at collection level (before export)
  • Exporter copies .ai/ metadata from source collection to export directory
  • Indexer uses cached Q&A from .ai/ metadata — never re-calls LLM if hash matches
  • Hash-based change detection: only processes changed files
Addressed in PR #24 (branch `development-fixes-feb`). **Commit:** `23ea189` Phase 2 - Restructured pipeline order: `process_collections_for_qa()` → `export_books_for_serving()` → `index_books_for_search()` - Q&A extraction now happens at collection level (before export) - Exporter copies `.ai/` metadata from source collection to export directory - Indexer uses cached Q&A from `.ai/` metadata — never re-calls LLM if hash matches - Hash-based change detection: only processes changed files
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_books#22
No description provided.