No description
Find a file
despiegk 6f4c5f393c
Some checks failed
Build and Test / build (push) Failing after 1m3s
Build Linux / build-linux (linux-amd64, false, x86_64-unknown-linux-musl) (push) Successful in 1m19s
Build Linux / build-linux (linux-arm64, true, aarch64-unknown-linux-gnu) (push) Successful in 2m37s
docs: clarify MCP configuration is LOCAL to each machine
- Emphasize MCP servers are stored locally in ~/.claude/mcp.json
- Don't sync across machines or to cloud
- Work with all Claude clients on THIS MACHINE
- Add note that each machine needs separate setup
- Update Getting Started to clarify local-only nature
- Add step for setting up on other machines

Key clarification:
 MCP servers work everywhere on THIS machine
 Don't sync to other machines
 Not stored in cloud
 Each machine has its own ~/.claude/mcp.json

Examples:
- Machine A: claude mcp add ... (configured locally)
- Machine B: Need to run command again (separate config)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-02-08 11:27:15 +04:00
.forgejo/workflows fix: add rustup target installation for Linux cross-compilation in CI 2026-02-08 09:40:24 +04:00
docs docs: add comprehensive OpenRPC specification documentation 2026-02-08 09:51:18 +04:00
heroindex feat: add MCP tab to web admin dashboard 2026-02-08 11:23:02 +04:00
heroindex_client feat: add tilde expansion and default socket path to client 2026-02-08 09:49:37 +04:00
scripts build 2026-02-08 09:34:01 +04:00
.gitignore Update repository URLs to forge.ourworld.tf in READMEs and Cargo.toml metadata 2025-12-25 09:13:23 +01:00
build.sh Update repository URLs to forge.ourworld.tf in READMEs and Cargo.toml metadata 2025-12-25 09:13:23 +01:00
buildenv.sh chore: update git remote and owner to lhumina_code organization 2026-02-08 09:17:54 +04:00
Cargo.lock feat: add in-memory logging system for operations 2026-02-08 09:53:14 +04:00
Cargo.toml feat: add in-memory logging system for operations 2026-02-08 09:53:14 +04:00
DEVELOPMENT_UI.md docs: add comprehensive development UI guide and implementation summary 2026-02-08 09:54:30 +04:00
install.sh Update repository URLs to forge.ourworld.tf in READMEs and Cargo.toml metadata 2025-12-25 09:13:23 +01:00
LICENSE Update repository URLs to forge.ourworld.tf in READMEs and Cargo.toml metadata 2025-12-25 09:13:23 +01:00
Makefile feat: add perftest Makefile target 2026-02-08 09:06:55 +04:00
README.md docs: clarify MCP configuration is LOCAL to each machine 2026-02-08 11:27:15 +04:00
run.sh Update repository URLs to forge.ourworld.tf in READMEs and Cargo.toml metadata 2025-12-25 09:13:23 +01:00
run_test_data.sh Update repository URLs to forge.ourworld.tf in READMEs and Cargo.toml metadata 2025-12-25 09:13:23 +01:00
VERSION feat: add build automation with comprehensive Makefile and build scripts 2026-02-08 09:00:23 +04:00

HeroIndex

Repository

A Tantivy-based full-text search server with OpenRPC socket interface.

Repository: https://forge.ourworld.tf/lhumina_research/hero_index_server

Packages

This workspace contains two packages:

  • heroindex - The search server binary
  • heroindex_client - Client library for connecting to the server

Features

  • Multiple Index Management - Create, delete, and manage multiple Tantivy indexes
  • Dynamic Schemas - Define custom schemas with various field types
  • Full-Text Search - Match queries, phrase queries, fuzzy search
  • Exact Queries - Term queries, range queries, regex, prefix matching
  • Boolean Queries - Combine queries with must/should/must_not clauses
  • Fast Fields - Columnar storage for sorting and aggregations
  • Web Admin UI - Browser-based dashboard for managing databases, queries, and monitoring
  • HTTP JSON-RPC Endpoint - POST /rpc for HTTP-based JSON-RPC 2.0 clients
  • MCP Server - Model Context Protocol endpoint at POST /mcp for AI assistant integration
  • OpenRPC Interface - Unix socket + HTTP JSON-RPC interface with discovery
  • Performance Test Tab - Built-in benchmark tool for load testing and search benchmarks
  • Demo Database - Auto-created on first startup with sample documents
  • Concurrent Connections - Multiple clients can connect simultaneously

Binaries

This project builds two binaries:

Binary Description
heroindex Full-text search server (main binary)
heroindex_test Test utility for the client library

Installation

Requirements:

  • Rust 1.92.0+
  • Unix-like operating system (Linux, macOS)
git clone https://forge.ourworld.tf/lhumina_research/hero_index_server.git
cd hero_index_server

# Build and install to ~/hero/bin/
make install

Both binaries are now in ~/hero/bin/ and ready to use!

Add to PATH

export PATH="$HOME/hero/bin:$PATH"

Then run anywhere:

heroindex  # Start the server

From crates.io

cargo install heroindex

(Note: For full project with both binaries, build from source)

Development & Usage

Quick Start

# Build and install
make install

# Run the server (uses defaults - no args needed!)
make run

# In another terminal, use the client
heroindex_test  # or use the client library in your code

Make Commands

# Installation & Running
make build         # Build release binaries
make install       # Build and install to ~/hero/bin/
make installdev    # Install debug build (faster compile)
make run           # Run server with defaults
make rundev        # Run with debug logging

# Testing & Quality
make check         # Fast code check
make test          # Run all tests
make test-all      # Run all tests including integration
make perftest      # Run performance benchmark

# Development
make fmt           # Format code
make fmt-check     # Check code formatting
make lint          # Run clippy linter

# Maintenance
make clean         # Remove build artifacts
make all           # Full cycle: clean → check → test → build
make help          # Show all commands

Server Arguments

The server uses sensible defaults - no arguments required!

# Just run it with defaults
make run

# Or after install, run directly
heroindex

Default Configuration

Argument Default Description
--dir ~/hero/var/index/defaulttest Base directory for all indexes
--socket ~/hero/var/socket_heroindex Unix socket for RPC interface
--http-port 9753 HTTP server port (Web UI + API)
--http-host 127.0.0.1 HTTP server bind address

Server Interfaces

The server exposes two interfaces simultaneously:

Interface Address Protocol Purpose
HTTP http://127.0.0.1:9753 HTTP + JSON-RPC 2.0 Web UI, REST API, HTTP RPC
MCP http://127.0.0.1:9753/mcp MCP over HTTP AI assistant tool integration
Unix Socket ~/hero/var/socket_heroindex JSON-RPC 2.0 Programmatic client access

Web Admin UI

Open http://127.0.0.1:9753 in your browser to access:

  • Overview - Database info, schema viewer, document counts
  • Query - Execute search queries with results display
  • Documents - Add documents individually or in batches
  • API Docs - Full JSON-RPC 2.0 method reference with examples
  • Logs - Real-time operation log viewer
  • Perf Test - Load 100k documents and benchmark search performance

HTTP JSON-RPC Endpoint

All 18 RPC methods are available via HTTP POST:

# Health check
curl -X POST http://127.0.0.1:9753/rpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"server.ping","params":[],"id":1}'

# List databases
curl -X POST http://127.0.0.1:9753/rpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"db.list","params":[],"id":1}'

# Download OpenRPC spec
curl http://127.0.0.1:9753/openrpc.json

MCP (Model Context Protocol) Endpoint

HeroIndex exposes an MCP endpoint at POST /mcp on the same HTTP port (9753), allowing AI assistants (Claude, etc.) to use it as a tool server.

Supported MCP methods: initialize, tools/list, tools/call, ping

Available tools (16):

Tool Description
server_ping Health check
server_stats Server uptime, database count, total docs
db_list List all databases
db_create Create database with schema
db_delete Delete a database
db_close Close database (free memory)
db_select Select database for operations
db_info Info about selected database
schema_get Get schema of selected database
doc_add Add a single document
doc_add_batch Add documents in batch
doc_delete Delete documents by field/value
index_commit Commit pending changes
index_reload Reload index reader
search_query Execute a search query
search_count Count matching documents

Example MCP usage:

# Initialize MCP session
curl -X POST http://127.0.0.1:9753/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"initialize","params":{},"id":1}'

# List available tools
curl -X POST http://127.0.0.1:9753/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"tools/list","params":{},"id":2}'

# Call a tool (list databases)
curl -X POST http://127.0.0.1:9753/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"db_list","arguments":{}},"id":3}'

# Search via MCP
curl -X POST http://127.0.0.1:9753/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"search_query","arguments":{"query":{"type":"match","field":"body","value":"search"},"limit":5}},"id":4}'

MCP client configuration (e.g. for Claude Desktop claude_desktop_config.json):

{
  "mcpServers": {
    "heroindex": {
      "url": "http://127.0.0.1:9753/mcp"
    }
  }
}

Custom Arguments

Override defaults if needed:

heroindex --dir /custom/data --socket /tmp/search.sock --http-port 8080 --http-host 0.0.0.0

Using the Client Library

Option 1: Use the Installed Binary

After make install, the heroindex_test utility is available:

heroindex_test

Option 2: As a Rust Library

Add to your Cargo.toml:

[dependencies]
heroindex_client = { git = "https://forge.ourworld.tf/lhumina_research/hero_index_server.git" }
serde_json = "1.0"
tokio = { version = "1.0", features = ["full"] }

Then update dependencies:

cargo update

Or use a specific version from crates.io:

[dependencies]
heroindex_client = "0.1"
serde_json = "1.0"
tokio = { version = "1.0", features = ["full"] }

Quick Start

use heroindex_client::HeroIndexClient;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Connect to the server (uses default socket path)
    let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;

    // 2. Create a database with schema
    client.db_create("articles", json!({
        "fields": [
            {"name": "title", "type": "text", "stored": true, "indexed": true},
            {"name": "body", "type": "text", "stored": true, "indexed": true}
        ]
    })).await?;

    // 3. Select the database
    client.db_select("articles").await?;

    // 4. Add a document
    client.doc_add(json!({
        "title": "Hello World",
        "body": "Rust is awesome"
    })).await?;

    // 5. Commit and search
    client.commit().await?;
    client.reload().await?;

    let results = client.search(
        json!({"type": "match", "field": "body", "value": "rust"}),
        10, 0
    ).await?;

    println!("Found {} results", results.total_hits);
    Ok(())
}
use heroindex_client::HeroIndexClient;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;

    client.db_create("docs", json!({
        "fields": [{"name": "content", "type": "text", "stored": true, "indexed": true}]
    })).await?;

    client.db_select("docs").await?;

    // Add documents
    client.doc_add(json!({"content": "Rust programming language"})).await?;
    client.doc_add(json!({"content": "Python for data science"})).await?;

    client.commit().await?;
    client.reload().await?;

    // Search
    let results = client.search(
        json!({"type": "match", "field": "content", "value": "rust"}),
        10, 0
    ).await?;

    for hit in results.hits {
        println!("{:?}", hit.doc);
    }

    Ok(())
}

Example 2: Batch Insert & Range Query

use heroindex_client::HeroIndexClient;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;

    client.db_create("products", json!({
        "fields": [
            {"name": "name", "type": "text", "stored": true, "indexed": true},
            {"name": "price", "type": "u64", "stored": true, "indexed": true, "fast": true}
        ]
    })).await?;

    client.db_select("products").await?;

    // Batch insert 1000 products
    let docs: Vec<_> = (1..=1000)
        .map(|i| json!({"name": format!("Product {}", i), "price": 10 + i as u64}))
        .collect();

    client.doc_add_batch(docs).await?;
    client.commit().await?;
    client.reload().await?;

    // Find products in price range $50-$100
    let results = client.search(
        json!({"type": "range", "field": "price", "gte": 50, "lt": 100}),
        100, 0
    ).await?;

    println!("Found {} products in price range", results.total_hits);
    Ok(())
}

Example 3: Fuzzy Search (Typo Tolerance)

use heroindex_client::HeroIndexClient;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;

    client.db_create("words", json!({
        "fields": [{"name": "word", "type": "text", "stored": true, "indexed": true}]
    })).await?;

    client.db_select("words").await?;

    client.doc_add(json!({"word": "programming"})).await?;
    client.doc_add(json!({"word": "elephant"})).await?;

    client.commit().await?;
    client.reload().await?;

    // Find "programing" (typo) - will match "programming"
    let results = client.search(
        json!({"type": "fuzzy", "field": "word", "value": "programing", "distance": 1}),
        10, 0
    ).await?;

    println!("Found matches for typo: {:?}", results.hits);
    Ok(())
}

Example 4: Boolean Queries (AND, OR, NOT)

use heroindex_client::HeroIndexClient;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;

    client.db_create("news", json!({
        "fields": [
            {"name": "title", "type": "text", "stored": true, "indexed": true},
            {"name": "category", "type": "str", "stored": true, "indexed": true}
        ]
    })).await?;

    client.db_select("news").await?;

    client.doc_add(json!({"title": "Rust wins award", "category": "tech"})).await?;
    client.doc_add(json!({"title": "Python ecosystem grows", "category": "tech"})).await?;
    client.doc_add(json!({"title": "Rust racing circuit", "category": "sports"})).await?;

    client.commit().await?;
    client.reload().await?;

    // Find: must have "Rust" AND must NOT be "sports"
    let results = client.search(
        json!({
            "type": "boolean",
            "must": [{"type": "match", "field": "title", "value": "rust"}],
            "must_not": [{"type": "term", "field": "category", "value": "sports"}]
        }),
        10, 0
    ).await?;

    println!("Found {} relevant articles", results.total_hits);
    Ok(())
}

Example 5: Multiple Databases

use heroindex_client::HeroIndexClient;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;

    // Create separate databases for different content types
    let schema = json!({"fields": [
        {"name": "content", "type": "text", "stored": true, "indexed": true}
    ]});

    client.db_create("blog_posts", schema.clone()).await?;
    client.db_create("documentation", schema.clone()).await?;
    client.db_create("comments", schema).await?;

    // Work with blog posts
    client.db_select("blog_posts").await?;
    client.doc_add(json!({"content": "My first blog post"})).await?;

    // Switch to documentation
    client.db_select("documentation").await?;
    client.doc_add(json!({"content": "API Reference"})).await?;

    client.commit().await?;

    // List all databases
    let dbs = client.db_list().await?;
    println!("Databases: {}", dbs.databases.len());

    Ok(())
}

Performance

HeroIndex is blazingly fast! Run the performance benchmark:

make perftest

Benchmark Results

Operation Time Throughput
Single document insert 0.2ms
Batch insert (1000 docs) 7ms 137,000 docs/sec
Simple search 0.5ms
Paginated search (100 results) 1.5ms
Range query 1.9ms
Fuzzy search 0.5ms
Count query 0.04ms

API Reference

See docs/specs.md for the complete OpenRPC interface specification.

Query Types

Type Description Example
all Match all documents {"type": "all"}
match Full-text match {"type": "match", "field": "body", "value": "search terms"}
term Exact term match {"type": "term", "field": "id", "value": "abc123"}
fuzzy Fuzzy matching {"type": "fuzzy", "field": "title", "value": "serch", "distance": 1}
phrase Exact phrase {"type": "phrase", "field": "body", "value": "exact phrase"}
prefix Prefix matching {"type": "prefix", "field": "title", "value": "hel"}
range Numeric/date range {"type": "range", "field": "price", "gte": 10, "lt": 100}
regex Regex pattern {"type": "regex", "field": "title", "value": "test.*"}
boolean Combine queries {"type": "boolean", "must": [...], "should": [...], "must_not": [...]}

Field Types

Type Description
text Full-text searchable string (tokenized)
str Exact string (keyword, not tokenized)
u64 Unsigned 64-bit integer
i64 Signed 64-bit integer
f64 64-bit floating point
date DateTime (RFC 3339 format)
bool Boolean
json JSON object
bytes Binary data
ip IP address

Project Structure

hero_index_server/
├── Cargo.toml              # Workspace configuration
├── Makefile                # Build automation
├── buildenv.sh             # Build environment variables
├── scripts/
│   └── build_lib.sh        # Build library & utilities
├── README.md
├── VERSION                 # Version file
├── docs/
│   ├── specs.md            # OpenRPC interface specification
│   └── OPENRPC.md          # Full API method documentation
├── .forgejo/workflows/     # CI/CD pipelines (Linux & macOS)
├── heroindex/              # Server package
│   ├── Cargo.toml
│   ├── src/
│   │   ├── main.rs         # Entry point, HTTP + socket servers, demo DB
│   │   ├── error.rs        # Error types
│   │   ├── logging.rs      # In-memory operation log store
│   │   ├── mcp.rs          # MCP (Model Context Protocol) server
│   │   ├── web/            # HTTP server (Axum)
│   │   │   ├── mod.rs
│   │   │   ├── handlers.rs # HTTP routes, RPC endpoint, OpenRPC spec
│   │   │   └── state.rs    # Shared AppState
│   │   └── modules/
│   │       ├── mod.rs
│   │       ├── index_manager.rs
│   │       ├── schema.rs
│   │       ├── query.rs
│   │       ├── rpc.rs
│   │       └── handlers.rs # RPC method handlers (18 methods)
│   ├── templates/          # Askama HTML templates (Web UI)
│   │   ├── base.html       # Layout: Bootstrap 5.3 dark theme, navbar
│   │   └── index.html      # Dashboard: tabs, forms, perf test
│   └── tests/
│       └── integration.rs  # Integration tests + performance benchmark
└── heroindex_client/       # Client library package
    ├── Cargo.toml
    └── src/
        ├── lib.rs
        ├── client.rs
        ├── error.rs
        └── types.rs

Using with Claude via MCP (Local Machine Integration)

HeroIndex can be integrated with Claude via the Model Context Protocol (MCP).

Important: MCP servers are configured locally on your machine in ~/.claude/mcp.json. They:

  • Work everywhere on this machine (web UI, Claude Code, API, etc.)
  • Don't sync to other machines or cloud
  • Persist across all sessions on this machine
  • Available to all Claude clients on this machine

Quick Start: Add MCP Servers to Claude

Add HeroIndex to Claude (Global)

claude mcp add --transport http heroindex http://localhost:9753/mcp

Add Other MCP Servers (Global Examples)

# Sentry - Error tracking and monitoring
claude mcp add --transport http sentry https://mcp.sentry.dev/mcp

# Filesystem - Access local files
claude mcp add --transport stdio filesystem file:///path/to/directory

# GitHub - Repository management
claude mcp add --transport stdio github file:///path/to/github/tool

What This Does (Local to Your Machine)

When you run claude mcp add, Claude:

  1. Registers the server locally in ~/.claude/mcp.json (on this machine only)
  2. Makes it available everywhere on this machine - web UI, Claude Code, API
  3. Persists across sessions - no need to reconfigure
  4. Works with all Claude clients on this machine (but not synced to other machines)
  5. Does not sync to other computers or cloud

HeroIndex MCP Capabilities

With HeroIndex as an MCP server, Claude can:

  • Create databases - Define schemas with multiple field types
  • Search documents - Full-text, fuzzy, boolean, range queries
  • Manage indexes - List, select, delete databases
  • Monitor performance - View statistics and benchmark results
  • Batch operations - Insert multiple documents at once
  • Analyze data - Process search results with AI

MCP Server Configuration Files (Local to Your Machine)

MCP servers are stored in your local Claude configuration directory (not synced to cloud):

Platform Location Scope
macOS/Linux ~/.claude/mcp.json This machine only
Windows %APPDATA%\Claude\mcp.json This machine only

Note: Configuration is local to this machine. Each machine where you use Claude needs its own MCP server setup.

Example: Full MCP Configuration

{
  "mcp_servers": {
    "heroindex": {
      "transport": "http",
      "url": "http://localhost:9753/mcp"
    },
    "sentry": {
      "transport": "http",
      "url": "https://mcp.sentry.dev/mcp"
    },
    "filesystem": {
      "transport": "stdio",
      "command": "filesystem",
      "args": ["/path/to/directory"]
    }
  }
}

Getting Started with HeroIndex MCP (On This Machine)

  1. Start HeroIndex server (on this machine):

    make run
    # Listens on http://localhost:9753 (local machine only)
    
  2. Add to Claude (local to this machine):

    claude mcp add --transport http heroindex http://localhost:9753/mcp
    # Stored in ~/.claude/mcp.json on this machine
    
  3. Use in Claude (on this machine):

    • Web UI: Start a new conversation, HeroIndex is available
    • Claude Code: Use in your terminal with /mcp commands
    • API: Access via Claude API with MCP context
  4. Manage servers (local to this machine):

    # List all configured MCP servers
    claude mcp list
    
    # Remove a server
    claude mcp remove heroindex
    
    # Update server configuration
    claude mcp add --transport http heroindex http://localhost:9754/mcp
    
  5. On other machines:

    • Repeat steps 1-2 on each machine where you want to use HeroIndex
    • Configuration doesn't sync automatically

Natural Language Examples

Once configured, you can ask Claude:

  • "Create a search index for documents with title, body, and date fields"
  • "Search for articles about machine learning from the last month"
  • "Show me the statistics for all databases"
  • "Find fuzzy matches for 'algoritm' in the documents"
  • "Run a performance benchmark and analyze the results"

License

MIT