- Emphasize MCP servers are stored locally in ~/.claude/mcp.json - Don't sync across machines or to cloud - Work with all Claude clients on THIS MACHINE - Add note that each machine needs separate setup - Update Getting Started to clarify local-only nature - Add step for setting up on other machines Key clarification: ✅ MCP servers work everywhere on THIS machine ❌ Don't sync to other machines ❌ Not stored in cloud ✅ Each machine has its own ~/.claude/mcp.json Examples: - Machine A: claude mcp add ... (configured locally) - Machine B: Need to run command again (separate config) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> |
||
|---|---|---|
| .forgejo/workflows | ||
| docs | ||
| heroindex | ||
| heroindex_client | ||
| scripts | ||
| .gitignore | ||
| build.sh | ||
| buildenv.sh | ||
| Cargo.lock | ||
| Cargo.toml | ||
| DEVELOPMENT_UI.md | ||
| install.sh | ||
| LICENSE | ||
| Makefile | ||
| README.md | ||
| run.sh | ||
| run_test_data.sh | ||
| VERSION | ||
HeroIndex
A Tantivy-based full-text search server with OpenRPC socket interface.
Repository: https://forge.ourworld.tf/lhumina_research/hero_index_server
Packages
This workspace contains two packages:
- heroindex - The search server binary
- heroindex_client - Client library for connecting to the server
Features
- Multiple Index Management - Create, delete, and manage multiple Tantivy indexes
- Dynamic Schemas - Define custom schemas with various field types
- Full-Text Search - Match queries, phrase queries, fuzzy search
- Exact Queries - Term queries, range queries, regex, prefix matching
- Boolean Queries - Combine queries with must/should/must_not clauses
- Fast Fields - Columnar storage for sorting and aggregations
- Web Admin UI - Browser-based dashboard for managing databases, queries, and monitoring
- HTTP JSON-RPC Endpoint -
POST /rpcfor HTTP-based JSON-RPC 2.0 clients - MCP Server - Model Context Protocol endpoint at
POST /mcpfor AI assistant integration - OpenRPC Interface - Unix socket + HTTP JSON-RPC interface with discovery
- Performance Test Tab - Built-in benchmark tool for load testing and search benchmarks
- Demo Database - Auto-created on first startup with sample documents
- Concurrent Connections - Multiple clients can connect simultaneously
Binaries
This project builds two binaries:
| Binary | Description |
|---|---|
heroindex |
Full-text search server (main binary) |
heroindex_test |
Test utility for the client library |
Installation
From Source (Recommended)
Requirements:
- Rust 1.92.0+
- Unix-like operating system (Linux, macOS)
git clone https://forge.ourworld.tf/lhumina_research/hero_index_server.git
cd hero_index_server
# Build and install to ~/hero/bin/
make install
Both binaries are now in ~/hero/bin/ and ready to use!
Add to PATH
export PATH="$HOME/hero/bin:$PATH"
Then run anywhere:
heroindex # Start the server
From crates.io
cargo install heroindex
(Note: For full project with both binaries, build from source)
Development & Usage
Quick Start
# Build and install
make install
# Run the server (uses defaults - no args needed!)
make run
# In another terminal, use the client
heroindex_test # or use the client library in your code
Make Commands
# Installation & Running
make build # Build release binaries
make install # Build and install to ~/hero/bin/
make installdev # Install debug build (faster compile)
make run # Run server with defaults
make rundev # Run with debug logging
# Testing & Quality
make check # Fast code check
make test # Run all tests
make test-all # Run all tests including integration
make perftest # Run performance benchmark
# Development
make fmt # Format code
make fmt-check # Check code formatting
make lint # Run clippy linter
# Maintenance
make clean # Remove build artifacts
make all # Full cycle: clean → check → test → build
make help # Show all commands
Server Arguments
The server uses sensible defaults - no arguments required!
# Just run it with defaults
make run
# Or after install, run directly
heroindex
Default Configuration
| Argument | Default | Description |
|---|---|---|
--dir |
~/hero/var/index/defaulttest |
Base directory for all indexes |
--socket |
~/hero/var/socket_heroindex |
Unix socket for RPC interface |
--http-port |
9753 |
HTTP server port (Web UI + API) |
--http-host |
127.0.0.1 |
HTTP server bind address |
Server Interfaces
The server exposes two interfaces simultaneously:
| Interface | Address | Protocol | Purpose |
|---|---|---|---|
| HTTP | http://127.0.0.1:9753 |
HTTP + JSON-RPC 2.0 | Web UI, REST API, HTTP RPC |
| MCP | http://127.0.0.1:9753/mcp |
MCP over HTTP | AI assistant tool integration |
| Unix Socket | ~/hero/var/socket_heroindex |
JSON-RPC 2.0 | Programmatic client access |
Web Admin UI
Open http://127.0.0.1:9753 in your browser to access:
- Overview - Database info, schema viewer, document counts
- Query - Execute search queries with results display
- Documents - Add documents individually or in batches
- API Docs - Full JSON-RPC 2.0 method reference with examples
- Logs - Real-time operation log viewer
- Perf Test - Load 100k documents and benchmark search performance
HTTP JSON-RPC Endpoint
All 18 RPC methods are available via HTTP POST:
# Health check
curl -X POST http://127.0.0.1:9753/rpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"server.ping","params":[],"id":1}'
# List databases
curl -X POST http://127.0.0.1:9753/rpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"db.list","params":[],"id":1}'
# Download OpenRPC spec
curl http://127.0.0.1:9753/openrpc.json
MCP (Model Context Protocol) Endpoint
HeroIndex exposes an MCP endpoint at POST /mcp on the same HTTP port (9753), allowing AI assistants (Claude, etc.) to use it as a tool server.
Supported MCP methods: initialize, tools/list, tools/call, ping
Available tools (16):
| Tool | Description |
|---|---|
server_ping |
Health check |
server_stats |
Server uptime, database count, total docs |
db_list |
List all databases |
db_create |
Create database with schema |
db_delete |
Delete a database |
db_close |
Close database (free memory) |
db_select |
Select database for operations |
db_info |
Info about selected database |
schema_get |
Get schema of selected database |
doc_add |
Add a single document |
doc_add_batch |
Add documents in batch |
doc_delete |
Delete documents by field/value |
index_commit |
Commit pending changes |
index_reload |
Reload index reader |
search_query |
Execute a search query |
search_count |
Count matching documents |
Example MCP usage:
# Initialize MCP session
curl -X POST http://127.0.0.1:9753/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"initialize","params":{},"id":1}'
# List available tools
curl -X POST http://127.0.0.1:9753/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"tools/list","params":{},"id":2}'
# Call a tool (list databases)
curl -X POST http://127.0.0.1:9753/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"db_list","arguments":{}},"id":3}'
# Search via MCP
curl -X POST http://127.0.0.1:9753/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"search_query","arguments":{"query":{"type":"match","field":"body","value":"search"},"limit":5}},"id":4}'
MCP client configuration (e.g. for Claude Desktop claude_desktop_config.json):
{
"mcpServers": {
"heroindex": {
"url": "http://127.0.0.1:9753/mcp"
}
}
}
Custom Arguments
Override defaults if needed:
heroindex --dir /custom/data --socket /tmp/search.sock --http-port 8080 --http-host 0.0.0.0
Using the Client Library
Option 1: Use the Installed Binary
After make install, the heroindex_test utility is available:
heroindex_test
Option 2: As a Rust Library
Add to your Cargo.toml:
[dependencies]
heroindex_client = { git = "https://forge.ourworld.tf/lhumina_research/hero_index_server.git" }
serde_json = "1.0"
tokio = { version = "1.0", features = ["full"] }
Then update dependencies:
cargo update
Or use a specific version from crates.io:
[dependencies]
heroindex_client = "0.1"
serde_json = "1.0"
tokio = { version = "1.0", features = ["full"] }
Quick Start
use heroindex_client::HeroIndexClient;
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1. Connect to the server (uses default socket path)
let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;
// 2. Create a database with schema
client.db_create("articles", json!({
"fields": [
{"name": "title", "type": "text", "stored": true, "indexed": true},
{"name": "body", "type": "text", "stored": true, "indexed": true}
]
})).await?;
// 3. Select the database
client.db_select("articles").await?;
// 4. Add a document
client.doc_add(json!({
"title": "Hello World",
"body": "Rust is awesome"
})).await?;
// 5. Commit and search
client.commit().await?;
client.reload().await?;
let results = client.search(
json!({"type": "match", "field": "body", "value": "rust"}),
10, 0
).await?;
println!("Found {} results", results.total_hits);
Ok(())
}
Example 1: Simple Full-Text Search
use heroindex_client::HeroIndexClient;
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;
client.db_create("docs", json!({
"fields": [{"name": "content", "type": "text", "stored": true, "indexed": true}]
})).await?;
client.db_select("docs").await?;
// Add documents
client.doc_add(json!({"content": "Rust programming language"})).await?;
client.doc_add(json!({"content": "Python for data science"})).await?;
client.commit().await?;
client.reload().await?;
// Search
let results = client.search(
json!({"type": "match", "field": "content", "value": "rust"}),
10, 0
).await?;
for hit in results.hits {
println!("{:?}", hit.doc);
}
Ok(())
}
Example 2: Batch Insert & Range Query
use heroindex_client::HeroIndexClient;
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;
client.db_create("products", json!({
"fields": [
{"name": "name", "type": "text", "stored": true, "indexed": true},
{"name": "price", "type": "u64", "stored": true, "indexed": true, "fast": true}
]
})).await?;
client.db_select("products").await?;
// Batch insert 1000 products
let docs: Vec<_> = (1..=1000)
.map(|i| json!({"name": format!("Product {}", i), "price": 10 + i as u64}))
.collect();
client.doc_add_batch(docs).await?;
client.commit().await?;
client.reload().await?;
// Find products in price range $50-$100
let results = client.search(
json!({"type": "range", "field": "price", "gte": 50, "lt": 100}),
100, 0
).await?;
println!("Found {} products in price range", results.total_hits);
Ok(())
}
Example 3: Fuzzy Search (Typo Tolerance)
use heroindex_client::HeroIndexClient;
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;
client.db_create("words", json!({
"fields": [{"name": "word", "type": "text", "stored": true, "indexed": true}]
})).await?;
client.db_select("words").await?;
client.doc_add(json!({"word": "programming"})).await?;
client.doc_add(json!({"word": "elephant"})).await?;
client.commit().await?;
client.reload().await?;
// Find "programing" (typo) - will match "programming"
let results = client.search(
json!({"type": "fuzzy", "field": "word", "value": "programing", "distance": 1}),
10, 0
).await?;
println!("Found matches for typo: {:?}", results.hits);
Ok(())
}
Example 4: Boolean Queries (AND, OR, NOT)
use heroindex_client::HeroIndexClient;
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;
client.db_create("news", json!({
"fields": [
{"name": "title", "type": "text", "stored": true, "indexed": true},
{"name": "category", "type": "str", "stored": true, "indexed": true}
]
})).await?;
client.db_select("news").await?;
client.doc_add(json!({"title": "Rust wins award", "category": "tech"})).await?;
client.doc_add(json!({"title": "Python ecosystem grows", "category": "tech"})).await?;
client.doc_add(json!({"title": "Rust racing circuit", "category": "sports"})).await?;
client.commit().await?;
client.reload().await?;
// Find: must have "Rust" AND must NOT be "sports"
let results = client.search(
json!({
"type": "boolean",
"must": [{"type": "match", "field": "title", "value": "rust"}],
"must_not": [{"type": "term", "field": "category", "value": "sports"}]
}),
10, 0
).await?;
println!("Found {} relevant articles", results.total_hits);
Ok(())
}
Example 5: Multiple Databases
use heroindex_client::HeroIndexClient;
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut client = HeroIndexClient::connect("~/hero/var/socket_heroindex").await?;
// Create separate databases for different content types
let schema = json!({"fields": [
{"name": "content", "type": "text", "stored": true, "indexed": true}
]});
client.db_create("blog_posts", schema.clone()).await?;
client.db_create("documentation", schema.clone()).await?;
client.db_create("comments", schema).await?;
// Work with blog posts
client.db_select("blog_posts").await?;
client.doc_add(json!({"content": "My first blog post"})).await?;
// Switch to documentation
client.db_select("documentation").await?;
client.doc_add(json!({"content": "API Reference"})).await?;
client.commit().await?;
// List all databases
let dbs = client.db_list().await?;
println!("Databases: {}", dbs.databases.len());
Ok(())
}
Performance
HeroIndex is blazingly fast! Run the performance benchmark:
make perftest
Benchmark Results
| Operation | Time | Throughput |
|---|---|---|
| Single document insert | 0.2ms | — |
| Batch insert (1000 docs) | 7ms | 137,000 docs/sec |
| Simple search | 0.5ms | — |
| Paginated search (100 results) | 1.5ms | — |
| Range query | 1.9ms | — |
| Fuzzy search | 0.5ms | — |
| Count query | 0.04ms | — |
API Reference
See docs/specs.md for the complete OpenRPC interface specification.
Query Types
| Type | Description | Example |
|---|---|---|
all |
Match all documents | {"type": "all"} |
match |
Full-text match | {"type": "match", "field": "body", "value": "search terms"} |
term |
Exact term match | {"type": "term", "field": "id", "value": "abc123"} |
fuzzy |
Fuzzy matching | {"type": "fuzzy", "field": "title", "value": "serch", "distance": 1} |
phrase |
Exact phrase | {"type": "phrase", "field": "body", "value": "exact phrase"} |
prefix |
Prefix matching | {"type": "prefix", "field": "title", "value": "hel"} |
range |
Numeric/date range | {"type": "range", "field": "price", "gte": 10, "lt": 100} |
regex |
Regex pattern | {"type": "regex", "field": "title", "value": "test.*"} |
boolean |
Combine queries | {"type": "boolean", "must": [...], "should": [...], "must_not": [...]} |
Field Types
| Type | Description |
|---|---|
text |
Full-text searchable string (tokenized) |
str |
Exact string (keyword, not tokenized) |
u64 |
Unsigned 64-bit integer |
i64 |
Signed 64-bit integer |
f64 |
64-bit floating point |
date |
DateTime (RFC 3339 format) |
bool |
Boolean |
json |
JSON object |
bytes |
Binary data |
ip |
IP address |
Project Structure
hero_index_server/
├── Cargo.toml # Workspace configuration
├── Makefile # Build automation
├── buildenv.sh # Build environment variables
├── scripts/
│ └── build_lib.sh # Build library & utilities
├── README.md
├── VERSION # Version file
├── docs/
│ ├── specs.md # OpenRPC interface specification
│ └── OPENRPC.md # Full API method documentation
├── .forgejo/workflows/ # CI/CD pipelines (Linux & macOS)
├── heroindex/ # Server package
│ ├── Cargo.toml
│ ├── src/
│ │ ├── main.rs # Entry point, HTTP + socket servers, demo DB
│ │ ├── error.rs # Error types
│ │ ├── logging.rs # In-memory operation log store
│ │ ├── mcp.rs # MCP (Model Context Protocol) server
│ │ ├── web/ # HTTP server (Axum)
│ │ │ ├── mod.rs
│ │ │ ├── handlers.rs # HTTP routes, RPC endpoint, OpenRPC spec
│ │ │ └── state.rs # Shared AppState
│ │ └── modules/
│ │ ├── mod.rs
│ │ ├── index_manager.rs
│ │ ├── schema.rs
│ │ ├── query.rs
│ │ ├── rpc.rs
│ │ └── handlers.rs # RPC method handlers (18 methods)
│ ├── templates/ # Askama HTML templates (Web UI)
│ │ ├── base.html # Layout: Bootstrap 5.3 dark theme, navbar
│ │ └── index.html # Dashboard: tabs, forms, perf test
│ └── tests/
│ └── integration.rs # Integration tests + performance benchmark
└── heroindex_client/ # Client library package
├── Cargo.toml
└── src/
├── lib.rs
├── client.rs
├── error.rs
└── types.rs
Using with Claude via MCP (Local Machine Integration)
HeroIndex can be integrated with Claude via the Model Context Protocol (MCP).
Important: MCP servers are configured locally on your machine in ~/.claude/mcp.json. They:
- ✅ Work everywhere on this machine (web UI, Claude Code, API, etc.)
- ❌ Don't sync to other machines or cloud
- ✅ Persist across all sessions on this machine
- ✅ Available to all Claude clients on this machine
Quick Start: Add MCP Servers to Claude
Add HeroIndex to Claude (Global)
claude mcp add --transport http heroindex http://localhost:9753/mcp
Add Other MCP Servers (Global Examples)
# Sentry - Error tracking and monitoring
claude mcp add --transport http sentry https://mcp.sentry.dev/mcp
# Filesystem - Access local files
claude mcp add --transport stdio filesystem file:///path/to/directory
# GitHub - Repository management
claude mcp add --transport stdio github file:///path/to/github/tool
What This Does (Local to Your Machine)
When you run claude mcp add, Claude:
- ✅ Registers the server locally in
~/.claude/mcp.json(on this machine only) - ✅ Makes it available everywhere on this machine - web UI, Claude Code, API
- ✅ Persists across sessions - no need to reconfigure
- ✅ Works with all Claude clients on this machine (but not synced to other machines)
- ❌ Does not sync to other computers or cloud
HeroIndex MCP Capabilities
With HeroIndex as an MCP server, Claude can:
- ✅ Create databases - Define schemas with multiple field types
- ✅ Search documents - Full-text, fuzzy, boolean, range queries
- ✅ Manage indexes - List, select, delete databases
- ✅ Monitor performance - View statistics and benchmark results
- ✅ Batch operations - Insert multiple documents at once
- ✅ Analyze data - Process search results with AI
MCP Server Configuration Files (Local to Your Machine)
MCP servers are stored in your local Claude configuration directory (not synced to cloud):
| Platform | Location | Scope |
|---|---|---|
| macOS/Linux | ~/.claude/mcp.json |
This machine only |
| Windows | %APPDATA%\Claude\mcp.json |
This machine only |
Note: Configuration is local to this machine. Each machine where you use Claude needs its own MCP server setup.
Example: Full MCP Configuration
{
"mcp_servers": {
"heroindex": {
"transport": "http",
"url": "http://localhost:9753/mcp"
},
"sentry": {
"transport": "http",
"url": "https://mcp.sentry.dev/mcp"
},
"filesystem": {
"transport": "stdio",
"command": "filesystem",
"args": ["/path/to/directory"]
}
}
}
Getting Started with HeroIndex MCP (On This Machine)
-
Start HeroIndex server (on this machine):
make run # Listens on http://localhost:9753 (local machine only) -
Add to Claude (local to this machine):
claude mcp add --transport http heroindex http://localhost:9753/mcp # Stored in ~/.claude/mcp.json on this machine -
Use in Claude (on this machine):
- Web UI: Start a new conversation, HeroIndex is available
- Claude Code: Use in your terminal with
/mcpcommands - API: Access via Claude API with MCP context
-
Manage servers (local to this machine):
# List all configured MCP servers claude mcp list # Remove a server claude mcp remove heroindex # Update server configuration claude mcp add --transport http heroindex http://localhost:9754/mcp -
On other machines:
- Repeat steps 1-2 on each machine where you want to use HeroIndex
- Configuration doesn't sync automatically
Natural Language Examples
Once configured, you can ask Claude:
- "Create a search index for documents with title, body, and date fields"
- "Search for articles about machine learning from the last month"
- "Show me the statistics for all databases"
- "Find fuzzy matches for 'algoritm' in the documents"
- "Run a performance benchmark and analyze the results"
License
MIT