Add SHELL := /bin/bash to use bash instead of sh (enables source command) Add CARGO_ENV helper to ensure cargo is in PATH before running cargo commands This fixes exit code 127 errors in CI/CD environments where cargo might not be in the default PATH. Applies to all cargo-using targets: build, check, fmt, fmt-check, lint, run, test, test-all, clean, installdev, deps. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> |
||
|---|---|---|
| scripts | ||
| src | ||
| tests | ||
| ui/assets | ||
| .gitignore | ||
| API_SPECIFICATION.md | ||
| buildenv.sh | ||
| Cargo.toml | ||
| CONFIGURATION.md | ||
| DOCUMENTATION.md | ||
| IMPLEMENTATION_COMPLETE.md | ||
| INSTRUCTIONS.md | ||
| Makefile | ||
| MAKEFILE.md | ||
| PROJECT_STATUS.md | ||
| QUICKSTART.md | ||
| README.md | ||
Cluster Manager
A modern, web-based tool for managing multiple servers via SSH with real-time monitoring, script execution, and process management integration.
Features
Core Capabilities
-
Server Inventory Management
- Automatic discovery from SSH config
- Extended metadata (tags, ports, custom fields)
- Server status monitoring and health checks
-
Shell Connections
- SSH connections with key management
- Mosh support for mobile connections
- SSH agent integration for secure key forwarding
-
Port Forwarding
- Local-to-remote tunneling
- Remote-to-local tunneling
- Predefined port mapping from metadata
-
Script Execution
- Execute scripts on single or multiple nodes
- Parallel execution with filtering
- Job queuing and tracking
- Real-time execution monitoring
-
Process Management
- Zinit integration for service monitoring
- Service start/stop/restart control
- Process status and health tracking
-
Modern Web UI
- Bootstrap5-based responsive design
- Real-time updates via WebSocket
- Terminal integration for shell access
- Job history and status tracking
Advanced Features
-
Filtering & Grouping
- Tag-based server filtering
- Predefined server groups
- Status-based filtering
-
Execution Profiles
- Deployment profile (parallel with retries)
- Maintenance profile (sequential)
- Custom execution configurations
-
Security
- SSH agent support
- Host key verification
- Configurable authentication
-
Extensibility
- Rhai scripting support
- Webhook notifications
- Custom command templates
Quick Start
Prerequisites
- Rust 1.92+ (install)
- SSH configured with keys
- Git
Setup (5 minutes)
# Clone or initialize project
cd cluster_manager_ssh
# Initialize Cargo project
cargo init --name cluster_manager_ssh
# Build
cargo build
# Run
cargo run
# Access UI
open http://localhost:7483
For detailed setup instructions, see QUICKSTART.md.
Documentation
For Developers
-
INSTRUCTIONS.md - Complete architecture, component design, and implementation guide
- Project structure and module breakdown
- Detailed component specifications
- Integration points with SSH and Zinit libraries
- Phase-by-phase implementation guide
-
API_SPECIFICATION.md - RESTful API endpoints reference
- All HTTP endpoints with request/response examples
- WebSocket events and real-time streaming
- Error responses and status codes
- Rate limiting and versioning
-
CONFIGURATION.md - Configuration and setup guide
- SSH config file format
- Extended metadata in TOML
- Environment variables
- Advanced configuration options
-
QUICKSTART.md - Quick start for new developers
- Step-by-step project initialization
- Minimal working example
- Troubleshooting guide
For Users
- Configuration Guide - Setting up servers, SSH keys, metadata
- User Guide - Using the web UI and executing scripts
- Deployment Guide - Docker, systemd, production setup
Architecture Overview
Layered Architecture
┌─────────────────────────────────────────────┐
│ Web UI Layer (Bootstrap5) │
│ - Server Dashboard │
│ - Connection Manager │
│ - Script Executor │
│ - Job Monitor │
└─────────────────────┬───────────────────────┘
│
┌─────────────┴──────────────┐
│ │
┌───▼────────┐ ┌──────▼──────┐
│ HTTP │ │ WebSocket │
│ Server │ │ Streaming │
└───┬────────┘ └──────┬──────┘
│ │
┌───┴───────────────────────────┴───┐
│ Backend Business Logic Layer │
│ - Server Management │
│ - Connection Handling │
│ - Script Execution │
│ - Job Tracking │
└───┬───────────────────────────────┘
│
┌───┴───────────────────┬─────────────┐
│ │ │
┌───▼─────────┐ ┌─────────▼───┐ ┌────▼──────┐
│ SSH Lib │ │ Zinit │ │ Rhai │
│ (hero_lib) │ │ Client │ │ Scripting │
└───┬─────────┘ └─────────┬───┘ └────┬──────┘
│ │ │
└───┴───────────────────────┴─────────────┘
│
System Integration Layer
- SSH Connections
- Process Management
- File Operations
Key Components
| Component | Purpose | Language/Framework |
|---|---|---|
| Server Manager | Inventory & configuration | Rust |
| Connection Handler | SSH/Mosh/Direct connections | Rust + hero_lib |
| Script Executor | Execute scripts on servers | Rust |
| Job Tracker | Track execution status | Rust |
| Web Server | HTTP API and UI serving | Actix-web |
| WebSocket Handler | Real-time updates | Actix-ws |
| UI Assets | Frontend interface | Bootstrap5 + JS |
API Overview
Server Management
# List servers
GET /api/servers?tags=prod,web
# Get server details
GET /api/servers/{id}
# Probe server status
POST /api/servers/{id}/status
# Update server metadata
PUT /api/servers/{id}
Connections
# Establish connection
POST /api/connections
{ "server_id": "web-1", "method": "ssh" }
# List active connections
GET /api/connections
# Close connection
DELETE /api/connections/{id}
Script Execution
# Submit job
POST /api/jobs
{ "script": "...", "servers": [...] }
# List jobs
GET /api/jobs?status=completed
# Get job details
GET /api/jobs/{id}
# Stream job execution
GET /api/jobs/{id}/stream
Process Management
# List processes on server
GET /api/servers/{id}/processes
# Get process status
GET /api/servers/{id}/processes/{name}
# Control process
POST /api/servers/{id}/processes/{name}/action
{ "action": "restart" }
See API_SPECIFICATION.md for complete API reference.
Configuration
Server Configuration File
~/.ssh/config (standard SSH):
Host web-server-1
HostName 10.0.1.5
User deploy
IdentityFile ~/.ssh/id_rsa
ForwardAgent yes
Extended Metadata
~/.ssh/cluster_manager.toml:
[servers.web-server-1]
tags = ["production", "web", "us-east"]
enabled = true
environment = "production"
[servers.web-server-1.ports]
http = { local = 8080, remote = 80, description = "HTTP" }
https = { local = 8443, remote = 443, description = "HTTPS" }
[server_groups]
production_web = ["web-server-1", "web-server-2"]
See CONFIGURATION.md for complete configuration options.
Development
Project Structure
cluster_manager_ssh/
├── src/
│ ├── config/ # Configuration loading
│ ├── server/ # Server management
│ ├── connection/ # SSH/Mosh connections
│ ├── execution/ # Script execution & jobs
│ ├── ui/ # Web server & handlers
│ ├── integration/ # External library wrappers
│ ├── error.rs # Error types
│ ├── state.rs # App state
│ ├── lib.rs # Library root
│ └── main.rs # Entry point
├── tests/ # Unit & integration tests
├── ui/assets/ # HTML, CSS, JS assets
├── docs/ # Documentation
├── INSTRUCTIONS.md # Full architectural specs
├── API_SPECIFICATION.md # API reference
├── CONFIGURATION.md # Configuration guide
├── QUICKSTART.md # Quick start guide
├── Cargo.toml # Dependencies
└── README.md # This file
Building
# Development build
cargo build
# Release build
cargo build --release
# With specific features
cargo build --features "mosh-support,rhai-scripts"
Running
# Default (localhost:7483)
cargo run
# Custom port
CLUSTER_MANAGER_PORT=8080 cargo run
# With debug logging
RUST_LOG=debug cargo run
# Production
./target/release/cluster_manager_ssh
Testing
# Run all tests
cargo test
# Run specific test
cargo test server_filter
# With output
cargo test -- --nocapture
Deployment
Docker
FROM rust:latest as builder
WORKDIR /build
COPY . .
RUN cargo build --release
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y openssh-client mosh
COPY --from=builder /build/target/release/cluster_manager_ssh /usr/local/bin/
EXPOSE 7483
CMD ["cluster_manager_ssh"]
Build and run:
docker build -t cluster_manager .
docker run -p 7483:7483 -v ~/.ssh:/root/.ssh cluster_manager
Systemd Service
Create /etc/systemd/system/cluster_manager.service:
[Unit]
Description=Cluster Manager SSH
After=network.target
[Service]
Type=simple
User=cluster
WorkingDirectory=/opt/cluster_manager
ExecStart=/usr/local/bin/cluster_manager_ssh
Environment="CLUSTER_MANAGER_PORT=7483"
Restart=always
[Install]
WantedBy=multi-user.target
Enable and start:
sudo systemctl enable cluster_manager
sudo systemctl start cluster_manager
Integration Points
Hero Lib (SSH)
use hero_lib_os::ssh::SshConnection;
// Connect to server via SSH library
let conn = SshConnection::connect(
hostname, port, user, config
).await?;
Zinit (Process Management)
use zinit_client::Client;
// Connect to Zinit and manage services
let client = Client::new(config).await?;
let processes = client.list_services().await?;
Rhai (Scripting)
// Execute Rhai scripts with context
let result = executor.execute_script(
"ssh server { print('Hello') }",
context
).await?;
Security Considerations
SSH Security
- Requires valid SSH configuration
- Supports SSH agent for key management
- Host key verification enabled by default
- SSH agent forwarding available
Connection Security
- All connections encrypted via SSH
- Mosh provides mobile security
- WebSocket connections over same HTTP/HTTPS as UI
Authentication (Future)
- API key support ready
- OAuth2 integration possible
- RBAC framework planned
Data Security
- No credentials stored (uses SSH agent)
- Job scripts not persisted in plaintext
- Audit logging support
Performance Characteristics
- Connection Pool: 20 concurrent SSH connections
- Job Queue: 1000 concurrent jobs
- Parallel Execution: 10 jobs simultaneously
- Job History: 30 days retention (configurable)
Troubleshooting
SSH Connection Issues
Host key verification failed:
# Add to ~/.ssh/config
Host *
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
SSH key not found:
# Ensure SSH agent is running
eval $(ssh-agent -s)
ssh-add ~/.ssh/id_rsa
Port Already in Use
# Use different port
CLUSTER_MANAGER_PORT=8080 cargo run
# Or kill process using port
lsof -ti :7483 | xargs kill -9
Configuration Not Loading
# Verify config file location
ls ~/.ssh/cluster_manager.toml
# Check log output
RUST_LOG=debug cargo run
Contributing
To contribute to this project:
- Read INSTRUCTIONS.md for architecture details
- Follow the module structure defined in the project
- Add tests for new functionality
- Update documentation
- Run
cargo testandcargo clippybefore committing
Roadmap
Phase 1 (Current)
- Core architecture design
- Configuration system
- SSH integration
- Basic web UI
Phase 2
- Script execution engine
- Job tracking system
- WebSocket real-time updates
- Bootstrap5 UI implementation
Phase 3
- Zinit process management
- Rhai scripting support
- Advanced filtering
- Job scheduling
Phase 4
- Authentication & RBAC
- Audit logging
- Multi-user support
- High availability setup
License
[Specify your license here]
Support
- Documentation: See docs/ directory
- Issues: GitHub Issues
- Discussions: GitHub Discussions
References
External Documentation
- SSH Protocol (RFC 4251)
- Hero Lib Packages
- Zinit Process Supervisor
- Bootstrap5 Documentation
- Actix-web Framework
- Rhai Scripting Language
Related Tools
- OpenSSH
- Mosh (mobile shell)
- tmux/screen
- pdsh (parallel shell)
Project Version: 0.1.0 Last Updated: 2026-01-28 Status: Specifications Complete - Ready for Implementation Maintainer: Your Team