Self-Hosted AI Agent Server

One brain.
Many arms.
Infinite tasks.

Install Linux. Install OctopOS. Done.
Manage your entire AI agent swarm from a single web console — no Docker, no cloud dependency.

Quick Start How it works →
"Proxmox for AI Agents" — The missing platform for self-hosted agent swarms.

Everything you need. Nothing you don't.

Built for real infrastructure. No Docker networking nightmares.

🧠

Boss + Worker Swarms

One Boss-Agent per project coordinates a pool of specialists in parallel. Results aggregated, streamed token-by-token to the user.

🔌

One-Command Install

One bash command. GPU auto-detected. Matrix, nginx, TLS, systemd — all configured. Web console ready in minutes.

💬

Matrix Communication

Agents are real Matrix bots. Use Element to supervise any conversation live. Built-in watchdog restarts crashed bots automatically.

🤖

Multi-LLM per Agent

Each agent runs its own model. Ollama locally, Claude Max (OAuth), OpenAI — mixed per task. Configurable temperature and token limits.

📚

QMD Skills

Give agents extra knowledge via Markdown files with YAML frontmatter. Scope: always or keyword-triggered on-demand. Hot-reload, no restart.

📁

Project Isolation

Each project gets a Linux user, isolated filesystem, Samba share, and Matrix room. Agents can only access their own project directory.

🔗

Webhook System

Trigger agents from external systems — Git pushes, CI pipelines, monitoring alerts. HMAC-signed outgoing webhooks included.

🛡️

Security & Audit

JWT auth, login rate-limiting, atomic config writes, three-layer tool permission model. Full audit log of all user actions.

🖥️

Full Web Console

Manage agents, projects, users, skills, webhooks, logs and LLM config — all without SSH. Built with React + Tailwind.

How it works

A message travels through 10 layers in under a second.

👤 Web Chat
⚡ FastAPI Core
📡 Matrix Room
🧠 Boss Agent
🐙 Worker Swarm
✅ SSE Stream

The user sees a clean streaming chat. The agents do the rest — with full Matrix audit trail.

Architecture

Clean separation of concerns. Every layer replaceable.

┌─────────────────────────────────────────┐
CLIENT React Console · Element · REST API
├─────────────────────────────────────────┤
GATEWAY nginx HTTPS · JWT Auth · Rate Limit
├─────────────────────────────────────────┤
CORE FastAPI · Orchestrator · Audit Log
├─────────────────────────────────────────┤
COMM BUS conduwuit Matrix · Agent Bots
├─────────────────────────────────────────┤
LLM ADAPTER Ollama · Claude OAuth · OpenAI
└─────────────────────────────────────────┘

Stack

Boring technology. Battle-tested components.

Core
Python 3.12 + FastAPI litellm · matrix-nio · anthropic SDK
Console
React 18 + TypeScript Vite · Tailwind CSS · shadcn/ui
Matrix
conduwuit (Rust) Single binary · RocksDB · no Postgres
LLM
Ollama + Cloud APIs llama3 · Claude · OpenAI · mixed per agent
Infrastructure
Bash + systemd No Docker · idempotent installer
Platform
Ubuntu 22/24 LTS Proxmox VM · PCIe GPU passthrough

Deployment Profiles

Start small. Scale to full GPU when ready.

🔵 Lite

Cloud APIs · No GPU required
  • Any VM or VPS
  • Claude Max OAuth + OpenAI
  • Full web console
  • Matrix communication
  • All project features
  • HTTPS out of the box

Agent System

From the handbook — how agents, skills and projects work together.

Agent Types

🎯 Boss
Receives user messages, coordinates worker agents, streams results back
🔬 Specialist
Domain expert delegated tasks from Boss — tax, legal, code, etc.
⚡ Worker
Short-lived task agent, spawned on-demand, discarded after completion

QMD Skills

---
skill: Tax Law Basics
scope: on-demand
triggers: [tax, vat, income]
priority: 10
---
## Knowledge content here...

Hot-reload. No restart needed. Injected into system prompt on keyword match.

Available Tools

file_read — read project files
file_write — write project files
web_search — search the web
http_request — call external APIs
dispatch_task — delegate to workers (boss only)
spawn_agent — create short-lived agent

Data Flow A → Z

From the technical docs — what happens when a message is sent.

1
BrowserPOST /projects/{id}/message/stream → nginx (TLS termination) → FastAPI
2
AuthJWT validated, project loaded, asyncio.Queue selected
3
MatrixMessage enqueued → Boss-Agent's Matrix room → matrix-nio client
4
SkillsSkill files scanned, always-loaded + keyword-matched injected into system prompt
5
LLMlitellm routes to configured provider (Ollama / Claude / OpenAI), streaming enabled
6
ToolsTool calls intercepted, executed via registry, results returned to LLM (multi-turn)
7
Workersdispatch_task → parallel worker calls → aggregated results back to Boss
8
ResponseFinal answer posted to Matrix room, stored in session history
9
SSECore dequeues tokens → data: {"text":"..."} → Browser renders Markdown
Donedata: {"done":true} → stream closed → Audit Log written

Quick Start

Three commands. One browser tab. Running.

$ git clone https://github.com/tilleulenspiegel/octopos.git
$ cd octopos
$ sudo bash installer/install.sh
 
# → open https://<your-ip> → Setup Wizard → done

Documentation

Everything you need to install, use, and extend OctopOS.

📖

Handbook

Installation, setup wizard, agents, projects, chat, LLM config, troubleshooting

⚙️

Technical Docs

Architecture, module overview, data flow A–Z, error handling, known limitations

🔌

API Reference

All REST endpoints with request/response examples — agents, projects, webhooks, audit

🛠️

Developer Guide

Write tools, skills, endpoints, Console pages — extend OctopOS with new features

Production ready — actively developed
Core complete · Console complete · Docs complete · API not yet stable