OctopOS — AI Agent Server Platform

Everything you need. Nothing you don't.

Built for real infrastructure. No Docker networking nightmares.

🧠

Boss + Worker Swarms

One Boss-Agent per project coordinates a pool of specialists in parallel. Results aggregated, streamed token-by-token to the user.

🔌

One-Command Install

One bash command. GPU auto-detected. Matrix, nginx, TLS, systemd — all configured. Web console ready in minutes.

💬

Matrix Communication

Agents are real Matrix bots. Use Element to supervise any conversation live. Built-in watchdog restarts crashed bots automatically.

🤖

Multi-LLM per Agent

Each agent runs its own model. Ollama locally, Claude Max (OAuth), OpenAI — mixed per task. Configurable temperature and token limits.

📚

QMD Skills

Give agents extra knowledge via Markdown files with YAML frontmatter. Scope: always or keyword-triggered on-demand. Hot-reload, no restart.

📁

Project Isolation

Each project gets a Linux user, isolated filesystem, Samba share, and Matrix room. Agents can only access their own project directory.

🔗

Webhook System

Trigger agents from external systems — Git pushes, CI pipelines, monitoring alerts. HMAC-signed outgoing webhooks included.

🛡️

Security & Audit

JWT auth, login rate-limiting, atomic config writes, three-layer tool permission model. Full audit log of all user actions.

🖥️

Full Web Console

Manage agents, projects, users, skills, webhooks, logs and LLM config — all without SSH. Built with React + Tailwind.

How it works

A message travels through 10 layers in under a second.

👤 Web Chat

→

⚡ FastAPI Core

→

📡 Matrix Room

→

🧠 Boss Agent

→

🐙 Worker Swarm

→

✅ SSE Stream

The user sees a clean streaming chat. The agents do the rest — with full Matrix audit trail.

Architecture

Clean separation of concerns. Every layer replaceable.

┌─────────────────────────────────────────┐

│ CLIENT React Console · Element · REST API │

├─────────────────────────────────────────┤

│ GATEWAY nginx HTTPS · JWT Auth · Rate Limit │

├─────────────────────────────────────────┤

│ CORE FastAPI · Orchestrator · Audit Log │

├─────────────────────────────────────────┤

│ COMM BUS conduwuit Matrix · Agent Bots │

├─────────────────────────────────────────┤

│ LLM ADAPTER Ollama · Claude OAuth · OpenAI │

└─────────────────────────────────────────┘

Stack

Boring technology. Battle-tested components.

Core

Python 3.12 + FastAPI litellm · matrix-nio · anthropic SDK

Console

React 18 + TypeScript Vite · Tailwind CSS · shadcn/ui

Matrix

conduwuit (Rust) Single binary · RocksDB · no Postgres

LLM

Ollama + Cloud APIs llama3 · Claude · OpenAI · mixed per agent

Infrastructure

Bash + systemd No Docker · idempotent installer

Platform

Ubuntu 22/24 LTS Proxmox VM · PCIe GPU passthrough

Deployment Profiles

Start small. Scale to full GPU when ready.

Agent System

From the handbook — how agents, skills and projects work together.

Agent Types

🎯 Boss

Receives user messages, coordinates worker agents, streams results back

🔬 Specialist

Domain expert delegated tasks from Boss — tax, legal, code, etc.

⚡ Worker

Short-lived task agent, spawned on-demand, discarded after completion

QMD Skills

        ---

        skill: Tax Law Basics

        scope: on-demand

        triggers: [tax, vat, income]

        priority: 10

        ---

        ## Knowledge content here...

Hot-reload. No restart needed. Injected into system prompt on keyword match.

Available Tools

file_read — read project files

file_write — write project files

web_search — search the web

http_request — call external APIs

dispatch_task — delegate to workers (boss only)

spawn_agent — create short-lived agent

Data Flow A → Z

From the technical docs — what happens when a message is sent.

1

Browser — POST /projects/{id}/message/stream → nginx (TLS termination) → FastAPI

2

Auth — JWT validated, project loaded, asyncio.Queue selected

3

Matrix — Message enqueued → Boss-Agent's Matrix room → matrix-nio client

4

Skills — Skill files scanned, always-loaded + keyword-matched injected into system prompt

5

LLM — litellm routes to configured provider (Ollama / Claude / OpenAI), streaming enabled

6

Tools — Tool calls intercepted, executed via registry, results returned to LLM (multi-turn)

7

Workers — dispatch_task → parallel worker calls → aggregated results back to Boss

8

Response — Final answer posted to Matrix room, stored in session history

9

SSE — Core dequeues tokens → data: {"text":"..."} → Browser renders Markdown

✓

Done — data: {"done":true} → stream closed → Audit Log written

Quick Start

Three commands. One browser tab. Running.

$ git clone https://github.com/tilleulenspiegel/octopos.git
$ cd octopos
$ sudo bash installer/install.sh
 
# → open https://<your-ip> → Setup Wizard → done

Documentation

Everything you need to install, use, and extend OctopOS.

📖

One brain.
Many arms.
Infinite tasks.

Everything you need. Nothing you don't.

Boss + Worker Swarms

One-Command Install

Matrix Communication

Multi-LLM per Agent

QMD Skills

Project Isolation

Webhook System

Security & Audit

Full Web Console

How it works

Architecture

Stack

Deployment Profiles

🔵 Lite

🟣 Full

Agent System

Agent Types

QMD Skills

Available Tools

Data Flow A → Z

Quick Start

Documentation

Handbook

Technical Docs

API Reference

Developer Guide

One brain.Many arms.Infinite tasks.

Everything you need. Nothing you don't.

Boss + Worker Swarms

One-Command Install

Matrix Communication

Multi-LLM per Agent

QMD Skills

Project Isolation

Webhook System

Security & Audit

Full Web Console

How it works

Architecture

Stack

Deployment Profiles

🔵 Lite

🟣 Full

Agent System

Agent Types

QMD Skills

Available Tools

Data Flow A → Z

Quick Start

Documentation

Handbook

Technical Docs

API Reference

Developer Guide

One brain.
Many arms.
Infinite tasks.