Architecture Overview

Everruns is a durable agentic harness engine built on Rust with a PostgreSQL-backed durable execution system. It provides a modular architecture for building, deploying, and managing AI agents with production-grade durability and observability.

Core Architecture

The system is organized into three distinct layers that work together to provide flexible, durable agent execution: Solid arrows represent configuration ownership — Harnesses and Agents contain Capabilities. Dashed arrows represent runtime assembly — configuration merges into a RuntimeAgent, which executes in a Session.

System Components

Control Plane (Server)

The control plane manages system state and provides APIs:

REST API (port 9000) — Public HTTP API for agent management, sessions, and streaming events
gRPC Server (port 9001) — Internal service for worker communication
PostgreSQL Storage — All persistent state (agents, sessions, events, workflows)
SSE Streaming — Real-time event delivery to clients

Workers

Stateless executors that process agent turns:

Communicate exclusively via gRPC (no direct database access)
Execute the reason-act loop (LLM calls + tool execution)
Support horizontal scaling with push-based task distribution
Automatic failover via heartbeat monitoring

Durable Execution Engine

PostgreSQL-backed workflow orchestration:

Event-sourced workflows with automatic retries
Circuit breakers and dead letter queues
Distributed task claiming via SKIP LOCKED
Push notifications for low-latency task distribution (<10ms P99)

Workers can be deployed separately from the control plane, enabling flexible scaling and resource allocation.

Configuration Hierarchy

Everruns uses a three-tier configuration model:

Harness — Infrastructure and base behavior
Agent — Domain-specific customization (optional)
Session — Runtime overrides (optional)

Each tier can contribute:

System prompt additions
Enabled capabilities
LLM model selection (overridable at each level)

Prompt Layering

System prompts are composed in a specific order:

[Session Capabilities]
[Agent Capabilities]
[Agent System Prompt]
[Harness Capabilities]
[Harness System Prompt]

Capabilities are resolved first, then prompts are prepended in reverse order, with session-level taking highest priority.

All prompt sections are wrapped in XML tags for clear boundaries. See the XML Prompt Formatting spec for details.

Data Flow

A typical agent execution follows this flow:

Event Streaming

All conversation data is stored as an append-only event log. Messages are reconstructed from events at read time:

Events are the source of truth (immutable, sequenced)
Messages are derived views (not stored separately)
SSE delivers events in real-time to connected clients

Development Modes

Everruns supports two deployment modes:

DEV_MODE (In-Memory)

DEV_MODE=true cargo run -p everruns-server

No PostgreSQL required
In-process execution (no separate workers)
Data lost on restart
Ideal for rapid development and testing

Full Mode (Production)

just start-all  # Starts PostgreSQL + server + worker

PostgreSQL-backed persistence
Separate worker processes
Durable workflows and events
Horizontal scalability

Both modes share the same core logic, ensuring consistent behavior across development and production.

Observability

Built-in observability via OpenTelemetry:

Distributed tracing — Spans for workflows, activities, LLM calls
Gen-AI conventions — Semantic attributes for LLM operations
Event listeners — Pluggable observability backends
Jaeger integration — Local trace visualization

All LLM operations are instrumented with:

Token usage (prompt, completion, total)
Model information
Latency and finish reasons
Tool calls and results

Next Steps

Harnesses

Learn about harness types and configuration

Agents

Understand agent configuration and capabilities

Sessions

Explore session lifecycle and management

Capabilities

Discover the capability system

Get Started

Core Concepts

Guides

Integrations

Deployment

Architecture Overview

Core Architecture

System Components

Control Plane (Server)

Workers

Durable Execution Engine

Configuration Hierarchy

Prompt Layering

Data Flow

Event Streaming

Development Modes

DEV_MODE (In-Memory)

Full Mode (Production)

Observability

Next Steps

Harnesses

Agents

Sessions

Capabilities

Get Started

Core Concepts

Guides

Integrations

Deployment

Documentation Index

​Core Architecture

​System Components

​Control Plane (Server)

​Workers

​Durable Execution Engine

​Configuration Hierarchy

​Prompt Layering

​Data Flow

​Event Streaming

​Development Modes

​DEV_MODE (In-Memory)

​Full Mode (Production)

​Observability

​Next Steps

Harnesses

Agents

Sessions

Capabilities

Core Architecture

System Components

Control Plane (Server)

Workers

Durable Execution Engine

Configuration Hierarchy

Prompt Layering

Data Flow

Event Streaming

Development Modes

DEV_MODE (In-Memory)

Full Mode (Production)

Observability

Next Steps