Alpha · Early Access

Orchestrator for
self-hosted
AI agents

100% private · fully offline

Hierarchical orchestration with dynamic task planning, persistent memory with semantic retrieval, and a closed-loop agent behavioral correction system.

Soul Studio in action — demo coming soon

Core Capabilities

Architecture, not a wrapper

01HTN Planning

Advanced agents on top of local models

Each agent is an autonomous computational module with encapsulated state, a specialized system prompt and an isolated tool space. The Archangel is an HTN planner: it builds a graph of sub-tasks with explicit dependencies, allocates agents by specialization and coordinates them through a shared state bus.

02Parallel Execution

Higher efficiency from local models

Three orthogonal mechanisms: (i) the context window is managed via semantic retrieval — at inference time a weighted sample by relevance score is supplied; (ii) parallel execution of DAG nodes with synchronization at dependency barriers; (iii) adaptive task routing by latency/quality/VRAM profile.

03ANN Retrieval

Unlimited context — the agent always remembers

Four memory layers: working (inference window), episodic (extractive summarization), semantic store (vector DB, ANN, cosine similarity with temporal decay) and procedural (behavioral patterns). Retrieval is a ranking task: k relevant fragments within a token budget.

04Real-time Observability

Productivity monitoring for local models

Real-time observability stack: task completion rate, confidence calibration score, tool invocation success ratio, context utilization efficiency, VRAM·time. Hallucination detection via cross-validation between agents and consistency checking on factual claims.

05Closed-loop Control

Live behavioral correction algorithm

Behavioral monitor as a closed control loop: it computes topic drift score, uncertainty expression rate and task alignment index. On threshold breach: runtime modification of the system prompt, rebuilding the retrieval strategy and changing tool access scope. On a reasoning loop — a forced context reset.

06Air-gap Architecture

100% offline — full data privacy

The entire inference stack, retrieval, tool calls and inter-agent coordination are confined to localhost with no outbound connections. Telemetry is architecturally absent. GDPR, HIPAA and SOC 2 compliance are structural consequences of the isolated runtime.

07DAG Scheduler

Agent orchestrator for complex multi-level projects

The scheduler builds a DAG with dependencies, priorities and resource constraints. Agents execute nodes in parallel; artifacts are passed through typed interfaces with schema verification. Online replanning: only the affected sub-tree is rebuilt, not the entire pipeline.

08Plugin Architecture

Integration with 40+ services and systems

A unified tool-calling interface on top of REST, GraphQL and WebSocket adapters. Each connector encapsulates authorization, rate-limit handling and retry with exponential backoff. New integrations are added without modifying the orchestrator core.

09Local SD/FLUX

Built-in unlimited image generator

Local inference on Stable Diffusion / FLUX. Integrated as a first-class tool callable by an agent programmatically. VRAM management during parallel LLM inference and generation via dynamic offloading of weights between GPU and CPU.

10Whisper / Vocoder

Full voice communication — agents hear and speak

STT uses Whisper-compatible architectures with local inference; TTS uses neural vocoder models with low latency. Voice input is deserialized into a structured intent. Response synthesis is asynchronous — audio is streamed as it is generated.

11Multi-model Cascade

Vision for any model — even small ones

A cascaded multi-model pipeline: small models analyze image structure and the results are aggregated into a structured text description for a text-only LLM agent. Vision capabilities without a multimodal LLM in the main stack — substantially lowers the VRAM threshold.

12Model Catalog

Large library of local models with specialization search

A catalog with metadata: benchmarks (MMLU, HumanEval, GSM8K, MATH) per task, VRAM at different quantization levels (Q4_K_M, Q5_K_M, Q8_0, F16), throughput on reference hardware. Search matches the task profile to model characteristics within resource constraints.

13Unified LLM Layer

Full cloud-model support — Gemini, Anthropic, OpenAI, Grok, DeepSeek

A unified LLM abstraction layer: a single interface for local and cloud providers. The orchestrator routes tasks by a multi-criteria function: cost, latency, capability profiles and privacy constraints. Routing is configured declaratively.

Use Cases

What people build on Soul Studio

01
Web Development

Advanced websites and landing pages

The Code agent operates at the architecture level: component graph, API contracts, dependency schema, ADRs in long-term memory. It iterates autonomously on the results of static analysis, test coverage and performance metrics.

02
Rapid Prototyping

Prototyping — an ideal fit for startups

Full stack: DB schema, migrations, API layer with validation, auth, UI and infrastructure configs. Generated code includes exception handling, structured logging and a baseline security model.

03
Game Dev

Game development

The Game agent works with ECS patterns, behavioral state machines and event-driven systems. It generates game logic, NPC behavior trees and procedural generation. Pairs with the image generator and TTS.

04
Workflow Automation

Workflow management and automation

The Workflow agent builds event-driven pipelines with branching, exception handling and rollback. Behavior adapts based on execution history in procedural memory — parameterized logic that learns from precedents.

05
Data Processing

Data monitoring and processing

The agent acts as a continuous consumer: parses sources on a schedule or event trigger, normalizes to a target schema, detects anomalies (z-score, IQR, isolation forest) and generates fully contextual alerts.

06
Integrations

Integrations: Gmail, Google Calendar, Monday, AirTable, Notion...

Semantically consistent transactions across multiple services: inbound document → extraction → cross-reference with CRM → status update → task → notification. Atomic from a business-logic standpoint.

07
Research & Writing

Writing books and research papers

The Research agent builds a knowledge graph from a source corpus, tracks claim consistency and stores a stylistic profile in long-term memory. It generates bibliographies in any citation format.

08
Web Scraping

Website scanning — data extraction and structuring

Playwright backend: JS rendering, pagination traversal, dynamic content. Deduplication by content hash, normalization to a target schema. Horizontal scaling via parallel browser sessions.

09
Marketing

Building and launching ad campaigns

Generation of variant content by audience parameters, an A/B hypothesis matrix and creative packages tailored to placement formats. Analyzes performance metrics and proposes iterations based on statistical significance.

10
Music Generation

Music composition

Generation of MIDI sequences, harmonic progressions, melodic lines and arrangements through specialized generative models. The agent keeps musical context (key, meter, thematic material) and iterates as a co-author.

Versions

Three versions. One studio.

Coming Soon02

Soul Studio
Cloud

Same power — through the browser

Identical orchestration architecture with inference on managed infrastructure. API-compatible with Local: agent configurations port over without modification.

  • Any device
  • Browser
  • Team collaboration
  • Multi-user workspaces
In development
Lightweight03

Soul Studio
Mini

Same architecture — for low-end hardware

Orchestration architecture optimized for CPU inference. 1B–4B models with aggressive Q4 quantization. Parity on memory architecture and behavioral correction.

  • CPU-inference
  • 8 GB RAM
  • Q4 quantization
  • Fully offline
In development

Vision

An operating system for AI work

The modern landscape of LLM-based automation systems is defined by a fundamental structural contradiction: although the quality of local open-source models is high enough to solve a significant share of production tasks, the infrastructure layer that would orchestrate them effectively is missing.

It has been shown empirically that a specialized 7B-parameter model outperforms general-purpose commercial systems on its target tasks. 34B-class models with quantization are competitive with the best commercial offerings on a large portion of standard benchmarks.

Soul Studio is that infrastructure layer.

01

Orchestration as the determinant of system intelligence

A multi-agent system with a correctly implemented planner outperforms a single monolithic model many times larger — through execution parallelism and per-task agent specialization.

02

Persistent hierarchical memory as a precondition for agentic behavior

A system without long-term memory with semantic retrieval is a stateless function, not an autonomous agent. Accumulating institutional knowledge through episodic and procedural memory is the key condition for an agent to grow more effective over time.

03

Local inference as an architectural advantage, not a compromise

A deterministic execution environment, zero latency on tool calls, the absence of rate limits and full control over the execution environment have value in their own right — independent of any privacy considerations.

Roadmap

Where Soul Studio is heading

Each phase is a complete functional layer, not an interim state. We build bottom-up: first a production-ready core, then vertical specializations and an open ecosystem. No phase starts before the previous one stabilizes.

01Current phase
LiveAlpha

Foundation

production-ready orchestrator, 40+ adapters, Local + Cloud

02Phase 2
Q3 2025

Vertical Agents

Vertical agent packages: Legal, Healthcare, FinTech, eCommerce

03Phase 3
Q1 2026

Ecosystem

SDK for custom agents, marketplace of specializations

04Phase 4
2026+

Federation

Distributed agent networks, node federation, P2P coordination

01Current phase
LiveAlpha

Foundation

production-ready orchestrator, 40+ adapters, Local + Cloud

02Phase 2
Q3 2025

Vertical Agents

Vertical agent packages: Legal, Healthcare, FinTech, eCommerce

03Phase 3
Q1 2026

Ecosystem

SDK for custom agents, marketplace of specializations

04Phase 4
2026+

Federation

Distributed agent networks, node federation, P2P coordination

Team

Practitioners, not theorists

Soul Studio is built by a team of practicing engineers who have accumulated systematic experience with existing solutions in the agentic systems space — LangChain, AutoGPT, LM Studio, Open WebUI — and concluded that they are fundamentally limited for production use.

None of the existing solutions delivers all at once: production-ready orchestration of many specialized agents, hierarchical persistent memory with semantic retrieval, a closed-loop behavioral correction layer and full execution-environment isolation. Soul Studio is built as the answer to this combined set of requirements — not as an incremental improvement.

40+tool adapters
4hierarchical memory layers
100%local inference
No venture fundingDecisions driven by engineering rationaleUsed in production every day

Plans

No hidden conditions

Upgrade

$39one-time

Upgrade to the current version for existing users

  • Update to the latest version
  • All new agents and tools
  • Configurations preserved
  • Memory data migration

Subscription

$19per month

Continuous updates and full access to all Soul products

  • Soul Studio Local + Cloud + Mini
  • All updates across all versions
  • Early access to new features
  • Test utilities and beta releases
  • Priority support channel

Refunds — 14 days, no questions asked