Self-Hosted AI Agent Platforms 2026: CISO & Regulated Buyer Guide

Last updated May 2026

The multi-tenant cloud model that won enterprise software in the 2010s is losing regulated buyers in 2026. Banks under DORA, healthcare organizations under eHealth data-residency rules, manufacturers with export-controlled IP, and EU public-sector entities under national sovereignty requirements are all asking the same question: which agentic AI platforms can we actually run inside our own perimeter?

This is not a fringe concern. The EU AI Act (Regulation 2024/1689), DORA (Regulation 2022/2554), and NIS2 (Directive 2022/2555) collectively create a documentation and audit-trail obligation that is much easier to satisfy when the organization owns the infrastructure stack. Add the CLOUD Act exposure of US-headquartered providers and the data-sovereignty requirements of specific sectors (financial supervisory data, healthcare patient records, defence-adjacent IP), and self-hosted agentic platforms shift from a nice-to-have to a procurement filter.

This guide is written for CISOs, Chief Compliance Officers, and IT procurement leads evaluating self-hostable agentic AI platforms. We are honest about the TCO of self-hosting — it is not free and it is not simple. We are also honest about which platforms genuinely support it versus which ones describe "private VPC" as self-hosting.

What "self-hosted" means in this guide

We use three tiers, consistent with the sovereign AI framing in sovereign agentic AI platforms 2026:

Tier A — VPC or private cloud. The vendor manages the control plane; workloads run in a customer-owned VPC or private cloud. Data does not leave the customer's network boundary, but the customer depends on the vendor's control plane for updates, configuration, and licensing.

Tier B — Customer-operated infrastructure. The platform runs entirely on customer-operated infrastructure (on-premises servers, customer-owned VMs). The vendor provides software licenses and support. The customer owns the runtime, the keys, and the audit trail. No dependency on the vendor's cloud.

Tier C — Air-gapped. As Tier B, plus: no network connectivity to vendor infrastructure. Model weights and all dependencies are delivered by physical media or isolated transfer. Required for security-classified and defence-adjacent workloads.

Buyers under DORA Article 28 (contractual guarantees on data location and audit access) typically need Tier B as a minimum for core workloads. EU public-sector buyers under national sovereignty guidance may need Tier C.

The honest TCO picture

Self-hosting is not cheap. Before evaluating vendors, procurement leads should budget:

Infrastructure. GPU compute for model inference (H100 or equivalent: €25,000–€60,000 per unit as of Q1 2026 if purchased; €3–8/hour if rented from EU-resident providers like Hetzner or OVHcloud). Storage for model weights (70B parameter models require ~140 GB at fp16; 405B models require ~810 GB). Redundancy and backup infrastructure.

Engineering capacity. A minimal self-hosted agentic platform requires at least one platform engineer for installation, configuration, and ongoing maintenance. Realistically two for a production deployment with SLA requirements. Budget 0.5–1.0 FTE ongoing for patching, upgrades, and incident response.

Model licensing. Open-weight models (Llama 3.x, Mistral, Falcon) have permissive licenses for commercial use but may have conditions. Proprietary models with on-premises licenses (Aleph Alpha PhariaAI, Cohere Command) have per-seat or per-token licensing costs.

Operational overhead. Monitoring, alerting, capacity planning, security scanning. Plan for 15–20% of initial build effort as annual operational overhead.

Buyers who cannot commit this capacity should evaluate managed options before committing to self-hosting. The argument for managed SaaS is not weakness — it is recognizing that the compliance burden of operating an under-resourced self-hosted platform can exceed the compliance burden of using a well-governed managed platform with strong contractual protections.

Platforms reviewed

n8n — fair-code, genuinely self-hostable

n8n is the most widely deployed self-hostable workflow automation platform with agentic capabilities. The fair-code license (Sustainable Use License for the hosted version; Apache 2.0 for self-hosted core) means the self-hosted version is free to run and modify. The platform ships with a visual workflow builder, an MCP node library, AI agent nodes, and a community of thousands of workflow templates.

For agentic workflows, n8n supports multi-step agent loops, tool calling, and integration with open-weight models via Ollama or custom HTTP nodes. It is not natively designed as a fleet console for multiple concurrent agents — that framing is closer to Knowlee or CrewAI Enterprise — but individual agent workflows can be built and run reliably.

Self-host tier: Tier B. Runs on any Linux VM; Docker or Kubernetes deployment documented. No dependency on n8n cloud for the self-hosted version.

Strengths. Truly open source at the core. Massive template library. No per-seat license for self-hosted. Strong community. Good fit for engineering-driven orgs that want composability over out-of-the-box completeness.

Trade-offs. No native AI Act-shaped governance metadata at the registry level. Audit trail is per-workflow, not fleet-level. Compliance metadata (risk classification, data categories, human oversight) must be added as custom workflow fields. Multi-agent fleet console is not a first-class feature.

TCO note. n8n self-hosted is infrastructure cost + engineering time only. No software license fee.

deepset Haystack Enterprise — on-premises option for NLP pipelines

deepset is a Berlin-based AI company; Haystack is its open-source NLP and RAG framework. Haystack Enterprise adds managed deployment options, enterprise support, and a cloud UI — but the core pipeline framework is deployable on-premises or in a customer VPC.

Haystack's agentic capabilities center on pipeline composition: retrieval, generation, tool calling, and multi-step reasoning built from composable components. It is strong for document-heavy workflows (contract intelligence, regulatory review, knowledge-base-driven agents) and weaker for sales or marketing agent fleets.

Self-host tier: Tier A/B depending on configuration. deepset Cloud is the managed option; Haystack framework deploys on-prem.

Strengths. EU legal entity (Germany). Strong NLP and RAG depth. Good fit for legal, compliance, and document-processing workflows. Active open-source community.

Trade-offs. Framework-level, not a fleet console. No native kanban or jobs registry. Governance metadata must be added by the implementing team.

LightOn Paradigm — private deployment, French legal entity

LightOn's Paradigm platform supports private deployment of LLMs for enterprise, with a strong story for French public-sector buyers. On-premises and private-cloud deployment; French legal entity; ANSSI relationships.

Self-host tier: Tier B/C (on-prem and air-gapped options discussed with enterprise buyers).

Strengths. French EU legal entity. Private and air-gapped deployment options. Strong for French-language and multilingual workloads. Good for French public sector under national sovereignty requirements.

Trade-offs. Foundation model and inference layer, not an agentic fleet OS. Orchestration must be built on top. Smaller ecosystem than the hyperscalers.

Aleph Alpha — air-gapped PhariaAI

Aleph Alpha's PhariaAI platform has the most mature air-gapped story of any EU AI company. The platform has been deployed in classified environments for German government use cases. Enterprise licensing for on-premises and air-gapped deployment is available; the legal entity is German.

Self-host tier: Tier B/C.

Strengths. Genuine air-gap capability. German legal entity. Mature enterprise and public-sector deployment track record. Highest-quality EU-native foundation model for German-language and multilingual tasks.

Trade-offs. High implementation complexity and cost for air-gapped deployments. Not an agentic fleet OS — orchestration is the customer's responsibility. Model licensing is proprietary.

GLBNXT — sovereign multi-cloud

GLBNXT positions around sovereign multi-cloud: the ability to run workloads across multiple EU-resident cloud providers without CLOUD Act exposure. Dutch legal entity; financial services and healthcare focus.

Self-host tier: Tier A/B.

Strengths. EU legal entity. Multi-cloud flexibility avoids single-provider lock-in. Financial services and healthcare expertise.

Trade-offs. Agentic platform maturity less developed than Aleph Alpha or Haystack for specific NLP tasks. Fleet orchestration not the primary product surface.

Cohere Coral — private VPC and on-premises

Cohere provides Command R/A models with private VPC and on-premises deployment options. The self-hosted model licensing is available for enterprise buyers. Cohere's architecture allows the model weights to be delivered inside the customer perimeter; inference runs locally.

Self-host tier: Tier A/B.

Strengths. Strong multilingual models. Genuine private-deployment story. Good fit for regulated industries willing to invest in local inference hardware.

Trade-offs. Parent company is North American; buyers under strict CLOUD Act exclusion should verify legal entity at contract level. Orchestration layer is the buyer's responsibility.

Knowlee — self-hostable agentic orchestration OS

Knowlee is designed to be deployed on EU-resident or customer-owned infrastructure. The platform runs on any Linux server (Docker or bare Node.js deployment); the operator owns the file system, the Neo4j graph database, the Supabase instance, and the audit trail. There is no mandatory dependency on Knowlee cloud infrastructure for production operation.

The governance model is built for self-hosted regulated deployment: every job in the registry carries risk_level, data_categories, human_oversight_required, approved_by, and approved_at fields. Every run lands in state/jobs/logs/ with structured outputs. The audit layer surfaces any unapproved run of a flagged job. This is the AI Act documentation requirement in operational form, not a dashboard retrofit.

The foundation model layer is configurable: Knowlee can route to self-hosted open-weight models (via Ollama, vLLM, or direct API), EU-sovereign models (Aleph Alpha, LightOn), or managed model APIs. The orchestration layer does not require a specific model provider.

Self-host tier: Tier B. Tier C possible with offline model hosting; contact for details.

Strengths. EU legal entity. Genuinely self-hostable: no mandatory Knowlee cloud dependency. AI Act-shaped governance as first-class data model. Multi-vertical fleet console (4Sales, 4Talents, 4Legals, 4Marketing). Cross-agent Neo4j memory compounds across runs. See agentic orchestration for the category context.

Trade-offs. More operator overhead than managed SaaS. Configuration requires platform engineering capacity. Not designed for teams that want a no-code builder with zero infrastructure responsibility.

Pricing. Self-hosted single-vertical engagements start in the low-five-figure euro range annually. Contact for multi-vertical and enterprise terms.

CrewAI Enterprise — open-source roots, self-hostable

CrewAI started as an open-source multi-agent framework. CrewAI Enterprise adds a management UI, observability tooling, and managed deployment — but the core framework is self-hostable on customer infrastructure. Strong developer ergonomics; role-based agent crews are the native primitive.

Self-host tier: Tier B.

Strengths. Open-source transparency on the agent runtime. Self-hostable. Active developer community. Good fit for engineering organizations that want composable multi-agent systems.

Trade-offs. AI Act-shaped governance metadata is not a first-class feature — compliance fields must be added by the implementing team. Fleet console is less mature than Knowlee or the enterprise platform alternatives. Enterprise tier is newer than the framework.

Comparison matrix

Platform	Legal entity	Self-host tier	AI Act governance native	Fleet console	Foundation model layer
n8n	German GmbH	Tier B	No (custom)	Partial	Any (via nodes)
deepset Haystack Enterprise	German GmbH	Tier A/B	No (custom)	No	Any
LightOn Paradigm	French SAS	Tier B/C	Not disclosed	No	LightOn models
Aleph Alpha PhariaAI	German GmbH	Tier B/C	Partial	No	PhariaAI
GLBNXT	Dutch	Tier A/B	Not disclosed	Partial	Various
Cohere Coral	Canadian (verify)	Tier A/B	Partial	No	Cohere Command
Knowlee	EU	Tier B	Yes, native fields	Yes (kanban + registry)	Any (configurable)
CrewAI Enterprise	US	Tier B	Not disclosed	Partial	Any

Self-hosting decision checklist

Before committing to self-hosting, answer:

Do you have 0.5–1.0 FTE of platform engineering capacity to allocate to ongoing operations?
Do you have EU-resident GPU compute budget for local model inference?
Does your compliance requirement mandate Tier B or Tier C, or would a well-governed managed EU-entity platform satisfy procurement?
Is your audit-trail requirement satisfied by platform-native governance fields, or do you need to prove control-plane isolation?
What is your disaster-recovery plan for self-hosted model infrastructure?

If the answer to question 1 or 2 is no, reconsider managed options from EU-incorporated vendors before committing to self-hosting.

Frequently asked questions

Is self-hosting always required for AI Act compliance? No. The AI Act does not require on-premises deployment. It requires documentation, risk classification, human oversight, and audit trails. A well-governed managed platform from an EU-incorporated entity can satisfy AI Act obligations. Self-hosting is required when DORA, national sovereignty rules, or sector-specific data-residency requirements apply on top of AI Act.

Can I use open-weight models (Llama, Mistral) in a self-hosted agentic platform? Yes. Most platforms in this guide support open-weight models via Ollama, vLLM, or direct API. Knowlee, n8n, Haystack, and CrewAI all support pluggable model backends. Verify the model license (Llama community license, Apache 2.0, MIT) against your commercial use requirements.

What is the minimum hardware for a production self-hosted deployment? For a 7–13B parameter model: a single A100 80 GB or H100 80 GB GPU is sufficient. For 70B models: 2-4 H100s. For orchestration-only (no local inference): any modern server with 16+ GB RAM. Storage: 2–10 TB NVMe depending on model size and log retention requirements.

Does self-hosting increase or decrease security risk? Both, depending on the organization. Self-hosting eliminates multi-tenant attack surface and third-party breach risk. It introduces internal attack surface and requires the organization to own patching, access controls, and incident response. Organizations with mature security operations typically find self-hosting reduces their overall risk posture; organizations without that capability may increase it.

How does Knowlee handle model licensing in self-hosted deployments? Knowlee is the orchestration layer; the foundation model is a separate component. Buyers select and license their own foundation model (EU-sovereign, open-weight, or proprietary) and configure Knowlee to route to it. There is no bundled model license.