Document
Loading...
ONT Platform Architecture

Your browser cannot display the PDF inline. Download it here.

Operator Native Thinking

The cluster is
the documentation.

Not a representation of it. Not a mirror of it. The thing itself. With an API that makes the organization queryable. When every governance decision is a CRD, every policy a versioned resource, and every contract a reconciled object, the cluster holds organizational truth.

How It Works Import the Schema
7
Open source repos
36
Published schemas
v1.9.3‑alpha.1
Alpha release
Apache 2
Open source license
Infrastructure governance is the consequence of living documentation. Not the purpose.
ONT is a living documentation system first. Every document is a reconciling object. Every relationship is a versioned contract. The entire organizational truth is queryable through a Kubernetes API.
The Problem

Three failures every operator knows

The coordination failures that DevOps, SRE, and Platform Engineering named but could not structurally fix.

01

Documentation rots

The moment you publish it, it drifts. Every wiki page, every architecture diagram, every runbook is already describing a system that has moved on. No discipline solves a structural problem. ONT eliminates the representation entirely.

02

Operators are islands

Thousands of Kubernetes operators exist. Each team builds in their own dialect. No shared contract surface. No common language. No way to compose governance across domain boundaries. When the senior engineer leaves, the knowledge leaves.

03

AI has no substrate

AI in production requires semantic structure, causal memory, and an enforced approval boundary. Most platforms have none of these. Organizations introducing AI into unstructured environments are accelerating failure modes, not operations.

The ONT Answer

One structural change. Everything follows.

Kubernetes already gave us the right primitives: the why-what-how separation at infrastructure scale. Human intent through CRDs. Organizational memory in etcd. Automated execution through controllers. But it left the semantic layer incomplete.

ONT completes it. Domain as the boundary of responsibility. Operators as intellectual delegates. CRDs as versioned contracts. Lineage as the chain connecting every object to its governing authority.

The consequence: when contracts accumulate over time, expressed with precision and bounded by domain, they become the most honest training corpus a domain AI could ever learn from. Not hallucination. Inheritance.

🔍

Queryable

Ask the cluster what is true right now. Not what a document says. What the platform enforces.

📊

Diffable

Ask what changed, when, who changed it, and what governance event authorized the change.

📋

Auditable

Every state transition is timestamped, attributed to an actor, written to the Guardian audit sink.

🔁

Living

The documentation cannot drift from the running system because it is the running system.

Living Documentation Architecture

The cluster authors. The human reads.

A five-layer architecture where every component deployed on Kubernetes produces its own documentation continuously. The lineage field is a first-class structural field in every manifest, not an annotation. It is reserved by the schema and fulfilled by the LineageController as part of the governance contract.

Component Deploy
An ONT-governed component is deployed or updated. The event enters the documentation pipeline.
Operational
Lineage Controller
Fulfills the first-class lineage field reserved in every manifest. Domain to SubDomain to Service to ExecutionUnit. No intelligence. Pure observation. Version, timestamp, owner at every edge in the chain. Live for InfrastructureTalosCluster. Full nine-GVK coverage in progress. Reconcilers write LineageSynced=False; controller flips to True when operational.
Alpha: Partial
Lineage Sink
Event-driven collector. Routes deploy, update, and deletion events to the Document Store. No intelligence. Structured fact routing only.
Interface Defined
Document Store
Two databases, two roles. Neo4j or PostgreSQL for the lineage graph traversal. MongoDB for populated document blobs indexed by searchDescriptor. etcd holds DocumentSchema CRD definitions only, never document blobs.
Next Layer
Translation
NLP fills bounded template slots declared in DocumentSchema. Input: structured cluster delta only. Output: populated blob per field. Exports to Confluence, PDF, Markdown, runbook. Human reads. Human never authors.
Next Layer
AI does
Narrates what the structured input already says
Translates spec diffs into readable change-narrative
Summarises component purpose from labels, image, and lineage context
Fills nlp-generated fields in DocumentSchema only
AI does not
Decide what a component should do
Infer relationships not expressed in lineage fields
Generate content outside schema-defined fields
Override human-set export flags or make architectural decisions
OpenAPI JSON Schema

The ONT Schema Standard

36 schemas across 4 layers. Importable by any operator. The community standard for ONT domain contracts.

Shared
6
SealedCausalChain, BindingStability, PhaseModel, RationaleField, GovernanceEvent, KubernetesMetadata
Seam
12
LineageRecord, TalosCluster, RunnerConfig, PackDelivery, PackBuild, PackExecution, PackInstalled, PackReceipt, PackLog, MachineConfigSync, SeamMembership, DSNSZone
Domain
9
DomainIdentity, DomainBoundary, DomainPolicy, DomainRelationship, DomainEvent, DomainWorkflow, DomainResource, DomainAudit, DomainSemanticNameService
Application
9
AppBoundary, AppIdentity, AppPolicy, AppTopology, AppEventSchema, AppWorkflow, AppResourceProfile, AppAuditPolicy, AppProfile
Import the full index
https://schema.ontai.dev/v1alpha1/index.json
Browse Schema
The Seam Operator Family

Six intellectual delegates. One governance chain.

Each operator is the institutional form of what a senior engineer knows about a bounded domain, written as code, running continuously, available always.

Guardian
guardian.ontai.dev
Trust root and security substrate for every cluster in the ONT operator family. Provisions RBAC via PermissionSnapshot computation, Ed25519 signing, and cross-cluster distribution. Runs the admission webhook that gates all ONT operators. Writes governance and audit events to the CNPG-backed audit sink. LineageArchiver persists LineageRecord governance history to the domain memory store.
Alpha
Platform
seam.ontai.dev / platform.ontai.dev
Cluster lifecycle authority. Imports existing Talos Linux clusters and drives the full CAPI and direct bootstrap paths for management and tenant roles. Day2 operations: Talos version upgrades, Kubernetes upgrades, PKI rotation, hardening profile application, etcd maintenance, and MachineConfigSync for machineconfig source-of-truth. DriftSignalReconciler creates corrective UpgradePolicies when version drift is detected.
Alpha
Dispatcher
seam.ontai.dev
Pack delivery engine. Compiles Helm charts, raw YAML, and Kustomize overlays into signed three-layer OCI artifacts. Drives PackExecution through a five-gate delivery sequence to PackInstalled. PackReceipt carries Ed25519 signatures verified by tenant Conductor before acknowledgment. Triggers corrective redeployment when RuntimeDrift signals arrive from tenant Conductor agents.
Alpha
Conductor
seam.ontai.dev
Three-image execution model. Compiler: offline workstation CLI, never deployed, sole scaffolding authority for new operators. Exec mode: debian-slim Kueue Jobs, 20 registered operational capabilities including machineconfig-sync, etcd-maintenance, node-reboot, and platform-upgrade. Agent mode: distroless, deployed to every cluster, running TalosVersionDriftLoop, PackPodHealthLoop, and ClusterNodeHealthLoop. AutonomyLevel gate via OperatorContextWatcher governs all autonomous actions.
Alpha – v1.9.3-alpha.1
Seam
seam.ontai.dev
Exclusive schema authority for all cross-operator CRD definitions. Owns LineageRecord, DriftSignal, RunnerConfig, OperatorContext, SeamMembership, MachineConfigSync, and the full pack lifecycle family: PackDelivery, PackExecution, PackInstalled, PackReceipt, PackLog. No operator defines CRDs that Seam owns. seam-sdk enforces this contract at compile time for every operator that imports it.
Alpha
seam-sdk
library — never deployed
Compile-time contract for all ONT operators. Every operator implements SeamOperator, declares SeamMembership on startup, and uses CreationRationale as a compile-time enumeration. Non-compliant operators do not compile. conductor-sdk extends the contract for Conductor capability declarations and execution lineage schema. Both are Go modules imported by every component in the operator family.
Go Library

Screen (virtualization, virt.ontai.dev), Vortex, and ONTAR are future scope. No implementation until Governor-approved ADR.

AI Sequencing

Build the substrate first. Then add AI.

AI in production operations requires semantic structure, causal memory, and an enforced approval boundary. ONT builds all three before asking AI to operate within them.

1

Governance layer

Layer One CRDs give governance configuration a formal address. AI can now distinguish governance decisions from operational tuning.

2

Lineage chain

Every object traces to its governing authority. AI has causal memory, not just current signals. Past decisions are queryable.

3

Human boundary

Layer One changes require GitOps with human identity. This is architectural, not a prompt instruction. AI cannot bypass it.

4

AI inherits intent

Accumulated governance decisions become the most honest training corpus a domain AI could ever learn from. Not hallucination. Inheritance.

UPL / KBCL Framework

The structural isomorphism that made safe delegation possible.

ONT independently converged with KBCL (Kapital, Balans, Cikulaer, Lag), a selection-systems framework derived from Universal Process Law (UPL). UPL identifies four universal forces governing all processes -- cognitive, biological, social, and artificial: Kapital (capacity -- what is selectable at all), Balans (bounded coordination mechanics -- how actors coordinate within that capacity), Cikulaer (creativity -- the act of realization that brings new state into being), and Lag (non-realized capacity -- what the system held but did not actualize, the structural loss). The convergence with ONT is not adaptation. It is recognition.

All states
A
CRD schema
Realizable
A(T)
schema + RBAC
Perceived
Mperc
reconciler model
Selection
B
reconcile decision
Realization
C
LineageRecord
System state
T
RunnerConfig
K — Kapital

Capacity defines what is selectable

UPL's first law: capacity is the structural boundary of any selection system. No action is selectable beyond the system's current capacity. In ONT, RunnerConfig.status.capabilities is the live capacity register -- the precise enumeration of every action Conductor is currently authorized and equipped to perform. Capacity is not configuration. It is the formal definition of A(T).

B — Balans

Bounded coordination mechanics

UPL's second force: within available capacity, multiple actors must coordinate without exceeding the structural boundary. Coordination that breaches the capacity boundary does not produce more -- it produces loss. In ONT, OperatorContext encodes the coordination contract: autonomyLevel sets the boundary (observe-only through full-delegation), ApprovalGates coordinate human and autonomous actors on contested actions. Every reconcile decision is a B step. Governance is bounded coordination made machine-readable.

C — Cikulaer

Realization is a creative act

UPL's third force: realization is not mechanical execution -- it is the creative moment that produces a genuinely new system state. The act of selection plus action creates something that did not exist. In ONT, LineageRecord is the record of that creative act: causal derivation chain, sealed at creation, controller-authored, carrying CreationRationale and ActorRef. PermissionSnapshot (Ed25519-signed by Guardian) is the creativity record of RBAC state brought into being. The cluster does not replay -- it creates.

L — Lag

Non-realized capacity is structural loss

UPL's fourth force: not all capacity becomes realization. The gap between what the system holds as K and what it actually selects and realizes through B and C is Lag -- non-realized capacity, structural loss. In ONT, DriftSignal fires when M_perc diverges from A(T): the system had the capacity but failed to realize it. RemediationPolicy tracks escalation when realization attempts fail. ClusterNodeHealthLoop makes non-realization visible at 60-second resolution. Loss is not failure. It is the measure of unrealized governance.

Convergence

The structural recognition

ONT derived its governance structure from Kubernetes operator patterns. KBCL derived K-B-C-L from systems theory applied to the Swedish labor market. Both reached the same structure: capacity (K) defines the selection space, bounded coordination (B) allocates it, creative realization (C) produces new state, and non-realized capacity (L) is the governance debt. UPL proves this is universal. ONT is the practical OPS implementation of the KBCL framework: every invariant, every CRD, every reconciler loop traces directly to one of these four forces.

Documentation

Everything you need to get started

Founding Document
ONT: Operator Native Thinking
The complete thesis. Preamble through closing. Definitions, invariants, six parts, and the soul of ONT.
Read →
Architecture
The Cluster Is the Documentation
The precise five-layer documentation architecture. LineageController through Translation Layer. What is built and what is next.
Read →
Playbook
Brownfield Adoption
Four stages for teams with running systems and no CRDs. Honest effort estimates. Discovery through reconciliation validation.
Read →
Framework
Operator Validation
How to know your operator correctly encodes domain knowledge. Three verification properties. Six production readiness criteria.
Read →
Specification
Vortex Retrieval Interface
Three queries that make accumulated governance memory queryable at 3am. Defined inputs, outputs, and authorization model.
Read →
Decision Record
ONTAR: Pod Execution Boundary
The Talos philosophy applied to container runtime. Six invariants. Phase authority. Bootstrap trust. Future specification only.
Read →
Architecture PDF
ONT Platform Architecture
The full visual architecture document. Operator family, CRD catalogue, lineage chain, deployment topology, and roadmap.
View PDF →