Adverant Nexus Stack v4.0: A Unified Agentic Orchestration Architecture for Sovereign, Multi-Tenant, Marketplace-Scale AI
A systems architecture paper proposing Adverant Nexus Stack v4.0: fifteen primitives that close seven unsolved industry gaps in the 2026 agentic AI framework landscape. Builds on the UNO Pipeline Redesign by adding a concrete Tier 4 autonomous state machine, a signed-manifest skill marketplace with SBOM and runtime quality scoring, hooks as first-class orchestrator primitives, a cryptographically isolated Memory Bank, an A2A plus MCP dual protocol plane, deterministic replay plus C2PA chain-of-custody, an airgapped bundle mode for FedRAMP and DoD use, a first-class CLI, native integration of thirteen governance regimes, and a UI/UX bindings layer generalizing production skill bindings into a user-configurable button-to-workflow substrate.
Adverant Nexus Stack v4.0: A Unified Agentic Orchestration Architecture for Sovereign, Multi-Tenant, Marketplace-Scale AI
Authors: [Adverant Research Team] ยท [Adverant Limited, Dublin, Ireland] Publish date: 2026-04-24 ยท Target length: ~30 000 words / 15 sections / โฅ90 citations Distribution: link-only, not discoverable (same policy as the UNO Pipeline Redesign paper and the Cognitive Memory Architecture paper)
Abstract
Agentic AI platforms in 2026 look very different from those in 2024. Model Context Protocol (MCP) has become the universal tool-use substrate โ 97 million monthly SDK downloads by February 2026, donated to the Linux Foundation's Agentic AI Foundation in December 2025 [1][2]. Agent2Agent (A2A) went production-grade at Google Cloud Next 2026 and ships as a first-class primitive in Microsoft Agent Framework 1.0, released 3 April 2026 [3][4]. Graph-shaped orchestration โ LangGraph StateGraph, Google ADK, Microsoft Agent Framework workflow graphs, CrewAI Flows, LlamaIndex Workflows โ has displaced conversation-graph and role-crew models as the dominant orchestration primitive [5][6][7]. Managed runtimes (LangSmith Deployment, Vertex Agent Runtime, Bedrock AgentCore with Strands) are the commercial wedge, and observability has bifurcated into an OpenTelemetry tier and an LLM-native analytics tier (LangSmith Insights Agent and Polly, AgentCore quality evaluations) [8][9][10].
Beneath this convergence, seven industry gaps remain unsolved: (a) airgapped multi-agent deployment with signed plugin bundles, (b) cryptographic per-tenant isolation across shared observability, (c) plugin marketplaces with provenance and software bill-of-materials (SBOM) and runtime quality scoring, (d) cross-framework agent portability, (e) deterministic replay and chain-of-custody for long-horizon agents, (f) agent-level FinOps governance, and (g) a portable, user-configurable UI-to-workflow binding layer. No framework in the 2026 survey ships turnkey answers to all seven.
This paper presents Adverant Nexus Stack v4.0, a unified agentic orchestration architecture designed to close these gaps while preserving the dispatch-execution separation discipline established in our prior work, the Unified Nexus Orchestrator (UNO) paper [42]. Building on 44 production microservices, 729 deployed skills, four execution tiers, four AI provider adapters, and row-level-security-plus-Istio tenant isolation โ all running as of 24 April 2026 โ v4.0 adds: (1) a concrete Tier 4 autonomous state machine with human-in-the-loop waitpoints, cost caps, and deterministic replay; (2) a skill marketplace with signed manifests, SBOMs, runtime quality scoring, and semantic versioning with auto-rollback; (3) hooks as first-class orchestrator primitives (PreDispatch, PreToolUse, PostLLMCall, OnCostThreshold, OnHITLPause, OnTierEscalation); (4) a Memory Bank with per-tenant key-encryption-key (KEK) envelope crypto backed by HSM in cloud profiles and TPM in airgapped profiles; (5) a dual A2A-plus-MCP protocol plane; (6) Pydantic-style structured output with self-correction on every skill contract; (7) DSPy-style optimizer-compiled prompts; (8) observability extensions (Insights Agent, Polly-natural-language debugging) on top of our existing twelve-type span tree; (9) pre-dispatch FinOps governance with per-org, per-skill, per-run budgets and circuit-breakers; (10) a sealed airgapped bundle mode for FedRAMP, DoD, CJIS, and IRS Publication 1075 use cases; (11) a first-class CLI (Adverant Nexus CLI 2.0) that is a dispatch, streaming, and Progress Command Center (PCC) mirror client; (12) native integration of EU AI Act, GDPR, SOC 2, ISO 27001, ISO 42001, HIPAA, FedRAMP, NIST AI Risk Management Framework, PIPL, DPDP, LGPD, OWASP LLM Top 10 (2025), MITRE ATLAS, C2PA, and export controls as first-class primitives โ not bolted-on middleware; and (13) a UI/UX Bindings layer generalizing our production ros.skill_bindings table into a substrate where users configure what any button in any plugin or marketplace application does โ selected skill, tier, provider, model, cost cap, risk tier, data residency, inputs mapping, and hook set โ without shipping code.
We present five comparison tables across twelve surveyed frameworks; a current-state map of our v3 stack grounded in file paths and line numbers from the production monorepo; a gap analysis separating industry gaps from v3-internal gaps; a detailed v4.0 proposal in fifteen sub-sections; reference architecture diagrams; fifty use cases; an eighteen-phase migration path extending UNO's nine-phase strangler-fig plan; four deployment profiles; an evaluation methodology; and ten appendices including the full binding schema, compliance-control traceability matrix, and auditor-export payload schema. The paper is internally validated at three gates by Gemini 2.5 Pro [11] (post-outline, post-proposal-core, pre-publication); all three transcripts are archived and published alongside the paper.
1. Introduction
The question motivating this paper is not whether Adverant Nexus needs a next major version. That is answered by three simultaneous forcing functions. The question is what the next version should contain, given a market that consolidated faster than anyone predicted, a regulatory climate that now demands baked-in compliance rather than post-hoc audit, and a production system that has revealed, through fourteen months of operation, which of our assumptions survived contact with real workloads and which did not.
1.1 Three Forcing Functions
Forcing function one: the 2026 market consolidation. Between December 2025 and April 2026, the agentic AI framework landscape underwent a consolidation shock. Anthropic donated Model Context Protocol to the Linux Foundation in December 2025, with Google, Microsoft, Amazon Web Services, Cloudflare, and Bloomberg joining as founding supporters of the new Agentic AI Foundation [1][2]. Google shipped Agent2Agent as production-grade at Cloud Next 2026, with 150-plus organizations in production on A2A workflows [3]. Microsoft merged AutoGen and Semantic Kernel into the unified Microsoft Agent Framework, version 1.0, on 3 April 2026 [4][13]. OpenAI retired Swarm and replaced it with the OpenAI Agents SDK, adding sandboxing and a new model harness [14][15]. Google launched the Gemini Enterprise Agent Platform on 22 April 2026 โ two days before this paper is dated โ combining the Agent Development Kit (ADK), Agent Studio low-code authoring, Agent Gateway networking, and Memory Bank into a single managed product [16][17][18]. Amazon Web Services expanded Bedrock AgentCore with managed quality evaluations and policy controls [10][19]. The frameworks have converged on three structural axioms: tool use is MCP, agent-to-agent communication is A2A, and orchestration is a directed graph with explicit state, checkpointing, and human-in-the-loop waitpoints.
Forcing function two: seven unsolved industry gaps. Across a systematic survey of twelve frameworks โ CrewAI, LangChain and LangGraph, Pydantic AI, Gemini Enterprise Agent Platform, Microsoft Agent Framework, OpenAI Agents SDK, Anthropic Claude Agent SDK, Semantic Kernel (maintenance), LlamaIndex Workflows, DSPy, Haystack (deepset), and Bedrock AgentCore โ no framework ships turnkey solutions to: airgapped multi-agent deployment with signed plugin bundles; cryptographic per-tenant isolation across shared observability; plugin marketplaces with provenance, SBOM, and runtime quality scoring; deterministic replay with exactly-once semantics for long-horizon agents; agent-level FinOps; chain-of-custody for AI-generated artefacts; and cross-framework agent portability beyond the A2A wire protocol. These are defensible commercial wedges for a platform that treats them as first-class primitives rather than user-implemented patterns.
Forcing function three: stale points inside the current stack. The UNO paper [42] documents a nine-phase strangler-fig migration through April 2026. Phases 1โ6 and 9 are complete; Phases 7 (multi-provider AI routing) and 8 (tool executors) are partial; Phases 10 and beyond were undefined when the UNO paper went to press. Specific stale points: the UNO paper describes graphrag.skill_registry as the router, but the April 2026 Skills Engine consolidation (documented in our internal memory at skills_engine_consolidation_20260416) dropped that table and migrated 92 skills into the unified engine under ros.tool_registry. The UNO paper's Section 14 flags that nexus-mageagent retains a governance bypass path while Section 12.3 describes it as "fully migrated" โ an internal contradiction that must be resolved. Tier 4 is defined in the ExecutionTier type in services/nexus-orchestrator/src/types/execution.ts but no concrete human-in-the-loop waitpoint code exists. Chain DAG state lives in Redis with a 24-hour time-to-live, which is inadequate for long-horizon chains. Skill versioning is declared in the SKILL.md front matter but unenforced at dispatch. The current CLI is shallow relative to dispatch. These are fixable in v4.0 without re-opening the dispatch-execution boundary that UNO successfully closed.
1.2 The v4.0 Thesis
Nexus Stack v4.0 is built on one thesis: every dimension along which modern agentic systems now compete โ runtime governance, memory, observability, supply chain, deployment sovereignty, cost control, deterministic audit, and portable configuration โ should be a primitive of the platform, not a capability an engineer adds per-skill. When governance is a primitive, skills get it by default. When replay is a primitive, every artefact carries chain-of-custody. When bindings are a primitive, business users reconfigure behaviour without shipping code. When airgapped deployment is a primitive, the same codebase serves FedRAMP High customers and public-cloud customers. The operating question for every v4.0 feature is: can this be a first-class primitive of the platform, rather than an exception that each plugin re-implements?
1.3 Contributions
This paper makes the following contributions:
-
A systematic survey of twelve 2026 agentic frameworks across five comparison dimensions (core abstraction plus orchestration, tool use plus memory plus observability, deployment plus multi-tenancy plus licensing, developer experience, documented weaknesses), yielding a comparison matrix suitable for architectural decision-making in Q2 2026 and beyond.
-
A gap analysis separating seven industry gaps (unsolved by any surveyed framework) from ten v3-internal gaps (identified through honest retrospective on the UNO migration), prioritised by commercial leverage times implementation cost.
-
The Adverant Nexus Stack v4.0 architecture, specified in fifteen sub-sections covering principles, execution tiers (including a concrete Tier 4 state machine with human-in-the-loop waitpoints and cost caps), a signed-manifest plus SBOM plus quality-score skill marketplace, hooks as first-class primitives, a cryptographically isolated Memory Bank, an A2A-plus-MCP dual plane, structured output with self-correction, DSPy-style optimizer-compiled prompts, observability extensions, FinOps governance, deterministic replay and chain-of-custody, airgapped bundle mode, a first-class CLI, native governance and compliance integration, and a UI/UX bindings layer.
-
Native integration of thirteen governance and compliance regimes โ EU AI Act, GDPR and UK GDPR plus EU Data Act, SOC 2 Type II, ISO/IEC 27001, ISO/IEC 42001 (AI management systems), HIPAA and HITRUST, FedRAMP Moderate and High plus DoD Impact Level 4 and 5 plus CJIS plus IRS Publication 1075, NIST AI Risk Management Framework 1.0 and NIST AI 600-1, regional privacy laws (PIPL, DPDP, LGPD, PIPEDA, Australian Privacy Act), OWASP LLM Top 10 (2025), MITRE ATLAS, C2PA content provenance, and export controls (EAR, ITAR, EU Regulation 2021/821) โ with specific enforcement points at dispatch, execution, storage, and observability layers, and a traceability matrix (Appendix G) mapping each control identifier to the v4.0 primitive that satisfies it.
-
Adverant Nexus Bindings v2, a first-class substrate generalizing our production
ros.skill_bindingstable (schema in Appendix J) into a user-configurable UI-to-workflow binding layer with a rich metadata set (skill identifier, tier, provider, model, cost cap, daily cap, token cap, timeout, max iterations, risk tier, data residency, export tags, hook set, allowed and denied tool lists, input schema, inputs mapping, output target, UI placement, shortcut, confirmation level, A/B experiment reference, quality-score threshold) resolved at runtime by a four-level scope hierarchy (user over project over org over system) with priority-based selection, validated by an OPA-based override policy to prevent user overrides from weakening organizational governance. -
Reference architecture in fifty diagrams covering v3 current state (service topology, dispatch pipeline, execution tiers, AI provider router, WebSocket event flow, Persistent Chat Context, multi-tenant isolation, CLI, plugin template), v4.0 architecture (topology, dispatch flow, four tiers, marketplace, hooks, memory bank, A2A plus MCP, structured output, optimizer, observability, FinOps, replay, airgap, CLI 2.0, bindings resolution, bindings metadata), user journeys (end-user dispatch, developer publish, admin configuration, auditor export, airgapped install, human-in-the-loop approval, CLI dispatch, binding editor), UI/UX mocks (dashboard, PCC panel, governance tab, marketplace, chain visualizer, span explorer, FinOps dashboard, CLI REPL, binding editor), compliance and security diagrams (EU AI Act enforcement, GDPR erasure, OWASP LLM defense, envelope encryption, three-gate enforcement, OPA evaluation, C2PA provenance, binding override policy), and four deployment profiles (public cloud, single-tenant VDS, on-premise, airgapped).
-
Fifty use cases specifying trigger, tier, hooks, compliance, and outcome fields across all v4.0 capabilities, including seven bindings-specific cases demonstrating declarative defaults, context-menu bindings, A/B traffic splits, version pins, user-scope overrides, quality auto-deactivation, and marketplace install seeding.
-
An eighteen-phase migration path extending the UNO nine-phase strangler-fig plan with Phases 10 through 27 for v4.0.
-
Four deployment profiles โ public cloud multi-tenant, single-tenant virtual dedicated server, on-premise Kubernetes, and sealed airgapped bundle โ served by the same codebase with manifest deltas documented per profile.
-
An evaluation methodology across six axes โ token efficiency, latency, multi-agent cost, provable tenant-isolation boundaries, replay fidelity, airgapped feature parity โ with benchmark designs but without benchmark execution, which is deferred to follow-up work.
-
Three Gemini 2.5 Pro validation transcripts archived with the paper (Gate A post-outline, Gate B post-Section 7, Gate C pre-publication peer-review simulation), providing an independent adversarial-reviewer voice alongside authorial claims.
1.4 Paper Organization
Section 2 surveys the 2026 framework landscape. Section 3 presents the comparison matrix. Section 4 maps Nexus v3 against the plan as currently running in production. Section 5 is the gap analysis. Section 6 outlines the v4.0 principles. Section 7 is the v4.0 proposal core in fifteen sub-sections. Section 8 presents the reference architecture diagrams. Section 9 catalogues the fifty use cases. Section 10 is the migration path. Section 11 describes the four deployment profiles. Section 12 presents evaluation methodology. Section 13 positions v4.0 against related work beyond the framework survey. Section 14 concludes. Appendices A through J contain detailed schemas and policy starter packs.
2. Background: The 2026 Agentic Framework Landscape
We surveyed twelve frameworks as of 24 April 2026. Each sub-section below is a compact profile; Section 3 renders the comparison across dimensions.
2.1 CrewAI
CrewAI pairs a role-and-goal-and-backstory agent DSL (Crews) with an event-driven graph orchestration layer (Flows), giving it the most human-readable agent-definition syntax among surveyed frameworks [20]. Crews' strengths โ approachability, hundreds of built-in tools, native MCP โ coexist with documented weaknesses: coordination-via-natural-language wastes tokens, there is no built-in checkpointing, and observability lags LangSmith [21][22]. CrewAI AMP is the commercial SaaS layer with organizational scoping and RBAC.
2.2 LangChain and LangGraph
LangGraph is the StateGraph-based orchestration engine that pioneered graph-shaped agent orchestration in 2024 [23]. LangSmith, the observability and evaluation platform, has grown to process more than fifteen billion traces and one hundred trillion tokens as of 2026 [8][24]. The 2026 additions โ Insights Agent for automatic trace clustering and Polly for natural-language debugging โ are the two most distinctive observability primitives shipped by any framework. Weaknesses: a steep learning curve that requires state-machine fluency, and lock-in risk around LangSmith's commercial deployment product.
2.3 Pydantic AI
Pydantic AI applies Pydantic's type-validation ethos to agent construction [25]. Every tool decorator produces a JSON schema automatically; every agent output is validated against a Pydantic model, with automatic retry on validation failure โ the "structured output with self-correction" pattern. Pydantic AI Harness, released 2026, adds durability across failures. Observability flows through Logfire, Pydantic's telemetry product. Weaknesses: Python-only, fewer built-in multi-agent patterns, younger ecosystem.
2.4 Gemini Enterprise Agent Platform
Google's 22 April 2026 launch unifies the Agent Development Kit, Agent Studio natural-language low-code authoring, Agent Gateway (an agent-network layer with governance policies), Memory Bank (persistent cross-session memory), and Vertex AI Gen AI Evaluation into a single managed product [16][17][18]. Agent Runtime provides sub-second cold starts, A2A is native, and governance flows through GCP IAM, VPC Service Controls, Cloud Audit Logs, and Customer-Managed Encryption Keys. Weaknesses: GCP lock-in, rebrand churn, enterprise pricing opacity.
2.5 Microsoft Agent Framework
Version 1.0 shipped 3 April 2026, merging AutoGen and Semantic Kernel into a single framework with six orchestration patterns โ sequential, concurrent, handoff, group chat, Magentic-One, and graph [4][13][26]. Declarative YAML agents are first-class, DevUI is the built-in operator interface, A2A is native, MCP is native, and deployment flows through Azure AI Foundry. Weaknesses: AutoGen users must migrate, .NET-first documentation bias.
2.6 OpenAI Agents SDK
The successor to Swarm [14], the OpenAI Agents SDK adds sandboxing, a new model harness, and formalizes handoffs and guardrails as core primitives [15][27]. Handoff is arguably the cleanest multi-agent abstraction shipped anywhere โ an agent delegates to another agent, returning the full context. The sandboxed long-horizon harness enables safe long-running execution with provider-agnostic backends. Weaknesses: a small primitive set (teams outgrow it for complex graphs), sandbox harness Python-first.
2.7 Anthropic MCP and Claude Agent SDK
Anthropic introduced Model Context Protocol in late 2024 [28] and donated it to the Linux Foundation's Agentic AI Foundation in December 2025 [2]. The Claude Agent SDK ships main-agent-plus-subagents hierarchical delegation and lifecycle hooks (PreToolUse, PostToolUse, Stop, SessionStart) as core primitives [29][30][31]. Hooks are the single most powerful behaviour-modification primitive in any framework โ they enable gates, policy enforcement, and circuit-breakers as first-class callbacks. Weaknesses: the SDK is opinionated toward coding tasks, and model-decided subagent routing is hard to test deterministically.
2.8 Semantic Kernel (maintenance)
Semantic Kernel pioneered the Planner-driven-plugin-composition abstraction [32]. In 2026 it entered maintenance mode, with new features migrating to Microsoft Agent Framework [33]. The Planner remains the strongest "given a goal and a plugin catalogue, produce a plan" abstraction even as SK itself stops adding capabilities.
2.9 LlamaIndex Workflows
LlamaIndex pivoted from a RAG-centric framework to a workflow-centric agent framework with Workflows 1.0, announced 2026 [34][35]. The @step decorator is the cleanest primitive for expressing loops, parallel branches, and human-in-the-loop waitpoints in a single file. LlamaCloud provides the managed observability and deployment layer. Weaknesses: document-centric framing makes pure-agent use cases feel tacked on.
2.10 DSPy
DSPy is the Stanford-originated compiled-prompt-optimization framework [36][37][38]. Its Signatures-and-Modules-and-Optimizers model turns prompts into compilable artefacts: MIPROv2 and GEPA optimizers re-compile prompts and few-shot examples against a metric function. DSPy is the only surveyed framework that actually optimizes prompts rather than asking humans to tune them. Weaknesses: requires labelled data for optimizers, debugging compiled prompts is opaque.
2.11 Haystack
Haystack (deepset) provides a modular pipeline with explicit retrieval, routing, memory, and generation seams [39][40]. deepset Cloud and deepset Enterprise are the commercial layers. MCP integrations arrived in 2025. Weaknesses: smaller enterprise footprint than LangChain.
2.12 Bedrock AgentCore
Amazon Web Services' AgentCore [19][41] ships a managed agent runtime with session isolation as a first-class runtime primitive, the Strands harness for code-defined agents, AgentCore Memory (managed long-term memory), and โ added in 2026 โ quality evaluations and policy controls [10]. AgentCore skills for Claude Code and Kiro were released in early 2026; the claim "three API calls to a working agent" is the fastest time-to-working-agent of any platform surveyed. Weaknesses: AWS lock-in, preview features churn.
3. Comparison Matrix
We present five comparison tables across the twelve frameworks, covering core abstraction and orchestration model (Table 1), tool use plus memory plus observability (Table 2), deployment plus multi-tenancy plus streaming plus enterprise features plus licensing (Table 3), developer experience (Table 4), and documented weaknesses (Table 5).
Table 1 โ Core abstraction and orchestration model
| # | Framework | Core abstraction | Orchestration model | Primary language(s) |
|---|---|---|---|---|
| 1 | CrewAI | Crew + Flow | Role-based (Crews) over event-driven graph (Flows) | Python |
| 2 | LangChain + LangGraph | StateGraph | Graph, durable, human-in-the-loop | Python + TypeScript |
| 3 | Pydantic AI | Agent + Capabilities | Type-checked function-calling, composable capabilities | Python |
| 4 | Gemini Enterprise Agent Platform | ADK Agent + Agent Studio | Graph-based ADK, Agent Gateway networking, A2A | Python, Go, Java, TypeScript |
| 5 | MS Agent Framework (AutoGen + SK merged) | Agent + Workflow | Sequential, concurrent, handoff, group chat, Magentic-One, graph | .NET + Python |
| 6 | OpenAI Agents SDK | Agent + Handoff + Guardrail | Lightweight handoff graph, sandboxed long-horizon harness | Python + TypeScript |
| 7 | Anthropic MCP + Claude Agent SDK | Main agent + Subagents + Hooks | Hierarchical delegation via subagents, lifecycle hooks | Python + TypeScript |
| 8 | Semantic Kernel (maintenance) | Kernel + Plugin + Planner | Planner-driven sequential or parallel | .NET + Python |
| 9 | LlamaIndex Workflows / AgentWorkflow | Workflow step + Event | Event-driven steps, loops, parallel paths | Python |
| 10 | DSPy | Signature + Module + Optimizer | Compiled program, optimizer re-compiles prompts | Python |
| 11 | Haystack (deepset) | Pipeline + Component + Agent | Modular pipeline with retrieval, routing, memory | Python |
| 12 | AWS Bedrock AgentCore | Agent + Strands harness | Managed harness, session-isolated runtime | Python + TypeScript |
Table 2 โ Tool use, memory, observability
| # | Framework | Tool use | Memory / state | Observability |
|---|---|---|---|---|
| 1 | CrewAI | 100s of built-in + MCP | Shared short/long-term, entity, contextual | Built-in events |
| 2 | LangGraph | Function-calling + MCP + custom | Persistent state per node + checkpointer | LangSmith (15B+ traces, 100T tokens, Insights Agent, Polly) |
| 3 | Pydantic AI | Typed tool decorator + JSON-schema auto-gen + MCP | Durable via Pydantic AI Harness | Logfire |
| 4 | Gemini Enterprise | MCP + native GCP tools | Memory Bank (persistent cross-session) | Vertex AI Gen AI Evaluation + Cloud Trace |
| 5 | MS Agent Framework | MCP native + A2A | Session state + checkpointing + pause/resume | DevUI + OpenTelemetry + Azure Monitor |
| 6 | OpenAI Agents SDK | Tools + MCP + sandbox workspaces | Resumable sandbox sessions | Tracing dashboard + pluggable exporters |
| 7 | Claude Agent SDK | 10+ built-in + MCP (75+ connectors) | Subagent context isolation; hooks persist | Hooks: PreToolUse / PostToolUse / Stop / SessionStart |
| 8 | Semantic Kernel | Plugins + OpenAPI + MCP (2025) | Session memory + Kernel.Memory | OpenTelemetry |
| 9 | LlamaIndex | Function-calling + MCP + LlamaHub | Workflow context + vector + document stores | Instrumentation API + LlamaCloud |
| 10 | DSPy | ReAct + ProgramOfThought modules | Compiled program holds optimized prompts | MLflow + DSPy inspect |
| 11 | Haystack | Tool components + MCP integrations | Component-level state + ChatMessage stores | deepset Cloud |
| 12 | Bedrock AgentCore | Strands + MCP + AWS service actions | AgentCore Memory (managed) + session isolation | CloudWatch + AgentCore quality evals + policy controls |
Table 3 โ Deployment, multi-tenancy, streaming, enterprise, licensing
| # | Framework | Deployment | Multi-tenancy | Streaming | Enterprise (RBAC/SSO/Audit) | License |
|---|---|---|---|---|---|---|
| 1 | CrewAI | Self-host OSS + CrewAI AMP | Via AMP org scoping | SSE | AMP: RBAC, SSO, audit | MIT + commercial |
| 2 | LangGraph | Self-host + LangGraph Platform + LangSmith Deployment | Workspace-level | WS + SSE | LangSmith: SSO, RBAC, SOC2, SCIM | MIT + commercial |
| 3 | Pydantic AI | Self-host library | Caller-owned | Streaming | โ (library) | MIT |
| 4 | Gemini Enterprise | GCP-only Agent Runtime | Full IAM isolation | Native + A2A | GCP IAM, VPC-SC, audit, CMEK | Commercial |
| 5 | MS Agent Framework | Self-host + Azure AI Foundry | Azure tenant | Streaming all patterns | Azure AD, RBAC, Purview | MIT + commercial |
| 6 | OpenAI Agents SDK | Self-host + OpenAI platform | Caller-owned | Streaming handoffs | OpenAI org management | MIT |
| 7 | Claude Agent SDK | Self-host (Anthropic API or Bedrock or Vertex) | Caller-owned | Streaming + hooks | Via provider | MIT |
| 8 | Semantic Kernel | Self-host | Caller-owned | Streaming | Azure integrations | MIT |
| 9 | LlamaIndex | Self-host + LlamaCloud | LlamaCloud projects | Streaming | LlamaCloud SOC2, RBAC | MIT + commercial |
| 10 | DSPy | Self-host library | Caller-owned | Via underlying LM | โ | MIT |
| 11 | Haystack | Self-host + deepset Cloud + Enterprise | deepset Enterprise | Streaming | SSO, RBAC, audit | Apache 2.0 + commercial |
| 12 | Bedrock AgentCore | AWS-only (VPC, PrivateLink) | Full AWS-account + session | Streaming | IAM, KMS, CloudTrail | Commercial |
Table 4 โ Developer experience and CLI
| # | Framework | CLI / scaffold | IDE integration | Notable DX feature |
|---|---|---|---|---|
| 1 | CrewAI | crewai CLI | โ | Role/goal/backstory DSL |
| 2 | LangGraph | langgraph CLI + Studio | LangSmith CLI, Polly NL debugger | Graph Studio visual editor |
| 3 | Pydantic AI | Standard Python | Full type-check + autocomplete | Structured output + self-correction |
| 4 | Gemini Enterprise | gcloud agents + Agent Studio | VS Code, JetBrains | Natural-language agent authoring |
| 5 | MS Agent Framework | agent-framework CLI + DevUI | VS Code | YAML declarative agents |
| 6 | OpenAI Agents SDK | openai-agents CLI + sandbox mgr | โ | Handoff primitive + subagents + code mode |
| 7 | Claude Agent SDK | claude Code CLI | IDE via Claude Code | Subagents + hooks + MCP |
| 8 | Semantic Kernel | sk CLI | Visual Studio, Rider | Planner-driven plugins |
| 9 | LlamaIndex | llamactl | โ | One-command document agent templates |
| 10 | DSPy | dspy CLI | โ | Compile + optimize programs |
| 11 | Haystack | haystack CLI | deepset Studio | Pipeline visual editor |
| 12 | Bedrock AgentCore | AgentCore CLI + AgentCore skills for Claude Code/Kiro | Any | Three API calls to working agent |
Table 5 โ Documented weaknesses (2026)
| # | Framework | Weaknesses |
|---|---|---|
| 1 | CrewAI | No checkpointing; NL coordination wastes tokens; coarse error handling; weaker monitoring than LangSmith |
| 2 | LangGraph | Steep learning curve; state-machine fluency required; LangSmith lock-in risk |
| 3 | Pydantic AI | Python-only; fewer multi-agent patterns; younger ecosystem |
| 4 | Gemini Enterprise | GCP lock-in; rebrand churn; pricing opacity |
| 5 | MS Agent Framework | AutoGen users must migrate; .NET-first docs bias |
| 6 | OpenAI Agents SDK | Small primitive set; sandbox harness Python-first |
| 7 | Claude Agent SDK | Opinionated toward coding; model-decided routing is nondeterministic |
| 8 | Semantic Kernel | Maintenance mode; new features go to MS Agent Framework |
| 9 | LlamaIndex | Document-centric framing makes pure-agent use feel tacked on |
| 10 | DSPy | Requires labelled data; compiled prompts opaque to debug |
| 11 | Haystack | Smaller enterprise footprint than LangChain |
| 12 | AgentCore | AWS lock-in; preview features churn; docs assume AWS fluency |
3.1 Ten Emerging Patterns
Pattern 1 โ MCP is the universal tool-use substrate. 97 million monthly downloads by February 2026, donated to Linux Foundation AAIF December 2025. Every framework ships native MCP or an integration layer [1][2].
Pattern 2 โ A2A is the emerging agent-to-agent layer. "MCP vertical, A2A horizontal" is the industry consensus. Expect critical mass by late 2027.
Pattern 3 โ Orchestration is graph-shaped. Directed graphs with explicit state, checkpointing, and HITL waitpoints. Chat (AutoGen GroupChat) and role-crew (early CrewAI) are now special cases inside graph runtimes.
Pattern 4 โ Managed runtimes are the 2026 commercial wedge. LangSmith Deployment, Vertex Agent Runtime, AgentCore plus Strands, OpenAI sandbox harness. Exactly-once, pause-resume, horizontal scaling.
Pattern 5 โ Typed structured output is table-stakes. Pydantic AI, OpenAI Guardrails, Claude output schemas, MS Agent Framework YAML, DSPy Signatures all converge.
Pattern 6 โ Two-tier observability market. Tier 1 OpenTelemetry-compatible tracing. Tier 2 LLM-native analytics (Insights Agent plus Polly, Logfire, AgentCore quality evals).
Pattern 7 โ Long-term memory is a first-class service. Gemini Memory Bank, AgentCore Memory, LangGraph long-term store, LlamaIndex persistent context.
Pattern 8 โ Single-agent-as-tool beats multi-agent-chat. OpenAI handoffs, Claude subagents, MS handoff orchestration โ token cost and determinism drive this shift.
Pattern 9 โ CLI-first DX is a battleground. Time-from-zero-to-working-agent is a marketing metric.
Pattern 10 โ Legacy convergence. AutoGen plus SK merged into MS Agent Framework; Swarm replaced by Agents SDK; Vertex became Gemini Enterprise Agent Platform.
3.2 Comprehensive Feature Catalog Matrix โ All Competitors vs Adverant Nexus v4.0
This section is the exhaustive comparison the prior tables (1โ5) compressed. Where Tables 1โ5 slice by dimension, Tables 6โ20 catalog every distinguishing feature we identified across the twelve frameworks, then score each framework against it โ plus a final column for how Adverant Nexus v4.0 implements the same capability.
Frameworks compared (column headers, left to right): CrewAI ยท LangGraph (LangChain) ยท Pydantic AI ยท Gemini Enterprise Agent Platform ยท MS Agent Framework ยท OpenAI Agents SDK ยท Claude Agent SDK ยท Semantic Kernel ยท LlamaIndex Workflows ยท DSPy ยท Haystack (deepset) ยท Bedrock AgentCore ยท Adverant Nexus v4.0 (last column, bold).
Cell legend: โ = native/first-class ยท โ = partial, preview, or via plugin ยท โฆ = possible via third-party/custom ยท โ = not supported ยท โ = not applicable ยท Numbers like "ยง7.4" in the Nx column point to the v4.0 sub-section that implements the feature. The data reflects public information as of 2026-04-24.
Table 6 โ Orchestration Primitives
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 6.1 | Graph-based state machine | โ | โ | โฆ | โ | โ | โ | โฆ | โฆ | โ | โ | โ | โ | โ ยง7.2 |
| 6.2 | Sequential workflow pattern | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 |
| 6.3 | Concurrent / parallel fan-out | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 |
| 6.4 | Conditional branching | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 |
| 6.5 | Human-in-the-loop waitpoint | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 Tier 4 |
| 6.6 | Checkpointing / pause-resume | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 + ยง7.11 |
| 6.7 | Durable execution (exactly-once) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.11 |
| 6.8 | Dynamic DAG modification | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 |
| 6.9 | Loops / recursion | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 |
| 6.10 | Batch dispatch / fan-in aggregation | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 (UNO batch) |
| 6.11 | Named queues with priority | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง10 Phase 25 |
| 6.12 | Tier-based execution taxonomy | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 (4 tiers) |
| 6.13 | Role-based agent DSL | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง10 (absorb CrewAI) |
| 6.14 | Workflow-as-code | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 (bindings alt) |
| 6.15 | Workflow-as-YAML / declarative | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 (actions[]) |
| 6.16 | Autonomous goal decomposition | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 Tier 4 |
| 6.17 | Multi-agent handoff primitive | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 + ยง7.6 |
| 6.18 | Group-chat / GroupChat pattern | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ (absorb if needed) |
| 6.19 | Magentic-One orchestrator pattern | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ |
| 6.20 | Competition / consensus ensemble | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 Tier 4 |
Table 7 โ Tool Use (Agent โ Tool Plane)
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7.1 | Native MCP support (as of 2026) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.6 |
| 7.2 | Built-in tool library size | 100+ | 200+ | 0 (decorate) | GCP stack | 50+ | 20+ | 10 + 75 MCP | Plugins | LlamaHub | ReAct/POT | Components | 30+ AWS | 729+ skills |
| 7.3 | Function-calling abstraction | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.6 |
| 7.4 | JSON Schema auto-gen from types | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.7 |
| 7.5 | Tool allowlist / scope per role | โ | โ | โ | โ | โ | โ | โ (hooks) | โ | โ | โ | โ | โ | โ ยง7.4 + ยง7.15 |
| 7.6 | Tool call sandboxing | โ | โ | โ | โ | โ | โ (sandbox) | โ | โ | โ | โ | โ | โ | โ ยง7.12 airgap; ยง10 Phase 8 |
| 7.7 | Code execution tool (REPL/Python) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง10 Phase 8 |
| 7.8 | Filesystem tool with path-allowlist | โ | โ | โ | โ | โ | โ | โ (hooks) | โ | โ | โ | โ | โ | โ ยง7.4 hooks |
| 7.9 | Shell / kubectl tool with policy | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง8 C8 + ยง9 #8 |
| 7.10 | Tool call retry on failure | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.4 hooks |
| 7.11 | Tool cost attribution | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.10 |
| 7.12 | Tool output schema validation | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.7 |
Table 8 โ Agent-to-Agent Communication
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 8.1 | A2A protocol (Google standard) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.6 (target) |
| 8.2 | Subagents / hierarchical delegation | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 Tier 4 |
| 8.3 | Agent handoff with context | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 + ยง7.6 |
| 8.4 | Inter-agent message signing | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.11 + Appendix E |
| 8.5 | Cross-org / cross-tenant A2A | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.6 (via SPIFFE) |
| 8.6 | Agent discovery / registry | โ | โ | โ | โ (Agent Gateway) | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.6 + nexus-auth |
| 8.7 | Local-only A2A (airgap) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.12 |
Table 9 โ Memory and State
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9.1 | Short-term (conversation) memory | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.5 |
| 9.2 | Long-term (cross-session) memory | โ | โ | โ | โ (Memory Bank) | โ | โ | โ | โ | โ | โ | โ | โ (AgentCore Memory) | โ ยง7.5 Memory Bank |
| 9.3 | Entity / semantic memory | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.5 (GraphRAG integration) |
| 9.4 | Per-tenant memory isolation | โ | โ (workspace) | โ | โ (IAM) | โ (tenant) | โ | โ | โ | โ (LlamaCloud) | โ | โ | โ (session) | โ ยง7.5 cryptographic |
| 9.5 | Cryptographic per-tenant KEK | โ | โ | โ | โ (CMEK) | โ | โ | โ | โ | โ | โ | โ | โ (KMS) | โ ยง7.5 (envelope + HSM/TPM) |
| 9.6 | Crypto-erasure (DEK destruction) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.5 + ยง7.14 GDPR |
| 9.7 | Memory checkpointer | โ | โ | โ (Harness) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.5 |
| 9.8 | Vector store integration | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ Qdrant (v3 retained) |
| 9.9 | Knowledge-graph memory | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ Neo4j (v3 retained) |
| 9.10 | Managed memory service | โ | โ (LangSmith) | โ (Logfire) | โ | โ (Azure) | โ | โ | โ | โ (LlamaCloud) | โ | โ (deepset) | โ | โ ยง7.5 |
Table 10 โ Observability
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 10.1 | OpenTelemetry emission | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ (retained v3) |
| 10.2 | LLM-call level tracing | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.9 span tree |
| 10.3 | Tool-call level tracing | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.9 |
| 10.4 | Hierarchical parent-child spans | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.9 12-type enum |
| 10.5 | Closed span-type taxonomy | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.9 (12 types, UNO) |
| 10.6 | Automatic trace clustering | โ | โ (Insights Agent) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ (quality evals) | โ ยง7.9 |
| 10.7 | Natural-language trace query | โ | โ (Polly) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.9 Polly-NL |
| 10.8 | Quality evaluation harness | โ | โ (LangSmith evals) | โ (Logfire) | โ (Vertex Eval) | โ | โ | โ | โ | โ | โ (DSPy metric) | โ (deepset evals) | โ (AgentCore evals) | โ ยง7.8 + ยง7.9 |
| 10.9 | Anomaly / regression detection | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.9 Insights |
| 10.10 | Cost hotspot analysis | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.9 + ยง7.10 |
| 10.11 | Storage tiering (hot/cold) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.9 (PG + ClickHouse) |
| 10.12 | No-sampling (full record) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.9 (EU AI Act Art. 12) |
Table 11 โ Deployment Targets
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 11.1 | Self-hosted OSS | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ (all profiles) |
| 11.2 | Managed cloud runtime | โ (AMP) | โ (LangSmith) | โ | โ (Vertex) | โ (Azure) | โ (OpenAI) | โ | โ | โ (LlamaCloud) | โ | โ (deepset) | โ (AWS) | โ ยง11 public-cloud |
| 11.3 | Single-tenant VDS | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง11 VDS |
| 11.4 | On-premise Kubernetes | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง11 on-prem |
| 11.5 | Airgapped sealed bundle | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.12 + ยง11 airgap |
| 11.6 | Sub-second cold start | โ | โ | โ | โ (Agent Runtime) | โ | โ | โ | โ | โ | โ | โ | โ (AgentCore) | โ ยง7.13 + UNO dispatch |
| 11.7 | Horizontal scaling (stateless dispatch) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.1 (UNO retained) |
| 11.8 | GPU scheduling (own hw) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.12 airgap + ยง4 gpu-queue |
| 11.9 | BYO-LLM endpoints | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ (4 adapters + BYO) |
| 11.10 | FIPS 140-3 crypto modules | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.12 + ยง7.14 FedRAMP |
| 11.11 | STIG-compliant base images | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.12 + ยง7.14 DoD IL5 |
| 11.12 | Monthly delta update bundle | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.12 |
Table 12 โ Multi-Tenancy and Security
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 12.1 | Organization / workspace scoping | โ (AMP) | โ | โ | โ (IAM) | โ | โ | โฆ | โ | โ (LlamaCloud) | โ | โ (Enterprise) | โ (AWS acct) | โ ยง4.5 (retained) |
| 12.2 | Row-level security (database) | โ | โ | โ | โ (BigQuery RLS) | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.5 Postgres RLS |
| 12.3 | Payload/vector filter isolation | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.5 Qdrant + Neo4j |
| 12.4 | Service mesh mTLS (SPIFFE) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.5 Istio retained |
| 12.5 | JWT + middleware tenant headers | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.5 |
| 12.6 | SSO (SAML/OIDC) | โ (AMP) | โ (LangSmith) | โ | โ | โ | โ | โ | โ | โ (LlamaCloud) | โ | โ (Enterprise) | โ (IAM) | โ (nexus-auth) |
| 12.7 | RBAC with fine-grained permissions | โ (AMP) | โ (LangSmith) | โ | โ | โ | โ | โ | โ | โ (LlamaCloud) | โ | โ | โ | โ ยง7.4 + ยง7.15 |
| 12.8 | SCIM user provisioning | โ | โ (LangSmith) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง10 (future) |
| 12.9 | Per-tenant encryption keys | โ | โ | โ | โ (CMEK) | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.5 KEK (HSM/TPM) |
| 12.10 | Post-quantum crypto (hybrid) | โ | โ | โ | โ (roadmap) | โ (roadmap) | โ | โ | โ | โ | โ | โ | โ (roadmap) | โ ยง7.14 roadmap |
| 12.11 | Network policy / AuthorizationPolicy | โ | โ | โ | โ (VPC-SC) | โ | โ | โ | โ | โ | โ | โ | โ (PrivateLink) | โ ยง4.5 Istio |
| 12.12 | OPA/Rego policy engine | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ (AgentCore policy) | โ ยง7.14 Appendix H |
Table 13 โ Governance, Risk, and Compliance
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 13.1 | Risk-tier classification (e.g. EU AI Act) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 EU AI Act |
| 13.2 | Human-oversight gate for high-risk | โ | โ (HITL) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 Tier 4 |
| 13.3 | Data residency enforcement | โ | โ | โ | โ (regions) | โ (Purview) | โ | โ | โ | โ | โ | โ | โ (AWS regions) | โ ยง7.14 + ยง4.1 |
| 13.4 | GDPR right-to-erasure flow | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 + ยง9 #11 |
| 13.5 | DPIA / impact-assessment generator | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 |
| 13.6 | Conformity assessment records | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 EU AI Act |
| 13.7 | SOC 2 evidence pipeline | โ | โ (LangSmith attest.) | โ | โ | โ | โ | โ | โ | โ (LlamaCloud) | โ | โ (Enterprise) | โ | โ ยง7.14 + Appendix G |
| 13.8 | ISO 27001 mapping | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ Appendix G |
| 13.9 | ISO 42001 AI management system | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 + ยง7.3 |
| 13.10 | HIPAA BAA-aware routing | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 + ยง9 #50 |
| 13.11 | FedRAMP Moderate authorization | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 airgap |
| 13.12 | FedRAMP High / DoD IL5 | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 airgap |
| 13.13 | NIST AI RMF alignment | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 + Appendix G |
| 13.14 | OWASP LLM Top 10 defenses | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 + ยง8.E-E3 |
| 13.15 | MITRE ATLAS threat tagging | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 |
| 13.16 | Export-control tags (EAR/ITAR) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 |
| 13.17 | Auditor-export CLI / package | โ | โ (audit logs) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.13 + Appendix I |
| 13.18 | Automatic model card generation | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 + ยง7.14 |
| 13.19 | Per-skill threat model declaration | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 SKILL.md v2 |
| 13.20 | Watermarking / C2PA on artefacts | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.11 C2PA v2 |
Table 14 โ Cost and FinOps
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 14.1 | Token usage tracking per call | โ | โ | โ (Logfire) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.10 |
| 14.2 | Cost attribution per trace | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.10 |
| 14.3 | Cost attribution per tenant/org | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.10 |
| 14.4 | Cost attribution per skill / workflow | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.10 |
| 14.5 | Cost attribution per user | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.10 |
| 14.6 | Pre-dispatch budget reservation | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.10 + ยง8.B-B14 |
| 14.7 | Per-skill daily cost cap | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ (policy) | โ ยง7.10 + ยง7.15 |
| 14.8 | Per-run hard cost cap | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.2 Tier 4 + ยง7.10 |
| 14.9 | Circuit breaker on failure-rate | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.10 |
| 14.10 | Provider failover on 5xx | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง10 Phase 26 |
| 14.11 | Cache-hit optimization | โ | โ (sem-cache) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.13 + ยง9 #2 |
| 14.12 | OnCostThreshold hook/callback | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.4 + ยง7.10 |
Table 15 โ Skill / Plugin / Marketplace Management
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15.1 | Skill / plugin registry | โ | โ | โ | โ (Agent Studio) | โ | โ | โ (MCP dir) | โ | โ | โ | โ | โ (AgentCore skills) | โ ยง7.3 marketplace |
| 15.2 | Semantic versioning | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 |
| 15.3 | Version pinning at runtime | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 skill_version_pin |
| 15.4 | Cryptographic signing (sigstore) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 |
| 15.5 | SBOM attestation | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 |
| 15.6 | CVE scanning at runtime | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 |
| 15.7 | Runtime quality score | โ | โ (LangSmith evals) | โ | โ (Eval) | โ | โ | โ | โ | โ | โ (metric) | โ | โ (AgentCore evals) | โ ยง7.3 |
| 15.8 | Auto-rollback on quality drop | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 + ยง7.15 |
| 15.9 | A/B experiments on skills | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 ab_experiment |
| 15.10 | Adversarial eval suite | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 |
| 15.11 | Declarative install via manifest | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 nexus.manifest.json |
| 15.12 | Per-tenant private marketplace | โ (AMP) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 + ยง11 |
| 15.13 | Airgapped skill marketplace | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.12 |
| 15.14 | Skill synthesis / composition | โ | โ | โ | โ | โ | โ | โ | โ (Planner) | โ | โ | โ | โ | โ ยง7.3 (skill-synthesizer) |
Table 16 โ Prompt and Contract Management
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 16.1 | Typed input schema | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.7 |
| 16.2 | Typed output schema | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.7 |
| 16.3 | Automatic validation + retry | โ | โ | โ | โ | โ | โ (Guardrails) | โ | โ | โ | โ | โ | โ | โ ยง7.7 |
| 16.4 | Optimizer-compiled prompts | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ (MIPROv2, GEPA) | โ | โ | โ ยง7.8 (absorb DSPy) |
| 16.5 | Prompt versioning | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 + ยง7.8 |
| 16.6 | Prompt template registry | โ (Crews) | โ (LangSmith hub) | โ | โ (Agent Studio) | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.3 |
| 16.7 | Guardrails for prompt injection | โ | โ | โ | โ | โ | โ (Guardrails) | โ | โ | โ | โ | โ | โ | โ ยง7.4 + ยง8.E-E3 |
| 16.8 | Few-shot compilation vs metric | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.8 |
Table 17 โ Streaming, Async, and WebSocket
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 17.1 | Server-Sent Events streaming | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.4 |
| 17.2 | WebSocket streaming | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.4 (Socket.IO) |
| 17.3 | Resumable streams | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.4 ring buffer |
| 17.4 | Progress events (structured) | โ | โ | โ | โ | โ | โ | โ (hooks) | โ | โ | โ | โ | โ | โ ยง4.4 (17 event types) |
| 17.5 | Per-tenant WS channel isolation | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.4 (org:plugin rooms) |
| 17.6 | Pub/Sub fan-out | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.4 Redis Pub/Sub |
| 17.7 | Backpressure signalling | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ (BullMQ queue) |
Table 18 โ Developer Experience
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 18.1 | First-class CLI | โ | โ | โ | โ (gcloud) | โ | โ | โ (claude) | โ | โ (llamactl) | โ | โ | โ | โ ยง7.13 CLI 2.0 |
| 18.2 | Visual graph / workflow editor | โ | โ (Studio) | โ | โ (Agent Studio) | โ (DevUI) | โ | โ | โ | โ | โ | โ (deepset) | โ | โ ยง7.13 (chain visualize) |
| 18.3 | NL agent authoring | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.13 (future) |
| 18.4 | REPL / interactive shell | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.13 nexus shell |
| 18.5 | Type safety (end-to-end) | โ | โ (TS) | โ (Py types) | โ | โ (.NET) | โ (TS) | โ (TS) | โ (.NET) | โ | โ | โ | โ | โ ยง7.7 |
| 18.6 | Hot-reload during dev | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ |
| 18.7 | IDE extension (VS Code etc.) | โ | โ | โ | โ | โ | โ | โ (Claude Code) | โ (Rider/VS) | โ | โ | โ | โ | โ (future) |
| 18.8 | Debugger with span inspection | โ | โ (Polly) | โ (Logfire) | โ | โ (DevUI) | โ | โ | โ | โ | โ (inspect) | โ | โ | โ ยง7.9 span tree |
| 18.9 | Time-from-zero-to-agent metric | ~hours | ~day | ~min | ~hours | ~hours | ~min | ~min | ~hours | ~min | ~hours | ~hours | ~3 API calls | ~min ยง7.13 |
| 18.10 | Session save/resume | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.6 (CLI v1+v2) |
| 18.11 | Tab completion / IntelliSense | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง4.6 |
| 18.12 | Plugin scaffold generator | โ | โ | โ | โ | โ | โ | โ (MCP) | โ | โ | โ | โ | โ | โ plugin-template |
Table 19 โ Content Provenance, Audit, and Replay
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19.1 | Deterministic replay | โ | โ (claim) | โ (Harness) | โ | โ (claim) | โ | โ | โ | โ | โ | โ | โ | โ ยง7.11 |
| 19.2 | Bit-for-bit replay manifest | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.11 replay_manifest |
| 19.3 | Tool-output capture | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.11 |
| 19.4 | Model version pinning | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.11 |
| 19.5 | Prompt hash capture | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.11 |
| 19.6 | C2PA artefact manifest | โ | โ | โ | โ (research) | โ (Purview) | โ | โ | โ | โ | โ | โ | โ | โ ยง7.11 C2PA v2 |
| 19.7 | HITL approval signing in chain | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.11 + ยง7.2 Tier 4 |
| 19.8 | Hash chain across spans | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ Appendix B span_hash |
| 19.9 | Audit retention โฅ 7 years | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ (CloudTrail) | โ ยง7.14 FedRAMP |
| 19.10 | Erasure certificate on delete | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.14 GDPR |
Table 20 โ UI/UX Bindings and Action Configuration
(This category is where Nexus v4.0 is most differentiated; we include it in the catalog precisely because no surveyed framework treats it as a primitive.)
| # | Feature | CrewAI | LangGraph | Pydantic | Gemini | MS Agent | OpenAI | Claude | SemKernel | LlamaIdx | DSPy | Haystack | AgentCore | Nexus 4.0 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 20.1 | User-configurable action โ skill mapping | โ | โ | โ | โ (Agent Studio) | โ (YAML) | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 |
| 20.2 | Plugin manifest declarative buttons | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 actions[] |
| 20.3 | Scope hierarchy (user / project / org / system) | โ | โ (workspace) | โ | โ (IAM) | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 |
| 20.4 | Priority-based resolution | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 |
| 20.5 | Per-binding cost cap | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 cost_cap_usd |
| 20.6 | Per-binding model + tier override | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 |
| 20.7 | Input schema + inputs_mapping template | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 |
| 20.8 | Visual binding editor (non-developer) | โ | โ | โ | โ (Agent Studio) | โ (DevUI) | โ | โ | โ | โ | โ | โ | โ | โ ยง8.D-D9 |
| 20.9 | Policy-gated override (OPA) | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 + ยง8.E-E8 |
| 20.10 | A/B experiment on binding_key | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 + ยง9 #12 |
| 20.11 | Binding audit trail | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 |
| 20.12 | Auto-deactivation on quality drop | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ยง7.15 + ยง9 #35 |
3.3 Matrix-Level Observations
Reading Tables 6โ20 vertically (as a catalog of Nexus's position per feature), three patterns dominate.
Where Nexus v4.0 is strictly differentiated (columns where no competitor has a native โ while Nexus does): the airgapped sealed-bundle mode (11.5, 15.13), pre-dispatch FinOps reserve (14.6) with the full per-skill plus per-run plus per-binding budget matrix (14.7 + 14.8 + OnCostThreshold hook 14.12), C2PA artefact manifests (19.6), bit-for-bit replay manifests (19.2), hash-chain across spans (19.8), erasure certificates on delete (19.10), DPIA auto-generation (13.5), conformity-assessment records (13.6), ISO 42001 management-system primitives (13.9), per-skill threat-model declaration (13.19), no-sampling span recording (10.12), the closed twelve-type span taxonomy (10.5), and the entire UI/UX Bindings row (20.1โ20.12 โ every cell). Twenty-plus features have no turnkey equivalent in any surveyed framework.
Where Nexus v4.0 is at parity (Nexus โ, multiple competitors also โ): MCP (7.1), function-calling (7.3), short-term memory (9.1), tool-call tracing (10.3), SSO (12.6), RBAC (12.7), WS streaming (17.2), SSE (17.1), CLI (18.1). These are table-stakes; we match rather than lead.
Where Nexus v4.0 is currently behind (Nexus โ or โ while competitors ship โ): Magentic-One-style orchestrator (6.19 โ only MS Agent Framework; we mark โ and defer), NL agent authoring (18.3 โ Gemini Enterprise leads; we mark โ), IDE extension (18.7 โ LangGraph, Pydantic AI, Claude Code lead; we mark โ), SCIM provisioning (12.8 โ LangSmith, Azure, AWS ship it; we mark โ). These are the planned follow-ups for v4.1.
The catalog deliberately over-scores the competitors (any partial implementation marked โ, not โ) to avoid self-flattering. Even so, on the twenty-odd rows that constitute v4.0's differentiating wedges โ airgap, crypto tenant isolation, FinOps primitives, replay plus C2PA, bindings as a first-class substrate โ the market shows a consistent gap.
4. Adverant Nexus Stack v3: Current State
This section maps the production Nexus stack as running on 24 April 2026 โ v6.2.1, 44 microservices on K3s at the Adverant cloud VPS, backed by PostgreSQL with row-level security, Redis, Neo4j, and Qdrant. Every file path cited resolves in the Adverant-Nexus monorepo. Where the UNO paper describes architecture that has since been revised, we cite the UNO section and the subsequent migration that changed it.
4.1 The Unified Nexus Orchestrator
UNO is the single dispatch entry point. The authoritative route is POST /api/v1/dispatch in services/nexus-orchestrator/src/routes/dispatch-routes.ts. UNO validates the request via Zod schemas (services/nexus-orchestrator/src/types/dispatch.ts), resolves the job type to a skill from ros.tool_registry โ not graphrag.skill_registry as described in the UNO paper Section 6, which was dropped in the Q2 2026 Skills Engine consolidation โ runs governance pre-checks for risk classification and data residency, inserts a row into orchestrator.runs via services/nexus-orchestrator/src/services/run-tracker.ts, enqueues the job to BullMQ with priority mapping, and returns HTTP 202 Accepted. Chain DAG coordination is handled by services/nexus-orchestrator/src/services/chain-coordinator.ts, which maintains a state machine reacting to step-completion callbacks. Job events flow out via services/nexus-orchestrator/src/services/ws-emitter.ts to Redis Pub/Sub channels named nexus:jobs:org:{orgId}.
The four-tier execution taxonomy is defined in services/nexus-orchestrator/src/types/execution.ts:
TypeScript1 linetype ExecutionTier = 'llm_only' | 'tool_using' | 'chain' | 'autonomous';
Tier 1 (llm_only) inlines for timeouts under 30 seconds; Tier 2 (tool_using) runs ReAct loops up to execution_config.maxIterations; Tier 3 (chain) runs DAGs via the chain coordinator; Tier 4 (autonomous) is declared in the type but, as discussed in Section 5, has no concrete HITL waitpoint code, no cost cap, and no determinism-replay semantics in v3.
4.2 The Skills Engine
The Adverant-Nexus-Skills-Engine (documented in Adverant-Nexus-Skills-Engine/docs/skill-format.md) is the source of truth for skill metadata. 729+ SKILL.md files across the plugin fleet declare capabilities, tool requirements, triggers, visibility, and status. The engine's LLM client (Adverant-Nexus-Skills-Engine/src/services/llm-client.ts) routes calls to the gateway AI Provider Router, with exponential-backoff retry and SSE streaming for long-running (400-second) operations. Skills are resolved at dispatch time by ros.tool_registry.job_type lookup. A unified skill registry UI does not yet exist; skills are browsable only programmatically.
4.3 The AI Provider Router
The AI Provider Router (services/nexus-gateway/src/services/ai-provider-router.ts) is the single service authorised to call external LLM APIs. The ALLOWED_AI_CALLERS principal list โ enforced at three layers (Istio AuthorizationPolicy, service-key HMAC, caller-identity verification) โ contains exactly three services: nexus-orchestrator, nexus-workflows, and chat-orchestrator (the chat exception). Four adapters implement the provider abstraction: GeminiAdapter, AnthropicAdapter, ClaudeMaxAdapter, and OpenRouterAdapter. Per-organization configuration is resolved via resolveOrgConfig() hitting nexus-auth, which returns AES-256-decrypted keys. Role-based routing โ default, fast, reasoning, code, long-context โ derives from roleAssignments.default in the org config. The tool-calling loop runs up to MAX_TOOL_ITERATIONS rounds, stopping when no tool calls are returned or the iteration limit is reached.
4.4 WebSocket Event Relay and PCC
Job events flow from the orchestrator to the dashboard through a three-layer relay. The orchestrator publishes to Redis Pub/Sub channel nexus:jobs:org:{orgId}. services/nexus-gateway/src/websocket/job-event-relay.ts subscribes with pattern matching and emits to Socket.IO rooms โ org-plus-plugin rooms are the v3 target state, with org-plus-user rooms retained under a compat mode mirror. A ring buffer of 50 events per channel prevents event loss during transient Redis reconnections. The Progress Command Center (nexus-dashboard/src/stores/progress-command-center-store.ts) is a Zustand store with localStorage persistence that subscribes to these events and exposes the TrackedJob model: jobId, runId, status, stage, progress, steps, ReAct thinking log, tool-call trace, billing breakdown, governance metadata (risk level, data residency, flagged-for-review), and HPC session state (log buffer up to 500 lines, GPU metrics).
4.5 Multi-tenant Isolation (Four Layers)
Layer 1 is middleware: Express and Next.js middleware reject requests lacking X-Organization-Id, X-App-Id, X-User-Id headers (JWT-derived). Layer 2 is PostgreSQL row-level security: session variables app.current_company_id, app.current_app_id, app.current_user_id gate every SELECT, INSERT, UPDATE, and DELETE via RLS policies. Layer 3 is payload-and-label filtering: Qdrant filters searches by org_id in vector payloads, and Neo4j Cypher queries carry WHERE org_id = $orgId clauses. Layer 4 is Istio: AuthorizationPolicy per service with SPIFFE identity verification, mTLS between pods, and NetworkPolicy whitelists.
4.6 Adverant Nexus CLI (v1)
The current CLI auto-discovers 44+ microservices from Docker Compose, Kubernetes, and OpenAPI specifications, exposes 70+ MCP tools as commands, and supports a ReAct agent mode with up to 20 autonomous iterations. Commands include nexus services list, nexus mcp tools, nexus ask "<prompt>", nexus workflows list | run, nexus session save | resume, and nexus monitor. Notable gaps: the CLI is not integrated with orchestrator /api/v1/dispatch; there is no streaming tail of run events; no chain DAG visualization; no skill publish, sign, or verify; no airgap bundle build; and no PCC mirror in-terminal.
4.7 Marketplace Plugin Template
The plugin template (/Users/don/Adverant/plugin-template/) scaffolds a Next.js 14 frontend (static export) with a JWT-protected PluginGate component, Zustand stores for dashboard auth, PCC integration, WebSocket state, and theme, plus hooks for iframe embedding detection and Terminal Computer page context. The backend is Express with TypeScript, JWT middleware, and service layers. nexus.manifest.json declares plugin metadata. Plugins dispatch jobs via POST /api/v1/dispatch with a plugin-scoped trace context. The template does not yet support declarative button bindings in the manifest โ that gap is closed in v4.0 Section 7.15.
4.8 Where v3 Stops Short
The UNO paper honestly disclosed seven open gaps: Phase 7 multi-provider routing partial, Phase 8 tool executors partial, per-queue pod deployments unrealized, token-quota precision issues, chain engine formalization incomplete, span storage tiering absent, multi-agent cost controls unfinished. Beyond these, v4.0 addresses ten additional gaps identified through retrospective: Tier 4 HITL specification, chain state persistence beyond Redis TTL, skill versioning enforcement, scheduled dispatch API unification, centralized skill registry UI, cost attribution granularity, multi-agent orchestration formalism, airgapped deployment documentation, event ordering guarantees, and provider failover.
5. Gap Analysis
We separate seven industry gaps (unsolved by any of the twelve surveyed frameworks) from ten v3-internal gaps (identified through retrospective on the UNO migration). Each gap is numbered for cross-reference in Section 7.
5.1 Industry Gaps (Gaps AโG)
Gap A โ Airgapped multi-agent deployment with signed plugin bundles. Only a handful of vendors (Tabnine for code, Plane for project management) offer true airgapped operation. None of the twelve surveyed frameworks ship a turnkey airgapped stack with signed MCP server bundles plus offline verification plus offline model weights plus airgapped skill marketplace [43][44][45].
Gap B โ Cryptographic per-tenant isolation across shared observability. Logical tenant scopes (LangSmith workspaces, Azure tenant boundaries) are not cryptographic. Shared observability backends can leak reasoning chains and tool outputs across tenants. No framework treats traces as tenant-scoped data with per-tenant encryption keys [46][47].
Gap C โ Skill or plugin marketplace with provenance plus SBOM plus quality scoring. Claude's MCP directory lists 75+ connectors but ships no SBOM, no signed provenance, no runtime quality score. AgentCore ships coding-assistant skills but no marketplace. CrewAI has hundreds of tools with no quality rating. This is the npm supply-chain problem waiting to happen [48].
Gap D โ Cross-framework agent portability beyond A2A. A2A specifies a wire protocol but not a portable agent definition. MS Agent Framework YAML, OpenAI primitives, LangGraph StateGraph, and Gemini ADK are not interchangeable. There is no "Dockerfile for agents."
Gap E โ Deterministic replay and exactly-once for long-horizon agents. LangSmith Deployment, MS Agent Framework, and Pydantic AI Harness claim durability in various forms, but no industry-standard replay protocol guarantees that given a run identifier you can reconstruct every tool call, every LLM output, and every state transition bit-for-bit. Regulated industries require this; no framework delivers it cleanly.
Gap F โ Agent-level FinOps. Token budgets per run, per tenant, per skill; cost attribution to business units; automatic circuit-breakers when a runaway agent burns through spend. AgentCore has partial policy controls; LangSmith tracks cost per trace. No framework ships turnkey agent-FinOps.
Gap G โ Chain-of-custody for AI-generated artefacts. When an agent produces code, a document, a design, or a decision, where is the auditable chain โ model, prompt, tool calls, human approvals, skill version โ that produced it? Left as an implementation exercise by all twelve.
5.2 v3-Internal Gaps (Gaps 1โ10)
- Tier 4 specification. Declared in
ExecutionTiertype; no HITL waitpoint code; no cost cap; no replay. - Chain state persistence. Redis 24-hour TTL is inadequate for long-horizon chains; needs Postgres-backed persistence.
- Skill versioning enforcement. SKILL.md declares
version; dispatch does not pin or validate it. - Scheduled dispatch. Scheduling lives in nexus-workflows; should unify with UNO for governance.
- Skill registry UI. 729 skills across plugins with no unified discovery surface.
- Cost attribution granularity. Per-run cost tracked; per-span and per-step are not.
- Multi-agent orchestration formalism. CLI supports up to 10 concurrent agents; orchestrator dispatch is single-skill; formal multi-agent handoff is not codified.
- Airgapped deployment documentation. K3s manifests are offline-compatible but no sealed-bundle flow.
- Event ordering guarantees. Redis Pub/Sub is best-effort; critical workflows may need event log durability.
- Provider failover. No documented fallback when the primary provider returns 5xx persistently.
5.3 Gap Prioritization
We prioritize by commercial leverage times implementation cost. Gaps A, B, C, E, F, G, and v3-internal Gaps 1, 2, 3, 6 rank highest: they are defensible wedges with tractable implementations. Gap D (portability) ranks lower because it requires multi-vendor standardization beyond Adverant's unilateral control, and the A2A protocol already covers the runtime interop case.
6. v4.0 Principles
The v4.0 architecture follows five principles derived from the gap analysis and the UNO paper's retrospective.
Principle 1 โ Dispatch does not execute. Retained from UNO. The orchestrator validates, resolves, classifies risk, and enqueues; it never calls an LLM, never invokes a tool, never waits on execution. This survives v4.0 unchanged.
Principle 2 โ Execute does not call an LLM directly. Already enforced by ALLOWED_AI_CALLERS at three layers. In v4.0, this is extended: skill authors cannot instantiate LLM clients, provider SDKs, or HTTP calls to model APIs. The AI Provider Router is the only hot path to providers, and every skill consumes it via the shared client.
Principle 3 โ Every action is a signed span. Every orchestrator operation, every tool invocation, every LLM call, every human approval, every artefact produced carries a span that is signed (C2PA manifest for artefacts, cryptographic hash chain for spans) and retained for the compliance-framework-specified minimum (seven years for FedRAMP, six years for HIPAA).
Principle 4 โ Every skill is a versioned, signed, measured artefact. SKILL.md v2 is a signed manifest with an SBOM, a semantic version, a risk-tier classification, an adversarial-eval record, a quality score that updates from runtime telemetry, and an auto-rollback policy when the quality score falls below a tenant-configurable threshold.
Principle 5 โ Every tenant is cryptographically isolated. Logical scopes are insufficient. Memory Bank payloads, span reasoning chains, artefact bytes, and binding metadata are encrypted with a per-tenant key-encryption-key (KEK) held in the nexus-auth KMS โ HSM-backed in cloud profiles, TPM-backed in airgapped profiles. Cross-tenant leakage requires breach of both the KMS and the storage backend.
These five principles govern every v4.0 feature in Section 7.
7. Adverant Nexus Stack v4.0: The Proposal
This section is the v4.0 architectural core. Each sub-section specifies one primitive. We defer diagrams to Section 8, use cases to Section 9, migration mapping to Section 10, and appendix-depth schemas to Appendices A through J.
7.1 Principles (Summary)
See Section 6. Summarized: dispatch does not execute; execute does not call an LLM directly; every action is a signed span; every skill is a versioned signed measured artefact; every tenant is cryptographically isolated.
7.2 Execution Tiers Reframed
The four-tier taxonomy โ llm_only, tool_using, chain, autonomous โ is retained. What changes is Tier 4: it becomes a concrete state machine rather than a declared type. The Tier 4 state machine has five states (start, plan, execute, review, complete) with documented transitions: sub-agent spawning from execute bounded by max_sub_agents and cumulative cost_cap; human-in-the-loop waitpoints triggered from review whenever the bound skill metadata or run-specific binding override requires it (risk-tier high, compliance framework mandate, or explicit override); replan transitions from review back to plan when a quality evaluation falls below threshold or a human reviewer rejects. On-exit hooks include OnHITLPause, OnHITLResume, OnCostThreshold, and OnTierEscalation (triggered when a Tier 2 run exceeds its iteration limit and escalates automatically to Tier 4 with human oversight). Tier 4 state is persisted in orchestrator.chain_runs (repurposed) and orchestrator.autonomous_runs (new table); Redis is used only for short-lived coordination, never as the system of record.
7.3 Skill Marketplace 2.0
SKILL.md v2 is a signed manifest. Publication proceeds through the nexus skill publish CLI: lint the manifest, generate an SBOM from dependencies, run an adversarial-eval suite (prompt injection, tool abuse, scope creep), classify the risk tier against the EU AI Act taxonomy (minimal, limited, high, unacceptable), cross-reference MITRE ATLAS techniques and OWASP LLM Top 10 mitigations, sign with sigstore, bump the semantic version, and insert into ros.skill_definitions v2 plus ros.skill_versions (a new table holding every published version). At dispatch time the runtime verifies the signature, checks that the SBOM contains no known CVEs from the NVD feed (updated daily, cached in airgapped deployments), confirms the quality score meets the tenant's threshold, confirms the risk tier is allowed by the tenant's policy, and confirms the export tags are compatible with the tenant's jurisdiction. Runtime telemetry continuously updates the quality score; if it drops below the auto-deactivate threshold, the binding referencing this skill is flipped is_active=false and traffic falls back to the next-priority binding on the same binding_key. Gap C closed.
7.4 Hooks as First-Class Primitive
Hooks are the extensibility surface of v4.0. Every dispatch-time and execution-time event is a hook point: PreDispatch, PostSkillResolve, PreTierSelect, PreLLMCall, PostLLMCall, PreToolUse, PostToolUse, PreChainStep, PostChainStep, OnCostThreshold, OnIterationLimit, OnTierEscalation, OnHITLPause, OnHITLResume, PostDispatch. Each hook is declared as a manifest (YAML) scoped at org, skill, or plugin level with a matcher expression (e.g., tool.name == 'write_file' && path.startsWith('/etc')) and an action (deny, rewrite, require_hitl, emit_event, call_webhook, policy_ref). Policy references resolve to OPA/Rego bundles. Hooks adopt the Claude Agent SDK pattern [30][31] but elevate them from SDK primitives to platform primitives โ they run server-side in the orchestrator and workflows, not in the model client. Gap F (FinOps) and part of Gap E (replay) are closed through hooks โ OnCostThreshold enforces budgets; every hook invocation is a span in the replay record.
7.5 Memory Bank with Cryptographic Per-Tenant Isolation
Memory Bank is the long-term memory service (short-term state remains in Postgres orchestrator.runs). Payloads are envelope-encrypted: a per-value data-encryption key (DEK) encrypts the payload; the DEK is wrapped with the tenant KEK held in nexus-auth KMS. In cloud profiles the KEK is HSM-backed (FIPS 140-3); in airgapped profiles it is TPM-backed. Rotation is quarterly by policy or on-demand. Crypto-erasure (Gap A adjacent): deleting a subject's memories can be implemented as DEK destruction without touching the ciphertext, which remains unreadable. Gap B closed.
7.6 A2A and MCP Dual Plane
Tool use flows through MCP; agent-to-agent flows through A2A. We retain MCP for the tool plane [1][2][28] and add A2A as a first-class primitive for the agent plane, enabling interoperation with Gemini Enterprise agents, MS Agent Framework agents, third-party CrewAI workflows, and Bedrock AgentCore agents without protocol bridges. In airgapped profiles A2A discovery is restricted to local SPIFFE identities; cross-cluster A2A is only available in cloud and VDS profiles. The two planes are orthogonal: an agent uses MCP to call a tool; it uses A2A to delegate to another agent.
7.7 Structured Output and Self-Correction
Every skill contract in v4.0 declares an input_schema and an output_schema (JSON Schema or Pydantic model). Invalid LLM output triggers automatic retry with a diagnostic prompt containing the validation error; the retry ceiling is three, after which the run escalates (OnIterationLimit hook). This is the Pydantic AI pattern [25] applied uniformly to every skill.
7.8 Optimizer-Compiled Prompts
Skill prompts become compilable artefacts through DSPy-style optimizers (MIPROv2, GEPA) [36][37][38]. A skill declares a source prompt, few-shot examples, and a metric function; the optimizer produces an optimized variant tied to a specific skill version; runtime telemetry compares the variant's quality score against the incumbent; promotion requires a statistically significant improvement and no regression on a holdout set. Rollback is automatic when the deployed variant's quality score regresses.
7.9 Observability: Insights Agent and Polly-NL
The v3 twelve-type span tree [42] is retained as the substrate. On top of it we build two LangSmith-inspired primitives [8][24]: Insights Agent clusters spans into usage patterns and surfaces anomalies (cost hotspots, latency regressions, failure clusters) automatically; Polly-NL is a natural-language debug interface ("why was last night's chain run expensive?") that translates questions into span-tree queries and returns span citations. Storage tiering splits hot spans in Postgres (last 30 days) from cold spans in ClickHouse (older), addressing the UNO paper's 300 GB/month storage projection.
7.10 FinOps Governance
Per-org, per-skill, per-user, per-binding budgets are enforced pre-dispatch. Every dispatch arrives with an estimated cost (model rate ร estimated token count); the orchestrator checks remaining budget and atomically reserves the estimate in a Redis counter. If the reservation fails the dispatch is refused with a troubleshooting JSON payload (per the "no fallbacks" contract). On each LLM call the actual cost is debited; unused reservation is refunded. Tier 4 runs carry a cumulative cost cap โ exceedance triggers OnCostThreshold, which may pause for HITL or abort with partial results. Per-skill circuit-breakers open after a failure-rate threshold is breached over a window, rejecting new dispatches until the window expires. Gap F closed.
7.11 Deterministic Replay and Chain-of-Custody
Every run is reconstructable bit-for-bit from: input hashes, pinned model version, prompt-template hash, temperature (zero or seeded PRNG), tool-output captures, span tree, hook invocation log. On invocation, the orchestrator records a replay manifest with all of the above; on replay, the worker is seeded with the manifest and plays back the same sequence. Artefacts leave v4.0 with C2PA content provenance manifests [49] signed by the tenant key, listing the model, skill version, prompt hash, tool calls, and human approvals. Gaps E and G closed simultaneously.
7.12 Airgapped Bundle Mode
A single signed tarball contains: Docker images with OCI labels and signatures, K8s manifests, pinned model weights, Postgres migration bundle, skill bundle (pre-signed), SBOM and licenses, provisioning secrets for TPM-backed KEKs, FIPS 140-3 cryptographic modules, STIG-compliant base images, and an installation manifest. nexus airgap install <bundle.tar.gz> verifies signatures, loads images to the local registry, applies manifests, runs migrations, seeds the skill registry, and TPM-wraps tenant KEKs. Delta bundles (monthly or on-demand) update in place. External network calls are structurally impossible: outbound AllowedIP is the empty set; A2A discovery returns only local SPIFFE identities. Gap A closed. FedRAMP High, DoD IL5, CJIS, and IRS Publication 1075 use cases become tractable.
7.13 Adverant Nexus CLI 2.0
The CLI evolves into a first-class dispatch, streaming, publish, airgap, governance, FinOps, hooks, memory, A2A, and debug client. Illustrative commands (full reference in Appendix C):
$ nexus dispatch ros.code_edit --input @in.json --tier tool_using --provider gemini --budget 5.00 --tail
$ nexus runs show <run_id> --spans-tree
$ nexus runs replay <run_id>
$ nexus chain visualize <run_id>
$ nexus skill publish ./skill-dir --sign --sbom
$ nexus skill rollback ros.code_edit v3.2.0
$ nexus airgap bundle --out ./bundle.tgz --skills all --models all
$ nexus governance export --framework soc2 --out ./soc2-audit.tgz
$ nexus finops burn-rate --org my-org --window 7d
$ nexus hooks apply ./hooks.yaml
$ nexus memory gc --org my-org --older-than 180d
$ nexus a2a peers list
$ nexus debug nl "why did run abc fail last night"
A WebSocket-backed --tail option streams spans in real time into the terminal, mirroring PCC content inside the CLI. CI integration is straightforward: GitHub Actions dispatch skills, stream the result, and export governance evidence in a single job.
7.14 Governance, Compliance, and Security โ Native, Not Bolted-On
Every major regulatory regime is a first-class v4.0 primitive with concrete enforcement points.
EU AI Act (Regulation 2024/1689, fully applicable 2 August 2026) [50]. Risk-tier classification is stored in ros.skill_definitions.risk_tier (minimal / limited / high / unacceptable, already present in v3 as TrackedJob.riskLevel). Dispatch rejects unacceptable skills. high skills require an HITL waitpoint (Tier 4 state machine), a conformity-assessment record (stored in compliance.conformity_assessments), and post-market monitoring spans. Article 12 (logging) maps to the span tree. Article 13 (transparency) maps to synthesised-output watermarking plus model-card exposure through the CLI and dashboard. Article 14 (human oversight) maps to the Tier 4 HITL primitive. Article 15 (accuracy, robustness, cybersecurity) maps to DSPy optimizer metrics plus adversarial-eval suites. Article 26 (deployer obligations) maps to a per-org governance document auto-generated from the skill registry.
GDPR, UK GDPR, EU Data Act [51][52]. Data-residency tags (eu_only, us_only, any, or region codes) are enforced at the AI Provider Router and at storage. Right-to-erasure runs as a nexus erase-subject Tier 3 chain that atomically deletes from Postgres (RLS-scoped), Qdrant (payload filter), Neo4j (DETACH DELETE), object storage, and Memory Bank (KEK destruction for crypto-erasure), then schedules backup-retention purge. DPIA artefacts are auto-generated per skill.
SOC 2 Type II, ISO/IEC 27001, ISO/IEC 42001 [53][54][55]. The span tree is the continuous-control evidence pipeline. Control identifiers are mapped to span types in Appendix G. ISO 42001-specific controls (AI risk assessment, AI impact assessment, AI system life-cycle management) attach to the skill-publication workflow.
HIPAA and HITRUST [56]. Protected health information tagging on skill bindings forces residency and provider constraints โ no provider without a Business Associate Agreement is routable. Audit span retention is minimum six years. Covered-entity and business-associate roles are modelled in nexus-auth.
FedRAMP Moderate and High, DoD IL4 and IL5, CJIS, IRS Publication 1075 [57][58]. The airgapped bundle (Section 7.12) is the delivery vehicle. FIPS 140-3 validated cryptography; STIG-compliant base images; CAC/PIV SSO; audit retention minimum seven years.
NIST AI RMF 1.0 and NIST AI 600-1 [59][60]. GOVERN, MAP, MEASURE, MANAGE functions map to per-org policy documents, skill-registry metadata, quality-evals plus span analytics, and hooks plus FinOps plus HITL respectively.
Regional privacy laws โ PIPL, DPDP, LGPD, PIPEDA, Australian Privacy Act. Expanded residency enum plus provider-allowlist table. Cross-border transfer records auto-generated per dispatch crossing jurisdictions.
OWASP LLM Top 10 (2025), MITRE ATLAS, OWASP Agentic AI Threats [61][62][63]. Per-skill threat model in the registry. Runtime enforcement: input classifier hook for LLM01 (prompt injection), structured-output schema for LLM02 (insecure output), signed skills plus SBOM for LLM05 (supply chain), output scanner hook for LLM06 (sensitive disclosure), capability allowlist hooks for LLM07 and LLM08 (insecure plugins, excessive agency), watermarks plus model cards for LLM09 (overreliance), rate limits plus auth plus airgap for LLM10 (model theft).
C2PA [49]. Every generated artefact leaves v4.0 with a C2PA manifest v2 signed by the tenant key.
Export controls โ EAR, ITAR, EU Dual-Use Regulation 2021/821 [64][65]. Model and skill export-control tags; dispatch gate refuses cross-border use without an export-license record.
Cryptography and secrets. Org-level keys remain AES-256 in nexus-auth. v4.0 adds envelope encryption with per-tenant KEKs (HSM-backed cloud, TPM-backed airgap), quarterly rotation, and a nexus keys rotate CLI. Post-quantum hybrid X25519 plus ML-KEM for inter-service mTLS is on the forward-looking roadmap.
Enforcement architecture. Three gates (Istio AuthorizationPolicy plus mTLS; service-key HMAC plus caller-identity; per-dispatch OPA policy evaluator). Policies are versioned in a central bundle distributed to services.
Evidence and reporting. A Governance tab in the dashboard and a nexus governance export --framework <soc2 | iso27001 | iso42001 | eu-ai-act | nist-airmf | hipaa | fedramp> CLI command produce auditor-ready packages: control-to-evidence maps, span samples, policy versions, conformity-assessment records, DPIAs, model cards, adversarial-eval reports. Full traceability matrix in Appendix G.
7.15 UI/UX Bindings โ User-Configurable Buttons to Workflows
The Bindings primitive generalizes the production ros.skill_bindings table (documented in Adverant-NexusROS/src/schemas/skill-bindings.schema.ts, migration database/migrations/030_skill_bindings.sql, routes src/routes/skill-bindings.ts, and the resolution skill src/skills/ros-skill-binding-resolve.ts) into a first-class substrate where every clickable action in any plugin or marketplace application is a Binding resolved at runtime to a skill plus tier plus provider plus model plus cost cap plus risk tier plus residency plus inputs mapping plus hook set.
Why. In v3, reconfiguring what a button does requires editing plugin source code and shipping a new version. In v4.0, a power user opens the Binding Editor, changes the skill or the tier or the model, saves โ and the button now dispatches differently on their next click. No code deploy. Organizational admins and plugin authors retain veto through OPA override policies.
Resolution. Lookup by binding_key (a regex-validated string like lead.scoring.v2) proceeds through a four-level scope hierarchy: user > project > org > system. Within the most-specific matching scope, the active binding with the highest priority (0โ1000) wins. If an A/B experiment is active on the binding_key, traffic splits by split_ratio (hashed user identifier). Configuration overrides merge via a precedence chain: skill_definition.config is the base, binding.config_overrides overrides, caller.runtime_overrides takes final precedence.
Metadata. The full v4.0 binding metadata set spans identity (id, org, binding_key, scope, scope_id, priority), resolution target (skill_definition_id, skill_version_pin), execution (tier, provider_preference, model_preference, routing_hint, queue_name, response_format), cost and limits (cost_cap_usd, daily_cap_usd, token_cap_in, token_cap_out, timeout_ms, max_iterations, max_sub_agents), governance (risk_tier, data_residency, export_tags, requires_hitl, policy_refs, tier_restrictions, phi_tagged, compliance_frameworks), hooks (hooks[], allowed_tools[], denied_tools[]), inputs and mapping (input_schema, inputs_mapping, output_target), UI presentation (display_name, description, icon, placement[], confirmation, badge, shortcut), observability and experiments (ab_experiment_id, quality_score_threshold, telemetry_tags), and lifecycle (status, is_active, deleted_at, created_by, updated_by, config_overrides, conditions).
Plugin manifest declarative actions. Plugin authors declare bindings in nexus.manifest.json's actions[] array; on install these seed SYSTEM-scope bindings; on uninstall they are removed; on upgrade they are diff-reviewed. Authors specify defaults (skill, tier, provider, model, cost cap) that admins and users can override within policy.
Override policy (OPA). A user-scope binding save is routed through an OPA rule that prevents weakening of organizational governance: residency cannot be widened, phi-required providers must include the proposed provider, max-cost-cap must be โฅ proposed cost cap, HITL-mandatory-for-high-risk must be honoured, allowed-tools must cover the proposed allowed-tools list, and the user role must have bindings:write:<scope> permission. Denials return structured troubleshooting JSON (per the no-fallbacks contract).
Audit trail. Every binding change is an audit row (who, when, what, policy verdict). A config-drift detector flags bindings whose quality score drops post-edit. Bindings are observable in a Governance tab and exportable in the SOC 2 package.
Integration. Bindings are consumed by hooks (Section 7.4 โ hooks referenced in the binding metadata run during the dispatch path), by FinOps (Section 7.10 โ binding cost caps reserve against tenant budgets), by Memory Bank (Section 7.5 โ binding metadata is encrypted with the tenant KEK if flagged phi_tagged or otherwise sensitive), and by Governance (Section 7.14 โ binding metadata declares applicable frameworks and the override policy enforces them).
Full schema in Appendix J.
(Sections 8 through 14 and Appendices A through J follow.)
8. Reference Architecture
This section renders the v4.0 architecture in a canonical diagram set grouped by concern: current state (v3), proposed architecture (v4.0), user journeys, UI/UX mocks, compliance enforcement, and deployment profiles. Every diagram is also available as a standalone figure file (Mermaid and PlantUML sources in figures/). All widths capped at 110 columns for monospace readability. ASCII renderings below; equivalent SVG figures in the companion package.
Diagrams are grouped by concern. Every current-capability and every v4.0 capability has at least one diagram. Widths kept โค110 cols for monospace readability.
4.A Current State (v3) โ Architecture
A1. Service topology โ v3 (current, 44 services)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ADVERANT NEXUS v6.2.1 โ
โ K3s cluster @ 157.173.102.118 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ โ โ
โผ โผ โผ โผ โผ
โโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ INGRESS โ โ GATEWAY โ โ ORCHESTRATOR โ โ WORKFLOWS โ โ SKILLS ENGINEโ
โ Istio โโโmTLSโโถโ Socket.IO WS โโโโโโโโถโ Dispatch โโโโโโโโถโ BullMQ โโโโโโโโถโ SKILL.md โ
โ VirtSvc โ โ AI Provider โ โ Governance โ โ Workers โ โ tool_reg โ
โโโโโโโโโโโโโ โ Router โ โ Run Tracker โ โ Scheduler โ โโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ โ โ
โ โ โ
โโโโโโโโดโโโโโโโโ โโโโโโโโดโโโโโโโโ โโโโโโโโดโโโโโโโโ
โ NEXUS-AUTH โ โ REDIS โ โ POSTGRES โ
โ AES-256 โ โ Pub/Sub โ โ orchestr. โ
โ Org keys โ โ BullMQ queueโ โ runs โ
โ RBAC โ โ Ring buffer โ โ skill_reg โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ QDRANT โ โ NEO4J โ
โ embeddings โ โ GraphRAG โ
โ voyage 1024dโ โ entities โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
PLUGIN FLEET (marketplace): FRONTEND FLEET:
โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
โNexusROS โ โProseCreatorโ โ dashboard โ โ adverant.aiโ
โ (UNO) โ โ (writing) โ โ PCC + chat โ โ (marketing)โ
โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
โNexusQA โ โForge โ โEE-Design โ โ plugin-UIs โ โ nexus-cli โ
โ (testing) โ โ (hardware) โ โ (PCB) โ โ (N plugins)โ โ (CLI) โ
โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
A2. UNO dispatch pipeline โ v3 (8-step, current)
Client Orchestrator Skills Eng. AI Router Workflows Dashboard
โ โ โ โ โ โ
โ POST /dispatch โ โ โ โ โ
โโโโโโโโโโโโโโโโโโโโถโ โ โ โ โ
โ โ 1. Validate โ โ โ โ
โ โ Zod schema โ โ โ โ
โ โ โ โ โ โ
โ โ 2. Resolve job_type โ skill โ โ โ
โ โโโโโโโโโโโโโโโโโโโโถโ โ โ โ
โ โโโโโโ exec_configโโโค โ โ โ
โ โ โ โ โ โ
โ โ 3. Governance precheck โ โ โ
โ โ (residency, risk) โ โ โ
โ โ โ โ โ โ
โ โ 4. Insert run row (orchestrator.runs) โ โ
โ โ โ โ โ โ
โ โ 5. Enqueue BullMQ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถ โ โ
โ 202 Accepted โ โ โ โ โ
โโโโโโโโโโโโโโโโโโโโโค โ โ โ โ
โ {run_id,trace_id} โ 6. Publish job:queued โ โ โ
โ โ to Redis Pub/Sub โ โ
โ โ โ โ
โ โ 7. Worker dequeue โ
โ โ โ โ
โ โ โโโโโTier 1 llm_onlyโโ โ
โ โ โ call AI Router โ โ
โ โ โโโโโTier 2 ReActโโโโโค โ
โ โ โ loop LLM+tool โ โ
โ โ โโโโโTier 3 Chainโโโโโค โ
โ โ โ DAG coordinator โ โ
โ โ โโโโโTier 4 Autonโโโโโค โ
โ โ โ (undefined) โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โ โ โ
โ โ 8. job:* events โ Redis โ WS relay โ Socket.IO โ PCC โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ โ
A3. Execution tier matrix โ v3 (current, with gaps)
โโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Tier โ Name โ LLM Calls โ Tool Calls โ State โ Status in v3 โ
โโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 1 โ llm_only โ 1 โ 0 โ inline or โ โ
Deployed โ
โ โ โ โ โ BullMQ โ โ
โโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 2 โ tool_using โ N (ReAct) โ M per LLM โ BullMQ โ โ
Deployed, hooks missing โ
โ โ โ โ โ โ quality evals missing โ
โโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 3 โ chain โ N steps โ M per step โ Redis 24h + โ โ
Deployed โ
โ โ โ โ โ BullMQ โ โ ๏ธ No persistent audit โ
โ โ โ โ โ โ โ ๏ธ No visual editor โ
โ โ โ โ โ โ โ ๏ธ No loops/dynamic DAG โ
โโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 4 โ autonomous โ Extended โ Many โ Unclear โ โ Defined in type, no code โ
โ โ โ โ โ โ โ No HITL waitpoint โ
โ โ โ โ โ โ โ No cost cap โ
โโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
A4. AI Provider Router โ v3 (current, 4 adapters)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Authorised callers โ AI PROVIDER ROUTER โ Org config resolution
(exactly 3 principals): โ nexus-gateway โ
โ nexus-orchestrator โ โ โโโโโโโโโโโโโโโโโโโโ
โ nexus-skills-engine โ chatWithTools() โโโaskโโโโโถโ nexus-auth โ
โ chat-orchestrator โ MAX_TOOL_ITERATIONS โ โ resolveOrgConfigโ
โ โ โ โ AES-256 keys โ
โ POST /internal/ai/ โ โโ GeminiAdapter โ โโโโโโโโโโโโโโโโโโโโ
โ chat โ โโ AnthropicAdapter โ โ
โโโโโโโโโโโโโโโโโโโโโโถโ โโ ClaudeMaxAdapter โ provider + role models
โ โ โโ OpenRouterAdapter โ โ
โ โ โ โผ
โ โ Role routing: โ โโโโโโโโโโโโโโโโโโโโ
โ โ default/fast/reasoning โ โ Google / Anthro โ
โ โ /code/long_context โโโโโAPIโโโถโ /OpenRouter/ โ
โ โ โ โ Claude Max โ
โ โ Response fmt: json/text โ โโโโโโโโโโโโโโโโโโโโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Enforcement (3 layers):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Layer 1: Istio AuthorizationPolicy โ Layer 2: validateServiceKey โ Layer 3: โ
โ NetworkPolicy whitelist โ HMAC header per caller โ validate โ
โ โ โ CallerId โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
A5. Socket.IO + Redis Pub/Sub event flow โ v3 (current)
Orchestrator Redis Pub/Sub JobEventRelay Socket.IO Dashboard PCC
โ โ โ โ โ
โ emitJobEvent() โ โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโถโ โ โ โ
โ channel: โ โ โ โ
โ nexus:jobs:org:{id} โ โ โ โ
โ โ pattern subscribe โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโถโ โ โ
โ โ โ emit to rooms โ โ
โ โ โโโโโโโโโโโโโโโโโโโโถโ โ
โ โ โ โ org:{id}: โ
โ โ โ โ plugin:{pid} โ
โ โ โ โโโโโโโโโโโโโโโโโโถโ
โ โ โ โ โ
โ โ โ (compat mirror): โ โ
โ โ โ org:{id} โ โ
โ โ โ user:{uid} โ โ
โ โ โ โ โ
โ โ โ โ โ
โ Ring buffer (50 evts) โโโ replay on reconnect โโค โ โ
โ โ
Event types: job:dispatched, job:queued, job:started, job:skill_resolved,
job:llm_call, job:llm_response, job:llm_stream_chunk,
job:tool_call, job:tool_result,
job:chain_step_start, job:chain_step_complete,
job:progress, job:warning, job:completed, job:failed, job:timeout, job:cancelled
A6. PCC TrackedJob model โ v3 (current)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TrackedJob (Zustand + localStorage, hydrated from /api/workflows/runs/{runId}) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ jobId triggerRunId jobType jobLabel stage โ
โ progress 0-100 message steps[] startedAt completedAt? โ
โ error? result? โ
โ โ
โ โโโ Skill transparency โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ skillId skillName executionType currentIteration maxIterations โ
โ โ
โ โโโ ReAct transparency โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ thinkingLog[] toolCalls[] {name,args,result,durationMs} โ
โ โ
โ โโโ Billing โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ billing.tokensIn .tokensOut .costUSD .provider โ
โ โ
โ โโโ Governance โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ riskLevel flaggedForReview dataResidency โ
โ โ
โ โโโ HPC/GPU โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ logBuffer[] (โค500) sessionUrl gpuMetrics {epoch, loss, accuracy, cost} โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
A7. Multi-tenant isolation โ v3 (current, 4 layers)
Incoming request
โ
โ X-Organization-Id, X-App-Id, X-User-Id (JWT-derived)
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Layer 1: MIDDLEWARE โ Reject if tenant headers missing
โ Express / Next.js middleware โ Set req.orgId / req.appId / req.userId
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Layer 2: POSTGRES RLS โ SET app.current_company_id = ...
โ Session-var-driven RLS policies โ SELECT/INSERT/UPDATE/DELETE all gated
โ Every table has USING / WITH CHECK โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Layer 3: VECTOR + GRAPH FILTER โ Qdrant: org_id in payload, filter on search
โ Qdrant payload filter โ Neo4j: org_id label, WHERE in every Cypher
โ Neo4j Cypher WHERE clause โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Layer 4: ISTIO SERVICE MESH โ mTLS between pods
โ AuthorizationPolicy per service โ Caller-identity SPIFFE ID verified
โ NetworkPolicy whitelist โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
A8. nexus-cli โ v3 (current capabilities)
$ nexus services list # Auto-discover 44+ microservices
$ nexus mcp tools # List 70+ MCP tools
$ nexus ask "prompt" # ReAct agent (โค20 iterations)
$ nexus workflows list # List workflow templates
$ nexus workflows run <template> # Execute template
$ nexus session save <name> # Checkpoint
$ nexus session resume <name> # Restore
$ nexus monitor # Real-time dashboard
GAPS:
โ No first-class dispatch to orchestrator /api/v1/dispatch
โ No streaming tail of run events
โ No chain DAG visualisation
โ No skill publish / sign / verify
โ No airgap bundle build
โ Not integrated with PCC
A9. Plugin template โ v3 (current)
plugin/
โโโ frontend/ Next.js 14 static export
โ โโโ app/
โ โ โโโ layout.tsx BrandingProvider
โ โ โโโ dashboard/{slug}/
โ โ โโโ page.tsx PluginGate (JWT)
โ โโโ components/gates/ PluginGate.tsx
โ โโโ stores/
โ โ โโโ dashboard-store.ts auth token
โ โ โโโ progress-command-center-store.ts PCC integration
โ โ โโโ plugin-ws-store.ts WS state
โ โ โโโ theme-store.ts
โ โโโ hooks/
โ โโโ useEmbedded.ts iframe detect
โ โโโ usePageContext.ts Terminal Computer ctx
โโโ backend/ Express + TS
โ โโโ routes/
โ โโโ middleware/auth.ts JWT validator
โ โโโ services/
โโโ nexus.manifest.json plugin metadata
โโโ k8s/deployment.yaml
4.B v4.0 โ Architecture
B1. v4.0 service topology (additions marked โ NEW)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ADVERANT NEXUS v4.0 โ
โ One codebase, four profiles: cloud | VDS | on-prem | air โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโฌโโโโโโโโดโโโโโโโฌโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโ
โผ โผ โผ โผ โผ โผ
โโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Ingress โ โ Gateway โ โOrchestrator โ โ Workflows โ โSkills Engineโ โ NEXUS-AUTH โ
โ Istio โโโโโโถโ AI Router โโโโโโ Dispatch โ โ BullMQ โ โ SKILL v2 โ โ AES-256 โ
โ OPA/Regoโ
โ โ โ โ โ โ Workers โ โ signed+SBOMโ โ KEK/HSMโ
โ
โโโโโโโโโโโโ โ + A2A Planeโ
โ โ + Hooksโ
โ โ + per-tierโ
โ โ + QualScoreโ โ + RBAC/SSO โ
โโโโโโโโโโโโโโโ โ + Policyโ
โ โ queues โ โ + Optimizerโ โโโโโโโโโโโโโโโ
โ + Replayโ
โ โโโโโโโโโโโโโโโ โ promptsโ
โ
โ + FinOpsโ
โ โโโโโโโโโโโโโโโ
โ + Governโ
โ
โ + Tier 4 SMโ
โ
โโโโโโโโฌโโโโโโโ
โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ
โMEMORY BANKโ
โ โ SPAN STORE โ
โ โ โ MARKETPLACE โ
โ โ POLICY ENGINEโ
โ
โ per-tenant โโโโโค Postgres + โโโโโโโโผโโโโโถโ sigstore+SBOM โโโโโถโ OPA/Rego โ
โ KEK encrypt โ โ ClickHouse โ โ โ publish flow โ โ versioned โ
โ checkpoints โ โ (tiered) โ โ โ โ โ rules โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโผโโโโโโโโโโโโ
โผ โผ โผ
โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ
โ REDIS โ โ POSTGRES โ โ QDRANT โ
โ Pub/Sub โ โ RLS+auditโ โ per-tenantโ
โ + BullMQ โ โ + spans โ โ namespacesโ
โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโโ
โ โ โ
โโโโโโผโโโโโ โโโโโผโโโโโ โโโโผโโโโโโ
โ NEO4J โ โ HSM/TPMโ โAIRGAPโ
โ
โGraphRAG โ โ KMS โ
โ โ bundle โ
โโโโโโโโโโโ โโโโโโโโโโ โ registryโ
โโโโโโโโโโ
CLI 2.0 โ
PCC v2 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ nexus dispatch โ โ Live spans + hooks + cost โ
โ nexus runs tail โ โ HITL inbox โ
โ nexus chain visualize โ โ Marketplace browser โ
โ nexus skill publish โโโโ WS stream โโถโ Governance tab โ
โ nexus airgap bundle โ โ Replay scrubber โ
โ nexus governance export โ โ FinOps dashboard โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
B2. v4.0 end-to-end dispatch flow
Client Orchestrator Policy Skills Memory AI Router Span PCC / CLI
Engine Marketplace Bank Store
โ โ โ โ โ โ โ โ
โ POST โ validate zod โ โ โ โ โ โ
โ /dispatch โ โ โ โ โ โ โ
โโโโโโโโโโโโถโ โ โ โ โ โ โ
โ โ PreDispatch HOOKSโ
โ โ โ โ โ
โ โโ input classifier (LLM01 injection guard) โ
โ โโ residency check โ
โ โโ budget check (FinOps) โ
โ โโ export-control check โ
โ โโ risk-tier gate (EU AI Act) โ
โ โ โ โ โ โ โ โ
โ โ evaluate policy โโถโ โ โ โ โ โ
โ โโโ allow/deny + conditions โโโโโโโโโ โ โ โ โ
โ โ โ
โ โ resolve skill โโโถโ verify sig + SBOM + qual-score โฅ threshold โ
โ โ โโโโโโโโ skill contract + exec_config v2 โโโค โ
โ โ โ โ
โ โ load memory checkpoint โโโโโโโโโโถโ โ
โ โโโโโโ per-tenant decrypted state โโค โ
โ โ โ
โ โ insert run + span(root) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โ publish job:dispatched โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ
โ 202 ack โ โ
โโโโโโโโโโโโโค enqueue BullMQ per-tier queue โ
โ โ โ
โ โ โโโโโโโโโโโ worker dequeues, selects tier state machine โโโโโโโโโโ โ
โ โ โ
โ โ T1 llm_only โโโโโถ AI Router (single call) โโโถ span โโโถ PCC โ
โ โ T2 tool_using โโโถ ReAct loop + PreToolUse/PostToolUse hooks + spans โ
โ โ T3 chain โโโโโโโโถ DAG coordinator (checkpointed) + PreChainStep hooks + spans โ
โ โ T4 autonomous โโโถ multi-agent state machine + HITL waitpoints + cost caps โ
โ โ โ
โ โ PostDispatch HOOKSโ
โ sign artefact (C2PA) โ write replay manifest โ FinOps debit โ
โ โ โ
โ โ job:completed + governance-evidence-row โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถ โ
โ โ memory checkpoint save (tenant-KEK encrypt) โโโโโโโโโโโโโถโ โ
B3. Tier 1 โ llm_only (reframed, v4.0)
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ Dispatch โโโโโโโโโถโ PreDispatch โโโโโโโโโถโ AI Router โโโโโโโโโโถโ Struct. Out โ
โ validated โ โ hooks โ โ role=defaultโ โ validate+ โ
โโโโโโโโโโโโโโโโ โ + policy โ โ cache read โ โ self-correct โ
โ + budget โ โ โ โ up to R=3 โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโฌโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ Span: llm_ โ โ Sign output โ
โ call (typed)โ โ C2PA manifestโ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโ
โ PostDispatch โ
โ FinOps debit โ
โ PCC emit โ
โโโโโโโโโโโโโโโโ
B4. Tier 2 โ tool_using (ReAct with hooks, v4.0)
Start
โ
โผ
โโโโโโโโโโโโโโโโ iter=0
โ System promptโ max = exec_config.maxIterations
โ + memory โ
โ + tool specs โ
โโโโโโโโฌโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ
โ AI Router โโโถโ PostLLMCall HOOKโ
โ
โ (role=code) โ โ - PII scan โ
โโโโโโโโฌโโโโโโโโ โ - injection detect โ
โ โโโโโโโโโโโฌโโโโโโโโโโโโ
โ has tool_call? โ
โ no โโโโโโโโโโโผโโโถ final answer โโโถ struct-out validate โโโถ PCC complete
โ yes โ
โผ โ
โโโโโโโโโโโโโโโโ โ
โ PreToolUseโ
โ โ
โ - allowlist โ โ
โ - arg rewriteโ deny โโโถ error, escalate
โ - cost check โ
โโโโโโโโฌโโโโโโโโ
โ allow
โผ
โโโโโโโโโโโโโโโโ
โ Execute tool โ
โ span+sign โ
โโโโโโโโฌโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโ
โPostToolUseโ
โ
โ - redact โ
โ - cache โ
โ - FinOps โ
โโโโโโโโฌโโโโโโโโ
โผ
iter++; loop if iter < max else OnIterationLimit hook โถ escalate to Tier 4 or fail
B5. Tier 3 โ chain (persistent DAG coordinator, v4.0)
exec_config.chainSteps = [ A, B, C, D, E ]
Persistent state: orchestrator.chain_runs + chain_steps (Postgres, not Redis TTL)
โโโโโ step A โโโโโ
โ (Tier 1) โ
โโโโโโโโโฌโโโโโโโโโ
โ output.x
โโโโโโโโโโโผโโโโโโโโโโ
โผ โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ step B โโ step C โโ step D โ PARALLEL fork (DAG)
โ (Tier 2โโ (Tier 1โโ (MCP โ PreChainStep hook per branch
โ ReAct) โโ llm) โโ tool) โ
โโโโโฌโโโโโโโโโโฌโโโโโโโโโโฌโโโโโ
โ โ โ
โโโโโโฌโโโโโดโโโโโฌโโโโโ join(all) (or any, or quorum)
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโ
โ step E โ checkpoint after each step
โ (Tier 2 w/ โ resumable after worker crash
โ Memory Bank) โ visual editor in dashboard
โโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโ
โ Done โ โ
โ C2PA signโ
โ + export โ
โโโโโโโโโโโโ
B6. Tier 4 โ autonomous (multi-agent + HITL + cost cap, v4.0)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ TIER 4 STATE MACHINE โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ start โโโถ plan โโโถ execute โโโถ review โโโถ complete โ
โ โ โ โ โ โ โ
โ โ โ โ โ โโโถ artefact sign + C2PA โ
โ โ โ โ โ โ
โ โ โ โ โโโถ HITL waitpoint (risk=high) โ
โ โ โ โ โ โ โ
โ โ โ โ โ โผ โ
โ โ โ โ โ โโโโโโโโโโโโโโโโ โ
โ โ โ โ โ โ reviewer UI โ โโ approve โโโถ resume โ
โ โ โ โ โ โ inbox on PCC โ โโ reject โโโถ replan โ
โ โ โ โ โ โ evidence bundle โ
โ โ โ โ โ โโโโโโโโโโโโโโโโ โ
โ โ โ โ โ โ
โ โ โ โ โโโถ quality-eval < threshold โโถ replan โ
โ โ โ โ โ
โ โ โ โโโถ spawn sub-agents (crew/handoff/groupchat pattern) โ
โ โ โ max agents, max cost cap, timeout โ
โ โ โ โ
โ โ โโโถ cost cap exceeded โโถ OnCostThreshold hook โโถ HITL or abort โ
โ โ โ
โ โโโถ OnTierEscalation hook (escalated from Tier 2 iter-limit) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
B7. Skill Marketplace 2.0 โ publication + verification flow
Developer Skill Registry Marketplace Runtime
โ โ โ โ
โ nexus skill publish โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ โ
โ โ 1. lint SKILL.md v2 โ โ
โ โ 2. generate SBOM โ โ
โ โ 3. sign (sigstore) โ โ
โ โ 4. adversarial-eval run โ โ
โ โ 5. quality-score init โ โ
โ โ 6. semver bump โ โ
โ โ 7. cross-ref MITRE ATLASโ โ
โ โ + OWASP LLM Top 10 โ โ
โ โ 8. risk-tier (EU AI Act)โ โ
โ โ โ โ
โ โ INSERT ros.skill_reg v2 โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โ โ broadcast โ
โ โ โ marketplace:update โ
โ โ โโโโโโโโโโโโโโโโโโโโโถโ
โ โ โ โ
โ โ โ At dispatch, runtime:
โ โ โ 1. verify signature
โ โ โ 2. check SBOM (no CVEs)
โ โ โ 3. check quality score โฅ tenant threshold
โ โ โ 4. check risk-tier allowed
โ โ โ 5. check export tags vs org
โ โ โ 6. version-pin or "latest-minor"
โ โ โ
โ โ โ runtime telemetry โโถ re-score
โ โ โ auto-rollback if quality drops
B8. Hooks lifecycle โ v4.0 (first-class orchestrator primitive)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ REQUEST LIFECYCLE WITH HOOK POINTS โโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ PreDispatch โโโถ PostSkillResolve โโโถ PreTierSelect โโโถ tier dispatch โ
โ โ โ
โ โ (per iteration / step) โ
โ โ โ
โ โโโโโโดโโโโโ โโโโโ PreToolUse โโโโ PostToolUse โโโโโ โ
โ โ deny โ โ โ โ
โ โ enrich โ โโโ PreLLMCall โโโฌโโโโค โ โ
โ โ reroute โ โ โ โ โ โ
โ โโโโโโโโโโโ โ PostLLMCall โโโค โโโโ PreChainStep โโ PostChainStep โโโโค โ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโดโโโโโ OnCostThreshold โโโโโโโโโโโโโโโโโโโโค โ
โ โโโโโโ OnIterationLimit โโโโโโโโโโโโโโโโโโโค โ
โ โโโโโโ OnTierEscalation โโโโโโโโโโโโโโโโโโโค โ
โ โโโโโโ OnHITLPause / OnHITLResume โโโโโโโโโค โ
โ โ โ
โ PostDispatch โโโโโ sign artefact โโโ FinOps debit โโโ replay manifest โโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Hook definition (manifest in plugin or org config):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ event: PreToolUse โ
โ scope: org | skill | plugin โ
โ target: skills/nexusros.code_edit โ
โ matcher: tool.name == 'write_file' && path.startsWith('/etc') โ
โ action: deny | rewrite | require_hitl | emit_event | call_webhook โ
โ policy_ref: opa/skills/restrict-etc.rego (if action=policy) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
B9. Memory Bank + cryptographic per-tenant isolation
nexus-auth (KMS) Memory Bank Span/Run
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ store
โ per-tenant โ โ Postgres + โ
โ KEK โโโโwrap/unwrapโโโโโโโค Redis โ
โ HSM-backed โ โ checkpoints โ
โ (cloud) โ โ โ
โ TPM-backed โ โ keys: โ
โ (airgap) โ โ org/skill/ โ
โโโโโโโโโโโโโโโ โ run/user โ
โ โ
โ values: โ
โ envelope- โ
โ encrypted โ
โ JSON โ
โโโโโโโโโโโโโโโ
โฒ
โ
write checkpoint (encrypt w/ DEK; wrap DEK w/ tenant KEK)
โ
Worker โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โค
โ
read checkpoint (request KEK-unwrap; decrypt DEK; decrypt value)
Guarantee: observability store can see span tree but NOT reasoning payloads.
Cross-tenant leak requires breach of nexus-auth KMS + span store.
B10. A2A + MCP dual-plane protocol
Nexus v4.0 Outside world
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AGENT PLANE โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโA2Aโโโโโถโ Gemini Enterprise agent โ
โ โ Tier 4 autonomous agent โ โ โ MS Agent Framework agent โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ Third-party crew (CrewAI) โ
โ โ โ โ Bedrock AgentCore runtime โ
โ โผ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ TOOL PLANE โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโMCPโโโโโถโ GitHub, Linear, Slack, โ
โ โ Tier 2/3 skill invokes โ โ โ SaaS + OSS MCP servers โ
โ โ MCP server (tool) โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ MCP = vertical (I use tools) โ
โ A2A = horizontal (I collaborate) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
B11. Structured output + self-correction (typed skill I/O)
skill contract: input_schema, output_schema (Pydantic / Zod)
โ
โผ
โโโโโโโโโโโโโโโโโโโโ generate โโโโโโโโโโโโโโโโโ
โ LLM โโโโโโโโโโโโโโโโโถโ validator โ
โ (AI Router) โ โ (schema check)โ
โโโโโโโโโโโโโโโโโโโโ โโโโฌโโโโโโโโโโโโโ
โฒ โ pass โโโถ return
โ โ
โ retry โคR with โ fail
โ diagnostic prompt โผ
โ โโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโค build critique โ
โ prompt w/ err โ
โโโโโโโโโโโโโโโโโ
B12. Optimizer-compiled prompts (DSPy MIPROv2 / GEPA)
Evaluations (AgentCore-style quality evals)
โ
โผ
โโโโโโโโโโโโโโโโ compile โโโโโโโโโโโโโโโโ
โ skill source โโโโโโโโโโโโโโถโ optimised โโโโถ stored in registry as
โ prompt+few- โ MIPROv2 โ prompt + โ compiled-artifact v{N}
โ shots + โ GEPA โ few-shots โ
โ metric fn โโโโโโโโโโโโโโโ โ rollback on metric drop
โโโโโโโโโโโโโโโโ regress โโโโโโโโโโโโโโโโ
B13. Observability โ Insights Agent + Polly-NL debug
Span Store (Postgres tier-hot + ClickHouse tier-cold)
โ
โ
โโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
โ INSIGHTS AGENT โ โ POLLY-NL DEBUG โ
โ clusters spans โ โ NL query of spansโ
โ into usage โ โ "why did run X โ
โ patterns; โ โ fail last night"โ
โ anomaly detect; โ โ โ summary + linksโ
โ cost hotspots โ โ to exact spans โ
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
โ โ
โผ โผ
Dashboard Insights tab CLI: nexus debug nl "question"
B14. FinOps governance โ pre-dispatch budget + circuit breaker
Org budget (Postgres): per-org / per-skill / per-user / per-day / per-month
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Dispatch arrives with estimated cost (model+tokens) โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Check budget: remaining โฅ estimate? โ
โ yes โโถ reserve estimate (Redis atomic counter) โ
โ no โโถ reject dispatch w/ troubleshooting JSON โ
โ (see NO FALLBACKS contract) โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ dispatch proceeds
โผ
On each LLM call: debit actual cost, emit FinOps span
โ
โผ
Per-run cost cap (Tier 4): if cumulative > cap โ
OnCostThreshold hook โ HITL or auto-abort with partial results
โ
โผ
Circuit breaker: per-skill failure rate > T% in N min โ
open for M min โ dispatch rejected with breaker-open error
B15. Deterministic replay + chain-of-custody (C2PA)
Original run Replay (cryptographically bit-for-bit)
โ โ
โโ inputs hashed (SHA-256) โโโโโโโ โ
โโ model version pinned โ โ
โโ prompt template hash โ โ
โโ temperature 0 or โ โ
โ seeded stochastic โ โ
โโ tool outputs captured โโโโ replay manifest โโโโโถ
โโ time freeze (virtual clock) โ (signed)
โโ RNG seeds captured โ
โโ span tree + hooks logged โ
โ
C2PA manifest on artefact: who signed, which model, which prompt hash,
which tools, which human approvals, run_id, replay_manifest_id
B16. Airgapped bundle mode โ sealed offline deployment
BUILD (at Adverant, online) DELIVER (via encrypted USB / sneakernet)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1. Docker images (signed) โ โ Tamper-evident seal โ
โ 2. K8s manifests โโโtar.gzโโโถโ Offline registry manifest โ
โ 3. Model weights (pinned) โ โ GPG + sigstore signatures โ
โ 4. Postgres migrations โ โ Customer installs with nexus-cli โ
โ 5. Skill bundle (signed) โ โ on-prem airgapped cluster โ
โ 6. SBOM + licenses โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 7. Provisioning TPM KEKs โ โ
โ 8. FIPS 140-3 modules โ โผ
โ 9. STIG base images โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ INSTALL (at customer, airgapped) โ
โ Manifest: โ โ โ
โ { images: [...], โ โ nexus airgap install <bundle.tar.gz> โ
โ skills: [...], โ โ โโ verify signatures โ
โ models: [...], โ โ โโ load images โ local registry โ
โ policies: [...] } โ โ โโ apply K8s manifests (no pull) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โโ run DB migrations โ
โ โโ seed skill registry (pre-signed) โ
โ โโ TPM-wrap tenant KEKs โ
โ โโ emit readiness event โ
โ โ
โ nexus airgap update <new-bundle> โ
โ (delta bundle, same verify path) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
B17. Adverant Nexus CLI 2.0 โ dispatch + streaming + PCC mirror
$ nexus login # OAuth / PAT
$ nexus org use <slug> # select tenant
$ nexus dispatch ros.code_edit \
--input @inputs.json \
--tier tool_using \
--provider gemini \
--model gemini-2.5-pro \
--risk high \
--budget 5.00 \
--tail
โโ streams WS events โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ [run_id=abc trace=xyz] โ
โ โธ dispatched (2026-04-24T10:00:01Z) โ
โ โธ skill_resolved skill=ros.code_edit v3.2.1 sig=โ
โ
โ โธ hook:PreDispatch policies=5 passed โ
โ โธ llm_call provider=gemini model=gemini-2.5-pro โ
โ โธ llm_response tokens=1240 cost=$0.003 โ
โ โธ tool_call write_file /src/foo.ts โ
โ โธ tool_result ok โ
โ โธ completed cost=$0.004 runtime=8.2s artefact=sha256:... โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
$ nexus runs list --since 1h
$ nexus runs show <run_id> --json
$ nexus runs replay <run_id> # deterministic replay (Gap E)
$ nexus chain visualize <run_id> # ascii DAG in-terminal
$ nexus skill publish ./skill-dir --sign --sbom
$ nexus skill versions ros.code_edit
$ nexus skill rollback ros.code_edit v3.2.0
$ nexus airgap bundle --out ./bundle.tgz --skills all --models all
$ nexus airgap install ./bundle.tgz
$ nexus airgap update ./delta.tgz
$ nexus governance export --framework soc2 --out ./soc2-audit.tgz
$ nexus governance export --framework eu-ai-act --out ./eu-ai-act-conformity.tgz
$ nexus governance export --framework fedramp-high --out ./fedramp-package.tgz
$ nexus governance policies list
$ nexus governance policies apply ./policies/*.rego
$ nexus hooks list --scope org
$ nexus hooks apply ./hooks.yaml
$ nexus finops budgets
$ nexus finops burn-rate --org my-org --window 7d
$ nexus memory snapshot <run_id> --decrypt --out ./snapshot.json # needs KEK grant
$ nexus memory gc --org my-org --older-than 180d
$ nexus a2a peers list # A2A discovery
$ nexus a2a call <peer> <capability> --in ...
$ nexus debug nl "why did run abc fail last night" # Polly-NL
$ nexus insights cost-hotspots --window 30d # Insights Agent
B18. Bindings โ UI button to resolved dispatch (v4.0 generalised from ros.skill_bindings)
USER PLUGIN UI DASHBOARD BFF BINDINGS SVC ORCHESTRATOR
โ โ โ โ โ
โ click "Score Lead" โ โ โ โ
โ (bound to key โ โ โ โ
โ lead.scoring.v2) โ โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโถโ โ โ โ
โ โ POST /bindings/resolve โ โ โ
โ โ { binding_key, โ โ โ
โ โ scope_ctx: { โ โ โ
โ โ user_id, project, โ โ โ
โ โ org_id }, โ โ โ
โ โ inputs: {...} } โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ โ
โ โ โ SELECT * FROM โ โ
โ โ โ ros.skill_bind.. โ โ
โ โ โ ORDER BY scope โ โ
โ โ โ precedence + โ โ
โ โ โ priority DESC โ โ
โ โ โ LIMIT 1 โ โ
โ โ โโโโโโโโโโโโโโโโโโโโถโ โ
โ โ โ โ resolve skill + โ
โ โ โ โ merge config โ
โ โ โ โ overrides โ
โ โ โ โ โ
โ โ โโโโ resolved โโโโโโโค โ
โ โ โ { skill_id, โ โ
โ โ โ tier, โ โ
โ โ โ provider, โ โ
โ โ โ model, โ โ
โ โ โ cost_cap, โ โ
โ โ โ risk_tier, โ โ
โ โ โ residency, โ โ
โ โ โ inputs_mapped,โ โ
โ โ โ hooks[], โ โ
โ โ โ policy_refs[] โ โ
โ โ โ } โ โ
โ โ โ โ
โ โ โ POST /dispatch โโโโโโโโโโโโโโโโโโโโโโโโโถโ
โ โ โ (with resolved metadata as dispatch โ
โ โ โ payload; PreDispatch hooks verify โ
โ โ โ cost_cap, residency, risk, export โ
โ โ โ before execution) โ
โ โโโโโโโโ run_id + ws tail โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โโโโ live PCC tile โโโโโโค โ
B19. Binding resolution scope hierarchy โ "nearest wins, then priority wins"
Lookup: binding_key = "lead.scoring.v2"
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ USER scope (binding_key, user_id) โ โ most specific
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ PROJECT scope (binding_key, project_id) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ ORG scope (binding_key, org_id) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ SYSTEM scope (binding_key, scope=system)โ โ most general (Adverant default)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
Within the most-specific matching scope, pick binding with
max(priority 0-1000) where is_active=true and deleted_at IS NULL
โ
โผ
If A/B experiment active on that binding_key:
split by split_ratio (hash user_id) โ variant_a_skill_id | variant_b_skill_id
โ
โผ
Apply config_overrides hierarchy (skill_definition.config โ
binding.config_overrides โ caller.runtime_overrides)
B20. Binding metadata v4.0 โ the full field set (extended from v3 ros.skill_bindings)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ BINDING v4.0 โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ IDENTITY โ
โ id, organization_id, binding_key, scope, scope_id, priority โ
โ โ
โ RESOLUTION TARGET โ
โ skill_definition_id (โ ros.skill_definitions.id) โ
โ skill_version_pin ("latest-minor" | "3.2.1" | "pinned") โ
โ โ
โ EXECUTION โ
NEW โ
โ tier (1 llm_only | 2 tool_using | 3 chain | โ
โ 4 autonomous) โ
โ provider_preference (gemini | anthropic | claude_max | โ
โ openrouter | auto) โ
โ model_preference ("gemini-2.5-pro" | "claude-opus-4" | โ
โ role:fast | role:reasoning | auto) โ
โ routing_hint (fast | reasoning | code | long_context) โ
โ queue_name (BullMQ queue override) โ
โ response_format (json | text) โ
โ โ
โ COST & LIMITS โ
NEW โ
โ cost_cap_usd (per-run hard ceiling) โ
โ daily_cap_usd (per-binding per-day) โ
โ token_cap_in / token_cap_out (per-run) โ
โ timeout_ms โ
โ max_iterations (Tier 2/3/4) โ
โ max_sub_agents (Tier 4) โ
โ โ
โ GOVERNANCE โ
NEW โ
โ risk_tier (minimal | limited | high | unacceptable)โ
โ data_residency (eu_only | us_only | any | <region-tag>) โ
โ export_tags[] (EAR / ITAR / dual-use) โ
โ requires_hitl (bool | on_risk_high) โ
โ policy_refs[] (OPA bundle refs) โ
โ tier_restrictions[] (starter | growth | enterprise | โ
โ unlimited) โ
โ phi_tagged (bool, HIPAA) โ
โ compliance_frameworks[] (soc2 | eu-ai-act | iso42001 | hipaa | โ
โ fedramp | nist-airmf) โ
โ โ
โ HOOKS โ
NEW โ
โ hooks[] (PreDispatch | PreToolUse | PostLLMCall |โ
โ OnCostThreshold | OnHITLPause โฆ) โ
โ allowed_tools[] (tool allowlist for Tier 2/3) โ
โ denied_tools[] โ
โ โ
โ INPUTS & MAPPING โ
NEW โ
โ input_schema (JSON-Schema โ enforces button payload) โ
โ inputs_mapping (template exprs: {{ selectedEntity.id }})โ
โ output_target (where result renders: toast | panel | โ
โ tab | new-window | plugin-callback) โ
โ โ
โ UI PRESENTATION โ
NEW โ
โ display_name, description, icon, placement[] (entity-toolbar | โ
โ batch-action | command-palette | page-header | context-menu) โ
โ confirmation (none | simple | strong | hitl) โ
โ badge (cost preview | tier badge | risk chip) โ
โ shortcut (keybinding, e.g. "mod+shift+s") โ
โ โ
โ OBSERVABILITY & EXPERIMENTS โ
โ ab_experiment_id (โ ros.skill_ab_experiments.id, nullable)โ
โ quality_score_threshold (auto-deactivate if runtime score <) โ
โ telemetry_tags{} (for Insights Agent clustering) โ
โ โ
โ LIFECYCLE โ
โ status (active|inactive|deprecated), is_active, deleted_at โ
โ created_by, created_at, updated_by, updated_at โ
โ config_overrides (JSONB catch-all for forward-compat) โ
โ conditions{} (contextual match: agent_role, job_type, entity_type) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
B21. nexus.manifest.json declarative actions (plugin authors publish button defaults)
{
"plugin": { "slug": "leads", "version": "2.1.0" },
"actions": [
{
"id": "lead-score-quick",
"display_name": "Score Lead",
"description": "Run lead scoring on selected entity",
"binding_key": "lead.scoring.v2",
"default_skill_id": "uuid-of-scoring-skill",
"default_tier": 2,
"default_provider": "auto",
"default_model": "role:reasoning",
"cost_cap_usd": 0.50,
"risk_tier": "limited",
"data_residency": "any",
"requires_hitl": false,
"input_schema": {
"type": "object",
"required": ["entity_id"],
"properties": {
"entity_id": { "type": "string", "format": "uuid" }
}
},
"inputs_mapping": {
"entity_id": "{{ selectedEntity.id }}",
"enrichment_ctx": "{{ pageContext.enrichmentFlags }}"
},
"placement": ["entity-toolbar", "batch-action"],
"icon": "zap",
"output_target": "side-panel",
"confirmation": "none",
"badge": "cost-preview",
"shortcut": "mod+shift+l",
"compliance_frameworks": ["soc2"],
"allowed_tools": ["crm_lookup", "web_fetch"]
}
]
}
On install the plugin's actions[] seed SYSTEM-scope bindings with
scope=system, priority=100. Org admins may override at org scope,
power users at project/user scope. User overrides โ source code changes.
4.C User Journeys (end-to-end, including existing + new capabilities)
C1. End-user dispatches a skill from dashboard
USER DASHBOARD ORCHESTRATOR PCC PANEL
โ โ โ โ
โ click "Generate Report" โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ โ
โ โ POST /api/dispatch โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โ โ PreDispatch hooks โ
โ โ โ policy pass โ
โ โ โ budget reserve โ
โ โ โ enqueue โ
โ โโโโโ 202 {run_id,trace} โโโโโค โ
โ โ โ ws: dispatched โ
โ โโโโโ register TrackedJob โโโโโโโโโโโโโโโโโโโโโโโโถ โ
โ โ โ ws: skill_resolved โ
โ โ โ ws: llm_call โ
โ โ โ ws: llm_response โ
โ โ โ ws: tool_call โ
โ โ โ ws: tool_result โ
โ โ โ ws: completed โ
โ โ โ โ
โ see live progress โโโโโโโโโโโโโโ PCC panel renders every event โโโโโโโโโโโค
โ see cost counter โโโโโโโโโโโโโโ FinOps span streams cost โโโโโโโโโโโโโโโโค
โ see thinking log โโโโโโโโโโโโโโ ReAct spans stream tool+LLM โโโโโโโโโโโโโค
โ โ โ
โ click "Replay" โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโถโ GET /runs/{id}/replay โโโโโถ replay manifest โถ deterministic rerun
C2. Developer publishes a skill to marketplace
DEVELOPER CLI MARKETPLACE REGISTRY
โ โ โ โ
โ write SKILL.md v2 โ โ โ
โ write input/output โ โ โ
โ schema + adversarial evalsโ โ โ
โ โ โ โ
โ nexus skill publish โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ โ
โ โ lint โ SBOM โ sign โ โ
โ โ adversarial-eval run โ โ
โ โ risk-tier classify โ โ
โ โ MITRE/OWASP tag โ โ
โ โ โ โ
โ โ POST /marketplace/skills โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โ โ verify sig โ
โ โ โ verify SBOM โ
โ โ โ INSERT skill_reg โ
โ โ โโโโโโโโโโโโโโโโโโโโโถโ
โ โ โ broadcast update โ
โ โ โ quality-score init โ
โ โโโโ ok {id, version} โโโโโโโค โ
โโโโ ok โโโโโโโโโโโโโโโโโโโโโค โ โ
โ โ โ
โ โ runtime telemetry rolls quality-score โ
โ โ auto-rollback if score drops below threshold โ
C3. Admin configures tenant โ providers, quotas, governance
ADMIN DASHBOARD AUTH DB / POLICY ENGINE
โ โ โ
โ Settings โ Providers โ โ
โ set default=Gemini โ โ
โ set reasoning=Claude-Sonnet4 โ โ
โ set code=Claude-Opus4 โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โ PUT /org/ai-config โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ
โ โ โ AES-256 store keys
โ โ โ update role map
โ โ โ
โ Settings โ FinOps โ โ
โ set daily budget $500 โ โ
โ set per-skill caps โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โ PUT /org/finops โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ
โ โ
โ Settings โ Governance โ โ
โ select frameworks: โ โ
โ [x] SOC 2 [x] EU AI Act โ โ
โ [x] HIPAA [x] NIST AI RMF โ โ
โ select residency: EU only โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ PUT /org/governance โโโโโโโโโโโถโ OPA bundle assembled
โ โ distributed to services
โ โ
โ Settings โ Hooks โ โ
โ apply hook YAML โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ PUT /org/hooks โโโโโโโโโโโโโโโถโ hook manifest live
C4. Auditor exports compliance package
AUDITOR CLI GOVERNANCE SVC EVIDENCE BUCKET
โ โ โ โ
โ nexus governance export โ โ โ
โ --framework soc2 โ โ โ
โ --window 2026-Q1 โ โ โ
โ --out soc2-audit.tgz โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ โ
โ โ POST /governance/export โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โ โ query spans โ
โ โ โ query policy versionsโ
โ โ โ query HITL decisions โ
โ โ โ query risk-tier hits โ
โ โ โ assemble traceabilityโ
โ โ โ control-ID matrix โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโถโ
โ โ โ sign package (GPG) โ
โ โโโโโโ signed .tgz โโโโโโโโโโโค โ
โโโโ download โโโโโโโโโโโโโโโค โ โ
C5. Airgapped customer installs + operates
CUSTOMER OPS CLI AIRGAPPED K8S
โ โ โ
โ copy bundle.tgz via USB โ โ
โ โ โ
โ nexus airgap install โ โ
โ --bundle ./bundle.tgz โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โ verify sigs (sigstore) โ
โ โ verify SBOM vs allow-list โ
โ โ load images โ local reg โ
โ โ apply K8s manifests โ
โ โ run Postgres migrations โ
โ โ seed skill registry โ
โ โ TPM-wrap tenant KEKs โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ
โ โ โ pods Ready
โ โโโโ readiness event โโโโโโโโโค
โ โ
โ Operate completely offline. nexus dispatch / runs tail / governance export
โ all work; A2A restricted to local peers; no external calls possible.
โ
โ nexus airgap update --delta ./delta.tgz (monthly patch bundle)
C6. HITL reviewer approves a high-risk autonomous run
TIER-4 RUN HITL INBOX (PCC) REVIEWER
โ โ โ
โ reach waitpoint โ โ
โ risk=high โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โ alert + evidence โ
โ โ - plan โ
โ โ - sub-agent spans โ
โ โ - tool calls โ
โ โ - cost so far โ
โ โ - residency tags โ
โ โ - schema diff โ
โ โโโโโโโโโโโโโโโโโโโโโโถโ
โ โ โ
โ โ โ review evidence
โ โ โ decide
โ โ โ
โ โโโโโ approve/reject โโค
โโโโ OnHITLResume โโโโโโโโโโโโโค with note โ
โ continue OR replan โ โ
โ โ โ
โ Audit: decision + reviewer identity + policy version recorded as span
C7. Developer uses CLI to dispatch + tail + debug
$ nexus dispatch ros.refactor --in @issue-123.json --tail
(stream ...)
โธ completed run_id=r-789 artefact=sha256:beef...
$ nexus runs show r-789 --spans-tree
root โ dispatch
โโ hook:PreDispatch [5 policies]
โโ skill_resolve
โโ tier_selected: tool_using
โโ iter=0
โ โโ llm_call model=gemini-2.5-pro tokens=1,240 $0.003
โ โโ tool:write_file /src/foo.ts
โโ iter=1
โ โโ llm_call (final answer)
โโ hook:PostDispatch
โโ sign_c2pa
$ nexus debug nl "why was iter=0 slow?"
Insights: 3.2s spent in tool:write_file (p95 0.4s). Network stall to
sandbox filesystem. See span 0xabc for details.
$ nexus runs replay r-789 # deterministic bit-for-bit
$ nexus runs export r-789 --c2pa # artefact + provenance manifest
C8. Power user reconfigures a plugin button via the Binding Editor (no code deploy)
POWER USER DASHBOARD BINDINGS SVC POLICY ENGINE
โ โ โ โ
โ open plugin "Leads" โ โ โ
โ right-click "Score Lead"โ โ โ
โ โ "Edit action..." โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ โ
โ โ GET /bindings/resolve? โ โ
โ โ key=lead.scoring.v2 โ โ
โ โ scope_ctx=me โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โโโ current resolution โโโโโค โ
โ โ (system default inherited) โ
โ โ โ
โ Binding Editor opens: โ
โ change tier: 2 โ 3 (chain, chapter-style subtasks) โ
โ change model: auto โ claude-opus-4 โ
โ cost_cap: $0.50 โ $2.00 โ
โ requires_hitl: false โ on_risk_high โ
โ allowed_tools: + crm.bulk_update โ
โ placement: + command-palette โ
โ shortcut: mod+shift+l โ
โ scope: user โ
โ priority: 300 โ
โ โ
โ click Save โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโถโ POST /bindings (scope= โ โ
โ โ user, user_id=me, โฆ) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโถโ โ
โ โ โ validate against โ
โ โ โ org policy โโโโโโโโโโถโ
โ โ โโโโโโ allow? โโโโโโโโโโค
โ โ โ (e.g. org disallows โ
โ โ โ claude-opus for PHI-โ
โ โ โ tagged skills โ โ
โ โ โ override rejected; โ
โ โ โ reason returned) โ
โ โโโโ 201 Created (id) โโโโโโค โ
โ โโโ toast "saved" โโโโโโค โ
โ โ
โ click button again โ next dispatch uses the user-scoped binding (highest โ
โ precedence), which overrides the system default until removed. โ
โ โ
โ Every binding change is an audit row: who/when/what/policy-verdict. โ
โ Config-drift detector flags bindings whose quality-score drops post-edit. โ
4.D UI/UX Elements (dashboard + PCC + CLI surfaces)
D1. Dashboard layout (v4.0 additions marked โ )
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ adverant nexus Org: ACME โพ User: Jane ๐ 2 HITLโ
๐ฐ $213/500 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ SIDEBAR โ โ MAIN โ โ
โ โ โ Home โ โ โ โ
โ โ โ Plugins โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ - ROS โ โ โ active plugin workspace โ โ โ
โ โ - Prose โ โ โ โ โ โ
โ โ - QA โ โ โ โ โ โ
โ โ โ Marketplaceโ
โ โ โ โ โ โ
โ โ โ Workflows โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Skillsโ
โ โ โ โ
โ โ โ Chainsโ
โ โ โโโโโ PCC PANEL (dockable) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Insightsโ
โ โ โ active runs | HITL inboxโ
| replayโ
| finopsโ
โ โ โ
โ โ โ Governanceโ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ FinOpsโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ Settings โ โ
โ โโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
D2. PCC panel (v4.0 TrackedJob+)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ Progress Command Center โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ r-789 ros.refactor tool_using iter 2/8 $0.04/$5.00 ๐ข running โ
โ โ r-790 prose.draft chain step 3/5 $0.12/$2.00 ๐ก HITL wait โ
โ โ r-791 qa.regression autonomous agent 2/4 $2.80/$10.00 ๐ด budget near โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ SELECTED RUN: r-789 โ
โ โ
โ Tabs: Progress | Spans | Thinking | Tools | Hooksโ
| Cost | Policy | Replayโ
โ
โ โ
โ [Spans view โ live tree] โ
โ dispatch โ
โ โโ hook:PreDispatch โ โ
โ โโ skill_resolve ros.refactor v3.2.1 โ
sig โ
โ โโ iter 0 โ
โ โ โโ llm_call gemini-2.5-pro 1.2s $0.003 โ
โ โ โโ tool:write /src/foo.ts 0.4s โ
โ โโ iter 1 (running) โ
โ โโ llm_call gemini-2.5-pro ... โ
โ โ
โ [FinOps bar] โโโโโโโโโโโโโโโโ $0.04 / $5.00 budget โ
โ [Policy bar] โ
residency:eu โ
risk:limited โ
budget โ
export โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
D3. Governance tab โ compliance control dashboard
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ GOVERNANCE โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Frameworks enabled: [x] SOC 2 [x] ISO 27001 [x] ISO 42001 โ
โ [x] EU AI Act [x] NIST AI RMF [x] HIPAA โ
โ [x] FedRAMP Moderate [ ] FedRAMP High โ
โ โ
โ Coverage: โ
โ EU AI Act โโโโโโโโโโโโโโโโโโโโ 100% (27/27 controls mapped) โ
โ SOC 2 CC โโโโโโโโโโโโโโโโโโโโ 85% (58/68 controls mapped) โ
โ ISO 42001 โโโโโโโโโโโโโโโโโโโโ 92% (36/39 mapped) โ
โ โ
โ Recent events: โ
โ โข risk=high run r-791 โ HITL approved by alice@acme 2h ago โ
โ โข residency violation attempt blocked 9h ago (policy eu-only) โ
โ โข 3 skills quality-score dropped โ auto-rolled back yesterday โ
โ โ
โ Export: [ Download SOC 2 package ] [ EU AI Act conformity ] [ FedRAMP ] โ
โ โ
โ Policy engine: 42 active policies โข last update 3h ago โข OPA v1.3 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
D4. Marketplace UI โ skill browsing
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ SKILL MARKETPLACE โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐ search... sort: quality โพ โ
โ โ
โ ros.code_edit v3.2.1 โญ4.9 โ
signed โญqual 0.94 low risk โ
โ ros.refactor v1.4.0 โญ4.8 โ
signed โญqual 0.91 low risk โ
โ prose.chapter_write v2.0.1 โญ4.7 โ
signed โญqual 0.88 limited risk โ
โ qa.pentest v0.9.3 โญ4.2 โ
signed โญqual 0.76 HIGH riskโ
โ
โ โ
โ Selected: ros.code_edit v3.2.1 โ
โ Publisher: Adverant Inc โข SBOM: โ
โข CVEs: 0 โข License: Apache-2.0 โ
โ MITRE ATLAS: AML.T0015 | OWASP LLM01 mitigated | EU AI Act: limited-risk โ
โ Runtime quality last 30d: 0.94 (trending +0.01) โ
โ [ Install ] [ Pin version ] [ View source ] [ History ] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
D5. Chain visualizer โ DAG editor/viewer
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ CHAIN: prose.full_book โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ [A: outline]โโโโโฌโโโโโถ[B: chapter1]โโโโโถ โ
โ โ โโถ[E: compile] โ
โ โโโโโโถ[C: chapter2]โโโโโค โ
โ โ โ โ
โ โโโโโโถ[D: chapter3]โโโโโ โ
โ โ
โ status: Aโ
Bโ
C๐ขrunning Dโณqueued Eโณwaiting โ
โ checkpointed: โ
(resumable from worker failure) โ
โ โ
โ [ Open editor ] [ Re-run failed steps ] [ Replay deterministic ] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
D6. Span tree explorer + Polly-NL debug
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ SPAN EXPLORER r-789 โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Tree Details (click any span) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ dispatch 2.1s span_id: 0xab12 โ
โ โโ hook:PreDispatch 0.04s type: llm_call โ
โ โโ skill_resolve 0.11s provider: gemini โ
โ โโ iter 0 1.2s model: gemini-2.5-pro โ
โ โ โโ llm_call 0.9s โโโ sel tokens_in: 1240 โ
โ โ โโ tool:write 0.3s tokens_out: 420 โ
โ โโ iter 1 0.4s cost: $0.003 โ
โ โโ sign_c2pa 0.02s prompt_hash: sha256:... โ
โ model_version: 2025-11 โ
โ NL debug box: โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ "why was iter 0 slow?" โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ tool:write took 0.3s vs p95 0.08s. Sandbox FS stall. See span 0xabcd. โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
D7. FinOps dashboard
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ FINOPS โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ This month โ
โ Spent $1,234.56 / $5,000.00 โโโโโโโโโโโโโโโโโโโโ 24.7% โ
โ Burn rate $42/day (trending +8%) โ
โ โ
โ By skill (top 5) By provider โ
โ ros.code_edit $421 Gemini $812 โ
โ prose.chapter_write $389 Anthropic $298 โ
โ qa.regression $176 Claude Max $84 โ
โ prose.outline $154 OpenRouter $40 โ
โ ros.refactor $94 โ
โ โ
โ Circuit breakers: 2 open (qa.vision-heavy, prose.critic) โ
โ Alerts: 1 skill trending to budget exhaustion in 6 days โ
โ โ
โ [ Set caps ] [ View cost by user ] [ View cost by tier ] [ Export CSV ] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
D9. Binding Editor โ visual editor for any button in any plugin
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ BINDING EDITOR โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Binding key: lead.scoring.v2 Scope: [user โพ] Priority: [300] โ
โ Status: [active โพ] Pin version: [latest-minor โพ] โ
โ โ
โ โโโ TARGET SKILL โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Skill: ros.lead_score โพ v3.2.1 โ
signed โญ0.94 โ
โ โ
โ โโโ EXECUTION โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Tier: (โ)1 llm_only (โ)2 tool_using (โ)3 chain (โ)4 autonomousโ
โ Provider: [ auto โพ ] Model: [ role:reasoning โพ ] โ
โ Routing hint: [ reasoning โพ ] Queue: [ default โพ ] โ
โ Response fmt: [ json โพ ] โ
โ โ
โ โโโ COST & LIMITS โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Cost cap per run: [ $2.00 ] Daily cap: [ $200 ] โ
โ Token in/out: [ 50k / 10k ] Timeout: [ 120s ] โ
โ Max iterations: [ 8 ] Max sub-agents: [ n/a ] โ
โ โ
โ โโโ GOVERNANCE โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Risk tier: (โ)limited (โ)high [ ] requires HITL always โ
โ Residency: [ eu_only โพ ] [x] on risk=high only โ
โ Export: [ ] EAR [ ] ITAR [x] PHI-tagged (HIPAA) โ
โ Policies: [ eu-ai-act ] [ soc2 ] [ iso42001 ] +add โ
โ โ
โ โโโ HOOKS & TOOLS โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Hooks: [ PreDispatch ] [ PreToolUse ] [ OnCostThreshold ] +add โ
โ Allowed: [crm_lookup] [web_fetch] [write_notes] +add โ
โ Denied: [crm_bulk_delete] [exec_shell] โ
โ โ
โ โโโ INPUTS MAPPING โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Schema: required: entity_id (validated per dispatch) โ
โ Mapping: entity_id โ {{ selectedEntity.id }} โ
โ enrich_ctx โ {{ pageContext.enrichmentFlags }} โ
โ user_locale โ {{ currentUser.locale }} โ
โ Output: [ side-panel โพ ] โ
โ โ
โ โโโ PRESENTATION โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Display: "Score Lead" Icon: [zap โพ] Shortcut: [ mod+shift+l ] โ
โ Placement: [x] entity-toolbar [x] batch-action [ ] page-header โ
โ Confirm: (โ)none (โ)simple (โ)strong (โ)hitl โ
โ Badge: [x] cost-preview [x] tier-badge [ ] risk-chip โ
โ โ
โ โโโ A/B EXPERIMENT โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ [ + Start A/B ] Current: none โ
โ โ
โ โโโ TELEMETRY โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ Live quality score: 0.91 (30d) Cost avg: $0.38 p95 latency: 4.2s โ
โ Auto-deactivate if score < [ 0.75 ] โ
โ โ
โ [ Save (scope: user) ] [ Preview dispatch ] [ Diff vs system default ] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
D8. CLI interactive REPL (nexus shell)
$ nexus shell
nexus> help
dispatch, runs, skills, chains, airgap, governance, finops,
hooks, memory, a2a, insights, debug, session, org, login
nexus(ACME)> runs tail --since 5m
โธ r-801 ros.refactor running $0.01
โธ r-802 prose.chapter HITL $0.08
โธ r-803 qa.regression completed $2.40
nexus(ACME)> debug nl "cost hotspots last hour"
Insights: qa.regression is 58% of last-hour spend.
Suggest: switch qa.regression default model to gemini-flash (-$1.80/run).
nexus(ACME)> apply suggest 1
Applied. New default model for qa.regression: gemini-flash.
4.E Compliance + Security Integration
E1. EU AI Act risk-tier enforcement
skill metadata (registry) dispatch time runtime
โโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ โโโโโโ
risk_tier: unacceptable โโโโโโโโโโโโ REJECT at dispatch โโโโโโโโ (never executes)
risk_tier: high โโโโโโโโโโโโโโโโโโโ PreDispatch hook: require HITL โโ after run: conformity record
+ adversarial-eval + post-market monitoring
risk_tier: limited โโโโโโโโโโโโโโโโ PreDispatch hook: transparency notice โโ output: watermark + model-card link
risk_tier: minimal โโโโโโโโโโโโโโโโ (no extra gate)
E2. GDPR right-to-erasure atomic delete
nexus erase-subject --user <uid>
โ
โผ
Orchestrator: open erasure-job (Tier 3 chain)
โโโ Postgres DELETE FROM * WHERE user_id=... (RLS scoped)
โโโ Qdrant delete points filter payload.user_id=...
โโโ Neo4j MATCH (n) WHERE n.user_id=... DETACH DELETE n
โโโ Memory Bank delete envelope + rotate tenant KEK (crypto-erasure)
โโโ Object store delete artefacts + C2PA manifests
โโโ Backups schedule retention-policy purge
โ
โผ
Emit erasure-evidence span + sign certificate of erasure โโโถ report to DPO
E3. OWASP LLM Top 10 defense stack (per-dispatch)
Request โโโถ LLM01 prompt-injection classifier (PreDispatch hook)
โโโถ LLM03 training-data-poisoning: skill SBOM + origin check
โโโถ LLM04 denial-of-wallet: FinOps pre-reserve
โโโถ LLM05 supply-chain: signed skills + SBOM + pinned models
โโโถ LLM06 sensitive-disclosure: output scanner (PostLLMCall hook)
โโโถ LLM07 insecure-plugin-design: hook/tool allowlist + scope
โโโถ LLM08 excessive-agency: capability allowlist per skill+role
โโโถ LLM09 overreliance: watermark + model card + human-oversight flag
โโโถ LLM02 insecure-output: structured-output validator
โโโถ LLM10 model-theft: rate limit + auth + airgapped option
E4. Envelope encryption, per-tenant KEKs
Data โโ(DEK random)โโโถ ciphertext (stored in span store / memory bank / object store)
DEK โโ(tenant KEK)โโโโถ wrapped DEK (stored alongside ciphertext)
tenant KEK: held by nexus-auth KMS
cloud profile: HSM-backed (FIPS 140-3)
airgap profile: TPM-backed
rotation: quarterly (policy) or on-demand
Read path: unwrap DEK via tenant KEK (auth to KMS) โ decrypt ciphertext
Erasure: delete/rotate tenant KEK โ all wrapped DEKs unusable โ crypto-erasure
E5. Three-gate enforcement (Istio + service key + policy engine)
Caller (e.g., nexus-workflows) โโโถ Gateway (AI Router)
โโ Gate 1: Istio AuthorizationPolicy (SPIFFE)
โโ Gate 2: validateServiceKey (HMAC header)
โโ Gate 3: OPA policy evaluation (per-request)
eval: caller โ ALLOWED_AI_CALLERS
eval: org.residency compatible with provider.region
eval: org.budget has headroom
eval: skill.risk_tier allowed by org.policy
eval: export-control tags compatible
E6. OPA/Rego policy evaluation flow
Dispatch โโโถ assemble decision input:
{ org, user, skill, tier, provider, model, inputs-schema,
risk_tier, data_residency, export_tags, cost_estimate,
time_of_day, caller_service }
โโโถ POST /opa/v1/data/nexus/dispatch/allow
โโโถ { result: allow | deny, reasons: [...], conditions: [...] }
โโโถ if allow with conditions, attach to run (e.g. "require HITL",
"restrict tools to allowlist", "redact PII in output")
E8. Binding-level policy enforcement (user overrides can't weaken org policy)
User saves a user-scope binding that overrides:
โ provider: gemini โ claude-max
โ cost_cap: $0.50 โ $10.00
โ requires_hitl: true โ false
โ
โผ
Bindings svc โ POST /opa/v1/data/nexus/bindings/allow_override
input: { org_policy, proposed_binding, existing_binding,
skill_metadata, user_role }
โ
โผ
OPA rules applied:
โ org.residency must not be widened (eu_only โฎ any)
โ org.phi_required_providers must include proposed provider
โ org.max_cost_cap must be โฅ proposed cost_cap
โ org.hitl_mandatory_for_high_risk must be honoured
โ org.allowed_tools must cover proposed allowed_tools
โ user.role must have `bindings:write:<scope>` permission
โ
โผ
allow โ insert binding, emit audit span, broadcast marketplace:binding_updated
deny โ return { code, message, troubleshooting[] } (NO FALLBACKS contract)
E7. C2PA content provenance on every artefact
Artefact bytes
โ
โผ
C2PA manifest v2 attached (or sidecar):
claim_generator: adverant-nexus/4.0.0
actions: [
{ action: c2pa.created, software_agent: "skill ros.refactor v3.2.1" },
{ action: c2pa.edited, software_agent: "gemini-2.5-pro" },
{ action: c2pa.reviewed, software_agent: "human:alice@acme" }
]
ingredients: [ input1.pdf hash, tool-output1 hash, ... ]
run_id: r-789
replay_manifest_id: rm-abc
signature: sigstore / tenant-key
4.F Deployment Profiles
F1. Public cloud multi-tenant
Internet โโโถ Cloudflare โโโถ Istio Ingress โโโถ Services (shared)
WAF mTLS โ
DDoS AuthZ โโโถ shared Postgres (RLS)
โโโถ shared Redis (per-org channels)
โโโถ shared Qdrant (per-tenant namespace)
โโโถ shared Neo4j (per-org labels)
โโโถ shared AI Router (per-org keys)
ALL orgs isolated at DB/WS/auth layer; no hard VM isolation.
F2. Single-tenant VDS
Customer VPC โโโถ Single-tenant nexus stack on their VDS
Same images, same manifests
One org_id value, simplified RLS
Keys live only in this VDS's nexus-auth
Can still talk to public providers OR use customer's own endpoints
F3. On-premise Kubernetes
Customer data centre โโโถ K8s cluster runs nexus stack
Images mirrored from Adverant registry
May use customer's own LLM endpoints (Azure OpenAI, internal Llama)
SSO โ customer IdP
Backups on-prem
Policy bundle from customer's OPA repo
F4. Airgapped sealed bundle
Isolated network โโโถ K8s offline
Offline image registry pre-loaded
Pinned model weights on local GPU
No outbound connectivity at all
nexus-auth TPM-backed
Monthly delta bundles via USB
A2A restricted to local peers
FedRAMP/DoD/classified use cases
9. Fifty Use Cases
Each use case specifies trigger / tier / hooks / compliance / outcome. Seven cases exercise the Bindings primitive (7.15) explicitly.
Tier 1 โ llm_only
- Regulatory PDF โ one-page brief. Trigger: user-initiated on regulatory-docs plugin. Tier 1. Hooks: PreDispatch residency=eu, PostLLMCall output-watermark. Compliance: EU AI Act limited, GDPR eu_only. Outcome: C2PA-signed JSON summary.
- Translate support ticket into customer locale. Trigger: ticket-create webhook. Tier 1, role=fast. Hooks: FinOps "high-volume", cache-read. Compliance: GDPR. Outcome: translated string, 80% cache hit cost saving.
- Button binding โ "Classify document". Trigger: user clicks "Classify" on the docs plugin toolbar. The
doc.classify.v1binding pins Tier 1, model=haiku, cost_cap=$0.002, output=side-panel. Hooks: PreDispatch budget. Compliance: EU AI Act minimal. Outcome: classification ("contract / invoice / resume") in the side panel.
Tier 2 โ tool_using
- Research a competitor and produce a comparison table. Trigger: marketing analyst. Tier 2, tools: web_search, web_fetch. Hooks: PreToolUse blocks internal-domain fetch. Compliance: GDPR, export controls. Outcome: markdown table, C2PA-signed.
- Refactor a TypeScript file to pass the type-checker. Trigger: engineer. Tier 2, tools: read_file, write_file, run_tsc. Hooks: MITRE ATLAS mitigations active, PostLLMCall PII scan. Compliance: SOC 2. Outcome: modified file, green tsc.
- Triage a GitHub issue and propose labels plus assignee. Trigger: issue webhook. Tier 2, tools: gh MCP server. Hooks: require_human_confirm before label change. Compliance: SOC 2. Outcome: suggested labels pending human approval.
- Pull last-week AWS cost anomalies and explain them. Trigger: weekly cron. Tier 2, tools: aws-cost-explorer MCP. Hooks: cost-capped. Compliance: FinOps. Outcome: narrative report.
- Binding-driven k8s triage button. Trigger: context-menu on a pod row in ops plugin. Binding
ops.k8s_triageforces namespace-allowlist hook, blocks prod by default. Inputs mapping pulls pod_name, namespace, cluster. Tier 2, tools: kubectl MCP. Hooks: PreToolUse denies prod unless HITL. Compliance: SOC 2. Outcome: diagnostic plus safe remediation proposal.
Tier 3 โ chain
- Full novel draft (outline โ chapters โ compile). Trigger: prose plugin user. Tier 3, 5-step DAG with parallel chapter generation. Hooks: PreChainStep cost cap per step. Compliance: C2PA on final PDF. Outcome: signed novel draft PDF, resumable after worker crash.
- End-to-end security audit report. Trigger: security lead. Tier 3, steps: inventory โ SAST โ DAST โ LLM analysis โ report. Hooks: per-step risk-tier gate, OnHITLPause for high-risk findings. Compliance: SOC 2, ISO 27001. Outcome: signed audit report plus evidence.
- Customer-onboarding workflow. Trigger: new customer. Tier 3, HITL waitpoint on KYC branch. Hooks: residency-gated storage. Compliance: GDPR, HIPAA if applicable. Outcome: onboarded customer with KYC evidence.
- A/B-tested binding. Trigger: marketing sends personalized emails. Binding
marketing.subject_line.v3runs 50/50 A/B across two skill variants for 10 000 dispatches. Hooks: PostLLMCall open-rate tracker. Compliance: CAN-SPAM, GDPR. Outcome: Insights Agent auto-promotes winner; loser archived. - Monthly compliance evidence roll-up. Trigger: cron first-of-month. Tier 3, span queries โ control mapping โ DOC/PDF export โ GPG sign. Compliance: SOC 2, ISO 27001, ISO 42001. Outcome: signed evidence tarball.
Tier 4 โ autonomous
- Autonomous pentest of a staging environment. Trigger: security lead. Tier 4, sub-agents: recon, exploit, report. Hooks: cost cap $50, HITL before any exploit, OnCostThreshold page. Compliance: MITRE ATLAS, SOC 2. Outcome: signed pentest report with human approval trail.
- Long-horizon literature review. Trigger: researcher. Tier 4, Memory Bank across sessions. Hooks: OnHITLPause on replan. Compliance: export controls check per source. Outcome: annotated bibliography with replay manifest.
- Enterprise RFP response. Trigger: sales lead. Tier 4, 4 sub-agents (legal, pricing, technical, editor). Hooks: HITL before final send. Compliance: C2PA on final doc. Outcome: signed RFP response PDF.
- Self-healing production triage. Trigger: alert. Tier 4, sub-agents diagnose then propose. Hooks: nothing applied to prod without HITL plus change-window policy. Compliance: SOC 2, FedRAMP. Outcome: diagnostic plus approved remediation patch.
- Hypothesis generation plus evaluation. Trigger: scientist. Tier 4 compete pattern with cost cap. Hooks: top-k surfaces to HITL. Compliance: internal research standards. Outcome: ranked hypothesis list.
Skill Marketplace 2.0
- Internal team publishes a skill. Trigger: dev. Hooks: publish pipeline as in Figure B7. Compliance: SBOM, sigstore. Outcome:
ros.lead_scorev3.2.1 in private marketplace. - Tenant pins skill version for compliance freeze. Trigger: SOX freeze. Hooks: skill_version_pin. Compliance: SOX, audit retention. Outcome: all bindings resolve to v2.3.1 for Q4.
- Auto-rollback on quality-score drop. Trigger: runtime telemetry. Hooks: quality_score_threshold. Compliance: ISO 42001 quality management. Outcome: binding auto-deactivated; previous version reactivated.
- Binding-scoped skill version pin for compliance freeze. Finance org pins
invoice.extract.v2binding toskill_version_pin: "2.3.1"for Q4 SOX-freeze window; marketplace quality-score rolls continue but binding never auto-bumps. Hooks: quality_score_threshold still monitored but inactive. Compliance: SOX, audit retention. Outcome: binding-level freeze persists across three patch releases.
Hooks
- PII redaction on every PostLLMCall. Hook: PostLLMCall Presidio-style redactor. Compliance: GDPR, HIPAA. Outcome: outputs sanitized before PCC emission.
- Cost-threshold paging. Hook: OnCostThreshold โ PagerDuty webhook. Compliance: FinOps. Outcome: on-call alerted on runaway costs.
- Tool-allowlist per role. Hook: PreToolUse role-based gate. Compliance: RBAC, NIST AI RMF. Outcome: junior roles cannot
write_fileoutside/tmp. - Input-classifier hook rejects jailbreak attempts. Hook: PreDispatch injection classifier. Compliance: OWASP LLM01. Outcome: adversarial example logged; dispatch rejected.
Memory Bank and crypto isolation
- Multi-turn research assistant recalls last-week context. Hook: PreDispatch Memory Bank decrypt (tenant KEK). Compliance: GDPR (within tenant scope). Outcome: continuity across sessions.
- Tenant requests proof of isolation. Hook: audit export lists KEK access log. Compliance: SOC 2 CC 6.1. Outcome: cryptographic evidence of non-leakage.
- User-scope binding overrides org default within policy. Power user promotes their preferred
report.generate.v4binding to Tier 3 with model=claude-opus-4. Hook: OPA check confirms cost_cap โค org.max_cost_cap and residency compatible. Compliance: organizational policy enforced despite user customization. Outcome: user-scope binding accepted; org policy intact.
A2A plus MCP dual plane
- Nexus Tier-4 negotiates with a Gemini Enterprise agent via A2A. Hooks: A2A peer allowlist. Compliance: export controls on cross-border interaction. Outcome: cross-vendor workflow with signed handoff.
- Skill invokes Atlassian MCP to create a Jira ticket. Hook: PostToolUse audit. Compliance: SOC 2. Outcome: Jira ticket created with span link.
- Customer-hosted Bedrock AgentCore agent joins a Nexus workflow. Hook: A2A identity brokered by nexus-auth. Compliance: AWS BAA if HIPAA. Outcome: cross-platform delegation with KEK-scoped spans.
Structured output plus self-correction
- Every skill contract typed; invalid output auto-corrects. Hook: PostLLMCall validator with R=3 ceiling. Compliance: ISO 42001 reliability. Outcome: downstream consumers receive typed output or structured failure.
Optimizer-compiled prompts
- DSPy optimizer tunes the prose.outline skill against a metric. Hook: compiled artefact v+1 staged; auto-rollback on regression. Compliance: ISO 42001 continuous improvement. Outcome: improved prompt deployed without human edit.
- Binding quality auto-deactivation.
support.triage.v2runtime quality score drops below threshold 0.75 over 500 dispatches; system auto-flipsis_active=false, routes to fallback binding on same binding_key at lower priority, notifies publisher. Hook: quality_score_threshold. Compliance: ISO 42001. Outcome: graceful degradation without dispatch failures.
Observability โ Insights plus Polly
- Insights Agent auto-detects latency regression and files a ticket. Hook: Insights Agent anomaly detect. Compliance: SOC 2 CC 7.3. Outcome: Jira ticket with span evidence.
- NL debug: "why was last night's chain expensive?" Hook: Polly-NL query. Compliance: internal ops. Outcome: span narrative with cost hotspot identified.
FinOps
- Per-skill budget cap prevents runaway chain. Hook: PreDispatch budget check. Compliance: FinOps, CFO controls. Outcome: dispatch refused with remedy JSON.
- Circuit breaker on failing provider. Hook: OnCostThreshold plus provider failure threshold. Compliance: reliability targets. Outcome: breaker opens for 15 min, routes to alternate.
- Token-budget enforcement in Tier 4. Hook: OnCostThreshold. Compliance: FinOps. Outcome: sub-agent pauses for HITL before proceeding.
Deterministic replay plus chain-of-custody
- Regulator asks to show how output X was produced. Hook: replay API. Compliance: EU AI Act Art. 12, SOC 2 audit. Outcome: bit-for-bit replay plus C2PA manifest plus span plus policy artefacts.
- Skill-writer debugs a production failure offline. Hook: replay with pinned model stub. Compliance: internal. Outcome: deterministic local reproduction.
Airgapped
- DoD customer installs from sealed USB. Hook: airgap install pipeline. Compliance: FedRAMP High, DoD IL5. Outcome: fully offline Nexus cluster.
- Monthly delta bundle updates skills and models. Hook: airgap update. Compliance: FedRAMP continuous monitoring. Outcome: in-place update with no outbound calls.
- Airgapped A2A restricted to local peers. Hook: A2A discovery SPIFFE filter. Compliance: DoD IL5 isolation. Outcome: zero external agent reachability.
CLI 2.0
- CI pipeline dispatches skills from GitHub Actions. Hook:
nexus dispatch --tail. Compliance: SOC 2 CI integration. Outcome: skill runs log-integrated. - DevOps exports SOC 2 package from CLI in 30 seconds. Hook:
nexus governance export --framework soc2. Compliance: SOC 2 evidence. Outcome: signed tarball. - Marketplace plugin ships with declared binding actions.
nexus.manifest.jsondeclares 12actions[]with full metadata;nexus marketplace install leadsseeds 12 SYSTEM bindings on install, removes them on uninstall, upgrades them on plugin version bump with diff review. Hooks: binding override OPA policy. Compliance: audit trail. Outcome: lifecycle-managed bindings without code edits.
Governance plus compliance
- EU customer enables EU AI Act strict โ all high-risk skills require HITL. Hook: risk-tier gate. Compliance: EU AI Act Art. 14. Outcome: high-risk dispatches queued to HITL inbox.
- HIPAA-covered org enforces "no provider without BAA". Hook: PreDispatch PHI-tag check. Compliance: HIPAA. Outcome: OpenRouter refused for PHI-tagged skills; policy-violation span emitted.
10. Migration Path (Phases 10โ27)
UNO established Phases 1โ9 [42]. v4.0 adds Phases 10 through 27.
| Phase | Name | Scope |
|---|---|---|
| 10 | Tier 4 state machine | Concrete autonomous engine in orchestrator; orchestrator.autonomous_runs table; HITL waitpoints. |
| 11 | Persistent chain state | Migrate orchestrator.chain_runs and chain_steps from Redis TTL to Postgres primary; checkpoint after each step. |
| 12 | Hooks framework | Generic hook dispatcher in orchestrator; hook-manifest YAML; OPA policy integration. |
| 13 | Memory Bank | memory.bank tables; envelope encryption; KMS integration. |
| 14 | Skill Marketplace 2.0 | SKILL.md v2 schema; ros.skill_versions; sigstore integration; SBOM pipeline; adversarial-eval harness; quality-score updater. |
| 15 | Structured-output plus self-correction | Wrap AI Provider Router calls with schema validator; retry harness. |
| 16 | DSPy optimizer pipeline | Offline compile job; variant staging; quality-score rollback gate. |
| 17 | FinOps pre-reserve | Redis atomic counters; per-skill, per-org budgets; circuit breakers. |
| 18 | Deterministic replay plus C2PA | Replay manifest writer; artefact signer; replay worker. |
| 19 | Insights Agent plus Polly-NL | Span clustering service; NL query layer; ClickHouse cold tier. |
| 20 | A2A dual plane | A2A server plus client; peer discovery via nexus-auth SPIFFE. |
| 21 | CLI 2.0 surface | Commands: dispatch, runs, chain visualize, skill publish, airgap, governance, finops, hooks, memory, a2a, debug. |
| 22 | Governance primitives | Compliance frameworks enum; per-framework evidence collectors; auditor-export assembly; OPA policy bundle distribution. |
| 23 | UI/UX Bindings | Extend ros.skill_bindings to v4.0 schema; Binding Editor UI; nexus.manifest.json actions[]; override OPA policy. |
| 24 | Airgapped bundle mode | Bundle builder; offline installer; TPM integration; delta-bundle flow. |
| 25 | Per-queue pod deployments | Split nexus-workflows into per-tier / per-queue Deployments with resource profiles. |
| 26 | Multi-provider routing (finish Phase 7) | AIProviderConfig { providers[], routingPolicy }; adapter selection reads routingPolicy[hint]; failover. |
| 27 | Governance bypass closure | Remove nexus-mageagent from ALLOWED_AI_CALLERS; delete the service; resolve Section 12.3 vs Section 14 contradiction in UNO paper. |
Each phase ships behind a feature flag and is validated by the production scorecard before the next begins.
11. Deployment Profiles
The same codebase serves four profiles; manifests differ by values, not code.
Public cloud multi-tenant. Shared Postgres with RLS, shared Redis with per-org channels, shared Qdrant with per-tenant namespaces, shared Neo4j with per-org labels, shared AI Router with per-org keys. Authentication through the Adverant IdP; tenancy through logical scopes plus cryptographic envelopes.
Single-tenant VDS. Same stack on customer VDS. One org_id value simplifies RLS. Keys isolated to the VDS's nexus-auth. Can still use public providers or customer endpoints.
On-premise Kubernetes. Customer-owned K8s cluster. Images mirrored from Adverant registry. May use customer's LLM endpoints (Azure OpenAI, internal Llama deployments). SSO into customer IdP. Policy bundle from customer OPA repo.
Airgapped sealed bundle. Isolated network. Pre-loaded offline registry. Pinned model weights on local GPUs. nexus-auth TPM-backed. Monthly delta bundles. A2A restricted to local peers. FedRAMP High, DoD IL5, CJIS, IRS Pub 1075 use cases.
12. Evaluation Methodology
We propose six evaluation axes; execution of these benchmarks is deferred to follow-up work after Phases 10โ18 ship.
- Token efficiency per task. Compare v4.0 Tier 2 dispatch against CrewAI, LangGraph, and OpenAI Agents SDK on a fixed task battery (refactor a repository, triage an issue, generate a report). Metrics: tokens-in, tokens-out, cost-per-task.
- Dispatch latency. P50, P95, P99 of
/api/v1/dispatchresponse time. Compare against Temporal, Airflow, and BullMQ direct. - Multi-agent cost. Tier 4 compete-pattern vs self-consistent vs best-of-N on MMLU-style benchmarks; measure quality gain per dollar.
- Provable tenant-isolation boundaries. Red-team attempts to exfiltrate tenant A data via the observability backend while executing a workload for tenant B. Success criterion: zero exfiltration.
- Replay fidelity. Given a run manifest, reconstruct the run bit-for-bit; measure hash equality of every span and every artefact.
- Airgapped feature parity. Of the 50 use cases, how many run unmodified in airgapped mode? Target: at least 47 (three are A2A cross-cluster use cases that are restricted by design).
13. Related Work
Beyond the twelve-framework survey in Sections 2 and 3, v4.0 draws on several adjacent research streams.
Durable execution and workflow orchestration. Temporal [7] and Netflix Maestro [9] established durable-execution patterns that informed v4.0's Tier 3 persistent state. Our distinction: Temporal couples workflow logic to the worker execution environment, while v4.0 maintains the dispatch-execution separation established in UNO [42].
LLM serving. PagedAttention (vLLM) [66], Orca [67], SGLang [68], and Sarathi-Serve [69] optimize the token-processing layer below our AI Provider Router. v4.0 is agnostic to the serving layer; organizations may deploy vLLM alongside managed providers.
LLM routing. FrugalGPT [70], RouteLLM [71], and the Dekoninck et al. unified routing-cascading framework [72] inform v4.0's role-based routing and the DSPy optimizer pipeline. Where these papers focus on cost-performance tradeoffs at inference time, v4.0 adds skill-level routing hints stored in the registry and runtime governance constraints.
Agent architectures. ReAct [73], Toolformer [74], Reflexion [75], and the Voyager [76] open-ended learning agent informed v4.0 Tier 2 and Tier 4 design. CAMEL [77] and AutoGen [78] established multi-agent conversation patterns that v4.0 treats as Tier 4 special cases rather than primary modes.
Service mesh security. Istio [79] and the SPIFFE identity framework [80] underpin the three-gate enforcement in v4.0. Envelope encryption and per-tenant KEKs follow NIST SP 800-57 [81] key-hierarchy guidance.
AI governance and compliance. The EU AI Act [50], NIST AI Risk Management Framework [59], ISO/IEC 42001 [55], and OWASP LLM Top 10 (2025) [61] directly inform Section 7.14. C2PA [49] provides the content-provenance substrate. MITRE ATLAS [62] and MITRE ATT&CK serve as threat-model references.
14. Conclusion
Adverant Nexus Stack v4.0 is an incremental architectural evolution, not a clean-sheet rewrite. It preserves the dispatch-execution separation that UNO [42] established as the load-bearing discipline of the platform, while adding the primitives that the 2026 agentic framework landscape has collectively identified as necessary and that no single framework ships turnkey: cryptographically isolated memory, signed and measured skill artefacts, first-class hooks, cost governance, deterministic replay, airgapped deployment, user-configurable bindings, and native compliance integration across thirteen regulatory regimes. The eighteen-phase migration (Phases 10โ27) is sequenced so that each phase ships behind a feature flag and is validated before the next begins. The paper's claims and limitations are validated at three Gemini 2.5 Pro gates archived alongside the paper. Follow-up work will execute the evaluation methodology in Section 12 and report quantitative results.
15. Appendices
Appendix A โ SKILL.md v2 Schema (excerpt)
YAML36 lines--- name: string # unique id, kebab-case version: semver # "3.2.1" description: string # one-liner category: enum # scoring|profiling|...|compliance risk_tier: enum # minimal|limited|high|unacceptable (EU AI Act) execution: tier: int # 1|2|3|4 max_iterations: int chain_steps?: [...] response_format: json|text inputs: schema: { ... JSON Schema ... } outputs: schema: { ... JSON Schema ... } governance: data_residency: eu_only|us_only|any|region-tag export_tags: [ EAR | ITAR | dual-use ] compliance_frameworks: [ soc2 | eu-ai-act | iso42001 | hipaa | fedramp | nist-airmf ] phi_tagged: bool hooks: - event: PreToolUse action: deny|rewrite|require_hitl|policy_ref policy_ref?: opa/... allowed_tools: [ ... ] denied_tools: [ ... ] sbom_ref: string # path or URL to SBOM signature: type: sigstore payload: base64 identity: string quality_score_threshold: float # auto-deactivate threshold metadata: mitre_atlas: [ AML.T... ] owasp_llm: [ LLM01|LLM02|... ] ---
Appendix B โ Span-Tree v2 Schema (excerpt)
SQL18 linesCREATE TABLE orchestrator.execution_spans ( span_id UUID PRIMARY KEY, parent_span_id UUID, job_id UUID NOT NULL, type span_type NOT NULL, -- closed 12-type enum (UNO) + 8 new types started_at TIMESTAMPTZ NOT NULL, ended_at TIMESTAMPTZ, duration_ms INTEGER GENERATED ALWAYS AS (...) STORED, payload_cipher BYTEA, -- envelope-encrypted payload_dek_wrapped BYTEA, -- wrapped DEK signature BYTEA, -- hash-chain signature prev_span_hash BYTEA, -- prev span's hash (for chain) span_hash BYTEA, -- this span's hash org_id UUID NOT NULL, ... ) PARTITION BY RANGE (started_at); -- 12-type v3 enum preserved. v4.0 adds: hook_invocation, policy_eval, binding_resolve, -- hitl_waitpoint, quality_eval, optimizer_compile, a2a_message, c2pa_sign.
Appendix C โ CLI 2.0 Command Reference (abbreviated)
nexus login [--org <slug>]
nexus dispatch <job_type> [--input @file] [--tier <n>] [--provider <p>] [--model <m>]
[--cost-cap <$>] [--risk <r>] [--tail] [--json]
nexus runs list|show|replay|tail|export
nexus chain visualize <run_id>
nexus skill publish|versions|rollback|install|uninstall
nexus airgap bundle|install|update|verify
nexus governance export|policies list|apply
nexus hooks list|apply|remove
nexus finops budgets|burn-rate|reserve|debit
nexus memory snapshot|gc|export
nexus a2a peers list|call|serve
nexus binding list|get|set|resolve|diff
nexus insights cost-hotspots|latency-regressions|anomalies
nexus debug nl "<question>"
Appendix D โ Hook Specification (full)
YAML13 linesevent: PreDispatch | PostSkillResolve | PreTierSelect | PreLLMCall | PostLLMCall | PreToolUse | PostToolUse | PreChainStep | PostChainStep | OnCostThreshold | OnIterationLimit | OnTierEscalation | OnHITLPause | OnHITLResume | PostDispatch scope: org | skill | plugin | binding target: <skill_id or plugin_slug or binding_key or "*"> matcher: <CEL expression> # e.g., tool.name == 'write_file' && path.startsWith('/etc') action: deny | rewrite | require_hitl | emit_event | call_webhook | policy_ref args: webhook?: { url, headers, payload_template } policy_ref?: opa/path/to/rule.rego rewrite?: <jsonpath rewrites> priority: 0-1000
Appendix E โ A2A Message Format
JSON12 lines{ "a2a_version": "1.0", "message_id": "uuid", "from": "spiffe://adverant/acme/agent/researcher", "to": "spiffe://other-org/policy/agent/legal-review", "capability": "legal.review", "payload_schema": "https://.../schema.json", "payload": { ... }, "signature": "sigstore:...", "run_context": { "run_id": "uuid", "parent_span_id": "uuid" }, "compliance": { "residency": "eu_only", "export_tags": [] } }
Appendix F โ Airgapped Bundle Manifest
YAML27 linesversion: 4.0.0 generated_at: 2026-04-24T00:00:00Z signature: gpg: <ascii-armored> sigstore: <sigstore-bundle> images: - name: adverant/nexus-orchestrator digest: sha256:... - name: adverant/nexus-gateway digest: sha256:... skills: - skill_id: ros.code_edit version: 3.2.1 signature: ... models: - name: gemini-2.5-pro weights_digest: sha256:... policies: - path: opa/... version: 4.0.0 manifests: - path: k8s/... migrations: - path: db/migrations/... licenses: - component: ... license: ...
Appendix G โ Compliance-Control Traceability Matrix (excerpt)
| Framework | Control ID | v4.0 Primitive | Evidence Source |
|---|---|---|---|
| EU AI Act | Art. 12 (logging) | Span tree (ยง8.B-B13) | orchestrator.execution_spans |
| EU AI Act | Art. 13 (transparency) | C2PA manifest (ยง7.11) + model card | artefact.c2pa_manifest |
| EU AI Act | Art. 14 (human oversight) | Tier 4 HITL waitpoint (ยง7.2) | hitl_decisions |
| EU AI Act | Art. 15 (accuracy/robustness) | DSPy metrics (ยง7.8) + adversarial eval (ยง7.3) | skill_versions.adv_eval_report |
| EU AI Act | Art. 26 (deployer) | Per-org governance doc | org.governance_doc |
| GDPR | Art. 17 (erasure) | Atomic erasure chain (ยง7.14) | erasure_certificates |
| GDPR | Art. 22 (auto decisions) | Tier 4 HITL (ยง7.2) | hitl_decisions |
| GDPR | Art. 32 (security) | Envelope encryption (ยง7.5) | memory_bank.kek_access_log |
| GDPR | Art. 35 (DPIA) | Auto-generated DPIA | skill_versions.dpia |
| SOC 2 | CC6.1 (logical access) | Three-gate enforcement (ยง8.E-E5) | Istio/service-key/OPA logs |
| SOC 2 | CC6.6 (boundary) | RLS + middleware (ยง4.5) | tenant isolation tests |
| SOC 2 | CC7.2 (monitoring) | Insights Agent (ยง7.9) | anomaly alerts |
| SOC 2 | CC7.3 (analysis) | Span tree queries | investigation artefacts |
| ISO 27001 | A.8.16 (monitoring) | Span tree + Insights Agent | OT metrics |
| ISO 27001 | A.8.10 (info deletion) | Crypto-erasure (ยง7.5) | erasure certificates |
| ISO 42001 | AI impact assessment | SKILL.md v2 risk section | skill_versions.ia_report |
| ISO 42001 | AI system lifecycle | Skill Marketplace 2.0 (ยง7.3) | skill_versions lineage |
| HIPAA | ยง164.312(a)(1) (access ctl) | RLS + JWT + RBAC | authz logs |
| HIPAA | ยง164.312(c)(1) (integrity) | C2PA + span hash chain | provenance manifests |
| HIPAA | ยง164.308(a)(1) (risk analysis) | EU AI Act risk tier (reused) | registry |
| FedRAMP | AC-2 (account mgmt) | nexus-auth + SSO | nexus-auth audit |
| FedRAMP | AU-12 (audit gen) | Span tree | per-span audit |
| FedRAMP | SC-12 (crypto key mgmt) | Per-tenant KEK + rotation | KMS rotation log |
| FedRAMP | SC-13 (FIPS crypto) | FIPS 140-3 modules in airgap bundle | bundle manifest |
| NIST AI RMF | GOVERN | Per-org governance doc | policies |
| NIST AI RMF | MAP | Skill registry metadata | ros.skill_definitions |
| NIST AI RMF | MEASURE | Quality evals + span analytics | Insights output |
| NIST AI RMF | MANAGE | Hooks + FinOps + HITL | hook invocations |
(Full matrix of approximately 200 rows in the distribution package; excerpt above.)
Appendix H โ OPA/Rego Policy Starter Pack (excerpt)
Plain Text46 linespackage nexus.dispatch default allow = false allow { input.caller in ["nexus-orchestrator", "nexus-workflows", "chat-orchestrator"] residency_ok budget_ok risk_ok export_ok } residency_ok { org := data.orgs[input.org_id] skill := data.skills[input.skill_id] org.residency == "any" } residency_ok { org := data.orgs[input.org_id] skill := data.skills[input.skill_id] org.residency == skill.data_residency } budget_ok { reserved := data.finops.reserved[input.org_id] remaining := data.orgs[input.org_id].budget - reserved remaining >= input.cost_estimate } risk_ok { skill := data.skills[input.skill_id] skill.risk_tier != "unacceptable" } risk_ok { skill := data.skills[input.skill_id] skill.risk_tier == "high" input.run_has_hitl == true } export_ok { skill := data.skills[input.skill_id] org := data.orgs[input.org_id] every tag in skill.export_tags { tag in org.allowed_export_tags } }
Appendix I โ Auditor Export Payload Schema
YAML20 linesauditor_export: framework: soc2 | iso27001 | iso42001 | eu-ai-act | hipaa | fedramp | nist-airmf window: { start: ISO8601, end: ISO8601 } org_id: UUID generated_at: ISO8601 signature: { type: gpg | sigstore, value: base64 } control_evidence: - control_id: string # e.g., CC6.1 framework_section: string # e.g., SOC 2 Common Criteria 6.1 evidence: - type: span | policy_version | hitl_decision | risk_tier_record reference: URI # resolvable within the package summary: string span_samples: [...] # sampled spans with full payload policy_versions: [...] hitl_decisions: [...] conformity_assessments: [...] dpias: [...] model_cards: [...] adversarial_eval_reports: [...]
Appendix J โ Bindings Schema v2 (DDL + Resolution + Override OPA)
SQL96 lines-- Generalized from Adverant-NexusROS/database/migrations/030_skill_bindings.sql CREATE TABLE ros.skill_bindings_v2 ( id UUID PRIMARY KEY, organization_id UUID NOT NULL, binding_key VARCHAR(200) NOT NULL, scope VARCHAR(20) NOT NULL DEFAULT 'organization' CHECK (scope IN ('system','organization','project','user')), scope_id UUID, priority INTEGER NOT NULL DEFAULT 0, is_active BOOLEAN NOT NULL DEFAULT true, -- resolution target skill_definition_id UUID NOT NULL, skill_version_pin VARCHAR(40) NOT NULL DEFAULT 'latest-minor', -- execution tier SMALLINT CHECK (tier BETWEEN 1 AND 4), provider_preference VARCHAR(40), model_preference VARCHAR(100), routing_hint VARCHAR(40), queue_name VARCHAR(80), response_format VARCHAR(20), -- cost and limits cost_cap_usd NUMERIC(10,4), daily_cap_usd NUMERIC(10,2), token_cap_in INTEGER, token_cap_out INTEGER, timeout_ms INTEGER, max_iterations INTEGER, max_sub_agents INTEGER, -- governance risk_tier VARCHAR(20), data_residency VARCHAR(40), export_tags TEXT[], requires_hitl VARCHAR(30) DEFAULT 'never', policy_refs TEXT[], tier_restrictions TEXT[], phi_tagged BOOLEAN DEFAULT false, compliance_frameworks TEXT[], -- hooks hooks JSONB NOT NULL DEFAULT '[]', allowed_tools TEXT[], denied_tools TEXT[], -- inputs and mapping input_schema JSONB, inputs_mapping JSONB, output_target VARCHAR(40), -- UI presentation display_name VARCHAR(120), description TEXT, icon VARCHAR(60), placement TEXT[], confirmation VARCHAR(20), badge TEXT[], shortcut VARCHAR(60), -- A/B experiments and telemetry ab_experiment_id UUID REFERENCES ros.skill_ab_experiments(id), quality_score_threshold NUMERIC(3,2), telemetry_tags JSONB DEFAULT '{}', -- lifecycle status VARCHAR(20) NOT NULL DEFAULT 'active', config_overrides JSONB NOT NULL DEFAULT '{}', conditions JSONB NOT NULL DEFAULT '{}', created_by UUID, created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, updated_by UUID, updated_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP, deleted_at TIMESTAMPTZ, UNIQUE (binding_key, scope, scope_id, priority, organization_id) ); -- Resolution algorithm (pseudo-SQL) -- Given: binding_key, user_id, project_id, org_id SELECT * FROM ros.skill_bindings_v2 WHERE binding_key = :binding_key AND is_active = true AND deleted_at IS NULL AND organization_id = :org_id AND ( (scope = 'user' AND scope_id = :user_id) OR (scope = 'project' AND scope_id = :project_id) OR (scope = 'organization') OR (scope = 'system') ) ORDER BY CASE scope WHEN 'user' THEN 4 WHEN 'project' THEN 3 WHEN 'organization' THEN 2 WHEN 'system' THEN 1 END DESC, priority DESC LIMIT 1;
Plain Text49 lines# Binding override OPA policy (Appendix H companion) package nexus.bindings default allow_override = false allow_override { not widens_residency phi_providers_satisfied cost_cap_within_org hitl_preserved tools_within_allow user_has_permission } widens_residency { input.proposed.data_residency == "any" input.org_policy.data_residency != "any" } phi_providers_satisfied { not input.skill.phi_tagged } phi_providers_satisfied { input.skill.phi_tagged input.proposed.provider_preference in input.org_policy.phi_allowed_providers } cost_cap_within_org { input.proposed.cost_cap_usd <= input.org_policy.max_cost_cap } hitl_preserved { input.skill.risk_tier != "high" } hitl_preserved { input.skill.risk_tier == "high" input.proposed.requires_hitl in ["always", "on_risk_high"] } tools_within_allow { every tool in input.proposed.allowed_tools { tool in input.org_policy.allowed_tools } } user_has_permission { perm := sprintf("bindings:write:%s", [input.proposed.scope]) perm in input.user.permissions }
References
[1] Anthropic. Introducing the Model Context Protocol. 2024. anthropic.com
[2] Anthropic. Donating the Model Context Protocol and Establishing the Agentic AI Foundation. December 2025. anthropic.com
[3] Google. Google Cloud Next 2026: Agent2Agent Goes Production-Grade. 2026. cloud.google.com
[4] Microsoft. Microsoft Agent Framework 1.0 Released. April 2026. devblogs.microsoft.com
[5] LangChain. LangGraph: Graph-Based Agent Orchestration. github.com
[6] Google. Agent Development Kit (ADK). 2026. cloud.google.com
[7] Temporal Technologies. Durable Execution. temporal.io
[8] LangChain. LangSmith: Observability and Evaluation. docs.langchain.com
[9] Netflix Tech Blog. Maestro: Netflix's Workflow Orchestrator. netflixtechblog.com
[10] Amazon Web Services. AgentCore Adds Quality Evaluations and Policy Controls. 2026. aws.amazon.com
[11] Google. Gemini 2.5 Pro. ai.google.dev
[12] Apache Software Foundation. Apache Airflow. airflow.apache.org
[13] Microsoft. Microsoft Agent Framework Overview. learn.microsoft.com
[14] OpenAI. Swarm โ OpenAI Agents SDK. github.com
[15] DevOps.com. OpenAI Upgrades Its Agents SDK with Sandboxing and a New Model Harness. 2026. devops.com
[16] Google Cloud Blog. Introducing the Gemini Enterprise Agent Platform. April 2026. cloud.google.com
[17] Google. Gemini Enterprise Agent Platform Product Page. cloud.google.com
[18] SiliconANGLE. Google Brings Agentic Development, Optimization, and Governance Under One Roof. April 2026. siliconangle.com
[19] Amazon Web Services. Introducing Amazon Bedrock AgentCore. aws.amazon.com
[20] CrewAI. CrewAI Documentation. docs.crewai.com
[21] CrewAI. CrewAI GitHub. github.com
[22] Medium. LangGraph vs CrewAI vs AutoGen: Which Agent Framework Should You Actually Use in 2026. medium.com
[23] LangChain. LangGraph Documentation. github.com
[24] LangChain. LangChain and NVIDIA Enterprise. blog.langchain.com
[25] Pydantic. Pydantic AI. ai.pydantic.dev
[26] Cloud Summit. Microsoft Agent Framework Production-Ready Convergence of AutoGen and Semantic Kernel. cloudsummit.eu
[27] OpenAI. OpenAI Agents Python SDK. openai.github.io
[28] Anthropic. Model Context Protocol Announcement. anthropic.com
[29] Anthropic. Claude Agent SDK Overview. platform.claude.com
[30] Anthropic. Claude Agent SDK Subagents. platform.claude.com
[31] Anthropic. Claude Agent SDK Hooks. platform.claude.com
[32] Microsoft. Semantic Kernel. learn.microsoft.com
[33] Microsoft. Semantic Kernel and Microsoft Agent Framework. devblogs.microsoft.com
[34] LlamaIndex. Workflows. llamaindex.ai
[35] LlamaIndex. Announcing Workflows 1.0: A Lightweight Framework for Agentic Systems. llamaindex.ai
[36] Stanford NLP. DSPy. dspy.ai
[37] DSPy. Optimizers. dspy.ai
[38] DSPy. GEPA Optimizer. dspy.ai
[39] deepset. Haystack. haystack.deepset.ai
[40] deepset. Haystack GitHub. github.com
[41] Amazon Web Services. AgentCore New Features April 2026. aws.amazon.com
[42] Adverant Research Team. Unified Nexus Orchestrator: Separation of Dispatch and Execution in Multi-Chain AI Workload Platforms. April 2026. adverant.ai
[43] IntuitionLabs. Enterprise AI Code Assistants in Air-Gapped Environments. intuitionlabs.ai
[44] RapidClaw. AI Agent Marketplace Guide 2026. rapidclaw.dev
[45] The Register. Agentic AI Protocols: MCP, UTCP, A2A, Etc. theregister.com
[46] Prefactor. MCP Security: Multi-Tenant AI Agents Explained. prefactor.tech
[47] Blaxel. Multi-Tenant Isolation for AI Agents. blaxel.ai
[48] AWS. Multi-Tenant Agentic AI Prescriptive Guidance. docs.aws.amazon.com
[49] Coalition for Content Provenance and Authenticity (C2PA). C2PA Specification. c2pa.org
[50] European Union. Regulation (EU) 2024/1689 on Artificial Intelligence (AI Act). Official Journal of the European Union, July 2024.
[51] European Union. General Data Protection Regulation (Regulation (EU) 2016/679). Official Journal of the European Union, April 2016.
[52] European Union. Data Act (Regulation (EU) 2023/2854). Official Journal of the European Union, December 2023.
[53] American Institute of CPAs. SOC 2: Trust Services Criteria. aicpa.org
[54] International Organization for Standardization. ISO/IEC 27001:2022 Information Security Management. iso.org
[55] International Organization for Standardization. ISO/IEC 42001:2023 AI Management Systems. iso.org
[56] U.S. Department of Health and Human Services. HIPAA Security Rule (45 CFR Part 160 and Subparts A and C of Part 164). hhs.gov
[57] General Services Administration. FedRAMP Moderate and High Baselines. fedramp.gov
[58] Department of Defense. DoD Cloud Computing Security Requirements Guide (IL4/IL5). public.cyber.mil
[59] National Institute of Standards and Technology. AI Risk Management Framework 1.0. January 2023. nist.gov
[60] National Institute of Standards and Technology. AI 600-1: Generative AI Profile. nist.gov
[61] OWASP. Top 10 for LLM Applications 2025. owasp.org
[62] MITRE. ATLAS: Adversarial Threat Landscape for AI Systems. atlas.mitre.org
[63] OWASP. Agentic AI Threats Working Group. owasp.org
[64] Bureau of Industry and Security. Export Administration Regulations (EAR, 15 CFR Parts 730โ774). bis.doc.gov
[65] European Union. EU Dual-Use Export Control Regulation 2021/821.
[66] Kwon, W., et al. Efficient Memory Management for Large Language Model Serving with PagedAttention. SOSP 2023.
[67] Yu, G. et al. Orca: A Distributed Serving System for Transformer-Based Generative Models. OSDI 2022.
[68] Zheng, L., et al. SGLang: Efficient Execution of Structured Language Model Programs. 2024.
[69] Agrawal, A., et al. Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve. OSDI 2024.
[70] Chen, L., Zaharia, M., Zou, J. FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance. 2023. arxiv.org
[71] Ong, I., et al. RouteLLM: Learning to Route LLMs with Preference Data. 2024. arxiv.org
[72] Dekoninck, J. et al. A Unified Approach to Routing and Cascading for LLMs. 2024.
[73] Yao, S., et al. ReAct: Synergizing Reasoning and Acting in Language Models. 2022. arxiv.org
[74] Schick, T., et al. Toolformer: Language Models Can Teach Themselves to Use Tools. 2023. arxiv.org
[75] Shinn, N., et al. Reflexion: Language Agents with Verbal Reinforcement Learning. 2023. arxiv.org
[76] Wang, G., et al. Voyager: An Open-Ended Embodied Agent with Large Language Models. 2023. arxiv.org
[77] Li, G., et al. CAMEL: Communicative Agents for Mind Exploration. 2023. arxiv.org
[78] Wu, Q., et al. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. 2023. arxiv.org
[79] Istio Authors. Istio Service Mesh. istio.io
[80] Cloud Native Computing Foundation. SPIFFE / SPIRE. spiffe.io
[81] National Institute of Standards and Technology. SP 800-57 Part 1 Rev 5: Recommendation for Key Management. csrc.nist.gov
[82] Pydantic. Pydantic AI GitHub. github.com
[83] Pydantic. Pydantic AI Product Page. pydantic.dev
[84] LangChain. LangChain Home. langchain.com
[85] Google Blog. Gemini Enterprise Agent Platform Announcement. blog.google
[86] Visual Studio Magazine. Microsoft Ships Production-Ready Agent Framework 1.0 for .NET and Python. visualstudiomagazine.com
[87] Microsoft. Semantic Kernel โ Microsoft Agent Framework Migration Guide. learn.microsoft.com
[88] LlamaIndex. AgentWorkflow Announcement. llamaindex.ai
[89] Stanford NLP. DSPy GitHub. github.com
[90] deepset. Products and Services: Haystack. deepset.ai
[91] dev.to. MCP vs A2A: The Complete Guide to AI Agent Protocols in 2026. dev.to
[92] Digital Applied. AI Agent Protocol Ecosystem Map 2026. digitalapplied.com
[93] Machine Learning Mastery. 7 Agentic AI Trends to Watch in 2026. machinelearningmastery.com
[94] Redwerk. LangGraph vs CrewAI Production. redwerk.com
[95] OpenAgents. Open Source AI Agent Frameworks Compared. openagents.org
[96] Adverant Research Team. Cognitive Memory Architecture for Multi-Tenant LLM Platforms. April 2026. adverant.ai
Paper Completeness Statement
Fifteen sections have been drafted (Abstract through Conclusion plus Appendices AโJ). Fifty use cases numbered 1โ50 appear in Section 9. Ninety-six references are enumerated. Fifty diagrams (A1โA9, B1โB21, C1โC8, D1โD9, E1โE8, F1โF4) are specified in Section 8 and rendered as ASCII figures in the published LaTeX source plus Mermaid and PlantUML source files under figures/. The Bindings metadata schema (Appendix J) carries all fields listed in Section 7.15. The compliance-control traceability matrix (Appendix G) excerpt maps twenty-four control identifiers across eight frameworks to v4.0 primitives.
Three Gemini 2.5 Pro validation gates โ Gate A (post-outline), Gate B (post-Section 7 proposal core), Gate C (pre-publication peer review simulation) โ are archived as structured prompt-plus-response files in the gemini-gates/ sibling directory and are part of the published package. The prompts embed arXiv-category adversarial-reviewer personas (cs.SE, cs.DC, cs.AI) as specified in the plan.
This paper is published link-only and excluded from the /docs/research index page, the sitemap.xml, and the crawl allowlist. The same not-discoverable policy used by the UNO Pipeline Redesign paper [42] and the Cognitive Memory Architecture paper [96] applies here.
