Skip to main content

Zero Trust for AI Agents

MeetLoyd applies Zero Trust principles to every AI agent in the platform. No agent is trusted by default -- identity, capabilities, and behavior must be continuously verified.

This page maps MeetLoyd's security controls to the five pillars that define Zero Trust governance for autonomous agents.

The Five Pillars

1. Identity -- "Who are you?"

Every agent has a cryptographically verifiable identity:

  • SPIFFE ID: Auto-assigned at deployment, encoding the tenant boundary
  • IETF Client ID Metadata: RFC 7591-compliant discovery document
  • W3C Verifiable Credentials: Badges encoding tools and permissions, signed by platform key (ES256)
  • NANDA AgentFacts: Cross-platform agent descriptor for federated discovery (Project Nanda compatible)
  • Capability Attestations: Sandbox-verified proofs that an agent can perform what it claims

2. Behavioral Monitoring -- "What are you doing?"

Every agent action is logged, baselined, and monitored for drift:

  • SOX-compliant audit chain: Hash-chained logs for every LLM call, tool invocation, and handoff
  • SIEM export: CEF, JSON, LEEF formats for Splunk, Datadog, Elastic, SumoLogic
  • Behavioral baselines: Per-model response metrics with anomaly detection (50%+ deviation threshold)
  • Drift detection: Oscillation, contradiction, and divergence signals across team agents
  • Post-task evaluation: 5-dimension quality scoring on every run (goal achievement, charter alignment, output quality, intent alignment, tone consistency)
  • Decision explainability: Chain-of-thought logging, extended thinking (DeepSeek R1, Claude Opus)

3. Data Governance -- "What data flows through?"

A 6-stage LLM Gateway pipeline wraps every request:

Request → Prompt Injection Detection → PII Redaction → Content Moderation → LLM Call → Output Validation → PII Restoration → Response

  • Schema validation: Zod schemas on all MCP tool parameters
  • Injection prevention: Dedicated prompt injection detector
  • PII/PHI masking: Redaction before LLM, restoration after (HIPAA/GDPR packs enable stricter thresholds)
  • Output filtering: Content moderation + custom guardrail rules
  • Data lineage: Context graph tracks entity provenance across conversations

4. Segmentation -- "Where can you go?"

Agents operate under the principle of least privilege:

  • OpenFGA authorization: Per-tenant stores, authorization on every MCP tool call, 125+ tools mapped
  • TBAC (Tool-Based Access Control): Policies for cross-agent delegation (deny overrides allow)
  • Solver constraints: 8 constraint types (budget, tool allow/deny, rate limit, separation of duties, PII, pattern match)
  • Cascading governance: Platform → Tenant → App → Team → Agent (most specific wins)
  • Sandbox isolation: E2B, Firecracker, Kubernetes, Docker Desktop -- network policies block outbound by default
  • Rate limiting: Per-tenant (120-5000 RPM), per-tool, per-model

5. Incident Response -- "What if something goes wrong?"

Containment and recovery in seconds, not hours:

  • Circuit breakers: Per-service with configurable thresholds
  • Kill switches: Governance pack module for immediate agent/team termination
  • Token revocation: Short-lived SVIDs (1h default), revocable OAuth tokens, API key revocation
  • State rollback: Manifest redeployment, versioned charters, versioned briefings, DB point-in-time recovery
  • Graceful degradation: Hybrid executor falls back plan-first to step-by-step, model alias failover, sandbox fallback chain

OWASP Agentic Top 10

MeetLoyd addresses all 10 risks from the OWASP Top 10 for Agentic Applications:

RiskMeetLoyd Control
Prompt InjectionGateway prompt injection detector
Tool MisuseOpenFGA + TBAC + Solver constraints
Excessive AgencyCharter boundaries + autonomy levels
Lack of TransparencyCoT logging + action transparency + post-task evaluation
Insecure OutputPII redaction + output validation + guardrail rules
Privilege EscalationSPIFFE + OpenFGA + default deny + cascading policies
Data PoisoningSchema validation + content filtering + context graph lineage
Denial of ServiceRate limiting + circuit breakers + credit guard
Supply ChainManifest validation + SRI digest pinning + capability attestations
Logging FailuresSOX audit chain + SIEM export + per-action attribution

For detailed technique-level coverage, see our SAFE-MCP Mapping which covers 79 of 85 SAFE-MCP techniques -- the remaining 6 are not applicable to MeetLoyd's architecture (e.g., user-installed MCP servers, which MeetLoyd manages centrally).

MeetLoyd also implements the SAFE-K8S infrastructure security catalog -- 10 governance modules covering all 10 SAFE-K8S domains (593 controls, 55 knowledge areas) for Kubernetes and AI infrastructure security. These modules are provider-agnostic, applying to all sandbox backends (K8s, Docker, Firecracker, E2B, Fly Machines), and are mapped to EU AI Act, DORA, ISO 27001, ISO 42001, SOC 2, NIST AI RMF, and NIST CSF with precise article references.

Standards Alignment

StandardAlignment
NIST 800-207Zero Trust Architecture principles applied to agent identity, access, and monitoring
OWASP Agentic Top 1010/10 risks addressed
SAFE-MCP79/85 techniques mitigated (93%) -- agent-tool interaction layer
SAFE-K8S10/10 domains mapped -- infrastructure and AI workload layer
NIST AI RMFGovernance, mapping, measurement, and management functions implemented
EU AI ActHigh-risk AI system controls via governance packs
Project NandaAgentFacts descriptor for cross-platform discovery, capability attestations
Governance Packs
GDPR, HIPAA, SOX, EU AI Act, DORA modules
Agent Identity
SPIFFE, W3C VC, Token Exchange, TBAC
Policy Engine
Solver constraint engine
Audit Logs
SOX-compliant hash-chain logging
SIEM Integration
Export to enterprise security platforms