Zero Trust for AI Agents

MeetLoyd applies Zero Trust principles to every AI agent in the platform. No agent is trusted by default -- identity, capabilities, and behavior must be continuously verified.

This page maps MeetLoyd's security controls to the five pillars that define Zero Trust governance for autonomous agents.

The Five Pillars

1. Identity -- "Who are you?"

Every agent has a cryptographically verifiable identity:

SPIFFE ID: Auto-assigned at deployment, encoding the tenant boundary
IETF Client ID Metadata: RFC 7591-compliant discovery document
W3C Verifiable Credentials: Badges encoding tools and permissions, signed by platform key (ES256)
NANDA AgentFacts: Cross-platform agent descriptor for federated discovery (Project Nanda compatible)
Capability Attestations: Sandbox-verified proofs that an agent can perform what it claims

2. Behavioral Monitoring -- "What are you doing?"

Every agent action is logged, baselined, and monitored for drift:

SOX-compliant audit chain: Hash-chained logs for every LLM call, tool invocation, and handoff
SIEM export: CEF, JSON, LEEF formats for Splunk, Datadog, Elastic, SumoLogic
Behavioral baselines: Per-model response metrics with anomaly detection (50%+ deviation threshold)
Drift detection: Oscillation, contradiction, and divergence signals across team agents
Post-task evaluation: 5-dimension quality scoring on every run (goal achievement, charter alignment, output quality, intent alignment, tone consistency)
Decision explainability: Chain-of-thought logging, extended thinking (DeepSeek R1, Claude Opus)

3. Data Governance -- "What data flows through?"

A 6-stage LLM Gateway pipeline wraps every request:

Request → Prompt Injection Detection → PII Redaction → Content Moderation → LLM Call → Output Validation → PII Restoration → Response

Schema validation: Zod schemas on all MCP tool parameters
Injection prevention: Dedicated prompt injection detector
PII/PHI masking: Redaction before LLM, restoration after (HIPAA/GDPR packs enable stricter thresholds)
Output filtering: Content moderation + custom guardrail rules
Data lineage: Context graph tracks entity provenance across conversations

4. Segmentation -- "Where can you go?"

Agents operate under the principle of least privilege:

OpenFGA authorization: Per-tenant stores, authorization on every MCP tool call, 125+ tools mapped
TBAC (Tool-Based Access Control): Policies for cross-agent delegation (deny overrides allow)
Solver constraints: 8 constraint types (budget, tool allow/deny, rate limit, separation of duties, PII, pattern match)
Cascading governance: Platform → Tenant → App → Team → Agent (most specific wins)
Sandbox isolation: E2B, Firecracker, Kubernetes, Docker Desktop -- network policies block outbound by default
Rate limiting: Per-tenant (120-5000 RPM), per-tool, per-model

5. Incident Response -- "What if something goes wrong?"

Containment and recovery in seconds, not hours:

Circuit breakers: Per-service with configurable thresholds
Kill switches: Governance pack module for immediate agent/team termination
Token revocation: Short-lived SVIDs (1h default), revocable OAuth tokens, API key revocation
State rollback: Manifest redeployment, versioned charters, versioned briefings, DB point-in-time recovery
Graceful degradation: Hybrid executor falls back plan-first to step-by-step, model alias failover, sandbox fallback chain

OWASP Agentic Top 10

MeetLoyd addresses all 10 risks from the OWASP Top 10 for Agentic Applications:

Risk	MeetLoyd Control
Prompt Injection	Gateway prompt injection detector
Tool Misuse	OpenFGA + TBAC + Solver constraints
Excessive Agency	Charter boundaries + autonomy levels
Lack of Transparency	CoT logging + action transparency + post-task evaluation
Insecure Output	PII redaction + output validation + guardrail rules
Privilege Escalation	SPIFFE + OpenFGA + default deny + cascading policies
Data Poisoning	Schema validation + content filtering + context graph lineage
Denial of Service	Rate limiting + circuit breakers + credit guard
Supply Chain	Manifest validation + SRI digest pinning + capability attestations
Logging Failures	SOX audit chain + SIEM export + per-action attribution

For detailed technique-level coverage, see our SAFE-MCP Mapping which covers 79 of 85 SAFE-MCP techniques -- the remaining 6 are not applicable to MeetLoyd's architecture (e.g., user-installed MCP servers, which MeetLoyd manages centrally).

MeetLoyd also implements the SAFE-K8S infrastructure security catalog -- 10 governance modules covering all 10 SAFE-K8S domains (593 controls, 55 knowledge areas) for Kubernetes and AI infrastructure security. These modules are provider-agnostic, applying to all sandbox backends (K8s, Docker, Firecracker, E2B, Fly Machines), and are mapped to EU AI Act, DORA, ISO 27001, ISO 42001, SOC 2, NIST AI RMF, and NIST CSF with precise article references.

Standards Alignment

Standard	Alignment
NIST 800-207	Zero Trust Architecture principles applied to agent identity, access, and monitoring
OWASP Agentic Top 10	10/10 risks addressed
SAFE-MCP	79/85 techniques mitigated (93%) -- agent-tool interaction layer
SAFE-K8S	10/10 domains mapped -- infrastructure and AI workload layer
NIST AI RMF	Governance, mapping, measurement, and management functions implemented
EU AI Act	High-risk AI system controls via governance packs
Project Nanda	AgentFacts descriptor for cross-platform discovery, capability attestations

Governance Packs

GDPR, HIPAA, SOX, EU AI Act, DORA modules

Agent Identity

SPIFFE, W3C VC, Token Exchange, TBAC

Policy Engine

Solver constraint engine

Audit Logs

SOX-compliant hash-chain logging

SIEM Integration

Export to enterprise security platforms

The Five Pillars​

1. Identity -- "Who are you?"​

2. Behavioral Monitoring -- "What are you doing?"​

3. Data Governance -- "What data flows through?"​

4. Segmentation -- "Where can you go?"​

5. Incident Response -- "What if something goes wrong?"​

OWASP Agentic Top 10​

Standards Alignment​