Skip to main content

Observability & Telemetry

MeetLoyd implements OpenTelemetry (OTel) for standardized observability of AI agent executions, providing distributed tracing and metrics following industry-standard semantic conventions.

Overview

The telemetry system provides:

  • Distributed Tracing: Track requests across agent executions, LLM calls, and tool invocations
  • Metrics Collection: Counters and histograms for executions, tokens, latency, and costs
  • Gen AI Semantic Conventions: Industry-standard gen_ai.* attributes for LLM observability
  • OTLP Export: Compatible with any OpenTelemetry-compatible backend

Architecture

┌─────────────────────────────────────────────────────────────┐
│ MeetLoyd Platform │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Agent │ │ LLM │ │ Tool │ │
│ │ Execution │──│ Calls │──│ Execution │ │
│ │ Spans │ │ Spans │ │ Spans │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
│ │ │ │ │
│ └──────────────┼───────────────────┘ │
│ ▼ │
│ ┌───────────────────────┐ │
│ │ OpenTelemetry SDK │ │
│ │ - BatchSpanProcessor│ │
│ │ - MetricReader │ │
│ └───────────────────────┘ │
│ │ │
└──────────────────────────┼───────────────────────────────────┘

┌───────────────────────┐
│ OTLP/HTTP Exporter │
└───────────────────────┘


┌───────────────────────┐
│ Observability Backend │
│ (Grafana, Honeycomb, │
│ Datadog, Jaeger) │
└───────────────────────┘

Traced Operations

Agent Execution Spans

Root spans created for each agent execution with attributes:

AttributeDescription
gen_ai.agent.idUnique agent identifier
gen_ai.agent.nameHuman-readable agent name
gen_ai.request.modelLLM model used
gen_ai.provider.nameProvider (anthropic, openai, etc.)
meetloyd.tenant.idTenant identifier
meetloyd.user.idUser who triggered execution

LLM Call Spans

Child spans for each LLM API call:

AttributeDescription
gen_ai.operation.nameOperation type (chat, completion)
gen_ai.request.modelModel requested
gen_ai.request.max_tokensMax tokens parameter
gen_ai.request.temperatureTemperature parameter
gen_ai.usage.input_tokensInput tokens consumed
gen_ai.usage.output_tokensOutput tokens generated
gen_ai.response.idProvider response ID
gen_ai.response.finish_reasonsWhy generation stopped

Tool Execution Spans

Child spans for each tool invocation:

AttributeDescription
gen_ai.tool.nameTool identifier
gen_ai.tool.call.idUnique call identifier
gen_ai.tool.typeTool type (builtin, custom)

Metrics

Counters

MetricDescription
gen_ai.agent.executionsTotal agent executions
gen_ai.llm.callsTotal LLM API calls
gen_ai.tool.callsTotal tool invocations
gen_ai.tokens.inputTotal input tokens
gen_ai.tokens.outputTotal output tokens
gen_ai.errorsTotal errors by type

Histograms

MetricDescription
gen_ai.agent.durationExecution duration (ms)
gen_ai.llm.latencyLLM call latency (ms)
gen_ai.tool.latencyTool execution latency (ms)
gen_ai.cost.usdExecution cost (USD)
gen_ai.tokens.per_executionTokens per execution

Configuration

OpenTelemetry is disabled by default. To enable, configure these environment variables:

# Enable telemetry
OTEL_ENABLED=true

# OTLP endpoint (required when enabled)
OTEL_EXPORTER_OTLP_ENDPOINT=https://your-backend.com

# Service identification
OTEL_SERVICE_NAME=meetloyd
OTEL_SERVICE_VERSION=1.0.0

# Sampling rate (0.0-1.0, default 1.0)
# Use 0.1 (10%) for high-volume production
OTEL_SAMPLE_RATE=0.1

# Console export for debugging (development only)
OTEL_CONSOLE_EXPORT=false

Supported Backends

MeetLoyd's OTLP export is compatible with:

  • Grafana Cloud - https://otlp-gateway-<zone>.grafana.net/otlp
  • Honeycomb - https://api.honeycomb.io
  • Datadog - https://trace.agent.datadoghq.com
  • Axiom - https://api.axiom.co
  • AWS X-Ray - Via OpenTelemetry Collector
  • Azure Monitor - Via Azure Monitor Exporter
  • Google Cloud Trace - Via GCP exporter
  • Any OTLP-compatible collector

Enterprise Cloud Deployment

Grafana Cloud provides a fully managed observability stack with generous free tier. Works with any cloud provider.

Setup:

  1. Create a Grafana Cloud account at grafana.com
  2. Navigate to Connections > Add new connection > OpenTelemetry (OTLP)
  3. Generate an API token with metrics:write and traces:write scopes
  4. Configure MeetLoyd:
OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-us-east-0.grafana.net/otlp
OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic <base64-encoded-instance-id:api-token>"
OTEL_SERVICE_NAME=meetloyd
OTEL_SAMPLE_RATE=0.1

Features:

  • Trace explorer with gen_ai.* attribute filtering
  • Pre-built dashboards for LLM metrics
  • Alerting on error rates and latency
  • 50GB free tier for traces

AWS (Amazon Web Services)

Option 1: AWS Distro for OpenTelemetry (ADOT) + X-Ray

Deploy the ADOT Collector as a sidecar or daemon:

# ECS Task Definition or EKS DaemonSet
OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
OTEL_SERVICE_NAME=meetloyd

The ADOT Collector forwards traces to X-Ray and metrics to CloudWatch.

Option 2: Direct to Grafana Cloud from AWS

OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-us-east-0.grafana.net/otlp
OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic <token>"

AWS-specific attributes automatically captured:

  • cloud.provider: aws
  • cloud.region: deployment region
  • cloud.availability_zone: AZ if applicable

Azure

Option 1: Azure Monitor Application Insights

Azure Monitor supports OTLP ingestion directly:

OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=https://<region>.in.applicationinsights.azure.com/v2/track
OTEL_EXPORTER_OTLP_HEADERS="x-api-key=<instrumentation-key>"
OTEL_SERVICE_NAME=meetloyd

Option 2: Azure Monitor OpenTelemetry Collector

Deploy the OTel Collector with Azure Monitor exporter in AKS or Container Apps:

exporters:
azuremonitor:
connection_string: ${APPLICATIONINSIGHTS_CONNECTION_STRING}

Azure-specific benefits:

  • Native integration with Azure Monitor dashboards
  • Log Analytics queries across traces and metrics
  • Azure RBAC for access control

Google Cloud Platform (GCP)

Option 1: Cloud Trace Direct Export

OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=https://cloudtrace.googleapis.com
OTEL_SERVICE_NAME=meetloyd

# Use workload identity or service account
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

Option 2: OpenTelemetry Collector with GCP Exporter

Deploy on GKE or Cloud Run with the collector:

exporters:
googlecloud:
project: your-gcp-project

service:
pipelines:
traces:
exporters: [googlecloud]
metrics:
exporters: [googlecloud]

GCP-specific features:

  • Native Cloud Trace integration
  • BigQuery export for long-term analysis
  • Integration with Cloud Monitoring dashboards

Multi-Cloud / Hybrid

For organizations with multi-cloud deployments, we recommend:

  1. Grafana Cloud as the centralized backend (cloud-agnostic)
  2. OpenTelemetry Collector deployed in each cloud
  3. Same OTLP endpoint configured across all environments
# Same configuration works everywhere
OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-us-east-0.grafana.net/otlp
OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic <token>"
OTEL_SERVICE_NAME=meetloyd
OTEL_RESOURCE_ATTRIBUTES="deployment.environment=production,cloud.provider=aws"

This provides unified observability across AWS, Azure, and GCP deployments.

Security Considerations

Data Sensitivity

Telemetry data may contain:

  • Agent and user identifiers
  • Model names and parameters
  • Token counts and costs
  • Execution timing information

Telemetry does not contain:

  • Prompt content or responses
  • User messages
  • Tool input/output data
  • PII or sensitive business data

Network Security

  • All OTLP exports use HTTPS
  • Authentication headers supported for backend auth
  • No data exported when telemetry is disabled

Compliance

OpenTelemetry implementation supports:

  • SOC 2: Provides audit trail of AI operations
  • ISO 27001: Enables monitoring and incident detection
  • GDPR: No PII in telemetry data by design

Graceful Degradation

When telemetry is disabled or misconfigured:

  • All tracing functions return no-ops
  • Zero performance overhead
  • Application continues normally
  • Warning logged if enabled without endpoint

Additional Observability

Beyond OpenTelemetry, MeetLoyd provides:

  • Structured Logging: Pino-based JSON logs with correlation IDs
  • Chain of Thought Logging: Full reasoning capture in database
  • Agent Run Tracking: Execution history in agent_runs table
  • Audit Logs: Security-relevant events for compliance

See Audit Logs and SIEM Integration for more.