OpenTelemetry GenAI semantic conventions define standardized gen_ai.* attributes for production-grade LLM and AI agent observability. These conventions (v1.37+) cover operation names, token metrics (input/output), latency measurements, and model metadata—enabling vendor-neutral tracing across all LLM providers. The GenAI SIG, formed in April 2024, maintains these conventions under CNCF oversight.
What You Will Learn
- GenAI semantic conventions — standardized gen_ai.* attributes for LLM spans, agent traces, and events
- Token metrics monitoring — input/output tokens, cost calculation, and latency TTFT/TPS tracking
- AI agent tracing — distributed tracing across multi-step workflows with context propagation
- OTel Collector pipeline — configuration for GenAI telemetry with processors and exporters
- Integration backends — Datadog, Grafana Tempo, Jaeger for visualizing GenAI traces
Related: Explore — Model Context Protocol (MCP) in 2026, Top 10 AI Agents You Can Use Today, or Best AI Coding Agents 2026.
Understanding GenAI Semantic Conventions
The OpenTelemetry Generative AI Special Interest Group (GenAI SIG) was formed in April 2024 to standardize how LLM and AI agent operations are observed. These semantic conventions provide a vendor-neutral schema for describing AI operations—so your telemetry data stays consistent whether you use OpenAI, Anthropic, Google Gemini, or any other provider.
As of early 2026, the GenAI conventions cover four primary areas: LLM client spans for direct API calls, agent spans for multi-step workflows, events for capturing prompt/completion content, and metrics for aggregated measurements. Version 1.37+ became the stable baseline, with the transition to version 1.41.0 introducing reasoning tokens and enhanced agent attributes.
Core gen_ai.* Attributes
Token Metrics and Cost Monitoring
Token monitoring is the foundation of LLM cost observability. Since LLMs price by token—differently for input and output—tracking these separately is essential. OpenTelemetry captures token usage through gen_ai.usage.input_tokens and gen_ai.usage.output_tokens, enabling trace-level cost transparency.
To calculate cost, multiply token counts by your model pricing rates. For example, OpenAI GPT-4o pricing (as of May 2026) is approximately $2.50/1M input tokens and $10/1M output tokens. Emit a custom gen_ai.usage.cost_usd metric using a pricing table you control.
Latency Metrics: TTFT and TPS
For streaming responses, Time to First Token (TTFT) and Tokens Per Second (TPS) matter more than total latency. TTFT measures perceived responsiveness—users see the first output faster with lower TTFT. TPS measures generation speed once streaming begins.
The standard GenAI metric instruments include gen_ai.client.operation.duration for end-to-end latency and gen_ai.client.time_per_output_token for TPS. Recommended SLO dimensions: P50 latency under 500ms, P99 under 5s for chat; P99 total generation under 30s.
AI Agent Distributed Tracing
Agentic AI introduces complexity that traditional APM cannot handle. A single user request may spawn multiple model calls, tool executions, and retrieval steps. OpenTelemetry handles this via W3C TraceContext headers (traceparent, tracestate), which propagate across HTTP calls automatically.
For queue-based communication, you must serialize the context into the message. The agent spans use gen_ai.agent.* attributes including gen_ai.agent.name, gen_ai.agent.type, and gen_ai.tool.name for tool invocations. This enables full trace visualization across the agent workflow.
Use auto-instrumentation libraries for LangChain, LangGraph, CrewAI, or OpenAI Agents SDK whenever possible. These libraries handle context propagation and span creation automatically, ensuring consistent distributed traces without custom code.
OTel Collector Pipeline for GenAI
The OpenTelemetry Collector provides a vendor-neutral telemetry pipeline that receives, processes, and exports data. For GenAI workloads, the collector normalizes attributes from different instrumentation libraries (OpenInference, OpenLLMetry) to the standard gen_ai.* conventions.
Collector Configuration
Configure the OTel Collector with the genainormalizer processor to transform attributes from popular frameworks to GenAI semantic conventions:
processors:
genainormalizer:
semconv_version: "1.41.0"
profiles: [openinference, openllmetry]
remove_originals: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [genainormalizer, batch]
exporters: [otlp/backend]
The genainormalizer processor maps attributes from OpenInference and OpenLLMetry libraries (covering 60+ frameworks including LangChain, CrewAI, PydanticAI, Strands) to the OTel GenAI semantic conventions. It supports custom mappings and optional overwrite behavior.
Integration: Datadog and Grafana
Datadog LLM Observability natively supports OpenTelemetry GenAI Semantic Conventions (v1.37+). This allows you to instrument once with OTel, export via your existing collector pipeline, and analyze GenAI spans directly in Datadog—no code changes required.
Similarly, Grafana Cloud Traces (powered by Tempo) provides full distributed tracing visualization. You can correlate agent traces with metrics and logs for faster root cause analysis. The OpenTelemetry Python instrumentation for OpenAI Agents SDK automatically emits GenAI-compliant spans.
Implementation Checklist
Add gen_ai.* attributes to spans
Configure your LLM SDK or auto-instrumentation to emit gen_ai.operation.name, gen_ai.request.model, gen_ai.usage.input_tokens, and gen_ai.usage.output_tokens on every span.
Set up OTel Collector with genainormalizer
Deploy an OTel Collector that receives OTLP data, normalizes attributes to GenAI semconv v1.41.0, and exports to your backend.
Configure metrics and cost tracking
Create histograms for gen_ai.client.operation.duration to capture latency percentiles. Emit gen_ai.usage.cost_usd using your model pricing table.
Connect observability backend
Export to Datadog, Grafana Tempo, Jaeger, or Honeycomb. Verify traces appear with correct gen_ai.* attributes.
Set SLOs and alerts
Define latency thresholds (P50 < 500ms, P99 < 5s), error rate limits (1-2%), and cost alerts when token usage spikes 2x baseline.
Final Verdict
OpenTelemetry GenAI semantic conventions provide the vendor-neutral standard that production AI systems need. By adopting gen_ai.* attributes in v1.37+ (or v1.41.0 for reasoning tokens), teams achieve consistent observability across providers, enabling cost tracking, latency analysis, and distributed tracing for AI agents. The OTel Collector pipeline with genainormalizer ensures compatibility across 60+ frameworks—making this the foundation for enterprise-grade LLM and agent observability in 2026.
Last Updated: May 06, 2026 | Source: OpenTelemetry Official Documentation (opentelemetry.io)