What are OpenTelemetry GenAI semantic conventions?

GenAI semantic conventions are standardized gen_ai.* attributes for LLM and AI agent observability maintained by the OpenTelemetry GenAI SIG (formed April 2024). They provide vendor-neutral schema covering operation names, token metrics, latency, and model metadata.

What are the core gen_ai.* attributes?

Key attributes include gen_ai.operation.name (chat, text_completion), gen_ai.request.model (model name), gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.usage.reasoning.output_tokens (v1.41+), and gen_ai.provider.name (openai, anthropic, gcp.vertex_ai).

How do you monitor token usage and cost with OpenTelemetry?

Track token usage via gen_ai.usage.input_tokens and gen_ai.usage.output_tokens separately (different pricing). Calculate cost by multiplying tokens by model rates. Emit gen_ai.usage.cost_usd using your pricing table. Latency tracked via gen_ai.client.operation.duration.

What is LLM latency TTFT and TPS monitoring?

Time to First Token (TTFT) measures perceived responsiveness from request to first token. Tokens Per Second (TPS) measures generation speed. For streaming, TTFT matters more than total latency. Recommended: P50 TTFT < 500ms, P99 < 5s for chat.

How do you implement distributed tracing for AI agents?

Use W3C TraceContext headers (traceparent, tracestate) for distributed tracing across multi-step agent workflows. Agent spans use gen_ai.agent.name, gen_ai.agent.type, and gen_ai.tool.name attributes. Auto-instrumentation for LangChain, LangGraph, and OpenAI Agents SDK handles this automatically.

How do you configure OTel Collector pipeline for GenAI?

Configure the OTel Collector with genainormalizer processor to map attributes from OpenInference/OpenLLMetry libraries (60+ frameworks) to GenAI semconv v1.41.0. Set semconv_version, profiles, and remove_originals options.

OpenTelemetry GenAI Semantic Conventions 2026: End-to-End Implementation Guide for LLM & Agent Observability

Q: How do you implement distributed tracing for AI agents?

Use W3C TraceContext headers (traceparent, tracestate) for distributed tracing across multi-step agent workflows. Agent spans use gen_ai.agent.name, gen_ai.agent.type, and gen_ai.tool.name attributes. Auto-instrumentation for LangChain, LangGraph, and OpenAI Agents SDK handles this automatically.

Q: How do you configure OTel Collector pipeline for GenAI?

Configure the OTel Collector with genainormalizer processor to map attributes from OpenInference/OpenLLMetry libraries (60+ frameworks) to GenAI semconv v1.41.0. Set semconv_version, profiles, and remove_originals options.

Complete guide to implementing OpenTelemetry GenAI semantic conventions for LLM and AI agent observability with token metrics, tracing, and OTel Collector pipeline

Sk Jabedul Haque

May 6, 2026 • 5 min read • 1078 views

OpenTelemetry GenAI Semantic Conventions 2026: End-to-End Implementation Guide for LLM & Agent Observability

Navigation

10 Sections

Get Updates on WhatsApp

OpenTelemetry GenAI semantic conventions define standardized gen_ai.* attributes for production-grade LLM and AI agent observability. These conventions (v1.37+) cover operation names, token metrics (input/output), latency measurements, and model metadata—enabling vendor-neutral tracing across all LLM providers. The GenAI SIG, formed in April 2024, maintains these conventions under CNCF oversight.

What You Will Learn

GenAI semantic conventions — standardized gen_ai.* attributes for LLM spans, agent traces, and events
Token metrics monitoring — input/output tokens, cost calculation, and latency TTFT/TPS tracking
AI agent tracing — distributed tracing across multi-step workflows with context propagation
OTel Collector pipeline — configuration for GenAI telemetry with processors and exporters
Integration backends — Datadog, Grafana Tempo, Jaeger for visualizing GenAI traces

Understanding GenAI Semantic Conventions

The OpenTelemetry Generative AI Special Interest Group (GenAI SIG) was formed in April 2024 to standardize how LLM and AI agent operations are observed. These semantic conventions provide a vendor-neutral schema for describing AI operations—so your telemetry data stays consistent whether you use OpenAI, Anthropic, Google Gemini, or any other provider.

As of early 2026, the GenAI conventions cover four primary areas: LLM client spans for direct API calls, agent spans for multi-step workflows, events for capturing prompt/completion content, and metrics for aggregated measurements. Version 1.37+ became the stable baseline, with the transition to version 1.41.0 introducing reasoning tokens and enhanced agent attributes.

Core gen_ai.* Attributes

Attribute	Type	Description	Example
gen_ai.operation.name	string	Type of LLM operation	chat, text_completion
gen_ai.request.model	string	Model name for request	gpt-4o, claude-sonnet-4-20250514
gen_ai.response.model	string	Model that generated response	gpt-4o-2025-01-27
gen_ai.provider.name	string	LLM provider identifier	openai, anthropic, gcp.vertex_ai
gen_ai.usage.input_tokens	int	Prompt token count	150
gen_ai.usage.output_tokens	int	Completion token count	200
gen_ai.usage.reasoning.output_tokens	int	Reasoning chain tokens (v1.41+)	512

Token Metrics and Cost Monitoring

Token monitoring is the foundation of LLM cost observability. Since LLMs price by token—differently for input and output—tracking these separately is essential. OpenTelemetry captures token usage through gen_ai.usage.input_tokens and gen_ai.usage.output_tokens, enabling trace-level cost transparency.

To calculate cost, multiply token counts by your model pricing rates. For example, OpenAI GPT-4o pricing (as of May 2026) is approximately $2.50/1M input tokens and $10/1M output tokens. Emit a custom gen_ai.usage.cost_usd metric using a pricing table you control.

$2.50 GPT-4o Input / 1M tokens

$10 GPT-4o Output / 1M tokens

$0.15 GPT-4o Mini Input / 1M tokens

Latency Metrics: TTFT and TPS

For streaming responses, Time to First Token (TTFT) and Tokens Per Second (TPS) matter more than total latency. TTFT measures perceived responsiveness—users see the first output faster with lower TTFT. TPS measures generation speed once streaming begins.

The standard GenAI metric instruments include gen_ai.client.operation.duration for end-to-end latency and gen_ai.client.time_per_output_token for TPS. Recommended SLO dimensions: P50 latency under 500ms, P99 under 5s for chat; P99 total generation under 30s.

AI Agent Distributed Tracing

Agentic AI introduces complexity that traditional APM cannot handle. A single user request may spawn multiple model calls, tool executions, and retrieval steps. OpenTelemetry handles this via W3C TraceContext headers (traceparent, tracestate), which propagate across HTTP calls automatically.

For queue-based communication, you must serialize the context into the message. The agent spans use gen_ai.agent.* attributes including gen_ai.agent.name, gen_ai.agent.type, and gen_ai.tool.name for tool invocations. This enables full trace visualization across the agent workflow.

Professional Recommendation

Use auto-instrumentation libraries for LangChain, LangGraph, CrewAI, or OpenAI Agents SDK whenever possible. These libraries handle context propagation and span creation automatically, ensuring consistent distributed traces without custom code.

OTel Collector Pipeline for GenAI

The OpenTelemetry Collector provides a vendor-neutral telemetry pipeline that receives, processes, and exports data. For GenAI workloads, the collector normalizes attributes from different instrumentation libraries (OpenInference, OpenLLMetry) to the standard gen_ai.* conventions.

Collector Configuration

Configure the OTel Collector with the genainormalizer processor to transform attributes from popular frameworks to GenAI semantic conventions:

processors:
  genainormalizer:
    semconv_version: "1.41.0"
    profiles: [openinference, openllmetry]
    remove_originals: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [genainormalizer, batch]
      exporters: [otlp/backend]

The genainormalizer processor maps attributes from OpenInference and OpenLLMetry libraries (covering 60+ frameworks including LangChain, CrewAI, PydanticAI, Strands) to the OTel GenAI semantic conventions. It supports custom mappings and optional overwrite behavior.

Integration: Datadog and Grafana

Datadog LLM Observability natively supports OpenTelemetry GenAI Semantic Conventions (v1.37+). This allows you to instrument once with OTel, export via your existing collector pipeline, and analyze GenAI spans directly in Datadog—no code changes required.

Similarly, Grafana Cloud Traces (powered by Tempo) provides full distributed tracing visualization. You can correlate agent traces with metrics and logs for faster root cause analysis. The OpenTelemetry Python instrumentation for OpenAI Agents SDK automatically emits GenAI-compliant spans.

Platform	Best For	Setup Complexity
Datadog LLM Observability	Enterprise monitoring with cost analysis	Low — uses existing OTel Collector
Grafana Tempo	Open-source tracing with metrics correlation	Medium — self-hosted options
Jaeger	Lightweight development/testing	Low — single binary
Honeycomb	Query-focused debugging	Medium — requires cloud account

Implementation Checklist

Add gen_ai.* attributes to spans

Configure your LLM SDK or auto-instrumentation to emit gen_ai.operation.name, gen_ai.request.model, gen_ai.usage.input_tokens, and gen_ai.usage.output_tokens on every span.

Set up OTel Collector with genainormalizer

Deploy an OTel Collector that receives OTLP data, normalizes attributes to GenAI semconv v1.41.0, and exports to your backend.

Configure metrics and cost tracking

Create histograms for gen_ai.client.operation.duration to capture latency percentiles. Emit gen_ai.usage.cost_usd using your model pricing table.

Connect observability backend

Export to Datadog, Grafana Tempo, Jaeger, or Honeycomb. Verify traces appear with correct gen_ai.* attributes.

Set SLOs and alerts

Define latency thresholds (P50 < 500ms, P99 < 5s), error rate limits (1-2%), and cost alerts when token usage spikes 2x baseline.

Final Verdict

OpenTelemetry GenAI semantic conventions provide the vendor-neutral standard that production AI systems need. By adopting gen_ai.* attributes in v1.37+ (or v1.41.0 for reasoning tokens), teams achieve consistent observability across providers, enabling cost tracking, latency analysis, and distributed tracing for AI agents. The OTel Collector pipeline with genainormalizer ensures compatibility across 60+ frameworks—making this the foundation for enterprise-grade LLM and agent observability in 2026.

Last Updated: May 06, 2026 | Source: OpenTelemetry Official Documentation (opentelemetry.io)

Frequently Asked Questions

Sk Jabedul Haque

Founder & Chief Editor

Building India's most trusted finance education platform — simplifying news, calculators, and market trends so anyone can understand and invest confidently.

Read full bio →

in Technology