Middleware

OpenTelemetry

The otelMiddleware factory wires TanStack AI into your existing OpenTelemetry setup. Every chat() call produces a root span, one child span per agent-loop iteration, and one grandchild span per tool call — all with GenAI semantic-convention attributes. It also records GenAI token and duration histograms when a Meter is provided.

Setup

Install @opentelemetry/api — it's an optional peer dependency of @tanstack/ai:

sh
pnpm add @opentelemetry/api
pnpm add @opentelemetry/api

Wire up your OTel SDK however you already do (e.g. @opentelemetry/sdk-node). Then pass a Tracer (and optionally a Meter) into the middleware. The OTel middleware lives on its own subpath — importing it never affects users who don't need OTel:

ts
import { chat } from '@tanstack/ai'
import { otelMiddleware } from '@tanstack/ai/middlewares/otel'
import { openaiText } from '@tanstack/ai-openai'
import { trace, metrics } from '@opentelemetry/api'

const otel = otelMiddleware({
  tracer: trace.getTracer('my-app'),
  meter: metrics.getMeter('my-app'),
})

const result = await chat({
  adapter: openaiText('gpt-4o'),
  messages: [{ role: 'user', content: 'hi' }],
  middleware: [otel],
  stream: false,
})
import { chat } from '@tanstack/ai'
import { otelMiddleware } from '@tanstack/ai/middlewares/otel'
import { openaiText } from '@tanstack/ai-openai'
import { trace, metrics } from '@opentelemetry/api'

const otel = otelMiddleware({
  tracer: trace.getTracer('my-app'),
  meter: metrics.getMeter('my-app'),
})

const result = await chat({
  adapter: openaiText('gpt-4o'),
  messages: [{ role: 'user', content: 'hi' }],
  middleware: [otel],
  stream: false,
})

What gets emitted

Spans

text
chat gpt-4o              (root, kind: INTERNAL)
├── chat gpt-4o #0       (iteration, kind: CLIENT)
│   ├── execute_tool get_weather
│   └── execute_tool get_time
└── chat gpt-4o #1       (iteration, kind: CLIENT)
chat gpt-4o              (root, kind: INTERNAL)
├── chat gpt-4o #0       (iteration, kind: CLIENT)
│   ├── execute_tool get_weather
│   └── execute_tool get_time
└── chat gpt-4o #1       (iteration, kind: CLIENT)

Iteration spans are numbered (#0, #1, ...) so distinct iterations of the same chat are easy to pick apart in trace viewers.

Attribute reference

LevelAttributeValue
root / iterationgen_ai.systemopenai, anthropic, ...
iterationgen_ai.operation.namechat
root / iterationgen_ai.request.modelrequested model
iterationgen_ai.response.modelactual model
iterationgen_ai.request.temperaturefrom config
iterationgen_ai.request.top_pfrom config
iterationgen_ai.request.max_tokensfrom config
iterationgen_ai.usage.input_tokensper iteration
iterationgen_ai.usage.output_tokensper iteration
root / iterationgen_ai.usage.total_tokensprovider-reported total
root / iterationgen_ai.usage.costprovider-reported cost, when available
root / iterationgen_ai.usage.cache_read.input_tokenscached prompt tokens, when reported
root / iterationgen_ai.usage.cache_creation.input_tokenscache-write prompt tokens, when reported
root / iterationgen_ai.usage.reasoning.output_tokensreasoning/thinking tokens, when reported
root / iterationtanstack.ai.usage.duration_secondsduration-based billing (e.g. transcription), when reported
root / iterationtanstack.ai.usage.upstream_costgateway upstream cost (e.g. OpenRouter), when reported
root / iterationtanstack.ai.usage.upstream_input_costupstream input cost split, when reported
root / iterationtanstack.ai.usage.upstream_output_costupstream output cost split, when reported
iterationgen_ai.response.finish_reasons[stop], [tool_calls], ...
rootgen_ai.usage.input_tokensrolled up
rootgen_ai.usage.output_tokensrolled up
roottanstack.ai.iterationsiteration count
toolgen_ai.tool.nametool name
toolgen_ai.tool.call.idtool call id
toolgen_ai.tool.typefunction
tooltanstack.ai.tool.outcomesuccess / error

Usage attributes beyond input/output tokens are emitted only when the provider reports them, so spans stay clean otherwise. Cache and reasoning breakdowns use the official GenAI semconv names; gen_ai.usage.cost and gen_ai.usage.total_tokens are de-facto extensions consumed directly by backends like PostHog — without them, backends re-derive cost from their own price tables and lose cache discounts and gateway markup. Fields with no established convention (duration-based billing, the upstream cost split) are TanStack-namespaced.

Metrics

Two GenAI-standard histograms:

  • gen_ai.client.operation.duration (seconds) — recorded once per chat() call, covering all agent-loop iterations and tool execution. On error or abort the record carries an error.type attribute (the thrown error's name, or "cancelled" for aborts).
  • gen_ai.client.token.usage (tokens) — recorded once per iteration (two records: input and output), tagged with gen_ai.token.type.

Both gen_ai.response.id and gen_ai.response.model are deliberately excluded from metric attributes to keep cardinality low (per-request custom-model names and request IDs would blow up the series set).

Privacy: capturing prompts and completions

By default, only metadata lands on spans. To record prompt and completion content, set captureContent: true. Content is captured as OTel span events following the GenAI convention:

  • gen_ai.user.message, gen_ai.system.message, gen_ai.assistant.message, gen_ai.tool.message, gen_ai.choice

Pass a redact function to strip PII before anything is recorded:

ts
otelMiddleware({
  tracer,
  captureContent: true,
  redact: (text) => text.replace(/\b\d{3}-\d{2}-\d{4}\b/g, '[SSN]'),
})
otelMiddleware({
  tracer,
  captureContent: true,
  redact: (text) => text.replace(/\b\d{3}-\d{2}-\d{4}\b/g, '[SSN]'),
})

If redact throws, the middleware writes the literal sentinel "[redaction_failed]" into the span event and logs a warning — it never falls back to the raw content. This is the load-bearing invariant for users who ship traces to third-party backends: a broken redactor should shut off capture, not leak prompts.

Accumulated assistant text (the gen_ai.choice event) is capped at maxContentLength characters (default 100 000); longer completions are truncated with a trailing "…" marker.

Multimodal content (images, audio, video, documents) is represented as placeholder strings ([image], [audio], ...) to preserve message order without dumping binary data onto spans. Use onSpanEnd if you need richer multimodal capture.

Prompt/system/user message events fire from onConfig at the start of every iteration, which means the full conversation history (as the adapter will re-send it) is re-emitted on each iteration span. This mirrors what the provider actually sees on the wire.

Extension points

All four extensions are optional. Each wraps user code in try/catch — a thrown callback becomes a log line, never a broken chat.

spanNameFormatter(info)

Override default span names. info.kind is 'chat' | 'iteration' | 'tool'.

ts
otelMiddleware({
  tracer,
  spanNameFormatter: (info) =>
    info.kind === 'tool' ? `tool:${info.toolName}` : `chat:${info.ctx.model}`,
})
otelMiddleware({
  tracer,
  spanNameFormatter: (info) =>
    info.kind === 'tool' ? `tool:${info.toolName}` : `chat:${info.ctx.model}`,
})

attributeEnricher(info)

Add custom attributes to every span. Fires once per span.

ts
otelMiddleware({
  tracer,
  attributeEnricher: () => ({
    'tenant.id': getCurrentTenant(),
  }),
})
otelMiddleware({
  tracer,
  attributeEnricher: () => ({
    'tenant.id': getCurrentTenant(),
  }),
})

onBeforeSpanStart(info, options)

Mutate SpanOptions immediately before tracer.startSpan(...). Useful for adding links, custom start times, or extra default attributes.

onSpanEnd(info, span)

Fires just before every span.end(). Common uses: record custom events, emit per-tool metrics via your own Meter.

ts
const toolDuration = meter.createHistogram('tool.duration')
otelMiddleware({
  tracer,
  onSpanEnd: (info, span) => {
    if (info.kind === 'tool') {
      // span is still recording; read timestamps from your own store if needed
      toolDuration.record(1, { 'tool.name': info.toolName })
    }
  },
})
const toolDuration = meter.createHistogram('tool.duration')
otelMiddleware({
  tracer,
  onSpanEnd: (info, span) => {
    if (info.kind === 'tool') {
      // span is still recording; read timestamps from your own store if needed
      toolDuration.record(1, { 'tool.name': info.toolName })
    }
  },
})
  • Middleware — the lifecycle this middleware hooks into
  • Debug Logging — quick console-output diagnostics, complementary to OTel