TanStack AI ships ready-made middleware so you don't have to hand-roll the common cases. Each one is an ordinary ChatMiddleware — drop it into the middleware array of any chat() call. This page documents every built-in.
| Middleware | Import | What it does |
|---|---|---|
| toolCacheMiddleware | @tanstack/ai/middlewares | Cache tool-call results by name + arguments |
| contentGuardMiddleware | @tanstack/ai/middlewares | Redact / transform / block streamed text content |
| otelMiddleware | @tanstack/ai/middlewares/otel | Emit OpenTelemetry spans + GenAI metrics |
toolCacheMiddleware and contentGuardMiddleware are exported from the main @tanstack/ai/middlewares barrel. otelMiddleware lives on its own subpath (@tanstack/ai/middlewares/otel) so that importing the barrel never eagerly pulls in @opentelemetry/api (an optional peer dependency).
Caches tool call results based on tool name and arguments. When a tool is called with the same name and arguments as a previous call, the cached result is returned immediately without re-executing the tool.
import { chat } from "@tanstack/ai";
import { toolCacheMiddleware } from "@tanstack/ai/middlewares";
const stream = chat({
adapter: openaiText("gpt-5.5"),
messages,
tools: [weatherTool, stockTool],
middleware: [
toolCacheMiddleware({
ttl: 60_000, // Cache entries expire after 60 seconds
maxSize: 50, // Keep at most 50 entries (LRU eviction)
toolNames: ["getWeather"], // Only cache specific tools
}),
],
});import { chat } from "@tanstack/ai";
import { toolCacheMiddleware } from "@tanstack/ai/middlewares";
const stream = chat({
adapter: openaiText("gpt-5.5"),
messages,
tools: [weatherTool, stockTool],
middleware: [
toolCacheMiddleware({
ttl: 60_000, // Cache entries expire after 60 seconds
maxSize: 50, // Keep at most 50 entries (LRU eviction)
toolNames: ["getWeather"], // Only cache specific tools
}),
],
});Options:
| Option | Type | Default | Description |
|---|---|---|---|
| maxSize | number | 100 | Maximum cache entries. Oldest evicted first (LRU). Only applies to the default in-memory storage. |
| ttl | number | Infinity | Time-to-live in milliseconds. Expired entries are not served. |
| toolNames | string[] | All tools | Only cache these tools. Others pass through. |
| keyFn | (toolName, args) => string | JSON.stringify([toolName, args]) | Custom cache key derivation. |
| storage | ToolCacheStorage | In-memory Map | Custom storage backend. When provided, maxSize is ignored — the storage manages its own capacity. |
Behaviors:
Custom key function — useful when you want to ignore certain arguments:
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null;
}
toolCacheMiddleware({
keyFn: (toolName, args) => {
// Ignore pagination, cache by query only. `args` is `unknown`, so
// narrow it with a type guard before destructuring.
if (!isRecord(args)) return JSON.stringify([toolName, args]);
const { page, ...rest } = args;
return JSON.stringify([toolName, rest]);
},
});function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null;
}
toolCacheMiddleware({
keyFn: (toolName, args) => {
// Ignore pagination, cache by query only. `args` is `unknown`, so
// narrow it with a type guard before destructuring.
if (!isRecord(args)) return JSON.stringify([toolName, args]);
const { page, ...rest } = args;
return JSON.stringify([toolName, rest]);
},
});By default the cache lives in-memory and is scoped to a single toolCacheMiddleware() instance. Pass a storage option to use an external backend like Redis, localStorage, or a database. This also enables sharing a cache across multiple chat() calls.
The storage interface:
// Implement this interface (exported from `@tanstack/ai/middlewares`):
interface ToolCacheStorage {
getItem: (key: string) => ToolCacheEntry | undefined | Promise<ToolCacheEntry | undefined>;
setItem: (key: string, value: ToolCacheEntry) => void | Promise<void>;
deleteItem: (key: string) => void | Promise<void>;
}
// ToolCacheEntry is { result: unknown; timestamp: number }// Implement this interface (exported from `@tanstack/ai/middlewares`):
interface ToolCacheStorage {
getItem: (key: string) => ToolCacheEntry | undefined | Promise<ToolCacheEntry | undefined>;
setItem: (key: string, value: ToolCacheEntry) => void | Promise<void>;
deleteItem: (key: string) => void | Promise<void>;
}
// ToolCacheEntry is { result: unknown; timestamp: number }All methods may return a Promise for async backends. The middleware handles TTL checking — your storage just needs to store and retrieve entries.
Redis example:
import { createClient } from "redis";
import { toolCacheMiddleware, type ToolCacheStorage } from "@tanstack/ai/middlewares";
const redis = createClient();
const redisStorage: ToolCacheStorage = {
getItem: async (key) => {
const raw = await redis.get(`tool-cache:${key}`);
return raw ? JSON.parse(raw) : undefined;
},
setItem: async (key, value) => {
await redis.set(`tool-cache:${key}`, JSON.stringify(value));
},
deleteItem: async (key) => {
await redis.del(`tool-cache:${key}`);
},
};
const stream = chat({
adapter,
messages,
tools: [weatherTool],
middleware: [toolCacheMiddleware({ storage: redisStorage, ttl: 60_000 })],
});import { createClient } from "redis";
import { toolCacheMiddleware, type ToolCacheStorage } from "@tanstack/ai/middlewares";
const redis = createClient();
const redisStorage: ToolCacheStorage = {
getItem: async (key) => {
const raw = await redis.get(`tool-cache:${key}`);
return raw ? JSON.parse(raw) : undefined;
},
setItem: async (key, value) => {
await redis.set(`tool-cache:${key}`, JSON.stringify(value));
},
deleteItem: async (key) => {
await redis.del(`tool-cache:${key}`);
},
};
const stream = chat({
adapter,
messages,
tools: [weatherTool],
middleware: [toolCacheMiddleware({ storage: redisStorage, ttl: 60_000 })],
});Sharing a cache across requests:
// Create storage once, reuse across chat() calls
const sharedStorage: ToolCacheStorage = {
getItem: (key) => globalCache.get(key),
setItem: (key, value) => { globalCache.set(key, value); },
deleteItem: (key) => { globalCache.delete(key); },
};
// Both requests share the same cache
app.post("/api/chat", async (req) => {
const stream = chat({
adapter,
messages: req.body.messages,
tools: [weatherTool],
middleware: [toolCacheMiddleware({ storage: sharedStorage })],
});
return toServerSentEventsResponse(stream);
});// Create storage once, reuse across chat() calls
const sharedStorage: ToolCacheStorage = {
getItem: (key) => globalCache.get(key),
setItem: (key, value) => { globalCache.set(key, value); },
deleteItem: (key) => { globalCache.delete(key); },
};
// Both requests share the same cache
app.post("/api/chat", async (req) => {
const stream = chat({
adapter,
messages: req.body.messages,
tools: [weatherTool],
middleware: [toolCacheMiddleware({ storage: sharedStorage })],
});
return toServerSentEventsResponse(stream);
});Filters or transforms streamed text content as it flows through onChunk. Use it to redact sensitive data (SSNs, emails, API keys), enforce a profanity filter, or rewrite text on the fly. Rules are applied to TEXT_MESSAGE_CONTENT chunks; all other chunk types pass through untouched.
import { chat } from "@tanstack/ai";
import { contentGuardMiddleware } from "@tanstack/ai/middlewares";
const stream = chat({
adapter: openaiText("gpt-5.5"),
messages,
middleware: [
contentGuardMiddleware({
rules: [
// Regex + replacement
{ pattern: /\b\d{3}-\d{2}-\d{4}\b/g, replacement: "[SSN REDACTED]" },
// Custom transform function
{ fn: (text) => text.replaceAll("badword", "****") },
],
strategy: "buffered",
}),
],
});import { chat } from "@tanstack/ai";
import { contentGuardMiddleware } from "@tanstack/ai/middlewares";
const stream = chat({
adapter: openaiText("gpt-5.5"),
messages,
middleware: [
contentGuardMiddleware({
rules: [
// Regex + replacement
{ pattern: /\b\d{3}-\d{2}-\d{4}\b/g, replacement: "[SSN REDACTED]" },
// Custom transform function
{ fn: (text) => text.replaceAll("badword", "****") },
],
strategy: "buffered",
}),
],
});Options:
| Option | Type | Default | Description |
|---|---|---|---|
| rules | ContentGuardRule[] | — | Required. Applied in order; each rule receives the previous rule's output. A rule is either { pattern: RegExp; replacement: string } or { fn: (text: string) => string }. |
| strategy | 'delta' | 'buffered' | 'buffered' | How content is matched. See below. |
| bufferSize | number | 50 | (Buffered only) Characters held back before emitting, so patterns spanning chunk boundaries still match. Set it ≥ the longest pattern you expect. Flushed at stream end. |
| blockOnMatch | boolean | false | When true, drop the entire chunk if any rule changes the content (instead of emitting the filtered version). |
| onFiltered | (info: ContentFilteredInfo) => void | — | Callback fired whenever a rule changes content. Receives { messageId, original, filtered, strategy }. |
Matching strategies:
Behaviors:
Emits vendor-neutral OpenTelemetry traces and metrics for every chat() call — a root span per call, a child span per agent-loop iteration, and a grandchild span per tool execution, all tagged with GenAI semantic-convention attributes.
import { chat } from "@tanstack/ai";
import { otelMiddleware } from "@tanstack/ai/middlewares/otel";
import { trace, metrics } from "@opentelemetry/api";
const otel = otelMiddleware({
tracer: trace.getTracer("my-app"),
meter: metrics.getMeter("my-app"), // optional — enables GenAI histograms
});
const result = await chat({
adapter: openaiText("gpt-5.5"),
messages,
middleware: [otel],
});import { chat } from "@tanstack/ai";
import { otelMiddleware } from "@tanstack/ai/middlewares/otel";
import { trace, metrics } from "@opentelemetry/api";
const otel = otelMiddleware({
tracer: trace.getTracer("my-app"),
meter: metrics.getMeter("my-app"), // optional — enables GenAI histograms
});
const result = await chat({
adapter: openaiText("gpt-5.5"),
messages,
middleware: [otel],
});otelMiddleware has its own configuration surface (content capture, redaction, span-name formatting, attribute enrichment, lifecycle callbacks) and requires the optional @opentelemetry/api peer dependency. See the dedicated OpenTelemetry guide for full setup, the span/metric catalogue, and all options.
These built-ins are just ChatMiddleware objects — nothing about them is privileged. To build your own, see the Middleware guide for the full hook reference, the context object, and composition rules.