by Alem Tuzlak on March 12, 2026.

If you've built an AI-powered application with more than a handful of tools, you've probably hit the wall: every tool definition you send to the LLM costs tokens, eats into the context window, and — past a certain point — actually makes the model worse at picking the right tool. More tools means more noise, slower responses, and higher bills.
Today we're shipping lazy tool discovery in TanStack AI, a mechanism that lets the LLM discover tools on demand instead of receiving all of them upfront.
Consider a customer support agent with 30 tools: ticket lookup, order management, refund processing, knowledge base search, escalation workflows, analytics queries, and more. On any given request, the user probably needs 2–3 of these. But the LLM sees all 30 tool definitions on every single call.
This creates three problems:
Lazy tool discovery adds a single flag to tool definitions:
const searchProducts = toolDefinition({
name: 'searchProducts',
description: 'Search products by keyword',
inputSchema: z.object({
query: z.string(),
}),
lazy: true,
})
That's it. Tools marked lazy: true are withheld from the LLM. In their place, the LLM sees a single synthetic tool called __lazy__tool__discovery__ whose description lists the names of all available lazy tools.
When the LLM decides it needs a tool, the flow looks like this:
From the LLM's perspective, it asked about a tool, learned what it does, and then used it. From your perspective, you saved tokens on every request where that tool wasn't needed.
When you pass tools to chat(), the engine separates them into eager tools (the default) and lazy tools. A LazyToolManager class handles the rest:
If none of your tools have lazy: true, the behavior is identical to before. No discovery tool is created, no extra processing happens, no code paths change. The feature is entirely opt-in.
Here's a guitar store chat application with a mix of eager and lazy tools:
import { chat, toolDefinition, maxIterations } from '@tanstack/ai'
import { openaiText } from '@tanstack/ai-openai'
import { z } from 'zod'
// Always available — core functionality
const getGuitars = toolDefinition({
name: 'getGuitars',
description: 'Get all guitars from inventory',
inputSchema: z.object({}),
}).server(() => fetchGuitarsFromDB())
const recommendGuitar = toolDefinition({
name: 'recommendGuitar',
description: 'Display a guitar recommendation to the user',
inputSchema: z.object({ id: z.number() }),
}).server(({ id }) => ({ id }))
// Discovered on demand — secondary features
const compareGuitars = toolDefinition({
name: 'compareGuitars',
description: 'Compare two or more guitars side by side',
inputSchema: z.object({
guitarIds: z.array(z.number()).min(2),
}),
lazy: true,
}).server(({ guitarIds }) => buildComparison(guitarIds))
const calculateFinancing = toolDefinition({
name: 'calculateFinancing',
description: 'Calculate monthly payment plans for a guitar',
inputSchema: z.object({
guitarId: z.number(),
months: z.number(),
}),
lazy: true,
}).server(({ guitarId, months }) => computePaymentPlan(guitarId, months))
const searchGuitars = toolDefinition({
name: 'searchGuitars',
description: 'Search guitars by keyword in name or description',
inputSchema: z.object({
query: z.string(),
}),
lazy: true,
}).server(({ query }) => searchInventory(query))
// Use in chat — lazy tools work automatically
const stream = chat({
adapter: openaiText('gpt-4o'),
messages,
tools: [
getGuitars,
recommendGuitar,
compareGuitars,
calculateFinancing,
searchGuitars,
],
agentLoopStrategy: maxIterations(20),
})
When a user asks "recommend me a guitar", the LLM sees getGuitars, recommendGuitar, and __lazy__tool__discovery__. It calls the first two and never touches discovery. Tokens saved.
When a user asks "compare the Motherboard Guitar and the Racing Guitar", the LLM sees the discovery tool, discovers compareGuitars, and calls it. One extra round-trip, but only when the feature is actually needed.
When a user follows up with "how much would the cheaper one cost per month?", the LLM has compareGuitars already available (from the earlier discovery) and discovers calculateFinancing. The conversation builds naturally without re-discovering tools.
Lazy discovery makes sense when:
Keep tools eager (the default) when:
A good rule of thumb: if a tool is used in less than 30% of conversations, it's a strong candidate for lazy: true.
Lazy tool discovery is available now in @tanstack/ai. Add lazy: true to any tool definition and you're done.
We're exploring a few follow-on improvements:
Check out the full documentation for details, or try it out in the ts-react-chat example which includes three lazy tools with test prompts.
TanStack AI is an open-source, provider-agnostic AI SDK for building type-safe AI applications. Get started here.