TL;DR: This is a breaking change. The root-level convenience sampling props on chat() / ai() / generate() — temperature, topP, and maxTokens — have been removed and now live inside provider-native modelOptions instead. Passing them at the root no longer type-checks and has no effect at runtime. Move each one into modelOptions under its provider's canonical name (e.g. OpenAI's max_output_tokens, Anthropic's max_tokens, Gemini's maxOutputTokens, Ollama's nested options.num_predict). A provider-aware codemod does the rewrite for you. metadata is unaffected and stays at the root.
Previously, chat() accepted three generic sampling props directly at the root of its options:
chat({
adapter: openaiText('gpt-4o'),
messages,
temperature: 0.3,
topP: 0.9,
maxTokens: 100,
})chat({
adapter: openaiText('gpt-4o'),
messages,
temperature: 0.3,
topP: 0.9,
maxTokens: 100,
})These were a convenience layer that the runtime mapped onto whatever the underlying provider expected. That generic mapping is now gone. Sampling parameters live where every other model-specific knob already lives — inside the provider-native modelOptions object — under each provider's own canonical key name.
chat({
adapter: openaiText('gpt-4o'),
messages,
modelOptions: {
temperature: 0.3,
top_p: 0.9,
max_output_tokens: 100,
},
})chat({
adapter: openaiText('gpt-4o'),
messages,
modelOptions: {
temperature: 0.3,
top_p: 0.9,
max_output_tokens: 100,
},
})The root prop names are the same everywhere (temperature, topP, maxTokens). The modelOptions target key differs per provider — use the exact key your provider expects.
// Before
chat({
adapter: openaiText('gpt-4o'),
messages,
temperature: 0.3,
topP: 0.9,
maxTokens: 100,
})
// After
chat({
adapter: openaiText('gpt-4o'),
messages,
modelOptions: {
temperature: 0.3,
top_p: 0.9,
max_output_tokens: 100,
},
})// Before
chat({
adapter: openaiText('gpt-4o'),
messages,
temperature: 0.3,
topP: 0.9,
maxTokens: 100,
})
// After
chat({
adapter: openaiText('gpt-4o'),
messages,
modelOptions: {
temperature: 0.3,
top_p: 0.9,
max_output_tokens: 100,
},
})// Before
chat({
adapter: anthropicText('claude-sonnet-4-5'),
messages,
temperature: 0.3,
topP: 0.9,
maxTokens: 1024,
})
// After
chat({
adapter: anthropicText('claude-sonnet-4-5'),
messages,
modelOptions: {
temperature: 0.3,
top_p: 0.9,
max_tokens: 1024,
},
})// Before
chat({
adapter: anthropicText('claude-sonnet-4-5'),
messages,
temperature: 0.3,
topP: 0.9,
maxTokens: 1024,
})
// After
chat({
adapter: anthropicText('claude-sonnet-4-5'),
messages,
modelOptions: {
temperature: 0.3,
top_p: 0.9,
max_tokens: 1024,
},
})// Before
chat({
adapter: geminiText('gemini-3.1-pro-preview'),
messages,
temperature: 0.3,
topP: 0.9,
maxTokens: 2048,
})
// After
chat({
adapter: geminiText('gemini-3.1-pro-preview'),
messages,
modelOptions: {
temperature: 0.3,
topP: 0.9,
maxOutputTokens: 2048,
},
})// Before
chat({
adapter: geminiText('gemini-3.1-pro-preview'),
messages,
temperature: 0.3,
topP: 0.9,
maxTokens: 2048,
})
// After
chat({
adapter: geminiText('gemini-3.1-pro-preview'),
messages,
modelOptions: {
temperature: 0.3,
topP: 0.9,
maxOutputTokens: 2048,
},
})Ollama is the one provider where sampling parameters are nested inside an options object within modelOptions, and the token limit is named num_predict:
// Before
chat({
adapter: ollamaText('llama3'),
messages,
temperature: 0.3,
topP: 0.9,
maxTokens: 1000,
})
// After
chat({
adapter: ollamaText('llama3'),
messages,
modelOptions: {
options: {
temperature: 0.3,
top_p: 0.9,
num_predict: 1000,
},
},
})// Before
chat({
adapter: ollamaText('llama3'),
messages,
temperature: 0.3,
topP: 0.9,
maxTokens: 1000,
})
// After
chat({
adapter: ollamaText('llama3'),
messages,
modelOptions: {
options: {
temperature: 0.3,
top_p: 0.9,
num_predict: 1000,
},
},
})| Root prop | OpenAI | Anthropic | Gemini | Grok | Groq | OpenRouter | Ollama (nested under options) |
|---|---|---|---|---|---|---|---|
| temperature | temperature | temperature | temperature | temperature | temperature | temperature | options.temperature |
| topP | top_p | top_p | topP | top_p | top_p | topP | options.top_p |
| maxTokens | max_output_tokens | max_tokens | maxOutputTokens | max_tokens | max_completion_tokens | maxCompletionTokens | options.num_predict |
A jscodeshift codemod moves the root sampling props into modelOptions for you, renaming each one to the correct provider-native key. It resolves the provider from the adapter: factory call (e.g. openaiText('gpt-4o') → OpenAI), so the rewrite is provider-aware. Run it from the repo:
pnpm codemod:move-sampling-to-model-options "src/**/*.{ts,tsx}"pnpm codemod:move-sampling-to-model-options "src/**/*.{ts,tsx}"Or run the published transform directly — no clone needed:
npx jscodeshift \
--parser=tsx \
-t https://raw.githubusercontent.com/TanStack/ai/main/codemods/move-sampling-to-model-options/transform.ts \
"src/**/*.{ts,tsx}"npx jscodeshift \
--parser=tsx \
-t https://raw.githubusercontent.com/TanStack/ai/main/codemods/move-sampling-to-model-options/transform.ts \
"src/**/*.{ts,tsx}"Add --dry --print to preview the rewrite without modifying files.
What it does:
Report + skip (never partial): the codemod never partially transforms a call. It leaves the call untouched and emits an api.report(...) message when it can't safely proceed:
See codemods/move-sampling-to-model-options/README.md for the full transform details and limitations.
metadata is not a sampling parameter and is unaffected — it stays at the root of chat():
chat({
adapter: openaiText('gpt-4o'),
messages,
metadata: { requestId: 'abc-123' }, // ← still at the root
modelOptions: {
temperature: 0.3,
max_output_tokens: 100,
},
})chat({
adapter: openaiText('gpt-4o'),
messages,
metadata: { requestId: 'abc-123' }, // ← still at the root
modelOptions: {
temperature: 0.3,
max_output_tokens: 100,
},
})