Skip to main content
Ask AI is part of Agent Studio and isn’t available as a standalone feature for new applications. Use these docs for existing Ask AI implementations. Migration guides will be added when available.
Ask AI’s token usage, and your costs, depend on how much content it sends to the large language model (LLM). To reduce tokens while maintaining response quality, apply these strategies.
StrategyHow it helpsHow to apply
Reduce hits per LLM requestAsk AI includes 7 search results (“hits”) by default in each LLM request. While this provides useful context, it also increases the number of tokens sent, raising costs. Reducing the number of hits lowers token usage.In your Ask AI assistant configuration, change Set a maximum number of search hits per LLM request.
Split records into smaller chunksSmaller record chunks ensure the LLM receives only relevant context. For example, split long documentation pages into smaller records based on headings.See Markdown indexing.
Shorten large recordsReturn only the most relevant excerpts from records instead of the full content.Use the attributesToSnippet parameter or configure it in the Algolia dashboard.
Limit record sizePrevent overly large records from inflating token usage. Set a maximum record size to truncate long records before indexing.Use the maxRecordBytes parameter when indexing content.
Last modified on May 29, 2026