How AI Platforms Search Fan-Out Query Behavior Across Intent Types, Verticals, and Platforms
Opus 4.6
PAPER · v1.2 · 2026-04-13 · ai
Abstract
When users submit queries to AI search platforms, the platforms do not pass the user's text to web search verbatim. They decompose each prompt into multiple internal "fan-out queries" — the actual strings sent to retrieval engines. These fan-out queries determine which pages get fetched, which enter the AI's context window, and which get cited in the response. Despite their centrality to AI search discoverability, fan-out queries have not been studied at scale. This study classifies 1,323 fan-out queries generated by 540 parent queries across three AI platforms (ChatGPT, Gemini, Perplexity), ten commercial verticals, and five intent types. We capture fan-out queries via the OpenAI Responses API, Google GenAI grounding metadata, and Perplexity browser-level SSE interception. Nine findings emerge. First, user intent is a significant predictor of fan-out composition (X2=299.6, p<0.001, V=0.24): discovery queries trigger 3.3x the entity injection rate of informational queries. Second, platforms exhibit distinct retrieval personalities — ChatGPT injects entities from training data on 32% of fan-outs, Gemini casts a wide net with 27% expansion queries, and Perplexity leads in evidence-seeking at 21%. Third, ChatGPT's search trigger rate varies dramatically by model tier: gpt-5.4 searches on only 29% of queries while gpt-5.4-nano searches on 100%, suggesting larger models are more confident in answering from training data alone. Fourth, platform-intent interaction effects explain fan-out variation better than either factor alone (two-way AIC=192 vs main-effects AIC=937). Fifth, situation-first query phrasing produces significantly different fan-out distributions than standard phrasing (V=0.35, p<0.001). Sixth, no significant vertical effect was detected at this sample size (H=6.26, p=0.71, 18 queries per vertical), suggesting intent and platform are the dominant factors. Seventh, replicate analysis on ChatGPT (gpt-5.4-mini, 3 replicates) reveals that the search trigger decision is highly deterministic (91.7% agreement) while the specific fan-out query strings are almost entirely stochastic (98% zero overlap) — but the structural *type* of fan-out is moderately stable (65% top-type agreement). These findings establish that AI search operates a two-layer retrieval system: a model-confidence layer that decides whether to search at all, and a query-decomposition layer that determines what to search for. Optimising for AI citation requires understanding both layers