Generative AI overviews now appear for roughly half of web queries and reshape which sites users see, favoring Google-owned content and sidelining publishers that block Google's AI crawler; the AI summaries are also less stable and less robust to small query changes than traditional search results, raising concerns about publisher visibility and revenue.
Generative AI is being increasingly integrated into web search for the convenience it provides users. In this work, we aim to understand how generative AI disrupts web search by retrieving and presenting the information and sources differently from traditional search engines. We introduce a public benchmark dataset of 11,500 user queries to support our study and future research of generative search. We compare the search results returned by Google's search engine, the accompanying AI Overview (AIO), and Gemini Flash 2.5 for each query. We have made several key findings. First, we find that for 51.5\% of representative, real-user queries, AIOs are generated, and are displayed above the organic search results. Controversial questions frequently result in an AIO. Second, we show that the retrieved sources are substantially different for each search engine (<0.2 average Jaccard similarity). Traditional Google search is significantly more likely to retrieve information from popular or institutional websites in government or education, while generative search engines are significantly more likely to retrieve Google-owned content. Third, we observe that websites that block Google's AI crawler are significantly less likely to be retrieved by AIOs, despite having access to the content. Finally, AIOs are less consistent when processing two runs of the same query, and are less robust to minor query edits. Our findings have important implications for understanding how generative search impacts website visibility, the effectiveness of generative engine optimization techniques, and the information users receive. We call for revenue frameworks to foster a sustainable and mutually beneficial ecosystem for publishers and generative search providers.
Summary
Main Finding
Generative-search summaries (Google AI Overviews, AIOs) are now common and materially change which web sources users see. AIOs appear for a large fraction of queries (65.6% of the authors' 11.5k benchmark; ~51.5% on the representative ORCAS subset), they cite a different set of sources than traditional SERPs (average Jaccard similarity ≈ 0.11–0.18 across engine pairs), they favor Google-owned content and disfavor popular/institutional domains (and sites that block Google’s AI crawler), and they are less consistent/robust than traditional search. These shifts have direct economic consequences for publishers, the SEO/GEO market, and the incentives governing web content access and compensation.
Key Points
- AIO prevalence
- Overall AIO generation: 65.6% of the 11,500 benchmark queries.
- Representative real-user (ORCAS) queries: 51.5% produced an AIO.
- Very high for long, informational, question-formatted queries (e.g., ELI5: 94.6%; NQ questions ≈ 86.2%); low for product keyword queries (Amazon Retail: 17.4%).
- AIOs are rare for trending queries (≈ 8.1%) but very common for sensitive/political queries (≈ 93.8% of political queries).
- Divergent source sets
- Low overlap between engines: average Jaccard similarity of sources = 0.18 for AIO vs SERP, 0.11 for AIO vs Gemini, 0.16 for SERP vs Gemini.
- Rank-sensitive agreement (RBO) also low: AIO vs SERP ≈ 0.23; AIO vs Gemini ≈ 0.15; SERP vs Gemini ≈ 0.21.
- Each engine returns ≈ 8–10 sources on average (SERP 8.75, AIO 9.24, Gemini 9.68), so low overlap reflects different retrieval/prioritization methods, not list length.
- Source characteristics
- Traditional SERP favors popular domains and institutional (.gov, .edu) sites.
- Generative outputs (AIO/Gemini) cite proportionally more Google-owned content and fewer popular/institutional sites.
- Websites that block Google’s AI crawler are significantly less likely to be cited in AIOs even though AIOs can technically access the content.
- Consistency and robustness
- AIOs are less consistent across repeated runs and more sensitive to minor query edits or device/location changes than traditional SERP.
- High-stakes behavior
- For contentious or political queries, AIOs are frequently produced and often take a stance in the generated text (AIOs: ~33.4% exhibited an expressed stance; Gemini: ~5.6%).
- Generative summaries have well-documented risks (hallucination, cherry-picking citations, attribution errors), which are amplified when they replace direct links to institutional sources.
Data & Methods
- Benchmark dataset
- 11,500 queries spanning 9 types: ORCAS (5,000 real-user queries labeled by intent), Amazon Retail (500), Retail-Comp (500), Retail-Q (500), Debate (1,000), ELI5 (1,000), Localized (1,000), NQ (1,000), NQ Keywords (1,000).
- Additional time-sensitive subsets used in some analyses.
- Collection procedures
- SERP and AIO results collected via SerpAPI simulating a mobile device from Newark, NJ to keep device/location controlled.
- Gemini 2.5 Flash responses collected via Gemini API with Google Search grounding enabled; no custom system prompts.
- Collection date for the main benchmark: Dec 7–8, 2025.
- For comparability, the analysis focuses on first-page SERP results and only on queries where all three systems returned sources (7,439 queries used for many comparisons).
- Metrics and analyses
- Set overlap: Jaccard similarity at URL level.
- Rank-aware overlap: Rank‑Biased Overlap (RBO) with persistence parameter p = 0.9.
- Additional analyses: per-domain changes, domain categories (popularity, institution type), effect of robots/AI-crawler blocking, consistency across repeated runs and minor query edits, stance detection on generated summaries.
- Statistical testing (e.g., chi-square for categorical differences) and robustness checks reported; dataset and code available: https://github.com/rag24/AIO
Implications for AI Economics
- Revenue and traffic redistribution
- Generative summaries reduce direct clicks to publishers by providing synthesized answers up front, threatening ad-driven and subscription revenues that rely on pageviews.
- Publishers face a trade-off: blocking AI crawlers may protect raw content but further reduces visibility in AIOs; allowing crawlers risks reuse/excerpting without commensurate compensation.
- The paper calls for revenue frameworks (licensing, revenue-sharing, micropayments) to align publisher and generative-search incentives and avoid a race-to-the-bottom for content access.
- Market power and vertical integration risks
- The finding that generative outputs disproportionately cite Google-owned content raises concerns about preferential treatment and self-preferencing, with potential antitrust and competition policy implications.
- If dominant search/generative providers amplify their own content, network effects could accelerate concentration and lock-in, reducing diversity of information sources and bargaining power of independent publishers.
- Impacts on SEO / GEO markets and service providers
- Traditional SEO tactics may become less effective because generative systems retrieve and rank sources differently; the nascent Generative Engine Optimization (GEO) industry faces uncertain efficacy.
- Publishers and GEO vendors need new measurement tools to evaluate visibility in AIOs and to monetize generative citations (if any).
- Externalities and public-good concerns
- Generative search may rely less on institutional (.gov/.edu) and otherwise high-credence sources, especially for politically sensitive queries—this has social-welfare implications (misinformation risks, lower-quality information in civic domains).
- There is a potential mismatch between private incentives (minimize cost/complexity of grounding) and public interest (accurate, transparent sourcing).
- Policy and marketplace remedies
- Short- to medium-term: transparency requirements (source provenance, citation links), audits of grounding behavior, and standards for citation display could mitigate information-quality externalities.
- Medium- to long-term: negotiated compensation models (licensing deals, per-use payments, aggregator revenue shares), API-based content use markets, or regulatory interventions to prevent anti-competitive self-preferencing.
- Antitrust and privacy regulators may need to consider how crawler access, data extraction, and preferential citation affect competition and content markets.
- Research & measurement needs (for economists and policymakers)
- Quantify traffic and revenue impacts on publishers from AIO-style displays (click-through vs. summary consumption).
- Model platform-publisher bargaining under alternative licensing/revenue-sharing arrangements.
- Evaluate welfare trade-offs between user convenience (one-shot answers) and information quality/diversity.
- Standardized metrics for AIO transparency, robustness, and citation fidelity; routine independent audits and public datasets (the paper’s dataset/code are an example).
Short summary takeaway: generative search has already altered which sources users see and how often publishers receive traffic. That disruption creates economic pressure on publishers and raises competition, compensation, and public-good issues that call for new measurement, business models, and policy interventions to align incentives between generative-search providers and content creators.
Repository / data: https://github.com/rag24/AIO (authors’ processed datasets and code).
Assessment
Claims (10)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| We introduce a public benchmark dataset of 11,500 user queries to support our study and future research of generative search. Other | null_result | high | dataset size (number of queries) |
n=11500
11,500 user queries
0.3
|
| For 51.5% of representative, real-user queries, AI Overviews (AIOs) are generated and are displayed above the organic search results. Adoption Rate | positive | high | presence and placement of AI Overview (AIO) |
n=11500
51.5%
0.18
|
| Controversial questions frequently result in an AIO. Adoption Rate | positive | medium | likelihood of AIO generation for controversial queries |
n=11500
0.11
|
| The retrieved sources are substantially different for each search engine (average pairwise Jaccard similarity < 0.2). Adoption Rate | mixed | high | overlap (Jaccard similarity) of retrieved source domains across engines |
n=11500
<0.2 average Jaccard similarity
0.18
|
| Traditional Google search is significantly more likely to retrieve information from popular or institutional websites in government or education. Adoption Rate | positive | high | proportion of results from government/education/institutional websites |
n=11500
0.18
|
| Generative search engines are significantly more likely to retrieve Google-owned content. Adoption Rate | positive | high | proportion of results that are Google-owned content |
n=11500
0.18
|
| Websites that block Google's AI crawler are significantly less likely to be retrieved by AIOs, despite having access to the content. Adoption Rate | negative | high | likelihood/frequency of being retrieved in AIOs for crawler-blocking vs non-blocking sites |
n=11500
0.18
|
| AIOs are less consistent when processing two runs of the same query. Output Quality | negative | high | run-to-run consistency/variability of AIO outputs |
0.18
|
| AIOs are less robust to minor query edits. Output Quality | negative | high | robustness of results to minor query edits |
0.18
|
| These findings have important implications for website visibility, the effectiveness of generative engine optimization techniques, and the information users receive; we call for revenue frameworks to foster a sustainable and mutually beneficial ecosystem for publishers and generative search providers. Governance And Regulation | positive | high | policy recommendation for revenue frameworks / publisher sustainability |
0.03
|