How big are Google’s grounding chunks?

Research

Analysis of how Google selects content to ground Gemini-powered AI shows a fixed 2,000-word budget per query, where relevance rank determines word share.

Listen

When Google uses your website to ground its Gemini AI, it doesn't actually read your entire page. Instead, it operates on a strict budget. For any given query, Google allocates a total budget of about two thousand words, and it splits this fixed amount among all the sources it pulls from.

Your search rank determines how much of that budget you get. The top-ranked source gets the lion's share, usually around five hundred words. By the time you get down to the fifth spot, that share is cut in half.

This also means writing longer articles won't help you get more coverage. In fact, after about fifteen hundred words, you hit a wall of diminishing returns. Google simply stops selecting more content. A tight, eight-hundred-word article might see more than half of its text used. But a massive four-thousand-word page will only see about thirteen percent of its content make the cut.

The takeaway for content strategy is clear. Density beats length. Don't try to write the longest page on the web. Instead, focus on being the most relevant, concise source for the query.

Note: Highlighted bits of this article indicate the parts used to ground Gemini with article title as prompt.

Our prior analysis showed that Google doesn’t use your full page content when grounding its Gemini-powered AI systems. Now we have substantially more data to share, specifically around how much content gets selected and what determines that selection.

Dataset Overview

We analysed 7,060 queries with 3+ sources, comparing grounding snippets against full page content for 2,275 tokenized pages.

MetricValueQueries Analysed7,060Pages Tokenized2,275Total Snippets883,262Avg Words / Chunk15.5

The ~2,000 Word Budget

Each query has a fixed grounding budget of approximately 2,000 words total, distributed across sources by relevance rank.

PercentileTotal Words Per Queryp251,546p50 (median)1,929p752,325p952,798

This budget is remarkably consistent regardless of how many sources are used or how long the individual pages are.

Rank Determines Your Share

The total budget is divided among sources based on their relevance ranking:

RankMedian WordsShare of Total#153128%#243323%#337820%#433017%#526613%

Being the #1 ranked source gets you 2x the grounding compared to being #5. You’re competing for share of a fixed pie, not expanding the pie.

Per-Source Selection

For individual sources, the grounding selection follows this distribution:

PercentileWordsCharactersp50 (median)3772,427p754913,182p906053,863p956484,202Max1,76911,541

77% of pages get 200-600 words selected. The typical page gets ~377 words.

Coverage Drops as Page Size Increases

We compared grounding selection against original page size:

Page WordsAvg Grounding WordsCoverage<1K37061%1-2K49235%2-3K53222%3K+54413%

Page CharsAvg Grounding CharsCoverage<5K2,12766%5-10K3,02442%10-20K3,36325%20K+3,57412%

Grounding plateaus at ~540 words / ~3,500 characters. Pages over 2,000 words see diminishing returns—adding more content dilutes your coverage percentage without increasing what gets selected.

Key Takeaways

Fixed budget per query: ~2,000 words total, split among sources
Rank matters most: #1 source gets 531 words, #5 gets 266 words
Diminishing returns: Pages over 1,500 words don’t get more selected
Concise wins: A tight 800-word page gets 50%+ coverage; a 4,000-word page gets 13%

The implication for content strategy is clear: density beats length. Focus on being the most relevant source for a query, not the longest.

Investigation Report: How Google selects the Perfect Snippet.

Dan Petrovic · Dec 20, 18:16