← all concepts

LongFormContextResult

The data wrapper Chrome uses to deliver extracted page content to Gemini's context window. It separates external source material from conversation, ensuring the model grounds its answers in the page rather than its training data. Each block carries a session-wide index used for citation mapping.

Listen

When you share a webpage or paste a link into Gemini, Google Chrome doesn't just send the raw text. Instead, it packages the page into a structured data wrapper designed to help the artificial intelligence understand exactly what it is reading.

This wrapper serves a crucial purpose. It tells Gemini that the incoming data is external source material, completely separate from your conversation. This boundary ensures the model grounds its answers in the actual page content, rather than relying on its training data.

Each block of content is labeled with a three-part index number, which acts as a session-wide counter. The first number tracks your conversation thread, the second identifies the type of block, and the third counts the individual data chunks as they are processed. This numbering system is the backbone of Gemini's citations, allowing the AI to trace its references back to the exact source block.

The structure also adapts to the type of media you share. A standard webpage arrives as a single block containing the full text. A YouTube video, however, is broken down into three distinct blocks: the title, the description, and the transcript. The transcript is further divided into timestamped segments, giving Gemini precise context for every moment in the video.

The data wrapper Chrome uses to deliver extracted page content into Gemini's context window. When a user shares a tab or pastes a URL into Gemini, the extracted content is packaged into this structure rather than sent as raw text.

The wrapper serves a specific purpose: it tells Gemini that the incoming data is external source material, not part of the conversation. This separation ensures the model grounds its answers in the actual page content rather than drawing from its training data.

Each LongFormContextResult block carries an index like 0.1.84 — a session-wide sequential counter. The first number tracks the conversation thread, the second identifies the block type, and the third is a sequential node counter that increments for every data chunk processed in the session: pages, video transcripts, descriptions. This index is the citation backbone — when Gemini references a specific piece of content, it traces back to the exact source block through this numbering.

The schema differs by content type. A webpage arrives as a single block containing the full extracted text. A YouTube video arrives as three separate blocks — title, description, and transcript — with the transcript further split into timestamped segments, each carrying its own start time, end time, and index.


Concept

Mentioned in