← back

Google Uses Chrome to Supply Context to Gemini Chat

Page Content Agent

Walks the browser's internal rendering tree and produces a tree of content nodes. Each node gets a type, geometry, text styling, interaction info, and accessibility data.

A Complete List of Node Types:

  1. Root — The top-level container for the entire page
  2. Container — A grouping element (e.g. div) that holds other nodes
  3. Text — A piece of visible text content
  4. Paragraph — A block of text (maps to <p> tags)
  5. Heading — A title or subtitle (h1 through h6)
  6. Anchor — A clickable link
  7. Image — An image element
  8. SvgRoot — The root of an inline SVG graphic
  9. Canvas — A drawable canvas area
  10. Video — A video player element
  11. Form — A form container (groups inputs together)
  12. FormControl — An interactive input (text field, dropdown, button, etc.)
  13. Table — A data table
  14. TableRow — A single row within a table
  15. TableCell — A single cell within a table row
  16. OrderedList — A numbered list
  17. UnorderedList — A bulleted list
  18. ListItem — A single item within a list
  19. Iframe — An embedded page within the page
  20. DialogModal — A popup dialog that blocks page interaction
  21. DialogModeless — A popup dialog that allows page interaction

Geometry

There are three types of bounding box per node:

  1. Outer bounding box - The full rectangle the element occupies on screen, including parts that might be hidden behind other elements or clipped by a scrollable container
  2. Visible bounding box - The portion of the outer box that's actually visible to the user (clipped by scroll containers, parent boundaries, and the viewport edges)
  3. Fragment bounding boxes - When an element wraps across multiple lines (like a long link that breaks mid-sentence), each line segment gets its own separate rectangle. Only present when there are 2+ fragments

Each box is defined by x, y, width, height in viewport pixel coordinates. There's also a CSS position value per node (static, relative, absolute, fixed, sticky) since that affects how the element is positioned on the page.

Text Styling

  1. Text size — Ratio of the element's font size to the page's base font size, bucketed into XL, L, M, S, XS (so a heading that's 2x the base size = XL, normal body text = M, fine print = XS)
  2. Has emphasis — Whether the text is visually stressed: bold, italic, underlined, or superscript/subscript
  3. Color — The RGBA color value of the text

Accessibility

Two levels — per-node interaction info and page-level landmark roles.

Per-node interaction info:

  1. Is focusable — Can the element receive keyboard focus
  2. Is tabbable — Can the user Tab to it (focusable + has a non-negative tab index)
  3. Is disabled — Element is grayed out / non-interactive
  4. Clickability reasons — Why the element is considered clickable (16 possible reasons): clickable control, has click events, has mouse hover events, has mouse click events, has key events, is editable, has cursor:pointer style, has :hover CSS pseudo-class, has an ARIA role implying clickability, has aria-haspopup, is an ARIA toggle, is ARIA selectable, has aria-expanded=true, has aria-expanded=false, has autocomplete, has a tabindex
  5. Disabled reasons — Why interaction is blocked: aria-disabled, HTML disabled attribute, cursor:not-allowed style
  6. Z-order — The element's stacking position in the document (which elements sit on top of which)
  7. Scroller info — If the element is a scroll container: its total scrollable area, visible area, and whether it scrolls horizontally, vertically, or both

Landmark roles (annotated on container nodes):

  1. Header — Page or section header
  2. Nav — Navigation block
  3. Search — Search functionality area
  4. Main — Primary content area
  5. Article — Self-contained content piece
  6. Section — Thematic grouping
  7. Aside — Sidebar / tangential content
  8. Footer — Page or section footer
  9. ContentHidden — Content hidden via CSS content-visibility
  10. PaidContent — Content behind a paywall

Labels:

  1. aria-label — Explicit text label set on the element
  2. aria-labelledby — Label composed by referencing the text content of other elements by ID

Output Formatters

Once the Page Content Agent builds the node tree, three formatters can convert it into different output formats:

  1. Inner Text Builder — Takes the node tree and flattens it into plain text. Like copying a page and pasting into Notepad. All structure is lost, just the readable words remain.
  2. Inner HTML Builder — Takes the node tree and produces cleaned-up HTML. Structured markup but stripped of scripts, styles, and noise.
  3. Document Chunker — Splits the extracted text into passage-sized chunks suitable for feeding into an LLM context window. Handles splitting at sentence and paragraph boundaries so chunks don't break mid-thought.

Privacy

APC applies several privacy protections during the tree walk, controlling what content reaches the AI:

  1. Password redaction — Password field values are never included in the output. This covers native password inputs, fields using CSS -webkit-text-security to mask characters, and fields that were ever set to type "password" even if later changed to plain text.
  2. Cross-origin iframe redaction — If an embedded page (iframe) comes from a different domain, its content is replaced with redacted metadata (just the origin). Only same-origin iframes have their content included.

Node IDs

Every node in the tree gets two possible identifiers:

  1. content_node_id — A sequential number assigned to every node via depth-first traversal (1, 2, 3, ...). Every node gets one. This is what the AI uses to reference specific parts of the page.
  2. dom_node_id — A selective ID assigned only to nodes whose types appear on an internal allowlist. Not every node gets one. This is kept selective to avoid growing Chrome's internal hash maps unnecessarily.

Supporting Modules

  1. Paid content detection — Checks for schema.org markup (JSON-LD and microdata) indicating content is behind a paywall (isAccessibleForFree: false). Paywalled nodes are flagged so the AI doesn't leak content the user hasn't paid for.
  2. Ad-related element detection — Heuristically identifies ad containers by matching common class names, IDs, and data attributes associated with ad networks.
  3. Debug Utilities — Converts internal enum values to human-readable strings for logging and debugging. For example, turning node type number 5 into the string "Heading".
  4. Frame Metadata Observer — Lets other parts of Chrome subscribe to notifications when page metadata changes (title, meta tags, etc.). Keeps APC's view of the page current as the page updates dynamically.

Enabling Page Interactions

This system isn't just for reading pages — it's designed to let the AI interact with them. Gemini can click buttons, fill forms, and navigate links by referencing specific nodes in the tree.

The challenge is that pages change dynamically. A button might shift position, a list might reorder, or new content might load. This is handled with a matching algorithm that identifies target elements by combining multiple properties — node type, interactivity, and location on the page. If there's ambiguity, it further verifies by comparing text content to make sure the right element is acted on.

Structured Markdown with Node References

One of the output formats is structured Markdown where each element is tagged with a unique ID that links back to the original node in the tree. For example:

# Welcome to Example {#2}This is a paragraph of text. {#4}[Click here](https://example.com) {#5}

These IDs are what make interaction possible. The AI can say "click element {#5}" and Chrome knows exactly which DOM node that refers to — no fragile CSS selectors or XPath queries needed.

Selective Node ID Allowlist

Not every node needs a DOM node ID. Assigning IDs broadly grows Chrome's internal hash maps, which hurts renderer performance even after extraction is finished.

An allowlist mechanism controls which node types receive a dom_node_id:

  1. If no allowlist is set, IDs are emitted broadly (legacy behaviour).
  2. If an allowlist is set (even empty), IDs are always emitted for required cases — actionable targets like buttons and links, and metadata-linked nodes such as focused elements, selections, and label references.
  3. If the allowlist names specific node types, those types also get IDs.

The sequential content_node_id is unaffected — every node always gets one. The allowlist only controls the more expensive dom_node_id.

How It All Fits Together

A request comes in from one of Chrome's AI features. The Page Content Agent walks the rendering tree and produces a structured node tree. Privacy protections — password redaction, iframe redaction, paid content flagging — are applied during the tree walk itself, before any output is generated. Nothing sensitive reaches the formatters in the first place.

That tree can then be:

  1. Serialized as JSON for direct consumption
  2. Converted to structured Markdown with node ID references, enabling the AI to target specific elements on the page
  3. Passed to the inner text builder for plain text output
  4. Passed to the inner HTML builder for cleaned-up HTML
  5. Run through the document chunker to split into LLM-sized passages

When the AI needs to interact with the page — clicking a button, filling a form, following a link — it references a node ID from the Markdown output. Chrome matches that ID back to the actual element on the page, verifying by type, position, and content to handle cases where the page has changed since extraction.

In Action

Here's how a broken down page looks like, this is https://dejan.ai/ home page:

![DEJAN]
{#471} AI SEO
{#469} SRO
{#467} Blog
{#465} Models
{#463} *Book a call*
{#461} Sign in
#{#458} DEJAN is an AI SEO agency that makes global brands visible in AI search, chat, assistants and agents.
{#456} Our team uses use machine learning and mechanistic interpretability to understand exactly why AI systems recommend a brand, then make yours the brand they recommend.
{#454} We cover the main AI ecosystems including Google, OpenAI and Anthropic. This includes AI Mode, AI Overviews, Gemini App as well as ChatGPT, Perplexity, Copilot and Claude Models.
{#451} *Book a conference call with our senior strategy team to discuss your project in detail.*
{#449} *Schedule a Call*
{#446} *ENGAGED BY GLOBAL BRANDS.*
![Brand Logos: Virgin Australia, Atlassian, Zendesk, iStock] ![Brand Logos: Alibaba, Nickelodeon, Xero, Beyond Bank] ![Brand Logos: Griffith University, Sportsbet, Ubank, OWAYO] ![Brand Logos: Compare the Market, Australia Post, Expedia, Sixt] ![Brand Logos: JLL, Healthengine, Trip.com, ABODO] ![Brand Logos: TPG, Gizmag, Lendi, Petspiration Group]
{#437} *HOW WE WORK*
## {#435} The ARC Framework
{#433} *01*
## {#431} Association
{#429} Map Connections
{#427} We nurture a strong culture of testing and measuring. We like to know what works, what doesn’t work, and most importantly, we like to know why.
{#425} *02*
## {#423} Relevance
{#421} Find Connection Strength
{#419} We innovate all the time. It’s in our DNA. When working together with your team we’re very likely to come up with something that’s never been done before.
{#417} *03*
## {#415} Citations
{#413} Selection Rate Optimization
{#411} We see ourselves as an extension of your team and take great care to ensure that you understand our work. Our best campaigns are based on strong collaboration.
## {#409} Bayesian Content Optimizer
## {#407} Become the source AI chooses.
### {#405} Content Optimizer is a content optimization engine that aligns your page content with AI model preferences, making your brand more likely to be selected, cited, and recommended in AI search and chat. It optimizes your pages for the way AI assistants evaluate, compare, and choose sources.
## {#402} AI search has changed how content wins
{#400} Your page can rank #1 and still never appear in an AI answer.
{#398} When someone asks a question in ChatGPT, Gemini, Perplexity, Claude, or Google AI Mode, the model runs its own search, pulls a handful of competing pages, and decides — in a single pass — which source to trust, which passage to quote, and which brand to recommend. That decision isn’t driven by keywords or backlinks. The model weighs clarity, relevance, structure, evidence, specificity, and how completely your content answers the question. Most pages were never written for that evaluation.
{#396} *Content Optimizer is built for it.*
## {#393} From one page to your whole site
{#391} Optimize a single page against a single query, or batch hundreds of page-and-query pairs in one pass. Content Optimizer reuses its research across the batch, so overlapping competitors are analyzed once — and you get a portfolio-level view of where you’re winning and where you’re not.
## {#388} What you get
{#386} Every run turns analysis into changes your team can act on.
## {#384} Competitive source analysis
{#382} See your page measured against the exact competitors an AI would weigh for a query, and understand why one source is preferred over another.
## {#380} Rank factor insights
{#378} See which content attributes helped or hurt your page in the model’s evaluation — and where the next gain is most likely to come from.
## {#376} Content briefs
{#374} Optimization results turned into a clear, actionable editorial brief — the concrete page edits your writers can execute straight away.
## {#372} Citation-focused content improvements
{#370} Sharpen the specific passages AI systems are most likely to quote, cite, or summarize when they answer on your topic.
## {#368} An optimization narrative
{#366} A plain-English summary of what worked, what didn’t, and the key insight from the run — so anyone on the team can follow the reasoning.
{#364} *Optimizes for the decision, not the ranking*
{#362} Traditional SEO optimizes for where you sit on a results page. Content Optimizer optimizes for something different: whether an AI grounding system would choose to quote you.
{#360} The mechanism is direct. When an AI assistant answers a question, it compares competing sources and selects the most quotable passage. Content Optimizer simulates that exact decision with an AI {#359} *ranker* {#358} — then iteratively rewrites your page, or the snippet an AI would lift from it, until the ranker prefers your content over the competition.
{#356} It isn’t a checklist or a static score. It’s a measured contest, run round after round, until your page is the one the model picks.
{#354} *This is not traditional SEO*
{#352} It works alongside your SEO — but it optimizes for a different moment in the user’s journey.
## {#350} Traditional SEO
1. {#348} Optimizes for position on a search results page
2. {#346} Targets crawlers and ranking algorithms
3. {#344} Measured by keyword rankings and clicks
4. {#342} Guided by general best-practice checklists
## {#338} Content Optimizer
1. {#336} Optimizes for selection inside an AI-generated answer
2. {#334} Targets the model’s source-evaluation step
3. {#332} Measured by whether the AI ranker prefers your page
4. {#330} Guided by a measured, round-by-round contest
![Content Optimizer]
## {#325} From baseline to the top of the set
{#323} A single, transparent loop you can watch round by round.
## {#321} Choose a query or topic
{#319} Start with the question, entity, or search intent you want your page to win.
## {#317} Assemble the competitive set
{#315} Content Optimizer pulls the live results for that query and gathers the competing pages an AI assistant would actually encounter and weigh.
## {#313} Establish a baseline
{#311} An AI ranker scores your page against those competitors, using multiple independent samples for a stable, trustworthy starting rank.
## {#309} Optimize round by round
{#307} Each round, the engine forms a hypothesis, applies a targeted edit, and re-runs the ranker. Changes that improve your rank are kept; the rest are discarded.
## {#305} Converge on the winning version
{#303} The loop repeats until your content is the preferred source — or until you’ve seen exactly which changes move the needle and which don’t.
## {#301} Receive the brief
{#299} When the run finishes, you get a plain-English narrative of what worked and a content brief of concrete edits to apply.
{#297} ***Client Success: OWAYO*** ![Client Success: Owayo]
| {#294} *Metric* | {#292} *Apr 15, 2026* | {#290} *May 31, 2026* | {#288} *Percentage Points Up* | {#286} *% Increase* |
| {#283} *Share of Voice* | {#281} 2.18% | {#279} 3.87% | {#277} +1.69 | {#275} +77.52% |
| {#272} *Mention Share* | {#270} 2.06% | {#268} 4.37% | {#266} +2.31 | {#264} +112.14% |
| {#261} *Citation Share* | {#259} 2.30% | {#257} 3.38% | {#255} +1.08 | {#253} +46.96% |
## {#249} BACKGROUND
{#247} At the start of the AI Visibility campaign {#246} OWAYO {#244} wasn’t being recommended in AI assistant chat sessions and AI Mode for audiences in the USA.
## {#242} AUDIT
{#240} Using our bayesian content optimizer we found that the brand was overly EU-centric causing models to withhold recommendations for the audiences in the USA.
## {#238} CAMPAIGN
{#236} An on-site optimisation followed by a 6 month off-site brand alignment campaign resulted in OWAYO’s AI visibility by up to 90% per entity and 2% global uplift for all targeted entities.
{#234} **FAQs**
{#232} *Does this replace my SEO?*
{#230} No — it complements it. Traditional SEO gets your page into the competitive set an AI assistant considers. Content Optimizer helps you win selection once you’re there, so you’re the source that gets quoted and recommended.
{#228} *Which AI models does it optimize for?*
{#226} It models the source-selection behavior of the major AI search systems — Google AI Mode, ChatGPT search, Perplexity, Gemini, and Claude — and gives you rank-factor attribution broken down per model, so you can see what works where.
{#224} *Do I have to rewrite my whole page?*
{#222} No. Snippet mode tunes a single extractable passage. Page mode proposes targeted, line-level edits rather than a full rewrite. You always decide what to apply.
{#220} *Will optimized content be penalized by Google?*
{#218} No. The changes improve clarity, structure, evidence, and specificity — the same qualities that serve human readers. There’s no keyword stuffing and no manipulation; the page simply answers the question better.
{#216} *How long does a run take?*
{#214} A single snippet run completes quickly. Page-mode and batch runs take longer because they make more changes and test more competitors. Every run ends with a narrative summary and a content brief.
{#212} *Can my editorial team keep control?*
{#210} Yes. Human-in-the-loop mode lets your team choose each change and write it themselves, while the AI ranker keeps an objective score of whether it actually improved your standing.
{#208} *What do I need to get started?*
{#206} A page — or a set of pages — and the queries you want to win. We handle the competitive research, scraping, ranking, and reporting.
## {#204} AI Visibility Philosophy & Approach
![AI SEO Process]
{#200} DEJAN’s methodology transcends traditional AI SEO, diving into the core mechanics of LLMs to provide actionable intelligence for AI visibility. {#199} We understand that AI models, while appearing intelligent, operate on statistical probabilities and learned associations. Our tools are designed to surface these underlying mechanisms, providing clarity on how and why models make certain decisions.
{#197} *Deep Understanding*
{#195} We move beyond surface-level metrics to analyze log probabilities, token flow, and decision-making junctions within LLMs, identifying precise points for optimization.
{#193} *Actionable*
{#191} Our insights translate directly into actionable strategies for content creation, internal linking, and even influencing user prompting patterns, ensuring your brand’s message resonates effectively with AI.
{#189} *Testing.*
{#187} We nurture a strong culture of testing and measuring. We like to know what works, what doesn’t work, and most importantly, we like to know why.
{#185} *Innovation.*
{#183} We innovate all the time. It’s in our DNA. When working together with your team we’re very likely to come up with something that’s never been done before.
{#181} *Collaboration.*
{#179} We see ourselves as an extension of your team and take great care to ensure that you understand our work. Our best campaigns are based on strong collaboration.
{#177} *Meet our core team*
{#175} We’re an all-senior team with experience in a wide range of projects and industries.
![Mike Jolly]
{#171} *MIKE JOLLY*
{#169} *DIRECTOR OF STRATEGY*
![Blake Walsh]
{#165} *BLAKE WALSH*
{#163} *SEO*
![Giordano Chng]
{#159} *GIORDANO CHNG*
{#157} *SEO*
![Liam Buttery]
{#153} *LIAM BUTTERY*
{#151} *SEO*
![Dan Petrovic]
{#146} *DAN PETROVIC*
{#143} *AI SEO*
![Martin Reed]
{#139} *MARTIN REED*
{#137} *TECHNICAL SEO*
![Bianca Hall]
{#133} *BIANCA HALL*
{#131} *PUBLIC RELATIONS*
![Alex Petrovic]
{#127} *ALEX PETROVIC*
{#125} *SEO*
![Danielle White]
{#121} *DANIELLE WHITE*
{#119} *OPERATIONS*
![Milos Dosen]
{#115} *MILOS DOSEN*
{#113} *CFO*
![Josip Ivanovic]
{#109} *JOSIP IVANOVIC*
{#107} *DEVELOPER*
![Nemek Nowaczyk]
{#103} *NEMEK NOWACZYK*
{#101} *PPC*
![Dragan Grubacki]
{#97} *DRAGAN GRUBACKI*
{#95} *TECHNICAL SEO*
![Finn Arrowsmith]
{#91} *FINN ARROWSMITH*
{#89} *OUTREACH*
{#87} *We were given our very own bespoke internal link recommendation engine that leverages world-class language models and data science. It’s one thing to theorize about the potential of machine learning in SEO, but it’s entirely another to witness it first-hand. It changed my perspective on what’s possible in enterprise SEO.*
![Scott Schulfer]
{#83} Scott Schulfer
{#81} Senior SEO Manager
{#79} **Zendesk**
![Brand Logos: Temple & Webster, Jim's Group, Pepperstone, Flight Centre] ![Brand Logos: Containers for Change, iSelect, Mater Foundation, inkStation] ![Brand Logos: Leonardo.ai, KingGee, SAE, AIB] ![Brand Logos: Chemist Warehouse, New Atlas, ITP, Hard Yakka, Redspot] ![Brand Logos: Carzoos, Oscar Wylee, ABC]
{#71} **Featured In**
{#69} Dan Petrovic, an academic and consultant on {#68} SEO and generative AI {#66} , said Google’s size, expertise and massive trove of search data gave it a massive advantage, but that Gemini 3 Pro would probably be a more expensive model to run.
{#64} Tim Biggs, The Sydney Morning Herald
![The Sydney Morning Herald Logo]
{#58} Dan Petrovic made a super write up around Chrome’s latest embedding model with all the juicy details on his blog. Great read.
##### {#56} JASON MAYES
##### {#54} WEB AI LEAD
## {#52} GOOGLE
{#50} *GOOGLE WEB AI*
![Jason Mayes, Google]
![MOZ Top 10] {#42} Featured in “ {#41} Moz Top 10 {#39} “, {#38} twice {#36} .
![TLDR Tech AI]
![MOZ Logo]
{#28} Moz Recommended Agency
{#26} *Book a conference call with our senior strategy team to discuss your project in detail.*
{#24} *Schedule a Call*
{#21} *DEJAN*
{#19} AI SEO {#17} SRO {#15} Blog {#13} Models {#11} Book a call {#9} Concepts {#7} · {#6} OKF bundle
{#3} DEJAN SEO PTY LTD, trading as DEJAN AI.

---

Dan Petrovic · Jul 05, 02:40