Google’s New URL Context Tool

Google’s just released a new system which allows Gemini to fetch text directly from a supplied page. OpenAI had this ability for a while now, but for Google, this is completely new. Previously their models were limited to the Search Grounding tool alone.

Gemini now employs a combination of tools and processes with the ability to search the web and then deeply “read” specific webpages. This allows it to ground its responses in real-world data. Let’s explore two key internal capabilities: a search tool and a browsing tool (URL context), and understand how they interact, especially when “Grounding with Google Search” is enabled.

The Core: Understanding Web Content with “URL Context” (the browse Tool)

At its heart, Gemini’s ability to understand the internet relies on what can be termed “URL Context.” This means it can take a specific web address (URL), access its content, and understand what’s written there. For an AI like Gemini, this is often managed through an internal function, let’s call it browse for simplicity.

The definition for such a tool is clear:

def browse(urls: list[str]) -> list[BrowseResult]:
    """Print the content of the urls.
     Results are in the following format:
     url: "url"
     content: "content"
     title: "title"
    """

What this browse tool does: When Gemini is provided with one or more URLs, it uses this browse capability to visit each page. It then extracts the main textual content and the page’s title. This is akin to the AI carefully reading a specific document.

An Example of browse in Action:
Imagine a user asks Gemini: “Can you summarize the article at https://dejan.ai/blog/gemini-grounding/?”

Gemini’s internal process would then involve executing a command similar to this:

print(browse(urls=['https://dejan.ai/blog/gemini-grounding/']))

Which would yield (as seen in the example output above):

  • URL: https://dejan.ai/blog/gemini-grounding/
  • Title: New insights into how Google grounds its LLM, Gemini. – DEJAN
  • Content: (A summary of the article, including points about Gemini’s internal indexing for search results, its operational loop of thinking and action stages, the use of Google Search and Conversation Retrieval tools, verification principles, error handling, and the importance of contextual parameters like date, time, and location.)

With this information, Gemini can then synthesize a summary for the user, citing the article as the source for its information.

Broadening the Scope: The concise_search Tool

But what happens if the user doesn’t provide a specific URL? For instance, a query like: “What AI models does Dejan AI offer?” This is where Gemini’s search capability, perhaps through an internal tool like concise_search, becomes essential.

The definition of such a tool might look like:

def concise_search(query: str, max_num_results: int = 3):
  """Does a search for the query and prints up to the max_num_results results. Results are _not_ returned, only available in outputs."""

What this concise_search tool does: It takes the user’s query, performs a web search, and returns a list of relevant URLs, typically with snippets of content. This is like Gemini consulting a vast digital library catalog.

An Example of concise_search:
For the query “dejan ai models”, Gemini would internally execute:

print(concise_search(query="dejan ai models", max_num_results=3))

The Output (as seen above):
Gemini receives a list of search results. For “dejan ai models,” these results include links to DEJAN’s “Our Models” page, Dan Petrovic’s Hugging Face profile listing various models, and an article about LinkBERT. These results often point to URLs like https://vertexaisearch.cloud.google.com/..., which are part of Google’s infrastructure for providing grounded search results.

The Synergy: “Grounding with Google Search”

When “Grounding with Google Search” is enabled for Gemini, it doesn’t just pick one tool over the other; it orchestrates a sophisticated workflow. This is guided by a set of internal instructions that tell Gemini how to combine these capabilities.

These instructions typically emphasize:

  1. If a user asks a question without a specific URL, Gemini should first use its search tool.
  2. Then, it should analyze the search results, paying close attention to the vertexaisearch URLs.
  3. Finally, it should use its browse tool (URL context) to deeply read the content of these specific search result pages.

The Grounded Workflow Illustrated:

Let’s take the query: “What AI models does Dejan AI offer?”

  1. Gemini’s Internal Analysis: The query is general, no specific URL is provided. “Grounding with Google Search” protocols apply. The first step is to use the search tool.
  2. Gemini Executes (Search): print(concise_search(query="dejan ai models", max_num_results=3)) (Output is similar to the example shown earlier)
  3. Gemini’s Internal Analysis (Post-Search): The search provides several promising URLs. The instructions guide Gemini to prioritize these for browsing.
  4. Gemini Executes (Browse Search Results): print(browse(urls=['https://vertexaisearch.cloud.google.com/grounding-api-redirect/AbF9wXGnLhpm8jDi9HywZ6LpSXte7g2BbnovULh-PjWTTHbKu7MaeQLEC5ikMi9BiLmy8JFWX0ftCOKE135ogWll7LzmTRU-hbd2Ne5JwW0POxs=', 'https://vertexaisearch.cloud.google.com/grounding-api-redirect/AbF9wXEXv7QWCPXcXYb3xB_Ol5UZHCCbOWoEpfdmbqMr4lnp-sG3JhMWcyy-mlEoMrOVtTfPeNK9Ysi5hBOgqcIyd4B__ehxE05mkkHogOQI_dyDOwYVkvP1'])) # Example with 2 URLs from search (Note: This is a conceptual representation. The actual URLs browsed would be based on the live output of the search.)
  5. Gemini Synthesizes and Responds: Having “read” the content of the pages from the search results (e.g., DEJAN’s “Our Models” page and the Hugging Face profile), Gemini can now construct a detailed answer. For instance: “Dejan AI offers several specialized machine learning models, primarily for SEO applications. These include a Query Intent Classifier (with variants like Intent-XS and Intent-XL built on Albert base models) to determine user intent behind search queries (e.g., Commercial, Informational, Navigational). They have also developed LinkBERT (with mini and XL versions), a fine-tuned BERT model that predicts natural link placements in web content, simulating author behavior and aiding in tasks like anchor text suggestion and inorganic link detection. Additionally, they offer a Query Form Quality Classifier, an improvement over Google’s previous work, using ALBERT architecture to identify ambiguous queries. Their Hugging Face profile also lists models like ‘ai-detection-small’, ‘QDF-large’ (Query Deserves Freshness), and ‘substance’ classifiers.” Gemini would cite the specific sources from the search.

Implications for Content and SEO:

Understanding this process reveals how crucial high-quality, clearly structured content is:

  • Discoverability & Readability: Content must be discoverable by search and easily parsable by AI tools like browse. Clear headings, good organization, and factual accuracy are key.
  • Answering Questions Directly: Gemini’s ability to ground responses means content that directly and comprehensively answers user questions is more likely to be leveraged.
  • The Role of vertexaisearch: These URLs indicate that Google’s systems have processed and identified specific content as authoritative or relevant for grounding.
  • Transparency and Trust: Gemini’s process, when it includes citing the URLs used (via browse), builds trust by showing the origin of the information.

By combining broad web search with deep reading of specific pages, Google’s Gemini can provide answers that are not only comprehensive but also grounded in the information available on the internet, making it a powerful tool for information retrieval and synthesis.

Does it really visit the page?

No. Our tests suggest Google fetches page information from internal storage. A server logger was created for the purpose of testing. When prompted, Gemini “fetched” the page text but server log files recorded no visit.

Additional test was performed where we changed the title of a page and requested Gemini fetches the latest information from that URL. It returned the old title.

Finally, this very article was published and Gemini failed to fetch its content on request. Instead the same generic tool response was supplied to the model:

“I’m sorry. I’m not able to access the website(s) you’ve provided. The most common reasons the content may not be available to me are paywalls, login requirements or sensitive information, but there are other reasons that I may not be able to access a site.”

In contrast when you send GPT to it there’s clear entry in our log file:

{"time":"2025-05-21 10:09:55","ip":"52.230.164.176","host":"","forwarded_for":"","user_agent":"Mozilla\/5.0 AppleWebKit\/537.36 (KHTML, like Gecko); compatible; ChatGPT-User\/1.0; +https:\/\/openai.com\/bot","request_method":"GET","uri":"\/test.php","query_string":"","referer":"","accept":"text\/html,application\/xhtml+xml,application\/xml;q=0.9,image\/avif,image\/webp,image\/apng,\/;q=0.8,application\/signed-exchange;v=b3;q=0.9","accept_lang":"en-US,en;q=0.9","accept_enc":"gzip, deflate, br","content_type":"","content_length":"","cookies":"","origin":"","protocol":"HTTP\/1.1","port":"443","https":"on","HTTP_HOST":"dejan.ai","HTTP_USER_AGENT":"Mozilla\/5.0 AppleWebKit\/537.36 (KHTML, like Gecko); compatible; ChatGPT-User\/1.0; +https:\/\/openai.com\/bot","HTTP_ACCEPT_LANGUAGE":"en-US,en;q=0.9","HTTP_ACCEPT_ENCODING":"gzip, deflate, br","HTTP_ACCEPT":"text\/html,application\/xhtml+xml,application\/xml;q=0.9,image\/avif,image\/webp,image\/apng,\/;q=0.8,application\/signed-exchange;v=b3;q=0.9","HTTP_X_DATADOG_TRACE_ID":"4310971778737635183","HTTP_X_DATADOG_PARENT_ID":"17309162417739219663","HTTP_X_DATADOG_SAMPLING_PRIORITY":"2","HTTP_X_DATADOG_TAGS":"_dd.p.tid=682da66d00000000,_dd.p.dm=-4","HTTP_TRACEPARENT":"00-682da66d000000003bd3a5fe0463ff6f-f0367f82d4e342cf-01","HTTP_TRACESTATE":"dd=p:f0367f82d4e342cf;s:2;t.dm:-4;t.tid:682da66d00000000","HTTP_X_OPENAI_TRAFFIC_SOURCE":"user","HTTP_X_OPENAI_ORIGINATOR":"browse","HTTP_X_OPENAI_ORIGINATOR_ENV":"prod","HTTP_X_OPENAI_PRODUCT_SKU":"unknown","HTTP_X_OPENAI_INTERNAL_CALLER":"browse","HTTP_X_REQUEST_ID":"76373afa-8b1c-4853-89a6-56dd50627308","HTTP_X_ENVOY_EXPECTED_RQ_TIMEOUT_MS":"14460","HTTP_X_HTTPS":"1"}

And here’s Anthropic’s Claude:

{"time":"2025-05-21 10:14:27","ip":"34.34.241.48","host":"","forwarded_for":"","user_agent":"Mozilla\/5.0 AppleWebKit\/537.36 (KHTML, like Gecko; compatible; Claude-User\/1.0; +Claude-User@anthropic.com)","request_method":"GET","uri":"\/test.php","query_string":"","referer":"","accept":"\/","accept_lang":"","accept_enc":"gzip, deflate","content_type":"","content_length":"","cookies":"","origin":"","protocol":"HTTP\/1.1","port":"443","https":"on","HTTP_HOST":"dejan.ai","HTTP_ACCEPT":"\/","HTTP_ACCEPT_ENCODING":"gzip, deflate","HTTP_CONNECTION":"keep-alive","HTTP_USER_AGENT":"Mozilla\/5.0 AppleWebKit\/537.36 (KHTML, like Gecko; compatible; Claude-User\/1.0; +Claude-User@anthropic.com)","HTTP_X_HTTPS":"1"}

Perhaps by accident, right after prompting Grok there was a bunch of rogue, unsigned requests via: 94.156.41.18, 45.130.33.251, 85.254.114.95, 207.90.46.241, 45.145.136.243 and 157.97.127.99:

{"time":"2025-05-21 10:16:03","ip":"94.156.41.18","host":"","forwarded_for":"","user_agent":"Mozilla\/5.0 (iPhone; CPU iPhone OS 18_0 like Mac OS X) AppleWebKit\/605.1.15 (KHTML, like Gecko) Version\/18.0 Mobile\/15E148 Safari\/604.1","request_method":"GET","uri":"\/test.php","query_string":"","referer":"","accept":"text\/html,application\/xhtml+xml,application\/xml;q=0.9,*\/*;q=0.8","accept_lang":"en-US,en;q=0.9","accept_enc":"gzip, deflate, br","content_type":"","content_length":"","cookies":"","origin":"","protocol":"HTTP\/1.1","port":"443","https":"on","HTTP_HOST":"dejan.ai","HTTP_SEC_FETCH_DEST":"document","HTTP_USER_AGENT":"Mozilla\/5.0 (iPhone; CPU iPhone OS 18_0 like Mac OS X) AppleWebKit\/605.1.15 (KHTML, like Gecko) Version\/18.0 Mobile\/15E148 Safari\/604.1","HTTP_ACCEPT":"text\/html,application\/xhtml+xml,application\/xml;q=0.9,*\/*;q=0.8","HTTP_SEC_FETCH_SITE":"none","HTTP_SEC_FETCH_MODE":"navigate","HTTP_ACCEPT_LANGUAGE":"en-US,en;q=0.9","HTTP_PRIORITY":"u=0, i","HTTP_ACCEPT_ENCODING":"gzip, deflate, br","HTTP_X_HTTPS":"1"}

Internal Tool Instructions

I managed to get hold of Gemini’s internal tool instructions:

***Instruction when answering questions***.
1. Always try to generate tool_code blocks before responding, gather as much information as you can before answering the questions

2. If there is no url in the user query, DO NOT COME UP WITH A URL DIRECTLY TO BROWSE. Instead, use the search tool first, then browse the urls you get from the search tool.

3. Always try to use the browse tool after the search tool, this can help you get more relevant information. Do the following when you want to browse any url based on the search result you get

4. Recognize the urls in the search result, which shown in the tool output. The urls should start with "https://vertexaisearch"

5. Browse the urls in step 4, use print statement to see the result.
# Guidelines for browse tool

When you are asked to browse multiple urls, you can browse multiple urls in a single call.
Note: Always use the tool_code block first in order to use the browse tool to answer the user query.

The current time is Wednesday, May 21, 2025 at 7:06 AM UTC.


# Guidelines for citations

Each sentence in the response which refers to a browsed result MUST end with a citation, in the format "Sentence. [INDEX]", where INDEX is a browsed result index. Use commas to separate indices if multiple browsed url sources are used. If the sentence does not refer to any browsed urls content, DO NOT add a citation.

Appendix: A Developer’s Look at Gemini’s URL Context Tool (Gemini API)

While the previous sections described Gemini’s internal logic and tools in a more conceptual way, Google also provides specific documentation for developers using the Gemini API. This documentation sheds more light on the official “URL context tool,” which aligns with the browse functionality discussed earlier.

Experimental Feature with Powerful Applications

According to Google’s Gemini API documentation, the URL context tool is an experimental feature designed to let developers provide Gemini with URLs as additional context directly within a prompt. The model can then retrieve content from these URLs to inform and enhance its responses. This is particularly useful for a variety of tasks, including:

  • Extracting key data points or talking points from articles.
  • Comparing information across multiple web links.
  • Synthesizing data from several online sources.
  • Answering questions based on the content of specific pages.
  • Analyzing web content for purposes like drafting job descriptions or creating test questions.

Two Primary Modes of Operation

Developers can leverage the URL context tool in two main configurations:

  1. URL Context Only: In this mode, developers provide specific URLs directly in their prompt for Gemini to analyze. For example, a prompt might be, “Summarize this document: [YOUR_URL]” or “Extract key features from the product description on this page: [YOUR_URL].” Gemini then focuses its analysis solely on the provided URLs.
  2. Grounding with Google Search + URL Context: This more comprehensive mode allows Gemini to first use its Google Search capabilities to find relevant information online if no specific URLs are given, or to augment URLs that are provided. After the search phase, it then employs the URL context tool to read and understand the content of the most relevant search results (or the provided URLs). A prompt might be, “Recommend 3 books for beginners to learn more about [YOUR_SUBJECT],” where Gemini would search for relevant books and then use URL context to understand summaries or reviews.

Technical Implementation and Metadata

The Gemini API documentation provides code examples (Python, Javascript, REST) showing how developers can integrate this. For instance, in Python, it involves using google.genai and its Tool types, specifically types.UrlContext.

A key aspect highlighted is the url_context_metadata that can be returned in Gemini’s response. This metadata provides information about the URLs that were retrieved and processed, including their status (e.g., success or failure in retrieval). This metadata can also show the actual URLs that were retrieved, which sometimes might be vertexaisearch.cloud.google.com/grounding-api-redirect/... URLs, indicating that the content was processed through Google’s grounding infrastructure, even if the original URL was different.

Supported Models and Limitations

As of the documentation, this experimental URL context tool is supported by models such as:

  • gemini-2.5-pro-preview-05-06
  • gemini-2.5-flash-preview-05-20
  • gemini-2.0-flash
  • gemini-2.0-flash-live-001

Being an experimental feature, it has some limitations:

  • It can consume up to 20 URLs per request.
  • For optimal results, it’s recommended for use with standard web pages rather than multimedia content like YouTube videos.
  • During the experimental phase, its usage is free, with billing anticipated later.
  • Quotas are in place: 1500 queries per day per project via the Gemini API and 100 queries per day per user in Google AI Studio.

This developer-focused information from the Gemini API documentation confirms the core capabilities discussed earlier: Gemini’s ability to directly process URL content is a fundamental feature, whether invoked by an agent through a browse command or by a developer through the url_context tool in the API. The “Grounding with Google Search” feature then leverages this URL processing ability to provide even more comprehensive and contextually aware responses by first discovering relevant URLs through search.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *