An experiment testing Google's AI Mode suggests it may rely on Google's existing index or cached web data rather than performing live HTTP requests for all URLs.
When you use Google's AI Mode to fetch a web page, you might assume it is browsing the live internet in real time. But a recent experiment reveals that this is not exactly how it works.
To test this, a brand-new web page was uploaded to a live server. A standard connection check confirmed the page was live and fully accessible. However, when a Python script was run inside Google's AI Mode to fetch that same new web address, it returned a "not found" error.
The script was then changed to fetch an older, already indexed page on the same server. This time, the AI successfully retrieved the page and its content.
This suggests that Google's AI Mode does not perform a fresh, live request to the target server every time. Instead, it seems to consult Google's existing search index or cache first. If a page is brand-new and has not been crawled yet, the AI acts as if it does not exist and simulates a 404 error, even if the page is actually live.
For developers and researchers, this is a crucial distinction. It means the AI's web access is mediated through Google's existing snapshot of the internet, rather than direct, real-time browsing. If you need up-to-the-second accuracy on a newly published page, the AI might tell you it simply is not there.
I recently stumbled upon a fascinating aspect of how Google’s AI Mode (powered by a custom Gemini model) interacts with the internet. I ran a simple test, and the results suggest that instead of performing truly live fetches for all URLs, the AI Mode relies on Google’s existing index or a cached version of the web. This can lead to some surprising discrepancies when dealing with brand-new or unindexed content.
Here’s What I Did:
First, I disabled the use of search_tool and made AI Mode run python code in its local environment.
My experiment was straightforward:
I repeated the test with another file (test.php) and replicated the test successfully.
My Observations and Implications:
The key takeaway for me was the stark difference in how AI Mode handled the newly created page:
However, for a page that is likely already known to Google (indexed or cached), the AI Mode correctly fetched and reported its status and content.
This strongly suggests to me that when Google’s AI Mode (or its Python execution environment) attempts to access a URL, it doesn’t necessarily perform a fresh, live HTTP request to the target server every single time. It seems more likely that it first consults Google’s vast index or a cached representation of the web.
Why This Matters (To Me, and Maybe To You):
This behavior has several implications:
Sign in with Google to comment.