Author: Dan Petrovic
-

AI Brand Authority Index: Ranking 2.9 Million Brands by Associative Embeddedness in Gemini’s Memory
Abstract When a large language model is asked to “name 100 brands at random,” it doesn’t produce uniform randomness. It produces a distribution shaped by its training data, revealing which brands occupy the most cognitive real estate in the model’s parametric memory. We present a methodology for quantifying brand authority in AI memory using Personalized…
-

TurboQuant: From Paper to Triton Kernel in One Session
Implementing Google’s KV cache compression algorithm on Gemma 3 4B and everything that went wrong along the way. On March 24, 2026, Google Research published a blog post introducing TurboQuant, a compression algorithm for large language model inference. The paper behind it, “Online Vector Quantization with Near-optimal Distortion Rate” had been on arXiv since April…
-

Clickbait Titles Exploit Attention Through Latent Entities
Every clickbait title works the same way: it removes exactly one critical variable: the subject, the reason, the process, or the outcome, and charges you a click to fill the blank. This missing variable, which we call a latent entity, is so pervasive it has become normalized and nobody questions it anymore. You should! That…
-

Fanout Query Analysis
When AI models like Gemini, GPT or Nova answer a question using web search, they don’t just run your query as-is. They generate their own internal search queries, or fanout queries. A single user prompt can trigger multiple fanout queries as the model breaks down the question, explores subtopics and verifies information. We captured 365,920…
-

Reverse Prompting: Reconstructing Prompts from AI-Generated Text
We fine-tuned Google’s Gemma 3 (270M) to reverse the typical LLM workflow: given an AI-generated response, the model reconstructs the most likely prompt that produced it. We generated 100,000 synthetic prompt-response pairs using Gemini 2.5 Flash, trained for a single epoch on a consumer GPU, and built a Streamlit app that sweeps 24 decoding configurations…
-

Rufus – Under the Hood. What Drives Amazon’s AI Shopping Assistant?
What’s Publicly Known About the Pipeline, Backend, and Response Anatomy. Rufus is not “one model that magically answers.” Public Amazon/AWS descriptions point to a multi-component system: Speculative schema: Pipeline: request → answer Step A — Input + context assembly Public descriptions indicate customers can: Amazon also describes using conversational context and (more recently) account memory…
-

Is Query Length a Reliable Predictor of Search Volume?
The answer is no. There’s a widely held intuition in SEO and ecommerce search: short queries have high volume, long queries have low volume. “laptop” gets millions of searches. “left handed ergonomic vertical mouse wireless” does not. It feels obvious. But is query length actually a reliable predictor of search volume? Or is it a…
-

Search Grounding is Transient
There is a fundamental misconception about how Google’s AI search and Gemini chatbot process retrieved web content. It is widely understood that these systems use Retrieval-Augmented Generation (RAG) to search the web, pull snippets from pages, and ground their answers in factual data. However, there is a pervasive assumption that once an AI pulls in…
-
SRO & Grounding Snippets
Source: dejan.ai/blog/category/ai-seo/sro/Author: Dan Petrovic, DEJAN AIPosts analyzed: 5 (Sep 2025 – Feb 2026) What is SRO? SRO — Selection Rate Optimization — is a new discipline coined by DEJAN that addresses visibility in AI-powered search (Google AI Mode, Gemini Chat, AI Overviews). It is the AI-native successor to traditional SEO click-through-rate optimization. The core premise:…
-

What extraction method is Google using to build grounding snippets?
I’ve been reverse-engineering Google’s Gemini grounding pipeline (AI Mode, Gemini Chat…etc) by examining the raw groundingSupports and groundingChunks returned by the API. Specifically, I’m interested in the snippet construction step, the part where, given a query and a retrieved web page, the system selects which sentences to include in the grounding context supplied to the…
