Category: SEO
-
Dynamic per-label thresholds for large-scale search query classification with Otsu’s method
Solving the “Which Score Is Good Enough?” Puzzle The real-world problem Arbitrary label search-query intent classifiers spit out a confidence score per label.On clean demos you set one global cut-off say 0.50 and move on.In production: Manual tuning per label quickly turns into a never-ending whack-a-mole, especially when the taxonomy is customized client-by-client (e.g., SaaS…
-
Training Gemma‑3‑1B Embedding Model with LoRA
In our previous post, Training a Query Fan-Out Model, we demonstrated how to generate millions of high-quality query reformulations without human labelling, by navigating the embedding space between a seed query and its target document and then decoding each intermediate vector back into text using a trained query decoder. That decoder’s success critically depends on…
-
Universal Query Classifier
Generalist, Open‑Set Classification for Any Label Taxonomy We’ve developed a search query classifier that takes any list of labels you hand it at inference time and tells you which ones match each search query. No retraining, ever. Just swap in new labels as they appear. Old workflow Pain New workflow Build + label data + retrain…
-
Another failed attempt to kill SEO
If Marie Haynes, Barry Schwartz or Cindy Krum had written an article declaring SEO dead and proposing we rebrand our industry you’d seriously consider it. Wouldn’t you? What about Zach Cohen and Seema Amble? I don’t know either. Looked them up just now. Two VC people with insignificant footprint or long-term interest in SEO, Machine…
-
From Hallucinations to Clicks
Anastasia Kotsiubynska proposed a method to repurpose LLM-hallucinated URLs and set up redirects from hallucinated 404 instances with more than one session to most similar valid 200 pages. I really like this, but since I work on websites with many millions of pages where volumes of hallucinated URLs are typically beyond the scope of manual…
-
What is GEO?
GEO stands for Generative Engine Optimisation, an acronym easily confused with, the well-established “geo-” prefix commonly associated with Geosciences. What is a ‘Generative Engine’? Generative engine is recently made up term by the marketing community in an attempt to rename Chatbots, more recently known as AI Assistants including ChatGPT, Claude, Grok, Gemini and Perplexity. Basically…
-
Google’s New URL Context Tool
Google’s just released a new system which allows Gemini to fetch text directly from a supplied page. OpenAI had this ability for a while now, but for Google, this is completely new. Previously their models were limited to the Search Grounding tool alone. Gemini now employs a combination of tools and processes with the ability…
-
How Google grounds its LLM, Gemini.
In previous analyses (Gemini System Prompt Breakdown, Google’s Grounding Decision Process, and Hacking Gemini), we uncovered key aspects of how Google’s Gemini large language model verifies its responses through external grounding. A recent accidental exposure has provided deeper insights into Google’s internal processes, confirming and significantly expanding our earlier findings. Accidental Exposure of Gemini’s Grounding…
-
Google Lens Modes
lns_mode is a parameter that classifies Google Lens queries into text, un (unimodal), or mu (multimodal). Google Lens has quietly become one of the most advanced visual search tools in the world. Behind the scenes, it works by constructing detailed, context-rich search queries that include a growing set of parameters. One of the newest additions…
-
Content Substance Classification
Demo: https://dejan.ai/tools/substance/ Preface In 1951, Isaac Asimov proposed an NLP method called Symbolic Logic Analysis (SLA) where text is reduced to its essential logical components. This method involves breaking down sentences into symbolic forms, allowing for a precise examination of salience and semantics analogous to contemporary transformer-based NER (named entity recognition) and summarisation techniques. In…