A deep learning approach using a Query Demand Estimator to automatically predict search volume ranges for long-tail queries generated by a fan-out model.
Finding new search terms for your website is easy with modern language models, but figuring out which ones actually get traffic is a massive bottleneck. That is why we developed the Query Demand Estimator, or QDE.
The QDE is a deep learning model designed to automatically predict search volume ranges for millions of generated keywords. We trained the model using historical search data, grouping queries into twelve distinct volume buckets.
While predicting the exact volume bucket is challenging, our model gets it right, or very close, nearly fifty-five percent of the time. Because it is choosing from twelve possible ranges, this performance is far better than random guessing. It gives content teams a fast, reliable way to prioritize their search engine optimization efforts.
For queries that have never been seen before, the most promising approach is entity tracking. Since most searches are about specific people, products, or brands, we can predict search volume by analyzing how popular those entities are across Wikipedia, knowledge graphs, and news trends.
By combining deep learning with entity analysis, businesses can turn a massive list of raw keywords into a prioritized, actionable roadmap for organic growth.
While traditional keyword research tools provide valuable data, they often fall short in discovering truly novel or long-tail search query variations that a business might not yet rank for, or even be aware of. This is where our query fan-out model comes in. Using advanced language models to generate a vast array of related search queries from existing organic queries.
However, generating a massive list of potential keywords creates a new challenge: how do you efficiently assess the search volume potential of these new, unproven queries? Manually checking each one is impractical. This article we present a deep learning approach developed to automatically predict the search volume ranges for these fan-out queries, transforming a broad list into an actionable, prioritized asset.
Content teams and SEO strategists constantly seek to expand their keyword footprint. Given a primary query like “AI SEO” and a target URL (e.g., dejan.ai), a fan-out generation model can suggest many diverse, yet related queries.
Here’s the exact output from the fan-out model for a single search query:
While invaluable for identifying new opportunities, this explosion of data quickly becomes overwhelming. Each generated query ideally needs a search volume estimate to determine its potential value and prioritize content efforts. Relying on external tools for millions of queries is costly and time-consuming.

To address this, we developed a Query Demand Estimator (QDE) using a deep learning model. The core idea is to train a sequence classification model to categorize a given query into predefined search volume buckets.
A tool driven by QDE model, trained for one specific industry.
The success of any supervised learning model hinges on the quality and quantity of its training data. Our approach involved:
label_id,label_text
We leveraged a pre-trained transformer model, specifically mDeBERTa-v3-base, known for its strong performance across various natural language understanding tasks, including classification. The choice of mDeBERTa also offers multilingual capabilities, which is advantageous for global businesses.
The model was fine-tuned as a sequence classifier:
The training process involved:
Once trained, the QDE model was integrated into our fan-out query generation system. As the fan-out model generated new query variations for a given URL and seed query, each new variation was immediately passed to the QDE model for a volume prediction. This allowed the system to:
Validation is crucial. To assess the QDE model’s real-world utility, we compared its predictions against a true gold standard: a subset of queries from a held out dataset, representing terms where our properties consistently ranked in the top 1-10 positions. For these queries, impression data closely reflects actual search volume.
The validation process involved:
Exact Match Accuracy: 23.31%
Initially, this might seem modest. It means that for 23.31% of the overlapping queries, the model predicted the exact search volume bucket.

Combined Accuracy (Exact + Adjacent): 54.80%

This metric is far more representative of the model’s practical value. It indicates that for 54.80% of the queries, the model’s prediction was either exactly correct OR within one adjacent search volume bucket (e.g., predicting “501-1000” when the actual was “251-500” or “1001-2000”). This level of accuracy is highly beneficial for prioritizing.

This isn’t a yes/no problem. The model isn’t picking between just 2 outcomes (like heads vs. tails). Instead, it has to choose among 11 different possible volume ranges (labels).


The confusion matrix (a table showing actual vs. predicted labels) provided deeper insights:

The deep learning-powered QDE model, integrated with fan-out query generation, transforms a previously manual and time-consuming process into an automated, scalable, and data-driven one. While not always achieving perfect exact-bucket accuracy, its ability to correctly or nearly correctly classify query search volume over 50% of the time provides an invaluable, actionable signal.
This system empowers SEO teams to:
The future of SEO keyword research is increasingly augmented by AI, allowing businesses to be more agile, comprehensive, and ultimately, more successful in capturing organic search demand.
Entity popularity signals offer the most promising foundation for zero-shot query volume prediction, according to a synthesis of recent machine learning and information retrieval literature. While direct research on predicting volume for never-before-seen queries remains sparse, converging evidence from entity importance estimation, query performance prediction, knowledge graph embeddings, and emerging LLM prompt analysis provides a viable methodological path. The core insight: because queries fundamentally seek information about entities—people, products, brands, concepts—modeling entity-level demand can generalize to arbitrary query formulations containing those entities.
The most directly relevant work comes from Van Gysel et al. (SIGIR 2020), who developed entity popularity prediction for virtual assistants at Apple. Their system forecasts which entities will trend before substantial query history exists, achieving 20% relative error reduction on emerging entity recognition. The key finding: temporal patterns of entity mentions in external signals (news, social media) correlate with future search demand, enabling predictions for entities with minimal query history.
Yang et al.’s Named Entity Topic Model (NETM) from Knowledge-Based Systems (2020) provides a theoretical framework for cold-start popularity prediction. The model assigns each named entity a “popularity-gain distribution over semantic topics”—essentially learning that certain entity-topic combinations drive engagement. For novel content, the model aggregates popularity gains from constituent entities across their topic distributions, predicting volume without any prior behavioral data.
The KB-PPN architecture (KDD 2018 Deep Learning Day) demonstrates how knowledge graph embeddings enable volume transfer between entities. By linking items to Freebase entities and encoding them via TransE embeddings, an LSTM with attention can identify similar entities in the knowledge graph and transfer their popularity dynamics. This directly addresses zero-shot prediction: a new product launch can borrow volume patterns from similar products identified via embedding proximity.
| Approach | Key Signal | Cold-Start Capability | Venue |
|---|---|---|---|
| Van Gysel et al. | Historical entity interactions + temporal patterns | High | SIGIR 2020 |
| NETM | Entity × topic popularity matrix | High | KBS 2020 |
| KB-PPN | TransE entity embeddings + KB neighbors | Medium-High | KDD 2018 |
| GENI | GNN on KG predicate-aware importance | Medium | KDD 2019 |
Park et al.’s GENI (KDD 2019) directly addresses the question: “How can we estimate the importance of nodes in a knowledge graph?” Their graph neural network aggregates importance scores (not just embeddings) across predicate-aware relationships, outperforming PageRank by 5-17% on NDCG@100 for importance prediction. The architecture captures that entities connected to other important entities via meaningful relations inherit importance—a signal correlating with search demand.
Thalhammer and Rettinger (ESWC 2016) found that combining Wikipedia’s internal link structure (PageRank) with page-view data provides the strongest entity importance estimates. Wikipedia page views serve as a direct search demand proxy, while link-based importance captures “objective relevance” independent of current interest. Their methodology enables entity importance estimation even for entities lacking search engine volume data.
Entity salience offers another avenue. Google’s work on entity salience (Dunietz & Gillick, EACL 2014) introduced automatic corpus generation using document-abstract pairs—entities appearing in abstracts are deemed salient. Features like first mention position, mention frequency, and headline presence predict which entities will be searched. The NYT-Salience dataset with millions of training examples is publicly available at google-research-datasets/nyt-salience, enabling training of salience predictors.
The recent GUM-SAGE dataset (ACL Findings 2025) advances this with graded salience scores (1-5) based on summary inclusion across 5 annotators and 12 genres. This graded approach better models varying levels of search interest than binary salience. Code and data are available at github.com/jessicaxylin/GUM-SAGE.
The information retrieval community has extensively studied Query Performance Prediction (QPP)—predicting query difficulty before or after retrieval. While QPP targets retrieval effectiveness rather than volume, its methods transfer remarkably well.
Cronen-Townsend et al.’s Clarity Score (SIGIR 2002) measures the KL-divergence between a query language model and the collection language model. High clarity indicates focused, unambiguous queries; low clarity suggests ambiguity. The connection to volume: ambiguous queries may have higher aggregate volume (multiple intents summed) while highly specific queries have lower volume but clearer intent. Query specificity features from QPP can inform volume estimation.
Carmel and Yom-Tov’s synthesis book (2010) provides a comprehensive taxonomy: pre-retrieval predictors use only query features (IDF statistics, linguistic features), while post-retrieval methods use search results. For novel query volume prediction, pre-retrieval features are essential—they require no behavioral data.
Modern neural approaches show promise. BERT-QPP (Arabzadeh et al., CIKM 2021) fine-tunes BERT for pre-retrieval difficulty prediction, demonstrating that contextualized embeddings capture query characteristics correlating with retrieval outcomes. Deep-QPP (Datta et al., WSDM 2022) uses 2D convolution over query-document interaction matrices. These architectures could be adapted for volume prediction by replacing retrieval effectiveness targets with volume labels.
Three representation paradigms dominate:
Semantic embeddings via Sentence-BERT enable similarity-based volume transfer. The intuition: if a novel query is semantically similar to known queries with volume data, transfer their volumes weighted by similarity. Nishikawa et al. (NAACL 2025 Industry) advanced this with user-behavior-driven contrastive learning at Yahoo Japan—training embeddings where queries leading to the same click or appearing in the same session are pulled together. This captures behavioral intent rather than surface lexical similarity.
Entity-based features leverage named entity recognition to extract persons, organizations, locations, and products from queries. Features include entity type, entity popularity (Wikipedia pageviews), entity recency (trending vs. evergreen), and entity embeddings from knowledge graphs. Queries containing trending celebrities or new product launches can be flagged for higher predicted volume based on entity-level signals alone.
Hybrid representations combining embeddings, entity features, and lexical statistics (query length, average IDF) perform best in production. Query length correlates strongly with volume—shorter queries tend to have higher volume (head queries) while longer, more specific queries populate the tail.
Recommended feature vector:- SBERT embedding (768-dim)- Entity types present (one-hot)- Entity popularity scores (continuous)- Query length, avg IDF- Query type (navigational/informational/transactional)The emergence of conversational AI has created new query surfaces. LMSYS-Chat-1M (ICLR 2024) provides 1 million real ChatGPT/LLaMA/Vicuna conversations across 25 models, revealing that prompt distributions follow power-law patterns similar to web search. WildChat-4.8M extends this with demographic metadata, showing geographic heterogeneity (21% US, 15% Russia, 10% China) and task distribution (creative writing dominates, followed by analysis and coding).
Critically, no published research exists on predicting novel LLM prompt volume or transferring web search patterns to prompt prediction. This represents a significant research gap. The datasets enable such work: BERTopic analysis of LMSYS-Chat-1M identified 29+ coherent topics, providing a taxonomy for prompt volume modeling.
Yelp’s engineering team demonstrated practical query volume prediction: they found query distributions follow power laws enabling pre-computation of LLM responses for 95% of traffic. This validates that prompt/query volume is predictable enough for industrial applications.
While cross-domain recommendation research provides architectural patterns—DSAM uses shared LSTMs with attention for preference transfer, CDTM addresses feature dimensional heterogeneity—no work specifically addresses transferring volume models from web search to e-commerce search to LLM prompts.
Domain generalization frameworks (Wang et al., IEEE TKDE 2022) suggest approaches: domain-invariant representation learning, meta-learning for few-shot adaptation, and self-supervised pretraining. The recipe would be: pretrain query embeddings on web search volume prediction, then fine-tune on the target domain (e-commerce, LLM prompts) with limited labeled data.
Google Trends preprocessing research (arXiv 2024) offers practical methodology: hierarchical clustering groups semantically similar queries to overcome data sparsity, combined with correlation-based filtering. This semantic grouping approach directly applies to novel query volume estimation—cluster novel queries with known queries, then estimate volume from cluster characteristics.
Datasets enabling entity-centric volume prediction:
| Dataset | Description | Access |
|---|---|---|
| NYT-Salience | Millions of entity salience labels | github.com/google-research-datasets |
| GUM-SAGE | Graded entity salience (12 genres) | github.com/jessicaxylin/GUM-SAGE |
| WikiPopular | Cold-start web traffic with text | Recent WWW paper |
| LMSYS-Chat-1M | LLM conversation logs | huggingface.co/datasets/lmsys |
| WildChat-4.8M | ChatGPT conversations + metadata | huggingface.co/datasets/allenai |
| Wikipedia Pageviews | Direct popularity signals | dumps.wikimedia.org |
Key code resources:
Confirmed gaps in the literature:
The most promising research direction combines entity importance from knowledge graphs with semantic similarity to known queries: extract entities from a novel query, estimate entity-level importance via KG signals, identify semantically similar known queries, then fuse these signals in a learned model. This entity-centric architecture leverages the key insight that entities are the fundamental units of search demand, enabling generalization to arbitrary query formulations about those entities.
The path to predicting volume for unseen queries runs through entities. Entity popularity prediction from knowledge graphs, Wikipedia signals, and content salience provides cold-start capability. Query performance prediction offers proven pre-retrieval features. Semantic embeddings enable similarity-based volume transfer from known queries. The missing piece—direct volume prediction models combining these signals—represents both the primary research gap and the clearest opportunity. Industrial applications at Apple, Yahoo Japan, and Yelp validate the approach’s viability; academic formalization and public benchmarks remain the field’s next frontier.
If you’d like a custom QDE model trained for your own website or client please apply below. This type of model training is best suited for websites with at least 100K, ideally 1M queries. We’ll evaluate your dataset and advise whether it’s suitable for model training.