Google’s open-source “Gemini Fullstack LangGraph Quickstart” pairs Gemini 2.5 with LangGraph to showcase a fully transparent, citation-driven research agent (Mikami 2025). A React frontend (Vite, Tailwind CSS, Shadcn UI) collects user queries and displays progress, while a FastAPI/LangGraph backend orchestrates a multi-step workflow:
- Query Generation: Gemini interprets the user’s request and formulates initial search terms.
- Web Research: These terms drive calls to Google Search (via Gemini tooling) to fetch relevant pages.
- Reflection & Gap Analysis: Gemini inspects the results, detects missing subtopics, and—if needed—generates follow-up queries.
- Iterative Refinement: The loop continues until sufficient coverage is achieved.
- Answer Synthesis: Gemini composes a coherent, citation-annotated response.
Although this isn’t Google’s official Gemini implementation as seen in AI Mode or AI Overviews, it provides unparalleled technical insight into how to build a “DeepSearch”-style agent by modularizing query formulation, retrieval, reflection, and synthesis (project repo). It’s a practical blueprint for anyone wanting to understand the nuts and bolts of an advanced, LLM-driven research pipeline.
from datetime import datetime
# Get current date in a readable format
def get_current_date():
return datetime.now().strftime("%B %d, %Y")
query_writer_instructions = """Your goal is to generate sophisticated and diverse web search queries. These queries are intended for an advanced automated web research tool capable of analyzing complex results, following links, and synthesizing information.
Instructions:
- Always prefer a single search query, only add another query if the original question requests multiple aspects or elements and one query is not enough.
- Each query should focus on one specific aspect of the original question.
- Don't produce more than {number_queries} queries.
- Queries should be diverse, if the topic is broad, generate more than 1 query.
- Don't generate multiple similar queries, 1 is enough.
- Query should ensure that the most current information is gathered. The current date is {current_date}.
Format:
- Format your response as a JSON object with ALL three of these exact keys:
- "rationale": Brief explanation of why these queries are relevant
- "query": A list of search queries
Example:
Topic: What revenue grew more last year apple stock or the number of people buying an iphone
```json
{{
"rationale": "To answer this comparative growth question accurately, we need specific data points on Apple's stock performance and iPhone sales metrics. These queries target the precise financial information needed: company revenue trends, product-specific unit sales figures, and stock price movement over the same fiscal period for direct comparison.",
"query": ["Apple total revenue growth fiscal year 2024", "iPhone unit sales growth fiscal year 2024", "Apple stock price growth fiscal year 2024"],
}}
```
Context: {research_topic}"""
web_searcher_instructions = """Conduct targeted Google Searches to gather the most recent, credible information on "{research_topic}" and synthesize it into a verifiable text artifact.
Instructions:
- Query should ensure that the most current information is gathered. The current date is {current_date}.
- Conduct multiple, diverse searches to gather comprehensive information.
- Consolidate key findings while meticulously tracking the source(s) for each specific piece of information.
- The output should be a well-written summary or report based on your search findings.
- Only include the information found in the search results, don't make up any information.
Research Topic:
{research_topic}
"""
reflection_instructions = """You are an expert research assistant analyzing summaries about "{research_topic}".
Instructions:
- Identify knowledge gaps or areas that need deeper exploration and generate a follow-up query. (1 or multiple).
- If provided summaries are sufficient to answer the user's question, don't generate a follow-up query.
- If there is a knowledge gap, generate a follow-up query that would help expand your understanding.
- Focus on technical details, implementation specifics, or emerging trends that weren't fully covered.
Requirements:
- Ensure the follow-up query is self-contained and includes necessary context for web search.
Output Format:
- Format your response as a JSON object with these exact keys:
- "is_sufficient": true or false
- "knowledge_gap": Describe what information is missing or needs clarification
- "follow_up_queries": Write a specific question to address this gap
Example:
```json
{{
"is_sufficient": true, // or false
"knowledge_gap": "The summary lacks information about performance metrics and benchmarks", // "" if is_sufficient is true
"follow_up_queries": ["What are typical performance benchmarks and metrics used to evaluate [specific technology]?"] // [] if is_sufficient is true
}}
```
Reflect carefully on the Summaries to identify knowledge gaps and produce a follow-up query. Then, produce your output following this JSON format:
Summaries:
{summaries}
"""
answer_instructions = """Generate a high-quality answer to the user's question based on the provided summaries.
Instructions:
- The current date is {current_date}.
- You are the finaly step of a multi-step research process, don't mention that you are the final step.
- You have access to all the information gathered from the previous steps.
- You have access to the user's question.
- Generate a high-quality answer to the user's question based on the provided summaries and the user's question.
- you MUST include all the citations from the summaries in the answer correctly.
User Context:
- {research_topic}
Summaries:
{summaries}"""
1. Query Writer Instructions
Purpose:
Generate one or more highly focused search queries so an automated research tool can retrieve exactly the data needed.
Key Elements:
- Single‐query preference: Only split into multiple queries if the original topic truly requires distinct angles.
- One‐aspect‐per‐query: Each query hones in on a single facet of the research topic.
- Limit on count: Don’t exceed
{number_queries}
. - Diversity requirement: If the topic is broad, you can use more queries—just avoid near‐duplicates.
- Recency filter: Always aim to pull in the most current information (using
{current_date}
as a reference).
How It Works in Practice:
- Receive
research_topic
and parameters (number_queries
,current_date
). - Analyze the topic’s scope: Is it narrow enough for one query, or does it cover multiple sub‐areas?
- Produce JSON with:
"rationale"
: A brief justification for why these exact queries were chosen."query"
: An array of one to{number_queries}
strings, each a standalone search string.
Proactive Suggestions & Alternatives:
- Template refinement:
- If you know the target search engine supports advanced operators (e.g., site:, filetype:), you can mention that in the rationale or even embed them in sample queries.
- Alternative: Adopt a “PICO” or “SPICE” framework (Problem/Intervention/Comparison/Outcome) for clinical or comparative questions to systematically build queries.
- Dynamic query count:
- Instead of a fixed
{number_queries}
, use a sliding scale based on detected topic complexity. For very broad topics, allow up to 5–7 queries and include logic to merge or discard near‐duplicates.
- Instead of a fixed
- Synonym expansion:
- Suggest including synonyms or related terms within a single query via Boolean operators (e.g., “(AI OR machine learning) ethics 2025”) to reduce the total number while preserving breadth.
2. Web Searcher Instructions
Purpose:
Turn those queries into concrete Google searches, retrieve results, and condense them into a structured, source‐verified summary.
Key Elements:
- Targeted Google Searches: Use precise queries from step one.
- Currency emphasis: Always check that the content is up to date (reference
{current_date}
). - Multiple searches: Run several different queries to cover all angles.
- Meticulous source tracking: Every fact or data point in the final summary must include its source.
- Well‐written report: Synthesize findings into a coherent narrative without inventing details.
How It Works in Practice:
- Take
research_topic
. - Execute each query (from the previous step) against Google (or another high‐quality index).
- Capture key findings: statistics, dates, quotes, methodology, etc.
- Annotate each piece of information with its exact URL or citation note.
- Produce a cohesive summary that integrates everything, segmented by theme or subtopic.
Proactive Suggestions & Alternatives:
- Use multiple search engines:
- Beyond Google, consider Bing or a specialized academic engine (e.g., Semantic Scholar). Note differences in coverage for technical or niche topics.
- If budget and tooling allow, employ an API like the Bing Web Search API, which offers programmatic rate limits and JSON‐friendly outputs.
- Automate link‐following logic:
- Rather than stopping at the first page of results, script a headless browser or crawler to click through related pages (e.g., “see also” links within Wikipedia). That lets you capture deeper context without manual clicking.
- Early filtering & clustering:
- Before synthesizing, cluster results by date or subtheme (use simple TF-IDF clustering) to avoid redundant reporting and to surface outliers.
3. Reflection Instructions
Purpose:
Analyze the assembled summaries, pinpoint gaps, and suggest follow‐up queries for any missing technical or emerging details.
Key Elements:
is_sufficient
flag: Indicates whether the current summaries fully answer the user’s question.knowledge_gap
: Iffalse
, explain exactly what’s missing—e.g., “no concrete benchmarks” or “latest regulatory changes aren’t covered.”follow_up_queries
: One or more self‐contained questions designed to fill those gaps in a subsequent search pass.
How It Works in Practice:
- Receive all summary text generated by the Web Searcher step.
- Compare against the original
research_topic
: Did we cover every angle—especially technical specifications, use cases, or recent breakthroughs? - If something’s missing:
- Set
"is_sufficient": false
. - In
"knowledge_gap"
, state in plain terms what’s lacking. - List one or more precise follow‐up questions that, when sent through the Query Writer, will generate queries to address those holes.
- Set
- If all gaps are filled:
- Set
"is_sufficient": true
, with an emptyknowledge_gap
and[]
forfollow_up_queries
.
- Set
Proactive Suggestions & Alternatives:
- Automated gap detection via keywords:
- Instead of a purely manual scan, run a keyword-extraction script (e.g., using RAKE or YAKE) on both the research question and the summary. Flag topics present in the question but not in the summary.
- Incorporate domain‐expert heuristics:
- For technical subjects, embed a checklist of must‐cover points (e.g., performance metrics, software versions, key protocols). If any checklist item is missing, automatically trigger a follow‐up query.
- Iterative deep dives:
- If the summary is “sufficient but shallow,” allow a user toggle to run an “advanced reflection” that specifically hunts for implementation examples, code snippets, or raw datasets.
4. Answer Instructions
Purpose:
Produce the final, polished answer to the user’s original question—completely grounded in the summaries and properly cited.
Key Elements:
- Use
current_date
as context. - Assume all necessary data is available from earlier steps.
- Integrate citations directly into the narrative.
- Do not mention that this is the “final step” of a multi‐step process.
How It Works in Practice:
- Receive the user’s question plus all collected summaries (and citations).
- Structure a coherent response: introduction, body (grouped by subtheme), conclusion.
- Embed citation markers wherever a fact or quotation is used, so that the user can trace it back to the source.
- Avoid adding any new information not present in the summaries.
Proactive Suggestions & Alternatives:
- Interactive footnotes or tooltips:
- Instead of inline bracket citations, consider generating numbered footnotes or hover‐tooltips so the user can click to see the source without breaking the reading flow.
- Visual summaries:
- If appropriate, embed a small chart or timeline (via
python_user_visible
) to illustrate trends or dates mentioned in the summaries. This can make complex data easier to digest.
- If appropriate, embed a small chart or timeline (via
- Hyperlinked references:
- Where permissible, convert citations into live hyperlinks so readers can jump directly to the original webpage.
Overall Workflow
Query Writer → Web Searcher → Reflection → Answer.
- Each module hands off to the next via well‐defined JSON formats.
- If Reflection flags gaps, loop back to Query Writer with new follow‐up questions.
Leave a Reply to Multi-Step Research Agent – DEJAN Cancel reply