Building Advanced AI Agents: How to Combine RAG with Real-Time Web Search

A common challenge in AI development is creating agents that are not limited to a static training set. To build truly useful tools for enterprise environments, developers often need to combine internal knowledge (like a company handbook) with real-time web search.
This hybrid approach allows an agent to act as a primary source for internal policies while having the web as a fallback for broader queries, all while maintaining strict control over which sources are trusted.
- Converting Web Pages to LLM-Friendly Markdown
The first step in any web-integrated pipeline is getting raw HTML into a format an LLM can understand. Tools like Dockling are highly effective for this, allowing you to convert a specific URL into clean Markdown.
By converting a page—such as the EU AI Act—into Markdown, you provide the LLM with a structured text format that is easy to summarize and interpret. This is essential for agents that need to "read" a specific page provided by a user in real-time.
- Real-Time Web Search with Domain Filtering
While searching a single page is useful, broader web search makes an agent truly dynamic. Using the OpenAI Web Search tool, developers can implement "agentic" searches that crawl the internet to find answers.
Key features of this implementation include:
• Domain Filtering: You can restrict the agent to a specific list of "allowed domains" (e.g., government websites or official documentation) to ensure high-quality, focused results.
• Reasoning Models: Using reasoning models like GPT-4.5 (mini or nano) allows the agent to decide if it needs to perform multiple searches or loops to satisfy a query.
• Citations: By using structured output (via Pydantic), the agent can return not just an answer, but a list of URLs and text snippets showing exactly where the information came from.
- Integrating Internal Knowledge (RAG)
For many clients, the core of an AI assistant is its access to internal instructions or handbooks. In a production environment, this is typically handled via a RAG (Retrieval-Augmented Generation) pipeline.
In this pattern, the handbook is treated as a tool. The agent only calls the "search handbook" function when it determines that the user's question relates to internal data rather than general knowledge or the live web.
- Bringing it All Together: The Multi-Tool Agent
The true power of this system is unlocked when you combine these capabilities into a single interactive agent. By abstracting functions into a dedicated tools folder, you can create a clean, scalable architecture.
A sophisticated search agent follows this decision-making process:
-
Analyze the Query: Does the user want internal info, a specific web page, or a general search?
-
Select the Tool: The agent decides whether to call the handbook tool, the single-page scraper, or the web search tool.
-
Synthesize and Cite: The agent brings the information together from multiple sources (if necessary) and replies in a structured way with citations.
Conclusion
Combining RAG with real-time web search represents the next step in AI engineering. By using structured outputs, domain filtering, and tool-calling patterns, developers can create AI assistants that are both grounded in private data and aware of the ever-changing world.
For developers looking to implement this, focusing on modular tool structures and type-safe Pydantic models is the most efficient way to scale these "search-aware" agents
Related Posts

LangChain v0.3 Tutorial & Migration Guide for 2026
Learn what’s new in LangChain v0.3 and how to migrate: Runnables, new agents, tools, middleware, MCP, and testing patterns for modern AI agents in Python.

The Best Local LLMs for 16GB RAM: A Developer's Optimization Guide
Sixteen gigabytes of memory is the current sweet spot for developers exploring local large language models. With this capacity, you can efficiently run 7B to 14B parameter models using modern quantization techniques—delivering near-cloud performance while keeping your data on-premise. Whether you're...

How to Add Web Search Skill to OpenClaw (Step‑by‑Step) With Crawleo
OpenClaw’s skills system makes it easy to plug in powerful web search capabilities directly into your AI agents. This guide shows you how to install a Crawleo-powered search skill, wire it up with your API key, and start running live web queries from inside OpenClaw in just a few minutes.